Structural and Multidisciplinary Optimization

Structural and Multidisciplinary
Optimization
Homework 1 – Unconstrained Optimization
Student: Dumitrașcu Celsia – Alexandra

S202355
Contents
1. Optimization Methods & Line Search Methods.............................................2
2. The minimized functions................................................................................3
3. Initial Points...................................................................................................4
4. Convergence Criteria......................................................................................5
5. Explanation of the Optimization Algorithms Used........................................6
6. Strengths and Weakness of the Analysed Methods......................................13

1. Optimization Methods & Line Search Methods
Optimization Methods
1. Steepest Descent
2. Conjugate Gradients
3. Newton Basic
4. Newton-like
5. Quasi-Newton method - Boyden-Fletcher-Goldfard-Shanno (BFGS)
Line Search Methods

1. Dichotomy – used in this report for the second function, in the Newton-
like method
2. Quadratic Interpolation -used in this report for the second function, in all
the methods
3. Newton - Raphson method
4. Secant method
The maximum number of iterations used is 70 and the tolerance is 1e-3.

2. The minimized functions
2.1 Function 1
The first function discussed in this report is a strictly convex quadratic

function, as it can be seen in the picture Fig 1.
Fig.1 Function 1. The Global Minimizer.
The gradient for this function is:

g= 4∗x +3∗y−2
(
4∗y +3∗x +10 )
The Hessian matrix has been calculated and it is a symmetric positive
definite.
Hessian ¿ ( 43 34)
For this function, no line search method is needed but they can still be
used.
2.2 Function 2
Unlike the first function, the second one is not a strictly convex quadratic
function. The Hessian matrix is not a symmetric positive definite. Below can be
seen the matrix for the gradient and the Hessian matrix (H m).
π π
( ( ) (( ) ))
( )
0.8∗x− 6∗ ∗sin ∗x +1
2 2
g=
π π
( (( )) (( ) ))
0.8∗y + 3∗
2
∗sin
2
∗y −1
π2 π
( ( ))
( )
0.8− 6 cos ∗x 0
4 2
H m=
π2 π
0 (
0.8+ 3
4
sin ∗y
2 ( ))
For this function, the optimisational alghoritms need a search line method. The
main used search line method in this report is the Quadratic Interpolation
method. However for the Newton-like optimization method, I used both the
Dichotomy and the Quadratic Interpolation.
Fig. 2 Function 2. Global and Local Minimizers

3. Initial Points
In order to define the initial point which will give us the minimum value
for the objective function, I made the following assumption: x=y.
Therefore, the studied functions become:
2
f 1 ( x , y ) =7 x2 +8 x−1 f 2 ( x , y ) =0.8 x +3 cos ( π2 x) The next step, we calculate
the second derivatives of the functions, which leads us to the following results.
∂2 f ∂2 f π2 π
∂x 2
=14> 0
∂x 2
=1.6−3
4 2 ( )
cos x >0 , for x ≤−1
Now, in order to find out what is the x of the initial point, we impose that
the first derivatives of f1 and f2 must be equal to 0.
∂f1
=14 x +8=0=¿ x=−0.5714 , y=−0.5714
∂x
∂f2 π π
∂x 2 2 ( )
=1.6 x−3 sin x =0=¿ x=−1.75265 , y=−1.75265
The solutions must be close to these initial points. That means that the
Global Minimum values of the objective functions are obtained for these points.
For the sake of a better understanding of how the methods work, several
initial points have been chosen (as it will be seen in the Annexes). In this report,
the results of the following initial points will be analysed:
 (0,0) for both functions – for the first function, it is a feasible point
but for the second one, the Hessian matrix will not be positive
definite. Some methods do not give good result because of this, but
it was a good way to check if the code was correct and to see how
the program will behave.
 (10,10) for both functions – to see the difference in the number of
iterations and to show the convergence approaching by right
 (-0.5714, -0.5714) for the first function – the point for which the
first derivative of f1=0
 (-1.75265, -1.75265) for the second function – the point for which
the first derivative of f2=0
4. Convergence Criteria
Convergence Criteria:
 for the first function
∂ f k +1
max
|
i=1 ,… n ∂ xi
(x ) < ε
|
It is a strict convex quadratic function and this criteria gives the best
computational time.
 for the second function

n 2
2 ∂ f k +1
‖∇ f ( x k+ 1)‖ =∑
i=1
[ ∂x ]
(x ) <ε
The second function is not strictly convergent quadratic, so I chose to

check the convergence by the magnitude of the gradient of the next point.
These criteria are used for all the methods in the present paperwork.
5. Explanation of the Optimization Algorithms Used
5.1 The Steepest Descent Method
For this method, first take an initial guess for X initial, which in my case is
(0,0). Using the iterative method, the program moves along the downwards
direction, until it reaches the optimum value of the objective function.
Sk =−∇ f k =−∇ f ( X k )
Where:
 Sk = search direction
 ∇ f k = gradient of the function at point Xk
 Xk = the analyzed point
The search direction is defined as minus the gradient because I am search

for the minimum of the functions, so I need to go downwards.
To calculate the next point in the search for the optimum, the formula
used is the one below:
Xk+1 = Xk + αSk = Xk – α∇ f k
Where:
 α = the optimal step length in the defined direction
In this method, two successive search directions are orthogonal. If the

initial point is not defined close to the optimal point, the convergence is slow.
For the first function (because it is strictly convex quadratic), the function
converges to a stationary point of the function (the global minimum). However,
for the second function, if the initial point is not defined well, the function will
converge to a local minimum.
Line Search:
 f1 – since it is a SCQF, no line search was needed
2
‖g‖
α= '
s ∗Hm∗s
Where:
s’ = the direction matrix transposed
Hm = the Hessian matrix
s = the direction matrix
 f2 – the used line search is the Quadratic Interpolation
Algorithm steps
1. Initialization
a) Choose initial point, x0, which needs to have real coordinates
b) Set k=0
2. Find the direction

a) Compute sk such that sk ’∗∇ f ∗X k <0
3. Line Search (for function 2)

k k k
a) Compute αk such that f ( X +α s )=min f ( X k +α s k )
α ≥0
4. Update the point

X k+1= X k + α k sk
5. Convergence Check
a) Satisfied: Stop => X*≈ Xk+1
b) Unsatisfied: Set k=k+1 and reiterate
In the table above can be seen the results for the 3 initial points, in the
case of the first function. It is easy to see (and just as expected) that for the
second point, which is the closest to X*, the program needed only 2 iterations to
reach the Global Minimum. The big number of iterations needed for the first
initial point ([0,0]) is caused by the asymptotic behaviour of the function at the
named point. The higher the value of the derivative in a point, the faster the
convergence.
Since the second function is not a strictly convex quadratic function, the
function will not converge only to one stationary point. It presents local
minimizers as well, thing that is noticed in the table above as well. As it can be
seen, for X0 = [0,0], the method converges to a local minimizer while for
X0 = [10,10] it reaches the Global minimum. However, for the x closest to X*, it
gets remarkably close to the global minimizer, yet not there.
The cosine creates waves in the x direction which makes it difficult to
figure out how fast the convergence will be. Therefore, for X0= [0,0] the
program needed 21 iterations while for the other 2 initial points, the global
minimum was reached in only 6 iterations.
The zig-zagging behaviour is noticed again.
5.2 The Conjugate Gradients Method

The previous method presents a zig-zagging behaviour. The Conjugate
Gradients is meant to solve this problem. The first direction for this method is
computed with respect to the Hessian matrix (just as sk) but the following
directions are calculated as shown below:
d k +1=−g+ β k∗d k where:
 d k +1 = the direction for the new point

 d k = the direction of the current point
2
k +1
k ‖∇ f ( x )‖
 β= 2
‖∇ f ( x k )‖
The directions of this method are mutually conjugate with respect to the
Hessian matrix. This method uses a reinitialization process after n iterations,
where “n” is the number of the function’s variables, in my case being 2.
Algorithm Steps
1. Initialization
a) Choose initial point, X0, which needs to have real coordinates
b) Set k=0
2. Find the direction

a) Set d0 = -g0
3. Iteration k
a ¿ X k+1= X k + α k d k
k −d k' g k
b) Compute α =
dk ' H m dk
c) d k +1=−gk +1+ β k∗d k

g(k +1)' H m d k
βk=
dk ' H m dk
d) Set k= k+1 and start the iteration again
The method converges to the global minimizer because the first
function is a strictly convex quadratic function. The asymptotic behaviour of the
function at the point X0= [0,0] disappears, which leads to a faster convergence.
However, using this method, the Global Minimum of the objective function is
reached, no matter what the initial point is.
The problem of the zigzagging disappears, which is fit to the theory.
Since the second function is not SCQF, I used the quadratic

interpolation line search method. The fact that the function is not strictly convex
quadratic also leads to a higher number of iterations until the global minimizer
is reached.
In order to reach the global convergence, it is needed to periodically
re-initialize the process. If “n” is not chosen properly, the convergence will be
slow.
The behaviour for the path at X0 = [0,0] as well as the big number of
iterations can be explained only by errors in the coding. This case is still to be
studied and repaired. In the path for X0 = [10,10] it can be observed that there is
no zigzag, just as expected.
Even with a number this big of iterations, it can be seen that this method
actually leads to the global minimizer, while the Steepest Descent method stops
to a local minimizer.
STEEPEST DESCENT VS CONJUGATE GRADIENTS
two successive search directions are mutually conjugate with respect to the
orthogonal Hessian matrix
Zig-zag path No “zig-zag”
Converges Well Takes less iterations
Does not globally converge for non- strictly convex quadratic functions
Converged for f1 in 2 iterations Converged for f1 in 2 iterations
Converged for f2 in 6 iterations Converged for f2 in 11 iterations
5.3 Basic Newton Method
This method takes a more direct path than the Conjugated Gradients
method. However, it requires the Hessian matrix for each iteration, to find the
extrema. It is perfect for SCQF and for functions with positive definite Hessian
matrix, otherwise it will not reach the global minimizer. As it will be seen
further in this report, for the second function, it found only local minimizers.
This is because this method does not use a line search.
In this method, α is not computed for the search direction. Instead, the
inverse of the Hessian matrix is multiplied by the gradient. The descent
direction for the Basic Newton method is given by the formula shown below:
−1
X k+1= X k −[∇2 f ( X k ) ] ∇ f ( X k )
It is amazingly easy to notice that this method, for the first function,
reaches the global minimizer in just 1 iteration. The convergence to a global
minimizer from both sides (left and right) of X ¿ was verified, even if X 0 is not
close to the said global minimizer. (As it will be shown in the annexes,
X =[100 100] is convergent.)
0
Even from the path of the method, it is seen that the method is going
straight for X*.
For the second function, the Newton Basic method reaches only local
minimizers. The local minimizers are found depending on the initial point
chosen. By running the method only for one initial point, it cannot be said if it
reached the global minimum or not.
5.4 Newton-like
The Newton-like method is the first adaptation of the Newton Basic
Method. Compared to the basic one, this method converges for the non-
SCQF too. In order for the non-SCQF to converge, in this method it is used
following adaptation:
−1
X k+1= X k −α k [∇ 2 f ( X k ) ] ∇ f ( X k )
Algorithm Steps
1. Choose α k to obtain the minimum of f ( X k +α s k ) in the downwards direction

2. Check if f ( X k + sk ) < f ( X k ) , α k =1
3. If f ( X k + sk ) ≥ f ( X k ); use the line search method to obtain updated α k such that it
satisfies
f ( X k +α k s k ) <f ( X k )
In the case of the first function, the results are similar to the Newton
Basic method. The global minimizer is reached in one single iteration. The
convergence is verified no matter how close or how far the chosen initial point
is from X*.
However, for the second function, the method reaches the minimum
only if the chosen X0 is close to X*. In the table above it can be seen that, just as
the Newton basic, the method stops at the first local minimizer met on the path.
The Dichotomy doesn’t give a good result for the initial point [10, 10] because
of the Hessian matrix. Therefore, I used the Quadratic Interpolation for that
point.
5.5 Quasi-Newton Method
In this method, it is used

k [ δ ¿¿ k ]T k T k k k [δ ¿¿ k ]T
=H +¿ δ δ k H H γ
k +1 k
H ¿ - [γ ¿ ¿ ] + ¿¿
[δ¿¿ k ]T γ k ¿ [δ¿¿ k ]T γ k ¿
When the initial H, H 0 is unknown, it is initialized as the identity

matrix. Unlike the Newton Basic method, this method needs more iterations to
find the minimum and the convergence path has a lower precision.
For the first function, it is noticeable that the method reached X* but it
took 2 iterations to get to it.
Unlike the Newton-like method, this one reached the global minimum
when the chosen initial point is close to X *. However, the results suggest that
the minimizer point that the method finds is dependent on the initial point.
NEWTON: BASIC VS LIKE VS QUASI
The Hessian matrix The Hessian matrix Needs just a positive

must be calculated must be calculated definite matrix
If the function doesn’t If the function doesn’t -
have a positive definite have a positive definite
Hessian, there will be Hessian, there will be
problems in the problems in the
computation computation
For the non-SCQF, the methods don’t reach the global minimizer
- Requires Line Search Requires Line Search
6. Strengths and Weakness of the Analysed Methods
Methods Strengths Weakness

Steepest Descent The simplest gradient Big number of iterations
Zig-zag behaviour of the
direction
Conjugate Gradients Straight path to the X* Reinitialization
Newton Basic Smallest number of Not globally convergent
iterations for f2
Newton Like Reaches the Global Works only if the
Minimizer Hessian is positive
definite
Newton-Quasi Less iterations than Needs a big storage
Newton Like
7. Annexes
1. Steepest Descent, function 1, X0 = [0,0], X* = [5.4282, -6.571], Obj. F = -39.2857.
2. Conjugated Gradients, function 1, X0 = [0,0], X* = [5.4286, -6.5714],

Obj. F = -39.2857.
3. Newton Basic, function 1, X0 = [0,0], X* = [5.4286, -6.5714], Obj. F = -39.2857.
4. Newton Like, function 1, X0 = [0,0], X* = [5.4286, -6.5714], Obj. F = -39.2857.

5.
6. Quasi-Newton, function 1, X0 = [0,0], X* = [5.4286, -6.5714], Obj. F = -39.2857.
7. Steepest Descent, function 2, X0 = [0,0], X* = [-1.962, 3.7213], Obj. F = - 7.3107.

8. Conjugated Gradients, function 2, X0 = [0,0], X* = [-1.9611, 0.12376], Obj. F =
-9.4727.
9. Newton Basic, function 2, X0 = [0,0], X* = [0.071406, 0.12192], Obj. F = 2.9746.

10. Newton Like, function 2, Dichotomy, X0 = [0,0], X* = [0.071406, 0.12129], Obj
F = 2.9746.
11. Quasi Newton, function 2, X0 = [0,0], X* = [-5.7497, 0.12328], Obj F = -1.1294.

12. Steepest Descent, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714], Obj.
F = -39.2857
13. Conjugated Gradients, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714],

Obj. F = -39.2857
14. Newton Basic, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714],
Obj. F = -39.2857
15. Newton Like, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714], Obj. F =

39.2857
16. Quasi-Newton, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714], Obj. F =
-39.2857
17. Steepest Descent, function 1, X0 = [100, 100], X* = [5.4284, -6.5713],

Obj. F =-39.2857
18. Conjugated Gradients, function 1, X0 = [100, 100], X* = [5.4284, -6.5713],
Obj. F =-39.2857
19. Newton Basic, function 1, X0 = [100, 100], X* = [5.4284, -6.5713], Obj.

F =-39.2857
20. Newton Like, function 1, X0 = [100, 100], X* = [5.4284, -6.5713], Obj. F =-39.2857
21. Quasi-Newton, function 1, X0 = [100, 100], X* = [5.4284, -6.5713],

Obj. F =-39.2857
22. Steepest Descent, function 1, X0 = [10, 10], X* = [5.4284, -6.5713],
Obj. F =-39.2857
23. Conjugated Gradients, function 1, X0 = [10, 10], X* = [5.4284, -6.5713],

Obj. F =-39.2857
24. Newton Basic, function 1, X0 = [10, 10], X* = [5.4286, -6.5714], Obj. F =-39.2857
25. Newton Like, function 1, X0 = [10, 10], X* = [5.4286, -6.5714], Obj. F =-39.2857
26. Quasi-Newton, function 1, X0 = [10, 10], X* = [5.4286, -6.5714], Obj. F =-39.2857
27. Steepest Descent, function 2, X0 = [10, 10], X* = [-1.9622, 0.1241],

Obj. F = -9.4727
28. Conjugated Gradients, function 2, X0 = [10, 10], X* = [-1.961, 0.12396],
Obj. F = -9.4727
29. Newton Basic, function 2, X0 = [10, 10], X* = [9.2943, 0.12234], Obj. F = 38.1108
30. Newton Like, Quadratic Interpolation, function 2, X0 = [10, 10], X* = [1.8315,
3.7241], Obj. F = -3.517
31. Newton Like, Dichotonomy, function 2, X0 = [10, 10], X* = [0.071406, 0.12129],

Obj. F = 2.9746
32. Quasi-Newton, function 2, X0 = [10, 10], X* = [-1.9598, 0.12387], Obj. F = -9.4726
33. Centralization of the results

Structural and Multidisciplinary Optimization

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Structural and Multidisciplinary Optimization

Uploaded by

Copyright:

Available Formats

Structural and Multidisciplinary

Homework 1 – Unconstrained Optimization

Student: Dumitrașcu Celsia – Alexandra

1. Optimization Methods & Line Search Methods.............................................2

2. The minimized functions................................................................................3

5. Explanation of the Optimization Algorithms Used........................................6

6. Strengths and Weakness of the Analysed Methods......................................13

Line Search Methods

The maximum number of iterations used is 70 and the tolerance is 1e-3.

The first function discussed in this report is a strictly convex quadratic

Fig.1 Function 1. The Global Minimizer.

The gradient for this function is:

Fig. 2 Function 2. Global and Local Minimizers

 for the second function

The second function is not strictly convergent quadratic, so I chose to

5. Explanation of the Optimization Algorithms Used

5.1 The Steepest Descent Method

The search direction is defined as minus the gradient because I am search

In this method, two successive search directions are orthogonal. If the

 f2 – the used line search is the Quadratic Interpolation

2. Find the direction

3. Line Search (for function 2)

4. Update the point

5.2 The Conjugate Gradients Method

d k +1=−g+ β k∗d k where:

 d k +1 = the direction for the new point

2. Find the direction

c) d k +1=−gk +1+ β k∗d k

Since the second function is not SCQF, I used the quadratic

5.3 Basic Newton Method

1. Choose α k to obtain the minimum of f ( X k +α s k ) in the downwards direction

5.5 Quasi-Newton Method

In this method, it is used

When the initial H, H 0 is unknown, it is initialized as the identity

NEWTON: BASIC VS LIKE VS QUASI

The Hessian matrix The Hessian matrix Needs just a positive

6. Strengths and Weakness of the Analysed Methods

Methods Strengths Weakness

1. Steepest Descent, function 1, X0 = [0,0], X* = [5.4282, -6.571], Obj. F = -39.2857.

2. Conjugated Gradients, function 1, X0 = [0,0], X* = [5.4286, -6.5714],

4. Newton Like, function 1, X0 = [0,0], X* = [5.4286, -6.5714], Obj. F = -39.2857.

7. Steepest Descent, function 2, X0 = [0,0], X* = [-1.962, 3.7213], Obj. F = - 7.3107.

9. Newton Basic, function 2, X0 = [0,0], X* = [0.071406, 0.12192], Obj. F = 2.9746.

11. Quasi Newton, function 2, X0 = [0,0], X* = [-5.7497, 0.12328], Obj F = -1.1294.

13. Conjugated Gradients, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714],

15. Newton Like, function 1, X0 = [-0.5714, -0.5714], X* = [5.4286, -6.5714], Obj. F =

17. Steepest Descent, function 1, X0 = [100, 100], X* = [5.4284, -6.5713],

19. Newton Basic, function 1, X0 = [100, 100], X* = [5.4284, -6.5713], Obj.

21. Quasi-Newton, function 1, X0 = [100, 100], X* = [5.4284, -6.5713],

23. Conjugated Gradients, function 1, X0 = [10, 10], X* = [5.4284, -6.5713],

27. Steepest Descent, function 2, X0 = [10, 10], X* = [-1.9622, 0.1241],

31. Newton Like, Dichotonomy, function 2, X0 = [10, 10], X* = [0.071406, 0.12129],

You might also like