Numerical Optimization: 1 Analysis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Numerical Optimization

Lecture 2: Elements of Analysis and Topology

Sangkyun Lee

The content is mainly Nocedal and Wright (2006), Bertsekas (1999), and
Folland (1999). Topics marked with ** are optional.

1
1.1

Analysis
Sequences

A sequence of an infinite number of points in Rn is denoted by {xk }


k=1 =
{x1 , x2 , . . . } = {xk }.

1.2

Convergence of a Sequence, Limit

A sequence of points {xk } in Rn converges to a point x, if for any  > 0,


there exists an index K such that
kxk xk2 ,

for all k K.

Then we write
lim xk = x, or xk x as k ,

and call x the limit of {xk }.


Ex. xk = (1 2k , 1/k 2 )T converges to (1, 0)T .

1.3

Subsequence

Given an infinite index set S {1, 2, . . . }, a subsequence of {xk } corresponding to S can be constructed, and is denoted by
{xk }kS .

1.4

Sequence with a Convergence Subsequence, Limit Point

For a sequence {xk } in Rn , if there exists an infinite set of indices S =


{k1 , k2 , . . . , } such that the corresponding subsequence {xk }kS converges
to a point x
Rn , that is,
lim xki = x
,
i

then x
is called an limit point (or, an accumulation point or a cluster point
in the Euclidean space) of {xk }.
Alternatively, we say for any  > 0 and all positive integer K, we have
kxk x
k , for some k K.
Ex. a sequence with two limit points x
= (0, 0)T and x
= (1, 1)T :
      
   

1
1/2
1
1/4
1
1/8
,
,
,
,
,
,...
1
1/2
1
1/4
1
1/8
Ex. a sequence with an infinite number of limit points: xk = sin k. Every
point in the interval [1, 1] is a limit point.

1.5

Cauchy Sequence

A sequence is said to be a Cauchy sequence if for any  > 0, there exists an


integer K such that
kxk xl k2 , for all k K and ` K.

1.6

Boundedness

A set of points in F Rn is bounded if there exists a scalar M > 0 such


that
kxk2 M, x F.

1.7

Convergence of a Sequence (Cont)

A bounded sequence {xk } in Rn converges if and only if it has a unique


limit point.
A sequence {xk } in Rn converges if and only if it is a Cauchy sequence.
(Bolzano-Weierstrass theorem) Every bounded sequence in Rn has at
least one limit point. Proof**:
Every sequence {xk } in Rn has a monotone subsequence.
(Monotone convergence theorem) every monotone sequence has
a finite limit if and only if the sequence is bounded.

1.8

Convergence of a Scalar Sequence

Consider a scalar sequence {tk }, tk R for all k. This sequence is


bounded above: if there exists a scalar u such that tk u for all k.
bounded below: if there exists a scalar ` such that tk ` for all k.
2

nondecreasing: if tk+1 tk for all k.


nonincreasing: if tk+1 tk for all k.
The scalar sequence {tk } converges if (monotone convergence theorem)
it is nondecreasing and bounded above.
it is nonincreasing and bounded below.
If a sequence is bounded below and bounded above, then it is called bounded.

1.9

Properties of Scalar Sequences

limn (xn yn ) = limn xn limn yn .


limn cxn = c limn yn .
limn (xn yn ) = (limn xn )(limn yn ).
limn

xn
yn

limn xn
limn yn

if limn yn 6= 0.

limn xpn = (limn xn )p .


If xn yn for all n N for some positive integer N , then limn xn
limn yn .
(Sandwich theorem) If xn zn yn for all n N for some positive
integer N and limn xn = limn = , then limn zn = .
(LHopital theorem) For functions f and g differentiable on I \ c where
I is an open interval containing c (c can be ), if
f 0 (x)
exists,
xc g 0 (x)

lim f (x) = lim g(x) = 0 or , and lim

xc

xc

f (x)
f 0 (x)
= lim 0
.
xc g(x)
xc g (x)

then lim

1.10

Supremum and Infimum

For a scalar sequence {tk },


Supremum: the smallest real number u: tk u for all k, denoted
by sup{tk }. If there is no such number, then sup{tk } = . By
convention, sup = .
Infimum: the largest real number `: tk ` for all k, denoted by inf{tk }.
If there is no such number, then inf{tk } = . By convention, inf =
.
3

For two scalar sequences {xk } and {yk }, we have (think about the proof!)
inf{xk } + inf{yk } inf{xk + yk }.
sup{xk } + sup{yk } sup{xk + yk }.

1.11

Limsup and Liminf

Define a sequence of suprema as {ui }:


ui := sup{tk |k i}
Then {ui } is an nonincreasing sequence. If bounded below, it converges to
a finite number u
, called lim sup of {tk }, denoted by
lim sup{tk } = lim ui = lim sup{tk |k i}
i

Similarly, define the sequence of infima as {`i }:


`i = inf{tk |k i}.
Then {`i } is nondecreasing. If {`i } is bounded above, it converges to a point
v which is called lim inf of {tk }, denoted by
lim inf{tk } = lim `i = lim inf{tk |k i}.
i

Ex. the sequence 1, 1/2, 1, 1/4, 1, 1/8, . . . has a lim inf of 0 and a lim sup of
1.
Let {xk } and {yk } be scalar sequences. Then,
inf{xk } lim inf k xk lim supk xk sup{xk }.
If lim supk xk (lim inf k xk ) is finite, then it is the largest (smallest, resp.) limit point of {xk }.
{xk } converges if and only if
< lim inf xk = lim sup xk < .
k

Furthermore, the limit of {xk } is equal to the common scalar value of


lim inf k xk and lim supk xk .
If xk yk for all k, then
lim inf xk lim inf yk ,
k

lim sup xk lim sup yk .


k

Finally, we have
lim inf xk + lim inf yk lim inf (xk + yk )
k

lim sup xk + lim sup yk lim sup(xk + yk )


k

Rate of Convergence

Let {xk } be a sequence in Rn that converges to x .

2.1

Q-Linear Convergence

The convergence is Q-linear if there exists a constant r (0, 1) such that


kxk+1 x k2
r (0, 1), for all k sufficiently large.
kxk x k2
Q stands for quotient.
Q-linear convergences implies that the distance to the solution x decreases at each iteration by a constant factor 0 < r < 1.
Ex. xk = 1 + (0.5)k converges to 1 Q-linearly with rate r = 0.5.

2.2

Q-Superlinear Convergence

The convergence is superlinear if


kxk+1 x k2
= 0.
k kxk x k2
lim

Ex. xk = 1 + k k converges superlinearly to 1.


(Hint)


X
xn
x n
ex =
= lim 1 +
.
n
n!
n
n=0

2.3

Q-Quadratic Convergence

The convergence is Q-quadratic if there exists a constant M > 0 (not necessarily < 1) such that
kxk+1 x k2
M for all k sufficiently large.
kxk x k22
k

Ex. xk = 1 + (0.5)2 .

2.4

Remarks

Convergence rate depends on constants r and M , whose values depend


on algorithms and problems. But eventually a Q-quadratic sequence
will always converge faster than a Q-linear sequence.
Q-quadratic (e.g. Newtons under some conditions)
Q-superlinear (e.g. quasi-Newton)
Q-linear (e.g. gradient descent).
5

There are R-rates of convergence (R for root), but well use only
Q-rates in this lecture (therefore well often omit Q).

1.04
1.00

Error

linear
superlinear
quadratic

superlinear
quadratic

1.02

1.0 1.1 1.2 1.3 1.4 1.5

Error

In Q-rates, the error (distance to x ) cannot increase for all sufficiently


large k (in R-rates the error can increase).

2.5

3.0

3.5

4.0

0.0
1.0

0.5

0.5

1.0

Figure 1: Rate of convergence (the three examples in the text).

Figure 2: Inf, sup, liminf, limsup, and convergence illustrated (yk =


exp(k) cos(2k)).

Topology

3.1

Topology**

For a set X, T 2X is a topology on X if


T and X T .
Arbitrary unions of members of T is in T .
Finite intersections of members of T is in T .
Ex. if X 6= , then 2X and {, X} are discrete and trivial topology, resp.
The pair (X, T ) is called a topological space. The members of T are
called open sets, and their complements are called closed sets.

3.2

Metric Space**

A pair M = (X, d) with a set X and a function d : X X R satisfying


the following properties (d is called a metric) is a metric space:
d(x, y) 0 for all x, y X.
d(x, y) = 0 if and only if x = y.
d(x, y) = d(y, x) for all x, y X.
d(x, y) + d(y, z) d(x, z) for all x, y, z X (triangle inequality).
The metric space M = (X, d) with a real
X = Rn and the (EuPn vector space
2
1/2
clidean) metric d(x, y) = kxyk2 = ( i=1 (xi yi ) ) is the n-dimensional
Euclidean space. For convenience, we use Rn to denote the Euclidean space.

3.3

Open Sets of a Metric Space**

Let (X, d) be a metric space. A subset S X is open if for all point x S


we can find a scalar  > 0 such that
{y X : d(x, y) < } S.

3.4

Metric-Induced Topology**

The collection of all open sets in a metric space M = (X, d) constitutes a


topology T on X. 1 A topology induced by the Euclidean metric d is called
the Euclidean topology.
1

For a proof, see Theorem 4.7 of https://www.dpmms.cam.ac.uk/~twk/Top.pdf

Topology of the Euclidean Space

4.1

Open Sets

A subset F Rn is open if for every x F we can find a positive number


 > 0 such that an open ball of radius  centered at x is contained entirely
in F :
B(x, ) := {y Rn : ky xk2 < } F.
**These open sets are the members of the Euclidean topology. That is,
Arbitrary unions of open sets are open.
Finite intersections of open sets are open.
Ex. (infinite intersections of open sets can be closed2 )


1 1

,
= {0}
n=1
n n

4.2

Closed Sets

A set F is closed if its complement F c = Rn \ F is open.


Or, equivalently, if for all possible sequences of points {xk } in F , all limit
points of {xk } are elements in F .
Ex.
F = (0, 1) (2, 10) is an open set of R.
F = [0, 1] [2, 5] is a closed set of R.
F = (0, 1] is neither open nor closed.
Rn and are both open and closed.
**From the properties of open sets, we have
Finite unions of closed sets are closed.
Arbitrary intersections of closed sets are closed.

4.3

Interior

A point x F Rn is an interior point of the set F , if there is an open ball


B(x, ) contained entirely in F .
The interior of a set F , denoted by int F , is the largest open set contained
in F . Or, equivalently, int F is the set of all interior points of F .
Ex. F = {x Rn : xi 0}. int F = {x Rn : xi > 0}.
2

Singleton sets are closed in any metric space.

4.4

Closure

The closure of a set F , denoted by cl F , is the smallest closed set containing


F . That is,
x cl F if lim xk = x for some sequence {xk } of points in F .
k

In other words, cl F is the union of F and all limit points from sequences in
F.
From definitions: int F F , F cl F .
int and cl are idempotent: int int F = int F , cl cl F = cl F .
int and cl are monotone: X Y implies that int X int Y and
cl X cl Y .
F is open iff int F = F .
F is closed iff cl F = F .
If F is a convex set, then int F and cl F are convex.

4.5

Boundary

The boundary of a set F , denoted by bd F , is


bd(F ) = cl F cl F c , F c = Rn \ F.

4.6

Compact Sets

A set F is compact if every sequence {xk } of points in F has at least one


limit point, and all such limit points are in F . If F Rn is closed and
bounded, then F is compact.

4.7

Neighborhood

For a point x Rn , N Rn is called a neighborhood of x, if it is an open set


containing x. A common neighborhood is the ball centered at x with radius
 > 0, B(x, ).
For a set F Rn , N is a neighborhood of F if N is open and there exists
 > 0 such that
xF B(x, ) N.

References
Bertsekas, D. P. (1999). Nonlinear Programming. Athena Scientific, second
edition.
Folland, G. B. (1999). Real Analysis: Modern Techniques and Their Applications. Wiley Interscience, 2nd edition.
Nocedal, J. and Wright, S. J. (2006). Numerical Optimization. Springer,
second edition.

10

You might also like