STA 5106 Computational Methods in Statistics I

STA 5106 Computational Methods in Statistics I
STA 5106
Computational Methods
in Statistics I
Department of Statistics
Florida State University
Class 8
September 26, 2023
1
Department of Statistics Florida State University
Chapter 3
Non-Linear Statistical Methods
3.1 Introduction
2
Non-linear Optimization
• This chapter is devoted to solving for the optimal points of
given functions, linear or nonlinear.
• An instance of nonlinear optimization problems in statistics

occurs in cases of maximum likelihood estimation.
• Example:
y = F(x, b) + ε
where x and y are random variables of interest, b is a constant
vector of parameters, and ε is a random error with the density
function given by fε.
• In many practical situations, we have the sampled values of y,

x, and our goal is to estimate b according to some criterion. 3
Maximum Likelihood Estimation (MLE)

• Assuming sample size is n and different measurements are
independent of each other, b can be estimated as follows:
n
bˆ = arg max Õ f e ( yi - F ( xi , b))
b i =1
• The quantity on the right side of the above equation is called

the likelihood function.
• b̂ is called the maximum likelihood estimate (MLE) of b.

• In general, a value of b which maximizes the likelihood
function can be found out by seeking the roots of the first
derivative of the likelihood function.
4
3.2 Numerical Root Finding:

Scalar Case
5
Goal
• We are given a real-valued function f(x) where x ∈ R and the
goal is to find the roots of f, i.e., find x∗ such that f(x∗) = 0.
• Due to the non-linearity of the cost function its roots are

found in an iterative way: an initial estimate is chosen and is
modified iteratively until some stopping criterion is met.
• There are many different methods for root-finding. They all

have the following three essential elements:
1. Some of determining the starting value, x0.
2. Given the i-th iterate some formula for calculating the
(i+1)-th iterate.
3. Some stopping criterion.
6
Order of Convergence
• For each of these algorithms we are interested in finding their
rate of convergence towards the roots of the function.
• The rate of convergence is defined by a quantity called the

order of convergence.
• Definition 10 For a converging sequence {xi}, the order of

convergence is defined by β if,
Ei +1
lim b = K ,
i ®¥ E
i
where Ei = |xi − xi-1| and K is a constant.
7
Simple Iterations
• Instead of solving for the roots of f we solve an equivalent
problem of finding the fixed point of another function g. g is
found in such a way that
f(x∗) = 0 ⇔ g(x∗) = x∗.
• The selection depends on which g is easier to handle and

solve for a fixed point. One choice which is always valid is
g(x) = x + f(x) .
• Let x0 be some starting value for the iterative search of the

roots of f. At the i-th stage of the algorithm, the next iterate is
given by the formula:
xi+1 = g(xi) = xi + f(xi) .
8
Algorithm
• Algorithm: Given a function f(x) find its roots using simple
iterations:
• Algorithm 21 (Simple Iterations)

x(1) = x0;
gx = g(x(1));
i = 1;
while (abs(x(i) − gx) > ε)
gx = x(i);
x(i + 1) = g(x(i));
i = i + 1;
end
• The order of convergence in simple iterations is β = 1.

9
Newton-Raphson’s Method
• The slow convergence of the simple iteration is avoided by
using the Newton-Raphson’s Method (1690).
• Newton-Raphson’s Method is one of the most popular

techniques used in numerical root finding.
• Stricter requirements: For Newton’s method to apply, f should

be continuously differentiable and
f′(x) ≠ 0
near the solution x∗.
• We will derive an iterative procedure for computing xi+1 given

xi.
10
Basic Idea
• Approximate the function at a given point by a straight line.
In this case, the line is given by the line tangent to f at that
point.
11
Newton-Raphson’s Method
• Assume xi is the current estimate. Then
0 - f ( xi )
f ' ( xi ) =
xi +1 - xi
• Therefore,
f ( xi )
xi +1 = xi - , ( f ' ( xi ) ¹ 0)
f ' ( xi )
• Newton-Raphson is one of the fastest known algorithms for

root finding in general situations.
• Its order of convergence is β = 2.

12
Algorithm
• Algorithm: Given a function f(x) find its roots using Newton-
Raphson’s method:
• Algorithm 23 (Newton-Raphson Method)

x(1) = x0;
i = 1;
gx = x(i) − f(x(i))/fʹ(x(i));
while (abs(x(i) − gx) > ε)
gx = x(i);
x(i + 1) = x(i) − f(x(i))/fʹ(x(i));
i = i + 1;
end
13
Important Application
• Newton-Raphson method is particularly useful for minimizing
(or maximizing) functions that are convex (or concave).
• Definition: A real-valued function f(x) defined on an interval

[a, b] is called convex, if for any two points x1 and x2 in [a, b]
and any t in [0, 1],
f(t x1 + (1–t) x2) ≤ t f(x1) + (1–t) f(x2)
• A function ƒ is said to be concave if −ƒ is convex. That is, if

for any two points x1 and x2 in [a, b] and any t in [0, 1],
f(t x1 + (1–t) x2) ≥ t f(x1) + (1–t) f(x2)
14
Convex and Concave
15
Convex Condition
• Theorem: Assume f is twice continuously differentiable on
[a, b]. If f’’ ≥ 0, then f is convex.
Proof: For any two points x1 and x2 in [a, b] and any t in [0,
1], let s = t x1 + (1–t) x2. Then use Taylor series,
f(x1) = f(s) + f’(s)(x1–s) + f’’(ξ1)(x1–s)2/2,
f(x2) = f(s) + f’(s)(x2–s) + f’’(ξ2)(x2–s)2/2.
Then
t f(x1) + (1–t) f(x2)
= f(s) + t f’’(ξ1)(x1–s)2/2 + (1–t) f’’(ξ2)(x2–s)2/2.
As f’’ ≥ 0,
t f(x1) + (1–t) f(x2) ≥ f(s) = f(t x1 + (1–t) x2).
Therefore, f is convex. 16

STA 5106 Computational Methods in Statistics I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

STA 5106 Computational Methods in Statistics I

Uploaded by

Copyright:

Available Formats

STA 5106 Computational Methods in Statistics I

Non-Linear Statistical Methods

• An instance of nonlinear optimization problems in statistics

• In many practical situations, we have the sampled values of y,

Maximum Likelihood Estimation (MLE)

• The quantity on the right side of the above equation is called

• b̂ is called the maximum likelihood estimate (MLE) of b.

3.2 Numerical Root Finding:

• Due to the non-linearity of the cost function its roots are

• There are many different methods for root-finding. They all

• The rate of convergence is defined by a quantity called the

• Definition 10 For a converging sequence {xi}, the order of

where Ei = |xi − xi-1| and K is a constant.

• The selection depends on which g is easier to handle and

• Let x0 be some starting value for the iterative search of the

• Algorithm 21 (Simple Iterations)

• The order of convergence in simple iterations is β = 1.

• Newton-Raphson’s Method is one of the most popular

• Stricter requirements: For Newton’s method to apply, f should

• We will derive an iterative procedure for computing xi+1 given

• Newton-Raphson is one of the fastest known algorithms for

• Its order of convergence is β = 2.

• Algorithm 23 (Newton-Raphson Method)

• Definition: A real-valued function f(x) defined on an interval

• A function ƒ is said to be concave if −ƒ is convex. That is, if

Convex and Concave

You might also like