Introduction. Error Analyses

Lecture 1
Introduction. Error analyses
1 Preliminaries and outline

Some problems are difficult or even impossible to solve analytically.
x3 + x + 1 = 0
2x + 3 cos x − ex = 0

 dy 2
= ex
dx
y(0) = 1
Ax = B where A is a large matrix

Z π√
1 + cos2 x
0
What kind of problems we solve here:
• Nonlinear equations f (x) = 0
• system of linear equations Ax = B; direct or iterative methods
• Interpolation and polynomial approximation
• Curve fitting
• Eigenvalue problems (Ax = λx); SVD
• numerical integration
• Numerical solution of ODEs
• Numerical solution of PDEs
Remark: Numerical results are always an approximation, while analytical results are
exact. Numerical result can be as accurate as need but not exact!
Numerical methods require repetitive arithmetic operators +, −,× and ÷.
Numerical Disasters
Patriot missile failure in Saudi Arabia (1991) which resulted in 28 deaths. Failure was
due to miscalculation (rounding errors).
1
Number Representation
Example 1
Decimal System: Base = 10 , Digits (0,1,,9)
312.45 = 3 × 102 + 1 × 101 + 2 × 100 + 4 × 10−1 + 5 × 10−2
±312.45 Sign - Integer part- Fractional part.
Normalized Floating Point Representation:
±d.f1 f2 f3 f4 × 10±n ; Sign -Mantissa- Exponent (d 6= 0)
Example 2
Binary System: Base = 2 , Digits (0,1)
±1.f1 f2 f3 f4 × 2±n ; Sign -Mantissa- Signed Exponent
1 1
(101.1001)2 = 1 · 22 + 0 · 21 + 1 · 20 + 1 · 2−1 + 0 · 2−2 + 0 · 2−3 + 1 · 2−4 = 4 + 1 + + =
2 16
(5.5625)10
Floating-Point Standard
Single Precision (32bit representation)

1bit Sign + 8bit Exponent + 23bit Fraction
Double Precision (64 bit representation)

1bit Sign + 11bit Exponent + 52bit Fraction
Error in Numerical Procedures

Suppose x̂ is an approximation to x.
Definition The absolute error is given as Ex = |x − x̂|
|x − x̂|
Definition The relative error is given as Rx = for x 6= 0.
|x|
2
Example 1
Given x = 1.01594 and x̂ = 1.02
Ex = 0.0406 and Rx = 0.0039963 ⇒ Ex ≈ Rx
Example 2
Given y = 1000000 and ŷ = 999996
Ey = 4 and Ry = 0.000004 ⇒ Ex >> Rx
Example 3
Given z = 0.000012 and ẑ = 0.000009
Ez = 0.000003 and Rz = 0.25 ⇒ Ex << Rx
Kind of Errors in Numerical Procedures

• Truncation error is the error due to the use of approximate expression in place of
an exact expression
• Round-off error is the error due to storage off finite number of digits
• Propagation error is the error in the output due to errors in input data
• Computational error is the error during arithmetic operations
• Loss of significance: Caused by a bad subtraction which means a subtraction of

a number from another one that is almost equal in value.
Round-off Errors
Computers use only a fixed number of digits to represent a number. As a result the
numerical values stored in a computer are said to have finite precision. Because of this
round of errors occur when the arithmetic operations are performed in the machine which
involves numbers with only finite number of digits.
Example 1
4 1
a = ; b = a − 1 = ; c = 3b = 1; d = 1 − c = 0.
3 3
Using 6 digits
f lr (a) = 1.33333
f lr (b) = 0.33333
f lr (c) = 0.99999
f lr (d) = 0.00001.
Experiment in Matlab
3
>> format long e
>> x=(4/3)*3
x=4
>> a=4/3
a=1.333...3e+00
>>b=a-1
b=3.333...3e-01
>>c=1-3*b
c=2.220446...e-16
Remark Round-off errors depend on hardware and computer language used.

Round-off errors are of the form:
p = ±0.d1 d2 ...dk dk+1... × 10n , where n ∈ N, 1 ≤ d1 ≤ 9, 0 ≤ d1 ≤ 9, for j > 1
f lchop (p) = ±0.d1 d2 ...dk × 10n
f lround (p) = ±0.d1 d2 ...dk−1 rk × 10n ,

where rk is obtained by rounding number dk dk+1dk+2 ...
i.e. when dk+1 < 5 ⇒ rk = dk
when dk+1 ≥ 5 ⇒ rk = dk + 1
Example 1
22
p= = 3.142857142857142857...
7
f lchop = 0.314285 × 101
f lround = 0.314286 × 101
Example 2
π = 3.14159265...
Using 5-digits: π̂ = 3.1416
Using 6-digits: π̂ = 3.14159
Remark Round-off errors is preferred generally!
4
Taylor’s Theorem
If f (x) ∈ C n+1 [a, b], i.e. f (x) has continuous n + 1 derivatives on (a, b) than for any point
x and x0 from [a, b] we have
n
X f k (x0 ) 1
f (x) = (x − x0 )k + f n+1(ζ)(x − x0 )n+1 , where ζ is between x and x0 .
k=0
k! (n + 1)!
n
X f k (x0 ) 1
Denote Pn (x) = (x − x0 )k and En (x) = f n+1 (ζ)(x − x0 )n+1
k=0
k! (n + 1)!
=⇒ f (x) = Pn (x) + En (x)
So f (x) ≈ Pn (x) if limn→∞ En (x) = 0.

(x − x0 )n+1
Theorem If |f (n+1) (ζ)| ≤ M for any ζ between x and x0 then |En (x)| ≤ M
(n + 1)!
Truncation errors
Example 1
Use Taylor expansion of order 5 to approximate sin x about x = 0 and evaluate the round-
off error.
x3 x5
sin(x) ≈ x − + = P5 (x)
3! 5!
−sin(ζ) x6
sin(x) = P5 (x) + R5 (x), where R5 (x) = (x − 0)6 ⇒ |R5 (x)| ≤
6! 6!
0.36
For instance for |x| ≤ 0.3 we have |R5 (x)| ≤ ≈ 10−6
6!
Example 2 (I)
Using Taylor expansion for the function f (x) = ex about x0 = 0 we get
∞
x x x2 x3 x4 X xn
e =1+ + + + + ... = P3 (x) +
1! 2! 3! 4! n=4
n!
x
The approximation e ≈ P3 (x) produces truncation errors!
Question: Where to cut series expansion?
Remark Truncation error is under control of the user. Truncation error can be reduced
by selecting more accurate discrete approximations. However it can not be eliminated
entirely!
Example 2 (II)
Establish the error bounds for the approximation ex ≈ P8 (x) on the interval
• a) −1 ≤ x ≤ 1
5
−1 1
• b) ≤x≤
2 2
x x2 x3 x4
ex = 1 + + + + + ...
1! 2! 3! 4!
|x|n+1 19
a) |En (x)| ≤ eζ as f (n) (ζ) = |eζ | < e ⇒ |E8 (x)| ≤ e ≈ 0.749 · 10−5
(n + 1)! 9!
Example 2 (III)
Determine the degree of Taylor polynomial PN (x) expanded about x0 = 0 that should be
used to approximate e0.1 so that the error is less than 10−6 .
x x2 x3 x4 xN
ex = 1 + + + + + ... + + ...
1! 2! 3! 4! N!
N +1 0.1 N +1
ζ x e · 0.1 e0.1
|EN (x)| = |e |≤ = N +1 ≤ 10−6
(N + 1)! (N + 1)! 10 (N + 1)!
0.1
e
For N = 3 ⇒ |E3 (x)|= 4 = 4.6 · 10−6
10 (4)!
e0.1
For N = 4 ⇒ |E4 (x)|= 5 = 9.21 · 10−8
10 (5)!
2
0.1 0.1 0.13 0.14
so e0.1 = 1 + + + +
1! 2! 3! 4!
Example 3
2 4
ln(cos x) − x2 − x12 + ... 1 1 2 1
lim = lim = lim − − x + ... = −
x→0 x2 x→0 x2 x→0 2 12 2
Example 4 Approximate e1 using the above Taylor expansion of f (x) = ex at x0 = 0.
P0 (x) = 1
P1 (x) = 1 + x
x2
P2 (x) = 1 + x +
2!
x2 x3
P3 (x) = 1 + x + +
2! 3!
...............................
f (e) = e = 2.718281828459...
...............................
P0 (1) = 1 ⇒ Err ≈ 1.72

P1 (1) = 2 ⇒ Err ≈ 0.72
P2 (1) = 2.5 ⇒ Err ≈ 0.22
P3 (1) = 2.66666... ⇒ Err ≈ 0.052
P4 (1) = 2.708333... ⇒ Err ≈ 0.01
P5 (1) = 2.716666... ⇒ Err ≈ 0.002
6
Example 5
Determine the degree of Taylor polynomial PN (x) expanded about x0 = π that should be
used to approximate cos 33π32
so that the error is less than 10−6 .
2
(x − π) (x − π)4 (x − π)6
cos(x) = −1 + − + + ...
2! 4! 6!
π N +1
(x − π)N +1 1
|EN (x)| ≤ ≤ 32 ≈ ≤ 10−6
(N + 1)! (N + 1)! (N + 1)! · (10.1859)N +1
1
For N = 3 ⇒ |E3 (x)| = ≈ 3.87 · 10−6
4! · (10.1859)4
1
For N = 4 ⇒ |E4 (x)| = ≈ 7.6 · 10−8
5! · (10.1859)5
π2 π4
33π
so cos( ) ≈ −1 + 32 − 32
32 2! 4!
Example 5
1 1
Approximate = f (3) using Taylor expansion of the function f (x) = at x0 = 1.
3 x
1 2
f ′ (x) = − 2
, f ′′ (x) = 3 , ...,f k (x) = (−1)k k!x(−k−1) ;
x x
∞ n
X f k (1) k
X
f (x) = (x − 1) ⇒ Pn (x) = (−1)k (x − 1)k
k!
k=0 k=0
P0 (3) = 1
P1 (3) = −1
P2 (3) = 3
P3 (3) = −5
P4 (3) = 11
P5 (3) = −21
P6 (3) = 43
P7 (3) = −85
Taylor Fails!
Example 7
Z 1
2 2
ex = 0.544987104184... = p
0
∞
2 x2 x4 x6 x8 X x2n
ex = 1 + + + + +
1! 2! 3! 4! n=5 n!
1 1 1
x2 x4 x6 x8
Z Z Z
2 2 2
x2
e ≈ P8 (x) = (1 + + + + ) = 0.544986720817 = p̂
0 0 0 1! 2! 3! 4!
Ep = |p − p̂| = 0.000000383367...
7
Propagation of Error
Stable Numerical Methods-errors made at early steps die out as method continues.
Unstable Numerical Methods-errors grow up as method continues.
Assume p̂ is an approximation of p and q̂ is an approximation of q, then
p = p̂ + ǫp , q = q̂ + ǫq
p + q = (p̂ + q̂) + (ǫp + ǫq ),
p · q = (p̂ + ǫp )(q̂ + ǫq ) = p̂q̂ + ǫq p̂ + ǫp q̂ + ǫp ǫq ,
Assume that p̂ and q̂ are large numbers then the terms ǫq p̂, and ǫp q̂ are much larger then
original errors ǫp and ǫq .
pq − p̂q̂ p̂ǫq + q̂ǫp + ǫp ǫq

Rp·q = = .
pq pq
Suppose that p̂ ≈ p and q̂ ≈ q

ǫp ǫq ǫp ǫq
⇒ Rp Rq = ≈ 0 ⇒ Rp·q ≈ + = Rp + Rq , so relative error in the product p · q is
p q p q
approximately the sum of relative errors in the approximation of p̂ and q̂.
Loss of Significance
A loss of significance can be incurred if two nearly equal quantities are subtracted from
one another. Thus if I were to direct my computer to subtract
Let a = 0.123456789123456789 and
b = 0.123456789000000000 be two given numbers. A floating-point representation of
this number on a machine that keeps 10 floating-point significant digits would be a =
0.1234567891 and
b = 0.1234567890.
The difference is b − a = 0.000000000123456789
while the machine will show b − a = 0.0000000001 ⇒ only 1 significant digit.
This loss is called subtractive cancellation, and can often be avoided by rewriting the
expression.
Errors can also occur when two quantities of radically different magnitudes are summed.
For example,
a = 0.1234 and b = 5.6789 × 10−20
a + b = 0.1234 + 5.6789 × 10−20 might be rounded to 0.1234 by a system that keeps only
16 significant digits. This may lead to unexpected results.
The usual strategies for rewriting subtractive expressions are completing the square, fac-
8
toring, or using the Taylor expansions, as the following examples illustrate.
Example 1
√ √ x
Using 6 digits and rounding solve f (x) = x( x + 1 − x) = g(x) = √ √ for
( x + 1 + x)
x = 500.
√ √
f (500) = 500( 501 − 500) = 500(22.3830 − 22.3607) = 500 · 0.0223 = 11.1500
500 500 500

g(500) = √ √ = = = 11.1748
501 + 500 22.3830 + 22.3607 44.7437
g(500) is the correct answer.
Example 2
Using 3 digits solve
P (x) = x3 − 3x2 + 3x − 1 and
Q(x) = ((x − 3)x + 3)x − 1; ′ Nested Structure′ .
P (2.19) = 2.193 −3·2.192 +3·2.19−1 = 2.19·4.80−3·4.80+6.57−1 = 10.5−14.4+5.57 =

1.67
Q(2.19) = ((2.19−3)·2.19+3)·2.19−1 = (−0.81·2.19+3)·2.19−1 = (−1.77+3)·2.19−1 =

1.23 · 2.19 − 1 = 2.69 − 1 = 1.69
Exact solution: 1.685159...
EP = 0.015159
EQ = 0.004841. Note: The nested structure requires less operations and has less errors!
Tips for avoiding large errors: In order to decrease the magnitude of round-off errors and
to lower the possibility of overflow or underflow errors, make the intermediate result as
xy
close to as possible in consecutive multiplication or division processes; for example,
z
(xy)
• When x and y in the multiplication are very different in magnitude
z
y
• When y and z in the division are close in magnitude x( )
z
x
• When x and z in the division are close in magnitude y( )
z
Example in MATLAB
9
x = 36; y = 1e16
y −20 y
−20x
= 0.000e + 000 but ( x )−20 = 0.49206e − 08
e e
y 20 y 20
= NaN but ( x ) = 2.0322e + 07
e20x e
Eps number
Eps number: Smallest machine value that can be added to 1.0 that gives a result distin-
guishable from 1.0
>>eps
eps = 2.2204e − 16
>>x=1.0
x = 1.0
>>x+eps
ans = 1.0
Round-off Error vs Truncation Error
Example (Error in finite difference approximation)
df f (x + h) − f (x)
Analytically = f ′ (x) = lim .
dx h→0 h
f (x + h) − f (x)
f ′ (x) ≈ (1)
h
when h gets small truncation error reduces but at some point round off error will dominate!
Etotal = Etruncation (h, f ′′ ) + Eround (x, h) (2)

h ′′ e(x + h) − e(x) 2ǫ
Etruncation (h, f ′′ ) ≈ fmax and Eround (x, h) = ≤ (3)
2 h h
r
ǫ
Total error is minimized when h ≈ 2 · , where M bounds |f ′′ (ζ)| for ζ near x.
M
Error increases for smaller h because of rounding error and increases for larger h because
of truncation error, see Fig. 1.
10
2
10
Total Error
0
Truncation Error
10 Round off Error
−2
10
−4
10
−6
10
−8
10
−10
10
−12
10
−14
10
−16
10
−18
10
−16 −14 −12 −10 −8 −6 −4 −2 0
10 10 10 10 10 10 10 10 10
h−value
Figure 1: Example: Finite Difference Approximation.
Well Posed and Well Conditioned Problems

Definition A problem is well posed if a solution exist, is unique and depends on varying
parameters (initial conditions).
Problems that are not well-posed are termed ill-posed. Inverse problems are often ill-
posed. For example, the inverse heat equation, deducing a previous distribution of tem-
perature from final data, is not well-posed in that the solution is highly sensitive to changes
in the final data.
Definition A well conditioned problem is not sensitive to changes in the values of the
parameters (small changes in the input do not cause large changes in the output).
If the problem is well conditioned, the model still gives useful results in spite of small
inaccuracies in the parameters.
Convergence in Iterative Sequences

Iteration is a common component of numerical algorithms. In the most abstract form, an
iteration generates a sequence of scalar values x1 , x2 , x3 , ...
11
Definition The sequence converges to a limit L if |xk − L| < δ for all k > N, where δ is a
small number called the convergence tolerance.
We say the sequence has converged to within the tolerance δ after N iteration.
Big O and small o notation

In discussing the rate of change of numerical methods we will use the notation O(hp ); the
so called the big o notation, where p is a real number.
Definition Function f(h) is said to be big O(h) of g(h), denoted f (h) = O(g(h)) if there
exist constant C and c such that
|f (h)| ≤ C(g(h)), whenever h ≤ c (4)
Example 1
Given f (x) = x2 + 1 and g(x) = x3 since
x2 ≤ x3 and 1 ≤ x3 for x ≥ 1 we have
x2 + 1 ≤ 2x3 for x ≥ 1
f (x) = O(g(x)) ⇒ f (h) = O(h3 )
Example 2
3
2h3 = O(h2 ) as h → 0, since | 2h
h2
| = |2h| ≤ 1 for all h < 1
2
Example 3
sin(h) = O(h) as h → 0, since
h3 h5
sin(h) = h − + − ... < h, for all h > 0.
3! 5!
Definition Let {xn }∞ ∞
n=1 and {yn }n=1 be two sequences. The sequence {xn } is said to be
of big order O of {yn } denoted xn = O(yn ) if there exist constant C and N such that
|xn | ≤ C|yn | for n ≥ N.
n2 − 1 1 n2 − 1 n2 1
Example 3
= O( ), since 3
≤ 3 = , for n ≥ 1
n n n n n
Often a function f (h) is approximated by p(h) and the error bounded is known to be
M|hn |.
Definition Assume f(h) is approximated by p(h) and that exists a real constant M > 0
|f (h) − p(h)|
and positive integer n so that ≤ M for sufficiently small h. We say that
|hn |
p(h) approximates f (h) with order of approximation O(hn ) and write f (h) = p(h)+O(hn ).
12
f (h)
Definition f (h) = o(g(h)) as h → 0 if → 0 as h → 0.
g(h)
This is slightly stronger then big oh as means that f (h) decays to 0 faster then g(h).
Example 1
2h3
2h3 = o(h2 ) as h → 0, since | | = |2h| → 0 as h → 0
h2
Example 2
sin(h) = h + o(h) as h → 0, since
sin(h) − h
sin(h) = = h2 .
h
Theorem Assume that

f (h) = p(h) + O(hn ) and g(h) = q(h) + O(hm ) and r = mim{n, m}.
Then f (h) + g(h) = p(h) + q(h) + O(hr )

f (h) · g(h) = p(h) · q(h) + O(hr )
f (h) p(h)
= + O(hr ), (g(h) 6= 0) and (p(h) 6= 0)
g(h) q(h)
Example 1
h2 h3
eh = 1 + h + + + O(h4 )
2! 3!
h2 h4
cos(h) = 1 − + + O(h6 )
2! 4!
h3 h4 h3
h
e + cos(h) = 2 + h + + O(h4 ) + + O(h6 ) = 2 + h + + O(h4 )
3! 4! 3!
h2 h3 h2 h4 h2
eh · cos(h) = (1 + h + + + O(h4 ))(1 − + + O(h6 )) = (1 + h + +
2! 3! 2! 4! 2!
h3 h2 h4 h2 h3 h2 h4
)(1 − + ) + (1 + h + + )O(h6 ) + (1 − + )O(h4 ))O(h4 ))O(h6 )) =
3! 2! 4! 2! 3! 2! 4!
h3 5h4 h6 h7 6 4 4 6 h3
1+h− − + + + O(h ) + O(h ) + O(h )O(h ) = 1 + h − + O(h4 )
3 24 48 144 3
Order of Convergence of a Sequence
Definition Suppose lim xn = L and {rn }∞

n=1 is a sequence with lim rn = 0. We say that
n→∞ n→∞
{xn }∞
n=1 converges to L with the order of convergence O(rn ) if exist a constant K > 0
|xn − L|
such that ≤ K for n sufficiently large.
|rn |
xn = L + O(rn ) or xn → L with order O(rn ).
13
Example
cos n
xn = 2 , rn = n12 . We know that lim xn = 0.
n n→∞
| cos
n2
n
|
= | cos n| ≤ 1 for all n.
| n12 |
14

Introduction. Error Analyses

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction. Error Analyses

Uploaded by

Copyright:

Available Formats

Lecture 1

Introduction. Error analyses

1 Preliminaries and outline

Ax = B where A is a large matrix

• Nonlinear equations f (x) = 0

• system of linear equations Ax = B; direct or iterative methods

• Interpolation and polynomial approximation

• Eigenvalue problems (Ax = λx); SVD

• Numerical solution of ODEs

• Numerical solution of PDEs

Decimal System: Base = 10 , Digits (0,1,,9)

312.45 = 3 × 102 + 1 × 101 + 2 × 100 + 4 × 10−1 + 5 × 10−2

±312.45 Sign - Integer part- Fractional part.

Normalized Floating Point Representation:

±d.f1 f2 f3 f4 × 10±n ; Sign -Mantissa- Exponent (d 6= 0)

Binary System: Base = 2 , Digits (0,1)

±1.f1 f2 f3 f4 × 2±n ; Sign -Mantissa- Signed Exponent

Single Precision (32bit representation)

Double Precision (64 bit representation)

Error in Numerical Procedures

Definition The absolute error is given as Ex = |x − x̂|

Kind of Errors in Numerical Procedures

• Computational error is the error during arithmetic operations

• Loss of significance: Caused by a bad subtraction which means a subtraction of

Remark Round-off errors depend on hardware and computer language used.

p = ±0.d1 d2 ...dk dk+1... × 10n , where n ∈ N, 1 ≤ d1 ≤ 9, 0 ≤ d1 ≤ 9, for j > 1

f lchop (p) = ±0.d1 d2 ...dk × 10n

f lround (p) = ±0.d1 d2 ...dk−1 rk × 10n ,

f lchop = 0.314285 × 101

f lround = 0.314286 × 101

Using 5-digits: π̂ = 3.1416

Using 6-digits: π̂ = 3.14159

Remark Round-off errors is preferred generally!

So f (x) ≈ Pn (x) if limn→∞ En (x) = 0.

P0 (1) = 1 ⇒ Err ≈ 1.72

Unstable Numerical Methods-errors grow up as method continues.

Assume p̂ is an approximation of p and q̂ is an approximation of q, then

p + q = (p̂ + q̂) + (ǫp + ǫq ),

p · q = (p̂ + ǫp )(q̂ + ǫq ) = p̂q̂ + ǫq p̂ + ǫp q̂ + ǫp ǫq ,

pq − p̂q̂ p̂ǫq + q̂ǫp + ǫp ǫq

Suppose that p̂ ≈ p and q̂ ≈ q

500 500 500

g(500) is the correct answer.

P (2.19) = 2.193 −3·2.192 +3·2.19−1 = 2.19·4.80−3·4.80+6.57−1 = 10.5−14.4+5.57 =

Q(2.19) = ((2.19−3)·2.19+3)·2.19−1 = (−0.81·2.19+3)·2.19−1 = (−1.77+3)·2.19−1 =

Exact solution: 1.685159...

Round-off Error vs Truncation Error

Example (Error in finite difference approximation)

Etotal = Etruncation (h, f ′′ ) + Eround (x, h) (2)

Figure 1: Example: Finite Difference Approximation.

Well Posed and Well Conditioned Problems

Convergence in Iterative Sequences

Big O and small o notation

|f (h)| ≤ C(g(h)), whenever h ≤ c (4)

Theorem Assume that

Then f (h) + g(h) = p(h) + q(h) + O(hr )

Order of Convergence of a Sequence

Definition Suppose lim xn = L and {rn }∞

You might also like