Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Rounding & truncation errors

Data Analysis Weeks 7-12


Numerical Methods

Engineering Mathematics

Computational errors – p.1/14


What is the point of Numerical Analysis?
The range of problems that can be solved
exactly is extremely limited:

x2 + 3x + 2 = 0
x2 + 3 sin (x) + 2 = 0

Computational errors – p.2/14


What is the point of Numerical Analysis?

ẋ = x + 1
ẋ = sin (x) + 1
NEED:
good approximate solutions
estimate how good or how bad solutions are

Computational errors – p.3/14


Floating-point representation
How does the computer represent
π = 3.1415926535897 . . .
decimal places: digits after decimal point
significant figures: all digits

Decimal floating-point system:

π ≈ +3.1416 × 100

sign mantissa exponent


Base is 10 so number before decimal point is 1 ≤ 3 < 10

Computational errors – p.4/14


Binary floating-point system
Base is 2 so number p before binary point is 1 ≤ p < 2
It is always 1!
π ≈ +1.10010010000111111 × 21
−n 1
position n after binary point corresponds to 2 = n
2

Computational errors – p.5/14


Computer number storage
single: 32 bits double: 64 bits
sign 1 1
exponent 8 11
mantissa 24 53

Computational errors – p.6/14


Rounding error
There is a smallest and largest number
in MATLAB!

Smallest number: realmin if smaller ≡ 0


Largest number : realmax if larger ≡ ±Inf = ±∞
Machine unit: eps 2−52

NaN means Not a Number

Computational errors – p.7/14


Rounding error — example
>>w = [4; sqrt(5); 6];

>>M = [-3 0 1; 2 5 -7; -1 4 8];

>>x = M\w
x =
−1.3138
1.0546
0.0585

Is M*x indeed equal to w?

Computational errors – p.8/14


Infinite algorithms
 M*x = w can be solved explicitly
this is a finite algorithm

X 1
 takes forever! This is an infinite algorithm
n=0
n!

Infinite algorithms necessarily introduce a truncation error

Computational errors – p.9/14


Truncation errors
Recall that

X 1
= e1 = e.
n=0
n!
Truncate such that the finite sum equals e to three
decimal places.

Equivalently: the tail must be less than 0.0005

Computational errors – p.10/14


Bound the tail
Suppose the tail starts at n = k + 1

X 1 1 1
= + + ...
n! (k + 1)! (k + 2)!
n=k+1
 
1 1 1
< 1+ + 2
+ ...
(k + 1)! k + 1 (k + 1)
 
1 k+1 1
= =
(k + 1)! (k + 1) − 1 k · k!
1
Now choose k such that k·k! < 5 × 10−4

Computational errors – p.11/14


Truncate the sum
1
truncated sum
k · k!
1
k=1 1 1 2
1
k=2 4 0.25 2.5
1
k=3 18 0.0556 2.6667
1
k=4 96 0.0104 2.7083
1
k=5 600 1.6667 × 10−3 2.7167
1
k=6 4320 2.3148 × 10−4 2.7181
Computational errors – p.12/14
Absolute and relative errors
Absolute error: real minus estimated values
6
X 1
| x − x̂ | |e − |< 2.3148 × 10−4
n=0
n!

Absolute error: agreement in decimal places

Relative error: difference relative to real value


| x − x̂ |
|x|
Relative error: agreement in significant figures
Computational errors – p.13/14
Absolute versus relative error
MATLAB uses e = 2.71828182845905

Estimate of e is ê = 2.718
−4 e − ê
e − ê = 2.818 × 10 = 1.037 × 10−4
e

Estimate of E := 100e is Ê = 100ê = 271.8

−2 E − Ê
E − Ê = 2.818 × 10 = 1.037 × 10−4
E

Computational errors – p.14/14

You might also like