Error Analysis

NUMERICAL ANALYSIS
Lecture 2: Floating point arithmetic and Error Analysis
Rashidah Kasauli Namisanvu
Department of Networks
College of Computing and Information Sciences
Makerere University
rashidah.namisanvu@mak.ac.ug
Rashidah Kasauli Namisanvu (Mak) Errors 1 / 22

Overview
1 Numbers Representation in Computer
2 Introduction to Errors
3 Error Sources
4 Additions & Subtractions
5 Multiplication & Division
6 Using Binomial Theorem

Numbers in a computer
Humans use decimal number system in daily life.

Most computers use binary number system (base 2)
We enter in decimal and computer transforms them to binary system
by using machine language.
(Scientific Notation). Let k be a real number, then k can be written
in the following form
k = m × 10n
where m (mantissa) is any real number and the exponent n is an
integer. This notation is called the scientific notation or scientific
form and sometimes referred to as standard form.

Numbers in a computer: Examples
1 0.00000834 = 8.34 × 10−6

2 25.45879 = 2.545879 × 101
3 3400000 = 3.4 × 106
4 33 = 3.3 × 101
5 2,300,000,000 = 2.3 × 109

Errors
Errors are unavoidable in the field of scientific computing.

Numberical analysts try to investigate the possible and best ways to
minimise the error.
The study of error and how to estimate and minimise it are the
fundamental issues in error analysis

Errors
Not all Mathematical problems can be solved analytically

2x√ + 5 = 8 easy to solve
2 x + 5 = 8 not as easy
Hence use numerical rather than analytical approaches
In numerical analysis, we approximate the exact solution of the
problem by using numerical methods which introduces an error.
The numerical error is the difference between the exact solution and
the approximate solution.
(Numerical Error). Let x be the exact solution of the underlying
problem and x ∗ its approximate solution, then the error (denoted by
e) in solving this problem is
e − x − x∗

Errors
The term error represents the imprecision and inaccuracy of a

numerical computation.
Measurements and calculations can be characterized with regard to
their accuracy and precision.
Unavoidable but needs to be tracked
As computations get complex so are the errors
Hence study how they propagate
Accuracy
How close the value is to the true value
Precision
How close the values are to each other

Rounding and Truncation Errors
Rounding
Occurs due to computing device’s inability to deal with inexact
numbers. π = π = 3.14159265 · · · can be rounded to four significant
digits as π = 3.142
Maximum error in rounding is 0.5 × 10−n . n is the number of rounded
places
Truncating
Occurs because some series (finite or infinite) is truncated/choped to a
fewer number of terms. e.g. A surd π = 3.14159265 · · · can be
truncated to four significant figures to π = 3.141
Maximum error in truncation is 1 × 10−n .

Error Propagation
Calculating the uncertainty or error of an approximation against the

actual value it is trying to approximate. Represented as absolute
value showing how far the approximation is or relative error shown as
a proportion
Absolute Error
δx = |x − x ′ |
Relative Error
′
δx
RE = x = | x−x
x |
Percentage Error
′
PE = RE × 100 = | x−x
x | × 100
Which is better?

Errors in Addition & Subtraction
Theorem (Errors in Addition)

For variables x1 ± δx1 and x2 ± δx2 where δxi is the maximum possible
error. The maximum possible error when they are added is (δx1 + δx2 ).
Proof.
Since δx1 and δx2 are ±, the possible combinations on addition are
(δx1 + δx2 ), (δx1 − δx2 ), (−δx1 + δx2 ) and (−δx1 − δx2 ). The extremes
are the first and last. Hence the sum ∈ ±(δx1 + δx2 )
Theorem (Errors in Subtraction)

For variables x1 ± δx1 and x2 ± δx2 where δxi is the maximum possible
error. The maximum possible error when they are subtracted is (δx1 + δx2 ).
Proof.
Same as addition
Errors in Multiplication & Division
Theorem (Errors in Multiplication and Division)

The relative error in division and multiplication are the same.
Proof.
Consider x1 ± δx1 and x2 ± δx2 . For multiplication, the absolute error is
(x1 + δx1 )(x2 + δx2 ) − x1 x2 = x1 δx2 + x2 δx1 (Note: δx1 δx2 tends to zero).
The relative error is x1 δxx21+x
x2
2 δx1
= δx 1 δx2
x1 + x2 . For division, the absolute
x2 −δx2 x2 δx1 −x1 δx2
error is xx12 +δx x1 x1 +δx1 x1
+δx2 − x2 ≡ x2 +δx2 × x2 −δx2 − x2 which simplifies to
1
x2
.
2
x2 δx1 −x1 δx2 x1
The relative error is x22
/ x2 which simplifies to
δx1 δx2 δx1 δx2
x1 − x2 ≡ x1 + x2 since errors are ±

Generic use of Binomial Theorem
We have looked at errors in simple cases of

addition/subtraction/multiplication/division
We may need error expressions for fairly complex cases: roots,
reciprocals, big powers (eg x 25 ), etc
Generally we employ the binomial theorem for any index
Approximate higer powers of δx to be very small
Theorem (Binomial theorem for any index n)

n(n − 1) 2 n(n − 1)(n − 2) 3
(1 + p)n = 1 + np + p + p + ······
2! 3!
All we do is to write the error expression as (1 + α)k and simplify

Some samples
power 100
If the error in x is δx, the error in x 100 can be got by (x + δx)100 − x 100 .
This can be rewritten as x 100 (1 + δxx )
100 − x 100 = x 100 ((1 + ( δx ))100 − 1).
x
Expanding to x 100 1 + 100( δx 100×99 δx 2 100×99×98 δx 3

x ) + 2! ( x ) + 3! ( x ) ··· − 1
and simplifying to x 100 × 100( δxx ) ≡ 100x δx
99
Try convince your self

That the error for the
δx
square root of x is √
2 x
δx
reciprocal of x is x2
2δx
square of the cuberoot of x is √
33x

Examples
Given the numbers a = 2.3,

1 Maximum error in a + b
b = 4.65 and c = 2.10. a and b = 0.05 + 0.005 = 0.055
were generated by rounding while 2 Max error in a + b − c
c by truncating. Find the = 0.05+0.005+0.01 = 0.065
maximum possile error in 3 Error in
1 a+b multiplication
= ac( δa δc
a + c ) = 0.128
2 a+b−c addition
ac
3
b+c
= δb + δc = 0.015
division
4 (b − c)23 ac
= b+c × ( δ(ac) δ(b+c)
ac + b+c ) =
The maximum possible errors δa, 0.026
δb and δc are 0.5 × 10−1 , 4 Error in
0.5 × 10−2 and 1 × 10−2 b − c = δb + δc = 0.015
respectvely (b − c)23 = 23(b −
c)22 δ(b − c) = 5.02 × 107
Recap – Errors
1 Numerical Analysis
Definition: Study of algorithms for the problems of continuous
mathematics.
We care about efficiency and accuracy of algorithms.
Sources of error:
truncation errors in mathematical models;
round-off error during calculations;
representation errors, e.g.:1/3 in decimal; 1/10 in binary.
We ask two important questions:
How sensitive is the output to changes in the input? – problem
conditioning
Do roundoff errors in an algorithm propagate? – algorithm stability
2 Floating point representation
Hardware supports single (32bit), double (64bit) and long double
(80bit) precision
Adding floating point numbers

Adding two floating point numbers
Consider two numbers in a system with 4 as the size of fraction:
x = 0.1101x23 and y = 0.1011x21
. Four steps to add two floating-point numbers:

1 Align the numbers so they both have the same exponent.
y = 0.1011x21 = 0.01011x22 = 0.001011x23
0.1101 x 23
2 Perform the addition. + 0.001011 x 23
0.111111 x 23
3 Round the result: x + y = 1.0000x23 .
4 Normalize the result: x + y = 0.1000x24 .
Exercise 1: check the accuracy of the sum. Translate all operations into
our familiar decimal notation.
Loss of Significance
Consider two numbers in a system with 4 as the size of the fraction:

x = 0.1110x23 and y = 0.1101x23
Compute x − y :
0.1110 x 23
− 0.1101 x 23
0.0001 x 23
After normalisation: x − y = 0.1000x20 .
Problem: x and y have 4 bits of significance
the result x − y has only one significant bit of accuracy
Using truncated Taylor series helps avoid loss of significance in
quadratic formulae.
For polynomial evaluation, the rearrangement of terms into nested
multiplication form will sometimes produce a better result.

Approximations
Decimal places and significant figures.

The number x̂ is said to approximate x to d significant digits if d is
the largest positive integer for which
|x − xb| 10−d
<
|x| 2
Example:
If x = 3.141592 and x̂ = 3.14, then |x − x̂|/|x| = 0.000507 < 10−2 /2.
Therefore, x̂ approximates x to two significant digits.
If y = 4, 000, 000 and ŷ = 000, 006, then
|y − ŷ |/|y | = 0.000004 < 10−5 /2. Therefore ŷ approximates y to five
significant digits.
If z = 0.000012 and ẑ = 0.000009, then |z − ẑ|/|z| = 0.25 < 100 /2.
Therefore ẑ approximates z to no significant digits.

Uncertainty in Data
Real world data contains uncertainty or error.

reffered to as noise
Successive computations using noisy data does not improve precision
Result and inputs should have the same number of significant digits
of accuracy.
Example: p1 = 4.152 and p2 = 0.07931 both have 4 significant digits
of accuracy.
Reporting p1 + p2 = 4.23131 includes noisy data.
Proper answer is p1 + p2 = 4.231

Propagation of error
Error in addition and subtraction is the sum of the errors in the

addends
The relative error in multiplication and division is the sum of the
relative errors in the approximations.
If a small error in the initial conditions produces a small error in the
final result, then the algorithm is stable. Otherwise it is unstable.
Suppose ϵ represents an initial error and ϵ(n) represents the growth of
error after n steps
If ϵ(n) ≈ nϵ the growth of error is said to be linear.
If ϵ(n) ≈ K n ϵ the growth of error is exponential.

Exercises
1 Find the error E and the relative error R. Also determine the number
of significant digits in the approximation.
x = 2.71828182, x̂ = 2.7182
y = 98.350, ŷ = 98.000
z = 0.000068, ẑ = 0.00006
2 Consider the data p1 = 1.414 and p2 = 0.09125 which have four
significant digits of accuracy. Determine the proper answer for the
sum p1 + p2 and the product p1 p2
3 Consider the data p1 = 31.415 and p2 = 0.027182which have five
significant digits of accuracy. Determine the proper answer for the
sum p1 + p2 and the product p1 p2
4 Let P(x) = x 3 − 3x 2 + 3x − 1, Q(x) = ((x − 3)x + 3)x − 1 and
R(x) = (x − 1)3
Use four-digit rounding arithmetic and compute P(2.72), Q(2.72) and
R(2.72). In the computation of P(x) assume that (2.72)3 = 20.12 and
(2.72)2 = 7.398. Compare them to the true values.
The End

Error Analysis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Error Analysis

Uploaded by

Copyright:

Available Formats

NUMERICAL ANALYSIS

Lecture 2: Floating point arithmetic and Error Analysis

Rashidah Kasauli Namisanvu

Rashidah Kasauli Namisanvu (Mak) Errors 1 / 22

1 Numbers Representation in Computer

4 Additions & Subtractions

5 Multiplication & Division

6 Using Binomial Theorem

Rashidah Kasauli Namisanvu (Mak) Errors 2 / 22

Humans use decimal number system in daily life.

Rashidah Kasauli Namisanvu (Mak) Errors 3 / 22

1 0.00000834 = 8.34 × 10−6

Rashidah Kasauli Namisanvu (Mak) Errors 4 / 22

Errors are unavoidable in the field of scientific computing.

Rashidah Kasauli Namisanvu (Mak) Errors 5 / 22

Not all Mathematical problems can be solved analytically

Rashidah Kasauli Namisanvu (Mak) Errors 6 / 22

The term error represents the imprecision and inaccuracy of a

Rashidah Kasauli Namisanvu (Mak) Errors 7 / 22

Rashidah Kasauli Namisanvu (Mak) Errors 8 / 22

Calculating the uncertainty or error of an approximation against the

Rashidah Kasauli Namisanvu (Mak) Errors 9 / 22

Theorem (Errors in Addition)

Theorem (Errors in Subtraction)

Theorem (Errors in Multiplication and Division)

Rashidah Kasauli Namisanvu (Mak) Errors 11 / 22

We have looked at errors in simple cases of

Theorem (Binomial theorem for any index n)

Rashidah Kasauli Namisanvu (Mak) Errors 12 / 22

Try convince your self

Rashidah Kasauli Namisanvu (Mak) Errors 13 / 22

Given the numbers a = 2.3,

Rashidah Kasauli Namisanvu (Mak) Errors 15 / 22

x = 0.1101x23 and y = 0.1011x21

. Four steps to add two floating-point numbers:

y = 0.1011x21 = 0.01011x22 = 0.001011x23

Consider two numbers in a system with 4 as the size of the fraction:

Rashidah Kasauli Namisanvu (Mak) Errors 17 / 22

Decimal places and significant figures.

Rashidah Kasauli Namisanvu (Mak) Errors 18 / 22

Real world data contains uncertainty or error.

Rashidah Kasauli Namisanvu (Mak) Errors 19 / 22

Error in addition and subtraction is the sum of the errors in the

Rashidah Kasauli Namisanvu (Mak) Errors 20 / 22

Rashidah Kasauli Namisanvu (Mak) Errors 22 / 22

You might also like