Download as pdf or txt
Download as pdf or txt
You are on page 1of 81

ICS226/ISE225: Numerical Analysis

Oliver Mhlanga

HARARE INSTITUTE OF TECHNOLOGY


omhlanga@hit.ac.zw or olivamhlanga@gmail.com; 0712531415

June 23, 2021

Error and Computer Arithmetic


Computer arithmetic and number storage

Numerical Analysis: This refers to the analysis of mathematical


problems by numerical means, especially mathematical problems
arising from models based on calculus.
Effective numerical analysis requires several things:
I An understanding of the computational tool being used, be
it a calculator or a computer.
I An understanding of the problem to be solved.
I Construction of an algorithm which will solve the given
mathematical problem to a given desired accuracy and
within the limits of the resources (time, memory, etc) that
are available.
Computer arithmetic and number storage

Definition
A mathematical model is a mathematical description of a
physical situation.
The main goal of numerical analysis is to develop efficient
algorithms for computing precise numerical values of mathematical
quantities, including functions, integrals, solutions of algebraic
equations, solutions of differential equations (both ordinary and
partial), solutions of minimization problems, and so on.
Part of this process is the consideration of the errors that arise in
these calculations, from the errors in the arithmetic operations or
from other sources.
Computer arithmetic and number storage

Computers use binary arithmetic, representing each number as a


binary number (consisting two bits 0 and 1): a finite sum of
integer powers of 2. Some numbers can be represented exactly, but
others, such as 1/10, 1/100, 1/1000, ... cannot.
For example, 2, 125 = 21 + 2−3 has an exact representation in
binary (base 2), but
3, 1 = 21 + 20 + 2−4 + 2−5 + 2−8 + ... does not.
And, of course, there are the transcendental numbers like π that
have no finite representation in either decimal or binary number
system.
Computer arithmetic and number storage
Decimal to binary conversion: For the integer part, we divide by
2 repeatedly (using integer division); the remainders are the
successive digits of the number in base 2, from least to most
significant.
Example 1 Convert (107)10 to a binary number:
Quotients 107 53 26 13 6 3 1 0
Remainders 1 1 0 1 0 1 1
Therefore , (107)10 = (1101011)2
For the fractional part, multiply the number by 2; take away the
integer part, and multiply the fractional part of the result by 2, and
so on; the sequence of integer parts are the digits of the base 2
number, from most to least significant.
Example 2 Convert (0.625)10 to a binary number:
Quotients 0.625 0.25 0.5 0
Remainders 1 0 1
Therefore , (0.625)10 = (0.101)2
Computer arithmetic and number storage

Note that if f has an infinite representation in base 10, its


representation in base 2 will also be infinite. However, we could
have situations where f is finitely represented in base 10 and
infinitely represented in base 2.
Octal representation: A binary number can be easily represented
in base 8. Partition the number into groups of 3 binary digits
(23 = 8), from decimal point to the right and to the left (add zeros
if needed). Then, replace each group by its octal equivalent.
Example 3 (107.625)10 = (|1| |101| |011| . |101|)2 = (153.5)8
Computer arithmetic and number storage

Hexadecimal representation: To represent a binary number in


base 16 proceed as above, but now partition the number into
groups of 4 binary digits (24 = 16). The base 16 digits are
0, ..., 9, A = 10, ..., F = 15.
Example 4 (107.625)10 = (|0110| |1011| . |1010|)2 = (6B.A)16
Errors in Numerical Computations

In any applied numerical computation, there are three key sources


of error:
1. Inherent errors:
1.1 Inexactness of the mathematical model for the underlying
physical phenomenon (often called Modeling errors).
1.2 Errors in measurements of parameters entering the model
(often called Experimental errors) .
(1.1) is the domain of mathematical modeling, and will not
concern us here. Neither will (1.2), which is the domain of the
experimentalists.
2. Round-off errors: arises due to the finite numerical precision
imposed by the computer.
3. Truncation errors: arises due to approximations used to solve
the full mathematical system.
Errors in Numerical Computations

(3) is the true domain of numerical analysis, and refers to the fact
that most systems of equations are too complicated to solve
explicitly, or, even in cases when an analytic solution formula is
known, directly obtaining the precise numerical values may be
difficult.
Definition
Let x be a real number and let x ∗ be an approximation. The
absolute error in the approximation x ∗ ≈ x is defined as
∆x = |x ∗ − x|. The relative error is defined as the ratio of the
∗ −x|
absolute error to the size of x, i.e., |x |x| = ∆x
|x| , provided x 6= 0;
otherwise relative error is not defined.
If x ∗ < x , then we say that the number x ∗ is an approximate
value of the number x by defect and if x ∗ > x , then it is an
approximate value of x by excess.
Errors in Numerical Computations

The percentage error (a.k.a relative percentage error)of an


approximate number is Relative error × 100%.
NB:Absolute error measures only quantity of error and it is the
total amount of error incurred by approximate value. While the
relative error measures both the quantity and quality of the
measurement. It is the total error while measuring one unit. The
absolute error depends on the measuring unit, but, relative error
does not depend on measuring unit.
Example 7. Find the absolute, relative and percentage error in x ∗
when x = 1/3 and x ∗ = 0.333
Solution. The absolute error:
∆x = |x ∗ − x| = |0.333 − 1/3| = 0.00033.
Relative error : ∆x 0.00033
|x| = 1/3 = 0.001.
Percentage error:Relative error × 100% = 0.001 × 100 = 0.1%.
Errors in Numerical Computations

H/W 1. Determine the absolute error and the exact number


corresponding to the approximate number x ∗ = 5.373 if percentage
error is 0.01%.
H/W 2.An exact number x is in the interval [28.03, 28.08].
Assuming an approximate value, find the absolute and the
percentage errors. [Hint: The middle of the given interval is taken
as its approximate value]
ICS226/ISE225: Numerical Analysis

Oliver Mhlanga

HARARE INSTITUTE OF TECHNOLOGY


omhlanga@hit.ac.zw or olivamhlanga@gmail.com; 0712531415

June 23, 2021

Solution of Nonlinear Equations


Solution of Nonlinear Equations

Our goal is to develop simple numerical methods for the


approximate solution of the equation

f (x) = 0 (1)

where f is a real-valued function, defined and continuous on a


bounded and closed interval of the real line.
We will restrict our discussion to the case where x ∈ <, but x ∈ C
can also be of significant interest in practice (especially if f is a
polynomial). Solutions of nonlinear equations are called roots or
zeroes.
Some methods will be considered such the bisection method and
the Newton-Raphson method.
Solution of Nonlinear Equations

These methods compute a solution of a nonlinear equation starting


from one initial data, then adopting some iterative method that,
under favorable conditions, will converge to a zero of the function
f.
The number of iterations needed to achieve a solution with a given
accuracy is considered. Iterative procedures can break down (i.e.,
fail) in various ways. The sequence generated by a procedure may
diverge, oscillate, or display chaotic behavior ( a.k.a. ’wandering
behavior’).
Theorem
2.1. Let f be a real-valued function, defined and continuous on a
bounded closed interval [a, b] of the real line. Assume, further,
that f (a)f (b) < 0;then, there exists r in [a, b] such that f (r ) = 0.
Bisection Method.

The bisection method is a procedure that repeatedly ’halves’ the


interval in which a root r has been located. This ’halving’ process
is reiterated until the desired accuracy is reached.
Suppose that we are given an interval [a, b] satisfying (theorem
2.1) and an error tolerance  > 0. The bisection method consists
of the following steps:
1. Define c = (a + b)/2.
2. If b − c ≤ , then accept c as the root and stop.
3. If f (b) × f (c) ≤ 0, then set a = c. Otherwise, set b = c.
Return to step 1.
The interval [a, b] is halved with each loop through steps 1 to 3.
Step 2 will be satisfied eventually, and with it the condition
|r − c| ≤  will be satisfied.
Bisection Method.
Example 2.1. Find the largest root of

f (x) = x 6 − x − 1 = 0 (2)

accurate to within  = 0.001.


With a graph, it is easy to check that 1 < r < 2

We choose a = 1, b = 2; then f (a) = −1, f (b) = 61, and


theorem (2.1) is satisfied.
Bisection Method.

n a b c b-c f(c)
1 1.0000 2.0000 1.5000 0.5000 8.8906
2 1.0000 1.5000 1.2500 0.2500 1.5647
3 1.0000 1.2500 1.1250 0.1250 -0.0977
4 1.1250 1.2500 1.1875 0.0625 0.6167
5 1.1250 1.1875 1.1562 0.0312 0.2333
6 1.1250 1.1562 1.1406 0.0156 0.0616
7 1.1250 1.1406 1.1328 0.0078 -0.0196
8 1.1328 1.1406 1.1367 0.0039 0.0206
9 1.1328 1.1367 1.1348 0.0020 0.0004
10 1.1328 1.1348 1.1338 0.00098 -0.0096

Table 1: Bisection Method for (2)


Bisection Method.

The number (n) of bisections needed to bracket a root given


tolerance  are given by

ln( b−a
 )
n≥ (3)
ln 2
Bisection Method.

For the previous example (2),we need


1
ln( 0.001 )
n≥ = 9.97
ln 2
i.e., we need n = 10 iterates, exactly the number computed.
Advantages of the Bisection Method
I It is guaranteed to converge.
I The error bound (??) is guaranteed to decrease by one-half
with each iteration
Many other numerical methods have variable rates of decrease for
the error, and these may be worse than the bisection method for
some equations.
Bisection Method.

Disadvantage of the Bisection Method


I generally converges more slowly than most other methods.
For functions f (x) that have a continuous derivative, other
methods are usually faster. These methods may not always
converge; when they do converge, however, they are almost always
much faster than the bisection method.
Newton-Raphson Method.

We are now going to establish general methods for computing a


root of the equation f (x) = 0, where f (x) can be an algebraic or
transcendental function.
If xn is the current estimate, then the next estimate xn+1 is given
by

f (xn )
xn+1 = xn − , n = 0, 1, 2, .... (4)
f 0 (xn )

(4) is referred to as the Newton’s method, or Newton-Raphson,


for solving f (x) = 0.
Newton-Raphson Method.

It is guaranteed to converge if the initial guess x0 is close enough,


but it is hard to make a clear statement about what we mean by
close enough because this is highly problem specific. A sketch of
the graph of f(x) can help us decide on an appropriate initial guess
x0 for a particular problem.
Example 2.2: From example 2.1, Use Newtons method to solve (2).
Here f (x) = x 6 − x − 1, f 0 (x) = 6x 5 − 1
and the iteration is given by

xn6 − xn − 1
xn+1 = xn − (5)
6xn5 − 1
Newton-Raphson Method.

Newton’s Method for x 6 − x − 1 = 0


n xn f (xn ) xn − xn−1 α − xn−1
0 1.5 8.89E+1
1 1.30049088 2.54E+1 -2.00E-1 -3.65E-1
2 1.18148042 5.38E-1 -1.19E-1 -1.66E-1
3 1.13945559 4.92E-2 -4.20E-2 -4.68E-2
4 1.13477763 5.50E-4 -4.68E-3 -4.73E-3
5 1.13472415 7.11E-8 -5.35E-5 -5.35E-5
6 1.13472414 1.55E-15 -6.91E-9 -6.91E-9
7 1.134724138
The true root is α = 1.134724138, and x6 = α to nine significant
digits.
N.B: Newton’s method may converge slowly at first. However, as
the iterates come closer to the root, the speed of convergence
increases.
Newton-Raphson Method.

Convergence analysis
Let α be the root so that f (α) = 0.
f (xk ) f 0 (xk )ek −f (xk )
Define Error: ek+1 = xk+1 − α = xk − f 0 (xk ) −α= f 0 (xk )
Using Taylor’s theorem
|ek+1 | =≡ C |ek |2
This is called quadratic convergence.
ICS226/ISE225: Numerical Analysis

Oliver Mhlanga

HARARE INSTITUTE OF TECHNOLOGY


omhlanga@hit.ac.zw or olivamhlanga@gmail.com; 0712531415

June 23, 2021

Interpolation and Approximation


Interpolation and Approximation

Let y = f (x), then given (n + 1) distinct data points


{xi , yi } , i = 0, 1, 2, ....., n. We wish to find a polynomial Pn (x)
such that

Pn (xi ) = yi (6)

Equation (6) is called interpolating polynomial if


dk f ∗ d k Pn
yi = f (xi ), i = 0, 1, 2, ..., n and dx k x = dx k ∗ for some finite
x
k, and x ∗ is one of the values of xi .

Pn (x) = a0 + a1 x + a2 x 2 + ...... + an x n . (7)

(7) has (n + 1) distinct functions, xi s are called nodes.


This creates a system of (n + 1) equations with (n + 1) unknowns.
Interpolation and Approximation

The aim is to solve (7), and determine the coefficients


a0 , a1 , a2 , ......, an using condition (6). This leads to

Pn (x0 ) = a0 + a1 x0 + a2 x02 + ...... + an x0n = y0


Pn (x1 ) = a0 + a1 x1 + a2 x12 + ...... + an x1n = y1
..... = ....................... = .........
Pn (xn ) = a0 + a1 xn + a2 xn2 + ...... + an xnn = yn

In matrix form

x02 ... x0n


     
1 x0 a0 y0
 1 x1 x12 n
... x1   a1 
  
 y1 
 

 ... ... =
... ...   ...   ... 
1 xn xn2 ... xnn an yn
Interpolation and Approximation
x02 ... x0n
   
1 x0 a0
 1 x1 x12 ... x1n   a1 
Where X= 
 ... ...
 , A=   and Y=
... ...   ... 
1 xn xn2 ... xnn an

 
y0
 y1 
 
 ... 
yn
The coefficient matrix is called
Q the Vandermonde matrix. It can
be shown that det(X ) = 0≤j<k≤n (xk − xj ). Since xi are distinct,
the Vandermonde system is non-singular, and solving this system
gives the coefficients ai which solves (6).
Other reasons:
I To approximate the function by polynomial interpolation;
I Other computations such as derivatives and definite integrals
of polynomials are easy to determine and are also polynomials;
Interpolation and Approximation
Example 3.1. Interpolate the given data set with a polynomial of
degree 2.
2
xi 0 1 3
yi 1 0 0.5
Solution: Let P2 (x) = a0 + a1 x + a2 x 2 .
We want to find the coefficients a0 , a1 , a2 . By interpolating
properties we’ve:

x = 0, y = 1 : P2 (0) = a0 = 1
x = 1, y = 0 : P2 (1) = a0 + a1 + a2 = 0
2 2 2 4
x = , y = 0.5 : P2 ( ) = a0 + a1 ( ) + a2 ( ) = 0.5
3 3 3 9
Solving the system gives a0 = 1, a1 = − 41 , a2 = − 34 . Thus,
P2 (x) = 1 − 14 x − 43 x 2
Interpolation and Approximation
Problem: A Vandermonde problem is often ill-conditioned, thus
the computed solutions ai will be inaccurate. Furthermore, the
amount of computational cost, including forming the matrix,
factorisation, and triangular substitutions, is excessive.
Lagrange Interpolation Polynomials
Given (n + 1) distinct data points: (xi , yi ) for i = 0, 1, ..., n, ∃ a
unique Pn , deg (Pn ) ≤ n that solves equation (6) by Lagrangre’s
Formula

n
X
Pn (x) = yi Li (x) = y0 L0 + y1 L1 + ... + yn Ln (8)
i=0

where
n
Y x − xj (x − x0 )...(x − xi−1 )(x − xi+1 )...(x − xn )
Li (x) = =
xi − xj (xi − x0 )...(xi − xi−1 )(xi − xi+1 )...(xi − xn )
j=0,j6=i
(9)
Interpolation and Approximation
Li (x) are cardinal
n functions, satisfying the properties
Li (xj ) = δij = 1, i=j
0, i6=j , i = 0, 1, ..., n
δij is the Kronecker delta function.
Locally supported in discrete sense.
Example 3.2. Write the Lagrange polynomial for the data in (same
as Example 3.1).
2
xi 0 1 3
yi 1 0 0.5
Solution :x0 = 0, x1 = 1, x2 = 32 , y0 = 1, y1 = 0, y2 = 0.5.
Compute the cardinal functions

(x − x1 )(x − x2 )
L0 (x) =
(x0 − x1 )(x0 − x2 )
(x − 2/3)(x − 1) 3 2
⇒ L0 (x) = = (x − )(x − 1)
(0 − 2/3)(0 − 1) 2 3
Interpolation and Approximation

(x − x0 )(x − x2 ) (x − 0)(x − 2/3) 2


L1 (x) = = = 3x(x − )
(x1 − x0 )(x1 − x2 ) (1 − 0)(1 − 2/3) 3
(x − x0 )(x − x1 ) (x − 0)(x − 1) 9
L2 (x) = = = − x(x − 1)
(x2 − x0 )(x2 − x1 ) (2/3 − 0)(2/3 − 1) 2

So
P2 (x) = 2i=0 yi Li (x) = 32 (x − 23 )(x − 1) − 49 x(x − 1) + 0
P
= 1 − 41 x − 43 x 2 same as in Example 3.1.
H/W: Given the following 4 data points
xi 0 1 3 5
yi 1 2 6 7
find a polynomial in Lagrange form to interpolate these data.
Interpolation and Approximation

Pros and Cons of Lagrange polynomial


I (+) It can be written down at once, since the coefficients are
the given yi ;
I (-) If more data points are added to the interpolation problem,
all Li (x) have to be recalculated;
I (-) Slow to compute, each Li (x) is different.
ICS226/ISE225: Numerical Analysis

Oliver Mhlanga

HARARE INSTITUTE OF TECHNOLOGY


omhlanga@hit.ac.zw or olivamhlanga@gmail.com; 0712531415

June 23, 2021

Numerical Differentiation
Numerical Differentiation
If f (x) can be computed with only n digits of precision, is it also
possible to calculate f 0 (x) numerically with n digits of precision?
This challenge can be traced to the subtraction between quantities
that are nearly equal. In this section, several alternatives are
offered for the numerical computation of f 0 (x) and f 00 (x).
Recall that
f (x + h) − f (x)
f 0 (x) = lim
h→0 (10)
h
Taylor theorem:
∞ n
X 1 (k) X 1 (k)
f (x + h) = f (x)hk = f (x)hk + En+1 ,
k! k!
k=0 k=0
(11)

X 1 (k) 1
En+1 = f (x)hk = f (n+1) ()hn+1 + O(hn+1 )
k! (n + 1)!
k=n+1
Numerical Differentiation

E = O(hn ) ⇒ ∃ real, finite, C : |E | ≤ C |h|n . E is of order hn


implying that E is approaching zero at a rate similar to hn .

f (x0 + h) − f (x0 ) h 00
f 0 (x0 ) = − f ().
h 2
For small values of h,
f (x0 + h) − f (x0 )
f 0 (x0 ) ≈
h
M|h|
with an error bounded by 2 , where M is a bound on |f 00 (x)| for
x ∈ (x0 , x0 + h).
Numerical Differentiation

Forward Difference (FD):

f (x + h) = f (x) + hf 0 (x) + O(h2 )


⇒ hf 0 (x) = f (x + h) − f (x) + O(h2 )
f (x + h) − f (x)
⇒ f 0 (x) = + O(h)
h
Backward Difference (BD):

f (x − h) = f (x) − hf 0 (x) + O(h2 )


⇒ hf 0 (x) = f (x) − f (x − h) + O(h2 )
f (x) − f (x − h)
⇒ f 0 (x) = + O(h)
h
Numerical Differentiation

Central Difference (CD):

h2 00 h3 000 h4 iv
f (x + h) = f (x) + hf 0 (x) + f (x) + f (x) + f (x) + ...
2! 3! 4!
h2 h3 000 h4 iv
f (x − h) = f (x) − hf 0 (x) + f 00 (x) − f (x) + f (x) + ...
2! 3! 4!
h3
⇒ f (x + h) − f (x − h) = 2hf 0 (x) + 2 f 000 (x) + ...
3!
0 f (x + h) − f (x − h)
⇒ f (x) = + O(h2 )
2h
FD, BD and CD formulas each involves 2 function calls, 1
subtraction , and 1 division: same computation time.
FD and BD formulas are comparable in accuracy.
CD formula is the most accurate and most recommended method,
however, sometimes CD can not be applied.
Numerical Differentiation

Higher Order Difference Formulas

h2 00 h3 000 h4
f (x + h) = f (x) + hf 0 (x) + f (x) + f (x) + f iv (x) + ...
2! 3! 4!
h2 h3 000 h4 iv
f (x − h) = f (x) − hf 0 (x) + f 00 (x) − f (x) + f (x) + ...
2! 3! 4!
h2 00 h4 iv
⇒ f (x + h) + f (x − h) = 2f (x) + 2 f (x) + 2 f (x) + ...
2! 4!
f (x + h) − 2f (x) + f (x − h)
⇒ f 00 (x) = + O(h2 )
h2 | {z }
f iv ()h2
Error =− 12
Numerical Differentiation

Other Higher Order Formulas

f (x + h) − 2f (x) + f (x − h)
f 00 (x) =
h2
f (x + 2h) − 2f (x + h) + 2f (x − h) − f (x − 2h)
f 000 (x) =
2h3
f (x + 2h) − 4f (x + h) + 6f (x) − 4f (x − h) + f (x − 2h)
f iv (x) =
h4
00 000
Other formulas for f (x), f (x), ... are also possible.
Numerical Differentiation

Example 4.1: f (x) = x 3 , f 0 (x) = 3x 2 , compute f 0 (1) using FD,


BD, CD.
Solution:
FD: h = 0.1, f 0 (1) = f (1.1)−f
0.1
(1.0)
= 3.31
0 f (1.0)−f (0.9)
BD: h = 0.1, f (1) = 0.1 = 2.71
0 f (1.1)−f (0.9)
CD: h = 0.1, f (1) = 0.2 = 3.01
Numerical Differentiation

Example 4.2: The distance x of a runner from a fixed point is


measured (in metres) at intervals of half a second. The data
obtained is
t 0.0 0.5 1.0 1.5 2.0
x 0.00 3.65 6.80 9.90 12.15
1. Use central differences to approximate the runners velocity at
times t = 0.5s and t = 1.25s
2. Use a central difference to approximate the runners
acceleration at time t = 1.5s H/W.
Numerical Differentiation

Solution (4.2.1):
Our aim here is to approximate x 0 (t). The choice of h is dictated
by the available data. Using data with t = 0.5s at its centre we
obtain

x(1.0) − x(0.0)
x 0 (0.5) ≈ = 6.80m/s (12)
2 × 0.5
Data centred at t = 1.25s gives us the approximation

x(1.5) − x(1.0)
x 0 (1.25) ≈ = 6.20m/s (13)
2 × 0.25
Note the value of h used.
ICS226/ISE225: Numerical Analysis

Oliver Mhlanga

HARARE INSTITUTE OF TECHNOLOGY


omhlanga@hit.ac.zw or olivamhlanga@gmail.com; 0712531415

June 23, 2021

Numerical Integration
Numerical Integration

Problem: Given a function f (x), defined on an interval [a, b], we


want to find an approximation to the definite integral
Z b
I (f ) = f (x)dx (14)
a

The indefinite integral of a function is another function or a class


of functions, whereas the definite integral of a function over a
fixed interval is a number.
Main idea:
I partition [a,b] into smaller sub-intervals;
I In each sub-interval, find a polynomial pi (x) ≈ f (x);
I Integrate pi (x) on each sub-interval, sum them up.
Numerical Integration

Z b n−1
1 1 X
f (x)dx ≈ h[ f (x0 ) + f (xn ) + f (xi )] (15)
a 2 2
i=1

Equation (15) is the Trapezoid rule.


Z 1
2
Example 5.1: Evaluate I (f ) = e −x dx by Trapezoid rule with
0
n = 10.
Numerical Integration
Answer: h = 1−0 10 = 0.1
2
i xi xi2 f (xi ) = e −xi
0 0.0 0.0 1.000000
1 0.1 0.01 0.990050
2 0.2 0.04 0.960789
3 0.3 0.09 0.913931
4 0.4 0.16 0.852144
5 0.5 0.25 0.778801
6 0.6 0.36 0.697676
7 0.7 0.49 0.612262
8 0.8 0.64 0.527292
9 0.9 0.81 0.444858
10 1.0 1.00 0.367879
I (f ) ≈ T (f ; h) = 0.1[0.5 × (1 + 0.367879) + 0.990050 +
0.960789 + ... + 0.444858] = 0.746211
Numerical Integration

Error Analysis: Trapezoid Rule.


Theorem
THEOREM ON PRECISION OF TRAPEZOID RULE
If f 00 exists and is continuous on the interval [a, b] and if the
Trapezoid rule T (f ; h) with uniform spacing h is used to estimate
Rb
the integral I (f ) = a f (x)dx, then for some ξ ∈ (a, b),
1
ET (f ; h) = I (f ) − T (f ; h) = − (b − a)h2 f 00 (ξ) = O(h2 )
12
Numerical Integration

Example
Z 2 5.2. If the trapezoid rule is to be used to compute
e x dx
0
with an error of at most 0.5 × 10−4 , how many points should be
used?
Numerical Integration

Solution:
f (x) = e x , f 00 (x) = e x , a = 0, b = 2, max 00 2
|{z} |f (x)| = e .
x∈(a,b)
By error bound, it is sufficient to require that
1
|ET (f ; h)| ≤ h2 .e 2 ≤ 0.5 × 10−4
6
⇒ h2 ≤ 0.5 × 10−4 × e −2 ≈ 4.06 × 10−5
2 p
⇒ = h ≤ 4.06 × 10−5 = 0.0064
n
2
⇒ n ≥ 0.0064 ≈ 313.8
We’ve at least 314 + 1 = 315 points will certainly produce the
desired accuracy.
Numerical Integration
Simpson’s rule.
A numerical integration rule over two equal sub-intervals with
partition points a, a + h, & a + 2h = b is the widely used basic
Simpsons Rule:
Numerical Integration

Z a+2h
h
f (x)dx ≈ [f (a) + 4f (a + h) + f (a + 2h)] (16)
a 3
Simpsons Rule computes exactly the integral of an interpolating
quadratic polynomial over an interval of length 2h using three
points; namely, the two endpoints and the middle
Ra point.
To derive Simpsons Rule for approximating a f (x)dx, we divide
the interval into an even number of sub-intervals of width
b−a
h= , x0 = a, x2m = b, xi+1 − xi = h, 2m = n.
2m
It can be derived by integrating over the interval [xi , xi+2 ] with the
Lagrange quadratic polynomial pi :
Numerical Integration

Thus,
Rb we obtain
f (x)dx ≈
a 
n/2 (n−2)/2
h  X X 
[f (a) + f (b)] + 4 f [a + (2i − 1)h] + 2 f (a + 2ih)
3 
i=1 i=1
| {z }
Simpson0 s rule=S(f ;h)
Numerical Integration
Z 1
2
Example 5.3. Evaluate I (f ) = e −x dx by Sympson’s rule with
0
2m = 10.
Answer: h = 1−0
10 = 0.1
2
i xi xi2 f (xi ) = e −xi
0 0.0 0.0 1.000000
1 0.1 0.01 0.990050
2 0.2 0.04 0.960789
3 0.3 0.09 0.913931
4 0.4 0.16 0.852144
5 0.5 0.25 0.778801
6 0.6 0.36 0.697676
7 0.7 0.49 0.612262
8 0.8 0.64 0.527292
9 0.9 0.81 0.444858
10 1.0 1.00 0.367879
Numerical Integration

Error Estimate for Simpson’s rule.


1
The error term is ES (f ; h) = − (b − a)h4 f iv (ξ). Proof: H/W.
180
This gives error bound
1
|ES (f ; h)| ≤ (b − a)h4 max iv th
|{z} |f (x)| ⇒ 4 order .
180
x∈(a,b)
Example 5.4. With f (x) = e x defined on [0, 2], use Simpson’s rule
Z 2
to compute f (x)dx. In order to achieve an error≤ 0.5 × 10−4 ,
0
how many points must we take?
Numerical Integration

Answer:
2 4 2
|ES (f ; h)| ≤ h e ≤ 0.5 × 10−4
180
180 −2
⇒ h4 ≤ e ≤ 0.5 × 10−4
2
⇒ h ≤ 0.18682
b−a
⇒m= = 5.35 ≈ 6
2h
We need at least 2m + 1 = 13 points (more powerful!).
ICS226/ISE225: Numerical Analysis

Oliver Mhlanga

HARARE INSTITUTE OF TECHNOLOGY


omhlanga@hit.ac.zw or olivamhlanga@gmail.com; 0712531415

June 23, 2021

Systems of Linear Equations


Systems of Linear Equations

A system of n linear equations in n unknowns (variables) is written


as

a11 x1 + a12 x2 + ... + a1n xn = b1


................................... ....
ai1 x1 + ai2 x2 + ... + ain xn = bi (17)
.................................. ....
an1 x1 + an2 x2 + ... + ann xn = bn

Where x1 , x2 , ..., xn are the unknowns (variables) of the system and


a11 , a12 , ..., ann are the coefficients of the unknowns of the system.
The numbers b1 , b2 , ..., bn are constants or free terms of the
system.
Systems of Linear Equations

The system of equations (17) can be written as


n
X
aij xj = bi , i = 1, 2, ..., n (18)
j=1

Also, (17) can be written in matrix form as

AX = b (19)
where A=
 
a11 a12 . . . a1n  
x1
 
b1
a21 a22 . . . a2n   x2 
  b2 
..  , X =  ,b = 
 
 .. .. .. 
 . . . .  ...  . . .
an1 an2 . . . ann xn bn
Systems of Linear Equations
I The system of linear equation (17) is consistent if it has a
solution. If a system of linear equations has no solution, then
it is inconsistent (or incompatible).
I A consistent system of linear equations may have one solution
or several solutions and is said to be determinate if there is
one solution and indeterminate if there are more than one
solution.
Generally, the following three types of the elementary
transformations to a system of linear equations are used.
1. Interchange: The order of two equations can be changed.
2. Scaling: Multiplication of both sides of an equation of the
system by any non-zero number.
3. Replacement: Addition to (subtraction from) both sides of
one equation of the corresponding sides of another equation
multiplied by any number.
Systems of Linear Equations

A system in which the constant terms b1 , b2 , ..., bn are zero is


called a homogeneous system. Two basic techniques are used to
solve a system of linear equations:
1. direct method: which are techniques that give a solution in a
finite number of steps, subject to only round-off errors are
used in this chapter. Several direct methods are used to solve
a system of equations, among them following are most useful.
(i) Cramers rule, (ii) matrix inversion, (iii) Gauss
elimination, (iv) decomposition, etc.
2. iteration method: The most widely used iteration methods are
(i) Jacobi’s iteration, (ii) Gauss-Seidal’s iteration, etc.
System of equation (19) has a unique solution X = A−1 b where
the coefficient matrix A is non-singular (or regular/ invertible).
Systems of Linear Equations
Gaussian Elimination Method:
I is most widely used to solve the system of equations (19).
The crucial step is to construct an equivalent upper-triangular
form which is easier to solve. This process, in turn, is
equivalent to finding the factorization A = LU, where L is a
unit lower triangular matrix and U is an upper triangular
matrix. This factorization is especially useful when solving
many linear systems involving the same coefficient matrix but
different right-hand sides, which occurs in various applications.
I If the coefficient matrix A are symmetric, positive definite,
triangular, banded, block, or sparse, the general approach of
Gaussian elimination with partial pivoting needs to be
modified or rewritten specifically for the system. If the
coefficient matrix has predominantly zero entries, the system
is sparse and iterative methods can involve much less
computer memory than Gaussian elimination.
Systems of Linear Equations

Gaussian elimination consists of (a) forward eliminations and (b)


backward substitution. That is system of equations (19) can be
rewritten as Ux = b of the form
 
u11 u12 . . . u1n  x1   b1 
 0 u22 . . . u2n     
 x2   b2 
=

 .. .. .. ..  
 . . . .  . . . . . .
0 0 . . . unn xn bn
is called back substitution. The solution xi are computed in
reversed order by
Systems of Linear Equations

xn = bn /unn ,
xn−1 = (bn−1 − un−1,n xn )/un−1,n−1 ,
...
x1 = (b1 − u12 x2 − u13 x3 − · · · − u1n xn )/u11
(20)

provided allPuii 6= 0. The general formulation is


xi = (bi − nj=i+1 uij xj )/uii , i = n, n − 1, ..., 1.
Example 6.1. Solve the following system of equations by Gauss
elimination:

−0.04x1 + 0.04x2 + 0.12x3 = 3 (21)


0.56x1 − 1.56x2 + 0.32x3 = 1 (22)
−0.24x1 + 1.24x2 − 0.28x3 = 0 (23)
Systems of Linear Equations

Solution: in matrix form


Augmented matrix
 
−0.04 0.04 0.12 3
 0.56 −1.56 0.32 1 
−0.24 1.24 −0.28 0
0.56
Forward elimination: R1 × 0.04 + R2 7−→ R2 and
0.24
R3 − R2 × 0.04 7−→ R3 yields:
 
−0.04 0.04 0.12 3
 0 −1 2 43 
0 1 −1 −18
Systems of Linear Equations

R2 + R3 7−→ R3 yields
 
−0.04 0.04 0.12 3
 0 −1 2 43 
0 0 1 25
Backward substitution of Gauss elimination is
25
x3 =
1
[43 − (2)(25)]
x2 = =7
−1
[3 − (0.12)(25) − (0.04)(7)]
x1 = =7
(−0.04)
Systems of Linear Equations

Scaled Partial Pivoting:


I The diagonal elements of U are called pivots. The k th pivot is
the coefficient of the k th variable in the k th equation at the
k th step of the elimination. Both the computation of the
multipliers and the back substitution require divisions by the
pivots. Consequently, the algorithm cannot be carried out if
any of the pivots are zero.
I Pivoting is used to change sequential order of the equations
for two purposes:
1. to prevent diagonal coefficients from becoming zero,
2. to make each diagonal coefficient larger in magnitude than any
other coefficients below it. changes of order increases accuarcy
of the computations since round-off error is reduced.
Systems of Linear Equations
Example 6.2a.
Consider the system of equations

2x + 16y + z = 39,
x + y + 25z = 83,
10x + y + z = 19.

Solve the system correct to 4 decimal places by Gaussian


elimination with scaled partial pivoting.
Solution: Rearranging the system to make it diagonally dominant

10x + y + z = 19.
2x + 16y + z = 39,
x + y + 25z = 83.
Systems of Linear Equations
Augmented matrix
 
10 1 1 19
 2 16 1 39 
1 1 25 83
2 1
Forward elimination: R2 :7−→ R2 − 10 R1 and R3 :7−→ R3 − 10 R1
 
10 1 1 19
 0 158 8 352 
10 10 10
0 10 10 811
9 249
10
R2 :7−→ 10R2 , R3 :7−→ 10R3
 
10 1 1 19
 0 158 8 352 
0 9 249 811
Systems of Linear Equations

9
R3 :7−→ R3 − 158 R2
 
10 1 1 19
 0 158 8 352 
39270 124970
0 0 158 158
Backward substitution:
39270 124970
z= ⇒ z = 3.1823
158 158
352 − 8(3.1823)
158y + 8z = 352 ⇒ y = = 2.0667
158
19 − 2.0667 − 3.1823
10x + y + z = 19 ⇒ x = = 1.3751
10
Systems of Linear Equations

The quantity Cond(A), is called the condition number of matrix A,


defined by

Cond(A) = ||A||||A−1 ||
where ||A|| is any matrix norm, gives a measure of the condition of
the matrix A. The large value of Cond(A) indicates the
ill-condition of a matrix or the associated system of equations.
A system is called ill-conditioned or ill-posed if a small change in
the coefficients of the system produces large change in the solution
(unstable),
Systems of Linear Equations
however, if the change in the solution is small for small changes in
the coefficients, then the system is called well-conditioned or
well-posed system.
Iteration Methods.
I If the system of equations has a large number of variables,
then the direct methods are not much suitable. In this case,
the approximate numerical methods are used to determine the
variables of the system.
I The efficiency of the application of approximate methods
depends on the choice of the initial vector and the rate of
convergence of the process.
The following two approximate methods are widely used to solve a
system of linear equations:
1. method of iteration (Jacobis iteration method), and
2. Gauss-Seidals iteration method.
Systems of Linear Equations
Jacobi’s Iteration Method.
Consider the linear system (17), also assume that the quantities aii
are pivot elements. System (17) can be written as

1
x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n xn )
a11
1
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn )
a22
..
.
1
xn = (bn − an1 x1 − an2 x2 − · · · − ann−1 xn−1 ).
ann

(k) (k) (k)


In general, if x1 , x2 , ..., xn be the k th approximate roots then
the next approximate roots are given by
Systems of Linear Equations

1 (k) (k) (k)


x1k+1 = (b1 − a12 x2 − a13 x3 − · · · − a1n xn )
a11
k+1 1 (k) (k) (k)
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn )
a22
...
1 (k) (k) (k)
xnk+1 = (bn − an1 x1 − an2 x2 − · · · − ann−1 xn−1 ).
ann
k = 0, 1, 2...
The iteration process is continued until all the roots converge to
the required number of significant figures.
The Jacobi’s iteration method and Gauss-Seidel method converge
if the coefficient
Pn matrix is diagonally dominant i.e.,
(|aii | ≥ j=1,j6=i aij )
(k+1) (k)
The method is repeated until |xi − xi | < ∀i = 1, 2, ..., n,
where  > 0 is any pre-assigned number called the error tolerance.
Systems of Linear Equations

Example 6.4. Solve the following system of linear equations by


Gauss-Jacobi’s method correct up to four decimal places.
27x + 6y − z = 54, 6x + 15y + 2z = 72, x + y + 54z = 110.
Solution: Obviously, the system is diagonally dominant as
|6| + | − 1| < |27|, |6| + |2| < |15|, |1| + |1| < |54|.
The Gauss-Jacobis iteration scheme is
1
x k+1 = (54 − 6y k + z k )
27
1
y k+1 = (27 − 6x k − 2z k )
15
1
z k+1 = (110 − x k − y k )
54
Let the initial solution be (0, 0, 0). The next iterations are shown
in the following table.
Systems of Linear Equations

k x y z
0 0 0 0
1 2.00000 4.80000 2.03704
2 1.00878 3.72839 1.91111
3 1.24225 4.14167 1.94931
4 1.15183 4.04319 1.93733
5 1.17327 4.08096 1.94083
6 1.16500 4.07191 1.93974
7 1.16697 4.07537 1.94006
8 1.16614 4.07454 1.93996
9 1.16640 4.07488 1.93999
10 1.16632 4.07477 1.93998
11 1.16635 4.07481 1.93998
The solution correct up to four decimal places is
x = 1.1664, y = 4.0748, z = 1.9400.
Systems of Linear Equations

Gauss-Seidal’s Iteration Method.


A simple modification of Jacobis iteration sometimes give faster
convergence. The modified method is known as Gauss-Seidal’s
iteration method.
(k)
Consider again system (17), If xi , i = 1, 2, ..., n be the k th
approximate value of xi , then the (k + 1)th approximate value of
x1 , x2 , ..., xn are given by
Systems of Linear Equations

1 (k) (k) (k)


x1k+1 = (b1 − a12 x2 − a13 x3 − · · · − a1n xn )
a11
1 (k+1) (k) (k)
x2k+1 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn )
a22
..
.
xik+1 =
1 (k+1) (k+1) (k) (k)
(bi − ai1 x1 − · · · − aii−1 xi−1 − aii+1 xi+1 − · · · − ain−1 xn−1 )
aii
..
.
1 (k+1) (k+1) (k+1)
xnk+1 = (bn − an1 x1 − an2 x2 − · · · − ann−1 xn−1 ).
ann
k = 0, 1, 2...
Systems of Linear Equations

(k+1) (k)
The method is repeated until |xi − xi | < ∀i = 1, 2, ..., n,
where  > 0 is any pre-assigned number called the error tolerance.
Example 6.5. Solve the following system of equations by
Gauss-Seidals iteration method, correct up to four decimal places.
27x + 6y − z = 54, 6x + 15y + 2z = 72, x + y + 54z = 110.
Solution. The iteration scheme is
1
x k+1 = (54 − 6y k + z k )
27
1
y k+1 = (27 − 6x k+1 − 2z k )
15
1
z k+1 = (110 − x k+1 − y k+1 )
54
Let y = 0, z = 0 be the initial solution. The successive iterations
are shown below
Systems of Linear Equations

k x y z
0 - 0 0
1 2.00000 4.00000 1.92593
2 1.18244 4.07023 1.93977
3 1.16735 4.07442 1.93997
4 1.16642 4.07477 1.93998
5 1.16635 4.07480 1.93998
6 1.16634 4.07480 .93998
The solution correct up to four decimal places is
x = 1.1663, y = 4.0748, z = 1.9400
Remark: Usually, the Gauss-Seidal’s method converges rapidly
than the Gauss- Jacobis method. But, this is not always true.
There are some examples in which the Gauss-Jacobi,s method
converges faster than the Gauss-Seidal’s method.
References
[1]. S.T Chapra and R. P. Candle: (2015) Numerical Methods For
Engineers, Seventh Edition.
[2] N. R. Nassif and D. K. Fayyad: (2014) Introduction to
Numerical Analysis and Scientific Computing.
[3]. Richard L. Burden, J. Douglas Faires. Numerical Analysis
Ninth Edition, Youngstown State University 2011 Brooks Cole,
Cengage Learning, www.cengage.com.
[4]. K.Atkinson ,W Han :(2004) Elementary Numerical Analysis
Wiley India.
[5]. Madhumangal Pal. Numerical Analysis for Scientists and
Engineers: Theory and C Programs
[6]. E.Kreyszig: Advanced Engineering mathematics. Copyright
(2006) John Wiley & Sons, Inc.
[7]. C. Trenchea: Numerical Mathematical Analysis (2010). Dpt of
Mathematics , University of Pittsburgh.
[8]. P. J. Olver: AIMS lecture notes on Numerical Analysis (1995).

You might also like