Download as pdf or txt
Download as pdf or txt
You are on page 1of 75

M.A.

Population Studies
(Distance Education)

MSP-3C

BASIC STATISTICAL METHODS FOR POPULATION


STUDIES

Block- 1
Some Basic Mathematical Tools

Capacity Building for a Better Future

Department of Extra Mural Studies & Distance Education


INTERNATIONAL INSTITUTE FOR POPULATION SCIENCES
(DEEMED UNIVERSITY)
BASIC STATISTICAL METHODS FOR POPULATION STUDIES

Block - 1 : Some Basic Mathematical Tools

Unit 1 : Elementary Mathematical Tools in the


Field of Population 6

Unit 2 : Interpolation and Graduation 28


Department of Extra Mural Studies and Distance Education

Prof. K. S. James Prof. T. V. Sekher


Director and Senior Professor Head
IIPS, Mumbai. Department of EMS & DE, IIPS

Units in this block are originally


written by: Prof. G. Rama Rao
Prof. R. K. Sinha
Prof. A. P. Deshpande

Units of this
block are revised and updated by: Prof. Sayeed Unisa
Prof. S. K. Singh
Dr. Laxmi Kant Dwivedi
Dr. Preeti Dhillon

Edited and compiled by: Dr. Atreyee Sinha


Dr. Md. Illias K. Sheikh

Production: Mr. Prakash H. Fulpagare


Dr. M. V. Vaithilingam

First Edition : May 1994;


Second Edition : July 1996;
Third Edition : July 1999;
Fourth Edition : Nov. 2003;
Fifth Edition : May 2019.

©
International Institute for Population Sciences, Govandi Station Road, Deonar,
Mumbai-400 088. Ph: 022-42372428; Fax: 022-25563257; E-mail: ems@iips.net

All rights are reserved. No part of this work may be reproduced in any form, by xeroxing or any
other means without prior permission in writing from the Director, International Institute for
Population Sciences, Mumbai.
Block - 1 : Some Basic Mathematical Tools

The block in your hand is the first block in the paper of Statistical Methods for
Population Studies. There are two units in this block. The details about the contents of different
units of this block are as follows:

Unit 1 : Elementary Mathematical Tools in the Field of Population

In this unit, we have endeavored to explain you some basic mathematical tools
and their applications in population data analysis. You will learn about permutation and
combination, binomial and exponential functions and computation of population growth rate in
different sections of this unit.

Unit 2 : Interpolation and Graduation

The interpolation is defined as the technique of obtaining the most likely estimate
of a certain quantity under certain assumptions. In this unit, we will discuss about different
methods of interpolation, extrapolation and graduation with their application in population data
analysis. Besides, you will also find a critical evaluation of different methods, their merits and
limitations.
PREFACE
(Fifth Edition)

It is indeed my great pleasure to introduce the fifth edition of the self-instructional


study materials on Population Studies prepared by the Department of Extra Mural Studies and
Distance Education of the Institute. While revising the course materials, we have taken care in
modifying the existing course content as per the new syllabus and updating the data used in
the examples. In addition to revision/updating, we have also developed a few additional units
in accordance with the new syllabus. We tried to make the fifth edition simpler which covers
broader aspects of Population Studies.

This block entitled Some Basic Mathematical Tools is the revised and updated version
of the module – Statistical Methods for Population Studies (MSP-3C). Prof. Sayeed Unisa,
Prof. S. K. Singh, Dr. Laxmi Kant Dwivedi and Dr. Preeti Dhillon have contributed in
revising and updating the block. Dr. Atreyee Sinha and Dr. Md. Illias K. Sheikh have compiled
and edited the block in this version. I complement Prof. T. V. Sekher and his colleagues in the
Department of Extra Mural Studies and Distance Education who have collectively worked
hard to maintain the quality and the overall conduct of the programme.

Self-instructional study material is one of the key factors in the distance-learning


process. I hope the study material will help to meet the needs of the distance learners and to
achieve the desired standard of the course.

Prof. K. S. James
May 2019 Director & Sr. Professor
Unit 1: Elementary Mathematical Tools in the Field of Population

Unit Structure

1.0 Objectives
1.1 Introduction
1.2 Permutations [Self-check Exercise]
1.3 Combinations [Self-check Exercise]
1.4 Binomial Expansion and Exponential Functions
1.4.1 Binomial Expansion
1.4.2 Exponential Function [Self-check Exercise]
1.5 Ratios, Proportions and Rates
1.6 Arithmetic, Geometric and Exponential Rates of Population Growth
1.7 Estimation of Mid-year Population
[Self-check Exercise]
Let Us Sum Up
Model Answers

1.0 Objectives

In the present unit you will learn about some mathematical tools and its application in
the field of population. More specifically we will discuss about:

• the concept of permutation and combinations;


• binomial expression and its properties;
• binomial theorem and its particular cases; and
• arithmetic, geometric, and exponential rates of population growth.

1.1 Introduction

Demography or population studies being interdisciplinary in nature attracts students


from different fields in social sciences, and because of their special background, those who enrol
for the course possess a varying degree of knowledge in basic mathematics. A workable
knowledge in handling the mathematical tools and their applications to the analysis and
interpretation of population data is essential in the study of population. For example, the
complexities of technical demography cannot be fully appreciated and understood without a
proper background in basic mathematics. The topics covered in this unit shall help the non-
mathematicians, amongst you to enhance their understanding of elementary mathematics, so that
along with your mathematician colleagues you can also grasp the quantitative aspects of
demography equally well. The students with a mathematics background who have forgotten the
application of the basic tools because of their lack of practice will also find these notes
rewarding. The presentation is kept simple and the solved examples are given from time to time
to illustrate the application of the various formulae.

1.2 Permutations

To know how to determine the number of possible outcomes, we must study the
mathematical concepts of permutation and combination. You will agree with us that before
defining permutation, it would be beneficial to explain the basic concept on permutations with

6
some suitable examples. So let us start with two rules to explain the concept about
permutations.

Rule 1: If a certain act A1 can be performed in m1 different ways and another act A2 can be
performed in m2 different ways then the total number of ways in which either A1 or A2 can be
performed is m1 + m2. Thus, for example, if there are 5 mathematics books and 4 physics
books, and if a boy is to choose either a mathematics book or a physics book, he can do so in
5+4=9 ways.

Rule 2: If a certain act A1 can be performed in m1 different ways, and having performed it in
any one of these m1 ways, another act A2 can be performed in m2 different ways then the two
acts, A1 and A2 can be performed in the stated order in (m1 x m2) ways.

Thus, for example, suppose that a person can travel from Bombay to Delhi by any one of
the three trains running between the two places. Also, suppose that he is to travel from Bombay
to Delhi and then return by a different train. In how many ways can he perform the journey?

Solution: The person can travel from Bombay to Delhi in three ways and corresponding to each
of these three ways there are two ways to return as he is not supposed to take the same train for
the return journey. Hence, the total number of ways he can perform the entire journey is 3x2=6
ways.

The above rule can be extended to the case where three or more acts are to be
performed. If there are k acts A1, A2, A3, ..., Ak such that A1 can be performed in m1 different
ways, having performed A1 in any one of these m1 ways, A2 can be performed in any one of m2
different ways and so on up to the kth act which can be performed in mk different ways. Then
the total number of ways in which the k acts can be performed in the stated order is m1 * m2 *
m3 * .... * mk.

Thus, for example, suppose that a cricket team of eleven players are to choose a captain,
a vice-captain and a secretary amongst themselves. The captain may be chosen in 11 ways as
anyone from 11 players can become a captain. Having chosen a captain, a vice-captain may be
chosen 10 ways (out of remaining 10 players) and having chosen a captain and a vice-captain, a
secretary can be chosen in 9 ways. Hence a captain, a vice-captain and secretary can be chosen
in 11x10x9=990 ways.

Definition of Permutations and its Proof: Now having understood the basic concepts about
permutations let us define permutations. The word permutation in simple language means
arrangement.

Definition: In general, let there be no different objects which are to be arranged in a line, taking
only r of them (0<r<n) at a time. Each possible arrangement in a line of r objects is called a
permutation of n objects taken r at a time. The total number of such arrangements is denoted npr
or P(n,r). Precisely if r objects are to be selected from a set of n different objects in such a way
that the order of selection is important, the number of permutations is given by:

nPr = n(n-1) (n-2) ..........(n-r+1)

The above relation shows the number of different permutations of n different objects taken r at a
time without repetition.

7
Proof: The required number is the same as the number of ways in which r places in a row can
be filled with n different objects.

The first place can be filled in n different ways as any one of the `n' objects may be
placed there. Having filled the first place in any of these n ways, the second place can be filled
in (n-1) different ways as any one of the remaining `n-1' objects may be placed there. Hence by
Rule 2 the first two places can be filled in n(n-1) different ways. Proceeding this, when the first
(r-1) places have been filled we are left with n-(r-1) = n-r+1 objects with any one of which the
rth (that is, the last places) can be filled. This can be done in (n-r+1) different ways.

Hence r places can be filled in n(n-1) (n-2) ...(n-r+1) ways.


That is, nPr = n(n-1) (n-2) ...(n-r+1)
Corollary: The number of permutations of n objects taken n at a time is:
nPn = n(n-1) (n-2) ...3.2.1
Factorial Notation for Permutations: In the earlier sub-section we have seen the formula for
permutations which is -
nPn = n(n-1) (n-2) ...3.2.1

The right hand side of the above formula is nothing but the product of the first n natural
numbers. Such products of some consecutive positive integers will often occur in the present
unit.

We shall, therefore, introduce the following notation to write such products.


We write: 1*2*3=3!, 1*2*3*4*5*6=6!
In general, 1*2*3* .... *(n-1) * n=n! [n factorial]
We then write,

1× 2 × 3 × 4 × 5 × 6 6!
4× 5× 6 = =
1× 2 × 3 3!

1 × 2 × 3 × 4 × ... × 10 10!
7 × 8 × 9 × 10 = =
1 × 2 × 3 x ... × 6 6!

n(n - 1) (n - 2) ... (n - r + 1 - 1) (n - r + 1 - 2) ... 3.2.1


∴ n(n - 1) (n - 2)... (n - r + 1) =
(n - r) (n - r - 1) ... 3.2.1

n(n - 1) (n - 2) ... (n - r + 1) (n - r) (n - r - 1) ...3.2.1 n!


(n - r) (n - r - 1) ... 3.2.1 (n - r)!

we thus see that,


(n! )
n Pr =
(n - r)!

Now if r = n, we get,

8
n! n!
n Pn = =
(n - n)! o!

But since nPr = n (n-1) (n-2) ......(n-r+1)

Therefore, nPn = n (n-1) (n-2) .... 3.2.1 = n!

As we define o! = 1. The formula


n!
n Pr = remains valid for r = n also.
(n - r)!

Example 1: How many three digit numbers can be formed using the digits 1,3,5,7,9 if each digit
is to be used only once?

Solution: Here we have to arrange 5 digits in a line, taking 3 at a time. This can be done in

5! 5! 1  2  3  4  5
P3 = = = = 60
1 2
5
(5 - 3)! 2!

Therefore, 60 three digits’ numbers can be formed.

Self-check Exercise

1. Find the values of


(i) 3P3 (ii) 4P2 (iii) 5P2

1.3 Combinations

In the previous section on permutations we have seen the arrangement of n objects taken
r at a time in a row. In permutations while selecting or arranging the objects we are concerned
with the ORDER of the objects whereas in combinations the order or an arrangement has no
relevance or it is ignored. Let us now define a combination.

Definition: Let there be no different objects out of which r (where 0<r<n) are to be chosen at a
time. A group of r objects selected out of the n objects without reference to order of selection, is
called a combination of n objects taken r at a time. The total number of such combinations is
denoted by nCr or C(n,r).

The formula for combination is -

nPr = n!
n Cr = ; n C0 = 1
r! r! (n - r)!

9
For example, suppose a boy is to choose 3 books out of 5 books, then the combination of these
can be denoted by - 5C3 or C(5,3) [Since n=5, r=3].

5! 1 2  3  4  5
C3 = = = 10
3! 2! 1  2  3  1  2
5

Example 2: An examination paper has 13 questions and the students are expected to answer 5.
In how many ways could the questions be selected?

Solution: In this example the order of selecting a question is not important since a student can
choose or answer any question according to his choice. Therefore, this is an example on
combination.

Therefore, total number of ways of selecting questions


13! 13!
13 C5 = =
(13 - 5)! 5! 8! 5!

13.12.11.1 0.9.
= = 1287 ways
5.4.3.2.1

Two Important Identities


(i) nCr = nCn-r
Proof:

n!
n Cn-r =
(n - r)! [n - (n - r)]!
n!
= = n Cr
(n - r)! r!

Hence L.H.S. = R.H.S.

This means that the number of combinations of n things taken r at a time is the same as
the number of combinations of n things taken n-r at a time.
(ii) nCr + nCr-1 = n+1Cr

Proof:
n! n!
n Cr + n Cr-1 = +
r! (n - r)! (r - 1)! (n - r + 1)!
n!  (n − r + 1) + n!  r n!  (n + 1)
= =
r! (n − r + 1)! r! (n + 1 − r!)
(n + 1)
= = ( n-1 Cr )
r! (n + 1 − r )!

10
Self-check Exercises

2. In how many ways can a cricket team of eleven be chosen out of 15 players?
3. A committee of five persons is to be chosen from 8 men and 5 women. In
how many ways can this be done if the committee is to contain 3 men and 2
women?

1.4 Binomial Expansion and Exponential Functions

In the earlier two sections we have discussed about permutations and combinations.
Now this section we shall start with Binomial Expansion.

1.4.1 Binomial Expansion

An expression containing two terms, which are connected by a positive or negative sign, is
called a Binomial Expression or simply a Binomial. For example, x+y, 2x-3y, 9x-7, x2+4y2 are
all Binomials. Consider the expression (x+y)2.
By actual multiplication we get,
(x+y)2 = x2 + 2xy + y2

Similarly, by actual multiplication, we can get


(x+y)3 = x3 + 3x2y + 3xy2 + y3.
(x+y)4 = x4 + 4x3y + 6x2y2 + 4xy3 + y4.

In each of the above expansions we may observe the following properties:

(i) The number of terms in each expansion is one more than the exponent of the binomial.
For example, in the expansion of (x+y)4, the exponent is 4 and the number of terms in
this expansion are 5.
(ii) The first term is x with an exponent the same as the exponent of the binomial, and the
exponent decreases by 1 from term to term. For example, in the expansion of (x+y)3, the
exponent of x in the first term is 3, that in the second term is 2 and so on and in the final
term there is no term with x as x0 = 1.

(iii) The exponent of y in the second term is 1, and it increases by 1 from term to term. For
example, in the expansion of (x+y)3 in the first term there is no y as y0 = 1 and the
exponent of y has increased from term to term.

(iv) The sum of the exponents of x and y in any term is equal to the exponent of the
binomial. For example, you will observe that in the expansion of (x+y)3 the sum of the
exponents of x and y in the second, third and fourth term is equal to 3 which is also the
exponent of (x+y)3.

(v) The coefficient of the second term is the same as the exponent of the binomial. The
coefficient of any term further may be computed from the previous term by multiplying
11
that term's coefficient by the exponent of x and dividing by one more than the exponent
of y. For example, consider the expansion of: (x+y)4 = x4 + 4x3y + 6x2y2 + 4xy3 + y4.

In the above expansion the coefficient of the second term is 4 which is also the exponent
of the binomial (x+y)4. The coefficient of the third term which is 6 is computed as follows:

The coefficient of the second term = 4 ... (a)


The exponent of x in the second term = 3 ... (b)
Therefore, (a) * (b) = 4 * 3 = 12 ... (c)
(c) is divided by 2 (since the exponent of y in the second term = 1)
Therefore, the coefficient of the third term = 12/2 = 6.

On similar lines, you may check the coefficient of the fourth and fifth terms in the above
expansion. The discussion done so far on the binomial expressions suggest the following
expansion of (x+y)n, n being a positive integer.

(x+y)n = xn + nC1xn-1.y + nC2xn-2.y2 + nC3xn-3.y3 +......+ nCrxn-r.yr +......+ nCn-1x.yn-1+yn

The above expansion called the binomial expansion is known as the Binomial Theorem.
It was proved by Sir Isaac Newton in 1665.

The Binomial Theorem : If n is a positive integer and x, y are any two numbers, then

(x+y)n = xn + nC1xn-1.y + nC2xn-2.y2 + nC3xn-3.y3 +......+ nCrxn-r.yr +......+ nCn-1x.yn-1+yn

n
= ∑ n
n-r
Cr x . y
r

r =0

Particular Cases of Binomial Expansion

(i) Replacing y by -y in the following equation

(x+y)n=nC0.xn + nC1.xn-1.y + nC2.xn-2.y2+....+nCr.xn-r.yr+....+nCn-1.x.yn-1 + yn

we get,
(x-y)n= nC0.xn + nC1.xn-1.(-y)+ nC2.xn-2.(-y)2+ nC3.xn-3.(-y)3... + nCr.xn-r.(-y)r + ...
+ nCn-1.x.(-y)n-1 + (-y)n

= xn - nC1.xn-1.y + nC2.xn-2.y2 - nC3.xn-3.y3 + .... + (-1)rnCr.xn-r.yr + .... +


(-1)n-1.nCn-1.x.yn-1 + (-1)nyn

Notice that in the above the terms are alternately positive and negative.

(ii) putting x = 1 in (x+y)n, we get

12
(1+y)n = 1 + nC1y + nC2y2 + nC3y3 +....+ nCryr +....+ nCn-1yn-1 + yn

(iii) Replacing y by -y in (1+y)n, we get


(1-y)n=1 - nC1y + nC2y2 - nC3y3+....+(-1)r.nCryr+....+(-1)n-1.nCn-1yn-1+(-1)nyn
Observe the following points about the expansion of (x+y)n

(i) The number of terms in the expansion is n+1

(ii) The exponent of x in the first term is n and it goes on decreasing by unity in the
succeeding terms and becoming zero in the (n+1)th, that is, the last term.

(iii) The exponent of y in the first term is zero and it goes on increasing by unity in the
succeeding terms and becoming n in the (n+1)th, that is, the last term.

(iv) The sum of the exponents of x and y in any term of the expansion is n.

(v) The coefficient of xn is 1. It may be written as nC0. The coefficient of yn is also 1 which
may be written as nCn. The coefficients in the n+1 terms of the expansion are,

nC0, nC1, nC2, ........, nCr, ........, nCn-1, nCn


These are called as Binomial coefficients of the nth order.
(vi) The (r+1)th term of the expansion of (x+y)n is
Tr+1 = nCr.xn-r.yr, r = 0, 1, 2, ...., n.
Where Tr+1 is (r+1)th term
The (r+1)th term of the expansion of (x-y)n is
Tr+1 = (-1)rnCr.xn-r.yr, r = 0, 1, 2, ...., n.

(vii) If n is even, the number of terms in the expansion is odd. The (n/2 + 1)th term is the
middle term. If n is odd, the number of terms in the expansion is even. In this case there is no
single middle term but the (n+1)/2th term and {(n+1)/2+1}th terms may be taken as two middle
most terms.

Example 3: Write down the expansion of (x2 + 3y)5


Solution: We know that the expansion of
(x+y)n = xn + nC1xn-1y + nC2xn-2y2 + .... + nCrxn-ryr + .... + nCn-1xyn-1 + yn
Substituting x = x2, y = 3y and n = 5 in the above expansion we have
(x2 + 3y)5 = (x2)5 + 5C1(x2)4(3y) + 5C2(x2)3(3y)2 + 5C3(x2)2(3y)3 + 5C4(x2)(3y)4 + 5C5(3y)5

13
Now the binomial coefficients in the present example are:

5! 5.4.3.2.1
5 C1 = = = 5
1! 4! 4.3.2.1

5! 5.4.3.2.1
5 C2 = = = 10
2! 3! 1.2.1.2.3

5! 5.4.3.2.1
5 C3 = = = 10
3! 2! 3.2.1.2.1

5! 5.4.3.2.1
5 C4 = = =5
4! 1! 4.3.2.1.1

Therefore, (x2+3y)5 = x10 + 15x8y + 90x6y2 + 270x4y3 + 405x2y4 + 243y5

Note: Please remember that it is always better to find out the values of binomial coefficients and
expand the given coefficient fully, otherwise you may lose the marks in the examination

Example 4: Find the 5th term in (2x-3)8

Solution: The (r+1)th term in the expansion of (x-y)n is

Tr+1 = (-1)r.nCr.xn-r.yr

Put n = 8, r = 4, x = 2x and y = 3

T5 = (-1)4.8C4.(2x)8-4.(3)4

= 8C4.(2x)4.(3)4

Example 5: Find out the middle term in the expansion of [x2/3 + 1/x1/2]10

Solution: In the present example, n = 10

The middle term is the T(n/2 +1)th term = T(10/2 +1) = T6

8! 8.7.6.5.4! 8.7.6.5
C = = = = 70
8 4 4! 4! 4! 4! 1.2.3.4

∴ T 5 = 70 × 16x 4 × 81 = 90720 x 4

The (r+1)th term is Tr+1 = nCrxn-ryr

Here n = 10, r = 5, x = x2/3, y = 1/x1/2

14
2/3 5 1 5
∴ T6 = 10 C5 .(x ) .( 1/2
)
x
10/3
10! x
= .
5! 5! x5/2
10  9  8  7  6 5/6
= . x = 252 x 5/6
1.2.3.4.5

5/6
T6 = the middle term = 252 x

Now we shall see with a solved example that how to find the coefficient of a given term in a
binomial expansion.

Example 6: Find the coefficient of x5 in (x-2)9

Solution: We know that the coefficient of Tr+1 is nCr.

therefore, Tr+1 = (-1)r.nCr.xn-r.yr

In the present example n = 9, and we are required to find out the coefficient of x5.

Since xn-r = x5
Therefore, n-r = 5 therefore r = n-5 = 9-5 = 4
Therefore, T5 = (-1)4.9C4.x5.(2)4
= 9C4.x5.(2)4

9! 9 × 8 × 7 × 6 × 5! 9× 8 × 7× 6
9 C4 = = = = 126
4! 5! 4! 5! 1.2.3.4

∴ T5 = 126 × 16 × x 5 = 2016 x 5 ∴ Coeff. of x 5 = 2016.

Now we shall see how to solve problems involving terms which do not have x as variable.

Example 7: The nth terms in the expansion of [3x - (1/3x)]30 is independent of x, find n ?
Solution: Let Tr+1 be term independent of x,

Now Tr+1 = 30-r (-1/3x) r


30Cr(3x)

= 30Cr 330-r x30-r (-1) r 3-r x-r

= 30Cr 330-2r(-1) r x30-2r

Tr+1 will be independent of x, when

30-2r = 0, or r = 15

15
Tr+1 =T15+1 = T16 n = 16

1.4.2 Exponential Function

The power function is represented by the general form y = xn; where n is any given
number. A new function can be defined by the simple process of taking the base of the power as
a fixed number and the index as variable. The function obtained by writing a variable power of a
fixed number is called as an exponential function. It can be written as y = ax, where a is the fixed
base of the function.

The functions of the type y = ea+bx are called exponential functions. Whenever
exponential functions occur the logarithm is taken to base e instead of to base 10. Usually
tables give values for log to base 10. The logarithm to base e of any number x, that is, logex
can be obtained as follows:
log x
log e x = 10

log 10 e
= log 10 x  1/0.9392916
log e x = 2.3026 log 10 x
To find the value of log 26:
By applying the rule of changing base, log1026 = loge26 * log10e
From the tables of common logs,
log1026 = 1.4150,
Therefore,1.4150 = Ln26 * 0.4343
Note: loge26 = ln26 and log10e = 0.4343
1.4150
or ln 26 = ---------- or ln 26 = 3.2581
0.4343

Self-check Exercises
4. Expand the following
5
 2x 3y 
 - 
 3 2 
5. Find the seventh term in [1-x/2]10
6. Find the middle terms in the expansion of
10
x a
a - x
 
7. Find the coefficient of x12 in the expansion of
12
 1
 2x 2 - x 
 
8. The nth term in the expansion [5x - 1/5x]30 is independent of x; hence find n.

16
1.5 Ratios, Proportions and Rates

Basic data on population comes from the censuses, surveys, and vital registration
records. All these will give us large numerical numbers according to some characteristics. These
basic data are in actual or "absolute" numbers. Absolute numbers given by census reports are
total population, male, and female population, female population in reproductive age groups,
etc. and some other characteristics.

In general, we require to measure the absolute numbers in relation to other numbers,


such as population in relation to area (density), population in relation to income (per capital
income), population in relation to education (literacy), population in different age-groups in
relation to total population for all ages, etc. This brings the concept of relative numbers, called
Rates and Ratios. In this section we shall be studying both rates and ratios.

Ratio: Ratio is the term used to denote a/b where `a' and `b' are two numbers. It indicates so
many as per unit of b. To make ratio in an integer mode, it is generally multiplied by a constant
factor `k` and this constant `k' may be 100,1000,10,000 or any multiplicative number exponent
of 10. It is important to note that ratio involves only one "Universe or Population". That is both
numerator and denominator are derived from the same source. For example, college going
population of Maharashtra state out of college going population of India.

Proportion: The second type of ratio is "proportion". It is represented by a/a+b. where `a' and `b'
are numbers obtained from the same source. For example, proportion single by age-group,
proportion of worker out of total worker, proportion of children immunized out total children,
masculinity ratio etc. Beside these, some other ratios are computed as such.

Children in the age group 0-4


(i) Child women ratio (CWR) = ---------------------------------------------------- x 1000
Women in the age group 15-44, or 15-49

Total number of male births


(ii) Sex ratio at birth (SRB)= ---------------------------------------- x 100
Total number of female births

Women of parity i and above


(iii) Parity progression ratio (PPR) = ------------------------------------ x 1000
Women of parity i

Rate: A rate is a special type of ratio used to indicate the relative frequency of the occurrence of
a particular event within a population or sub-population in a specified period of time, usually
one year. Although this usage is recommended, the term has steadily acquired a wider meaning
and is often incorrectly used as a synonym for ratio. For example, percentage of population
literate is often termed as literacy rate.

Rate is defined as (a/b)*K, where `a' and `b' are derived from two different sources. For
example, while computing the crude birth rate the number of births in the numerator is obtained
from vital statistics registration records and the mid-year population, in the denominator is
obtained from censuses. Following are some examples of rates, which are used in population
analysis. They are: General fertility rate, Age-specific marital fertility rate, Total fertility rate,
age cumulative fertility rate, crude death rate, age-specific death rate, infant mortality rate,
maternal mortality rate.
17
1.6 Arithmetic, Geometric, and Exponential Rates of Population Growth

The change in population is measured by the annual rate of growth. The rate of
population growth can be measured in two ways. One is to find the difference between the
numbers of people present at two different dates (as absolute number), and from this to calculate
the annual rate of change during the intervening period (a relative number); the other is to
reckon the rate of change from the records of individual changes as they occurred-births, deaths,
and migration - based on vital statistics. Here we are concerned with the first approach.

The rate of growth of population may follow -

(i) Arithmetic rate of growth,


(ii) Geometric rate of growth, and
(iii) Exponential rate of growth.

Now let us discuss these different rates of growth in detail one by one.

(i) Arithmetic rate of growth or Linear Growth Function: Linear growth function is applied to
estimate intercensal population only when the population figures growing by a constant amount.
Linear growth function or arithmetic rate of growth is given by the equation: Pt2 = Pt1(1 + rt)

where Pt2 is the size of population at time t2;


Pt1 is the size of population at time t1;
r is the arithmetic rate of growth at which the population is increasing between
the time periods t1 and t2; (to be calculated) and
t = t2 - t1 = is the time interval between Pt2 and Pt1

Let us discuss the computational procedure of this method with the following example:

Example 8: The population of a country in 1991 is 846 million and 1027 million in 2001. If the
increase in population is constant during 1991-2001. Estimate the population in 1997. If the
growth continues at the same rate, estimate the population in 2004. Also calculate the average
annual growth rate during 1991-2001.

Solution: The linear growth function is given by:


Pt2 = Pt1(1 + rt)

In the present example,


Pt2 = P2001 = 1027 million
Pt1 = P1991 = 846 million

and t = time interval between 1991 and 2001 = 10 years.

Since Pt2 = Pt1 (1 + r * t) or Pt2 = Pt1 + Pt1 * r * t or Pt1 * r * t = Pt2 - Pt1

18
- 1027 - 846
∴ r = Pt 2 Pt1 = = 0.0213947 ...(i)
Pt1  t 846  10

Population in 1997 can be estimated by using the following equation:

P1997 = P1991 (1 + rt)

Substituting P1991 = 846, r = .0213947 and t = 6, which is time interval in years between 1997
and 1991, we have

P1997 = 846(1 + 0.021394 * 6)


= 954.6 million.

Assuming that the growth rate `r' is constant, population in 2004 can be estimated as:

P2004 = P2001 (1 + rt)


= 1027 (1 + 0.0213947 * 3)
= 1092.9 million.

(ii) Geometric rate of growth: The geometric rate of growth of population can be used to
estimate the intercensal population when the successive ratios of population are constant. The
equation for the geometric rate of growth is given by;

Pt2 = Pt1 (1 + r)t

where, Pt2 is the size of population at time t2;


Pt1 is the size of population at time t1;
r is the geometric rate of growth at which the population is increasing between the time
periods t1 and t2; and

t = t2-t1 = the time interval between Pt2 and Pt1.

The annual rate of geometric growth (r) can be calculated as follows:

Taking logarithms on both sides of the above equation, we have

log Pt 2 = log Pt 1 + t * log (1 + r)


log Pt 2 - log Pt1
∴ log (1 + r) =
t

By taking antilogarithm for the above equation, we have

 log P t 2 - log P t1 
1 + r = Antilog  
 t 

19
 log Pt 2 - log Pt1 
∴ r = Antilog   -1
 t 
Example 9: Given below is the population of Rajasthan state in India for the census years 1991
and 2001. Calculate the annual rate of growth of population assuming geometric law of growth
and estimate the population after 5 years of the census 1991.

Census period Population

Ist March 1991 44,005,900


Ist March 2001 56,473,122

Solution: The formula for computing the annual rate of geometric growth is given as:

 log Pt 2 - log Pt1 


r = Antilog   -1
 t 
In the present example,

Pt2 = 56,473,122
Pt1 = 44,005,900
t = 10 years

Substituting these values in the above equation for `r', we have


 log 56,473,122 - log 44,005,900 
r = Antilog   - 1 = 1.0252576 - 1
 10 

= 0.0252576
Hence the population after 5 years from Ist March 1991 is given by:
Population of 1996 = P1991 (1 + 0.0252576) 5
= 44,005,900 (1 + 0.0252576) 5
= 49,851,232
Example 10: If Sri Lanka's population is growing at the rate of 2 percent per annum, find (i)
time required to double the population; (ii) also find the rate of growth, which will double the
population in 20 years.

Solution:

(i) Pt2 = Pt1 (1 + r)t

The population after `t' years is to be 2P0

2Pt1 = Pt1 (1 + 2/100) t


or 2 = (1 + 2/100)t
log 2 = t log (1.02)
20
0.3010 = t (0.0086)
0.3010
t = ——— = 35 years.
0.0086
(ii) Given t = 20 years
2Pt1 = Pt1(1 + r)20
2 = (1 + r)20
log 2 = 20 log (1 + r)
0.3010 = 20 log (1 + r)
0.3010
or log (1 + r) = –——— = 0.01505
20
1 + r = Antilog (0.01505)
= 1.035
r = 1.035 - 1 = 0.035

Hence to double the population in 20 years the required rate of growth is 3.5 percent per annum.

(iii) Exponential Rate of Growth: The equation for the exponential rate of growth is given as
Pt2 = Pt1 ert
Where,
Pt2 is the size of population at time t2;
Pt1 is the size of population at time t1;
r is the exponential rate of growth at which the population is increasing between the time
periods t1 and t2; and
t = t2-t1 = the time interval between Pt2 and Pt1.

Example 11: The schedule tribe populations of Andhra Pradesh for the census years 1981 and
1991 are 3,176,001 and 4,199,481 respectively. Calculate the annual rate of growth of schedule
tribe population assuming exponential law of growth and estimate the schedule tribe population
in 1996.

Solution: We know that

Pt2 = 4,199,481, Pt1 = 3,176,001 and t = 10 years.

21
Pt 2 = r t
e
Pt 1

l n P t 2 = r × t × ln e
Pt 1

ln P t 2
Pt 1
r= ; ( ln e = 1)
t

 
ln Pt 2 
r =  t1 
P
t

 4199481 
ln  
r=  3176001 
10
ln(1.32225 43)
=
10
= 0.0279338 = 2.8 %

Therefore, the annual rate of exponential growth is 2.8 per cent per annum.

Now,
P96 = P91 x ert
= 4,199,481 x e0.0279338x 5
= 4,199,481 x e0.139669
= 4,199,481 x 1.14989
= 4,828,954
1.7 Estimation of Mid-Year Population

In demography while computing the crude rates, such as crude birth rate or crude death
rate, the denominator refers to mid-year population of that area for that year. This mid-year
population is more often called the person years lived by the population during the year under
question. This is essential because the numerator is a record of events over a period of 12
months; in other words, the sum of all events that occur in a year. In order to relate this to the
denominator, population should also be counted over a year. This is achieved by obtaining the
number of person years lived by the population or the population count at the middle of the year.
These two concepts are not identical but in most cases they are equivalent.

Estimation of Mid-Year Population from Census Data:

(i) If the last census is within the same year for which vital rate is required, the census
population figure can be taken.

22
(ii) If the two censuses are conducted with a gap of one year, mid-year is calculated by
half of the difference in population assuming that the increment or decrement in
population size is uniform during the inter censal period.

(iii) If the data of the estimate has between two censuses more than one year apart
(generally it will be 5 years, 10 years), it is still possible to estimate the mid-year
population by using the formula given below:

n
P = P1 + ( P2 - P1)
N
Where P is the mid-year population to be estimated.
P1 is the initial population at the first census.
P2 is the final population at the second census.
N is the number of months between two censuses.
n the number of months between the date of P1 and the date of estimation.

Example 12: The population of Himachal Pradesh was 5,170,877 in 1991 and 6,077,248 in
2001. Compute the mid-year population in 1996.

Solution: We know that the formula for computation of mid-year population is

n
P t = P t1 + ( P t 2 - P t 1 );
N
P = P 1991 = 5170877;
t1

P = P 2001 = 6,077,248
t2

n = 1996 - 1991 = 5;
N = 2001 - 1991 = 10;

In this case P2 = 6,077,248

P1 = 5,170,877

N = 10 and n = 5
∴ P = 5,170,877 +
5
(6,077,248 − 5,170,877 )
10

= 5,170,877 + 1/2 x 906,371

= 5,170,877 + 453,185 = 5,624,062

Mid-year population in 1996 = 5,624,062.

23
Self-Check Exercises

9. The schedule tribe populations of Andhra Pradesh for the census years 1971 and
1981 were 1,657,657 and 3,176,001 respectively. Calculate the annual rate of growth
of schedule tribe population assuming arithmetic law of growth during 1971-81. If the
population continues to grow at the arithmetic rate, estimate the population in 1986.

10. Given below is the population of Bihar, an Indian State, for the census years 1951 to
1981. Calculate the annual geometric growth rate of the population for Bihar during
the census decades 1951-1961, 1961-1971, and 1971-1981. Also estimate the
population of Bihar for the years 1956, 1966, 1976, and 1991 assuming geometric law
of growth for the corresponding decades.

Population
1951 1961 1971 1981
Bihar 38,782,271 46,447,547 56,353,369 69,914,734

11. If India's population is growing at the rate of 2.23 per cent per annum, find the time
required to double the population and also find the rate of growth, which will double
the population in 17 years.

12. The population of a country on 30th June 1960 was 1.9 million, and on 30th June 1970
it was 2.4 million. Find (i) the exponential rate of growth during 1960 and 1970; (ii)
the estimated population on 30th June, 1968 assuming exponential rate of growth; and
(iii) the time the population would be double that of the 1970 value.

13. Population of Sri Lanka is 12,689,897 in October 1971 and it is 14,850,001 in March
1981. Compute October 1975 population by using formula for mid-year population.

14. The mid-year population of a region in 1981 was 19,896,843. If the number of births
during the year 1981 were 656,596, compute the crude birth rate for the year 1981.

Let Us Sum Up

After completing this unit, you would have learned about the following:

* If a certain act A1 can be performed in m1 different ways and another act A2 can be performed
in m2 different ways then the total number of ways in which either A1 or A2 can be performed
is m1 + m2. Thus, for example, if there are 5 mathematics books and 4 physics books, and if a
boy is to choose either a mathematics book or a physics book he can do so in 5+4=9 ways.
However, if a certain act A1 can be performed in m1 different ways, and having performed it in
any one of these m1 ways, another act A2 can be performed in m2 different ways then the two
acts, A1 and A2 can be performed in the stated order in m1 x m2 ways.

* Let there be n different objects, which are to be arranged in a line, taking only r of them
(0<r<n) at a time. Each possible arrangement in a line of r objects is called a Permutation of
n objects taken r at a time. The total number of such arrangements is denoted by npr or P(n,r)
or nPr.

nPr = n(n-1)(n-2)..........(n-r+1)
24
* Let there be n different objects out of which r (where 0<r<n) are to be chosen at a time. A
group of r objects selected out of the n objects without reference to order of selection, is called
a combination of n objects taken r at a time. The total number of such combinations is
denoted by nCr or (Nr) or C(n,r).

(i) nCr = nCn-r


(ii) nCr + nCr-1 = n+1Cr

* If n is a positive integer and x, y are any two numbers, then the Binomial theorem states that:

(x+y)n = xn + nC1xn-1.y + nC2xn-2.y2 + nC3xn-3.y3 +......+nCrxn-r.yr +......+ nCn-1x.yn-1+yn

n
= ∑ n Cr x n - r . y
r

r= 0

* Population growth rates are usually computed on the basis of the formulae

Pt2 = Pt1(1 + rt)


Pt2 = Pt1(1 + r) t
Pt2 = Pt1.ert
where Pt 2, Pt 1 and other notations have the same meaning as explained in the text.

* The mid-year population is more often called the person years lived by the population during
the year under question. This is essential because the numerator is a record of events over a
period of 12 months; in other words, the sum of all events that occur in a year. In order to
relate this to the denominator, population should also be counted over a year. This is achieved
by obtaining the number of person years lived by the population or the population count at the
middle of the year.

* If the data of the estimate has between two censuses more than one year apart (generally it
will be 5 years, 10 years), it is still possible to estimate the mid-year population by using the
formula given below:

n
Pt = Pt 1 + ( Pt 2 - Pt 1 )
N

where Pt2 the population at time t2;


Pt1 the population at time t1;
n=t-t1; N=t2-t1.

Model Answers

1. (i) 6 (ii) 12 (iii) 20

Hint: Follow solved example No. 1.

2. 1365

25
Hint: Here n=15, r=11 and you are required to select them without any
importance of order of selection

3. 560 ways

Hint: Here we have to choose 3 men out of 8 and 2 women out of 5. This can be
done in 8C3 and 5C2 ways respectively. Hence by rule 2 the committee can
be formed by 8C3 x 5C2 ways.
32 5 40 4 20 3 2 2 3 135 4 243 5
4. x - x y + x y - 15 x y + xy - y
243 27 3 8 32

Hint: Use the expansion of (x-y)n & substitute appropriate values for
x = 2x/3,y=3y/2 and n = 5.

5. 105/32 x6

Hint: Follow the solved example No. 4.

6. (a) -252; (b) 189/8 x17

Hint: Follow the solved example No. 5.

7. Coefficient of x12 is 7920

Hint: Follow the solved example No. 6.

8. n = 16

Hint: Assume that Tr+1 be the term independent of x. Find r and then Tr+1=Tn,
therefore n = r+1.

9. r = 0.091596
P86 = 4,630,546

Hint: Follow the solved example No. 8.

10. (i) Geometric Rates of Growth for Bihar


1951-61 1961-71 1971-81
0.018199 0.019520 0.021798

(ii) Estimated Population of Bihar


1956 1966 1976 1991
42,442,171 51,161,222 62,768,868 86,739,624

Hint: Follow the solved example No. 9.

11. (i) Time required to double the population = 31.4 years.


(ii) To double the population in 17 years the required rate of growth is 4.2
percent per annum.

26
Hint: Follow the solved example No. 10.

12. (i) 2.336 per cent per annum


(ii) 2.2904 million
(iii) 29.67 years

Hint: Follow the solved example No. 11.

13. 13,599,409

Hint: Follow the solved example No. 12.

14. CBR = 33 per thousand population

Hint: Follow the solved example No. 12.

27
Unit 2: Interpolation and Graduation

Unit Structure

2.0 Objectives
2.1 Introduction
2.2 Methods of Interpolation
2.3 Uses of Interpolation Formula
2.4 Limitations of Interpolation [Self-Check Exercises]
2.5 Graduation
2.6 Methods of Graduation
2.7 Osculatory Interpolation
2.7.1 Modified Osculatory Interpolation
2.7.2 Comparison and Selection of Osculatory Interpolation Formulas
[Self-Check Exercises]

Let Us Sum Up
Model Answers

2.0 Objectives

In this unit you are expected to learn about -

▪ different methods of Interpolation and their relative merits and demerits,


▪ different methods of graduation and their applications.

2.1 Introduction

Interpolation is insertion of an intermediate value in a series of items. For example, the


Indian census is conducted in every 10th year. If for an investigation, inter-censal population
figures are required, we have no other alternative except to estimate the most likely figures on
the basis of decennial census. Extrapolation is the technique of finding out the most probable
figures for some future date.

Interpolation is the process of finding the value of the function (y) for any of the
independent variable (x) within a given range of value of x and extrapolation is the process of
finding the value outside the given range of x.

If two variables are connected by a known relation, for each value of one variable there
is corresponding value of other. For example, with expression y = x2 + 3x + 2 if x = 2 then y =
12; this can be determined directly by substituting the value of x and solving for y. In many
cases, however, the relationship connecting two variables is unknown and usually, a pair of
values of x and y is given. If one wishes to estimate the value of independent variable i.e., x for
special values of the dependent variable y between points, one must establish empirical
relationship between the two variables. One way of doing this would be to fit a curve through
the data, then to substitute the known value of independent variable x into the formula for the
curve and solve the unknown value of the dependent variable. Frequently, a curve that
adequately fits the entire distribution of data cannot be easily fitted. Moreover, it requires large
investment of time to fit a curve to an entire distribution in order to read off the values at a few
intermediate points for which information is not given. Instead, interpolate between the known
28
points taking into account several adjacent values to estimate the shape of the curve around the
point of interpolation.

The interpolation analysis is done on the basis of two assumptions: (i) the quantity
changes continuously without any break or sudden juMA and (ii) the rate of change (rise or fall)
is uniform and there are no sudden juMA in the data. In other words, it means that the data are
in the shape of continuous or smooth curve. If for example, we are interpolating the figures of
population of India in the year 1955 and that we are given the figures of Indian population for
the years 1941, 1951, 1961, 1971 and 1981. Our presumption would be that the population has
grown up smoothly and there are no violent ups and downs in these figures. We also assume
that the rate of growth of Indian population has been uniform throughout the period 1941 to
1981.

The accuracy of the interpolated figures actually depends on two factors: (i) the
knowledge of the possible fluctuations of the figures and (ii) the knowledge about the course of
events relating to the problem under investigation. If the assumption of interpolation is not
fulfilled, the interpolated figures would be a fictitious. Interpolated figures are not perfect
substitute of the original figure. These are only best possible estimates under certain
assumptions.

2.2 Methods of Interpolation

Broadly speaking there are two types of methods of interpolation. They are:
(1) Graphic method
(2) Algebraic methods.

I. Graphic Method: The graphic method is applicable in all types of data. According to this
method, the values given are plotted on a graph and are joined by a straight line. The line so
obtained is then smoothed. It is possible to determine the value of y for any x within the given
limits from the smoothed curve. Graphs are useful for deriving rough estimates for subdivision
of grouped data as well as for estimating values in a point series. They are especially useful
when the grouped data are unevenly spaced.

Population
(In millions)
A

0 1931 1941 1951 1961 1971 1981

For example, the figures of the population for years 1931, 1941, 1951, 1961, 1971 and
1981 are available and it is desired to find the population for the years 1956 and 1966. Take
years along the x axis. Represent the population along y axis. Draw a continuous smooth curve
connecting all the points along the y axis. Now suppose, we have to interpolate the population
figures for the years 1956 and 1966. For this, we shall first locate these values on the x axis on
which the years are shown from these points, two ordinates shall be drawn at the y-axis. We can
29
now read the values at the points where these ordinates touch the y axis. They would be the
interpolated figures for the years 1956 and 1966.

II. Algebraic Methods: In a situation where one quantity changes continuously and regularly
and another quantity changes in relation to it, we can estimate the discontinuous value of the
second quantity corresponding to the first by using algebraic methods. The important methods
of interpolation are

(a) Linear Interpolation


(b) Newton's Forward Difference Formula
(c) Newton's Backward Difference Formula
(d) Newton's Divided Difference Formula
(e) Lagrange's Formula

(a) Linear Interpolation: The simplest type of interpolation is Linear-interpolation. It requires


knowledge of values on two points- One above and one below the point of interpolation - which
defines the interval within which the interpolation is made. It may be given a general expression
by the following formula.
x
u x = uo + (ui - uo) ...(2.1)
h
where ux = Interpolated value
uo = The value at the beginning of the interval.
ui = The value at the end of the interval.
h = The length of the interval
x = The distance between the beginning of the interval and the point for which an
interpolation is being made.

This interval can be either one of time or a class interval within a compositional
classification. Thus one can interpolate between two censuses or between two ages of an age
classification for a given census. Interpolation frequently must be used in order to make data
comparable. If two distributions have been tabulated for unlike age intervals, it is possible to
make equal intervals by using interpolation to make distributions approximately comparable.
For example, if the census dates fall in different years; by interpolation it is possible to arrive at
population estimates for common years.

Example 1: Below are the mortality rates for India at specified ages based on 1961-71 deaths.
Find the estimates of mortality rate for age 32.

Age Mortality rates

25 4.38
30 5.26
35 6.74
40 9.14
45 12.82
To find out the mortality rate for age 32, we apply linear interpolation formula as given
below.

30
x
u x = uo + (u i - uo )
h
Where uo = 5.26; ui = 6.74; x = 2; h = 5
2
u32 = 5.26 + (6.74 - 5.26)
5
= 5.26 + 0.592 = 5.852

Mortality rate for age 32 is 5.852

(b) Newton's Forward Difference Formula: Simple interpolation will not be adequate as most of
the population distributions are not linearly distributed. Usually they have a marked curve
linearity. It is true, however, that simple interpolation can be used with curvilinear distribution,
provided the interval of interpolation is small enough. Unfortunately, most population
distributions are tabulated in broad intervals that permit little use of linear interpolation. Hence
some more exact techniques such as Newton's Forward Difference Formula is applied when the
functions of values have equal intervals and Newton's Divided Difference or Lagrange's
formulas are used for unequal intervals. For applying Newton's formula, it is necessary to have
some elementary knowledge on finite differences and the construction of difference tables. The
next section is devoted for concepts, construction of difference table, and its applications.

Method of Finite Differences: Let ux represent the value of u for a specified value of x in a
distribution. The known values of ux may be represented by u0, u1, u2, …etc. This is the value of
x for which ux is known may be assigned integers form 0,1,2, ... etc.

The symbol Δ1x, (Δ (delta) is a symbol representing differences) will be used to


represent the difference between two successive known values of the distribution ux+1 -ux. Such
difference is called first order differences or simply "first differences"... The super-script
identifies the order of the difference and the subscript specified which pair of values has been
differentiated. For example, Δ1o means that u0 has been subtracted from the value u1.

If underlying relationship between two variables is linear, then a change of one unit in
one variable is accompanied by a fixed amount of change in the other variable. This means that
equal changes of x are accompanied by equal changes in u. And the relationship between u and
x is assumed to be linear in this case. Hence, it is readily apparent that linear interpolation
assumes that the first differences are constant or equal size. Suppose first-order differences are
taken for all successive known values of u. It is then possible to take the difference of the
successive pairs of differences. These are termed as `Second order differences' or simply
"second differences" and are represented by the symbol Δx2. The super-script signifies that the
differences are of second order and the subscript specified which first order differences have
been subtracted from each other. For example,

Δ 02 = Δ11 - Δ10
Δ12 = Δ12 - Δ11
Δ 22 = Δ13 - Δ12
Also, 20 is pronounced as delta – zero - two

Similarly, third, fourth, ... nth differences are denoted by Δ3x, Δ4x, ..., Δnx,
31
Differences of higher order are used to interpolate within curvilinear distributions.

It is convenient to introduce alternative names for x and y in our equations y =ux. The
independent variable is often termed as the argument and the corresponding value of y the entry.

Difference Table: Table given below illustrates the construction of a difference table for the
equation y = ux.

Table 2.1: Single Difference Table for a General Case y = ux

Argument Entry First Second Third Fourth


x ux Difference Difference Difference Difference

0 u0
Δ10
1 u1 Δ20
Δ11 Δ30
2 u2 Δ 2
1 Δ40
Δ12 Δ31
3 u3 Δ 2
2 Δ41
Δ13 Δ32
4 u4 Δ 2
3 Δ42
Δ14 Δ33
5 u5 Δ 2
4
Δ1
5
6 u6

The first term, u0, in Table 2.1 is called the leading term and the difference, at the head
of the respective columns, namely, Δ10, Δ20, Δ30, Δ40 are called the leading differences.
Although we have expressed the term in the difference table by the use of Δ symbols, it is
quite easy to obtain any differences in terms of functions alone.

For example, Δ30 is the difference between Δ21 and Δ20 i.e., Δ30 =Δ21-Δ20.

Again Δ20 is the difference between Δ11 and Δ10 i.e., Δ20 = Δ11 - Δ10 and Δ10= u1-u0

So, we have, Δ30 = Δ21 - Δ20

= (Δ12 - Δ11)- (Δ11 - Δ10)


= Δ12 - 2Δ11 +Δ10
= (u3 -u2) - 2 (u2 - u1) + (u1 - u0)
= u3 - 3u2 + 3u1 - u0

Similarly Δ40 = Δ31 - Δ30


Δ31 =Δ22 - Δ21 =(Δ13-Δ12) - (Δ12 -Δ11)
= Δ13 -2Δ12 +Δ11

32
= (u4-u3)-2(u3-u2) + (u2-u1)
= u4-3u3 +3u2-u1
Thus, Δ40 =(u4-3u3 + 3u2-u1) -(u3-3u2 + 3u1-u0)
= u4-4u3 +6u2-4u1 +u0

In general, the nth difference can be written as:


1 2 r n
0 = ( 0 ) un + (-1 ) ( 1 ) un-1 + (-1 ) ( 2 ) un-2 + ... + (-1 ) ( r ) un-r + ... + (-1 ) ( n ) un-n
n n n n n n
...(2.2)

u0 = u0
u1 = u0 + Δ1u0
u2 = u1 + Δ1u1
= (u0 + Δ10) + (Δ20 +Δ10)
= (u0 + 2Δ10 + Δ20)
u3 = u2 + Δ12
(u0 + 2Δ10 + Δ20) + (Δ21 +Δ11)
(u0 + 2Δ10 + Δ20) + (Δ30 + Δ20) + (Δ20 +Δ10)
u0 + 3Δ1u0 +3Δ2u0 +Δ3u0

Proceeding in the similar fashion you can compute u4, u5, u6, ......, etc.
n
and un = u0 + ( 1n ) Δ u 0 + ( n2 ) Δ2 u 0 + ... + (n-r ) Δn-r u 0 + ... + ( nn ) Δn u 0 ...(2.3)

We can now have generalization as follows-


Z(Z - 1) 2 Z(Z - 1) (Z - 2) 3 (2.4)
u X = u0 + Z Δ10 + Δ0 + Δ0 + ...
1 2 1 2  3

x - xo
where Z =
h
h = Difference between two adjoining points.
x = interpolated value
x0 = Point of origin.
This important equation is called 'Newton's Forward Difference Formula'.

33
Example 2 : The following table contains pairs of values satisfying the equation ux = 1+x with
difference of various orders.

Table: 2.2 Difference table

x ux Δ1x Δ2x
0 1
1
1 2 0
1
2 3 0
1
3 4 0
1
4 5

The first order differences are obtained by subtracting successively each value of u x
from the value of ux immediately below it.

The second order differences are obtained by performing similar subtraction on the
first order differences. If the relationship between ux and x is linear then the first differences
are constant and the second difference are zero.

Example 3: The following table contains pairs of values satisfying the equation
ux = 1+x+x2, with difference of various orders.

Table 2.3: Difference Table

x ux Δ1x Δ2x Δ3x


0 1
2
1 3 2
4 0
2 7 2
6 0
3 13 2
8 0
4 21 2
10
5 31

The relationship between ux and x is curvilinear, as defined in the equation. The


increasing size of the first order differences with increasing value of x, emphasizes this fact.
But note that the second order differences are constant and the third order differences are zero.

34
Example 4: The differencing process is now applied to a function that is even more complex:

ux = x4+x3+5x+4

Table 2.4: Difference table


x ux Δ1 x Δ2x Δ3 x Δ4 x Δ5x
0 4
7
1 11 20
27 42
2 38 62 24
89 66 0
3 127 128 24
217 90
4 344 218
435
5 779

Summarizing the results of the examples, we find

(i) In the linear equation the first differences of ux were equal and the second differences
were zero.
(ii) In the second-degree equation (highest term x2), the second difference were equal and
the third differences were zero.
(iii) In the fourth equation (highest term x4), the fourth differences were equal and the fifth
differences were zero.

Newton's Forward Difference Formula (NFDF): The Newton's Forward formula is applied
when the independent variables advance by equal interval. Given the first row of differences,
it is possible to reproduce all other differences, simply by successively adding adjacent pairs
of values together and placing the total under the left entry of the pair.

If there are n arguments and n corresponding entries, Newton's forward difference


formula for the entry ux to be interpolated for the argument x is:

Z(Z - 1) 2 Z(Z - 1)(Z - 2) 3


uo + ZΔu 0 + Δ u0 + Δ u 0 + ...
2! 3!
...(2.5)
x - x0
Where Z = and h = x1 − x 0
h
and u0, Δu0, Δ2u0, Δ3u, ...., etc. are the term and leading differences occurring at the top of the
cone in the difference table 2.5. The Newton's Forward Difference formula should be used
when the figure to be interpolated is taken in the beginning of the Table. The reason is that we
take only leading differences into account, which are always at the top.

It provides the basis for method of interpolation, which permit the assumption of
curvilinear relationship. Linear interpolation, infact considers only successive values of ux and
uses the linear relationship that will reproduce two values to estimate for any intermediate
(fractional) value of x. However, if we consider three adjacent values (u 0, u1, u2), there is a
polynomial of the second degree in x, which will reproduce these three values. Similarly, there
35
is a polynomial of the fourth degree in x, which will reproduce five adjacent values of u x
(u0,u1,u2,u3,u4).

Table 2.5: Newton's Forward Difference Table


Argument x Entry ux Δ1x Δ2x Δ3x Δ4x
0 u0
u1-u0 = Δ1u0
1 u1
u2-u1 = Δ1u1 Δu1-Δu0=Δ2u0 Δ2u1-Δ2u0
2 u2 Δ3u1-Δ3u0
u3-u2 = Δ u2 1
Δu2-Δu1=Δ u1 2
Δ u2-Δ u1
2 2

3 u3
u4-u3 = Δ1u3 :
4 u4 :
: : : :
: : : :
n-2 un-2
un-1-un-2 =Δ1un-2
n-1 un-1 Δun-1-Δun-2-Δ2un-2
un-un-1 =Δ un-11

n un

Values of ux may be estimated for fractional values of x within the range of the known pairs of
values by substituting the appropriate values of x in NFDF. Hence it is possible to interpolate
between two known of ux even though the relation between ux and x is curvilinear around the
point of interpolation.

Example 5: The age specific mortality rates for India at specified ages for 1961-71 are given
below. Estimate the mortality rate for age 32.
Age Age specific mortality rates
25 4.38
30 5.26
35 6.74
40 9.14
45 12.82
50 18.18
Let us form the difference table.
Table 2.6: Difference table
Age x ux Δ1x Δ2x Δ3x Δ4x
25 4.38
0.88
30 5.26 0.60
1.48 0.32
35 6.74 0.92 0.04
2.40 0.36
40 9.14 1.28 0.04
3.68 0.40
45 12.82 1.68
5.36
50 18.18

36
x - x0 32 - 30
where h = 5, z = = = 0.4
h 5

1.4  0.4 1.4  0.4  - 0.6


u1.4 = 4.38 + 1.4  0.88 +  0.60 +  0.32
1 2 1 2  3

1.4  0.4  - 0.6  - 1.6


+  0.04
1 2  3  4

= 4.38 + 1.232 + 0.168 - 0.01792 + 0.000896 = 5.762976.


The true value taken from the life table is 5.75 when simple linear interpolation had
been used the estimate of mortality for age 32 would be 5.85 (see example 1). It may be noted
that the use of the rate for age 25 as u0 provides a closer estimate than the use of the rate for
age 30 as u0.

As regards to the choice of the sets of u' s to be used in interpolation, we should try
and keep the value sought as far as possible central to the set of u' s is employed. As regards
the equation of how many differences have to be used, usually, but not always, using higher
orders of differences and therefore, fitting a curve to more known points of the distribution
will increase the accuracy of an interpolation. The highest order difference used in an
interpolation implies a curve of a specified form. For much demographic work, carrying the
interpolation beyond the fourth order difference will not greatly improve the accuracy of the
result if it is assumed that the basic relationship between the two variables involved is
approximated by the form of a third or fourth degree polynomial.

(c) Newton's Backward Difference Formula (NBDF): For data given at equal intervals of x,
if we have to find the value of the function for a value of x near the bottom of the table, we
use Newtons Backward Difference Formula. If there are n arguments and n corresponding
entries, Newton's backward formula for the entry ux to be interpolated for the argument x is

z(z + 1) 2 z(z + 1) (z + 2) 3
ux = un + z Δ1n + Δ n+ Δ n + ...
1 2 1 2  3

x - xn
where, z = , h = x1 - x0 and Δ1n , Δn2 ...,
h

are the differences occurring at the bottom of the core in the difference table.

(d) Newton's Divided Difference Formula (NDDF): Many a times in population statistics
data are given at equal intervals of x; but sometimes it happens that we are required to
interpolate when values of the function are known for unequal intervals. Since we cannot take
out the differences as defined earlier we adopt a process of difference in involving the
argument as well as the entry. The differences obtained by this process are called `divided'

37
differences. In these situations, to find the value of the function at the intermediate values of x
say x0, we proceed as follows:

Let f(x1),f(x2) ..... f(xn) be the known values of the function at x1, x2 .... xn. To find the
values of the function at x0, let us form the following difference table.

From table 2.7 we see that the top entries are f(x1), f(x1, x2), f(x1, x2, x3), etc. Then,
Newton's formula of divided differences for estimating f(x0) corresponding to x0 is,

f(x0) = f(x1)+(x0-x1) f(x1,x2)+ (x0-x1) (x0-x2) f(x1,x2 ,x3)

+ .... +(x0-x1) (x0-x2).... (x0-xn-1) f(x1,x2...xn) ...(2.6)

f(x 2) - f(x1) f( , ) - f(x1 , x 2)


where f(x1 , x 2) = ; f(x1 , x 2 , x 3) = x 2 x 3
x 2 - x1 x 3 - x1

f(x 2 ... x n) - f(x1 .. x n - 1)


and f(x1 ... x n) =
x n - x1

38
Table 2.7: Divided Difference Table for General Case, y = f (x)

x f(x) Δ1x (First difference) Δ2x (Second difference) Δ3x (Third difference)
x1 f(x1) f(x 2) - f(x1)
= f(x1 x 2)
x 2 - x1

f(x 2 x 3) - f(x1 x 2)
= f(x1 x 2 x 3)
f(x 3) - f(x 2) x 3 - x1
x2 f(x2) = f(x 2 x 3)
x3 - x2 f(x 2 x 3 x 4) - f(x1 x 2 x 3)
= f(x1 x 2 x 3 x 4)
f(x 3 x 4) - f(x 2 x 3) x 4 - x1
= f(x 2 x 3 x 4)
x4 - x2
x3 f(x3) f(x 4) - f(x 3) f(x 3 x 4 x 5) - f(x 2 x 3 x 4)
= f(x 3 x 4) = f(x 2 x 3 x 4 x 5)
x4 - x3 x5 - x2
f(x 4 x 5) - f(x 3 x 4)
= f(x 3 x 4 x 5)
x5 - x3
x4 f(x4)
f(x 5) - f(x 4)
= f(x 4 x 5)
x5 - x4

x5 f(x5)
: : :
: : :

xn-1 f(xn-1)
f(x n) - f(x n - 1)
= f(x n - 1 x n)
xn - xn - 1
xn f(xn)

39
(e) Lagrange’s Formula: This can be put in another form which does not require to construct
the divided difference table. Let f(x) be a continuous function of x and f (x 0), f(x1) f(x2) ….be
the value of f(x) when x=x0,x1,x2,……,

The formula is:

(x - x1 )(x - x 2 ).....(x - x n ) (x - x0)(x - x2).....(x - xn )


f(x) = f ( x 0 ) + f ( x1 ). + ...
( x0 - x1)(x0 - x2).....(x0 - xn ) (x1 - x0)(x1 - x2).....(x1 - xn )
..(2.7)
(x - x0)(x - x1).....(x - xn-1)
+ f ( x n ).
( xn - x0)(xn − x1).....(xn - xn-1)

Where f(x) is the figure to be interpolated. The above equation is known as Lagrange's
formula.
Note: Newton's Divided Difference formula and Lagrange's formula are used for unequal
intervals.
Example 6: Find f(27), when f(26) =10.29, f(28) =10.54, f(29) =10.65, & f (30)=10.76

Table 2.8: Divided difference table

x f(x) Δ1 x Δ2 x Δ3 x
26 10.29
0.125
28 10.54 -0.005
0.110 0.00125
29 10.65 0
0.110
30 10.76

Then by applying the Newton's Divided Difference Formula, we have

f(27)=10.29 + (27-26)  0.125 + (27-26) (27-28)  -0.005+ (27-26) (27-28) (27-29)  0.00125

=10.29+ 0.123 + 0.005 + 0.0025 = 10.42

By applying Lagrange's formula, we have

(27 - 28)(27 - 26)(27 - 30)


f(27) =  10.29
(26 - 28)(26 - 29)(26 - 30)
(27 - 26)(27 - 27 )(27 - 30)
+  10.54
(28 - 26)(28 - 27 )(28 - 30)
(27 - 26)(27 - 28)(27 - 27 )
+  10.65
(29 - 26)(27 - 28)(27 - 30)
(27 - 26)(27 - 28)(27 - 27 )
+  10.76
(30 - 26)(30 - 28)(30 - 27 )
= 2.57 + 15.81 - 10.65 + 2.69 = 10.42

40
2.3 Uses of Interpolation Formula

The interpolation formula can be applied to solve many problems that may arise in
demographic analysis, some of them are mentioned below:

(i) Estimation of intermediate terms among n equidistant terms: In order to find the
intermediate terms among n equidistant terms, one of the three Newton's formulas namely,
forward, backward and central differences are applied on the basis of the position of the
interpolated value in the differences table. Example 5 has already shown how Newton's
forward difference formula can be applied suitably to estimate the interpolated value. In case,
the intervals of x values are not equal. Newton's Divided Difference and Lagrange 's formula
are used to estimate the intermediate functional values & an illustration is given in Example 6.

(ii) Method for Estimation a Missing Term: If there are "n" equidistant terms of which n-1 are
known and in order to estimate the missing term h, a difference table is constructed by
assuming the missing values as x. We assume that the fourth order difference, or depending
upon the polynomial relationship between the variables, to be zero and solved for x.

Example 7: Find f(27) by using Newton's Forward Difference Formula when f(26) =10.29,
f(28):10.54, f(29)=10.65 & f(30)=10.76.

Solution: Let x be equal to f(27) so that the difference table will be

Table 2.9: Difference Table

x f(x) Δ1 x Δ2 x Δ3 x Δ4 x
26 10.29
x-10.29
27 x 20.83-2x
10.54-x 3x-31.25
28 10.54 x-10.43 41.68-4x
0.11 10.43-x
29 10.65 0
0.11
30 10.76

If we assume that Δ4x = 0 then 41.68-4x=0 or


x = 41.68/4 or x = 10.42

When we applied Newton's Divided and Lagrange's formula for the above example we
obtained the same answer.

(iii) Estimation of the composition of a population: The third application of interpolation is to


estimate the composition of a population for more detailed intervals than those, which are
reported.

Example 8: The population of Goa in 5-year age groups is given below and suppose we
required the estimate of the population aged 27 years.

41
Table 2.10: Population of Goa

Age Groups Population (in 000's)


0-4 48254
5-9 45184
10-14 42402
15-19 33887
20-24 28962
25-29 25749

Then let us form the following difference table:

Table 2.11: Difference table


Upto Cumulative popn. Δ1 x Δ2 x Δ3 x Δ4x Δ5 x
age upto the stated age
5 48254
45184
10 93438 -2782
42402 -5733
15 135840 -8518 9323
33887 3590 -11201
20 169727 -4925 -1878
28962 1712
25 198689 -3213
25749
30 224438

Population aged 27 = Population aged upto 28 years - Population aged upto 27 years. Using
Newton's Backward Difference Formula- Population upto 28 years:

- 0.4 × 0.6 - 0.4 × 0.6 × 1.6


224438 + 25749 × - 0.4 + × - 3213 + × 1712
1× 2 1× 2× 3

- 0.4 × 0.6 × 1.6 × 2.6 - 0.4 × 0.6 × 1.6 × 2.6 × 3.6


+ × - 1878 + × - 11201
1× 2× 3× 4 1× 2× 3× 4 × 5

= 224438 - 10300 + 386 - 110178 + 335 = 214827

Similarly population aged upto 27 = 224438 + 25749 × -0.6

- 0.6 × 0.4 × 2.4 - 0.6 × 0.4 × 1.4 × 2.4 × 3.4


+ × -1878 + × -11201
1× 2× 3× 4 1× 2× 3× 4 × 5

= 224438 - 15449 + 386 - 96 + 63 + 256 = 209598.

Hence, population aged 27 years = 214827 - 209598 = 5229.

42
(iv) Conversion of Unconventional age group into conventional age groups: If the age
distribution has not been tabulated in conventional age groups, it is possible to convert them
into conventional age groups by interpolation.

Example 9: The age distribution of males in unconventional age group is given below: find
the age distribution of the population by conventional age groups (0-4,5-9,10-14, .... etc.)

Table 2.12: Age distribution of Males

Age Males
0-1 39,510
2-4 64,533
5-7 62,125
8-14 129,666
15-17 40,057
18-22 78,347
23-29 80,994
30-39 96,327

Population of males aged 0-4 = 39,510 + 64533 = 104,043

The central ages of the groups 5-7, 8-14, 15-17, 18-22, 23-29 and 30-39 are 6.5, 11.5,
16.5, 20.5, 26.5 and 34.5 respectively. The respective intervals for the above age groups are
3,7,3,5,7 and 10 years.

To make each group into 5-year groups with the central ages fixed, we shall multiply
the respective group population by
5 5 5 5 5 5
, , , , and
3 7 3 5 7 10

Thus the population by 5 years’ age groups with central ages 6.5, 11.5, 16.5, 20.5, 26.5 and
34.5 are:

Table 2.13: Age distribution of males by central ages

Central Age Age Group Population


6.5 4-8 103,542
11.5 9-13 92,619
16.5 14-18 66,762
20.5 18-22 78,347
26.5 24-28 57,853
34.5 32.5-36.5 48,614

The population for the 5-year age group with exact central age 7.5 is

43
6.5 is a unit away from 7.5 and 11.5 is 4 units away 7.5.

∴ The population for 5 year group with central age 7.5 is


4  103542 + 1  92619
= = 101358
5

In a similar way we can get the population by 5-year age group with the required
central value, they are:

Table 2.14: Age distribution of males by conventional 5-year age groups

Age Group Central Age Population


0-4 2.5 104043
5-9 7.5 101358
10-14 12.5 87447
15-19 17.5 69658
20-24 22.5 71516
25-29 27.5 56715
30-34 32.5 51014

(v) Halving a group: A frequent application of interpolation encountered in demography is


that of halving a group; for example, dividing the frequency of 10-year interval into two five
year intervals. If the interpolation is limited to second difference, Newton's formula may be
used.

Let W0, W1 and W2 denote the population in 3 consecutive 10-year age groups. Split
the population of middle age groups into five-year age groups. Let W0, W1 and W2 be the
population aged n to n+9; n+10 to n+19 and n+20 to n+29 respectively. Find the populations
aged n+10 to n+15 and n+15 to n+19.

Let X be the population aged n+10 to n+14 so that W1-X is the population aged n+15
to n+19 then the divided difference table will be:

Table 2.15: The Divided Difference Table

Age Group Population First Difference Second Difference


n to n+9 W0
X - W0
7.5  ( W1 - 2X) (X - W0 ) 
 -  12.5
 5 7.5 
n+10 to +14 X W1 - 2X
5
 ( W2 - W1 + X) ( W1 - 2X) 
 -   12.5
 7.5 5 
W2 - W1 + X
n+15 to n+19 W1-X
7.5

n+20 to n+29 W2

To estimate X, we shall assume that the differences are constant and equal so that

44
 ( W1 - 2X) (X - W0)   ( W2 - W1 + X) ( W1 - 2X) 
 - = -
 5 7.5   7.5 5 

W X W W W X W
or 1 - 0.4X - + 0= 2- 1+ 1
+ 0.4x
5 7.5 7.5 7.5 7.5 7.5 5
w1 w 0 w 2
or + -
2 8 8
w w w
 Population aged n+10 to n+14 is 1 + 0 - 2
2 8 8
w1 w 0 w 2
The Population aged n+15 to n+19 = w1-x = +
2 8 8
Example 10: Split the ten-year age group population of 25-34 in the five-year age groups.

Table 2.16: Population by age groups

Age Group Population


15-24 18,139
25-34 24,225
35-44 31,496

Considering W0, W1 and W2 are the population totals of 15-24, 25-34 and 35-44 respectively.

Here, W0 = 18,139 W1 = 24,225


W2 = 31,496

Population for the age group 25-29 are


24,225 18,139 31,496
+ - = 11,278
2 8 8
Population for the age group 30-34 = (population for the age group 25-34) – (population for
the age group 25-29)

= 24225 - 11278 = 12,947

Population for the age group 25-29 = 11,278

Population for the age group 30-34 = 12,947

(vi) Sub-division of intervals: A frequent problem in demography is the interpolation for


values of ux at individual points given every fifth or tenth value of the function. For example,
the problem may be to complete the series u0, u1, u2 ... from the known values of u0, u5, u10 ...
or so.

A simple method for obtaining the individual values where quinquennial values are
known is given below. Let δx denote the difference for unit interval of x and Δx denote the
difference for quinquennial interval. Then Ux+5 may be expressed as either (1+δx)5 or as
(1+Δx) symbolically;

45
(1+δx)5 = (1+Δx)
1+δx = (1+Δx)1/5
δx = (1+Δx)1/5 -1
From this relation one can find easily that

δx = (0.2Δx-0.08 Δ2x + 0.048 Δ3x + ...)


Hence (δx)2 = δx2 = (0.2Δx-0.08 Δ2x + 0.048 Δ3x + ...)2
= (0.04Δ2x - 0.032 Δ3x + ...)
Similarly, δ3x = (-0.008 Δ3x + ...)

The same principle can be adopted if decennial values are known. In the event of Δ1x, Δ2x ...
will represent differences for decennial intervals and individual differences will be found from
the identity.

δx = (1+Δx)1/10 -1

Example 11: The mortality rate for quinquennial ages are given below. Obtain the mortality
rates for ages 36, 37, 38 and 39.

Table 2.17: Mortality rate by age

Age Mortality rate


25 0.0292
30 0.0338
35 0.0433
40 0.0595
45 0.0863

Since we are interested only in ages after 35 we shall consider the following abridged
difference table.

Table 2.18: Difference Table

Age Mortality Δ1 x Δ2 x
35 0.0433
0.0162
40 0.0595 0.0106
0.0268
45 0.0863

we calculate δx = 0.2 Δ1x- 0.008 Δ2x = 0.00239


δ2x = 0.04 Δ2x = 0.000424

46
Assuming that second order differences to be constant we construct the following table.

Table 2.19: Difference Table

Age Mortality Δ1 X Δ2 X
35 0.0433
0.00239
36 0.0457 0.000424
0.00281
37 0.0485 0.000424
0.00323
38 0.0517 0.000424
0.00365
39 0.0553 0.000424
0.00407
40 0.0595

It may happen that you know the values of a function f(x) at intervals of a unit and
wish to calculate a table of values with a smaller interval, e.g., it is a common practice to
calculate every fifth value in a life table and to complete the table by interpolation. Here the
unit interval for the preliminary calculations is five years and we are faced with the problems
of getting these quantities for single year intervals. In these situations, formulae were already
given in equations. Here we shall give certain methods which are equivalent to the methods
given earlier but which are easy for application. The details about tables to be used to obtain
Single Year Interval Values when data are given at 5 or 10-year intervals, which are based on
Newton's Forward Difference formula are given as Tables 1 to 10 in supplementary of this
block.

It is notable here that in each case to obtain u x+1,ux+2,ux+3 etc. the values of ux, ux+5,
ux+10 etc. are multiplied by the corresponding coefficients as given in the tables and the
resulting values added up. The three point, four point, and five point formulae are respectively
used for differences up to second, third and fourth orders.

Example 12: Find u36, u37,u38 and u39 when

x ux
30 .0338
35 .0433
40 .0595

u36 =.0338 (.72) + .0433 (.36) + .0595 (-.08) = 0.0352 (from the reference Table 1,
u37 =.0338 (.48) + .0433 (.64) + .0595 (-.12) = 0.0368 attached as supplementary of
u38 =.0338 (.28) + .0433 (.84) + .0595 (-.12) = 0.0387 this block)
u39 =.0338 (.12) + .0433 (.96) + .0595 (-.08) = 0.0409

In the compilation of population statistics the problem of interpolation occurs often as


we have already noticed. Their greatest necessity is in the construction of age distribution, of
population in census, of deaths in deaths registration and of mothers in birth registration. In all
these cases, the age distribution as tabulated from the schedules and registers are so much

47
vitiated due to digit preference in ages etc. that these figures have got any meaning only if
taken in large age groups like 5 years or 10 years’ age groups. But in many situations single
year age distributions are required and so methods are to be developed to split the 5 years or
10 years totals into single year groups. We may assume that, however, much the data be
vitiated by digit preference etc., the group totals are substantially correct and so a
redistribution of the group totals into single year values on the basis of these 5 year or 10 year
totals may be taken as good in many practical situation i.e. we define.

W0 = Population aged 0-4, W5 = population aged 5-9

W10 = Population aged 10-14

i.e. W0 = P0 +P1 +P2 +P3 +P4; W5 =P5 +P6 +P7 +P8 +P9

W10 =P10 +P11 +P12 +P13 +P14 etc.

2.4 Limitations of Interpolation

1. The limitation of interpolation method is that if any particular five-year age group
is greatly in error due to under/over enumeration, this method will not correct such
deficiencies; they must be corrected by graphic interpolation or applying
osculatory or modified osculatory interpolation formulas. Common sense must
govern the use and interpretation obtained by interpolation formulas.

2. If the function f(x) changes drastically within a small interval of x (such as rates of
mortality during the first five years of life, like the percentage of the married
population between the ages of 15 and 24.) interpolation will tend to distribute the
change more or less smoothly throughout the interval, when in fact changes may
be concentrated in a particular part of the interval.

3. When the function tends to zero slowly, i.e. it is asymptotic to the x axis,
interpolation for the tail may have percentage errors that are larger than
permissible. In this case interpolation can be greatly improved by dealing with the
logarithm of f(x). Even with this transformation since the required value lies at the
bottom panel of the tabulated function a backward difference formula will be
better.

4. Also in the case of open intervals (which are too wide) the theory of interpolation
will fail to give results.

In other words, since interpolation is a "mechanical procedure" its uncritical use may
result in the obliteration of fluctuation that is basic underlying characteristics.

Caution also should be exercised not only in the use of interpolation formula in
situations of the above type-but also in the selection of the formula itself. Each formula has its
advantages and disadvantages and one should be guided by not only common sense but also
experience with similar data.

48
Self-Check Exercises

1. What are the basic assumptions for interpolation?

2. Fill up the blanks:

(a) Linear interpolation formula is not adequate as most of the


population distributions are _____________________.

(b) Newton's Forward Difference Formula is applied when the functions


of values have _____________________.

(c) Newton's Divided Difference Formula or Lagrange's Formula are


used for _______________________.

(d) The first term u0, in the difference table is called the _________ and
the difference which at the head of the respective columns are called
the _____________________.

(e) Δ30 is the difference between _________________.

3. What are the main uses of interpolation?

4. What are the main limitations of interpolation?

2.5 Graduation

"Graduation may be defined as the process of securing from an irregular series of


observed values of a continuous variable, a smooth regular series of values consistent in a
general way with the observed series of values" (Miller, 1946: 4).

The probabilities or rates of occurrence of death, birth, marriage etc., are of great
interest to the demographer. We are in many cases interested in constructing tables setting
forth probabilities or rates and for that purpose observations are made of the happening of
such events. Graduation is one of the steps in the construction of these tables.

You shall consider it under following headings:

* How the problem of graduation arises


* What the process of graduation means
* What is the justification for the process of graduation
* How the graduation of an observed series may be accomplished
* What are the criteria of an acceptable graduation

The problem of graduation arises in connection with the construction of say mortality
rates because a series of observed mortality rates will be found to contain irregularities which
we have reason to believe are not a feature of the true, underlying rates of mortality. These
irregularities may mainly be due to errors in reporting. This is particularly true of tabulations
by single years of age, which tend to heap at ages ending at '0' and '5' and have other irregular
features. For this reason, erratic fluctuations in data are often regarded as symptomatic of error
49
in collection or processing. Before certain refined demographic calculations are made, such as
construction of life tables, it is necessary to smooth or graduate the data to remove those
irregularities and correct for mal-distribution.

In most cases the underlying law may not be known and therefore, we must rely on the
information supplied by observations of the rates or probabilities. If we wish that underlying
curve to be smooth, regular and continuous (this assumption is quite reasonable and holds in
many situations) we may be able to secure by the method of graduation, from this irregular
series of values consistent in a general way with the observed series of values. This smooth
series or series of graduated values, is then taken as a representation of the underlying law
which give rise to the series of observed values.

Thus, in graduating a series of k+1 observed mortality rates from q0" to qw", the
graduation process will substitute k+1 graduated mortality rates q0 to qk lying close to the
crude values but larger at some values and smaller at others. These graduated rates are
obtained by altering each observed rate by reference to the other observed rates so that the
new series will be smooth rather than irregular but at the same time will exhibit the trend
indicated by observed series.

An observed series may be thought of as having two components - the underlying


smooth regular series free from the fluctuation characteristics observed data and the super
imposed irregular series consisting of a haphazard array of positive and negative terms, which
account for the irregularities appearing in the observed series. The graduation process will
operate on both these components and will effect a redistribution and reduction of the error
series permitting positive and negative errors to offset one another while it leaves the
underlying smooth series substantially unchanged.

Graduation may not be successful in wholly eliminating the error component of the
observed series. Consequently, the graduated series will contain an element of residual error
and must therefore be thought of as a representation of the underlying law rather than as the
law itself.

Graduation is characterized by two essential qualities (i) smoothness and (ii) fit or
consistency with the observed data. The graduated series should be not only smooth as
compared with the ungraduated data but it should be consistent with the indication of the
ungraduated series. Since in smoothening an observed series its values must be changed, the
new values will depart from the observed series. Generally, an increase in the smoothening
results in a reduction in the fit. Conversely, when the graduated series is drawn closer to the
observed series, improving the fit, smoothness usually suffers. Graduation must follow a
middle course between optimum fit and optimum smoothness.

2.6 Methods of Graduation

Graduation may be accomplished either by (i) graphic method, (ii) interpolation


method, (iii) adjusted or moving average method, (iv) difference equation or by mathematical
formula (Graduation by Newton's formula). We shall consider some of these methods.

(i) The Graphic Method: In the graphic method, the observed values are suitably plotted on
graph paper and among them a smooth continuous curve is drawn as the basis of the graduated
series.

50
This method was applied by Joshua Milne, in the graduation of one of the earliest
mortality tables, the Carlisle Table of mortality, published in 1815. The data consisted of
census populations and death registers in two parishes in Carlisle. The graduation performed
separately on the population and deaths arranged in quinquennial and decennial age groups.

A similar method was applied by statisticians in the U.S. Bureau of the census to study
the population of Philippines from 1799 to 1903.

(ii) The Interpolation Method: Under the interpolation method, the graduated series is
obtained by interpolating between special points determined as representative of age groups
into which the data are combined. Since graduation involves the replacement of an irregular
observed series by a regular smooth series consistent with the trend of the observed values,
clearly the interpolation method of graduation includes more than interpolation alone. As a
graduation process, the interpolation method comprises three elements.

(i) The grouping of the data


(ii) The securing of a smooth, reliable series of points, one for each group
representative of the data.
(iii) The computation of graduated values by interpolation based upon these points.

(a) Grouping: The first step in the interpolation method is the combination of the data into
groups of suitable size and number. The data are grouped in the hope that by distributing the
excess population over the neighbouring ages the effects of these errors of reporting will be
eliminated or greatly reduced. To that end, an effort is made to select the particular grouping
that will best compensate for the heaping of the data.

(b) Pivotal points: The second step in the application of the method is the calculations of the
special interpolation points, referred to as 'pivotal points' upon which interpolation will be
based. Since the interpolation is anchored to the pivotal points, it is of great importance to the
success of the method as a whole that these points be representative of the respective groups
and at the same time form a smooth series. Because the interpolating curve segments are
constrained either to pass through the pivotal points or in the modified interpolation methods,
to pass close by them, the entire series will have the same general pattern of regularity as the
series of interpolation points.

King's method provides a means of computing the pivotal values ux from three or five
surrounding quinquennial sums, wx, into which data are assumed to be grouped. The formula,
based on the three quinquennial sums, wx-5, wx and wx+5 and correct to the third difference is -
ux = .2wx - .008 (wx-5 - 2wx + wx+5)

This formula may be derived by the methods of theory of finite differences; King's
formula should be applied separately to the exposures and deaths. The pivotal values of rates
are then obtained as the quotient of pivotal exposures and deaths.

King's method does not give satisfactorily pivotal points unless the grouped data form
a comparatively smooth series as in the case of population statistics (where the data are very
extensive). In any event if the series of pivotal points does not seem to be smooth enough; its
smoothness may be increased by graduating the pivotal values graphically before proceeding
with the interpolation.

51
(iii) Graduation by Newton's Formula: One of the simplest and most effective of the
graduation techniques is Newton's Forward Difference Formula (NFDF). In this, the
reported data are grouped into intervals of equal size (say 5 or 10 groups). Cumulative
frequencies for the selected intervals are then compiled and utilized as known Ux values in
NFDF. Application of the formula for a series of intermediate points within each intervals
yields a smooth cumulative distribution and successive subtraction of the cumulative
frequencies for the intermediate points converts the cumulative distribution to a distribution of
smoothed frequencies between the intermediate points.

One major weakness of Newton's method is the failure of the sets of single year
estimates for successive five-year intervals to link-up smoothly. If the data are graduated
properly the entire distribution should form smooth continuous curve. By the above method,
we would find sudden shifts at the points of junction i.e. at the end of intervals within which
graduation are made. Actuarial mathematicians have developed a variety of formula for
accomplishing a smooth function of the interpolation curves.

The Goa census of 1981 gives the following number of males in the age groups
preceding and following the 40-44 age interval.

Age Number
35-39 35016
40-44 30470
45-49 23422
Successive differences of cumulative frequencies are as follows:

Table 2.20: Difference Table


Age (x) Cumulative Δ1 x Δ2 x Δ3 x
Number
from age 35 (ux)
34 0
35016
39 35016 -4546
30470 -1602
44 65486 -6148
24322
49 89808

Take age 34 as the zero point for the interpolation. This means that the x values in
Newton's formula will be 1.0 for age 39. 1.2 for age 40, 1.4 for age 41, .... 2.0 for age 44.
Substituting the leading differences for age 34, and applying Newton's formula (refer formula
2.5) yields' the following:
u1.2 = 0+1.2 (35016) + 0.12 (-4546) - 0.032 (-1602) = 41422
u1.4 = 0+1.4 (35016) + 0.28 (-4546) - 0.056 (-1602) = 47660
u1.6 = 0+1.6 (35016) + 0.48 (-4546) - 0.064 (-1602) = 53741
u1.8 = 0+1.8 (35016) + 0.72 (-4546) - 0.048 (-1602) = 59679

These values are cumulative estimates. For example, the value to age 40 is for the
number of persons from age 35 through age 40 inclusive. In order to estimate the population

52
that was age 40, it is necessary to subtract the cumulative number for age 40 from that age 39.
Similar subtractions are made to obtain ages 41, 42, 43.

Table 2.21: Graduated numbers and reported Census


Numbers of males, age 40-44 by single years of age Goa, 1981

Graduated cumulative Number


Graduated number
Age (x) population from age reported Difference
at each age
35 through age x at each age
39 35,016
6406 17,087 -10681
40 41,422
6238 2,594 3644
41 47,660
6081 5,775 306
42 53,741
5938 2,730 3208
43 59,679
5807 2,284 3523
44 65,486
30470 30,470 0

Column 4 shows the difference between the graduated and the census figures. The
`heaping' at age 40 and low reporting of ages for 41, 43 and 44 are observed. It is difficult to
explain why such an erratic age distribution should be a true characteristic of the population.
Therefore, it appears to be due to misreporting of age. It may be extended to include more
five-year age groups and differences above the third order. This method can be applied for
five-year age groups and differences above the third order. This method can be applied for 10
year groups. It is possible to smooth the entire distribution. Reference Tables 11 to 22 given in
supplementary of this block will be used for data given in 5 or 10-year age groups to obtain
single year values.

Example 13: The following table gives the single year (enumerated) age distribution of the
Andhra Pradesh male population (1961). Using the 5-point formula for the data given in 5
years’ age groups, obtain graduated estimates by single year ages from 10 to 29 years for the
data.

Age Population Age Population Age Population Age Population

0 5769 10 6583 20 6602 30 8958


1 5657 11 2569 21 1203 31 495
2 4960 12 6177 22 2983 32 1995
3 5495 13 2782 23 1661 33 736
4 5017 14 3409 24 1777 34 945
5 5730 15 3700 25 7013 35 6610
6 5966 16 3865 26 2062 36 1331
7 4184 17 1481 27 991 37 492
8 5766 18 4617 28 2990 38 1736
9 3706 19 1509 29 727 39 495
In the above example
W0 = Population aged 0-4 = 26,808
W5 = Population aged 5-9 = 25,352
53
W10 = Population aged 10-14 = 21,520
W15 = Population aged 15-19 = 15,172
W20 = Population aged 20-24 = 14,226
W25 = Population aged 25-29 = 13,783
W30 = Population aged 30-34 = 13,120
W35 = Population aged 35-39 = 10,664

To get P10 we multiply w0 by -.003864, w5 by .065855, w10 by .178816, w15 by -.042943; w20
by .006336 and then add them up (Table 3). To get P11, multiply w0, w5, w10, w15, and w 20 by
the coefficients in the second line of the table 3, giving the five point formula for data given in
5 year age groups. Similarly, for P12 we use the same w's but the third line in the table, for P13
the same w's and fourth line in the table and for P14 the same w's and the fifth line in the table.

To get P15 we use w5; w10, w15 w20 and w25 and the first line in the table and so on.

The p values are given below:

P10 = 4740; P11 = 4552; P12 = 4331; P13 =4082; P14 = 3813; P15 = 3346; P16 = 3126; P17 =
2971; P18 = 2882; P19 = 2847; P20 = 2861.

The above are the smoothed values of the populations by single years. We can check
that the graduated totals are the same as undergraduate totals i.e.,

Graduated P10 +P11 +P12 +P13 +P14 = Undergraduate P10 +P11 +P12 +P13 +P14 = w0

2.7 Osculatory Interpolation

Osculatory interpolation formulas, is designed for securing interpolated results which


have a high degree of smoothness. They are particularly designed for use with rough data.

One of the limitations observed in adjusting rough data by the usual Newton's (single
polynomial) interpolation formulas is that at points where two interpolation curves meet, there
are sudden breaks in the values of the first order difference (see in previous example p19 =
2847 and p20 = 2861). To have a solution to this, osculatory interpolation was devised by
Thomas Bond Sprague in 1880. It involves combining two overlapping polynomial into one
equation. One of the polynomial begins sooner and ends sooner than the other and the
interpolations are limited to the overlapping parts. The second of the two polynomials in the
first range then becomes the first polynomial in the second range. The use of one common
polynomial for each pair of successive ranges permits a continuous joining of results from
range to range. The two overlapping polynomials should have common at the beginning and at
the end of range in which interpolation is desired. The specific condition of the osculatory
interpolation formula are both the polynomials should have a common ordinate, a common
tangents (slope) and/or a common radius of curvature. This is possible by making the first
derivative or the first two derivatives equal for the two polynomials.

To split the five-year age groups data into single year of age data several methods are
available. Some of them are Karup-Kings third degree tangential, Sprague's fifth difference
formula, Jenkin's fifth difference osculatory non-reproducing formula, Greville's formula,
Beer's six term ordinary and modified formulas.

54
Karup-King's Formula: This formula is simplest one for which the interpolation coefficients
are presented. It is correct to second difference and has an adjustment involving third

differences. It uses the four given points. The formula provides the third degree curve through
which the central interval u1 to u2 of the four-point series u0, u1, u2 and u3 shall have at u1 and
u2 the same tangents (first differential coefficients) as the partial Newton - Sterling Curves of
the second degree through u0 u1 u2 and u1 u2 u3 respectively. The formula may be expressed in
2
x(x + 1) 2 x (x - 1) 3
ux+1 = u1 + x u0 +  u0 +  u0
2! 2!
the following terms

The application of Karup-Kings Formula: If Tx, Tx+5, Tx+10, denote the enumerated population
aged x to x+4, x+5 to x+10, and x+10 to x+14 respectively. First find out the groups which
have one T value above it and one below it. These are designated as `mid panel'. In this case,
Tx-5, Tx and Tx+10 (for x =5,10) would satisfy this, The first group or the first end panel i.e. Tx
here which has only two values below it but none above it is respectively designated as the
first end panel (the first group). Similarly, the last-end Tx has only two values above it but
none below it. The multiplier for the first, middle and last end panels are given in tables 23 A
& B, which is attached as supplementary of this block.

In Karup-King formula (see table 23 B), there are 3 columns and 5 rows in each table.
The five rows denote the five single year values, which we are interested to find out from the
given five year grouped values. For example in Table 1 (first panel) row number say 4, when
operated on Tx, Tx+5, Tx+10 will give Px+3 (= population aged x+3). In mid-panel row number
say 3 when operated on Tx+5 ,Tx+10, and Tx+15 will give Px+12 = population aged x+12 and in
the last panel row number 4 when operated on Tx+10, Tx+15 and Tx+20 will give Px+23 =
population aged x+23. In this way all the groups can be split into single year values and this
formula is adjusted in such a way that the total in each age group is unaffected by the splitting.

Sprague's Formula: This formula smoothens at the points occupied by the original data by
providing that curve of the fifth degree passing through the central intervals u2 to u3 in the
given six-point series, u0, u1 .... u5 shall have the same tangent and radius of curvature as the
partial Newton Sterling Curve of the fourth order through u0, u1 ... u4 and shall similarly, at the
point whose ordinate is u3 have the same tangent and radius of curvature as the partial curve of
the fourth order through u1, u2 ... u5. In this process the other important conditions laid down is
that the formula must reproduce the given values exactly. The formula may be stated.

(x + 2) (x + 2)(x + 1) 2 (x + 2) (x + 1)x 3
u2+ x = u0 + Δu0 + Δ u0 + Δu0
1! 2! 3!

(x + 2)  (x + 1)  (x - 1) 4 x(x - 1)(5x - 7) 5
+ Δu0 + Δu0
4! 4!

Application of Sprague's Formula: In general Sprague formula, calls for two five-year age
intervals to proceed with in which single year age graduation is to be performed. When this
condition is met a set 'mid panel' multipliers can be used. At the ends of the distribution,
where two five year intervals are available for only one side of the five year intervals within
which graduation is desired, special sets of multipliers based on 4th order differences are used.
There are four sets of multipliers-two for the younger ages and two for the older ages. The
55
"first-end-panel' of multipliers is used for the 0-4 year intervals and the `first-next-to-end-
panel' multipliers is used for the 5-9 year intervals. At the oldest age the `last-end-panel'
multiplier is use for the oldest five-year age group and the `last-next-to-end-panel' is used for
the five-year group immediately younger. In general, the Sprague multipliers are very flexible
and will be fit most-distribution of data by age.

Greville's Formula: In recent years it has been pointed out that for most actuarial work,
interpolation is used only to obtain estimates for integral ages and that the function ux, where
ux is the estimated number at different ages, x, may logically be regarded as discrete series
rather than a continuous curve. Formulas which minimize the mean square error of differences
of a given order have been developed by Greville and Beers, these formulas are recommended
for the general use in demographic analysis. In Grevilles formula, if ur is an observed value
and the Ur the true value, then Ur = ur + er where er is an error. It is assumed that the errors in
the observed values are independent random variables with mean zero and variance e2. It is
further assumed that the differences of U beyond order j are zero. Greville obtained the
graduated values by minimize the variance of Δj+1u where U is the linear composition of
observed value.

Beer's Formula: In Beer's formula, he started with the assumption that the 5th differences of
the observed values are independent random variables with mean zero and variance U2 - the
U2 for the assumed constant mean square error (Variance) of each Δ5ur.

2.7.1 Modified Osculatory Interpolation: When the reproducing formulas are used to fill in
the values between certain pre-determined points, it is often found that the whole curve which
finally results will show many undulations and points of inflection even though it will be free
from discontinuities. Since the original values are reproduced exactly, the reproduced
formulas ensure smoothness in the value, but if the group values are not correct, it may lead to
undulation. Therefore, while using reproducing formula the original values should be
graduated if it is needed.

As mentioned above, both ordinary interpolation and the osculatory interpolation are
true interpolation formulas, in the sense that the interpolating arcs pass through the pivotal
points. W.A. Jenkin removed the restriction and produced a set of formulas known as
modified osculatory interpolation formula which achieve considerably greater smoothness
among the interpolated values than do true interpolation formulas. In the modified osculatory
formulas, two adjoining interpolating arcs merely meet and do so in such a way that a
specified number of successive derivatives of the interpolating curve functions are equal at
their common point. Note that the interpolating arcs do not pass through the pivotal points.
The extend by which values of this formula differ at the pivotal point is Ux (formula)

1
= Ux -
36

2.7.2 Comparison and Selection of Osculatory Interpolation Formulas:

(i) The Beer's formula yields a smoother pattern of results than the Sprague formula.
However, the Beer's formula has the slight disadvantage. When it is used for
subdivision into tenths, the results of subdivisions will not necessarily add up exactly
to the results of subdivisions into fifths.

56
(ii) When the data trend to follow the trend of a second degree of third degree polynomial,
the Sprague formula and the Beer's formula will both yield about the same results as
the Karup-King formula.
(iii) Modified formula will give poor results when used with data that are of good quality.
It should be used when there is a desire to obtain a smooth series of interpolation from
data is known to be somewhat erratic.

The choice of a method for interpolation is dependent on the nature of the data and on
the purposes to be served. The several sets of interpolation coefficients that are presented in
Appendix Tables are based on formulas that differ in their underlying principles. There is no
one best method for all purposes.

Tables of selected sets of multipliers: These selected sets of multipliers based on five different
formulas namely, Karup-King, Sprague, Beers Ordinary, Beers modified and Grabill's
formulas are used for subdivision of grouped data. For instance, these multipliers may be used
for subdividing age data given in 5-year age groups into single years of age. They may also be
used for subdividing data for 10-year age groups into single years of age also. These
multipliers can be manipulated in various ways.

Example 14: Employing Karup-King Third Difference Formula for the data given below
estimate the population 20 years old.

Age group (years) Population

15-19 35700
20-24 30500
25-29 32600

Age 20 is the 'first fifth' of age group 20-24. Age group 20-24 is a middle group.
Taking the population aged 15-19 as G1, the population aged 20-24 as G2 , and the population
aged 25-29 as G3 , and using the coefficient values corresponding to first fifth of G2 given
Table 23 'B' for subdivision of groups into fifths', the desired estimate of the population 20
years old is:

+.064 (35700) + .152 (30500) - .016 (32600) = 2285 + 4636 + -522 = 6399.

It is noted that whenever possible mid panel multipliers should be used. For
subdivision of the first group in a distribution (e.g. ages 0-4) the first panel multipliers must be
used. Similarly, for subdivision of the last group (e.g. ages 70-74) the last panel multipliers
must be used. For more details about the use of these supplementary tables you may refer
Bogue et al. (1994), Vol. 1, Chapter 5.

Self-Check Exercises

5. What do you mean by graduation?

6. What are the criteria of an acceptable graduation?

7. What do you mean by Pivotal points?

57
Let Us Sum Up

After completing this unit, you might have learnt that -

* Interpolation is the process of finding the value of the function (y) for any of the
independent variable (x) within a given range of value of x and extrapolation is the process
of finding the value outside the given range of x.

* The interpolation analysis is done on the basis of two assumptions, Firstly, that the quantity
changes continuously without any break and secondly the rate of change is uniform and
there are no sudden juMA in the data. In other words, it means that the data are in the
shape of continuous or smooth curve.

* The symbol Δ1x, (Δ read as delta) will be used to represent the difference between two
successive known values of the distribution ux+1 -ux. Such differences are called first order
differences or simply "first differences"... The super-script identifies the order of the
difference and the subscript specified which pair of values has been differentiated.

* The simplest type of interpolation is Linear-interpolation, whose general expression is given


as
x - x0
u x = uo + _(u1 - uo)
x1 - x 0
where the notations have their usual meanings.

* The Newton's Forward formula is applied when the independent variables advance by equal
intervals. Given the first row of differences, it is possible by this method to reproduce all
other differences, simply by successively adding adjacent pairs of values together and
placing the total under the left entry of the pair. This formula is given as
Z(Z - 1) 2 Z(Z - 1)(Z - 2) 3
uz = ZΔu 0 + Δ u0 + Δ u0
2! 3!
x - x0
where Z = and h = x1 - x 0
h
u0, Δu0, Δ2u0, Δ3u0, ...., etc. are the leading differences occurring at the top of the cone in the
difference table.

* Many a times in population statistics data are given at equal intervals of x; but sometimes it
happens that we are required to interpolate when values of the function are known for
unequal intervals. Since we cannot take out the differences as defined earlier we adopt a
process of difference in involving the argument as well as the entry. The differences
obtained by this process are called `divided' differences. Newton's Divided Difference and
Lagrange’s formulae are based on these concepts.

* Graduation may be defined as the process of securing from an irregular series of observed
values of a continuous variable, a smooth regular series of values consisting in a general
way with the observed series of values. It is characterized by two essential qualities i.e.
smoothness and fit or consistency with the observed data. It is to be noted here that the
graduated series should be not only smooth as compared with the ungraduated data but it
should be consistent with the indication of the ungraduated series also.

* The interpolation method of graduation comprises three elements viz. grouping of data,
securing of a smooth, reliable series of points, and the computation of graduated values by
interpolation based upon these points.

* Graduation by Newton's formula is the simplest and effective technique of graduation. In


this, the reported data are grouped into intervals of equal size (say 5 or 10 groups)
58
cumulative frequencies for the selected intervals are then compiled and utilized as known Ux
values in NFDF. Application of the formula for a series of intermediate points within each
intervals yields a smooth cumulative distribution and successive subtraction of the
cumulative frequencies for the intermediate points converts the cumulative distribution to a
distribution of smoothed frequencies between the intermediate points.

Model Answers

1. (i) The data should be continuous.

(ii) The rate of change in the values should be uniform and these should not be any
sudden jump in the data.

2. (a) not linearly distributed


(b) equal intervals
(c) unequal intervals
(d) leading term, the leading difference
(e) Δ21 and Δ20

3. The main uses of interpolation formula are as follows:

(i) Where there are n equidistant terms and it is required to find an intermediate
term.

(ii) Where there are n equidistant terms of which n-1 are known and it is required
to find the missing terms.

(iii) To estimate the composition of a population for a more detailed interval than
those in which they are reported.

(iv) To convert unconventional age groups into conventional age groups.

(v) To split the ten year and five year intervals into five year intervals.

(vi) To split the ten-year and five-year intervals into single year intervals.

4. Some of the important limitations of interpolation formula are:

(i) This method cannot correct the deficiencies/errors in a particular group of


observations.

(ii) When the function tends to zero slowly i.e. it is asymptotic to x-axis,
interpolation for the tail may have percentage errors that are larger than
permissible.

(iii) Caution also should be exercised not only in the use of interpolation formula in
the above situations but also in the selection of formula itself.

59
5. Graduation may be defined as the process of securing from an irregular series of
observed values of a continuous variable, a smooth regular series of values consisting
in a general way with the observed series of values.

6. Graduation is characterized by the following two qualities:

(i) Smoothness
(ii) Fit

7. The pivotal points are defined as special interpolation points upon which interpolation
will be based.

60
Supplementary
Table 6: Table for obtaining single year interval values for the
The following 10 tables may be used to obtain Single first section when three single year values are given at
Year Interval Values when data is given at 5 or 10 year ten year intervals.
intervals. The tables are based on Newton's Forward
Difference formula. ──────────────────────────────────
ux ux+10 ux+20
──────────────────────────────────
Table 1: Table for obtaining single year interval values for the ux+1 .856 .190 -.015
first section when three Single Year Values are given ux+2 .720 .360 -.080
at five year intervals. ux+3 .595 .510 -.105
ux = Population at age x ux+4 .480 .640 -.120
ux+5 .375 .750 -.125
────────────────────────────────── ux+6 .280 .840 -.120
ux ux+5 ux+10 ux+7 .195 .910 -.105
────────────────────────────────── ux+8 .120 .960 -.080
ux+1 .72 .36 -.08 ux+9 .055 .990 -.045
ux+2 .48 .64 -.12 ──────────────────────────────────
ux+3 .28 .84 -.12
ux+4 .12 .96 -.08 Table 7: Table for obtaining single year interval values for the
────────────────────────────────── second section when three single year values are given at ten year
intervals.
Table 2: Table for obtaining single year interval values for the
second section when three single year values are ──────────────────────────────────
given at five year intervals. ux-10 ux ux+10
──────────────────────────────────
────────────────────────────────── ux+1 -.045 .990 .055
ux-5 ux ux+5 ux+2 -.080 .960 .120
────────────────────────────────── ux+3 -.105 .910 .195
ux+1 -.08 .96 .12 ux+4 -.120 .840 .280
ux+2 -.12 .84 .28 ux+5 -.125 .750 .375
ux+3 -.12 .64 .48 ux+6 -.120 .640 .480
ux+4 -.08 .36 .72 ux+7 -.105 .510 .595
────────────────────────────────── ux+8 -.080 .360 .720
ux+9 -.045 .190 .855
Table 3: Table for obtaining single year interval values for the ──────────────────────────────────
first section when four single year values are given at
five year intervals. Table 8:
Table for obtaining single year interval values for the
third section when five single year values are given at
────────────────────────────────── ten year intervals.
ux ux+5 ux+10 ux+15 ──────────────────────────────────
────────────────────────────────── ux-20 ux-10 ux ux+10 ux+20
ux+1 .672 .504 -.224 .048 ──────────────────────────────────
ux+2 .416 .832 -.312 .064 ux+1 .0078 -.0597 .9873 .0733 -.0087
ux+3 .224 1.008 -.288 .056 ux+2 .0144 -.1056 .9504 .1584 -.0176
ux+4 .088 1.056 -.176 .032 ux+3 .0193 -.1367 .8893 .2543 -.0262
────────────────────────────────── ux+4 .0224 -.1536 .8064 .3584 -.0336
ux+5 .0234 -.1561 .7029 .4689 -.0391
Table 4: Table for obtaining single year interval values for the ux+6 .0224 -.1456 .5824 .5824 -.0416
first section when five single year values are given at ux+7 .0193 -.1227 .4473 .6963 -.0402
five year intervals. ux+8 .0144 -.0896 .3024 .8064 -.0336
ux+9 .0078 -.0477 .1513 .9093 -.0207
────────────────────────────────── ──────────────────────────────────
ux ux+5 ux+10 ux+15 ux+20
────────────────────────────────── Table 9:
Table for obtaining single year interval values for the
ux+1 .6384 .6384 -.4256 .1824 -.0336 first section when four single year values are given at
ux+2 .3744 .9984 -.5616 .2304 -.0416 ten year intervals.
ux+3 .1904 1.1424 -.4896 .1904 -.0336 ──────────────────────────────────
ux+4 .0704 1.1264 -.2816 .1024 -.0176 ux ux+10 ux+20 ux+30
────────────────────────────────── ──────────────────────────────────
ux+1 .8265 .2755 -.1305 .0285
Table 5: Table for obtaining single year interval values for the ux+2 .6720 .5040 -.2240 .0480
third section when five single year values are given ux+3 .5355 .6885 -.2835 .0595
at five year intervals. ux+4 .4160 .8320 -.3120 .0640
ux+5 .3125 .9375 -.3125 .0625
────────────────────────────────── ux+6 .2240 1.0080 -.2880 .0560
ux-10 ux-5 ux ux+5 ux+10 ux+7 .1495 1.0465 -.2415 .0455
────────────────────────────────── ux+8 .0880 1.0465 -.1760 .0320
ux+1 .0144 -.1056 .9504 .1584 -.0176 ux+9 .0385 1.0395 -.0945 .0165
ux+2 .0224 -.1536 .8064 .3584 -.0336 ──────────────────────────────────
ux+3 .0224 -.1456 .5824 .5824 -.0416
ux+4 .0144 -.0896 .3024 .8064 -.0336
──────────────────────────────────

61
Table 10: Table for obtaining single year interval values for the Table 14: Multipliers for obtaining single year values for the
first section when five single year values are given at first group when three 10-year group values are given.
ten year intervals.
──────────────────────────────────
────────────────────────────────── 0 Wx Wx+10 Wx+20
ux ux+10 ux+20 ux+30 ux+40 ──────────────────────────────────
────────────────────────────────── Px .1735 -.1020 .0285
ux+1 .8058 .3582 -.2545 .1112 -.0207 Px+1 .1545 -.0740 .0195
ux+2 .6384 .6384 -.4256 .1824 -.0336 Px+2 .1365 -.0480 .0115
ux+3 .4953 .8492 -.5245 .2202 -.0402 Px+3 .1195 -.0240 .0045
ux+4 .3744 .9984 -.5616 .2304 -.0416 Px+4 .1035 -.0020 -.0015
ux+5 .2734 1.0938 -.5469 .2188 -.0391 Px+5 .0885 .0180 -.0065
ux+6 .1904 1.1424 -.4896 .1904 -.0336 Px+6 .0745 .0360 -.0105
ux+7 .1233 1.1512 -.3985 .1502 -.0262 Px+7 .0615 .0520 -.0135
ux+8 .0704 1.1264 -.2806 .1024 -.0176 Px+8 .0495 .0660 -.0155
ux+9 .0298 1.0742 -.1465 .0512 -.0087 Px+9 .0385 .0780 -.0165
────────────────────────────────── ──────────────────────────────────

The following 12 tables may be used for data given in 5 or 10- Table 15: Multipliers for obtaining single year values for the
year age groups to obtain single year values. The tables are middle group when three 10-year group values are
based on Newton's Forward Difference Formula. given.

Note : Tx = Population aged x to x+4 ──────────────────────────────────


Wx = Population aged x to x+9 0 Wx-10 Wx Wx+10
──────────────────────────────────
Table 11 : Multipliers for obtaining single year values for the Px .0285 .0880 -.0165
middle group when three 5 year group values are Px+1 .0195 .0960 -.0155
given. Px+2 .0115 .0120 -.0135
Px+3 .0045 .1060 -.0105
────────────────────────────────── Px+4 -.0015 .1080 -.0065
0 Tx-5 Tx Tx+5 Px+5 -.0065 .1080 -.0015
────────────────────────────────── Px+6 -.0105 .1060 .0045
Px .048 .184 -.032 Px+7 -.0135 .1020 .0115
Px+1 .016 .208 -.024 Px+8 -.0155 .0960 .0195
Px+2 -.008 .216 -.008 Px+9 -.0165 .0880 .0285
Px+3 -.024 .208 .016 ──────────────────────────────────
Px+4 -.032 .184 .048
────────────────────────────────── Table 16: Multipliers for obtaining single year values for the last
group when three 10-year group values are given.
Table 12: Multipliers for obtaining single year values for the
second group when four 5-year group values are ──────────────────────────────────
given. 0 Wx-20 Wx-10 Wx
──────────────────────────────────
────────────────────────────────── Px -.0165 .0780 .0385
0 Tx-5 Tx Tx+5 Tx+10 Px+1 -.0155 .0660 .0495
────────────────────────────────── Px+2 -.0135 .0520 .0615
Px .0336 .2272 -.0752 .0144 Px+3 -.0105 .0360 .0745
Px+1 .0080 .2320 -.0480 .0080 Px+4 -.0065 .0180 .0885
Px+2 -.0080 .2160 -.0080 .0000 Px+5 -.0015 -.0020 .1035
Px+3 -.0160 .1840 .0400 -.0080 Px+6 .0045 -.0240 .1195
Px+4 -.0176 .1408 .0912 -.0144 Px+7 .0115 -.0480 .1365
────────────────────────────────── Px+8 .0195 -.0740 .1545
Px+9 .0285 -.1020 .1735
Table 13: Multipliers for obtaining single year values for the ──────────────────────────────────
middle group when five 5-year group values are
given. Table 17: Multipliers for obtaining single year values for the
first group when four 10-year group values are given.
──────────────────────────────────
0 Tx-10 Tx-5 Tx Tx+5 Tx+10 ──────────────────────────────────
────────────────────────────────── 0 Wx Wx+10 Wx+20 Wx+30
Px -.003864 .065855 .178816 -.042943 .006336 ──────────────────────────────────
Px+1 -.003584 .022336 .210496 -.033664 .004416 Px .1942 -.1640 .0905 -.0207
Px+2 .000896 -.011585 .221378 -.011585 .000896 Px+1 .1674 -.1128 .0583 -.0129
Px+3 .004416 -.033664 .210496 .022356 -.003584
Px+4 .006336 -.042943 .178816 .065855 -.008064
Px+2 .1431 -.0677 .0312 -.0066
────────────────────────────────── Px+3 .1209 -.0283 .0088 -.0014
Px+4 .1010 .0056 -.0091 .0025
Px+5 .0830 .0344 -.0229 .0055
Px+6 .0671 .0583 -.0328 .0074
Px+7 .0529 .0777 -.0292 .0086
Px+8 .0406 .0928 -.0423 .0089
Px+9 .0298 .1040 -.0425 .0087
──────────────────────────────────

62
Table 18: Multipliers for obtaining single year values for the Table 22: Multipliers for splitting ten-year group values into five-
second group when four 10-year group values are year group values when three ten-year group values
given. are given.
──────────────────────────────────
0 Wx-10 Wx Wx+10 Wx+20 x = Population aged x to x+4
──────────────────────────────────
Px .0207 .1115 -.0400 .0078 ──────────────────────────────────
Px+1 .0129 .1157 -.0352 .0066 0 Wx Wx+10 Wx+20
Px+2 .0066 .1168 -.0283 .0049 ──────────────────────────────────
Px+3 .0014 .1152 -.0197 .0031 Vx .6875 -.2500 .0625
Px+4 -.0025 .1111 -.0096 .0010 Vx+5 .3125 .2500 -.0625
Px+5 -.0055 .1049 .0016 -.0010 Vx+10 .0625 .5000 -.0625
Px+6 -.0074 .0968 .0137 -.0031 Vx+15 -.0625 .5000 .0625
Px+7 -.0086 .0872 .0263 -.0049 Vx+20 -.0625 .2500 .3125
Px+8 -.0089 .0763 .0392 -.0065 Vx+25 .0625 -.2500 .6875
Px+9 -.0087 .0645 .0520 -.0078 ──────────────────────────────────
──────────────────────────────────

Table 19: Multipliers for obtaining single year values for the
middle group when five 10-year group values are
given.

──────────────────────────────────
0 Wx-20 Wx-10 Wx Wx+10 Wx+20
──────────────────────────────────
Px -.0045 .0388 .0842 -.0218 -.0038
Px+1 -.0035 .0270 .0946 -.0211 .0030
Px+2 -.0024 .0160 .1025 -.0188 .0026
Px+3 -.0013 .0063 .1080 -.0149 .0019
Px+4 -.0001 -.0023 .1107 -.0093 .0010
Px+5 .0010 -.0093 .1107 -.0023 -.0001
Px+6 .0019 -.0149 .1080 .0063 -.0013
Px+7 .0026 -.0188 .1025 .0161 -.0024
Px+8 .0030 -.0211 .0946 .0270 -.0035
Px+9 .0033 -.0218 .0842 .0388 -.0045
──────────────────────────────────

Table 20: Multipliers for obtaining single year values for the
second group when four 10-year group values are given.

──────────────────────────────────
0 Wx-20 Wx-10 Wx Wx+10
──────────────────────────────────
Px -.0078 .0520 -.0645 -.0087
Px+1 -.0066 .0392 -.0763 -.0089
Px+2 .0049 .0263 -.0872 -.0086
Px+3 -.0031 .0137 -.0968 -.0074
Px+4 -.0010 .0016 .1049 -.0055
Px+5 .0010 -.0096 .1111 -.0025
Px+6 .0031 -.0197 .1152 .0014
Px+7 .0040 -.0283 .1168 .0066
Px+8 .0066 -.0352 .1157 .0129
Px+9 .0078 -.0400 .1115 .0207
──────────────────────────────────

Table 21: Multipliers for obtaining single year values for the
last group when four 10-year group values are given.

──────────────────────────────────
0 Wx-30 Wx-20 Wx-10 Wx
──────────────────────────────────
Px .0087 -.0425 .1040 .0298
Px+1 .0089 -.0423 .0928 .0406
Px+2 .0086 -.0392 .0777 .0529
Px+3 .0074 -.0328 .0583 .0671
Px+4 .0055 -.0229 .0344 .0830
Px+5 .0025 -.0091 .0056 .1010
Px+6 -.0014 .0088 -.0283 .1209
Px+7 -.0066 .0312 -.0677 .1431
Px+8 -.0129 .0583 -.1128 .1674
Px+9 -.0207 .0905 -.1640 .1942
──────────────────────────────────

63
Table 23: Interpolation Coefficients Based on the Karup-King Formula

[Karup-King formula is a four-term third-difference osculatory formula. It maintains the given values. Given
point or groups must be equally spaced.]

A. FOR INTERPOLATION BETWEEN GIVEN POINTS AT INTERVALS OF 0.2

Interpolated point Coefficients to be applied to -

N1.0 N2.0 N3.0 N4.0

First interval
N1.0----------------- +1.000 .000 .000 .000
N1.2----------------- +.656 +.552 -.272 +.064
N1.4----------------- +.408 +.856 -.336 +.072
N1.6----------------- +.232 +.984 -.264 +.048
N1.8----------------- +.104 +1.008 -.128 +.016

Middle interval
N2.0----------------- .000 +1.000 .000 .000
N2.2----------------- -.064 +.912 +.168 -.016
N2.4----------------- -.072 +.696 +.424 -.048
N2.6----------------- -.048 +.424 +.696 -.072
N2.8----------------- -.016 +.168 +.912 -.064

Last interval

N3.0----------------- .000 .000 +1.000 .000


N3.2----------------- +.016 -.128 +1.008 +.104
N3.4----------------- +.048 -.264 +.984 +.232
N3.6----------------- +.072 -.336 +.856 +.408
N3.8----------------- +.064 -.272 +.552 +.656

N4.0----------------- .000 .000 .000 +1.000

64
Table 23 (Cont.): Interpolation Coefficients Based on the Karup-King Formula

B. FOR SUBDIVISION OF GROUPS INTO FIFTHS

Interpolated subgroup Coefficients to be applied to --

G1 G2 G3

First panel

First fifth of G1 +.344 -.208 +.064


Second fifth of G1 +.248 -.056 +.008
Third fifth of G1 +.176 +.048 -.024
Fourth fifth of G1 +.128 +.104 -.032
Last fifth of G1 +.104 +.112 -.016

Middle Panel

First fifth of G2 +.064 +.152 -.016


Second fifth of G2 +.008 +.224 -.032
Third fifth of G2 +.024 +.248 -.024
Fourth fifth of G2 -.032 +.224 +.008
Last fifth of G2 -.016 +.152 +.064

Last Panel

First fifth of G3 -.016 +.112 +.104


Second fifth of G3 -.032 +.104 +.128
Third fifth of G3 -.024 +.048 +.176
Fourth fifth of G3 +.008 -.056 +.248
Last fifth of G3 +.064 -.208 +.344

C. FOR SUBDIVISION OF GROUPS INTO TENTHS OR HALVES

Interpolated subgroup Coefficients to be applied to -

First Tenth of G2 +.0405 +.0640 -.0045


Second Tenth of G2 +.0235 +.0880 -.0115
Third Tenth of G2 +.0095 +.1060 -.0155
Fourth Tenth of G2 -.0015 +.1180 -.0165
Fifth Tenth of G2 -.0095 +.1240 -.0145

Sum of coefficients for first Five-tenths =


coefficients for first half of G2 +.0625 +.5000 -.0625

Sixth tenth of G2 -.0145 +.1240 -.0095


Seventh tenth of G2 -.0165 +.1180 -.0015
Eight tenth of G2 -.0155 +.1060 +.0095
Ninth tenth of G2 -.0115 +.0880 +.0235
Last tenth of G2 -.0045 +.0640 +.0405

Sum of coefficients for last Five tenths =


coefficients for last half of G2 -.0625 +.5000 +.0625
Note : Interpolation coefficients in tables 23, 24 and 27 were computed by Wilson H. Grabill, U.S. Bureau of
the Census, from the basic formulas. Interpolation coefficients in tables 25 and 26 are reproduced from the
sources cited in these tables.

65
Table 24: Interpolation Coefficients Based on the Sprague Formula

The Sprague formula is a six-term fifth-difference osculatory formula.


It maintains the given points or groups must be equally spaced

A. FOR INTERPOLATION BETWEEN GIVEN POINTS AT INTERVALS OF 0.2

Interpolated point Coefficients to be applied to -

N1.0 N2.0 N3.0 N4.0 N5.0 N6.0

First interval

N1.0----------------- +1.0000 -.0000 .0000 .0000 .0000


N1.2----------------- +.6384 +.6384 -.4256 +.1824 -.0336
N1.4----------------- +.3744 +.9984 -.5616 +.2304 -.0416
N1.6----------------- +.1904 +1.1424 -.4896 +.1904 -.0336
N1.8----------------- +.0704 +1.1264 -.2816 +.1024 -.0176

Next-to-first interval

N2.0----------------- .0000 +1.0000 .0000 .0000 .0000


N2.2----------------- -.0336 +.8064 +.3024 -.0896 +.0144
N2.4----------------- -.0416 +.5824 +.5824 -.1456 +.0224
N2.6----------------- -.0336 +.3584 +.8064 -.1536 +.0224
N2.8----------------- -.0176 +.1584 +.9504 -.1056 +.0144

Middle interval

N3.0----------------- .0000 .0000 .0000 .0000 .0000


N3.2----------------- +.0128 -.0976 +.1744 -.0256 +.0016
N3.4----------------- +.0144 -.1136 +.4384 -.0736 +.0080
N3.6----------------- +.0080 -.0736 +.7264 -.1136 +.0144
N3.8----------------- +.0016 -.0256 +.9344 -.0976 +.0128

Next-to-last interval

N4.0----------------- .0000 .0000 +1.0000 .0000 .0000


N4.2----------------- +.0144 -.1056 +.9504 +.1584 -.0176
N4.4----------------- +.0224 -.1536 +.8064 +.3584 -.0336
N4.6----------------- +.0224 -.1456 +.5824 +.8524 -.0416
N4.8----------------- +.0144 -.0896 +.3024 +.8064 -.0336

Last interval

N5.0----------------- .0000 .0000 .0000 +1.0000 .0000


N5.2----------------- -.0176 +.1024 -.2816 +1.1264 +.0704
N5.4----------------- -.0336 +.1904 -.4896 +1.1424 +.1904
N5.6----------------- -.0416 +.2304 -.5616 +.9984 +.3744
N5.8----------------- -.336 +.1824 -.4256 +.6384 +.6384

N6.0----------------- .0000 .0000 .0000 .0000 +1.0000

66
Table 24 (Cont.): Interpolation Coefficients Based on the Sprague Formula

B. FOR SUBDIVISION OF GROUPS INTO FIFTHS

Interpolated subgroup Coefficients to be applied to --

G1 G2 G3 G4 G5

First panel

First fifth of G1 +.3616 -.2768 +.1488 -.0336


Second fifth of G1 +.2640 -.0960 +.0400 -.0080
Third fifth of G1 +.1840 +.0400 -.0320 +.0080
Fourth fifth of G1 +.1200 +.1360 -.0720 +.0160
Last fifth of G1 +.0704 +.1968 -.0848 +.0176

Next-to-first Panel

First fifth of G2 +.0366 +.2272 -.0752 +.0144


Second fifth of G2 +.0080 +.2320 -.0480 +.0080
Third fifth of G2 -.0080 +.2160 -.0080 .0000
Fourth fifth of G2 -.0160 +.1840 +.0400 -.0080
Last fifth of G2 -.0176 +.1408 +.0912 -.0144

Middle Panel

First fifth of G3 -.0128 +.0848 +.1504 -.0240 +.0016


Second fifth of G3 -.0016 +.0144 +.2224 -.0416 +.0064
Third fifth of G3 +.0064 -.0336 +.2544 -.0336 +.0064
Fourth fifth of G3 +.0064 -.0416 +.2224 +.0144 -.0016
Last fifth of G3 +.0016 -.0240 +.1504 +.0848 -.0128

Next-to-Last Panel

First fifth of G4 -.0144 +.0912 +.1408 -.0176


Second fifth of G4 -.0080 +.0400 +.1840 -.0160
Third fifth of G4 .0000 -.0080 +.2160 -.0080
Fourth fifth of G4 +.0080 -.0480 +.2320 +.0080
Last fifth of G4 +.0144 -.0752 +.2272 +.0336

Last Panel

First fifth of G5 +.0176 -.0848 +.1968 +.0704


Second fifth of G5 +.0160 -.0720 +.1360 +.1200
Third fifth of G5 +.0080 -.0320 +.0400 +.1840
Fourth fifth of G5 -.0080 +.0400 -.0960 +.2640
Last fifth of G5 -.0336 +.1488 -.2768 +.3616

C. FOR SUBDIVISION OF GROUPS INTO TENTHS OR HALVES

Interpolated subgroup Coefficients to be applied to -

First Tenth of G3 -.0076 +.0510 +.0660 -.0096 +.0002


Second Tenth of G3 -.0052 +.0338 +.0844 -.0144 +.0014
Third Tenth of G3 -.0022 +.0154 +.1036 -.0195 +.0027
Fourth Tenth of G3 +.0006 -.0010 +.1188 -.0221 +.0037
Fifth Tenth of G3 +.0027 -.0133 +.1272 -.0203 +.0037

Sum of coefficients for first Five-tenths = coefficients


for first half of G3 -.0117 +.0859 +.5000 -.0859 +.0117

Sixth tenth of G3 +.0037 -.0203 +.1272 -.0133 +.0027


Seventh tenth of G3 +.0037 -.0221 +.1188 -.0010 +.0006
Eight tenth of G3 +.0027 -.0195 +.1036 +.0154 -.0022
Ninth tenth of G3 +.0014 -.0144 +.0844 +.0338 -.0052
Last tenth of G3 +.0002 -.0096 +.0660 +.0510 -.0076

Sum of coefficients for last Five tenth = coefficients


for second half of G3 +.0117 -.0859 +.0500 -.0859 -.0117

67
Table 25: Interpolation Coefficients Based on the Beers "Ordinary" Formula.

The Beers "ordinary" formula is a six-term formula, which minimizes the fifth differences of the interpolated
results. It maintains the given values. Given points or groups must be equally spaced.

A. FOR INTERPOLATION BETWEEN GIVEN POINTS AT INTERVALS OF 0.2

Interpolated point Coefficients to be applied to -

N1.0 N2.0 N3.0 N4.0 N5.0 N6.0

First interval

N1.0----------------- +1.0000 .0000 .0000 .0000 .0000 .0000


N1.2----------------- +.6667 +.1426 -.1426 -.1006 +.1079 -.0283
N1.4----------------- +.4072 +.4969 -.2336 -.0976 +.1224 -.0328
N1.6----------------- +.2148 +1.0204 -.2456 -.0536 +.0884 -.0244
N1.8----------------- +.0819 +1.0689 -.1666 -.0126 +.0399 -.0115

Next-to-first interval

N2.0----------------- .0000 +1.0000 .0000 .0000 .0000 .0000


N2.2----------------- -.0404 +.8404 +.2344 -.0216 -.0196 +.0068
N2.4----------------- -.0497 +.6229 +.5014 -.0646 -.0181 +.0081
N2.6----------------- -.0389 +.3849 +.7534 -.1006 -.0041 +.0053
N2.8----------------- -.0191 +.1659 +.9354 -.0906 +.0069 +.0015

Middle interval

N3.0----------------- .0000 .0000 +.1000 .0000 .0000 .0000


N3.2----------------- +.0117 -.0921 +.9234 +.1854 -.0311 +.0027
N3.4----------------- +.0137 -.1101 +.7194 +.4454 -.0771 +.0087
N3.6----------------- +.0087 -.0771 +.4454 +.7194 -.1101 +.0137
N3.8----------------- +.0027 -.0311 +.1854 +.9234 -.0921 +.0117

Next-to-last interval

N4.0----------------- .0000 .0000 .0000 1.0000 .0000 .0000


N4.2----------------- +.0015 +.0069 -.0906 +.9354 +.1659 -.0191
N4.4----------------- +.0053 -.0041 -.1006 +.7534 +.3849 -.0389
N4.6----------------- +.0081 -.0181 -.0646 +.5014 +.6229 -.0497
N4.8----------------- +.0068 -.0196 -.0216 +.2344 +.8404 -.0404

Last interval

N5.0----------------- .0000 .0000 .0000 .0000 +1.0000 .0000


N5.2----------------- -.0115 +.0399 -.0126 -.1666 +1.0689 +.0819
N5.4----------------- -.0244 +.0884 -.0536 -.2456 +1.0204 +.2148
N5.6----------------- -.0328 +.1224 -.0976 -.2336 +.8344 +.4072
N5.8----------------- -.0283 +.1079 -.1006 -.1426 +.4969 +.6667

N6.0----------------- .0000 .0000 .0000 .0000 .0000 +1.0000

Source: Henry S. Beers, "Discussion of papers presented in the Record Number 68; ......

68
Table 25 (Cont.): Interpolation Coefficients Based on the Beers "Ordinary" Formula.

B. FOR SUBDIVISION OF GROUPS INTO FIFTHS

Interpolated subgroup Coefficients to be applied to --

G1 G2 G3 G4 G5

First panel

First fifth of G1 +.3333 -.1636 -.0210 +.0796 -.0283


Second fifth of G1 +.2595 -.0780 +.0130 +.0100 -.0045
Third fifth of G1 +.1924 +.0064 +.0184 -.0256 +.0084
Fourth fifth of G1 +.1329 +.0844 +.0054 -.0356 +.0129
Last fifth of G1 +.0819 +.1508 -.0158 -.0284 +.0115

Next-to-first Panel

First fifth of G2 +.0404 +.2000 -.0344 -.0128 +.0068


Second fifth of G2 +.0093 +.2268 -.0402 +.0028 +.0013
Third fifth of G2 -.0108 +.2272 -.0248 +.0112 -.0028
Fourth fifth of G2 -.0198 +.1992 +.0172 +.0072 -.0038
Last fifth of G2 -.0191 +.1468 +.0822 -.0084 -.0015

Middle Panel

First fifth of G3 -.0117 +.0804 +.1570 -.0284 +.0027


Second fifth of G3 -.0020 +.0160 +.2200 -.0400 +.0060
Third fifth of G3 +.0050 -.0280 +.2460 -.0280 +.0050
Fourth fifth of G3 +.0060 -.0400 +.2200 +.0160 -.0020
Last fifth of G3 +.0027 -.0284 +.1570 +.0804 -.0117

Next-to-Last Panel

First fifth of G4 -.0015 -.0084 +.0822 +.1468 -.0191


Second fifth of G4 -.0038 +.0072 +.0172 +.1992 -.0198
Third fifth of G4 -.0028 +.0112 -.0248 +.2272 -.0108
Fourth fifth of G4 +.0013 +.0028 -.0402 +.2268 +.0093
Last fifth of G4 +.0068 -.0128 -.0344 +.2000 +.0404

Last Panel

First fifth of G5 +.0115 -.0284 -.0158 +.1508 +.0819


Second fifth of G5 +.0129 -.0356 +.0054 +.0844 +.1329
Third fifth of G5 +.0084 -.0256 +.0184 +.0064 +.1924
Fourth fifth of G5 -.0045 +.0100 +.0130 -.0780 +.2595
Last fifth of G5 -.0283 +.0796 -.0210 -.1636 +.3333

69
Table 26: Interpolation Coefficients Based on the Beers "Modified" Formula.

The Beers "Modified" formula is a six-term formula, which minimizes the fourth differences of the interpolated
results. This formula combines interpolation with some smoothing graduation of given values: end panels
maintain the given values. However, given data must be equally spaced)

A. FOR INTERPOLATION BETWEEN GIVEN POINTS AT INTERVALS OF 0.2

Interpolated point Coefficients to be applied to -

N1.0 N2.0 N3.0 N4.0 N5.0 N6.0

First interval

N1.0----------------- +1.0000 .0000 .0000 .0000 .0000 .0000


N1.2----------------- +.6668 +.5270 -.2640 +.0820 -.0140 +.0022
N1.4----------------- +.4099 +.8592 -.3598 +.1052 -.0173 +.0028
N1.6----------------- +.2196 +1.0279 -.3236 +.0874 -.0136 +.0023
N1.8----------------- +.0862 +1.0644 -.1916 +.0464 -.0066 +.0012

Next-to-first interval

N2.0----------------- .0000 +1.0000 .0000 .0000 .0000 .0000


N2.2----------------- -.0486 +.8655 +.2160 -.0350 +.0030 -.0009
N2.4----------------- -.0689 +.6903 +.4238 -.0442 +.0003 -.0013
N2.6----------------- -.0697 +.5018 +.5938 -.0152 -.0097 -.0010
N2.8----------------- -.0589 +.3233 +.7038 +.0578 -.0257 -.0003

Middle interval

N3.0----------------- -.0430 +.1720 +.7420 +.1720 -.0430 .0000


N3.2----------------- -.0270 +.0587 +.7072 +.3162 -.0538 -.0013
N3.4----------------- -.0141 -.0132 +.6098 +.4708 -.0477 -.0056
N3.6----------------- -.0056 -.0477 +.4708 +.6098 -.0132 -.0141
N3.8----------------- -.0013 -.0538 +.3162 +.7072 +.0587 -.0270

Next-to-last interval

N4.0----------------- .0000 -.0430 +.1720 +.7420 +.1720 -.0430


N4.2----------------- -.0003 -.0257 +.0578 +.7038 +.3233 -.0589
N4.4----------------- -.0010 -.0097 -.0152 +.5938 +.5018 -.0697
N4.6----------------- -.0013 +.0003 -.0442 +.4238 +.6903 -.0689
N4.8----------------- -.0009 +.0030 -.0350 +.2160 +.8655 -.0486

Last interval

N5.0----------------- .0000 .0000 .0000 .0000 +1.0000 .0000


N5.2----------------- +.0012 -.0066 +.0464 -.1916 +1.0644 +.0862
N5.4----------------- +.0023 -.0136 +.0874 -.3236 +1.0279 +.2196
N5.6----------------- +.0028 -.0173 +.1052 -.3598 +.8592 +.4099
N5.8----------------- +.0022 -.0140 +.0820 -.2640 +.5270 +.6668

N6.0----------------- .0000 .0000 .0000 .0000 .0000 +1.0000

70
Table 26 (cont.): Interpolation Coefficients Based on the Beers "Modified" Formula.

(The Beers "Modified" formula is a six-term formula which minimizes the fourth differences of the interpolated
results. This formula combines interpolation with some smoothing o graduation of given values: end panels
maintain the given values. However, given data must be equally spaced)

B. FOR SUBDIVISION OF GROUPS INTO FIFTHS

Interpolated subgroup Coefficients to be applied to --

G1 G2 G3 G4 G5

First panel

First fifth of G1 +.3332 -.1938 +.0702 -.0118 +.0022


Second fifth of G1 +.2569 -.0753 +.0205 -.0027 +.0006
Third fifth of G1 +.1903 +.0216 -.0146 +.0032 -.0005
Fourth fifth of G1 +.1334 +.0969 -.0351 +.0059 -.0011
Last fifth of G1 +.0862 +.1506 -.0410 +.0054 -.0012

Next-to-first-panel

First fifth of G2 +.0486 +.1831 -.0329 +.0021 -.0009


Second fifth of G2 +.0203 +.1955 -.0123 -.0031 -.0004
Third fifth of G2 +.0008 +.1893 +.0193 -.0097 +.0003
Fourth fifth of G2 -.0108 +.1677 +.0577 -.0153 +.0007
Last fifth of G2 -.0159 +.1354 +.0972 -.0170 +.0003

Middle panel

First fifth of G3 -.0160 +.0973 +.1321 -.0121 -.0013


Second fifth of G3 -.0129 +.0590 +.1564 +.0018 -.0043
Third fifth of G3 -.0085 +.0260 +.1650 +.0260 -.0085
Fourth fifth of G3 -.0043 +.0018 +.1564 +.0590 -.0129
Last fifth of G3 -.0013 -.0121 +.1321 +.0973 -.0160

Next-to-last panel

First fifth of G4 +.0003 -.0170 +.0972 +.1354 -.0159


Second fifth of G4 +.0007 -.0153 +.0577 +.1677 -.0108
Third fifth of G4 +.0003 -.0097 +.0193 +.1893 +.0008
Fourth fifth of G4 -.0004 -.0031 -.0123 +.1955 +.0203
Last fifth of G4 -.0009 +.0021 -.0329 +.1831 +.0486

Last panel
First fifth of G5 -.0012 +.0054 -.0410 +.1506 +.0862
Second fifth of G5 -.0011 +.0059 -.0351 +.0969 +.1334
Third fifth of G5 -.0005 +.0032 -.0146 +.0216 +.1903
Fourth fifth of G5 +.0006 -.0027 +.0205 -.0753 +.2569
Last fifth of G5 +.0022 -.0118 +.0702 -.1938 +.3332

Source: Henry S. Beers, "Modified-Interpolation Formulas that Minimize Fourth Differences,” The Record of the
American Institute of Actuaries, 24, Part 1(69):19-20, June 1945.

71
Table 27: Interpolation Coefficients Based on Grabills'
Weighted Moving Average of Sprague Coefficients

(See text for derivation. Used for drastic smoothing. Given groups must be equally spaced)

Interpolated subgroup Coefficients to be applied to --

G1 G2 G3 G4 G5
First fifth of G3 +.0111 +.0816 +.0826 +.0256 -.0009
Second fifth of G3 +.0049 +.0673 +.0903 +.0377 -.0002
Third fifth of G3 +.0015 +.0519 +.0932 +.0519 +.0015
Fourth fifth of G3 -.0002 +.0377 +.0903 +.0673 +.0049
Last fifth of G3 -.0009 +.0256 +.0826 +.0816 +.0111

72
Suggested Readings

1. Ayres Frantk, Jr. (1983), " Theory and problems of matrices" Schaum's outline series,
Singapore.

2. Ayres. F. Jr, 1986 : Theory and Problems of matrices, Schaum's outline series, McGraw
Hill Book company, Singapore.

3. C.P.Prakasam, G. Rama Rao and R.B. Upadhyay, 1987, Basic Mathematics in


Population Studies, Gemini Publishers, Mumbai.

4. Gorden Fuller (1971), Algebra and Trigonometry, Chapter 15 (pp. 304-317), Chapter 16,
Mc Graw-Hill Book Company, New York, U.S.A.

5. International Institute for Population Sciences (1974-75), Lecture Notes on


Mathematics, Interpolation and Graduation, Rates and Ratios, Vol.1, Mimeographed,
Bound Copy.

6. Jain. S.K, 1979 : Basic Mathematics for Demographers, The Australian National
University, Canberra.

7. Kruglak H., and Moore J.T. (1973), Schaum's outline of Theory and problems of Basic
Mathematics, Mc Graw -till Book company.

8. Lewis Parry J, 1957: An Introduction to Mathematics, Macmillan and Co Ltd, New


York St. Martins Press.

9. Ross. M.R, 1946 : Differential and Integral Calculus, McGraw Hill Book Company,
New York (Chapters I, II, III, XIV).

73
Notations and Symbols

ε Belongs to

Δ Delta

│ │ Determinants

d
─ Differential
dx

= (=) Equal to (not equal to)

e Exponential

> () Greater than (greater than or equal to)

 Integral

< () Less than (less than or equal to)

lim Limit

[ ] Matrix

││ Modulus Value

 (*) Multiplication (used for multiplication at few places)

π Pie

r Radius

 Square root

! Factorial

ln Natural logarithm

74
Capacity Building for a Better Future

Department of Extra Mural Studies and Distance Education

International Institute for Population Sciences


B.S. Devshi Marg, (Govandi Station Road), Deonar, Mumbai-400 088
Ph: 022-42372428; Fax: 022-25563257
E-mail: ems@iips.net, Website: www.iipsindia.org

You might also like