Professional Documents
Culture Documents
MSP-3C 01
MSP-3C 01
Population Studies
(Distance Education)
MSP-3C
Block- 1
Some Basic Mathematical Tools
Units of this
block are revised and updated by: Prof. Sayeed Unisa
Prof. S. K. Singh
Dr. Laxmi Kant Dwivedi
Dr. Preeti Dhillon
©
International Institute for Population Sciences, Govandi Station Road, Deonar,
Mumbai-400 088. Ph: 022-42372428; Fax: 022-25563257; E-mail: ems@iips.net
All rights are reserved. No part of this work may be reproduced in any form, by xeroxing or any
other means without prior permission in writing from the Director, International Institute for
Population Sciences, Mumbai.
Block - 1 : Some Basic Mathematical Tools
The block in your hand is the first block in the paper of Statistical Methods for
Population Studies. There are two units in this block. The details about the contents of different
units of this block are as follows:
In this unit, we have endeavored to explain you some basic mathematical tools
and their applications in population data analysis. You will learn about permutation and
combination, binomial and exponential functions and computation of population growth rate in
different sections of this unit.
The interpolation is defined as the technique of obtaining the most likely estimate
of a certain quantity under certain assumptions. In this unit, we will discuss about different
methods of interpolation, extrapolation and graduation with their application in population data
analysis. Besides, you will also find a critical evaluation of different methods, their merits and
limitations.
PREFACE
(Fifth Edition)
This block entitled Some Basic Mathematical Tools is the revised and updated version
of the module – Statistical Methods for Population Studies (MSP-3C). Prof. Sayeed Unisa,
Prof. S. K. Singh, Dr. Laxmi Kant Dwivedi and Dr. Preeti Dhillon have contributed in
revising and updating the block. Dr. Atreyee Sinha and Dr. Md. Illias K. Sheikh have compiled
and edited the block in this version. I complement Prof. T. V. Sekher and his colleagues in the
Department of Extra Mural Studies and Distance Education who have collectively worked
hard to maintain the quality and the overall conduct of the programme.
Prof. K. S. James
May 2019 Director & Sr. Professor
Unit 1: Elementary Mathematical Tools in the Field of Population
Unit Structure
1.0 Objectives
1.1 Introduction
1.2 Permutations [Self-check Exercise]
1.3 Combinations [Self-check Exercise]
1.4 Binomial Expansion and Exponential Functions
1.4.1 Binomial Expansion
1.4.2 Exponential Function [Self-check Exercise]
1.5 Ratios, Proportions and Rates
1.6 Arithmetic, Geometric and Exponential Rates of Population Growth
1.7 Estimation of Mid-year Population
[Self-check Exercise]
Let Us Sum Up
Model Answers
1.0 Objectives
In the present unit you will learn about some mathematical tools and its application in
the field of population. More specifically we will discuss about:
1.1 Introduction
1.2 Permutations
To know how to determine the number of possible outcomes, we must study the
mathematical concepts of permutation and combination. You will agree with us that before
defining permutation, it would be beneficial to explain the basic concept on permutations with
6
some suitable examples. So let us start with two rules to explain the concept about
permutations.
Rule 1: If a certain act A1 can be performed in m1 different ways and another act A2 can be
performed in m2 different ways then the total number of ways in which either A1 or A2 can be
performed is m1 + m2. Thus, for example, if there are 5 mathematics books and 4 physics
books, and if a boy is to choose either a mathematics book or a physics book, he can do so in
5+4=9 ways.
Rule 2: If a certain act A1 can be performed in m1 different ways, and having performed it in
any one of these m1 ways, another act A2 can be performed in m2 different ways then the two
acts, A1 and A2 can be performed in the stated order in (m1 x m2) ways.
Thus, for example, suppose that a person can travel from Bombay to Delhi by any one of
the three trains running between the two places. Also, suppose that he is to travel from Bombay
to Delhi and then return by a different train. In how many ways can he perform the journey?
Solution: The person can travel from Bombay to Delhi in three ways and corresponding to each
of these three ways there are two ways to return as he is not supposed to take the same train for
the return journey. Hence, the total number of ways he can perform the entire journey is 3x2=6
ways.
The above rule can be extended to the case where three or more acts are to be
performed. If there are k acts A1, A2, A3, ..., Ak such that A1 can be performed in m1 different
ways, having performed A1 in any one of these m1 ways, A2 can be performed in any one of m2
different ways and so on up to the kth act which can be performed in mk different ways. Then
the total number of ways in which the k acts can be performed in the stated order is m1 * m2 *
m3 * .... * mk.
Thus, for example, suppose that a cricket team of eleven players are to choose a captain,
a vice-captain and a secretary amongst themselves. The captain may be chosen in 11 ways as
anyone from 11 players can become a captain. Having chosen a captain, a vice-captain may be
chosen 10 ways (out of remaining 10 players) and having chosen a captain and a vice-captain, a
secretary can be chosen in 9 ways. Hence a captain, a vice-captain and secretary can be chosen
in 11x10x9=990 ways.
Definition of Permutations and its Proof: Now having understood the basic concepts about
permutations let us define permutations. The word permutation in simple language means
arrangement.
Definition: In general, let there be no different objects which are to be arranged in a line, taking
only r of them (0<r<n) at a time. Each possible arrangement in a line of r objects is called a
permutation of n objects taken r at a time. The total number of such arrangements is denoted npr
or P(n,r). Precisely if r objects are to be selected from a set of n different objects in such a way
that the order of selection is important, the number of permutations is given by:
The above relation shows the number of different permutations of n different objects taken r at a
time without repetition.
7
Proof: The required number is the same as the number of ways in which r places in a row can
be filled with n different objects.
The first place can be filled in n different ways as any one of the `n' objects may be
placed there. Having filled the first place in any of these n ways, the second place can be filled
in (n-1) different ways as any one of the remaining `n-1' objects may be placed there. Hence by
Rule 2 the first two places can be filled in n(n-1) different ways. Proceeding this, when the first
(r-1) places have been filled we are left with n-(r-1) = n-r+1 objects with any one of which the
rth (that is, the last places) can be filled. This can be done in (n-r+1) different ways.
The right hand side of the above formula is nothing but the product of the first n natural
numbers. Such products of some consecutive positive integers will often occur in the present
unit.
1× 2 × 3 × 4 × 5 × 6 6!
4× 5× 6 = =
1× 2 × 3 3!
1 × 2 × 3 × 4 × ... × 10 10!
7 × 8 × 9 × 10 = =
1 × 2 × 3 x ... × 6 6!
Now if r = n, we get,
8
n! n!
n Pn = =
(n - n)! o!
Example 1: How many three digit numbers can be formed using the digits 1,3,5,7,9 if each digit
is to be used only once?
Solution: Here we have to arrange 5 digits in a line, taking 3 at a time. This can be done in
5! 5! 1 2 3 4 5
P3 = = = = 60
1 2
5
(5 - 3)! 2!
Self-check Exercise
1.3 Combinations
In the previous section on permutations we have seen the arrangement of n objects taken
r at a time in a row. In permutations while selecting or arranging the objects we are concerned
with the ORDER of the objects whereas in combinations the order or an arrangement has no
relevance or it is ignored. Let us now define a combination.
Definition: Let there be no different objects out of which r (where 0<r<n) are to be chosen at a
time. A group of r objects selected out of the n objects without reference to order of selection, is
called a combination of n objects taken r at a time. The total number of such combinations is
denoted by nCr or C(n,r).
nPr = n!
n Cr = ; n C0 = 1
r! r! (n - r)!
9
For example, suppose a boy is to choose 3 books out of 5 books, then the combination of these
can be denoted by - 5C3 or C(5,3) [Since n=5, r=3].
5! 1 2 3 4 5
C3 = = = 10
3! 2! 1 2 3 1 2
5
Example 2: An examination paper has 13 questions and the students are expected to answer 5.
In how many ways could the questions be selected?
Solution: In this example the order of selecting a question is not important since a student can
choose or answer any question according to his choice. Therefore, this is an example on
combination.
13.12.11.1 0.9.
= = 1287 ways
5.4.3.2.1
n!
n Cn-r =
(n - r)! [n - (n - r)]!
n!
= = n Cr
(n - r)! r!
This means that the number of combinations of n things taken r at a time is the same as
the number of combinations of n things taken n-r at a time.
(ii) nCr + nCr-1 = n+1Cr
Proof:
n! n!
n Cr + n Cr-1 = +
r! (n - r)! (r - 1)! (n - r + 1)!
n! (n − r + 1) + n! r n! (n + 1)
= =
r! (n − r + 1)! r! (n + 1 − r!)
(n + 1)
= = ( n-1 Cr )
r! (n + 1 − r )!
10
Self-check Exercises
2. In how many ways can a cricket team of eleven be chosen out of 15 players?
3. A committee of five persons is to be chosen from 8 men and 5 women. In
how many ways can this be done if the committee is to contain 3 men and 2
women?
In the earlier two sections we have discussed about permutations and combinations.
Now this section we shall start with Binomial Expansion.
An expression containing two terms, which are connected by a positive or negative sign, is
called a Binomial Expression or simply a Binomial. For example, x+y, 2x-3y, 9x-7, x2+4y2 are
all Binomials. Consider the expression (x+y)2.
By actual multiplication we get,
(x+y)2 = x2 + 2xy + y2
(i) The number of terms in each expansion is one more than the exponent of the binomial.
For example, in the expansion of (x+y)4, the exponent is 4 and the number of terms in
this expansion are 5.
(ii) The first term is x with an exponent the same as the exponent of the binomial, and the
exponent decreases by 1 from term to term. For example, in the expansion of (x+y)3, the
exponent of x in the first term is 3, that in the second term is 2 and so on and in the final
term there is no term with x as x0 = 1.
(iii) The exponent of y in the second term is 1, and it increases by 1 from term to term. For
example, in the expansion of (x+y)3 in the first term there is no y as y0 = 1 and the
exponent of y has increased from term to term.
(iv) The sum of the exponents of x and y in any term is equal to the exponent of the
binomial. For example, you will observe that in the expansion of (x+y)3 the sum of the
exponents of x and y in the second, third and fourth term is equal to 3 which is also the
exponent of (x+y)3.
(v) The coefficient of the second term is the same as the exponent of the binomial. The
coefficient of any term further may be computed from the previous term by multiplying
11
that term's coefficient by the exponent of x and dividing by one more than the exponent
of y. For example, consider the expansion of: (x+y)4 = x4 + 4x3y + 6x2y2 + 4xy3 + y4.
In the above expansion the coefficient of the second term is 4 which is also the exponent
of the binomial (x+y)4. The coefficient of the third term which is 6 is computed as follows:
On similar lines, you may check the coefficient of the fourth and fifth terms in the above
expansion. The discussion done so far on the binomial expressions suggest the following
expansion of (x+y)n, n being a positive integer.
The above expansion called the binomial expansion is known as the Binomial Theorem.
It was proved by Sir Isaac Newton in 1665.
The Binomial Theorem : If n is a positive integer and x, y are any two numbers, then
n
= ∑ n
n-r
Cr x . y
r
r =0
we get,
(x-y)n= nC0.xn + nC1.xn-1.(-y)+ nC2.xn-2.(-y)2+ nC3.xn-3.(-y)3... + nCr.xn-r.(-y)r + ...
+ nCn-1.x.(-y)n-1 + (-y)n
Notice that in the above the terms are alternately positive and negative.
12
(1+y)n = 1 + nC1y + nC2y2 + nC3y3 +....+ nCryr +....+ nCn-1yn-1 + yn
(ii) The exponent of x in the first term is n and it goes on decreasing by unity in the
succeeding terms and becoming zero in the (n+1)th, that is, the last term.
(iii) The exponent of y in the first term is zero and it goes on increasing by unity in the
succeeding terms and becoming n in the (n+1)th, that is, the last term.
(iv) The sum of the exponents of x and y in any term of the expansion is n.
(v) The coefficient of xn is 1. It may be written as nC0. The coefficient of yn is also 1 which
may be written as nCn. The coefficients in the n+1 terms of the expansion are,
(vii) If n is even, the number of terms in the expansion is odd. The (n/2 + 1)th term is the
middle term. If n is odd, the number of terms in the expansion is even. In this case there is no
single middle term but the (n+1)/2th term and {(n+1)/2+1}th terms may be taken as two middle
most terms.
13
Now the binomial coefficients in the present example are:
5! 5.4.3.2.1
5 C1 = = = 5
1! 4! 4.3.2.1
5! 5.4.3.2.1
5 C2 = = = 10
2! 3! 1.2.1.2.3
5! 5.4.3.2.1
5 C3 = = = 10
3! 2! 3.2.1.2.1
5! 5.4.3.2.1
5 C4 = = =5
4! 1! 4.3.2.1.1
Note: Please remember that it is always better to find out the values of binomial coefficients and
expand the given coefficient fully, otherwise you may lose the marks in the examination
Tr+1 = (-1)r.nCr.xn-r.yr
Put n = 8, r = 4, x = 2x and y = 3
T5 = (-1)4.8C4.(2x)8-4.(3)4
= 8C4.(2x)4.(3)4
Example 5: Find out the middle term in the expansion of [x2/3 + 1/x1/2]10
8! 8.7.6.5.4! 8.7.6.5
C = = = = 70
8 4 4! 4! 4! 4! 1.2.3.4
∴ T 5 = 70 × 16x 4 × 81 = 90720 x 4
14
2/3 5 1 5
∴ T6 = 10 C5 .(x ) .( 1/2
)
x
10/3
10! x
= .
5! 5! x5/2
10 9 8 7 6 5/6
= . x = 252 x 5/6
1.2.3.4.5
5/6
T6 = the middle term = 252 x
Now we shall see with a solved example that how to find the coefficient of a given term in a
binomial expansion.
In the present example n = 9, and we are required to find out the coefficient of x5.
Since xn-r = x5
Therefore, n-r = 5 therefore r = n-5 = 9-5 = 4
Therefore, T5 = (-1)4.9C4.x5.(2)4
= 9C4.x5.(2)4
9! 9 × 8 × 7 × 6 × 5! 9× 8 × 7× 6
9 C4 = = = = 126
4! 5! 4! 5! 1.2.3.4
Now we shall see how to solve problems involving terms which do not have x as variable.
Example 7: The nth terms in the expansion of [3x - (1/3x)]30 is independent of x, find n ?
Solution: Let Tr+1 be term independent of x,
30-2r = 0, or r = 15
15
Tr+1 =T15+1 = T16 n = 16
The power function is represented by the general form y = xn; where n is any given
number. A new function can be defined by the simple process of taking the base of the power as
a fixed number and the index as variable. The function obtained by writing a variable power of a
fixed number is called as an exponential function. It can be written as y = ax, where a is the fixed
base of the function.
The functions of the type y = ea+bx are called exponential functions. Whenever
exponential functions occur the logarithm is taken to base e instead of to base 10. Usually
tables give values for log to base 10. The logarithm to base e of any number x, that is, logex
can be obtained as follows:
log x
log e x = 10
log 10 e
= log 10 x 1/0.9392916
log e x = 2.3026 log 10 x
To find the value of log 26:
By applying the rule of changing base, log1026 = loge26 * log10e
From the tables of common logs,
log1026 = 1.4150,
Therefore,1.4150 = Ln26 * 0.4343
Note: loge26 = ln26 and log10e = 0.4343
1.4150
or ln 26 = ---------- or ln 26 = 3.2581
0.4343
Self-check Exercises
4. Expand the following
5
2x 3y
-
3 2
5. Find the seventh term in [1-x/2]10
6. Find the middle terms in the expansion of
10
x a
a - x
7. Find the coefficient of x12 in the expansion of
12
1
2x 2 - x
8. The nth term in the expansion [5x - 1/5x]30 is independent of x; hence find n.
16
1.5 Ratios, Proportions and Rates
Basic data on population comes from the censuses, surveys, and vital registration
records. All these will give us large numerical numbers according to some characteristics. These
basic data are in actual or "absolute" numbers. Absolute numbers given by census reports are
total population, male, and female population, female population in reproductive age groups,
etc. and some other characteristics.
Ratio: Ratio is the term used to denote a/b where `a' and `b' are two numbers. It indicates so
many as per unit of b. To make ratio in an integer mode, it is generally multiplied by a constant
factor `k` and this constant `k' may be 100,1000,10,000 or any multiplicative number exponent
of 10. It is important to note that ratio involves only one "Universe or Population". That is both
numerator and denominator are derived from the same source. For example, college going
population of Maharashtra state out of college going population of India.
Proportion: The second type of ratio is "proportion". It is represented by a/a+b. where `a' and `b'
are numbers obtained from the same source. For example, proportion single by age-group,
proportion of worker out of total worker, proportion of children immunized out total children,
masculinity ratio etc. Beside these, some other ratios are computed as such.
Rate: A rate is a special type of ratio used to indicate the relative frequency of the occurrence of
a particular event within a population or sub-population in a specified period of time, usually
one year. Although this usage is recommended, the term has steadily acquired a wider meaning
and is often incorrectly used as a synonym for ratio. For example, percentage of population
literate is often termed as literacy rate.
Rate is defined as (a/b)*K, where `a' and `b' are derived from two different sources. For
example, while computing the crude birth rate the number of births in the numerator is obtained
from vital statistics registration records and the mid-year population, in the denominator is
obtained from censuses. Following are some examples of rates, which are used in population
analysis. They are: General fertility rate, Age-specific marital fertility rate, Total fertility rate,
age cumulative fertility rate, crude death rate, age-specific death rate, infant mortality rate,
maternal mortality rate.
17
1.6 Arithmetic, Geometric, and Exponential Rates of Population Growth
The change in population is measured by the annual rate of growth. The rate of
population growth can be measured in two ways. One is to find the difference between the
numbers of people present at two different dates (as absolute number), and from this to calculate
the annual rate of change during the intervening period (a relative number); the other is to
reckon the rate of change from the records of individual changes as they occurred-births, deaths,
and migration - based on vital statistics. Here we are concerned with the first approach.
Now let us discuss these different rates of growth in detail one by one.
(i) Arithmetic rate of growth or Linear Growth Function: Linear growth function is applied to
estimate intercensal population only when the population figures growing by a constant amount.
Linear growth function or arithmetic rate of growth is given by the equation: Pt2 = Pt1(1 + rt)
Let us discuss the computational procedure of this method with the following example:
Example 8: The population of a country in 1991 is 846 million and 1027 million in 2001. If the
increase in population is constant during 1991-2001. Estimate the population in 1997. If the
growth continues at the same rate, estimate the population in 2004. Also calculate the average
annual growth rate during 1991-2001.
18
- 1027 - 846
∴ r = Pt 2 Pt1 = = 0.0213947 ...(i)
Pt1 t 846 10
Substituting P1991 = 846, r = .0213947 and t = 6, which is time interval in years between 1997
and 1991, we have
Assuming that the growth rate `r' is constant, population in 2004 can be estimated as:
(ii) Geometric rate of growth: The geometric rate of growth of population can be used to
estimate the intercensal population when the successive ratios of population are constant. The
equation for the geometric rate of growth is given by;
log P t 2 - log P t1
1 + r = Antilog
t
19
log Pt 2 - log Pt1
∴ r = Antilog -1
t
Example 9: Given below is the population of Rajasthan state in India for the census years 1991
and 2001. Calculate the annual rate of growth of population assuming geometric law of growth
and estimate the population after 5 years of the census 1991.
Solution: The formula for computing the annual rate of geometric growth is given as:
Pt2 = 56,473,122
Pt1 = 44,005,900
t = 10 years
= 0.0252576
Hence the population after 5 years from Ist March 1991 is given by:
Population of 1996 = P1991 (1 + 0.0252576) 5
= 44,005,900 (1 + 0.0252576) 5
= 49,851,232
Example 10: If Sri Lanka's population is growing at the rate of 2 percent per annum, find (i)
time required to double the population; (ii) also find the rate of growth, which will double the
population in 20 years.
Solution:
Hence to double the population in 20 years the required rate of growth is 3.5 percent per annum.
(iii) Exponential Rate of Growth: The equation for the exponential rate of growth is given as
Pt2 = Pt1 ert
Where,
Pt2 is the size of population at time t2;
Pt1 is the size of population at time t1;
r is the exponential rate of growth at which the population is increasing between the time
periods t1 and t2; and
t = t2-t1 = the time interval between Pt2 and Pt1.
Example 11: The schedule tribe populations of Andhra Pradesh for the census years 1981 and
1991 are 3,176,001 and 4,199,481 respectively. Calculate the annual rate of growth of schedule
tribe population assuming exponential law of growth and estimate the schedule tribe population
in 1996.
21
Pt 2 = r t
e
Pt 1
l n P t 2 = r × t × ln e
Pt 1
ln P t 2
Pt 1
r= ; ( ln e = 1)
t
ln Pt 2
r = t1
P
t
4199481
ln
r= 3176001
10
ln(1.32225 43)
=
10
= 0.0279338 = 2.8 %
Therefore, the annual rate of exponential growth is 2.8 per cent per annum.
Now,
P96 = P91 x ert
= 4,199,481 x e0.0279338x 5
= 4,199,481 x e0.139669
= 4,199,481 x 1.14989
= 4,828,954
1.7 Estimation of Mid-Year Population
In demography while computing the crude rates, such as crude birth rate or crude death
rate, the denominator refers to mid-year population of that area for that year. This mid-year
population is more often called the person years lived by the population during the year under
question. This is essential because the numerator is a record of events over a period of 12
months; in other words, the sum of all events that occur in a year. In order to relate this to the
denominator, population should also be counted over a year. This is achieved by obtaining the
number of person years lived by the population or the population count at the middle of the year.
These two concepts are not identical but in most cases they are equivalent.
(i) If the last census is within the same year for which vital rate is required, the census
population figure can be taken.
22
(ii) If the two censuses are conducted with a gap of one year, mid-year is calculated by
half of the difference in population assuming that the increment or decrement in
population size is uniform during the inter censal period.
(iii) If the data of the estimate has between two censuses more than one year apart
(generally it will be 5 years, 10 years), it is still possible to estimate the mid-year
population by using the formula given below:
n
P = P1 + ( P2 - P1)
N
Where P is the mid-year population to be estimated.
P1 is the initial population at the first census.
P2 is the final population at the second census.
N is the number of months between two censuses.
n the number of months between the date of P1 and the date of estimation.
Example 12: The population of Himachal Pradesh was 5,170,877 in 1991 and 6,077,248 in
2001. Compute the mid-year population in 1996.
n
P t = P t1 + ( P t 2 - P t 1 );
N
P = P 1991 = 5170877;
t1
P = P 2001 = 6,077,248
t2
n = 1996 - 1991 = 5;
N = 2001 - 1991 = 10;
P1 = 5,170,877
N = 10 and n = 5
∴ P = 5,170,877 +
5
(6,077,248 − 5,170,877 )
10
23
Self-Check Exercises
9. The schedule tribe populations of Andhra Pradesh for the census years 1971 and
1981 were 1,657,657 and 3,176,001 respectively. Calculate the annual rate of growth
of schedule tribe population assuming arithmetic law of growth during 1971-81. If the
population continues to grow at the arithmetic rate, estimate the population in 1986.
10. Given below is the population of Bihar, an Indian State, for the census years 1951 to
1981. Calculate the annual geometric growth rate of the population for Bihar during
the census decades 1951-1961, 1961-1971, and 1971-1981. Also estimate the
population of Bihar for the years 1956, 1966, 1976, and 1991 assuming geometric law
of growth for the corresponding decades.
Population
1951 1961 1971 1981
Bihar 38,782,271 46,447,547 56,353,369 69,914,734
11. If India's population is growing at the rate of 2.23 per cent per annum, find the time
required to double the population and also find the rate of growth, which will double
the population in 17 years.
12. The population of a country on 30th June 1960 was 1.9 million, and on 30th June 1970
it was 2.4 million. Find (i) the exponential rate of growth during 1960 and 1970; (ii)
the estimated population on 30th June, 1968 assuming exponential rate of growth; and
(iii) the time the population would be double that of the 1970 value.
13. Population of Sri Lanka is 12,689,897 in October 1971 and it is 14,850,001 in March
1981. Compute October 1975 population by using formula for mid-year population.
14. The mid-year population of a region in 1981 was 19,896,843. If the number of births
during the year 1981 were 656,596, compute the crude birth rate for the year 1981.
Let Us Sum Up
After completing this unit, you would have learned about the following:
* If a certain act A1 can be performed in m1 different ways and another act A2 can be performed
in m2 different ways then the total number of ways in which either A1 or A2 can be performed
is m1 + m2. Thus, for example, if there are 5 mathematics books and 4 physics books, and if a
boy is to choose either a mathematics book or a physics book he can do so in 5+4=9 ways.
However, if a certain act A1 can be performed in m1 different ways, and having performed it in
any one of these m1 ways, another act A2 can be performed in m2 different ways then the two
acts, A1 and A2 can be performed in the stated order in m1 x m2 ways.
* Let there be n different objects, which are to be arranged in a line, taking only r of them
(0<r<n) at a time. Each possible arrangement in a line of r objects is called a Permutation of
n objects taken r at a time. The total number of such arrangements is denoted by npr or P(n,r)
or nPr.
nPr = n(n-1)(n-2)..........(n-r+1)
24
* Let there be n different objects out of which r (where 0<r<n) are to be chosen at a time. A
group of r objects selected out of the n objects without reference to order of selection, is called
a combination of n objects taken r at a time. The total number of such combinations is
denoted by nCr or (Nr) or C(n,r).
* If n is a positive integer and x, y are any two numbers, then the Binomial theorem states that:
n
= ∑ n Cr x n - r . y
r
r= 0
* Population growth rates are usually computed on the basis of the formulae
* The mid-year population is more often called the person years lived by the population during
the year under question. This is essential because the numerator is a record of events over a
period of 12 months; in other words, the sum of all events that occur in a year. In order to
relate this to the denominator, population should also be counted over a year. This is achieved
by obtaining the number of person years lived by the population or the population count at the
middle of the year.
* If the data of the estimate has between two censuses more than one year apart (generally it
will be 5 years, 10 years), it is still possible to estimate the mid-year population by using the
formula given below:
n
Pt = Pt 1 + ( Pt 2 - Pt 1 )
N
Model Answers
2. 1365
25
Hint: Here n=15, r=11 and you are required to select them without any
importance of order of selection
3. 560 ways
Hint: Here we have to choose 3 men out of 8 and 2 women out of 5. This can be
done in 8C3 and 5C2 ways respectively. Hence by rule 2 the committee can
be formed by 8C3 x 5C2 ways.
32 5 40 4 20 3 2 2 3 135 4 243 5
4. x - x y + x y - 15 x y + xy - y
243 27 3 8 32
Hint: Use the expansion of (x-y)n & substitute appropriate values for
x = 2x/3,y=3y/2 and n = 5.
5. 105/32 x6
8. n = 16
Hint: Assume that Tr+1 be the term independent of x. Find r and then Tr+1=Tn,
therefore n = r+1.
9. r = 0.091596
P86 = 4,630,546
26
Hint: Follow the solved example No. 10.
13. 13,599,409
27
Unit 2: Interpolation and Graduation
Unit Structure
2.0 Objectives
2.1 Introduction
2.2 Methods of Interpolation
2.3 Uses of Interpolation Formula
2.4 Limitations of Interpolation [Self-Check Exercises]
2.5 Graduation
2.6 Methods of Graduation
2.7 Osculatory Interpolation
2.7.1 Modified Osculatory Interpolation
2.7.2 Comparison and Selection of Osculatory Interpolation Formulas
[Self-Check Exercises]
Let Us Sum Up
Model Answers
2.0 Objectives
2.1 Introduction
Interpolation is the process of finding the value of the function (y) for any of the
independent variable (x) within a given range of value of x and extrapolation is the process of
finding the value outside the given range of x.
If two variables are connected by a known relation, for each value of one variable there
is corresponding value of other. For example, with expression y = x2 + 3x + 2 if x = 2 then y =
12; this can be determined directly by substituting the value of x and solving for y. In many
cases, however, the relationship connecting two variables is unknown and usually, a pair of
values of x and y is given. If one wishes to estimate the value of independent variable i.e., x for
special values of the dependent variable y between points, one must establish empirical
relationship between the two variables. One way of doing this would be to fit a curve through
the data, then to substitute the known value of independent variable x into the formula for the
curve and solve the unknown value of the dependent variable. Frequently, a curve that
adequately fits the entire distribution of data cannot be easily fitted. Moreover, it requires large
investment of time to fit a curve to an entire distribution in order to read off the values at a few
intermediate points for which information is not given. Instead, interpolate between the known
28
points taking into account several adjacent values to estimate the shape of the curve around the
point of interpolation.
The interpolation analysis is done on the basis of two assumptions: (i) the quantity
changes continuously without any break or sudden juMA and (ii) the rate of change (rise or fall)
is uniform and there are no sudden juMA in the data. In other words, it means that the data are
in the shape of continuous or smooth curve. If for example, we are interpolating the figures of
population of India in the year 1955 and that we are given the figures of Indian population for
the years 1941, 1951, 1961, 1971 and 1981. Our presumption would be that the population has
grown up smoothly and there are no violent ups and downs in these figures. We also assume
that the rate of growth of Indian population has been uniform throughout the period 1941 to
1981.
The accuracy of the interpolated figures actually depends on two factors: (i) the
knowledge of the possible fluctuations of the figures and (ii) the knowledge about the course of
events relating to the problem under investigation. If the assumption of interpolation is not
fulfilled, the interpolated figures would be a fictitious. Interpolated figures are not perfect
substitute of the original figure. These are only best possible estimates under certain
assumptions.
Broadly speaking there are two types of methods of interpolation. They are:
(1) Graphic method
(2) Algebraic methods.
I. Graphic Method: The graphic method is applicable in all types of data. According to this
method, the values given are plotted on a graph and are joined by a straight line. The line so
obtained is then smoothed. It is possible to determine the value of y for any x within the given
limits from the smoothed curve. Graphs are useful for deriving rough estimates for subdivision
of grouped data as well as for estimating values in a point series. They are especially useful
when the grouped data are unevenly spaced.
Population
(In millions)
A
For example, the figures of the population for years 1931, 1941, 1951, 1961, 1971 and
1981 are available and it is desired to find the population for the years 1956 and 1966. Take
years along the x axis. Represent the population along y axis. Draw a continuous smooth curve
connecting all the points along the y axis. Now suppose, we have to interpolate the population
figures for the years 1956 and 1966. For this, we shall first locate these values on the x axis on
which the years are shown from these points, two ordinates shall be drawn at the y-axis. We can
29
now read the values at the points where these ordinates touch the y axis. They would be the
interpolated figures for the years 1956 and 1966.
II. Algebraic Methods: In a situation where one quantity changes continuously and regularly
and another quantity changes in relation to it, we can estimate the discontinuous value of the
second quantity corresponding to the first by using algebraic methods. The important methods
of interpolation are
This interval can be either one of time or a class interval within a compositional
classification. Thus one can interpolate between two censuses or between two ages of an age
classification for a given census. Interpolation frequently must be used in order to make data
comparable. If two distributions have been tabulated for unlike age intervals, it is possible to
make equal intervals by using interpolation to make distributions approximately comparable.
For example, if the census dates fall in different years; by interpolation it is possible to arrive at
population estimates for common years.
Example 1: Below are the mortality rates for India at specified ages based on 1961-71 deaths.
Find the estimates of mortality rate for age 32.
25 4.38
30 5.26
35 6.74
40 9.14
45 12.82
To find out the mortality rate for age 32, we apply linear interpolation formula as given
below.
30
x
u x = uo + (u i - uo )
h
Where uo = 5.26; ui = 6.74; x = 2; h = 5
2
u32 = 5.26 + (6.74 - 5.26)
5
= 5.26 + 0.592 = 5.852
(b) Newton's Forward Difference Formula: Simple interpolation will not be adequate as most of
the population distributions are not linearly distributed. Usually they have a marked curve
linearity. It is true, however, that simple interpolation can be used with curvilinear distribution,
provided the interval of interpolation is small enough. Unfortunately, most population
distributions are tabulated in broad intervals that permit little use of linear interpolation. Hence
some more exact techniques such as Newton's Forward Difference Formula is applied when the
functions of values have equal intervals and Newton's Divided Difference or Lagrange's
formulas are used for unequal intervals. For applying Newton's formula, it is necessary to have
some elementary knowledge on finite differences and the construction of difference tables. The
next section is devoted for concepts, construction of difference table, and its applications.
Method of Finite Differences: Let ux represent the value of u for a specified value of x in a
distribution. The known values of ux may be represented by u0, u1, u2, …etc. This is the value of
x for which ux is known may be assigned integers form 0,1,2, ... etc.
If underlying relationship between two variables is linear, then a change of one unit in
one variable is accompanied by a fixed amount of change in the other variable. This means that
equal changes of x are accompanied by equal changes in u. And the relationship between u and
x is assumed to be linear in this case. Hence, it is readily apparent that linear interpolation
assumes that the first differences are constant or equal size. Suppose first-order differences are
taken for all successive known values of u. It is then possible to take the difference of the
successive pairs of differences. These are termed as `Second order differences' or simply
"second differences" and are represented by the symbol Δx2. The super-script signifies that the
differences are of second order and the subscript specified which first order differences have
been subtracted from each other. For example,
Δ 02 = Δ11 - Δ10
Δ12 = Δ12 - Δ11
Δ 22 = Δ13 - Δ12
Also, 20 is pronounced as delta – zero - two
Similarly, third, fourth, ... nth differences are denoted by Δ3x, Δ4x, ..., Δnx,
31
Differences of higher order are used to interpolate within curvilinear distributions.
It is convenient to introduce alternative names for x and y in our equations y =ux. The
independent variable is often termed as the argument and the corresponding value of y the entry.
Difference Table: Table given below illustrates the construction of a difference table for the
equation y = ux.
0 u0
Δ10
1 u1 Δ20
Δ11 Δ30
2 u2 Δ 2
1 Δ40
Δ12 Δ31
3 u3 Δ 2
2 Δ41
Δ13 Δ32
4 u4 Δ 2
3 Δ42
Δ14 Δ33
5 u5 Δ 2
4
Δ1
5
6 u6
The first term, u0, in Table 2.1 is called the leading term and the difference, at the head
of the respective columns, namely, Δ10, Δ20, Δ30, Δ40 are called the leading differences.
Although we have expressed the term in the difference table by the use of Δ symbols, it is
quite easy to obtain any differences in terms of functions alone.
For example, Δ30 is the difference between Δ21 and Δ20 i.e., Δ30 =Δ21-Δ20.
Again Δ20 is the difference between Δ11 and Δ10 i.e., Δ20 = Δ11 - Δ10 and Δ10= u1-u0
32
= (u4-u3)-2(u3-u2) + (u2-u1)
= u4-3u3 +3u2-u1
Thus, Δ40 =(u4-3u3 + 3u2-u1) -(u3-3u2 + 3u1-u0)
= u4-4u3 +6u2-4u1 +u0
u0 = u0
u1 = u0 + Δ1u0
u2 = u1 + Δ1u1
= (u0 + Δ10) + (Δ20 +Δ10)
= (u0 + 2Δ10 + Δ20)
u3 = u2 + Δ12
(u0 + 2Δ10 + Δ20) + (Δ21 +Δ11)
(u0 + 2Δ10 + Δ20) + (Δ30 + Δ20) + (Δ20 +Δ10)
u0 + 3Δ1u0 +3Δ2u0 +Δ3u0
Proceeding in the similar fashion you can compute u4, u5, u6, ......, etc.
n
and un = u0 + ( 1n ) Δ u 0 + ( n2 ) Δ2 u 0 + ... + (n-r ) Δn-r u 0 + ... + ( nn ) Δn u 0 ...(2.3)
x - xo
where Z =
h
h = Difference between two adjoining points.
x = interpolated value
x0 = Point of origin.
This important equation is called 'Newton's Forward Difference Formula'.
33
Example 2 : The following table contains pairs of values satisfying the equation ux = 1+x with
difference of various orders.
x ux Δ1x Δ2x
0 1
1
1 2 0
1
2 3 0
1
3 4 0
1
4 5
The first order differences are obtained by subtracting successively each value of u x
from the value of ux immediately below it.
The second order differences are obtained by performing similar subtraction on the
first order differences. If the relationship between ux and x is linear then the first differences
are constant and the second difference are zero.
Example 3: The following table contains pairs of values satisfying the equation
ux = 1+x+x2, with difference of various orders.
34
Example 4: The differencing process is now applied to a function that is even more complex:
ux = x4+x3+5x+4
(i) In the linear equation the first differences of ux were equal and the second differences
were zero.
(ii) In the second-degree equation (highest term x2), the second difference were equal and
the third differences were zero.
(iii) In the fourth equation (highest term x4), the fourth differences were equal and the fifth
differences were zero.
Newton's Forward Difference Formula (NFDF): The Newton's Forward formula is applied
when the independent variables advance by equal interval. Given the first row of differences,
it is possible to reproduce all other differences, simply by successively adding adjacent pairs
of values together and placing the total under the left entry of the pair.
It provides the basis for method of interpolation, which permit the assumption of
curvilinear relationship. Linear interpolation, infact considers only successive values of ux and
uses the linear relationship that will reproduce two values to estimate for any intermediate
(fractional) value of x. However, if we consider three adjacent values (u 0, u1, u2), there is a
polynomial of the second degree in x, which will reproduce these three values. Similarly, there
35
is a polynomial of the fourth degree in x, which will reproduce five adjacent values of u x
(u0,u1,u2,u3,u4).
3 u3
u4-u3 = Δ1u3 :
4 u4 :
: : : :
: : : :
n-2 un-2
un-1-un-2 =Δ1un-2
n-1 un-1 Δun-1-Δun-2-Δ2un-2
un-un-1 =Δ un-11
n un
Values of ux may be estimated for fractional values of x within the range of the known pairs of
values by substituting the appropriate values of x in NFDF. Hence it is possible to interpolate
between two known of ux even though the relation between ux and x is curvilinear around the
point of interpolation.
Example 5: The age specific mortality rates for India at specified ages for 1961-71 are given
below. Estimate the mortality rate for age 32.
Age Age specific mortality rates
25 4.38
30 5.26
35 6.74
40 9.14
45 12.82
50 18.18
Let us form the difference table.
Table 2.6: Difference table
Age x ux Δ1x Δ2x Δ3x Δ4x
25 4.38
0.88
30 5.26 0.60
1.48 0.32
35 6.74 0.92 0.04
2.40 0.36
40 9.14 1.28 0.04
3.68 0.40
45 12.82 1.68
5.36
50 18.18
36
x - x0 32 - 30
where h = 5, z = = = 0.4
h 5
As regards to the choice of the sets of u' s to be used in interpolation, we should try
and keep the value sought as far as possible central to the set of u' s is employed. As regards
the equation of how many differences have to be used, usually, but not always, using higher
orders of differences and therefore, fitting a curve to more known points of the distribution
will increase the accuracy of an interpolation. The highest order difference used in an
interpolation implies a curve of a specified form. For much demographic work, carrying the
interpolation beyond the fourth order difference will not greatly improve the accuracy of the
result if it is assumed that the basic relationship between the two variables involved is
approximated by the form of a third or fourth degree polynomial.
(c) Newton's Backward Difference Formula (NBDF): For data given at equal intervals of x,
if we have to find the value of the function for a value of x near the bottom of the table, we
use Newtons Backward Difference Formula. If there are n arguments and n corresponding
entries, Newton's backward formula for the entry ux to be interpolated for the argument x is
z(z + 1) 2 z(z + 1) (z + 2) 3
ux = un + z Δ1n + Δ n+ Δ n + ...
1 2 1 2 3
x - xn
where, z = , h = x1 - x0 and Δ1n , Δn2 ...,
h
are the differences occurring at the bottom of the core in the difference table.
(d) Newton's Divided Difference Formula (NDDF): Many a times in population statistics
data are given at equal intervals of x; but sometimes it happens that we are required to
interpolate when values of the function are known for unequal intervals. Since we cannot take
out the differences as defined earlier we adopt a process of difference in involving the
argument as well as the entry. The differences obtained by this process are called `divided'
37
differences. In these situations, to find the value of the function at the intermediate values of x
say x0, we proceed as follows:
Let f(x1),f(x2) ..... f(xn) be the known values of the function at x1, x2 .... xn. To find the
values of the function at x0, let us form the following difference table.
From table 2.7 we see that the top entries are f(x1), f(x1, x2), f(x1, x2, x3), etc. Then,
Newton's formula of divided differences for estimating f(x0) corresponding to x0 is,
38
Table 2.7: Divided Difference Table for General Case, y = f (x)
x f(x) Δ1x (First difference) Δ2x (Second difference) Δ3x (Third difference)
x1 f(x1) f(x 2) - f(x1)
= f(x1 x 2)
x 2 - x1
f(x 2 x 3) - f(x1 x 2)
= f(x1 x 2 x 3)
f(x 3) - f(x 2) x 3 - x1
x2 f(x2) = f(x 2 x 3)
x3 - x2 f(x 2 x 3 x 4) - f(x1 x 2 x 3)
= f(x1 x 2 x 3 x 4)
f(x 3 x 4) - f(x 2 x 3) x 4 - x1
= f(x 2 x 3 x 4)
x4 - x2
x3 f(x3) f(x 4) - f(x 3) f(x 3 x 4 x 5) - f(x 2 x 3 x 4)
= f(x 3 x 4) = f(x 2 x 3 x 4 x 5)
x4 - x3 x5 - x2
f(x 4 x 5) - f(x 3 x 4)
= f(x 3 x 4 x 5)
x5 - x3
x4 f(x4)
f(x 5) - f(x 4)
= f(x 4 x 5)
x5 - x4
x5 f(x5)
: : :
: : :
xn-1 f(xn-1)
f(x n) - f(x n - 1)
= f(x n - 1 x n)
xn - xn - 1
xn f(xn)
39
(e) Lagrange’s Formula: This can be put in another form which does not require to construct
the divided difference table. Let f(x) be a continuous function of x and f (x 0), f(x1) f(x2) ….be
the value of f(x) when x=x0,x1,x2,……,
Where f(x) is the figure to be interpolated. The above equation is known as Lagrange's
formula.
Note: Newton's Divided Difference formula and Lagrange's formula are used for unequal
intervals.
Example 6: Find f(27), when f(26) =10.29, f(28) =10.54, f(29) =10.65, & f (30)=10.76
x f(x) Δ1 x Δ2 x Δ3 x
26 10.29
0.125
28 10.54 -0.005
0.110 0.00125
29 10.65 0
0.110
30 10.76
f(27)=10.29 + (27-26) 0.125 + (27-26) (27-28) -0.005+ (27-26) (27-28) (27-29) 0.00125
40
2.3 Uses of Interpolation Formula
The interpolation formula can be applied to solve many problems that may arise in
demographic analysis, some of them are mentioned below:
(i) Estimation of intermediate terms among n equidistant terms: In order to find the
intermediate terms among n equidistant terms, one of the three Newton's formulas namely,
forward, backward and central differences are applied on the basis of the position of the
interpolated value in the differences table. Example 5 has already shown how Newton's
forward difference formula can be applied suitably to estimate the interpolated value. In case,
the intervals of x values are not equal. Newton's Divided Difference and Lagrange 's formula
are used to estimate the intermediate functional values & an illustration is given in Example 6.
(ii) Method for Estimation a Missing Term: If there are "n" equidistant terms of which n-1 are
known and in order to estimate the missing term h, a difference table is constructed by
assuming the missing values as x. We assume that the fourth order difference, or depending
upon the polynomial relationship between the variables, to be zero and solved for x.
Example 7: Find f(27) by using Newton's Forward Difference Formula when f(26) =10.29,
f(28):10.54, f(29)=10.65 & f(30)=10.76.
x f(x) Δ1 x Δ2 x Δ3 x Δ4 x
26 10.29
x-10.29
27 x 20.83-2x
10.54-x 3x-31.25
28 10.54 x-10.43 41.68-4x
0.11 10.43-x
29 10.65 0
0.11
30 10.76
When we applied Newton's Divided and Lagrange's formula for the above example we
obtained the same answer.
Example 8: The population of Goa in 5-year age groups is given below and suppose we
required the estimate of the population aged 27 years.
41
Table 2.10: Population of Goa
Population aged 27 = Population aged upto 28 years - Population aged upto 27 years. Using
Newton's Backward Difference Formula- Population upto 28 years:
42
(iv) Conversion of Unconventional age group into conventional age groups: If the age
distribution has not been tabulated in conventional age groups, it is possible to convert them
into conventional age groups by interpolation.
Example 9: The age distribution of males in unconventional age group is given below: find
the age distribution of the population by conventional age groups (0-4,5-9,10-14, .... etc.)
Age Males
0-1 39,510
2-4 64,533
5-7 62,125
8-14 129,666
15-17 40,057
18-22 78,347
23-29 80,994
30-39 96,327
The central ages of the groups 5-7, 8-14, 15-17, 18-22, 23-29 and 30-39 are 6.5, 11.5,
16.5, 20.5, 26.5 and 34.5 respectively. The respective intervals for the above age groups are
3,7,3,5,7 and 10 years.
To make each group into 5-year groups with the central ages fixed, we shall multiply
the respective group population by
5 5 5 5 5 5
, , , , and
3 7 3 5 7 10
Thus the population by 5 years’ age groups with central ages 6.5, 11.5, 16.5, 20.5, 26.5 and
34.5 are:
The population for the 5-year age group with exact central age 7.5 is
43
6.5 is a unit away from 7.5 and 11.5 is 4 units away 7.5.
In a similar way we can get the population by 5-year age group with the required
central value, they are:
Let W0, W1 and W2 denote the population in 3 consecutive 10-year age groups. Split
the population of middle age groups into five-year age groups. Let W0, W1 and W2 be the
population aged n to n+9; n+10 to n+19 and n+20 to n+29 respectively. Find the populations
aged n+10 to n+15 and n+15 to n+19.
Let X be the population aged n+10 to n+14 so that W1-X is the population aged n+15
to n+19 then the divided difference table will be:
n+20 to n+29 W2
To estimate X, we shall assume that the differences are constant and equal so that
44
( W1 - 2X) (X - W0) ( W2 - W1 + X) ( W1 - 2X)
- = -
5 7.5 7.5 5
W X W W W X W
or 1 - 0.4X - + 0= 2- 1+ 1
+ 0.4x
5 7.5 7.5 7.5 7.5 7.5 5
w1 w 0 w 2
or + -
2 8 8
w w w
Population aged n+10 to n+14 is 1 + 0 - 2
2 8 8
w1 w 0 w 2
The Population aged n+15 to n+19 = w1-x = +
2 8 8
Example 10: Split the ten-year age group population of 25-34 in the five-year age groups.
Considering W0, W1 and W2 are the population totals of 15-24, 25-34 and 35-44 respectively.
A simple method for obtaining the individual values where quinquennial values are
known is given below. Let δx denote the difference for unit interval of x and Δx denote the
difference for quinquennial interval. Then Ux+5 may be expressed as either (1+δx)5 or as
(1+Δx) symbolically;
45
(1+δx)5 = (1+Δx)
1+δx = (1+Δx)1/5
δx = (1+Δx)1/5 -1
From this relation one can find easily that
The same principle can be adopted if decennial values are known. In the event of Δ1x, Δ2x ...
will represent differences for decennial intervals and individual differences will be found from
the identity.
δx = (1+Δx)1/10 -1
Example 11: The mortality rate for quinquennial ages are given below. Obtain the mortality
rates for ages 36, 37, 38 and 39.
Since we are interested only in ages after 35 we shall consider the following abridged
difference table.
Age Mortality Δ1 x Δ2 x
35 0.0433
0.0162
40 0.0595 0.0106
0.0268
45 0.0863
46
Assuming that second order differences to be constant we construct the following table.
Age Mortality Δ1 X Δ2 X
35 0.0433
0.00239
36 0.0457 0.000424
0.00281
37 0.0485 0.000424
0.00323
38 0.0517 0.000424
0.00365
39 0.0553 0.000424
0.00407
40 0.0595
It may happen that you know the values of a function f(x) at intervals of a unit and
wish to calculate a table of values with a smaller interval, e.g., it is a common practice to
calculate every fifth value in a life table and to complete the table by interpolation. Here the
unit interval for the preliminary calculations is five years and we are faced with the problems
of getting these quantities for single year intervals. In these situations, formulae were already
given in equations. Here we shall give certain methods which are equivalent to the methods
given earlier but which are easy for application. The details about tables to be used to obtain
Single Year Interval Values when data are given at 5 or 10-year intervals, which are based on
Newton's Forward Difference formula are given as Tables 1 to 10 in supplementary of this
block.
It is notable here that in each case to obtain u x+1,ux+2,ux+3 etc. the values of ux, ux+5,
ux+10 etc. are multiplied by the corresponding coefficients as given in the tables and the
resulting values added up. The three point, four point, and five point formulae are respectively
used for differences up to second, third and fourth orders.
x ux
30 .0338
35 .0433
40 .0595
u36 =.0338 (.72) + .0433 (.36) + .0595 (-.08) = 0.0352 (from the reference Table 1,
u37 =.0338 (.48) + .0433 (.64) + .0595 (-.12) = 0.0368 attached as supplementary of
u38 =.0338 (.28) + .0433 (.84) + .0595 (-.12) = 0.0387 this block)
u39 =.0338 (.12) + .0433 (.96) + .0595 (-.08) = 0.0409
47
vitiated due to digit preference in ages etc. that these figures have got any meaning only if
taken in large age groups like 5 years or 10 years’ age groups. But in many situations single
year age distributions are required and so methods are to be developed to split the 5 years or
10 years totals into single year groups. We may assume that, however, much the data be
vitiated by digit preference etc., the group totals are substantially correct and so a
redistribution of the group totals into single year values on the basis of these 5 year or 10 year
totals may be taken as good in many practical situation i.e. we define.
i.e. W0 = P0 +P1 +P2 +P3 +P4; W5 =P5 +P6 +P7 +P8 +P9
1. The limitation of interpolation method is that if any particular five-year age group
is greatly in error due to under/over enumeration, this method will not correct such
deficiencies; they must be corrected by graphic interpolation or applying
osculatory or modified osculatory interpolation formulas. Common sense must
govern the use and interpretation obtained by interpolation formulas.
2. If the function f(x) changes drastically within a small interval of x (such as rates of
mortality during the first five years of life, like the percentage of the married
population between the ages of 15 and 24.) interpolation will tend to distribute the
change more or less smoothly throughout the interval, when in fact changes may
be concentrated in a particular part of the interval.
3. When the function tends to zero slowly, i.e. it is asymptotic to the x axis,
interpolation for the tail may have percentage errors that are larger than
permissible. In this case interpolation can be greatly improved by dealing with the
logarithm of f(x). Even with this transformation since the required value lies at the
bottom panel of the tabulated function a backward difference formula will be
better.
4. Also in the case of open intervals (which are too wide) the theory of interpolation
will fail to give results.
In other words, since interpolation is a "mechanical procedure" its uncritical use may
result in the obliteration of fluctuation that is basic underlying characteristics.
Caution also should be exercised not only in the use of interpolation formula in
situations of the above type-but also in the selection of the formula itself. Each formula has its
advantages and disadvantages and one should be guided by not only common sense but also
experience with similar data.
48
Self-Check Exercises
(d) The first term u0, in the difference table is called the _________ and
the difference which at the head of the respective columns are called
the _____________________.
2.5 Graduation
The probabilities or rates of occurrence of death, birth, marriage etc., are of great
interest to the demographer. We are in many cases interested in constructing tables setting
forth probabilities or rates and for that purpose observations are made of the happening of
such events. Graduation is one of the steps in the construction of these tables.
The problem of graduation arises in connection with the construction of say mortality
rates because a series of observed mortality rates will be found to contain irregularities which
we have reason to believe are not a feature of the true, underlying rates of mortality. These
irregularities may mainly be due to errors in reporting. This is particularly true of tabulations
by single years of age, which tend to heap at ages ending at '0' and '5' and have other irregular
features. For this reason, erratic fluctuations in data are often regarded as symptomatic of error
49
in collection or processing. Before certain refined demographic calculations are made, such as
construction of life tables, it is necessary to smooth or graduate the data to remove those
irregularities and correct for mal-distribution.
In most cases the underlying law may not be known and therefore, we must rely on the
information supplied by observations of the rates or probabilities. If we wish that underlying
curve to be smooth, regular and continuous (this assumption is quite reasonable and holds in
many situations) we may be able to secure by the method of graduation, from this irregular
series of values consistent in a general way with the observed series of values. This smooth
series or series of graduated values, is then taken as a representation of the underlying law
which give rise to the series of observed values.
Thus, in graduating a series of k+1 observed mortality rates from q0" to qw", the
graduation process will substitute k+1 graduated mortality rates q0 to qk lying close to the
crude values but larger at some values and smaller at others. These graduated rates are
obtained by altering each observed rate by reference to the other observed rates so that the
new series will be smooth rather than irregular but at the same time will exhibit the trend
indicated by observed series.
Graduation may not be successful in wholly eliminating the error component of the
observed series. Consequently, the graduated series will contain an element of residual error
and must therefore be thought of as a representation of the underlying law rather than as the
law itself.
Graduation is characterized by two essential qualities (i) smoothness and (ii) fit or
consistency with the observed data. The graduated series should be not only smooth as
compared with the ungraduated data but it should be consistent with the indication of the
ungraduated series. Since in smoothening an observed series its values must be changed, the
new values will depart from the observed series. Generally, an increase in the smoothening
results in a reduction in the fit. Conversely, when the graduated series is drawn closer to the
observed series, improving the fit, smoothness usually suffers. Graduation must follow a
middle course between optimum fit and optimum smoothness.
(i) The Graphic Method: In the graphic method, the observed values are suitably plotted on
graph paper and among them a smooth continuous curve is drawn as the basis of the graduated
series.
50
This method was applied by Joshua Milne, in the graduation of one of the earliest
mortality tables, the Carlisle Table of mortality, published in 1815. The data consisted of
census populations and death registers in two parishes in Carlisle. The graduation performed
separately on the population and deaths arranged in quinquennial and decennial age groups.
A similar method was applied by statisticians in the U.S. Bureau of the census to study
the population of Philippines from 1799 to 1903.
(ii) The Interpolation Method: Under the interpolation method, the graduated series is
obtained by interpolating between special points determined as representative of age groups
into which the data are combined. Since graduation involves the replacement of an irregular
observed series by a regular smooth series consistent with the trend of the observed values,
clearly the interpolation method of graduation includes more than interpolation alone. As a
graduation process, the interpolation method comprises three elements.
(a) Grouping: The first step in the interpolation method is the combination of the data into
groups of suitable size and number. The data are grouped in the hope that by distributing the
excess population over the neighbouring ages the effects of these errors of reporting will be
eliminated or greatly reduced. To that end, an effort is made to select the particular grouping
that will best compensate for the heaping of the data.
(b) Pivotal points: The second step in the application of the method is the calculations of the
special interpolation points, referred to as 'pivotal points' upon which interpolation will be
based. Since the interpolation is anchored to the pivotal points, it is of great importance to the
success of the method as a whole that these points be representative of the respective groups
and at the same time form a smooth series. Because the interpolating curve segments are
constrained either to pass through the pivotal points or in the modified interpolation methods,
to pass close by them, the entire series will have the same general pattern of regularity as the
series of interpolation points.
King's method provides a means of computing the pivotal values ux from three or five
surrounding quinquennial sums, wx, into which data are assumed to be grouped. The formula,
based on the three quinquennial sums, wx-5, wx and wx+5 and correct to the third difference is -
ux = .2wx - .008 (wx-5 - 2wx + wx+5)
This formula may be derived by the methods of theory of finite differences; King's
formula should be applied separately to the exposures and deaths. The pivotal values of rates
are then obtained as the quotient of pivotal exposures and deaths.
King's method does not give satisfactorily pivotal points unless the grouped data form
a comparatively smooth series as in the case of population statistics (where the data are very
extensive). In any event if the series of pivotal points does not seem to be smooth enough; its
smoothness may be increased by graduating the pivotal values graphically before proceeding
with the interpolation.
51
(iii) Graduation by Newton's Formula: One of the simplest and most effective of the
graduation techniques is Newton's Forward Difference Formula (NFDF). In this, the
reported data are grouped into intervals of equal size (say 5 or 10 groups). Cumulative
frequencies for the selected intervals are then compiled and utilized as known Ux values in
NFDF. Application of the formula for a series of intermediate points within each intervals
yields a smooth cumulative distribution and successive subtraction of the cumulative
frequencies for the intermediate points converts the cumulative distribution to a distribution of
smoothed frequencies between the intermediate points.
One major weakness of Newton's method is the failure of the sets of single year
estimates for successive five-year intervals to link-up smoothly. If the data are graduated
properly the entire distribution should form smooth continuous curve. By the above method,
we would find sudden shifts at the points of junction i.e. at the end of intervals within which
graduation are made. Actuarial mathematicians have developed a variety of formula for
accomplishing a smooth function of the interpolation curves.
The Goa census of 1981 gives the following number of males in the age groups
preceding and following the 40-44 age interval.
Age Number
35-39 35016
40-44 30470
45-49 23422
Successive differences of cumulative frequencies are as follows:
Take age 34 as the zero point for the interpolation. This means that the x values in
Newton's formula will be 1.0 for age 39. 1.2 for age 40, 1.4 for age 41, .... 2.0 for age 44.
Substituting the leading differences for age 34, and applying Newton's formula (refer formula
2.5) yields' the following:
u1.2 = 0+1.2 (35016) + 0.12 (-4546) - 0.032 (-1602) = 41422
u1.4 = 0+1.4 (35016) + 0.28 (-4546) - 0.056 (-1602) = 47660
u1.6 = 0+1.6 (35016) + 0.48 (-4546) - 0.064 (-1602) = 53741
u1.8 = 0+1.8 (35016) + 0.72 (-4546) - 0.048 (-1602) = 59679
These values are cumulative estimates. For example, the value to age 40 is for the
number of persons from age 35 through age 40 inclusive. In order to estimate the population
52
that was age 40, it is necessary to subtract the cumulative number for age 40 from that age 39.
Similar subtractions are made to obtain ages 41, 42, 43.
Column 4 shows the difference between the graduated and the census figures. The
`heaping' at age 40 and low reporting of ages for 41, 43 and 44 are observed. It is difficult to
explain why such an erratic age distribution should be a true characteristic of the population.
Therefore, it appears to be due to misreporting of age. It may be extended to include more
five-year age groups and differences above the third order. This method can be applied for
five-year age groups and differences above the third order. This method can be applied for 10
year groups. It is possible to smooth the entire distribution. Reference Tables 11 to 22 given in
supplementary of this block will be used for data given in 5 or 10-year age groups to obtain
single year values.
Example 13: The following table gives the single year (enumerated) age distribution of the
Andhra Pradesh male population (1961). Using the 5-point formula for the data given in 5
years’ age groups, obtain graduated estimates by single year ages from 10 to 29 years for the
data.
To get P10 we multiply w0 by -.003864, w5 by .065855, w10 by .178816, w15 by -.042943; w20
by .006336 and then add them up (Table 3). To get P11, multiply w0, w5, w10, w15, and w 20 by
the coefficients in the second line of the table 3, giving the five point formula for data given in
5 year age groups. Similarly, for P12 we use the same w's but the third line in the table, for P13
the same w's and fourth line in the table and for P14 the same w's and the fifth line in the table.
To get P15 we use w5; w10, w15 w20 and w25 and the first line in the table and so on.
P10 = 4740; P11 = 4552; P12 = 4331; P13 =4082; P14 = 3813; P15 = 3346; P16 = 3126; P17 =
2971; P18 = 2882; P19 = 2847; P20 = 2861.
The above are the smoothed values of the populations by single years. We can check
that the graduated totals are the same as undergraduate totals i.e.,
Graduated P10 +P11 +P12 +P13 +P14 = Undergraduate P10 +P11 +P12 +P13 +P14 = w0
One of the limitations observed in adjusting rough data by the usual Newton's (single
polynomial) interpolation formulas is that at points where two interpolation curves meet, there
are sudden breaks in the values of the first order difference (see in previous example p19 =
2847 and p20 = 2861). To have a solution to this, osculatory interpolation was devised by
Thomas Bond Sprague in 1880. It involves combining two overlapping polynomial into one
equation. One of the polynomial begins sooner and ends sooner than the other and the
interpolations are limited to the overlapping parts. The second of the two polynomials in the
first range then becomes the first polynomial in the second range. The use of one common
polynomial for each pair of successive ranges permits a continuous joining of results from
range to range. The two overlapping polynomials should have common at the beginning and at
the end of range in which interpolation is desired. The specific condition of the osculatory
interpolation formula are both the polynomials should have a common ordinate, a common
tangents (slope) and/or a common radius of curvature. This is possible by making the first
derivative or the first two derivatives equal for the two polynomials.
To split the five-year age groups data into single year of age data several methods are
available. Some of them are Karup-Kings third degree tangential, Sprague's fifth difference
formula, Jenkin's fifth difference osculatory non-reproducing formula, Greville's formula,
Beer's six term ordinary and modified formulas.
54
Karup-King's Formula: This formula is simplest one for which the interpolation coefficients
are presented. It is correct to second difference and has an adjustment involving third
differences. It uses the four given points. The formula provides the third degree curve through
which the central interval u1 to u2 of the four-point series u0, u1, u2 and u3 shall have at u1 and
u2 the same tangents (first differential coefficients) as the partial Newton - Sterling Curves of
the second degree through u0 u1 u2 and u1 u2 u3 respectively. The formula may be expressed in
2
x(x + 1) 2 x (x - 1) 3
ux+1 = u1 + x u0 + u0 + u0
2! 2!
the following terms
The application of Karup-Kings Formula: If Tx, Tx+5, Tx+10, denote the enumerated population
aged x to x+4, x+5 to x+10, and x+10 to x+14 respectively. First find out the groups which
have one T value above it and one below it. These are designated as `mid panel'. In this case,
Tx-5, Tx and Tx+10 (for x =5,10) would satisfy this, The first group or the first end panel i.e. Tx
here which has only two values below it but none above it is respectively designated as the
first end panel (the first group). Similarly, the last-end Tx has only two values above it but
none below it. The multiplier for the first, middle and last end panels are given in tables 23 A
& B, which is attached as supplementary of this block.
In Karup-King formula (see table 23 B), there are 3 columns and 5 rows in each table.
The five rows denote the five single year values, which we are interested to find out from the
given five year grouped values. For example in Table 1 (first panel) row number say 4, when
operated on Tx, Tx+5, Tx+10 will give Px+3 (= population aged x+3). In mid-panel row number
say 3 when operated on Tx+5 ,Tx+10, and Tx+15 will give Px+12 = population aged x+12 and in
the last panel row number 4 when operated on Tx+10, Tx+15 and Tx+20 will give Px+23 =
population aged x+23. In this way all the groups can be split into single year values and this
formula is adjusted in such a way that the total in each age group is unaffected by the splitting.
Sprague's Formula: This formula smoothens at the points occupied by the original data by
providing that curve of the fifth degree passing through the central intervals u2 to u3 in the
given six-point series, u0, u1 .... u5 shall have the same tangent and radius of curvature as the
partial Newton Sterling Curve of the fourth order through u0, u1 ... u4 and shall similarly, at the
point whose ordinate is u3 have the same tangent and radius of curvature as the partial curve of
the fourth order through u1, u2 ... u5. In this process the other important conditions laid down is
that the formula must reproduce the given values exactly. The formula may be stated.
(x + 2) (x + 2)(x + 1) 2 (x + 2) (x + 1)x 3
u2+ x = u0 + Δu0 + Δ u0 + Δu0
1! 2! 3!
(x + 2) (x + 1) (x - 1) 4 x(x - 1)(5x - 7) 5
+ Δu0 + Δu0
4! 4!
Application of Sprague's Formula: In general Sprague formula, calls for two five-year age
intervals to proceed with in which single year age graduation is to be performed. When this
condition is met a set 'mid panel' multipliers can be used. At the ends of the distribution,
where two five year intervals are available for only one side of the five year intervals within
which graduation is desired, special sets of multipliers based on 4th order differences are used.
There are four sets of multipliers-two for the younger ages and two for the older ages. The
55
"first-end-panel' of multipliers is used for the 0-4 year intervals and the `first-next-to-end-
panel' multipliers is used for the 5-9 year intervals. At the oldest age the `last-end-panel'
multiplier is use for the oldest five-year age group and the `last-next-to-end-panel' is used for
the five-year group immediately younger. In general, the Sprague multipliers are very flexible
and will be fit most-distribution of data by age.
Greville's Formula: In recent years it has been pointed out that for most actuarial work,
interpolation is used only to obtain estimates for integral ages and that the function ux, where
ux is the estimated number at different ages, x, may logically be regarded as discrete series
rather than a continuous curve. Formulas which minimize the mean square error of differences
of a given order have been developed by Greville and Beers, these formulas are recommended
for the general use in demographic analysis. In Grevilles formula, if ur is an observed value
and the Ur the true value, then Ur = ur + er where er is an error. It is assumed that the errors in
the observed values are independent random variables with mean zero and variance e2. It is
further assumed that the differences of U beyond order j are zero. Greville obtained the
graduated values by minimize the variance of Δj+1u where U is the linear composition of
observed value.
Beer's Formula: In Beer's formula, he started with the assumption that the 5th differences of
the observed values are independent random variables with mean zero and variance U2 - the
U2 for the assumed constant mean square error (Variance) of each Δ5ur.
2.7.1 Modified Osculatory Interpolation: When the reproducing formulas are used to fill in
the values between certain pre-determined points, it is often found that the whole curve which
finally results will show many undulations and points of inflection even though it will be free
from discontinuities. Since the original values are reproduced exactly, the reproduced
formulas ensure smoothness in the value, but if the group values are not correct, it may lead to
undulation. Therefore, while using reproducing formula the original values should be
graduated if it is needed.
As mentioned above, both ordinary interpolation and the osculatory interpolation are
true interpolation formulas, in the sense that the interpolating arcs pass through the pivotal
points. W.A. Jenkin removed the restriction and produced a set of formulas known as
modified osculatory interpolation formula which achieve considerably greater smoothness
among the interpolated values than do true interpolation formulas. In the modified osculatory
formulas, two adjoining interpolating arcs merely meet and do so in such a way that a
specified number of successive derivatives of the interpolating curve functions are equal at
their common point. Note that the interpolating arcs do not pass through the pivotal points.
The extend by which values of this formula differ at the pivotal point is Ux (formula)
1
= Ux -
36
(i) The Beer's formula yields a smoother pattern of results than the Sprague formula.
However, the Beer's formula has the slight disadvantage. When it is used for
subdivision into tenths, the results of subdivisions will not necessarily add up exactly
to the results of subdivisions into fifths.
56
(ii) When the data trend to follow the trend of a second degree of third degree polynomial,
the Sprague formula and the Beer's formula will both yield about the same results as
the Karup-King formula.
(iii) Modified formula will give poor results when used with data that are of good quality.
It should be used when there is a desire to obtain a smooth series of interpolation from
data is known to be somewhat erratic.
The choice of a method for interpolation is dependent on the nature of the data and on
the purposes to be served. The several sets of interpolation coefficients that are presented in
Appendix Tables are based on formulas that differ in their underlying principles. There is no
one best method for all purposes.
Tables of selected sets of multipliers: These selected sets of multipliers based on five different
formulas namely, Karup-King, Sprague, Beers Ordinary, Beers modified and Grabill's
formulas are used for subdivision of grouped data. For instance, these multipliers may be used
for subdividing age data given in 5-year age groups into single years of age. They may also be
used for subdividing data for 10-year age groups into single years of age also. These
multipliers can be manipulated in various ways.
Example 14: Employing Karup-King Third Difference Formula for the data given below
estimate the population 20 years old.
15-19 35700
20-24 30500
25-29 32600
Age 20 is the 'first fifth' of age group 20-24. Age group 20-24 is a middle group.
Taking the population aged 15-19 as G1, the population aged 20-24 as G2 , and the population
aged 25-29 as G3 , and using the coefficient values corresponding to first fifth of G2 given
Table 23 'B' for subdivision of groups into fifths', the desired estimate of the population 20
years old is:
+.064 (35700) + .152 (30500) - .016 (32600) = 2285 + 4636 + -522 = 6399.
It is noted that whenever possible mid panel multipliers should be used. For
subdivision of the first group in a distribution (e.g. ages 0-4) the first panel multipliers must be
used. Similarly, for subdivision of the last group (e.g. ages 70-74) the last panel multipliers
must be used. For more details about the use of these supplementary tables you may refer
Bogue et al. (1994), Vol. 1, Chapter 5.
Self-Check Exercises
57
Let Us Sum Up
* Interpolation is the process of finding the value of the function (y) for any of the
independent variable (x) within a given range of value of x and extrapolation is the process
of finding the value outside the given range of x.
* The interpolation analysis is done on the basis of two assumptions, Firstly, that the quantity
changes continuously without any break and secondly the rate of change is uniform and
there are no sudden juMA in the data. In other words, it means that the data are in the
shape of continuous or smooth curve.
* The symbol Δ1x, (Δ read as delta) will be used to represent the difference between two
successive known values of the distribution ux+1 -ux. Such differences are called first order
differences or simply "first differences"... The super-script identifies the order of the
difference and the subscript specified which pair of values has been differentiated.
* The Newton's Forward formula is applied when the independent variables advance by equal
intervals. Given the first row of differences, it is possible by this method to reproduce all
other differences, simply by successively adding adjacent pairs of values together and
placing the total under the left entry of the pair. This formula is given as
Z(Z - 1) 2 Z(Z - 1)(Z - 2) 3
uz = ZΔu 0 + Δ u0 + Δ u0
2! 3!
x - x0
where Z = and h = x1 - x 0
h
u0, Δu0, Δ2u0, Δ3u0, ...., etc. are the leading differences occurring at the top of the cone in the
difference table.
* Many a times in population statistics data are given at equal intervals of x; but sometimes it
happens that we are required to interpolate when values of the function are known for
unequal intervals. Since we cannot take out the differences as defined earlier we adopt a
process of difference in involving the argument as well as the entry. The differences
obtained by this process are called `divided' differences. Newton's Divided Difference and
Lagrange’s formulae are based on these concepts.
* Graduation may be defined as the process of securing from an irregular series of observed
values of a continuous variable, a smooth regular series of values consisting in a general
way with the observed series of values. It is characterized by two essential qualities i.e.
smoothness and fit or consistency with the observed data. It is to be noted here that the
graduated series should be not only smooth as compared with the ungraduated data but it
should be consistent with the indication of the ungraduated series also.
* The interpolation method of graduation comprises three elements viz. grouping of data,
securing of a smooth, reliable series of points, and the computation of graduated values by
interpolation based upon these points.
Model Answers
(ii) The rate of change in the values should be uniform and these should not be any
sudden jump in the data.
(i) Where there are n equidistant terms and it is required to find an intermediate
term.
(ii) Where there are n equidistant terms of which n-1 are known and it is required
to find the missing terms.
(iii) To estimate the composition of a population for a more detailed interval than
those in which they are reported.
(v) To split the ten year and five year intervals into five year intervals.
(vi) To split the ten-year and five-year intervals into single year intervals.
(ii) When the function tends to zero slowly i.e. it is asymptotic to x-axis,
interpolation for the tail may have percentage errors that are larger than
permissible.
(iii) Caution also should be exercised not only in the use of interpolation formula in
the above situations but also in the selection of formula itself.
59
5. Graduation may be defined as the process of securing from an irregular series of
observed values of a continuous variable, a smooth regular series of values consisting
in a general way with the observed series of values.
(i) Smoothness
(ii) Fit
7. The pivotal points are defined as special interpolation points upon which interpolation
will be based.
60
Supplementary
Table 6: Table for obtaining single year interval values for the
The following 10 tables may be used to obtain Single first section when three single year values are given at
Year Interval Values when data is given at 5 or 10 year ten year intervals.
intervals. The tables are based on Newton's Forward
Difference formula. ──────────────────────────────────
ux ux+10 ux+20
──────────────────────────────────
Table 1: Table for obtaining single year interval values for the ux+1 .856 .190 -.015
first section when three Single Year Values are given ux+2 .720 .360 -.080
at five year intervals. ux+3 .595 .510 -.105
ux = Population at age x ux+4 .480 .640 -.120
ux+5 .375 .750 -.125
────────────────────────────────── ux+6 .280 .840 -.120
ux ux+5 ux+10 ux+7 .195 .910 -.105
────────────────────────────────── ux+8 .120 .960 -.080
ux+1 .72 .36 -.08 ux+9 .055 .990 -.045
ux+2 .48 .64 -.12 ──────────────────────────────────
ux+3 .28 .84 -.12
ux+4 .12 .96 -.08 Table 7: Table for obtaining single year interval values for the
────────────────────────────────── second section when three single year values are given at ten year
intervals.
Table 2: Table for obtaining single year interval values for the
second section when three single year values are ──────────────────────────────────
given at five year intervals. ux-10 ux ux+10
──────────────────────────────────
────────────────────────────────── ux+1 -.045 .990 .055
ux-5 ux ux+5 ux+2 -.080 .960 .120
────────────────────────────────── ux+3 -.105 .910 .195
ux+1 -.08 .96 .12 ux+4 -.120 .840 .280
ux+2 -.12 .84 .28 ux+5 -.125 .750 .375
ux+3 -.12 .64 .48 ux+6 -.120 .640 .480
ux+4 -.08 .36 .72 ux+7 -.105 .510 .595
────────────────────────────────── ux+8 -.080 .360 .720
ux+9 -.045 .190 .855
Table 3: Table for obtaining single year interval values for the ──────────────────────────────────
first section when four single year values are given at
five year intervals. Table 8:
Table for obtaining single year interval values for the
third section when five single year values are given at
────────────────────────────────── ten year intervals.
ux ux+5 ux+10 ux+15 ──────────────────────────────────
────────────────────────────────── ux-20 ux-10 ux ux+10 ux+20
ux+1 .672 .504 -.224 .048 ──────────────────────────────────
ux+2 .416 .832 -.312 .064 ux+1 .0078 -.0597 .9873 .0733 -.0087
ux+3 .224 1.008 -.288 .056 ux+2 .0144 -.1056 .9504 .1584 -.0176
ux+4 .088 1.056 -.176 .032 ux+3 .0193 -.1367 .8893 .2543 -.0262
────────────────────────────────── ux+4 .0224 -.1536 .8064 .3584 -.0336
ux+5 .0234 -.1561 .7029 .4689 -.0391
Table 4: Table for obtaining single year interval values for the ux+6 .0224 -.1456 .5824 .5824 -.0416
first section when five single year values are given at ux+7 .0193 -.1227 .4473 .6963 -.0402
five year intervals. ux+8 .0144 -.0896 .3024 .8064 -.0336
ux+9 .0078 -.0477 .1513 .9093 -.0207
────────────────────────────────── ──────────────────────────────────
ux ux+5 ux+10 ux+15 ux+20
────────────────────────────────── Table 9:
Table for obtaining single year interval values for the
ux+1 .6384 .6384 -.4256 .1824 -.0336 first section when four single year values are given at
ux+2 .3744 .9984 -.5616 .2304 -.0416 ten year intervals.
ux+3 .1904 1.1424 -.4896 .1904 -.0336 ──────────────────────────────────
ux+4 .0704 1.1264 -.2816 .1024 -.0176 ux ux+10 ux+20 ux+30
────────────────────────────────── ──────────────────────────────────
ux+1 .8265 .2755 -.1305 .0285
Table 5: Table for obtaining single year interval values for the ux+2 .6720 .5040 -.2240 .0480
third section when five single year values are given ux+3 .5355 .6885 -.2835 .0595
at five year intervals. ux+4 .4160 .8320 -.3120 .0640
ux+5 .3125 .9375 -.3125 .0625
────────────────────────────────── ux+6 .2240 1.0080 -.2880 .0560
ux-10 ux-5 ux ux+5 ux+10 ux+7 .1495 1.0465 -.2415 .0455
────────────────────────────────── ux+8 .0880 1.0465 -.1760 .0320
ux+1 .0144 -.1056 .9504 .1584 -.0176 ux+9 .0385 1.0395 -.0945 .0165
ux+2 .0224 -.1536 .8064 .3584 -.0336 ──────────────────────────────────
ux+3 .0224 -.1456 .5824 .5824 -.0416
ux+4 .0144 -.0896 .3024 .8064 -.0336
──────────────────────────────────
61
Table 10: Table for obtaining single year interval values for the Table 14: Multipliers for obtaining single year values for the
first section when five single year values are given at first group when three 10-year group values are given.
ten year intervals.
──────────────────────────────────
────────────────────────────────── 0 Wx Wx+10 Wx+20
ux ux+10 ux+20 ux+30 ux+40 ──────────────────────────────────
────────────────────────────────── Px .1735 -.1020 .0285
ux+1 .8058 .3582 -.2545 .1112 -.0207 Px+1 .1545 -.0740 .0195
ux+2 .6384 .6384 -.4256 .1824 -.0336 Px+2 .1365 -.0480 .0115
ux+3 .4953 .8492 -.5245 .2202 -.0402 Px+3 .1195 -.0240 .0045
ux+4 .3744 .9984 -.5616 .2304 -.0416 Px+4 .1035 -.0020 -.0015
ux+5 .2734 1.0938 -.5469 .2188 -.0391 Px+5 .0885 .0180 -.0065
ux+6 .1904 1.1424 -.4896 .1904 -.0336 Px+6 .0745 .0360 -.0105
ux+7 .1233 1.1512 -.3985 .1502 -.0262 Px+7 .0615 .0520 -.0135
ux+8 .0704 1.1264 -.2806 .1024 -.0176 Px+8 .0495 .0660 -.0155
ux+9 .0298 1.0742 -.1465 .0512 -.0087 Px+9 .0385 .0780 -.0165
────────────────────────────────── ──────────────────────────────────
The following 12 tables may be used for data given in 5 or 10- Table 15: Multipliers for obtaining single year values for the
year age groups to obtain single year values. The tables are middle group when three 10-year group values are
based on Newton's Forward Difference Formula. given.
62
Table 18: Multipliers for obtaining single year values for the Table 22: Multipliers for splitting ten-year group values into five-
second group when four 10-year group values are year group values when three ten-year group values
given. are given.
──────────────────────────────────
0 Wx-10 Wx Wx+10 Wx+20 x = Population aged x to x+4
──────────────────────────────────
Px .0207 .1115 -.0400 .0078 ──────────────────────────────────
Px+1 .0129 .1157 -.0352 .0066 0 Wx Wx+10 Wx+20
Px+2 .0066 .1168 -.0283 .0049 ──────────────────────────────────
Px+3 .0014 .1152 -.0197 .0031 Vx .6875 -.2500 .0625
Px+4 -.0025 .1111 -.0096 .0010 Vx+5 .3125 .2500 -.0625
Px+5 -.0055 .1049 .0016 -.0010 Vx+10 .0625 .5000 -.0625
Px+6 -.0074 .0968 .0137 -.0031 Vx+15 -.0625 .5000 .0625
Px+7 -.0086 .0872 .0263 -.0049 Vx+20 -.0625 .2500 .3125
Px+8 -.0089 .0763 .0392 -.0065 Vx+25 .0625 -.2500 .6875
Px+9 -.0087 .0645 .0520 -.0078 ──────────────────────────────────
──────────────────────────────────
Table 19: Multipliers for obtaining single year values for the
middle group when five 10-year group values are
given.
──────────────────────────────────
0 Wx-20 Wx-10 Wx Wx+10 Wx+20
──────────────────────────────────
Px -.0045 .0388 .0842 -.0218 -.0038
Px+1 -.0035 .0270 .0946 -.0211 .0030
Px+2 -.0024 .0160 .1025 -.0188 .0026
Px+3 -.0013 .0063 .1080 -.0149 .0019
Px+4 -.0001 -.0023 .1107 -.0093 .0010
Px+5 .0010 -.0093 .1107 -.0023 -.0001
Px+6 .0019 -.0149 .1080 .0063 -.0013
Px+7 .0026 -.0188 .1025 .0161 -.0024
Px+8 .0030 -.0211 .0946 .0270 -.0035
Px+9 .0033 -.0218 .0842 .0388 -.0045
──────────────────────────────────
Table 20: Multipliers for obtaining single year values for the
second group when four 10-year group values are given.
──────────────────────────────────
0 Wx-20 Wx-10 Wx Wx+10
──────────────────────────────────
Px -.0078 .0520 -.0645 -.0087
Px+1 -.0066 .0392 -.0763 -.0089
Px+2 .0049 .0263 -.0872 -.0086
Px+3 -.0031 .0137 -.0968 -.0074
Px+4 -.0010 .0016 .1049 -.0055
Px+5 .0010 -.0096 .1111 -.0025
Px+6 .0031 -.0197 .1152 .0014
Px+7 .0040 -.0283 .1168 .0066
Px+8 .0066 -.0352 .1157 .0129
Px+9 .0078 -.0400 .1115 .0207
──────────────────────────────────
Table 21: Multipliers for obtaining single year values for the
last group when four 10-year group values are given.
──────────────────────────────────
0 Wx-30 Wx-20 Wx-10 Wx
──────────────────────────────────
Px .0087 -.0425 .1040 .0298
Px+1 .0089 -.0423 .0928 .0406
Px+2 .0086 -.0392 .0777 .0529
Px+3 .0074 -.0328 .0583 .0671
Px+4 .0055 -.0229 .0344 .0830
Px+5 .0025 -.0091 .0056 .1010
Px+6 -.0014 .0088 -.0283 .1209
Px+7 -.0066 .0312 -.0677 .1431
Px+8 -.0129 .0583 -.1128 .1674
Px+9 -.0207 .0905 -.1640 .1942
──────────────────────────────────
63
Table 23: Interpolation Coefficients Based on the Karup-King Formula
[Karup-King formula is a four-term third-difference osculatory formula. It maintains the given values. Given
point or groups must be equally spaced.]
First interval
N1.0----------------- +1.000 .000 .000 .000
N1.2----------------- +.656 +.552 -.272 +.064
N1.4----------------- +.408 +.856 -.336 +.072
N1.6----------------- +.232 +.984 -.264 +.048
N1.8----------------- +.104 +1.008 -.128 +.016
Middle interval
N2.0----------------- .000 +1.000 .000 .000
N2.2----------------- -.064 +.912 +.168 -.016
N2.4----------------- -.072 +.696 +.424 -.048
N2.6----------------- -.048 +.424 +.696 -.072
N2.8----------------- -.016 +.168 +.912 -.064
Last interval
64
Table 23 (Cont.): Interpolation Coefficients Based on the Karup-King Formula
G1 G2 G3
First panel
Middle Panel
Last Panel
65
Table 24: Interpolation Coefficients Based on the Sprague Formula
First interval
Next-to-first interval
Middle interval
Next-to-last interval
Last interval
66
Table 24 (Cont.): Interpolation Coefficients Based on the Sprague Formula
G1 G2 G3 G4 G5
First panel
Next-to-first Panel
Middle Panel
Next-to-Last Panel
Last Panel
67
Table 25: Interpolation Coefficients Based on the Beers "Ordinary" Formula.
The Beers "ordinary" formula is a six-term formula, which minimizes the fifth differences of the interpolated
results. It maintains the given values. Given points or groups must be equally spaced.
First interval
Next-to-first interval
Middle interval
Next-to-last interval
Last interval
Source: Henry S. Beers, "Discussion of papers presented in the Record Number 68; ......
68
Table 25 (Cont.): Interpolation Coefficients Based on the Beers "Ordinary" Formula.
G1 G2 G3 G4 G5
First panel
Next-to-first Panel
Middle Panel
Next-to-Last Panel
Last Panel
69
Table 26: Interpolation Coefficients Based on the Beers "Modified" Formula.
The Beers "Modified" formula is a six-term formula, which minimizes the fourth differences of the interpolated
results. This formula combines interpolation with some smoothing graduation of given values: end panels
maintain the given values. However, given data must be equally spaced)
First interval
Next-to-first interval
Middle interval
Next-to-last interval
Last interval
70
Table 26 (cont.): Interpolation Coefficients Based on the Beers "Modified" Formula.
(The Beers "Modified" formula is a six-term formula which minimizes the fourth differences of the interpolated
results. This formula combines interpolation with some smoothing o graduation of given values: end panels
maintain the given values. However, given data must be equally spaced)
G1 G2 G3 G4 G5
First panel
Next-to-first-panel
Middle panel
Next-to-last panel
Last panel
First fifth of G5 -.0012 +.0054 -.0410 +.1506 +.0862
Second fifth of G5 -.0011 +.0059 -.0351 +.0969 +.1334
Third fifth of G5 -.0005 +.0032 -.0146 +.0216 +.1903
Fourth fifth of G5 +.0006 -.0027 +.0205 -.0753 +.2569
Last fifth of G5 +.0022 -.0118 +.0702 -.1938 +.3332
Source: Henry S. Beers, "Modified-Interpolation Formulas that Minimize Fourth Differences,” The Record of the
American Institute of Actuaries, 24, Part 1(69):19-20, June 1945.
71
Table 27: Interpolation Coefficients Based on Grabills'
Weighted Moving Average of Sprague Coefficients
(See text for derivation. Used for drastic smoothing. Given groups must be equally spaced)
G1 G2 G3 G4 G5
First fifth of G3 +.0111 +.0816 +.0826 +.0256 -.0009
Second fifth of G3 +.0049 +.0673 +.0903 +.0377 -.0002
Third fifth of G3 +.0015 +.0519 +.0932 +.0519 +.0015
Fourth fifth of G3 -.0002 +.0377 +.0903 +.0673 +.0049
Last fifth of G3 -.0009 +.0256 +.0826 +.0816 +.0111
72
Suggested Readings
1. Ayres Frantk, Jr. (1983), " Theory and problems of matrices" Schaum's outline series,
Singapore.
2. Ayres. F. Jr, 1986 : Theory and Problems of matrices, Schaum's outline series, McGraw
Hill Book company, Singapore.
4. Gorden Fuller (1971), Algebra and Trigonometry, Chapter 15 (pp. 304-317), Chapter 16,
Mc Graw-Hill Book Company, New York, U.S.A.
6. Jain. S.K, 1979 : Basic Mathematics for Demographers, The Australian National
University, Canberra.
7. Kruglak H., and Moore J.T. (1973), Schaum's outline of Theory and problems of Basic
Mathematics, Mc Graw -till Book company.
9. Ross. M.R, 1946 : Differential and Integral Calculus, McGraw Hill Book Company,
New York (Chapters I, II, III, XIV).
73
Notations and Symbols
ε Belongs to
Δ Delta
│ │ Determinants
d
─ Differential
dx
e Exponential
Integral
lim Limit
[ ] Matrix
││ Modulus Value
π Pie
r Radius
Square root
! Factorial
ln Natural logarithm
74
Capacity Building for a Better Future