Statistics Made Easy Volume 1 Descriptive Statistics by Pritish Ranjan Gayali

PREVIEW COPY, NOT FOR SALE OR REPRINT
Support Independent Authors & Publishers like me.

Do not print this book in any form, buy it instead from www.gayali.in
OR ecommerce sites like Amazon / Flipkart or Popular bookstores nearby
STATISTICS MADE EASY

( Volume I )
Descriptive Statistics
( Intended for B.Sc. ( Eco ) , B.Com. ,
B.B.A. , M.B.A. , etc )
P.R. Gayali
M.A. ( Economics ) , M.B.A. ( Finance )
ISBN: 978-93-5636-587-2
Year of Publication: 2022
Country of Publication: India
Published By: Pritish Ranjan Gayali

Dedicated To
My Father, Mother, Wife

Son & Daughter

www.gayali.in
CONTENTS
[1] COLLECTION OF DATA, CLASSIFICATION & TABULATION...............1–5
[2] CHARTS AND DIAGRAMS........................................................................5–15
[3] FREQUENCY DISTRIBUTION................................................................15–35
[4] MEASURES OF CENTRAL TENDENCY.................................................35–87
[5] MEASURES OF DISPERSION................................................................88–124
www.gayali.in
[6] MOMENTS, SKEWNESS & KURTOSIS...............................................125–148
[7] CURVE FITTING AND METHODS OF LEAST SQUARES................148–168
[8] TIME SERIES.........................................................................................169–196
[9] CORRELATION AND REGRESSION..................................................196–233
[10] INTERPOLATION................................................................................233–262
[11] INDEX NUMBER..................................................................................262–295

www.gayali.in

PREFACE
A large number of my teachers and students

have long been insisting that I should write a book
on Statistics in a lucid manner which can be
useful for students of H.S. , B.Sc. , B.com. , BBA ,
MBA , etc . With their suggestions along with so
much of goodwill and good wishes, I have
ventured to bring out this volume.
Distinguishing Features:
i. This book completely covers the entire

course of B.Sc. (Economics), B.B.A. , B.Com.
, M.B.A., etc.
ii. The book elaborates various topics like
Correlations, Regression, Time Series, Index
Numbers etc.
iii. The subject matter has been developed in a
very clear, concise, lucid and intelligible
manner without sacrificing the basic ideas.
Starting with elementary concepts,
complexities and the intricacies of the
advanced concepts have been explained in
a very lucid manner.

iv. It expounds various statistical procedures

duly elucidating all the key topics over more
than 300 solved problems which have been
taken from examination papers of
B.Sc.(economics) & B.Com .
I express my deep sense of congratulation to my

son Anuran Gayali for his interest, untiring efforts
and cooperation in bringing out the book in such
an elegant form.
Suggestions and criticism for further improvement

of the book as well as intimation of errors and
serious misprints will be most gratefully received
and duly acknowledged. Please feel free to
communicate to my email ID pritish@gayali.in
with your queries and suggestions.
EARNEST REQUEST: Support independent

Authors & Publishers like me. Please do not re-
print this book in any form, buy it instead from e-
commerce sites or visit www.gayali.in
It is not legal to reproduce a copy of this book.

This page has been intentionally left blank

www.gayali.in
Statistics Made Easy | 1
COLLECTION OF DATA, CLASSIFICATION AND TABULATION
What is Statistics
Statistics, as a plural noun, is used to mean numerical data arising in any
sphere of human experience. To be precise, numerical data which arise from a host of
uncontrolled and mostly unknown causes acting together. It is in this sense that the
term used when our daily newspapers give vital statistics, crime statistics or statistics of
rainfall, statistics of temperature, statistics of accidents, etc. Used as singular, statistics
is a name for the body of scientific methods which are meant for the collection, analysis
and interpretation of numerical data.
Primary and Secondary data
The data may be of two broad types :– primary and secondary. The ordinary user
of economic and social statistics will find that the data have been already collected by
some other agency – government or private. These may exist either in a published or an
unpublished form. His job will then be simply to have access to the source and get hold
of the data. Such data will be called secondary data. Government departments collect
data on diverse topics that touch the life of the people as a matter of routine and as an
www.gayali.in
essential basis of administration. Private agencies like banks and industrial concerns
regularly compile figures on their assets and liabilities, number of employees, income of
employees, etc. The enquirer may get his material readymade from such agencies or he
may get the data in rough form and adapt them to his needs. In some cases, the enquirer
will find that the relevant data have been collected by some research organization as
part of an investigation similar to his own.
In making use of secondary data, the enquirer has to be practically careful about
the nature of the data and their coverage, the definition on which they are based and
their degree reliability. May be he will find that the available data are more extensive
than is required for the purpose of his enquiry. In such a case, he will naturally discard
the part of the data that is redundant. Sometimes he may as well find that the available
information is inadequate for the purpose of his enquiry. He will then have to decide
whether to collect his own data, either to base his enquiry solely on them or to plug
lacunae in the secondary data.
Data collected primarily for the purpose of the given enquiry are called primary
data. These are collected by the enquirer, either on his own or through some agency set
up for this purpose, directly from the field of enquiry. It goes without saying that this
type of data may be used with greater confidence, because the enquirer will himself
decide upon the coverage of the data and the definitions to be used and, as such, will
have a measure of control on the reliability of the data.
www.gayali.in
Various Methods of Collecting Primary Data

The following methods are generally used for collection of primary data :–
(a) Direct personal observation
(b) Indirect oral investigation
(c) Questionnaire sent by mail
(d) Schedules sent through investigators

www.gayali.in
In “direct personal observation”, the investigator collects the requisite information

personally through observation or by measurement. For example, in order to study the
conditions of students residing in Calcutta hostels, the investigator meets the students
in their hostels and collects necessary data after a personal study. The method is time-
consuming and costly, but yield very accurate results. It is therefore suitable for such
studies when the field of enquiry is small.
In “indirect oral investigation”, data are collected through indirect sources.
Persons who are likely to have information about the problem are interrogated and on
the basis of their answers, factual data have to be compiled. Most of the commissions of
enquiry or committees appointed by Government collect primary data by this method.
The accuracy of the method depends largely upon the type of persons interviewed and
hence these persons have to be selected very carefully.
In the “mailed questionnaire method “, the most important instrument is the
questionnaire. This contains a set of questions, relevant to the subject of enquiry,
answer where to are expected to yield the requisite information. Printed questionnaires
are sent by mail to a selected list of persons, with the request to return them duly
filled in. Supplementary instructions regarding the definitions of terms used and
methods of filling forms should also accompany the questionnaire. The method is
cheap and expeditious, and a large area can be covered with limited cost. Two principal
www.gayali.in
disadvantages of the methods are – the low degree of reliability of collected data and a
large number of non-respondents.
“Schedules sent through investigators” is the most widely used method of
collection of primary data. Here, paid investigators are employed for data collection.
The investigators carry with them printed “schedules” specially designed for the
purpose, interview people concerned, and fill up the schedules on the spot, based
on answers received from the informant. The method is very popular and yields
satisfactory results. Most of the accuracy of the collected data however depends on
the ability and tactfulness of investigators, who are given special training as to how
they should elicit the correct information through friendly discussions. The method is
adopted during the decennial census of population in this country.
Classification of Data
Classification is the process of arranging data collected under different
categories.
Types of classification
Broadly, there are four types of classifications:-
1. On qualitative basis – Classification of the total population according to sex,
religion, occupation, etc.
www.gayali.in
2. On quantitative basis – Classification of the total population according to age or of

industries according to the number of persons employed, etc. are included in this type.
3. On time basis – Some statistical data are arranged in order of their time of
occurrence. Production of crops may be shown by monthly, quarterly and yearly.
4. On geographical basis – The total population of a country may be classified by
states or districts or industrial production may be classified by different regions, states,
districts, etc.

www.gayali.in
Tabulation
Tabulation may be defined as the logical and systematic organization of
statistical data in rows and columns, designed to simplify the presentation and facilitate
comparison.
Different parts of a table
(i) Title – This is a brief description of the contents and is shown at the top of the
table.
(ii) Stub – The extreme left part of the table where descriptions of rows are shown is
called stub.
(iii) Caption and Box-head – The upper part of the table which shows the description
of columns and sub-coulmns is called caption. The whole upper part including caption,
unit of measurement and columns numbers, if any, is called the box-head.
(iv) Body – It is that part of the table which shows the figures.
Table 1.1 – Different parts of a table
<--------------------------------------- TITLE ------------------------------------->
CAPTION
}
www.gayali.in
(1) (2) (3) (4) (5) (6) BOX-Head
S
T <--------------------- BODY -------------------->
U
B
Source :..............
Footnote :..........
(v) Footnote – This is the part below the Body where the source of data and
explanation are shown.
Problem 1 – Draw up a blank table in which could be shown the number of
persons employed in six industries on two different dates distinguishing males from
females and among the latter, singles, married and widows.
[I.CW.A. Jan'1973]
Solution :
Table 1.2 : Number of persons employed in six industries
As on 01.01.2016 As on 01.01.2017
Industry Female Female
Male Male
Single Married Widow Single Married Widow
(1) (2) (3) (4) (5) (6) (7) (8) (9)
www.gayali.in
A
B
C
D
E
F
Source : Industrial Statistics
Footnote : Data is in Lakhs

www.gayali.in
Problem 2 – Prepare a blank table showing the distribution of population

according to sex and six religions in five age groups in seven different cities.
[C.U.B.A. (ECON)]
Table 1.3 : Distribution of Population : City A/B/C/D/E/F/G
Male (age in years) Female (age in years)
Religion 61 & 61 &
0 – 10 11 – 19 20 – 40 41 – 60 0 – 10 11 – 19 20 – 40 41 – 60
Above Above
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)
Hindu
Badya
Jain
Sikh
Muslim
Christian
Problem 3 – Draw up a blank table showing the Exports and Imports during the
www.gayali.in
years 1960, 1961, 1962, 1963 and 1964 relating to the ports Bombay, Calcutta, Madras
and other ports. The table should provide for the values and the balance of trade and
the totals for each year.
[C. A. ‘63]
Solution :
Table 1.4 : Value of Exports and Imports and balance of trade during 1960 to
1964 for Bombay, Calcutta, Madras and other Ports
Value in crores INR
Items
1960 1961 1962 1963 1964
(1) (2) (3) (4) (5) (6)
Exports From
Mumbai
Kolkata
Chennai
Others
Total of Exports (A)
Imports From
www.gayali.in
Mumbai
Kolkata
Chennai
Others
Total of Imports (B)
Balance of Trade (A –B)

www.gayali.in
Problem 4 – You are given data on exports (both quantity and value) of Indian
jute to U.K., U.S.A., Russia, Japan and Canada for 5 consecutive years. Suggest a suitable
tabular representation by drawing a blank table.
Solution –
Table 1.5 : Exports of Indian Jute to Different countries during 1990 to 1994
1990 1991 1992 1993 1994
Items
Quantity Value Quantity Value Quantity Value Quantity Value Quantity Value
Exports to
U.K.
U.S.A
Russia
Japan
Canada
Total of Exports (A)
www.gayali.in
CHARTS AND DIAGRAMS
The common types of charts and diagrams are:-

1. Line diagram or Graph
2. Bar Diagram
3. Pie diagram
4. Pictogram
5. Histogram, Frequency Polygon and Ogive
[1] Line Diagram (or Graph)
The line diagram shows by means of a curve or a straight line, the relationship
between two variables. Two straight lines, one horizontal and the other vertical (known
as X- axis and Y-axis respectively), are drawn on a graph paper, which intersect at a
point called origin. The given data are represented as points on the graph paper. The
consecutive points thus obtained are joined by pieces of straight lines, giving the line
diagram.
Two types of line diagrams are used :–
www.gayali.in
1) Natural scale and

2) Ratio Scale
Equal distances represent equal amounts of change in the natural scale but equal
ratios of change in ratio scale.
Distinction between the natural scales and logarithmic scale used in graphical

www.gayali.in
presentation of data
In the usual type of graph papers, all rulings are shown equal apart both
horizontally and vertically. These are known as natural scale or arithmetic scale graph
papers. There is a special type of graph paper in which the distances of rulings from
the initial line are proportional to the logarithms of numbers, and hence the distances
between consecutive rulings are not equal. Such a scale is known as logarithmic scale
or ratio scale.
The natural scale is used for showing absolute amount of change.
Example:
Table 2.1 : Cheque clearance (crores of Rs)
Month
Year
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1958 832 765 873 792 791 663 834 754 806 799 773 887
1959 894 828 946 923 849 – – – – – – –
In Figure 1.5, an amount of 50 (Rs. Crores) increase either from 650 to 700 or
from 900 to 950 is represented by
www.gayali.in
Fig. 1.5 : Line diagram showing cheque clearance
1000
950
900
850
800
750
700
www.gayali.in
650
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May
the same distance in the vertical direction.

www.gayali.in
Figure :– Comparison of Natural Scale and Ratio Scale.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
NATURAL SCALE
1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100
RATIO SCALE
However, in terms of percentage changes, these do not have the same
50
significance. In the former, the increase is × 100 = 7.7% , whereas in the latter case,
50 650
it is × 100 = 5.6% . Such discrepencies in percentage changes can be readily detected
900
when the data are shown on the ratio scale. Equal distances represent equal absolute
changes in the natural scale, but equal percentage changes in the ratio scale. A series
of observations having a common difference i.e. in arithmetic progression, shows
a straight line on the natural scale; but a straight line is obtained on the ratio scale
only when the consecutive observations have a common ratio, i.e. are in geometric
progression.
In the natural scale graph, the base line of the vertical scale must show zero. But
www.gayali.in
since the ratio scale shows proportionate changes, there is no zero point. The base line
must show either 1 or 10 or 100 or 1000 etc.
Some graph papers have a ratio scale in the vertical direction, but natural scale
in the horizontal direction. These are known as semi-logarithmic graph papers.
Semi-logarithmic Graph
Semi-logarithmic graph or Ratio Chart is a line diagram drawn on a special
type of graph paper which shows the natural scale in the horizontal direction and
the logarithmic or ratio scale in the vertical direction. In the semi-logarithmic graph
paper, the vertical rulings are equispaced, but the horizontal rulings are not, their
distances from the base line being proportional to the logarithms of the numbers
represented. If a semi-logarithmic graph paper is not available, the ratio chart may
be drawn on a natural scale graph paper by plotting the logarithms of values of the
dependent variable y against the corresponding values of the independent variable x.
[2] Bar Diagram
Bar diagram consists of a group of equispaced rectangular bars, one for each
category (or class) of given statistical data. The bars, starting from a common base line,
must be of equal width and their length represent the values of statistical data.
There are two types of bar diagrams – Vertical Bar Diagram and Horizontal Bar
Diagram. Vertical bars are used to represent time series data or data classified by the
www.gayali.in
values of variable. Horizontal bars are used to depict data classified by attributes only.
For each of these types, we have again grouped bar diagram, sub-divided (or
component) bar diagram, paired bar diagram, etc.
[3] Pie Diagram
Pie diagram is a circle whose area is divided proportionately among the different
components by straight lines drawn from the centre to the circumference to the circle.
When statistical data are given for a number of categories, and we are interested in the

www.gayali.in
comparison of the various categories or between a part and the whole. Such a diagram
is very helpful in effectively displaying the data.
For drawing a pie diagram, it is necessary to express the value of each category
as a percentage of the total. Since the full angle 3600 around the centre of the circle
represents the whole i.e. 100% the perentage figure of each component is multiplied by
3.6 degrees to find the angle of the corresponding sector at the centre of the circle.
[4] Pictogram
Pictogram consists of rows of picture symbols of equal size. Each symbol
represents a definite numerical value. If a fraction of this value occurs, then the
proportionate part of the picture from the Left is shown. Pictograms are used for
representing time series data, one row of pictures for each time period. It may also be
used for displaying statistical data classified by attributes.
[5] Histogram, Frequency Polygon and Ogive
These diagrams are used to depict statistical data given in the form of frequency
distributions.
Exercise–2
[1] Represent the following statistical information graphically :
www.gayali.in
Year 1924 1925 1926 1927 1928 1929 1930
Monthly Average Production 609 522 205 608 551 632 516
[C.U. B.com.(Hons.)'65]
Solution :
Figure – Line Diagram showing Monthly Average Production
650
600
550
Monthly average production
500
450
400
350
www.gayali.in
300
250
200
1924 1925 1926 1927 1928 1929 1930
Year

www.gayali.in
[2] Plot the following data relating to population of India so as to indicate the
proportionate increase in population from one period to another :–
Year 1872 1881 1891 1901 1911 1921 1931 1941
Population (in millions) 210 250 290 295 315 320 350 390
[C.U. , B.A. (Econ) ‘62]
Solution:
We draw the semi logarithmic graphs on an ordinary (arithmetic scale) graph
paper. For this purpose, the logarithms of population data should be plotted against
the corresponding years.
Table – Calculation of logarithms
Year Population (in Log y (from log Log y (rounded to 2

millions), y table) decimals)
1872 210 2.3222 2.32
1881 250 2.3979 2.40
www.gayali.in
1891 290 2.4624 2.46
1901 295 2.4698 2.47
1911 315 2.4983 2.50
1921 320 2.5051 2.51
1931 350 2.5441 2.54
1941 390 2.5911 2.59
Figure – Semi-logarithmic graph (or Ratio Chart). (Drawn on ordinary graph paper)
2.6
2.5
2.4
Log y
2.3
www.gayali.in
2.2
2.1
2
1872 1881 1891 1901 1911 1921 1931 1941
Year

www.gayali.in
[3] Represent the information contained in the following table in a component part chart.
Commodity pattern of India’s exports (percentage)
1956 – ’57 1957 – ‘58 1958 – ’59
Capital goods 0.29 0.31 0.30
Intermediate goods 45.82 46.87 44.19
Consumer goods 50.50 47.32 48.19
Unclassified 3.39 5.50 7.32
Total 100.00 100.00 100.00
[C. U. B. Com (Hons) ‘67]
Solution: Figure – Component bar chart showing Indian Exports during
1956-57, 1957-58, 1958-59
www.gayali.in
[4] The following table shows the values of a variable y corresponding to some given
equidistant values of the independent variable x :–
x 7 8 9 10 11 12
y 132 214 330 486 688 942
Draw a semi-logarithmic chart and find by graphical interpolation the value of
y, when x = 10.5
[ I.C.W.A. ‘71]
Solution :
Table – Calculation of logarithms for Ratio Chart
www.gayali.in
x y Log x Log y
7 132 0.8451 = 0.85 2.1206 = 2.12
8 214 0.9031 = 0.90 2.3304 = 2.33
9 330 0.9542 = 0.95 2.5185 = 2.52
10 486 1.0000 = 1.00 2.6866 = 2.69
11 688 1.0414 = 1.04 2.8376 = 2.84
12 942 1.0792 = 1.08 2.9741 = 2.97

www.gayali.in
Figure – Semi – logarithmic graph

3
2.9
2.8
2.75
2.7
2.6
log y
2.5
2.4
2.3
2.2
2.1
2
6 7 8 9 x 10 10.5 11 12 x
log x
Ans. Value of log y = 2.75 when x = 10.5

∴ y = Antilog of 2.75
www.gayali.in
= 562.3
[5] The following table shows the number of bushels of wheat and corn produced in
a farm during the years 1950 to 1960.
Express the yearly number of bushels of wheat and corn as percentages of total
annual production. Graph the percentages by component bar charts.
Year 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
No. of Bushels of wheat 200 185 225 250 240 195 210 225 250 230 235
No. of bushels of Corn 75 90 100 85 80 100 110 105 95 110 100
[Dip. Soc. Welfare, ’68]
Solution :
Year 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
No. of Bushels of wheat 200 185 225 250 240 195 210 225 250 230 235
No. of bushels of Corn 75 90 100 85 80 100 110 105 95 110 100
Totals 275 275 325 335 320 295 320 330 345 340 335
Figure – Component bar chart showing no. of bushels of wheat and corn (1950 – 1960)
25.37% 27.54% 32.35% 29.85%

30.77% 25% 31.82%
34.37%
33.9%
www.gayali.in
27.22% 32.73%
Frequency
74.63% 75% 72.46%

69.23% 65.93% 68.18% 67.65% 70.15%
72.73% 67.27% 66.10%
Year

www.gayali.in
[6] Of the life insurance policy dividends paid in the United States, 21% were taken
in cash, 1% were used to pay premiums, 18% were used to purchase additional paid-up
life insurance, 30% were left with life insurance companies to earn interest. Construct
a pie diagram showing these different uses of policy dividends.
[C.U. M Com. ‘62]
Solutions:
Table – Calculations for pie chart
Mode of Payment or distribution of Angle (degrees) at the center of
Percent of total Cumulative total
dividend pie chart col. (2) x 3.6
(1) (2) (3) (4)
Cash 21 75.6 75.6
Premiums 31 111.6 187.2
Purchase additional paid up LIC policy 18 64.8 252.0
Left with Life insurance companies 30 108.0 360.0
Total 100 360.0
Fig. Pie diagram showing Life Insurance Policy dividend distributed to different heads.
www.gayali.in
[7] A summary of the estimated receipts and expenditures of Government of India
for a particular year is given below :
Receipts Amount (millions of Expenditure Amount (millions of
rupees) rupees)
Direct Taxes on Income 2076.0 Interest on Public Debt 1143.3
www.gayali.in
Customs Duties 1680.0 Education and Health 272.6

Other Taxes 2955.6 National Defence 2774.5
Revenue from public 1660.7 Transfers to States 3755.3
undertakings
Other Receipts 1189.1 Other current Expenditures 1754.4
Capital Expenditures 5702.2
Other loans and advances 759.8
Total Receipts 9561.4 Total Expenditure 16162.1

www.gayali.in
Draw suitable diagrams to show the relative importance of different heads of

income and expenditure.
[C.U. B.A (Econ) ‘73]
Solution:
Receipts Amount (millions of rupees) Percentage of Total Cumulative Total
Direct Taxes on Income 2076.0 21.71 21.71
Customs Duties 1680.0 17.57 39.28
Other Taxes 2955.6 30.91 70.19
Revenue from public
1660.7 17.37 87.56
undertakings
Other Receipts 1189.1 12.44 100.00
Total Receipts 9561.4 100.00
Expenditure Amount (millions of rupees) Percentage of Total Cumulative total
Interest on Public Debt 1143.3 7.07 7.07
Education and Health 272.6 1.69 8.76
National Defence 2774.5 17.17 25.93
Transfers to States 3755.3 23.23 49.16
Other current Expenditures 1754.4 10.86 60.02
Capital Expenditures 5702.2 35.28 95.30
Other loans and advances 759.8 4.70 100.00
www.gayali.in
Total Expenditure 16162.1 100.00
Figure: Sub-divided bar chart showing receipts and expenditures of India for a
particular year.
[8] Use a suitable diagram to represent the following data relating to the Post and
Telegraph Department, Govt. of India
www.gayali.in
Year Net Receipts (lakhs of `) Year Net Receipts (lakhs of `)

1955 – 56 565.32 1961 – 62 497.56
1956 – 57 880.33 1962 – 63 567.12
1957 – 58 645.03 1963 – 64 963.80
1958 – 59 954.09 1964 – 65 871.33
1959 – 60 859.09 1965 – 66 516.17
1960 – 61 425.34 1966 – 67 936.09

www.gayali.in
Solution :
Net receipts (in lakhs Rs.)
Year
Fig: Bar diagram showing data of Post & Telegraph Department during 1955–56
to 1966–67.
www.gayali.in
[9] The actual outlay on the public sector in the First and Third Five – year plans of
India is shown below by head of development:
Head of Development First Plan outlay (` Cr) Third Plan Outlay (` Cr)
Agricultural & Community
290 1096
Development
Irrigation and Power 583 1927
Industries and Mining 97 1965
Transport and Communications 518 2113
Social Services 412 1422
Miscellaneous 60 85
Total 1960 8608
Draw suitable diagrams to show the relative importance attached to the various
heads in each plan. Hence, make a comparison between the First and Third Plan.
Solution:
Table – Calculations of sub-divided bar Chart
Head of Development % of total outlay: plan-I % of total outlay: plan-III
Agricultural & Community
www.gayali.in
14.8 12.7
Development
Irrigation and Power 29.8 22.5
Industries and Mining 4.9 22.8
Transport and Communications 26.4 24.5
Social Services 21.0 16.5
Miscellaneous 3.1 1.0
Total 100.0 100.0

www.gayali.in
This page has been intentionally hidden to encourage reader
to buy the book and support the author's hard work
3.1
21.0
26.4
4.9
29.8
14.8
Sample Copy for

Figure – Sub- divided Bar chart showing Actual outlay in India’s Five year Plans
I & III.
Evaluation and FREQUENCY DISTRIBUTION
www.gayali.in
Attribute and Variable
The character of statistical information collected from a group of individuals or
Preview
objects, is of two types – quantitative and qualitative. Information about the ages of a
group of men is quantitative, because age is expressed in numbers, say 29 years, 43.5
years, etc. Religion of a group of men is qualitative, because religion cannot be stated
in numerical terms, e.g., either Hindu or Buddhist or Christian, etc. The quantitative
character is technically called variable and the qualitative character is called attribute.
A variable takes different ‘values’ and these values can be measured numerically in
suitable units. An attribute cannot be measured but can only be classified under
different heads or categories.
Discrete and continuous variables
When we pass on to the study of data regarding quantitative characters, it is
immediately found that these may be of two principal types. In the first place, the
character may take only some isolated values, like the number of letters in a word,
number of petals in a flower, number of members in a family and so forth. Alternatively,
it may be conceivably take any value within its range of variation. The height, weight
or age of a man, the diameter of a bobbin, the temperature, rainfall or humidity in a
region, etc. are variables of this type. Variables of the first type are called discontinuous
www.gayali.in
or discrete, while those of the second type are called continuous.

Frequency Distribution
Frequency Distribution is a statistical table which shows the values of the
variable arranged in order of magnitude, either individually or in groups, and also the
corresponding frequencies side by side. There are two types of frequency distributions :–
[a] Simple frequency distribution
[b] Grouped frequency distribution

www.gayali.in
Simple frequency distribution shows the values of the variable individually,

whereas groupsed frequency distribution shows the values of the variable in groups or
intervals.
Table – Simple Frequency Distribution
Daily number of car accidents Frequency (no. of days)
3 5
4 9
5 11
6 4
7 1
Total 30
Table – Grouped Frequency Distribution
Age in years Frequency (no. of persons)
15 – 19 37
20 -24 81
25 – 29 43
30 – 34 24
www.gayali.in
35 – 44 9
45 - 59 6
Total 200
Useful terms associated with grouped frequency distributions:
[a] Class interval or class
[b] Class frequency and Total Frequency
[c] Class Limits – Lower class limit, upper class limit
[d] Class boundaries – Lower class boundary, upper class boundary
[e] Class mark, or Mid – Value, or Mid – point of class interval
[f] Width, or size of class interval
[g] Frequency density
We shall explain these terms with reference to table below
Class Class Class Limits Class Class Width Frequency Relative
Interval Frequency Boundaries Mark of Class Density Frequency
Lower Upper Lower Upper
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
20 – 29 2 20 29 19.5 29.5 24.5 10 0.2 0.029
30 – 39 4 30 39 29.5 39.5 34.5 10 0.4 0.057
40 – 49 7 40 49 39.5 49.5 44.5 10 0.7 0.100
www.gayali.in
50 – 59 3 50 59 49.5 59.5 54.5 10 0.3 0.043

60 – 69 9 60 69 59.5 69.5 64.5 10 0.9 0.129
70 – 79 7 70 79 69.5 79.5 74.5 10 0.7 0.100
80 – 89 10 80 89 79.5 89.5 84.5 10 1 0.143
90 – 99 11 90 99 89.5 99.5 94.5 10 1.1 0.157
100 – 119 9 100 119 99.5 119.5 109.5 20 0.7 0.128
120 – 149 8 120 149 119.5 149.5 134.5 30 0.8 0.114
Total 70 - - - - - - - 1.000

www.gayali.in
[a] Class Interval (or Class)

When a large number of observations varying in a wide range are available,
these are usually classified in several groups according to the size of values. Each of
these groups, defined by an interval, is called class interval, or simply class. In table,
column (1), the class intervals are 20 – 29, 30 – 39, etc.
When one end of a class is not specified, the class is called an open – end class.
[b] Class frequency & Total frequency
The number of observations falling within a class is called its class frequency,
or simply frequency. The sum of all class frequencies is called total frequency. In table,
column (2), the class frequencies are 2,4,7, etc. and the total frequency is 70.
[c] Class limits
When a grouped frequency distribution is constructed from the collected
data, the values of the variable are shown in several classes. For determining the class
frequencies, it is necessary that these classes are mutually exclusive, i.e., be such that
any observation is contained in only one class; for example 20 – 29, 30 – 39, 40 – 49,
www.gayali.in
etc.
In the construction of grouped frequency distribution, the class intervals must
therefore be defined by the pairs of numbers such that the upper end of one class does
not coincide with the lower end of the immediately following class. The two numbers
used to specify the limits of a class interval for the purpose of tallying the original
observations in the various classes, are called ‘Class Limits’. The smaller of the pair is
known as Lower class limit and the larger as Upper Class Limit with reference to the
particular class.
[d] Class boundaries
When measurements are taken on a continuous variable, all data are recorded
nearest to a certain unit. Thus, if ages are recorded to the nearest whole number of
years, any age between 19.5 years and 20.5 years is recorded as 20 years. Similarly,
29 years denotes an age between 28.5 years and 29.5 years. Hence, the class interval
20 – 29 actually includes all ages between 19.5 and 29.5 years. These most extreme
values which would ever be included in a class interval are called ‘class boundaries’.
The lower extreme point is called lower class boundary, and the upper extreme point is
called upper class boundary with reference to any particular class. See Table columns
(5)&(6).
www.gayali.in
Class boundaries may be calculated from class limits by applying the following
rule:
Lower class boundary = Lower class limit – ½ d and
Upper class boundary = Upper class limit + ½ d
Where d is the common difference between the upper class limit of any class
interval and the lower class limit of the next class interval. In Table, d = 1.

www.gayali.in
[e] Class Mark (or Mid- Value or Mid – Point)

The value exactly at the middle of a class interval is called class mark or mid -
value. It lies half-way between the class limits or between class boundaries. See Table
column (7).
Class Mark = (Lower Class Limit + Upper Class Limit) / 2
= (Lowe Class boundary + Upper Class boundary) / 2
[f] Width (or Size) of Class
Width or Size of a class is the difference between the lower and the upper class
boundaries. See table column (8).
Width of Class = Upper Class boundary – Lower Class boundary
In the construction of a frequency distribution, it is generally preferable to
have classes of equal width. Unequal width of classes is resorted to when some of the
observations are few and far away from the rest. Use of equal width in such cases may
result in some empty classes, i.e., classes with zero frequency.
www.gayali.in
[g] Frequency Density
Frequency density of a class is its frequency per unit of width. It shows the
concentration of frequency in a class and is given by the formula
Class Frequency
Frequency Density = Width of the Class
See Table, Column (9)

Frequency density is used in drawing histogram, when the classes are of unequal
width.
Construction of frequency distribution
The Steps in the construction of a frequency distribution from ungrouped data
are as follows:-
[1] Find the largest and the smallest observations in the given data, and then
calculate the range, i.e., the difference between them.
[2] Divide the Range into a suitable number of class intervals by means of class
limits. The number of class intervals should not ordinarily be less than 5 nor more
than 15, depending on the number of observations available. The more the number of
www.gayali.in
observations, the larger will be the number of class intervals.

Class limits should be so chosen that most of the observations lie near the
class marks. The class intervals should preferably be of the same width. In special
circumstances, class intervals of unequal widths may also be used.
[3] The number of observations falling in each class interval, i.e. class-frequency, is
determined by tally marks.

www.gayali.in
[4] A table is now prepared showing the class intervals in the first column and the
corresponding class frequencies in the second column. This is required frequency
distribution.
Cumulative Frequency Distribution
Cumulative Frequency corresponding to a specified value of the variable may be
defined as the number of observations smaller than (or greater than) that value.
The number of observations ‘upto’ a given value is called ‘Less – than’ cumulative
frequency; and the number of observations ‘greater than’ a value is called the ‘More-
than’ cumulative frequency. When a grouped frequency distribution relates to a variable
of continuous type, the cumulative frequencies calculated therefrom must be shown
against the class boundary points (i.e. end points of classes). Cumulative frequency
expressed as a percentage of total frequency, is known as cumulative percentage.
A Table showing the cumulative frequencies against values of the variable
systemically arranged in in increasing (or decreasing) order is known as cumulative
frequency distribution. If cumulative percentages are shown, instead of cumulative
frequencies, the table is called Cumulative Percentage Distribution.
www.gayali.in
Relative frequency distribution
Relative Frequency denotes the class frequency expressed as a fraction of the
total frequency.
Class Frequency
Relative Frequency = Total Frequency
The sum of all the relative frequencies is equal to 1. See Table, column (10).
Diagrammatic Representation of Frequency Distribution
The diagrams commonly used to depict statistical data given in the form of a
frequency distribution are:-
[1] Histogram
[2] Frequency Polygon
[3] Ogive (or cumulative frequency polygon)
[1] Histogram
Histogram is the most common form of diagrammatic representation of a
grouped frequency distribution. It consists of a set of adjoining rectangles drawn
www.gayali.in
on a horizontal baseline, with areas proportional to the class frequencies. The width
of rectangles, one for each class, extends over the class boundaries (not class limits)
shown on the horizontal scale. When all classes have equal widths, the heights of
rectangles will be proportional to the class frequencies and it is then customary to take
the heights numerically equal to the class frequencies. If, however, the classes are of
unequal width, the rectangles will also be of unequal width, and therefore the heights
must be proportional to the frequency density. Because then,

www.gayali.in
Area of each rectangle = Width x height

= (Width of class) x (frequency density)
Class Frequency
= (width of class) × Width of Class
@ Class Frequency
[2] Frequency Polygon
Frequency Polygon is the graphical representation alternative to histogram
and may be looked upon as derived from histogram by joining the mid-points of the
tops of consecutive rectangles. It is generally used in cases when all the classes have a
common width. In actual construction, therefore, the frequency polygon is obtained
by joining the successive points whose abscissae represents the mid – values and the
ordinates represent the corresponding class frequencies. The two end- points are joined
to the baseline at the mid-values of the empty classes at each end of the frequency
distribution. Thus the frequency polygon has the same area as the histogram provided
the width of all classes are the same. The frequency polygon is particularly useful in
representing simple frequency distributions of a discrete variable.
[3] Ogive (or Cumulative Frequency Polygon)
www.gayali.in
Ogive is the graphical representation of the cumulative frequency distribution,
and hence is also called cumulative frequency polygon. When cumulative frequencies
are plotted against the corresponding class boundaries and the successive points are
joined by straight lines, the line diagram obtained is known as Ogive or Cumulative
Frequency Polygon. The ogive is of “less-than” or “more-than” type according to
the cumulative frequencies used are of “less-than” or “more-than” type. The “less-
than”ogive starts from the lowest class boundary on the horizontal axis and gradually
rising upward ends at the highest class boundary corresponding to the cumulative
frequency N, i.e., the total frequency. It looks like an elongated letter S. The “more-
than” ogive has the appearance of an elongated S turned upside down. Unequal width
of classes in the frequency distribution, do not cause any difficulty in the construction
of an ogive.
Frequency Curve
If the widths of classes be made smaller and smaller and at the same time the
total frequency be also increased indefinitely, then the histogram and the frequency
polygon will closely approach to a smooth curve known as the frequency curve.
The frequency curve shows the probability distribution of the variable in the
population and its area bounded by the ordinates at two specified points on the
www.gayali.in
horizontal axis represents the probability that a value of the variable lies between those
two limits. Like histogram, the frequency curve is therefore an area diagram.
Generally, there are 4 types of frequency curves –
i) Symmetrical bell – shaped
ii) Asymmetrical single humped
iii) J – Shaped
iv) U – Shaped

www.gayali.in
Figure – Different Types of Frequency Curves
i) Symmetrical (bell– shaped) ii) Asymmetrical (single humped)
iii) J – Shaped iv) U – Shaped
www.gayali.in
For most of the distributions met with in practice, the frequency curve is bell
– shaped, and in such cases three important characteristics are immediately apparent
from the frequency curve:
[1] The first characteristics is a measure of central tendency. In particular, the ‘mode’
of the distribution is given by the abscissae of the highest point of the frequency curve.
[2] The second characteristic is a measure of dispersion. In particular, the ‘range’
, i.e., the maximum possible discrepancy between any two values, is given by the
distance between the two points at which the frequency curve meets the horizontal
axis.
[3] The third characteristic is the shape of the frequency curve i.e., whether the
curve is symmetrical or not; and if not, a measure of the degree of ‘Skewness”. A
symmetrical curve indicates that mean, median and mode are equal. For asymmetrical
curves this is not true, the mean being greater or less than the mode according as the
longer tail of the curve lies to the right or to the left.
Solved Problems
[1] Below is given the distribution of heights of a group of 60 students: -
Height (in cm) 145.0–149.9 150.0–154.9 155.0–159.9 160.0–164.9 165.0–169.9 170.0–174.9 175.0–179.9 180.0–184.9
No. of Students 2 5 9 15 16 7 5 1
www.gayali.in
Explain the terms ‘class limits’ and ‘class boundaries’ with reference to this
distribution.
[I.C.W.A., ‘75]
Solution:
Class Limits are 145.0 – 149.9, 150.0 – 154.9, 155.0 – 159.9, etc.
Class boundaries are 144.95 – 149.95, 149.95 – 154.95, 154.95 – 159.95, etc.

www.gayali.in
[2] Form a frequency distribution with the following :

7,4,3,5,6; 3,3,2,4,3; 4,3,3,4,4; 3,2,24,3; 5,4,3,4,3; 4,3,1,2,3;
[C.U. B.Com. ‘71]
Solution: From the given observations, we find that
Maximum Value = 7
Minimum Value = 1
Table – Tally Sheet
Observations Tally Marks Frequency
1 | 1
2 |||| 4
3 |||| |||| || 12
4 |||| |||| 9
5 || 2
6 | 1
www.gayali.in
7 | 1
Total 30
Value 1 2 3 4 5 6 7 Total
Frequency 1 4 12 9 2 1 1 30
[3] The following are the monthly salaries of 20 employees:
(Rs.) 130, 62, 145, 118, 125; 76, 151, 142, 110, 98;
95, 116, 100, 103, 71; 85, 80, 122, 132, 95;
Form a frequency distribution with class intervals Rs. 61 – 80, 81 – 100, 101 –
120, 121 – 140 and 141 – 160.
[C.U., B. Com, ‘74]
Solution:
Table Tally Sheet
Class Limits Tally Marks Frequency
61 – 80 |||| 4
81 – 100 |||| 5
www.gayali.in
101 – 120 |||| 4

121 – 140 |||| 4
141 – 160 ||| 3
Total 20
Salary (Rs.) 61 – 80 81 – 100 101 – 120 121 – 140 141 – 160 Total
Frequency 4 5 4 4 3 20

www.gayali.in
[4] The data below give the marks secured by 70 candidates in a certain examination:
21 31 35 52 64 74 89 53 42 7
22 35 43 67 76 35 46 26 32 40
72 43 38 41 63 71 28 32 45 54
15 18 52 73 86 50 39 55 47 12
44 58 67 85 39 40 50 65 72 69
57 63 5 56 79 37 24 54 82 49
51 54 68 29 34 44 58 62 59 65
Construct a frequency distribution of the marks, taking classes of uniform
width of 10 marks and 0 as the lower limit of the lower-most class.
[I.C.W.A. ‘74]
Solution:
Maximum Value = 89
Minimum Value =7
Table: Tally Sheet
www.gayali.in
Class Marks Tally Marks Frequency
0–9 || 2
10 – 19 ||| 3
20 – 29 |||| | 6
30 – 39 |||| |||| | 11
40 – 49 |||| |||| || 12
50 – 59 |||| |||| |||| 15
60 – 69 |||| |||| 10
70 – 79 |||| || 7
80 – 89 |||| 4
Total 70
Class Marks 0 – 9 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 Total

Frequency 2 3 6 11 12 15 10 7 4 70
[5] Ages at death (years) of 50 persons of a town are given below:
36 48 50 45 49 31 50 48 43 42
37 32 40 39 41 47 45 39 43 47
38 39 37 40 32 52 56 31 54 36
51 46 41 55 58 31 42 53 32 44
53 36 60 59 41 53 58 36 38 60
www.gayali.in
Arrange the data in a frequency distribution in 10 class intervals and obtain the
percentage frequency in each class interval.
[C. U., B. Com ‘72]
Solution:
Maximum Value = 60
Minimum Value = 31

www.gayali.in
Table: Tally Sheet

Class Marks Tally Marks Percentage Frequency Frequencies
6
31 – 33 |||| | × 100 = 12 6
50
4
34 – 36 |||| × 100 = 8 4
50
7
37 – 39 |||| || × 100 = 14 7
50
7
40 – 42 |||| || × 100 = 14 7
50
5
43 – 45 |||| × 100 = 10 5
50
5
46 – 48 |||| × 100 = 10 5
50
www.gayali.in
4
49 – 51 |||| × 100 = 8 4
50
5
52 – 54 |||| × 100 = 10 5
50
2
55 – 57 || × 100 = 4 2
50
5
58 – 60 |||| × 100 = 10 5
50
Total 100 50
Class Marks 31–33 34–36 37– 39 40–42 43–45 46–48 49–51 52–54 55–57 58–60 Total
Frequency 6 4 7 7 5 5 4 5 2 5 50
Percentage
12 8 14 14 10 10 8 10 4 10 100
frequency
[6] Form an ordinary frequency table from the following cumulative frequency
distribution of marks obtained by 22 students:
www.gayali.in
Marks Number of students

Below 10 3
Below 20 8
Below 30 17
Below 40 20
Below 50 22
[I.C.W.A. ‘73]

www.gayali.in
Solution:
Table: Frequency Distribution
Class Marks Frequency
0–9 3
10 – 19 5
20 – 29 9
30 – 39 3
40 – 49 2
Total 22
[7] From the following data, calculate the “percentage” of workers getting wages :–
(a) more than Rs. 44, (b) between Rs. 22 and Rs. 58
Wages (Rs) 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80 Total
No. of Workers 20 45 85 160 70 55 35 30 500
[C.A. ‘76]
Solution:
Table: Cumulative Frequency Distribution
Class Boundary Cumulative
www.gayali.in
Frequency (less than)
10 20
20 65
22 x
30 150
40 310
44 y
50 380
58 z
60 435
70 470
80 500
[a] Number greater than Rs. 44
= Total frequency – Cumulative frequency (less than) corresponding to Rs.44
= 500 – 338 = 162
162
Percentage of workers getting wages more than Rs. 44 = × 100 = 32.4
To find the cumulative frequency x , we have 500
22 − 20 x − 65
=
30 − 20 150 − 65
2 x − 65
=
www.gayali.in
5 10 85 17
or, x-65 = 17 or, x=82
To find the cumulative frequency (less than) y, we have
44 − 40 y − 310
=
50 − 40 380 − 310
4 y − 310 y − 310 = 28
or , = or ,
10 70 y = 3 38

www.gayali.in
Again, to find the cumulative frequency z, we have

58 − 50 z − 380 8 z − 380
= or , =
60 − 50 435 − 380 10 55
or, z-380 = 44 or, z=424
[b] No. of workers getting wages between Rs. 22 and Rs. 58
= No. of cases less than 58 – No. of cases less than 22
= Cumulative frequency corresponding to 58 – Cumulative frequency
corresponding to 22
= 424 – 82 = 342, Percentage of workers getting wages between Rs. 22 and Rs.58
342
= × 100 = 68.8
500
[8] Draw the histogram of the following frequency distribution of heights of 100
college students :
Height (cm) 141 – 150 151 – 160 161 – 170 171 – 180 181 -190 Total
Frequency 5 16 56 19 4 100
www.gayali.in
[W.B.H.S. ‘78]
Solution:
Table: Calculations for Drawing Histogram
Class Limits Class Boundaries Frequency
141 – 150 140.5 – 150.5 5
151 – 160 150.5 – 160.5 16
161 – 170 160.5 – 170.5 56
171 – 180 170.5 – 180.5 19
181 – 190 180.5 – 190.5 4
Figure: Histogram
60
Frequency (No. of students)
50
40
30
www.gayali.in
20
10
0
140.5-150.5 150.5-160.5 160.5-170.5 170.5-180.5 180.5-190.5
Height (cm)

www.gayali.in
[9] Draw histogram and frequency polygon to present the following data :–
Income (Rs) 100-149 150-199 200-249 250-299 300-349 350-399 400-449 450-499 Total
No. of Individuals 21 32 52 105 62 43 18 9 342
[I.C.W.A. ‘78]
Solution:
Table: Calculations for Drawing Histogram
Class Limits Class Boundaries Frequency (f) Width of Class (w)
100 – 149 99.5 – 149.5 21 50
150 – 199 149.5 – 199.5 32 50
200 – 249 199.5 – 249.5 52 50
250 – 299 249.5 – 299.5 105 50
300 – 349 299.5 – 349.5 62 50
350 – 399 349.5 – 399.5 43 50
400 – 449 399.5 – 449.5 18 50
450 – 499 449.5 – 499.5 9 50
110
gr am
100 isto
www.gayali.in
← H
90 on
olyg
80 yP
u enc
Class frequency
70 q
60 Fre
50 ←
40
30
20
10
0
Class boundaries
[10] Draw the histogram of the distribution given below and obtain the number of
firms whose sales lie between Rs. 12,00,000 and Rs. 26,00,000.
Value of Sales (Rs. 1000) No. of Firms
www.gayali.in
0 – 500 3
500 – 1000 42
1000 – 2500 288
2500 – 3500 150
3500 – 4500 51
Also draw the cumulative frequency polygon.
[C.U.B.A. (Econ.) ‘71]

www.gayali.in
Solution:
Table: Calculations for drawing histogram
Class Interval (Rs. 1000) Frequency (f) Width of Class (w) Frequency density (f ÷ w)
0 – 500 3 500 0.006
500 – 1000 42 500 0.084
1000 – 2500 288 1500 0.192
2500 – 3500 150 1000 0.150
3500 – 4500 51 1000 0.051
Figure – Histogram for the Distribution of Sales
Frequency Density
www.gayali.in
Sales ('000)
The proportion of firms with annual sales lie between Rs.12,00,000 and
Rs.26,00,000 is given by the proportion of area under the histogram which lies between
right of vertical line at 1200 and 2500 and between vertical line 2500 to 2600 on the
horizontal axis. Assuming that the frequency 288 in the class interval 1200 to 2500
is uniformly distributed in the whole interval, the proportional part of the area i.e.
frequency 1200 and 2500 units is 1300 x 0.192 = 249.6 and proportional part of the
area i.e., frequency 2600 and 2500 units is 100 x 0.150 = 15.0.
So, the frequency between 1200 and 2600 is 249.6 + 15.0 = 264.6 ≈ 265
No. of firms = 265
www.gayali.in
Class Boundary Cumulative Frequency (less than)

0 0
500 3
1000 45
2500 333
3500 483
4500 534 = N

www.gayali.in
Figure: Cumulative Frequency Polygon

550
500
450
Frequency
400
350
Cumulative
300
250
Cumulative
200
150
100
50
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Values of Sales (Rs.'000)
[11] Draw a cumulative frequency graph and estimate the number of persons
between the ages 30 – 32 in the following table:
Age 20 – 25 25 – 30 30 – 35 35 – 40 40 – 45 45 – 50 50 – 55 55 – 60 Total
www.gayali.in
No of persons 50 70 100 180 150 120 70 59 799
[C. U., M. Com. ‘68]
Solution:
Table – Cumulative Frequency Distribution
Class Boundary Cumulative Frequency (less – than)
20 0
25 50
30 120
35 220
40 400
45 550
50 670
55 740
60 799
800
700
600
Cumulative Frequency
500
400
300
www.gayali.in
200
160
120
100
0
20 25 30 32 35 40 45 50 55 60
No of people between the age 32 and 30 is 40. No. of persons i.e. 160 - 120.

www.gayali.in
[12] Draw an ogive from the following data and find graphically the number of
observations lying between 360 and 440 :
Value Number of Observations
More than 200 400
More than 250 370
More than 300 315
More than 350 220
More Than 400 115
More than 500 45
More than 600 15
More than 700 0
[I.C.W.A. ‘72]
Solution:
Value Cumulative Frequency (more – than)
200 400
250 370
www.gayali.in
300 315
350 220
400 115
500 45
600 15
700 0
Figure: Cumulative Frequency Polygon (more – than)
210
98
www.gayali.in
360 440
No. of observations lying between 360 and 440

= Difference between cumulative frequency corresponding to value 360 and 440
= 210 – 98 = 112

www.gayali.in
[13] Draw less-than Ogive based on the data given below. (N = 146)
Mid-Point 18 25 32 39 46 53 60
Frequency 10 15 32 42 26 12 9
[C.A. ‘74]
Solution:
Class Boundary Frequency Cumulative Frequency (less – than)
14.5 0 0
21.5 10 10
28.5 15 25
35.5 32 57
42.5 42 99
49.5 26 125
56.5 12 137
63.5 9 146
Figure: Ogive (less – than)
www.gayali.in
Value
[14] The word – length for each of 90 words in a poem by Tagore is shown below:
5 4 3 5 8 6 6 3 4
3 4 4 5 8 2 6 7 6
4 5 6 4 9 6 4 2 2
2 9 2 3 3 3 2 4 7
7 2 4 4 4 3 4 4 2
www.gayali.in
4 4 9 3 7 4 5 12 6
3 5 2 5 10 3 5 7 3
3 3 6 2 5 3 3 3 2
4 5 8 5 3 4 4 6 7
2 3 5 5 5 3 2 4 5
Construct column diagram, frequency polygon, and cumulative frequency
polygon (less than).

www.gayali.in
Solution:
Word Length Tally Marks Frequency
2 |||| |||| ||| 13
3 |||| |||| |||| |||| |||| 19
4 |||| |||| |||| |||| |||| 20
5 |||| |||| |||| 15
6 |||| |||| 9
7 |||| | 6
8 ||| 3
9 ||| 3
10 | 1
12 | 1
Total 90
Word – length Frequency (f) Cumulative Frequency (less – than) Relative Frequency
www.gayali.in
2 13 13 0.1445
3 19 32 0.2111
4 20 52 0.2222
5 15 67 0.1667
6 9 76 0.1000
7 6 82 0.0667
8 3 85 0.0333
9 3 88 0.0333
10 1 89 0.0111
12 1 90 0.0111
Total 90 1.0000
Figure: Column diagram for the frequency distribution of word lengths
Frequency
www.gayali.in
Word-length

www.gayali.in
Figure: Frequency Polygon for the frequency distribution of word lengths

25
20
15
Frequency
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13
Word-length
Figure: Cumulative Frequency diagram (less than type) for the data on number
of different word – lengths.
[15] On the basis of the table constructed in Exercise 14, answer the following
www.gayali.in
questions:
[a] What is the proportion of words with 9 letters?
[b] What is the number of words with 3 letters or less, and what is the number
of words with 5 letters or more?
[c] What is the number of words with not less than 4 and not more than 6
www.gayali.in
letters?
Answer:
[a] 0.0333 [b] (13+19)=32, (15+9+6+3+3+1+1)=38 [c] 44=(76–32)
[16] With the data shown below, form a frequency distribution with six classes. Show
the frequencies, the relative frequencies and the cumulative frequencies (of both the
less – than and the greater – than type). Finally, represent the distribution by means of
a suitable diagram.

www.gayali.in
Life (in hours) of 100 electric bulbs

511 991 1177 1016 600 777 895 749 1067 980
923 1314 1108 1137 906 1230 1099 1242 803 1131
918 1240 1057 980 992 763 759 1394 1111 1117
1143 808 948 857 962 922 817 1057 665 1171
936 1068 750 873 1139 1127 1163 934 515 907
1061 1198 1027 1081 991 1155 1199 806 950 1262
848 1293 956 1140 885 1330 1166 1333 1146 933
820 880 982 912 1100 1293 1192 1371 1023 1298
1059 1092 1091 1182 699 803 1009 922 1245 706
1053 1001 939 1248 850 985 1219 945 1012 846
Solution: From the given observations, we find that
Maximum Value = 1394
Minimum Value = 511
Difference = 883
883
There will be six classes, hence = 147.17
6
Table: Tally Sheet
www.gayali.in
Class Limits Tally Marks Frequency
501 – 650 ||| 3
651 – 800 |||| ||| 8
801 – 950 |||| |||| |||| |||| |||| |||| 29
951 – 1100 |||| |||| |||| |||| |||| || 27
1101 – 1250 |||| |||| |||| |||| |||| 25
1251 – 1400 |||| ||| 8
Total 100
The required frequency distribution is shown below:
Table: Frequency Distribution of life of bulbs
Life (in hours) Relative Frequency Class Boundaries
501 – 650 0.030 3 500.5 – 650.5
651 – 800 0.080 8 650.5 – 800.5
801 – 950 0.290 29 800.5 – 950.5
951 – 1100 0.270 27 950.5 – 1100.5
1101 – 1250 0.250 25 1100.5 – 1250.5
1251 – 1400 0.080 8 1250.5 – 1400.5
Total 1.000 100
www.gayali.in
Class Boundaries
(Less than) (More than)
500.5 0 100
650.5 3 97
800.5 11 89
950.5 40 60
1100.5 67 33
1250.5 92 8
1400.5 100 0

www.gayali.in
Figure: Histogram
40
30
Frequency
20
10
0
0 500.5 650.5 800.5 950.5 1100.5 1250.5 1400.5
Hours
www.gayali.in
100
80
60
40
20
0
500.5 650.5 800.5 950.5 1100.5 1250.5 1400.5
Hours
MEASURES OF CENTRAL TENDENCY
Central Tendency
www.gayali.in
Quite often there will be found in the data a tendency, notwithstanding their
variability, to cluster around a central value. In such a case, it would be legitimate
to use a single value, the central value, to represent the whole set of figures. Such a
representative or typical value of a variable is called a measure of central tendency or
an average.
There are three measures of central tendency – Mean, Median and Mode.

www.gayali.in
Again, Mean is of three types – Arithmetic Mean (A.M.), Geometric Mean (G.M.),
and Harmonic Mean (H.M.).
Arithmetic Mean (A.M.)
Arithmetic mean of a set of observations is defined to be their sum, divided by
the number of observations.
Given n observations x1, x2, ............., xn, their A.M., denoted by the symbol x is
x1 + x 2 + − − − − + x n 1
x= = Σx
n n
If x1, x2, ––––, xn have frequencies f1, f2, ––––, fn respectively i.e. x1 occurs f1
times, x2 occurs f2 times and so on, then the sum of all the f1+f2+––––+fn observations
is
x1 + x1 + − − − − + x1 + x 2 + x 2 − − − − + x 2 + − − − − + x n + x n + − − − − + x n

f1 terms f2 terms fn terms
= f1x1+f2x2+––––+fnxn
www.gayali.in
Hence, the arithmetic mean is
f1 x1 + f2 x 2 + − − − − + fn x n Σfx
x= =
f1 + f2 + − − − − + fn N
Where N = ∑f is the total frequency. This is sometimes referred to as weighted

arithmetic mean, as distinct from simple arithmetic mean.
Important properties of A.M.
[a] The total of a set of observations is equal to the product of their number and the
A.M.
[i] Σxi = n x ; [ii] Σfixi = N x
[b] The sum of the deviations of a set of observations from their A.M. is always zero.
Σx i
[i] Σ ( x i − x ) = 0 where x =
n
Σf x
[ii] Σfi ( x i − x ) = 0 where x = i i
N
[c] If two variables x and z are so related that z = ax+b for each x=xi, where a and b
are constants, then z = ax + b
www.gayali.in
x−c
In particular, if y = , where c and d are constants, then x = c + dy
d
[d] If a group of n1 observations has A.M. x1 , and another group of n2 observations
has A.M. x2 , then the A.M. ( x ) of the composite group of n1+n2 (=N, say) observations
is given by
Nx = n1 x1 + n2 x 2

www.gayali.in
This can be generalised to any number of groups :

Nx = Σni x i where N = Σni
[e] The sum of the squares of deviations of a set of observations has the smallest
value, when deviations are taken from their A.M.
Σ ( x i − A ) is minimum, when A = Simple A.M.
2
[i]
Σfi ( x i − A ) is minimum, when A = Weighted A.M.
2
[ii]
Example–1 show that if x is the arithmetic mean of the quantities x1, x2, ––––
xn, then ∑(xi– x )=0.
Solution : Set x1, x2, ––––xn be a series of numbers, and x their arithmetic mean,
defined by
x1 + x 2 + − − − − + x n
x=
n
The deviations of the numbers from their A.M. ( x ) are (x1– x ), (x2– x ), ––––, (xn– x ).
The algebraic sum of these deviations is
www.gayali.in
= (x1– x )+(x2– x )+––––+(xn– x )
= x1+x2+––––+xn– x – x –––– x (n times)
= (x1+x2+––––+xn) – n x
=nx –nx
=0
Example–2 : Show that if x be the arithmetic mean of the values xi, weighted by
fi (i=1, 2, ––––, n), then
n
Σ fi ( x i − x ) = 0
1
Solution : The arithmetic mean of the values xi weighted by fi (i=1, 2, ––––, n) is,
f1 x1 + f2 x 2 + − − − − + fn x n
by definition, x = where N=f1+f2+––––+fn. Therefore,
N
n
Σ fi ( x i − x ) = f1 ( x1 − x ) + f2 ( x 2 − x ) + − − − − + fn ( x n − x )
1
= f1 x1 − f1 x + f2 x 2 − f2 x + − − − − + fn x n − fn x
= f1 x1 + f2 x 2 + − − − − + fn x n − f1 x − f2 x − − − −fn x
= ( f1 x1 + f2 x 2 + − − − − + fn x n ) − x ( f1 + f2 + − − − − + fn )
= Nx − Nx
www.gayali.in
=0
Example–3 : If yi=xi–c; (i=1, 2,––––, n) where c is a constant, prove that x =c+ y
Solution : Since yi = xi – c, therefore xi = c + yi
Multiplying both sides by fi and then summing over all values of i = 1, 2, ––––, n
we have
n n
Σ fi x i = Σ fi ( c + y i )
1 1

www.gayali.in
= ∑(fic + fiyi)
= ∑fic + Σfiyi

= c∑fi + Σfiyi

= CN + ∑fiyi, since ∑fi = N
1 1
Hence, x = Σfi x i = ( CN + Σfi y i )
N N
1
= C + Σfi Yi
N
=c+y
xi − c
Example–4 : If y i = (i = 1, 2, ––––, n) where c and d are constants, prove
d
that x = c + d y .
xi − c
Solution : Since y i = , we have dyi = xi – c or, xi = dyi+c :
d
Multiplying both sides by fi, we get fixi = fi (c + dyi). Now, suming over all values of
i = 1, 2, ––––, n
∑fixi = ∑fi (c + dyi) = ∑(fic + dfiyi)
www.gayali.in
= ∑fic + ∑dfiyi
= c∑fi + d∑fiyi
= cN + d ∑fiyi where N = ∑fi
1 1
Hence, x = Σfi x i = ( cN + d Σfi y i )
N N
1 
= c + d  Σfi y i 
N 
= c + dy
n
Example–5 : Prove that Σ ( x i − A ) / n is the least when A = x where x1, x2, ––––,
2
i =1
xn are the observations, A is any arbitrary constant and x the arithmetic mean.
n n
Solution : Σ ( x i − A ) / n will be the least, when Σ ( x i − A ) is so.
2 2
i =1 i =1
Now, we can write
xi – A = (xi – x ) + ( x – A)
Therefore, Σ ( x i − A ) = Σ{( x i − x ) + ( x − A )}
2 2
{ }
www.gayali.in
= Σ ( xi − x ) + 2 ( xi − x )( x − A ) + ( x − A )
2 2
= Σ ( x i − x ) + Σ2 ( x i − x ) ( x − A ) + Σ ( x − A )
2 2
= Σ ( xi − x ) + 2( x − A ) Σ ( xi − x ) + n ( x − A )
2 2
( )
= Σ ( x i − x ) + 2 X − A .0 + n ( x − A )
2 2
= Σ ( xi − x ) + n ( x − A )
2 2

www.gayali.in
Both the terms of the right are positive; because, the first is the sum of n squares
(xi – x )2, and the second is the product of n (a positive integer) and a square. But we have
only to choose the value of A which makes ∑(xi – A)2 the minimum possible and this will
be achieved when the second term on the right has the minimum possible value viz. 0 i.e.
n( x – A)2 = 0
Or, ( x – A)2 = 0
x –A=0
∴x =A
Geometric Mean (G.M.)
Geometric mean of a group of n observations is the n-th root of their product. It
is defined only when all observations have the same sign, and none of them is zero.
Given n observations x1, x2, ––––, xn
G.M. = n x1 × x 2 × − − − − × x n
1
This may also be written as ( x1 , x 2 − − − − x n ) n
If, however, x1, x2, ––––, xn have frequencies f1, f2, ––––, fn respectively, the product
www.gayali.in
of all the N (= f1 + f2 +––––+ fn) observations is
= x f1 x 2 f2 − − − − x n fn
So that G.M. = N
x1f1 x 2 f2 − − − − x n fn
1
This may also be written in the form x1f1 x 2 f2 − − − − x fn( ) N where N = fi is the total
frequency.
We have simple geometric mean and weighted geometric mean, given by the
formulae
1
Simple G.M. (g) = ( x1 .x 2 − − − − x n ) n
1
(
Weighted G.M. (G) = x1 x 2 − − − − x nf1 f2 fn
) N
They are equal, only when all weights are equal. For practical calculations, these
formulae cannot be applied directly. Taking logarithms of both sides, we have
1
log g =
n
( log x1 + log x 2 + − − − − + log x n )
1
= Σ log x i
n
www.gayali.in
1
log G =  f1 ( log x1 ) + f2 ( log x 2 ) + − − − − + fn ( log x n ) 
N
1
= Σfi ( log x i )
N
Properties of G.M.
[a] The product of a group of n observations is equal to the n-th power of their G.M.
x1. x2––––.xn = gn

www.gayali.in
[b] The logarithm of G.M. of a set of observations is equal to the A.M. of their
logarithms.
1
log g = Σ log x i ;
n
1
log G = Σfi ( log x i )
N
[c] If G1, G2 –––– be the geometric means of several groups having n1, n2 ––––
observations respectively, then G.M. (G) of the composite group is given by their
weighted geometric mean.
G= N
G1n1 G2 n2 − − − −
1
i.e. log G = Σni ( log Gi )
N
Where N = n1 + n2 + ––––
Harmonic Mean (H.M.)
Harmonic mean of a set of observations is the reciprocal of the arithmetic mean
of their reciprocals. Like G.M., H.M. is defined only when no observation is zero.
www.gayali.in
n n
Simple H.M. = =
1 1 1  1 
+ +−−−−+ Σ 
x1 x 2 xn
 xi 
N N
Weighted H.M. = =
f1 f2 f f 
+ +−−−−+ n Σ i 
x1 x 2 xn  xi 
They are equal only when all weights are equals.
Relations between A.M., G.M., H.M.
[1] For any given set of observations, A.M. is greater than or equal to G.M., and G.M.
is greater than or equal to H.M.
A.M. ≥ G.M. ≥ H.M.
They are equal, only when all observations are equal.
[2] For two observations only.
A.M. G.M.
=
G.M. H.M.
This means that G.M. not only lies between A.M. and H.M., but (G.M.)2 = A.M. ×
www.gayali.in
H.M., provided there are only two observations.

Example : Prove that
A.M. ≥ G.M. ≥ H.M.
Where A.M., G.M. and H.M. represent arithmetic, geometric and harmonic
means respectively.

www.gayali.in
Solution : Let x1, x2, ––––, xn be a set of n observations (all positive). Their A.M.,
G.M., and H.M. (denoted by A, G and H respectively) are
x1 + x 2 + − − − − + x n
A=
n
1
G = ( x1 x 2 − − − − x n ) n
n
H=
1 1 1
+ +−−−−+
x1 x 2 xn
Considering only two observations x1 and x2, we see that (√x1–√x2)2 ≥ 0, because
the left side is a square quantity.
Or, x1 + x 2 − 2 x1 x 2 ≥ 0
Or, x1 + x 2 ≥ 2 x1 x 2
x1 + x 2
Or, ≥ x1 x 2 –––– (i)
2
i.e. A.M. ≥ G.M., when n=2 –––– (ii)
www.gayali.in
Similarly, considering only the observations x3 and x4, we have
x3 + x 4
≥ x 3 x 4 –––– (iii)
2
x1 + x 2 x +x
If we now consider the two quantities and 3 4 , we must have, by (ii),
2 2
x1 + x 2 x 3 + x 4
+ x1 + x 2 x 3 + x 4
2 2 ≥ .
2 2 2
x1 + x 2 + x 3 + x 4 x1 + x 2 x 3 + x 4
Or, ≥ . –––– (iv)
4 2 2
x1 + x 2 x 3 + x 4
But, . ≥ x1 x 2 . x 3 x 4 –––– (v)
2 2
because each of the two terms on the left is greater than or equal to the
corresponding term on the right, by (i) and (iii).
Substituting from (v) in (iv),
x1 + x 2 + x 3 + x 4
≥ x1 x 2 . x 3 x 4
4
www.gayali.in
x +x +x +x
Or, 1 2 3 4 ≥ 4 x1 x 2 x 3 x 4
4
i.e. A.M. ≥ G.M., when n = 4
Proceeding this way, it can be shown that A.M. > G.M., whenever n = 2 or 4 or 8
or 32 etc. i.e. of the form 2m, where m is a positive integer. But we have to prove the result
for any value of n. For this purpose, let us suppose that the given value of n' lies between
two such values 2m–1 and 2m i.e. 2m–1 < n < 2m.

www.gayali.in
We now consider 2m (= N, say) values, consisting of the n given observations x1, x2,
–––– xn, and (N–n) further values each equal to A, i.e. (x1 + x2 + ––––+ xn)/n.
x1 , x 2 , − − − − x n A
, , −
A −
− −
A

n terms ( N − n ) terms
A.M. of these N values is
x1 + x 2 + − − − − + x n + A + A + − − − − + A nA + ( N − n ) A
= =A
N N
Also G.M. of these N values is
1 1
( x1 . x 2 − − − − x n . A . A. − − − −A ) N = ( Gn . A N −n ) N
1
(
Since A.M. ≥ G.M. for N=2m values, therefore, in the present case A ≥ Gn . A N − n ) N
Raising both sides to the power N, we have AN ≥ Gn. AN-n.

Simplifying, we get An ≥ Gn or, A ≥ G
This completes the proof that A.M. ≥ G.M. for any number of observations.
Using this result, we shall now prove that G ≥ H. For this purpose, let us consider
www.gayali.in
1 1 1
the n values , , − − −−, .
x1 x 2 xn
1 1 1
+ +−−−−+
x1 x 2 xn 1
The A.M. of these values is = and their G.M. is
n H
1 1
 1 1 1 n  1 n  1  1
 . −−−−  =  = n =
 x1 x 2 xn   x1 . x 2 − − − − x n  G  G
1 1
Since, we have proved that A.M. ≥ G.M., in the present case ≥ i.e. G ≥ H
H G
Combining the two results A ≥ G and G ≥ H, we have A ≥ G ≥ H; i.e. in general
A.M. ≥ G.M. ≥ H.M.
We shall now prove that A.M. = G.M. = H.M., only when all the observations have
the same value, i.e. x1 = k, x2 = k, –––– xn = k.
In such a situation
k + k + − − − − + k nk
A.M. = = =k
n n
www.gayali.in
1 1
G.M. = ( k.k. − − − −k ) n = k n ( ) n =k
n n
H.M. = = =k
1 1 1 n
+ +−−−−+ k
k k k
and hence A.M. = G.M. = H.M., when all the observations are equal.

www.gayali.in
Median
Median of a set of observations is the middle – most value when the observations
are arranged in order of magnitude. The number of observations smaller than Median is
the same as the number greater than it. Thus, Median divides the observations into two
equal parts. It is unaffected by the presence of extremely large or small observations and
can be calculated from frequency distributions with open-end classes.
An important property of Median is that for any given set of observations the sum
of absolute deviations from median is the least.
Calculation of Median
The median is calculated as follows :
[a] From simple series – The given data are arranged in order of magnitude. If
the number of observations be odd, the value of the middle-most item is the median.
However, if the number be even, the arithmetic mean of the two middle-most items is
taken as median.
[b] From simple frequency distribution – The cumulative frequency ("Less than" type)
corresponding to each distinct value of the variable is calculated. If the total frequency
be N, the value of the variable corresponding to cumulative frequency (N+1)/2 gives the
www.gayali.in
median.
[c] From grouped frequency distribution – Median from a grouped frequency
N
distribution is that value which corresponds to cumulative frequency . Median from
2
a grouped frequency distribution can be calculated by any of the following methods :–
[i] By the application of formula for median :
The cumulative frequencies are calculated. The class in which cumulative
N
frequency lies, is called the median class. Now we apply the formula :
2
N
−F
2
Median = l1 + f × C
m
where l1 = lower boundary of median class;

N = total frequency;
F = cumulative frequency below l1;
fm = frequency of median class;
C = width of median class.
www.gayali.in
[ii] By the application of simple interpolation in a cumulative frequency

distribution – If F1 and F2 be the cumulative frequencies shown in the table which are
N
just smaller than and just larger than , and they correspond to the class boundaries l1
2
and l2 respectively, then
N
Median − l1 2 − F1
=
l 2 − l1 F2 − F1


www.gayali.in
[iii] Graphical method – An approximate value of median can be obtained

graphically from ogive, or cumulative frequency polygon. Draw a horizontal line
N
from the point . On the vertical scale showing the cumulative frequencies, until it
2
meets the ogive (either less-than or more-than type). From the point of intersection,
a perpendicular is now drawn on the horizontal axis. The position of the foot of the
perpendicular is read . From the horizontal scale showing values of the variable, and
this gives the median. If both ogives (less-than and more-than) are available on the
same graph paper, the position of the foot of the perpendicular drawn from the point of
intersection of the two ogives, gives the median.
Mode
Mode of a given set of observatios is that value which occurs with the maximum
frequency. It is the most typical or prevalent value and at times represents the true
characteristic of the distribution as a measure of central tendency.
Calculation of Mode
From a simple series, mode can be determined by locating that value, which
www.gayali.in
occurs the maximum number of times.
From a simple frequency distribution, mode can be determined by inspection
only. It is that value of the variable which corresponds to the largest frequency.
From a grouped frequency distribution it is very difficult to find the mode
accurately. However, if all classes are of equal width, mode is usually calculated by the
formula.
d1
Mode = l1 + ×c
d1 + d 2
where
l1 = lower boundary of the modal class
d1 = difference of the largest frequency and the frequency of class just preceeding
the modal class;
d2 = difference of the largest frequency and the frequency of class just following
the modal class;
c = common width of classes.
If fo, f–1, f1 represent the frequencies of the modal class, the class just preceding and
www.gayali.in
the class just following it, then

d1 = fo – f–1 and d2 = fo – f1,
So that equation (1) may also be written in the form
fo − f−1
Mode = l1 + ×c
2fo − f−1 − f1
Relation between mean, median, mode

www.gayali.in
For unimodal distributions of moderate skewness the following approximate

relation has been found to hold :
Mean – Mode = 3 (Mean – Median)
Partition Values – Quartiles, Deciles, Percentiles
Some of the important types of partition values are – (a) Median (b) Quartiles (c)
Deciles (d) Perentiles.
We know that Median is the middle-most value of a set of observations i.e. it
divides the total number of observations into 2 equal parts. The number of observations
smaller than median is the same as the number larger than it. For data of continuous
type, exactly one-half of the observations are smaller than median i.e. median is the
N
value of the variable corresponding to cumulative frequency .
2
Quartiles are such values which divide the total number of observations into 4
parts. Obviously, there are 3 quartiles –
[i] First quartile : Q1 (or lower quartile).
www.gayali.in
[ii] Second quartile : Q2 (or middle quartile)
[iii] Third quartile (or upper quartile) : Q3
For data of continuous type, one – quarter of the observations is smaller than
Q1, two – quarters are smaller than Q2 and three – quarters are smaller than Q3. This
means that Q1, Q2, Q3 are values of the variable corresponding to "less-than" cumulative
N 2 N 3N 2N N
frequencies , , respectively. Since, = , it is evident that the second quartile
4 4 4 4 2
Q2 is the same as median.
Q1 < Q2 < Q3 ; Q2 = Median
In Bowley's formula for skewness, all the three quartiles are used.
Median = Q2
Q3 − Q1
Quartile Deviation =
2
Q3 − 2Q2 + Q1
Skewness =
Q3 − Q1
Deciles are such values which divide the total number of observations into 10
equal parts. There are 9 deciles D1, D2, ––––, D9 called the first decile, the second decile,
www.gayali.in
etc. The number of observations small-than D1, or between two successive deciles, or
larger thanD9 is the same. For data of continuous type, D1, D2, –––– D9 correspond to
N 2N 9N
cumulative frequencies , , − − −−, respectively.
10 10 10
D1 < D2 < –––– < D9; D5 = Q2 = Median
Percentiles are such values which divide the total number of observations into

www.gayali.in
100 equal parts. There are 99 percentiles P1, P2, ––––, P99, called the first percentiles
the second percentile, and so on. The K-th percentile (Pk) is, therefore, that value of the
variable upto which lie exactly K% of the total number of observations.
In particular,
P10 = D1, P20 = D2, ––––, P90 = D9.
P25 = Q1, P50 = D5 = Q2 = Median, P75 = Q3
P1 < P2 < –––– < P99
Calculation of partition values
[a] From simple series – The given data are arranged in increasing order of
magnitude and a number showing the rank is attached to each observation. The smallest
value is given rank 1, the next higher value rank 2, etc. and the largest value is given rank
n. The ranks of partition values are as follows :
1
Rank of Median =
2
( n + 1)
1
Rank of Q1 = ( n + 1)
www.gayali.in
4
3
Rank of Q3 = ( n + 1)
4
K
Rank of Dk = ( n + 1)
10
K
Rank of Pk =
100
( n + 1)
Using simple interpolation, the value of the variable corresponding to the
appropriate rank is determined, giving the partition values.
1
Median = Value corresponding to rank
2
( n + 1)
1
Q1 = Value corresponding to rank
4
( n + 1)
3
Q3 = Value corresponding to rank ( n + 1)
4
K
Dk = Value corresponding to rank ( n + 1)
10
K
Pk = Value corresponding to rank
100
( n + 1)
[b] From simple frequency distribution – The cumulative frequency
www.gayali.in
corresponding to each distinct value of the variable is calculated. If the total frequency be N,
1
Median = Value corresponding to cumulative frequency
2
( N + 1)
1
Q2 = Value corresponding to cumulative frequency
4
( N + 1)
3
Q3 = Value corresponding to cumulative frequency ( N + 1)
4

www.gayali.in
K
Dk = Value corresponding to cumulative frequency
10
( N + 1)
K
Pk = Value corresponding to cumulative frequency
100
( N + 1)
[c] From grouped frequency distribution –
(i) By application of interpolation – A cumulative frequency distribution is
constructed showing the class boundaries and the corresponding cumulative frequencies
("less-than" type). Using simple interpolation, we now find
1
Median = Value corresponding to cumulative frequency N
2
1
Q1 = Value corresponding to cumulative frequency N
4
3
Q3 = Value corresponding to cumulative frequency N
4
K
Dk = Value corresponding to cumulative frequency N
10
K
Pk = Value corresponding to cumulative frequency N
100
www.gayali.in
(ii) Graphical method – An ogive ("less-than" type) is drawn. From this ogive
1
Median = Abscissa corresponding to ordinate N
2
1
Q1 = Abscissa corresponding to ordinate N
4
3
Q3 = Abscissa corresponding to ordinate N
4
K
Dk = Abscissa corresponding to ordinate N
10
K
Pk = Abscissa corresponding to ordinate N
100
Exercise
[1] Find the mean and the median of :
88, 72, 33, 29, 70, 54, 86, 91, 57, 61
[C.U. B.Com.'73]
1
Solution : Mean = (88 + 72 + 33 + 29 + 70 + 54 + 86 + 91 + 57 + 61)= 64.1
10
For median data are arranged in order of magnitude 29, 83, 54, 57, (61), (70), 72,
86, 88, 91
www.gayali.in
There are two middle most values 61 and 70

61 + 70 131
Hence, median = = = 65.5
2 2
[2] Find the mean, median and mode of the following numbers :
7, 4, 3, 5, 6, 3, 3, 2, 4, 3, 4, 3, 3, 4, 4, 3, 2, 2, 4, 3, 5, 4, 3, 4, 3, 4, 3, 1, 2, 3

www.gayali.in
Solution : Data are arranged in simple frequency distribution.

Value (x) Frequency (f) fx Cumulative frequency
1 1 1 1
2 4 8 5
←
3 12 36 17
4 9 36 26
5 2 10 28
6 1 6 29
7 1 7 30=N
Total 30 104
104
Mean = = 3.47
30
N + 1 30 + 1
Median lies at cumulative frequency = = 15.5
2 2
Cumulative frequency 15.5 lies the value greater than cumulative frequency 5 and
less than 17 and it corresponds to value 3.
www.gayali.in
Hence, Median is 3.
Mode = 3
[3] Evaluate the arithmetic mean, median and mode for the following distribution of
number of telephone calls received per one minute "interval".
No. of calls 0 1 2 3 4 5 6 7 8
Frequency 5 22 31 43 51 40 35 15 3
[B.U., B.Com.'71]
Solution :
Table : Calculations for A.M., median and mode
No. of calls (x) Frequency (f) Cumulative frequency y=x–4 fy
0 5 5 –4 –20
1 22 27 –3 –66
2 31 58 –2 –62
3 43 101 –1 –43
←
4 51 152 0 0
5 40 192 1 40
www.gayali.in
6 35 227 2 70
7 15 242 3 45
8 3 245=N 4 12
Total 245 –24
 24 
Mean ( x ) = 4 +  −  = 4 − 0.098 = 3.902 = 3.90
 245 

www.gayali.in
Mode = 4
N + 1 245 + 1 246
Median lies in cumulative frequency = = = 123
2 2 2
Value corresponding to cumulative frequency 123 = 4
Median = 4
[4] Calculate the simple and weighted average from the following and account for the
difference between the two :
Price per ton (Rs./P.) 45.60 50.70 42.45
Tons purchased 135 40 25
[C.A. '72]
Solution :
1
3
( 45.60 + 50.70 + 42.45)
Simple A.M. =
138.75
= = 46.25
3
45.60 × 135 + 50.70 × 40 + 42.45 × 25
www.gayali.in
Weighted A.M. =
135 + 40 + 25
6156 + 2028 + 1061.25
=
200
9245.25
= = 46.23
200
[5] The numbers 3.2, 5.8, 7.9 and 4.5 have frequencies x, (x+2), (x–3) and (x+6)
respectively. If the arithmetic mean is 4.876, find the value of x.
[C.U., M.Com., '73]
Solution :
As per given condition,
3. 2 × x + 5. 8 × ( x + 2 ) + 7.9 ( x − 3 ) + 4.5 ( x + 6 )
= 4.876
x + x +2+ x −3+ x +6
3.2 x + 5.8 x + 11.6 + 7.9 x − 23.7 + 4.5x + 27
Or, = 4.876
4x + 5
Or, 21.4x + 14.9 = 4.876 (4x + 5)
Or, 21.4x + 14.9 = 19.504x + 24.38
Or, 21.4x – 19.504x = 24.38 – 14.9
www.gayali.in
Or, 1.896x=9.48
9.48
∴ x = =5
1.896
[6] Calculate the arithmetic mean from the following data :
[i] Class interval 50–59 60–69 70–79 80–89 90–99 100–109 110–119
Frequency 14 38 44 54 45 30 25

www.gayali.in
[ii] Height in inches 57.5 - 60.0 - 62.5 - 65.0 - 67.5 - 70.0 - 72.5
Number of men 6 26 190 281 412 127 38
[iii] Weight in lbs. 137.5–147.5 147.5–157.5 157.5–1673.5 167.5–177.5 177.5–187.5 187.5–197.5 197.5–217.5 217.5–247.5
Number of Men 2 5 4 5 7 5 3 1
[iv] x 20–30 30–50 50–100 100–200 200–350 350–550
Frequency 2 9 11 52 10 3
Solution :
[i] Table : Calculation of Arithmetic Mean
x − 84.5
Class interval Frequency (f) Mid-Value (x) y= fy
10
50–59 14 54.5 –3 –42
60–69 38 64.5 –2 –76
70–79 44 74.5 –1 –44
80–89 54 84.5 0 0
www.gayali.in
90–99 45 94.5 1 45
100–109 30 104.5 2 60
110–119 25 114.5 3 75
Total 250 – – 18
18
Arithmetic Mean ( x ) = 84.5 + × 10
250
= 84.5 + 0.72 = 85.22
[ii] Table : Calculation for Arithmetic Mean
x − 66.25
2.50
57.5–60.0 6 58.75 –3 –18
60.0–62.5 26 61.25 –2 –52
62.5–65.0 190 63.75 –1 –190
65.0–67.5 281 66.25 0 0
67.5–70.0 412 68.75 1 415
www.gayali.in
70.0–72.5 127 71.25 2 254

72.5–75.0 38 73.75 3 114
Total 1080 – – 520
520
Arithmetic Mean ( x ) = 66.25 + × 2. 5
1080
= 66.25 + 1.20 = 67.45

www.gayali.in
[iii] Table : Calculation for Arithmetic Mean

x −172.5
5
137.5–147.5 2 142.5 –6 –12
147.5–157.5 5 152.5 –4 –20
157.5–167.5 4 162.5 –2 –8
167.5–177.5 5 172.5 0 0
177.5–187.5 7 182.5 2 14
187.5–197.5 5 192.5 4 20
197.5–217.5 3 207.5 7 21
217.5–247.5 1 232.5 12 12
Total 32 – – 27
27
Arithmetic Mean ( x ) = 172.5 + ×5
32
135
www.gayali.in
= 172.5 +
32
= 172.5 + 4.22
= 176.72 lbs.
[iv] Table : Calculation for Arithmetic Mean
x −150
x Frequency (f) Mid-Value (x) y= fy
5
20–30 2 25 –25 –50
30–50 9 40 –22 –198
50–100 11 75 –15 –165
100–200 52 150 0 0
200–350 10 275 25 250
350–550 3 450 60 180
Total 87 – – 17
17
Arithmetic Mean ( x ) = 150 + ×5
87
www.gayali.in
= 150 + 0.98 = 150.98

[7] From the data in the following table calculate the average marks of the M.Com.
examinees in Statistics at a class test :
Marks 30–39 40–49 50–59 60–69 70–79 80–89 90–99

No. of examinees 2 3 11 20 32 25 7
[C.U., M.Com. '72]

www.gayali.in
Solution :
Table : Calculations for Arithmetic Mean
x − 64.5
10
30–39 2 34.5 –3 –6
40–49 3 44.5 –2 –6
50–59 11 54.5 –1 –11
60–69 20 64.5 0 0
70–79 32 74.5 1 32
80–89 25 84.5 2 50
90–99 7 94.5 3 21
Total 100 – – 80
80
Arithmetic Mean ( x ) = 64.5 + × 10
100
= 64.5 + 8 = 72.5
www.gayali.in
[8] The following table gives the rise in prices of 300 commodities between two dates.
Calculate the mean rise in price :
% increase 0 - 5 - 10 - 15 - 25 - 35 - 45 - 60-80
Frequency 12 30 51 84 66 35 15 7
[Dip. in Social Welfare '71]
Solution :
x − 30
Class boundaries Frequency (f) Mid-Value (x) y= fy
2. 5
0–5 12 2.5 –11 –132
5–10 30 7.5 –9 –270
10–15 51 12.5 –7 –357
15–25 84 20 –4 –336
25–35 66 30 0 0
35–45 35 40 4 140
www.gayali.in
45–60 15 52.5 9 135

60–80 7 70 16 112
Total 300 – –2 –708
 708 
Arithmetic Mean ( x ) = 30 +  − × 2. 5 
 300 
= 30 – 5.9 = 24.1%

www.gayali.in
[9] The following are the monthly salaries (in Rs.) of 30 employees in a firm :
140 139 126 114 100 88 62 77 99 103
108 129 144 148 134 63 69 148 132 118
142 116 123 104 95 80 85 106 123 133
The firm gave bonus of Rs.10, 15, 20, 25, 30, 35 for individuals in the respective
salary groups : 'exceeding Rs.60 but not exceeding Rs.75'; 'exceeding Rs.75 but not
exceeding Rs.90'; and so on upto 'exceeding Rs.135 but not exceeding Rs.150'. Find the
average bonus per worker.
[I.C.W.A. '76 - old]
Solution : As per data given
Table : Calculation for A.M.
Bonus paid (Class (x)(Rs.) boundaries) Frequency (f) fx
10 3 30
15 4 60
20 5 100
25 5 125
30 7 210
35 6 210
Total 30 735
www.gayali.in
735
Arithmetic Mean ( x ) = = Rs.24.50
30
[10] For the variable x, taking the values 0, 1, 2, ––––, k, the cumulative frequencies of
k k Fi
more-than type are F0, F1, F2, ––––, Fk. Show that x = Σ Σ , where n is the total frequency.
i =1 i =1 n
[B.U., B.A. (Econ.) '73]

Solution :
Table : Calculation for A.M.
Cumulative frequencies
x Frequency (f) fx
(more than)
0 F0 F0 – F1 0
1 F1 F1 – F2 F1 – F2
2 F2 F2 – F3 2F2 – 2F3
K[Fk–1 – Fk]
www.gayali.in
K Fk Fk
Total (F0 + Fk) (F1+F2+F3+––––+Fk)
F1 + F2 + F3 + − − − − +Fk
∴x =
n
k F
=Σ i
i =1 n
[11] [a] The arithmetic mean calculated from the following frequency distribution

www.gayali.in
is known to be 67.45 inches. Find the value of f3.

Height (inches) 60–62 63–65 66–68 69–71 72–74
Frequency 15 54 f3 81 24
[I.C.W.A. '71]
Solution
x − 67
Class limits Frequency (f) Mid-value (x) y= fy
3
60–62 15 61 –2 –30
63–65 54 64 –1 –54
66–68 f3 67 0 0
69–71 81 70 1 81
72–74 24 73 2 48
Total 174+f3 – – 45
45
x = 67 + 3 ×
174 + f3
www.gayali.in
135
Or, 67.45 = 67 +
174 + f3
135
Or, 0.45 =
174 + f3
Or, 0.45f3 + 78.3 =135
Or, 0.45f3 = 135 – 78.3 = 56.7
56.7
∴ f3 = = 126
0.45
[b] The expenditure of 1000 families is given below :
Expenditure (Rs.) 40–59 60–79 80–99 100–119 120–139
No. of families 50 ? 500 ? 50
The median and mean for the distribution are both Rs.87.50P. Calculate the
missing frequencies.
[I.C.W.A. '78]
Solution : let the missing frequencies are f2 and f4 respectively.
Table : Calculations for Missing Frequencies
www.gayali.in
x − 89.5
Class limits Frequency (f) Mid-Value y= fy
20
40–59 50 49.5 –2 –100
60–79 f2 69.5 –1 –f2
80–99 500 89.5 0 0
100–119 f4 109.5 1 f4
120–139 50 129.5 2 100
Total 600+f2+f4 – – f4–f2

www.gayali.in

f 4 − f2
87.50 = 89.5 + × 20
600 + f2 + f4
f 4 − f2
–2 = × 20
1000
50
Or, f4 – f2 = –100 –––– (1)

Also, 600 + f2 + f4 = 1000
f2 + f4 = 400 –––– (2)
Adding (1) and (2) we get
2f4 = 300
f4 = 150
Putting the value of f4 in equation (2) we get
f2 + 150 = 400
f2 = 250
www.gayali.in
[12] Find out the missing frequencies of the following data, given the A.M. is 67.45
inches.
Height (inches) 60–62 63–65 66–68 69–71 72–74 Total
No. of students 5 18 f3 f4 8 100
[Dip. Management '72]
Solution :
Table : Calculation for Missing Entries
x − 67
Class limit Frequency (f) Mid-Value y= fy
3
60–62 5 61 –2 –10
63–65 18 64 –1 –18
66–68 f3 67 0 0
69–71 f4 70 1 f4
72–74 8 73 2 16
Total 31+f3+f4 – – f4 – 12
f4 − 12
67.45 = 67 + ×3
www.gayali.in
100
f4 − 12
Or, 0.45 = ×3
100
Or, 45 = 3f4 – 36
∴ 3f4 = 45 + 36 = 81
81
f4 = = 27
3

www.gayali.in
Purring the value of f4 in the equation,

31 + f3 + f4 = 100
f3 + f4 = 100 – 31 = 69
f3 + 27 = 69
f3 = 69 – 27 = 42
[13] [i] The average marks obtained in an examination by two groups of students
was found to be 75 and 85 respectively. Determine the ratio of students in two
groups, if the average mark for all students was 80.
[B.U., B.A. (Econ.) '69]
Solution : Let the number of students in the first and second group be n1
and n2 and n1 + n2 = N.
Table : Mean of composite group
Groups
Characteristics Composite group
I II
No. of observations n1 n2 N = n1 + n2
www.gayali.in
Mean marks x1 = 75 x 2 = 85 x =80
Applying formula for mean of composite group.

Nx = n1 x1 + n2 x 2
80N = 75n1 + 85n2
Or, 80 ( n1 + n2 ) = 75n1 + 85n2
Or, 80n1 + 80n2 = 75n1 + 85n2
Or, 80n1 − 75n1 = 85n2 − 80n2
Or, 5n1 = 5n2
n1 5 1
Or, = = Or, n1:n2=1:1
n2 5 1
[ii] On a certain examination, the average grade of all students in Class A is
68.4 and all students in Class B is 71.2. If the average of both classes combined
is 70.0, what is the ratio of the number of students in Class A to the number in
Class B?
[C.U., M.Com. '63]
Solution : Let the number of students in Class A and Class B is n1 and n2
www.gayali.in
and n1+n2=N.
Groups
I II
No. of observations n1 n2 N = n1 + n2
Mean of grade x1 =68.4 x 2 =71.2 x =70.0

www.gayali.in
Applying formula,
Nx = n1 x1 + n2 x 2
Or, ( n1 + n2 ) x = n1 x1 + n2 x2
Or, ( n1 + n2 ) 70 = n1 × 68.4 + n2 × 71.2
Or, 70n1 + 70n2 = 68.4n1 + 71.2n2
Or, 70n1 − 68.4n1 = 71.2n2 − 70n2
n 1. 2 3
Or, 1.6n1 = 1.2n2 Or, =
1
=
n2 1. 6 4
∴ n1 : n2 = 3 : 4
[iii] The mean age of a combined group of men and women is 30 years. If the
mean age of the group of men is 32 and that of the group of women is 27, find
out the percentage of men and women in the group.
[C.A. '65]
Solution : Let the number of men and number of women be n1 and n2 and
n1+n2=N.
www.gayali.in
Groups
Characteristics Men Women Composite group
I II
No. of observations n1 n2 n1 + n2 = N
Mean age (years) 32 27 30
Applying formula,
Nx = n1 x1 + n2 x 2
Or, ( n1 + n2 ) 30 = 32n1 + 27n2
Or, 30n1 + 30n2 = 32n1 + 27n2
Or, 30n2 − 27n2 = 32n1 − 30n1
Or, 3n2 = 2n1
n1 3
Or, =
n2 2
Or, n1 : n2 = 3 : 2
3 3 20
Or, n1 = × 100 = × 100 = 60% and n2 = 40%
3+2 5
www.gayali.in
∴ Men = 60%
Women = 40%
[14] Out of the total population in a certain town in South Africa, 60% belonged to
the Black Race and the rest belonged to the White Race. It was estimated that their
mean incomes were respectively 2000 and 5000 pounds. Find the average income of
the entire town.
[C.A. '68]

www.gayali.in
Solution : Let the average income be x .

Groups
I II
No. of observations (%) 60 40 100
Mean income (Rs.) 2000 5000 x
Applying formula,
Nx = n1 x1 + n2 x 2
Or, 100 x = 60 × 2000 + 40 × 5000
Or, = 120000 + 200000 = 320000
320000
∴ x = = Rs.3200
100
∴Average income = Rs.3200
[15] [i] A factory has five sections employing 105, 184, 130, 93 and 125 workers.
The mean earnings in a certain week per worker are Rs.13.80, Rs.15.00, Rs.15.20,
Rs.18.20 and Rs.14.20 for the 5 sections. Determine the mean earning per
www.gayali.in
worker of the whole factory.
[Dip. Management '70]
Solution : Let the mean earning per worker be x .
Groups
I II III IV V
No. of observations 105 184 130 93 125 637
Mean of earning 13.80 15.00 15.20 18.20 14.20 x
Applying formula,
Nx = n1 x1 + n2 x2 + n3 x3 + n 4 x 4 + n5 x5
637 x = 105 × 13.80 + 184 × 15.00 + 130 × 15.20 + 93 × 18.20 + 125 × 14.20
= 1449 + 2760 + 1976 + 1692.6 + 1775
= 9652.6
9652.6
∴ x = = Rs.15.15
637
[ii] In a survey of locality the following figures regarding the income of the
people in different occupations were received. Find out the average per capita income
www.gayali.in
:
Occupation Average income (in Rs.) Number of people
Business 500 700
Labour 300 300
Craftmanship 200 200
Others 400 100
[D.S.W. '70]

www.gayali.in
Solution : Let the average per capita income be x

Groups
I II III IV
No. of observations 700 300 200 100 1300
Mean of earning 500 300 200 400 x
Applying formula,
Nx = n1 x1 + n2 x2 + n3 x3 + n 4 x 4
1300 x = 700 × 500 + 300 × 300 + 200 × 200 + 100 × 400
= 350000 + 90000 + 40000 + 40000
= 520000
520000
∴ x = = Rs.400
1300
∴Average per capital income = Rs.400
[16] The following shows some data collected for three regions of a country :
www.gayali.in
No. of inhabitants Percentage of Average annual income
Region
(Millions) Literates per person (Rs.)
A 10 52 850
B 5 68 620
C 18 39 730
Obtain the over all figures for the three regions taken together.
[C.U., B.A. (Eco.) '77]
Solution : Let the average annual income of entire group be x .
Groups
A B C
No. of observations 10 5 18 33
Mean of income (Rs.) 850 620 730 x
Applying formula,
www.gayali.in
Nx = n1 x1 + n2 x 2 + n3 x3
33 x = 10 × 850 + 5 × 620 + 18 × 730
= 8500 + 3100 + 13140
= 24740
24740
∴x = = Rs.749.70
33

www.gayali.in
Region No. of literates

A 52
× 10 = 5.2
100
B 68
× 5 = 3. 4
100
C 39
× 18 = 7.02
100
Total 15.62
15.62
∴ % of literates = × 100 = 47.33
33
[17] The population of India in 1951 and in 1961 were 361 and 439 million
respectively. (i) What was the average percentage increase per year during the period?
(ii) If the average rate of increase from 1961 to 1971 remain the same, what would be
the population in 1971?
Solution :
[i] As per formula, A = P (1 + i)n
www.gayali.in

Or, 439 = 361 (1 + i)10
Taking together on both sides
log 439 = log 361 + 10 log (1 + i)
2.6425 = 2.5575 + 10 log (1 + i)
10 log (1 + i) = 2.6425 – 2.5575
= 0.085
log (1 + i) = 0.0085
1 + i = Antilog .0085
= 1.020
i = .020 = 2%
[ii] Let A be the population in 1971
∴ A = 439 (1 + .02)10
= 439 (1.02)10
log A = log 439 + 10 log 1.02
= 2.6425 + 10 × 0.085
= 2.7275
A = Antilog 2.7275
= (5333 + 6) = 533.9 million.
www.gayali.in
[18] A man gets three successive annual increments in salary of 20%, 30% and 25%,
each percentage being reckoned on his salary at the end of the previous year. How
much better or worse off would he have been if he had been given 3 annual increments
of 25% each, reckoned in the same way?
[I.C.W.A. '74]
Solution : Let R be his starting salary.
A1 be his salary when annual increments are given 20%, 30% and 25%

www.gayali.in
successively. A2 be his salary when annual increments are 25% each year.
 20   30   25 
∴ A1 = R  1 +   1+   1+ 
 100   100   100 
= R (1.2)(1.3)(1.25)
= R × 1.95
3
 25 
A2 = R  1 +  = R × 1.953125
 100 
∴ A2–A1 = R (1.953125 – 1.95)
= R × 0.003125
R
=
320
R
∴ In the second case he would have received more.
320
[19] A machine is assumed to depreciate 40% in value in the first year, 25% in the
second year and 10% per annum for the next 3 years, each percentage being calculated
on the diminishing value. What is the average percentage depreciation, reckoned on
the diminishing value, for the 5 years?
www.gayali.in
[B.U., B.A. (Eco.) '66, C.U. M.Com. '73]
Solution : Let P is the original value of the machine 'i' is the average rate of
depreciation
i1, i2, i3 are successive rate of depreciations.
P/ (1 − i ) = P/ (1 − i1 ) (1 − i 2 ) (1 − i 3 )
5 3
Taking logarithm on both sides

5 log (1–i) = log [(1–0.40)(1–0.25)(1–0.10)3]
= log 0.60 + log 0.75 + 3 log 0.90
= I .7782 + I .875 + 3 × I .9542
= –1 + 0.7782 – 1 + 0.8751 – 3 + 2.8626
= –5 + 4.5159
= – 0.4841
Or, log (1 – i) = –0.0968
= –1 + 1 – 0.0968
= –1 + 0.9032
= I .9032
 7998 
 
+4 
www.gayali.in
Taking Antilog, (1 − i ) =  = 0.8002

 8002 
 
 
∴ i = 1 – 0.8002 = 0.1998
= .20
Or, i = 20%
∴ Average percentage depreciation = 20%

www.gayali.in
[20] The G.M. of 4 observations is 47, and the G.M. of 6 others is 40. Find the G.M.
of all the 10 observations.
Solution : Formula states that
If G1, G2 –––– be the G.M. of several groups having n1, n2 –––– observations
respectively, then G.M. (G) of the composite group is given by their weighted Geometric
Mean.
G = N G1n1 , G2 n2 − − − −
1
log G = Σni ( log Gi )
N
where N = n1 + n2 + ––––
Here, n1 = 4, n2 = 6, N = 10
G1 = 47 G2 = 40 G = ?
Substituting the values in the formula,
1
log G =  4 log 47 + 6 log 40 
10 
www.gayali.in
1
=  4 × 1.6721 + 6 × 1.6021
10 
1
= 6.6884 + 9.6126 
10 
16.3010
= = 1.6301
10
Taking Antilog, G = [4266 + 1] = 42.67
Therefore, G.M. of all the 10 observations = 42.67
[21] The geometric mean of six numbers is 75. If the geometric mean of four of them
is 67, what is the geometric mean of the other two?
[B.U., B.A. (Eco.) '71]
Solution : As per given condition,
1
log 75 =  4 log 67 + 2 log G2 
6
1
Or, 1.8751=  4 × 1.8261 + 2 log G2 
6
www.gayali.in
1
= 7.3044 + 2 log G2 
6
1
= 1.2174 + log G2
3
1
Or, log G2 = 1.8751 – 1.2174 = 0.6577
3

www.gayali.in
log G2 = 1.9731
Taking Antilog, G = (9397 + 2) = 93.99
= 94
∴ Geometric Mean of other two = 94
[22] You fly to a place X in a Boeing at a speed of 500 miles per hour and came back
from X, following the same route, at a speed of 160 mp.h. what is your average
speed for the to-and-fro journey?
[C.U., M.Com. '72]
2
Solution : Average speed =
1 1
+
500 160
2
=
8 + 25
4000
2 8000
www.gayali.in
= = = 242.4 m.p.h.
33 33
4000
[23] An aeroplane flies around a square the sides of which measure 100 Kms each.
The aeroplane covers at a speed of 100 Kms. per hour the first side, and at 400
Kms. per hour the fourth side. Use the correct mean to find the average speed
round the square.
[I.C.W.A. '78]
4
Solution : Average speed =
1 1 1 1
+ + +
100 200 300 400
4 4
= = = 192 k.p.h.
12 + 6 + 4 + 3 25
1200 1200
[24] If two grades of oranges sell at 10 for Rs.1 and 20 for Rs.1 respectively, calculate
the average price per orange, statigng your assumptions explicitly.
[B.U., B.A. (Eco.) '69]
www.gayali.in
Solution : Let the consumer buys X number of oranges

∴ 10 orange for Rs.1
1
1 orange for Rs.
10
X
X orange for Rs.
10
20 orange for Rs.1

www.gayali.in
1
1 orange for Rs.
20
X
X orange for Rs.
20
X X 3X
+
3X 1
Average price = 10 20 = 20 = ×
2X 2X 20 2 X
3
= × 100
40
= 7.5 P.
[25] The weights (in lbs.) of 8 persons are 138, 143, 141, 139, 152, 148, 160 and 267.
Find the average weight using a suitable form of average. Give reasons for your choice.
Solution :
No. Weight (arranged in order of magnitude)
1. 138 143 + 148
2. 139 Average Weight =
2
www.gayali.in
3. 141
291
4. (143) = = 145.5 lbs.
5. (148) 2
Since there are one extremely large value, A.M. will not be
6. 152
7. 160 suitable, Mode does not exist.
8. 267 Hence Median is the suitable average, which is 145.5 lbs.
[26] Find the mean and the median for the following data and comment on the shape
of the distribution :
Weight in Kg. 36–40 41–45 46–50 51–55 56–60 61–65 66–70
No. of persons 14 26 40 53 50 37 25
[I.C.W.A., '75 - old]
Solution :
Table : Calculations for Mean and Median
Cumulative
Weight
Frequency (f) Mid-Value (x) y = x − 53 fy Class Boundary frequency
(Kg.)
5 (less-than)
36–40 14 38 –3 –42 35.5–40.5 14
41–45 26 43 –2 –52 40.5–45.5 40
46–50 40 48 –1 –40 45.5–50.5 80
www.gayali.in
N
← = 122.5
2
51–55 53 53 0 0 50.5–55.5 133
56–60 50 58 1 50 55.5–60.5 183
61–65 37 63 2 74 60.5–65.5 220
66–70 25 68 3 75 65.5–70.5 245=N
Total 245 – – 65

www.gayali.in
65
x = 53 + × 5 = 53 + 1.33 = 54.33
245
N
−F
122.5 − 80 42.5
Median = l1 + 2 × c = 50.5 + × 5 = 50.5 + × 5 = 50.5 + 4.01 = 54.5 Kg.
fm 53 53
Here Mean is less than Median. Hence,

the curve is negetively skewed i.e.
54.5
longer tail of frequency curve lies to
Mean Median the left.
54.33
[27] The G.M., H.M. and A.M. of three observations are 3.63, 3.27 and 4 respectively.
Find the observations.
[C.U., M.Com. '75]
Solution : Let the observations be x, y and z
www.gayali.in
3 xyz = 3.63
∴ xyz = 47.83 –––– (1)
3
= 3.27
1 1 1
+ +
x y z
3
yz + xz + xy 3xyz 3 × 47.83
= 3.27 or, = 3.27 or, = 3.27
xyz xy + yz + zx xy + yz + zx
∴ xy + yz + zx = 43.88 –––– (2)
x+y +z
=4
3
or, x + y + z = 12 –––– (3)
Also, for A.M. of x, y, z
x+z
y= or, 2y = x + z –––– (4)
2
Putting the value of x + z in (3) we get
2y + y = 12
www.gayali.in
or, y = 4 and x + z = 2 × 4 = 8
Putting the value of (x + z) y = 8 × 4 = 32
xy + zy = 32 in (2)
zx = 43.88 – 32 = 11.88 = 12 (approx.)
z (8 – z) = 12 or, z2 – 8z + 12 = 0
or, (z – 6)(z – 2) = 0
z = 6 or 2

www.gayali.in
when z = 6, y = 4
x + y + z = 12
x=2
∴ the observations are 2, 4 and 6
[28] Using a suitable formula calculate the median value from the following data :
Mid Value 115 125 135 145 155 165 175 185 195 Total
Frequency 6 25 48 72 116 60 38 22 3 390
[C.A., '66]
Solution :
Table : Calculation for Median.
Mid Value Class boundaries Frequency (f) Cumulative frequency (less-than)
115 110–120 6 6
125 120–130 25 31
135 130–140 48 79
145 140–150 72 151
N
www.gayali.in
← = 195
2
155 150–160 116 267
165 160–170 60 327
175 170–180 38 365
185 180–190 22 387
195 190–200 3 390 = N
Total – 390 –
N
−F
Median = l1 + 2 ×c
fm
195 − 151
= 150 + × 10
116
44
= 150 + × 10
116
= 150 + 3.80
= 153.80
∴Median = 153.80
[29] In a group of 1000 wage earners the monthly wages of 4% are below Rs.60 and
www.gayali.in
those of 15% are under Rs.62.50. 15% earned Rs.95 and over, and 5% got Rs.100 and
over. Find the median wage.
[B.U., B.A. (Eco.) '70]
Solution :
4% i.e. 40 wage earners earns below Rs.60
15% i.e. 150 wage earners earns below Rs.62.50

www.gayali.in
15% i.e. 150 wage earners earn Rs.95 & over

5% i.e. 50 wage earners earn Rs.100 & over
[a] under Rs.60 the frequency is 4% of 1000 i.e. 40
[b] Under Rs.62.5 the frequency is 15% of 1000 i.e. 150
[c] Over Rs.95 the frequency is 15% of 1000 i.e. 150
[d] Over Rs.100 the frequency is 5% of 1000 i.e. 50
Weekly wage (Rs.) Frequency (f) Cumulative frequency (less-than)
0–60.00 40 40
60–62.50 110 150
62.50–95 700 850
95–100 100 950
100 – above 50 1000 = N
500 − 150
Median = 60.50 + × 32.5
700
350
= 62.50 + × 32.5
700
www.gayali.in
= 62.50 + 16.25
= 78.75
∴Median = Rs.78.75
[30] The table below gives the frequency distribution of weights of 80 apples :
Weight (gms.) 110–119 120–129 130–139 140–149 150–159 160–169 170–179 180–189
Frequency 5 7 12 20 16 10 7 3
Draw the cumulative frequency diagram and hence determine the median
weight of an apple.
[I.C.W.A. '76]
Solution :
Table : Calculation of median weight
Weight (gms) Frequency (f) Class boundary Cumulative frequency (less-than)
110–119 5 109.5 0
120–129 7 119.5 5
130–139 12 129.5 12
140–149 20 139.5 24
www.gayali.in
N
← = 40
2
150–159 16 149.5 44
160–169 10 159.5 60
170–179 7 169.5 70
180–189 3 179.5 77
Total 80 189.5 80 = N

www.gayali.in
80
70
60
50
40
30
20
10
www.gayali.in
7.5
14
0
9.5 9.5 9.5 39.5 49.5 59.5 69.5 9.5 9.5
10 11 12 1 1 1 1 17 18
Class boundary
∴ Median = 147.5
[31] Draw the less that Ogive and estimate the value of median on the basis of the
data given below :
Mid-Point 18 25 32 39 46 53 60
Frequency 10 15 32 42 26 12 9 N=146
[C.A. ’74]
Solution:
Table: Ogive (less-than) for data given
Mid-point Clan boundary Frequency Cumulative frequency (less than)
18 14.5 0 0
25 21.5 10 10
32 28.5 15 25
www.gayali.in
39 35.5 32 57
N
← = 73
2
46 42.5 42 99
53 49.5 26 126
60 56.5 12 137
63.5 9 146=N
Total - 146 -

www.gayali.in
n)
ha
st
les
e(
giv
O
Median = 38.2
[32] An incomplete frequency distribution is given below :
www.gayali.in
Height (inches) 5.1-6.0 6.1-7.0 7.1-8.0 8.1-9.0 9.1-10.0 10.1-11.0 11.1-12.0
No of Plants 3 8 27 ? 17 11 9
It is known that the median height of the plant in 8.53 inches. Calculate the
missing frequency.
[I.C.W.A. '72]
Solution: let the missing frequency be f4
Table: Calculation for missing frequency
Height (class boundary) Frequency Cumulative Frequency (less-than)
5.05-6.05 3 3
6.05-7.05 8 11
7.05-8.05 27 38
N 75 + f 4
← =
2 2
8.05-9.05 f4 38+f4
9.05-10.05 17 55+f4
10.05-11.05 11 66+f4
11.05-12.05 9 75+f4 = N
Total 75+f4
www.gayali.in
Clearly, Median lies in the clan 8.05-9.05 and cumulative frequency is move
than 38 but less than 38+ f4
As per gives condition
75 + f4
− 38 75 + f4 − 76
8.53 = 8.05 + 2 × 1 Or, 0.48 =
f4 2f4

www.gayali.in
Or, 0.96 f4 = f4–1 Or, .04 f4 = 1

1
Or, =
f4 = 25
.04
∴ Missing frequency=25 inches.
[33] In the following data two class frequencies are missing:
C.I 100-110 110-120 120-130 130-140 140-150 150-160 160-170 170-180 180-190 190-200
Frequency 4 7 15 ? 40 ? 16 10 6 3
However it was possible to ascertain that the total number of frequencies was 150 and
that the median has been correctly found out as 146.25. Find the two missing frequencies.
[C.A. '73]
Solution : Let ‘a’ and ‘b’ denote the missing frequencies of classes 130-140 and
150-160 respectively.
101+a+b=150
a + b = 49
Median lien in 140-150 class
Clan boundaries Frequency Cumulative frequency
www.gayali.in
100-110 4 4
110-120 7 11
120-130 15 26
130-140 a 26+a
N
← = 75
2
140-150 40 66+a
150-160 b 66+a+b
160-170 16 82+a+b
170-180 10 92+a+b
180-190 6 98+a+b
190-200 3 101+a+b=N
Total 101+a+b
75 − (26 + a)
Median = 140 + × 10
40
49 − a
146.25 = 140 +
4
49 − a
Or, 6.25 = Or, 25 = 49 – a
4
a = 24
a + b = 49
www.gayali.in
Or, 24 + b = 49 ∴b = 25
Therefore, missing entries are
a = 24, b = 25
[34] Calculate the value of the mode by the usual formula (after grouping if necessary) :
x 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100 100-110
f 4 6 5 10 20 22 24 6 2 1

www.gayali.in
[C .A. ‘74]
Solution : Table – Calculation for Mode
Class boundaries Frequency (f) Regrouped Class Frequency
(x) boundary
10-20 4 10-30 10
20-30 6 30-50 15
30-40 5 50-70 42
40-50 10 70-90 30
50-60 20 90-110 9
60-70 22
70-80 24
80-90 6
90-100 2
100-110 1
Mode lies in the class boundary 50–70
f0 − f−1 42 - 15 27
Mode = l1 + × c = 50 + × 20 = 50 + × 20
2f0 − f−1 − f1 2 × 42 - 15 - 30 84 - 45
27
= 50 + × 20 = 50 + 13.85 = 63.85
39
www.gayali.in
∴ Mode = 63.8
[35] From the following distribution of weekly earnings, calculate (i) the most usual
wage, and (ii) the percentage earning more than Rs. 31.50
Weekly earning (Rs.) 25 - 26 - 27 - 28 - 29 - 30 - 31 - 32 - 33 - 34 - 35-36 Total
No. of persons 25 70 210 275 430 550 340 130 90 55 25 2200
[ I. C. W. A. ‘73]
Solution : Total Calculation for Mode
Weekly earnings (Rs.)(x) Frequency (f) Class boundary Cumulative frequency (less than)
25-26 25 25 0
26-27 70 26 25
27-28 210 27 95
28-29 275 28 305
29-30 430 29 580
30-31 550 30 1010
31-32 340 31 1560
←31.5 ←x
32-33 130 32 1900
33-34 90 33 2030
34-35 55 34 2120
35-36 25 35 2175
www.gayali.in
36 2200
(i) Mode lies in the class 30-31
550 - 430
Mode = 30 + ×1
2 × 550 - 430 - 340
120 120
= 30 + = 30 + = 30 + 0.36 = 30.36
1100 - 770 330

www.gayali.in
(ii) Earning less than 31.50, let the cumulative frequency be x.

∴ By interpolation;
31.5 − 31 x − 1560 x − 1560
= Or, 0.5 =
32 − 31 1900 − 1560 340
Or, x – 1560 = 170 or, x = 1730
Cumulative frequency more than Rs.31.50 i.e. 2200 – 1730 = 470
470
Percentage of earnings more than Rs.31.50 is × 100 = 21.4%
2200
[36] Find the mean and Mode for the following :
Years under 10 20 30 40 50 60
Number of persons 15 32 51 78 97 109
[B.U., B.A. (Econ.) ‘73]
Solution : Table – Calculation for mean and mode
Number of persons
Year under Mid-value (x) Frequency (f) y = x − 25 fy
(Cumulative frequency) 10
0-10 5 15 15 -2 -30
10-20 15 32 17 -1 -17
www.gayali.in
20-30 25 51 19 0 0
30-40 35 78 27 1 27
40-50 45 97 19 2 38
50-60 55 109 12 3 36
Total 109 - 54
54
Mean ( x ) = 25 + × 10 = 25 + 4.95 = 29.95 years
109
27 − 19 8 8
Mode = 30 + × 10 = 30 + × 10 = 30 + × 10 = 35 years
2 × 27 − 19 − 19 54 - 38 16
[37] From the following cumulative frequency distribution of marks obtained by 22
students, calculate (a) arithmetic mean, (b) Median and (c) Mode
Marks No. of students
Below 10 3
Below 20 8
Below 30 17
Below 40 20
Below 50 22
[I.C.W.A ‘77]
Solution: Table : Calculation for A.M.
www.gayali.in
Cumulative Class
Class limits Frequency (f) Mid-value (x) y = x − 24.5 fy
frequency 10 boundary
0-9 3 3 4.5 -2 -6 0.5-9.5
10-19 8 5 14.5 -1 -5 9.5-19.5
20-29 17 9 24.5 0 0 19.5-29.5
30-39 20 3 34.5 1 3 29.5-39.5
40-49 22 2 44.5 2 4 39.5-49.5
Total 22 - -4

www.gayali.in
 4 
Mean ( x ) = 24.5 +  − × 10  = 24.5 − 1.82 = 22.68 = 22.7
 22 
11 − 8
Median = 19.5 + × 10 = 19.5 + 3.33 = 22.83 = 22.8
9
9−5
Mode = 19.5 + × 10 = 19.5 + 4 = 23.5
18 - 5 - 3
[38] The table below given the numbers (f) of candidates obtaining marks (x) or
higher in a certain examination (all marks are given in whole number).
x 10 20 30 40 50 60 70 80 90 100
f 140 133 118 100 75 45 25 9 2 0
Calculate the mean and the median marks obtained by the candidates.
[ I.C.W.A.’75]
Solution: Table:- Calculation of A.M. and median
Marks (x) or Cumulative frequency Class Cumulative frequency
Frequency (f)
Higher (more than) interval (less than)
10 140 10-20 7 7
www.gayali.in
20 133 20-30 15 22
30 118 30-40 18 40
40 100 40-50 25 65
←
50 75 50-60 30 95
60 45 60-70 20 115
70 25 70-80 16 131
80 9 80-90 7 138
90 2 90-100 2 140=N
100 0
Total 140
Median value is at cumulative frequency to which lies at the class interval 50-60
70 − 65 5
Median = 50 + × 10 = 50 + × 10 = 50 + 1.67 = 51.67 = 51.7 (roundup)
30 30
x − 55
Mid-value (x) y= )f( fx
10
15 -4 7 -28
25 -3 15 -45
www.gayali.in
35 -2 18 -36
45 -1 25 -25
55 0 30 0
65 1 20 20
75 2 16 32
85 3 7 21
95 4 2 8
Total – 140 -53

www.gayali.in
 −53 
x = 55 +   × 10
 140 
= 55 – 3.78
= 51.22
∴ Arithmetic Mean = 51.22
[39] Calculate the values of (i) mean (ii) median and (iii) the two quartiles:
Income (Rs.1000) Under 1 1-2 2-3 3-5 5-10 10-25 25-50 50-100 100-1000
No. of persons 13 90 81 117 66 27 6 2 2
[C.U.M.Com.’74]
Solution: Table : Calculation for mean, median and quartiles
Income (Rs.1000) Cumulative frequency
Frequency (f) Mid-value (x) fx )less-than (
(class marks)
0-1 13 0.5 6.5 13
404
← Q1 = = 101
4
1-2 90 1.5 135.0 103
www.gayali.in
2-3 81 2.5 202.5 184
N 404
←Q2 = = = 202
2 2
3-5 117 4 468.0 301
3N 3 × 404
← Q3 = = = 303
4 4
5-10 66 7.5 495.0 367
10-25 27 17.5 472.5 394
25-50 6 37.5 225.0 400
50-100 2 75 150.0 402
100-1000 2 550 1100.0 404=N
Total 404 3254.5
3254.5
[i] Arithmetic Mean ( x ) = = 8.06
404
202 − 184 18
[ii] Median = 3 + ×2 = 3 + × 2 = 3.31
117 117
101 − 13 88
[iii] First Quartile (Q1) = 1 + ×1 = 1 + = 1.98
90 90
303 − 301 2 10
Third Quartile (Q3) = 5 + ×5 = 5 + ×5 = 5 + = 5.15
66 66 66
www.gayali.in
[40] In a moderately asymmetrical distribution, the mean and the median are
respectively 25.6 and 26.1 inches. What is the mode of the distribution?
[I.C.W.A.’71]
Solution:
25.6 – Mode = 3 (25.6 – 26.1) = 3 × –0.5 = –1.5
Mode = 25.6 + 1.5 = 27.1 inches

www.gayali.in
[41] In a moderately skewed distribution, Arithmetic Mean = 24.6, and the Mode = 26.1.
Find the value of the Median and explain the reason for the method employed.
[C.A.’67]
Solution:
24.6 – 26.1 = 3 (24.6 – Median)
– 1.5 = 73.8 – 3 Median
3 Median = 73.8 + 1.5 = 75.3
75.3
Median = = 25.1
3
For unimodal distributions of moderate skewness the following approximate
relation has been found to hold :
[42] Calculate arithmetic mean, and median of the frequency distribution given
below. Hence calculate the mode using the empirical relation between the three.
Class limits 130-134 135-139 140-144 145-149 150-154 155-159 160-164
Frequency 5 15 28 24 17 10 1
www.gayali.in
[I.C.W.A. '74]
Solution: Table: Calculation for A .M., Median
Class Cumulative
Class limit Mid-value (x) Frequency (f) y = x −147 fy
5 boundary frequency
130-134 132 5 -3 -15 129.5-134.5 5
135-139 137 15 -2 -30 134.5-139.5 20
140-144 142 28 -1 -28 139.5-144.5 48
N
← = 50
2
145-149 147 24 0 0 144.5-149.5 72
150-154 152 17 1 17 149.5-154.5 89
155-159 157 10 2 20 154.5-159.5 99
160-164 162 1 3 3 159.5-164.5 100=N
Total 100 -33
33 
Arithmetic Mean ( x ) =147+  −  ×5=147–1.65=145.35
 100 
50 − 48 2
Median=144.5+ ×5=144.5+ ×5=144.5+0.42=144.92
24 24
www.gayali.in

Or, 145.35 – Mode = 3 (145.35 – 144.92)= 3 × 0.43 = 1.29
Mode = 145.35 – 1.29 = 144.06
[43] Compute the median and the upper quartile of the following :
Intelligence Quotient (IQ) 55-64 65-74 75-84 85-94 95-104 105-114 115-124 125-134 135-144
No. of students 2 20 79 184 302 207 82 24 4
[C.U.B.A. (E con) ‘76]

www.gayali.in
Solution : Table : Calculations for median and upper quartiles

Class Limits Class boundaries Frequency (f) Cumulative frequency (less-than)
55-64 54.5-64.5 2 2
65-74 64.5-74.5 20 22
75-84 74.5-84.5 79 101
85-94 84.5-94.5 184 285
N
← = 452
2
95-104 94.5-104.5 302 587
3N
← = 678
4
105-114 104.5-114.5 207 794
115-124 114.5-124.5 82 876
125-134 124.5-134.5 24 900
135-144 134.5-144.5 4 904=N
Total 904
452 - 285 167
Median = 94.5 + × 10 = 94.5 + × 10 = 94.5 + 5.53 = 100.03
302 302
678 - 587 91
Upper Quartile (Q3)= 104.5 + × 10 = 104.5 + × 10 =104.5+4.40=108.90
207 207
[44] The weekly wages earned by the hundred workers of a factory are set out in the
www.gayali.in
following table :–
Weekly wages (Rs.) 12.5-17.5 17.5-22.5 22.5-27.5 27.5-32.5 32.5-37.5 37.5-42.5 42.5-47.5 47.5-52.5 52.5-57.5
No. of workers 12 16 25 14 13 10 6 3 1
2n 3n
Calculate the three quartiles of the above distribution taking n/4, 4 and as
4
their respective ranks.
[C.A.’63]
Solution : Table : Calculations for Quartiles
Class Boundaries Frequency (f) Cumulative frequencies (less-than)
12.5-17.5 12 12 n
← = 25
4
17.5-22.5 16 28 n
← = 50
2
22.5-27.5 25 53
27.5-32.5 14 67 3n
← = 75
4
32.5-37.5 13 80
37.5-42.5 10 90
42.5-47.5 6 96
www.gayali.in
47.5-52.5 3 99
52.5-57.5 1 100=N
Total 100
n 25 − 12 13
First Quartile (Q1)
= 17.5 + =
× 5 = 17.5 + × 5
4 16 16
65
= 17.5 + = 17.5 + 4.06 = 21.56 (Rs.)
16

www.gayali.in
n 50 − 28 22
2nd Quartile (Q2) =
= 22.50 + × 5 = 22.50 + × 5
2 25 25
110
= 22.50 + = 22.5 + 4.4 =26.90 (Rs.)
25
3n 75 − 67 8
3rd Quartile (Q3) = = 32.50 + × 5 = 32.50 + × 5
4 13 13
= 32.50 + 3.08 = 35.58 (Rs.)
[45] The following table shows the age distribution of heads of families in a certain
country during the year 1957. Find the median, the third quartile and the second
decile of the distribution. Check your results by the graphical method:
Age of head of family (Yrs.) Under 25 25–29 30–34 35–44 45–54 55–64 65–74 Above 74 Total
Number (Million) 2.3 4.1 5.3 10.6 9.7 6.8 4.4 1.8 45.0
[I.C.W.A. '73]
Class boundary Frequency Cumulative frequency (less-than)
0.5–24.5 2.3 2.3
24.5–29.5 4.1 6.4 2N
←
www.gayali.in
=9
10
29.5–34.5 5.3 11.7
34.5–44.5 10.6 22.3 N
← = 22.5
2
44.5–54.5 9.7 32.0
3N
← = 33.75
4
54.5–64.5 6.8 38.8
64.5–74.5 4.4 43.2
74.5–above 1.8 45.0=N
Total 45.0
22.5 − 22.3 0.2 2
Median = 44.5 + × 10 = 44.5 + × 10 = 44.5 + =44.5+0.21=44.7 years.
9. 7 9. 7 9. 7
 3N  33.75 − 32.0 1.75
Third Quartile   = 54.5 + × 10 = 54.5 + × 10
 4  6 . 8 6. 8
17.5
= 54.5 + = 54.5 + 2.57 = 57.07 = 57.1 years
6. 8
 2N  9 − 6. 4 2. 6
Second Decile   = 29.5 + 5.3 × 5 = 29.5 + 5.3 × 5 = 29.5+2.45=32.95=32 years.
www.gayali.in
 10 
[46] For an income distribution of a group of men, 20 percent of men have income
below Rs. 30, 35 percent below Rs.70, 60 percent below Rs.150 and 80 percent below
Rs.250, The first and third quartiles are Rs.50 and Rs.170
Put the above information in a cumulative frequency distribution and find the median.
[C.U. M.Com. ‘66]

www.gayali.in
Solution:
Income below Rs.30 the frequency is 20%
Income below Rs. 50 the frequency is 25%
Table : Calculations for Median
Income (Rs.) Cumulative frequency in % (less than)
30 20
50 25
70 35 N
Median→ ← = 50
2
150 60
170 75
250 80
Above 250 100=N
By interpolation,
www.gayali.in
Median − 70 50 − 35 Median − 70 15 15
=
150 − 70 60 − 35 Or, 80
=
25
Or, Median − 70 = × 80 = 48
25
Median = 70 + 48 = 118
∴Median = Rs.118
[47] For a group of 5000 workers the weekly wages vary from Rs.20 to Rs.80. The wages
of 4 percent of the workers are under Rs.25 and those of 10 percent are under Rs.30; 15
percent of the workers earn Rs.60 and over, and 5 percent of them get Rs.70 and over. The
quartile wages are Rs.40 and Rs.54, and the sixth decile is Rs.50”. Put the above information
in the form of a frequency distribution and find the mean wage there from.
[I.C.W.A. ‘71]
Solution : Number of workers = 5000
Under Rs.25 the frequency is 4% i.e. 200
Under Rs.60 & over frequency is 15% i.e. 750
Under Rs.70 & over frequency is 5% i.e. 250
Class boundary Frequency (f)(less-than) Mid-value (x) x − 45 fy
y=
2. 5
www.gayali.in
20-25 200 12.5 -9 -1800

25-30 300 27.5 -7 -2100
30-40 750 35.0 -4 -3000
40-50 1750 45.0 0 0
50-54 750 52.0 2.8 2100
54-60 500 57.0 4.8 2400
60-70 500 65.0 8.0 4000
70-80 250 75.0 12.0 3000
Total 5000 - - 4600

www.gayali.in
4600 11500
Arithmetic Mean ( x ) = 45 + × 2.5 = 45 + = 45 + 2.30 = 47.30
5000 5000
∴ Arithmetic Mean = Rs.47.30
[48] For a certain group of ‘Saree’ weavers of varanashi, the median and quartile of
earnings per week are Rs.44.30, Rs.43.00 and Rs.45.90 respectively. Ten percent of the
group earn under Rs.42 per week and 13% earn Rs.47 and over, and 6% Rs.48 and
over. The range of earnings per week is Rs.40 – Rs.50. Put the data into a frequency
distribution.
[C.U., B.A. (Eco.) ‘70]
Solution:
As per condition given.
Earning per week under 42.00 cumulative frequency 10%
Table: Frequency Distribution of Wages
www.gayali.in
Weekly Wages (Rs.) Cumulative Frequency (%)(less-than) Frequency (%)
40-42 10 10
42-43 25 15
43-44.30 50 25
44.30-45.90 75 25
45.90-47.00 87 12
47.00-48.00 94 7
48.00-50.00 100 6
[50] Given below the frequency distribution of carbon content (percent) in 150
determinations of a certain mixed powder.
Percent carbon 4.0-4.1 4.2-4.3 4.4-4.5 4.6-4.7 4.8-4.9 5.0-5.1 5.2-5.3 5.4-5.5 5.6-5.7
Frequency 1 2 7 20 25 30 10 25 30
Compute the arithmetic mean and median.
[I.C.W.A. ‘78]
Solution : Table : Calculations for A.M. and Median
Class x = 4.85 Class Cumulative
limit (%) Frequency (f) Mid-value (x) y = fy boundary frequency
0. 1
4.0 – 4.1 1 4.05 -8 -8 3.95 – 4.15 1
4.2 – 4.3 2 4.25 -6 -12 4.15 – 4.35 3
4.4 – 4.5 7 4.45 -4 -28 4.35 – 4.55 10
4.6 – 4.7 20 4.65 -2 -40 4.55 – 4.75 30
www.gayali.in
4.8 – 4.9 25 4.85 0 0 4.75 – 4.95 55

N
← = 75
2
5.0 – 5.1 30 5.05 2 60 4.95 – 5.15 85
5.2 – 5.3 10 5.25 4 40 5.15 – 5.35 95
5.4 – 5.5 25 5.45 6 150 5.35 – 5.55 120
5.6 – 5.7 30 5.65 8 240 5.55 – 5.75 150 = N
Total 150 - - 402 -

www.gayali.in
402 40.2
Arithmetic Mean ( x ) = 4.85 + × 0.1 = 4.85 + = 4.85 + 0.268 =5.118
150 150
Arithmetic Mean = 5.118%
75 − 55
Median = 4.95 + × 0.20 = 4.95 + 0.133 = 5.083
30
Median = 5.083%
[52] Compute the arithmetic mean, median and mode of the following distribution
and explain their relationships :
Monthly income (Rs.) 0-75 75-150 150-225 225-300 300-375 375-450
Frequency 15 200 250 225 10 5
[C.U., M.Com.'76]
Solution : Calculations for A.M., Median and Mode
x - 187.5 Cumulative Fre-
Class boundary Frequency (f) Mid-value (x) y = fy
75 quency (less-than)
0-75 15 37.5 -2 -30 15
www.gayali.in
75-150 200 112.5 -1 -200 215
150-225 250 187.5 0 0 465
225-300 225 262.5 1 225 690
300-375 10 337.5 2 20 700
375-450 5 412.5 3 15 705=N
Total 705 - - 30
30
Arithmetic Mean ( x ) = 187.5 = + × 75
705
= 187.5 + 3.19 = 190.69
Arithmetic Mean = Rs.190.69
352.5 − 215
Median = 150 + × 75
250
137.5 10312.5
= 150 + × 75 = 150 + =150+41.25=191.25
250 250
∴ Median = Rs.191.25
250 − 200
Mode = 150 + × 75
2 × 250 − 200 − 225
50 50
= 150 + × 75 = 150 + × 75 = 150 + 50 = 200
500 − 425 75
www.gayali.in
∴ Mode = Rs.200
The distribution is very skewed. Hence, the relation, Mean – Mode = 3(Mean –
Median) does not hold good.
[53] [a] z1=x1+y1, z2=x2+y2,----zn=xn+yn, Then prove that z = x + y , where the
symbols have their usual meaning.
[b] Prove that the logarithm of geometric mean of observations is the

www.gayali.in
arithmetic mean of logarithms of the observations. [D. M.’73]

Solution:
(a) z1+z2+-----+zn=x1+y1+x2+y2+----+xn+yn

or, z1+z2+----+zn=x1+x2+----+xn+y1+y2+----+yn
z1 + z 2 + ---- + z n x1 + x 2 + ---- + x n y 1 + y 2 + ---- + y n
or, = +
n n n
or, z = x + y
[b] Simple G. M. g = ( x1 , x 2 ,- - - x n ) , given the observations are x1,x2,----,xn
1/ n
taking logarithms of both sides, we have

1 1
log g = ( log x1 + log x 2 + - - -log x n ) = Σ log x i
n 2
1
( f1 f2
In case of weighted G. M. (G) = x1 , x 2 , - - - x n
both sides, we have
fn
) N
taking logarithms of
1
log G =
 f1 log x1 + f2 log x 2 + - - - fn log x n 
N
1
= Σ fi log x i
www.gayali.in
N
[54] The following are the population figures (in thousand) of 10 cities. Find the
median: 2488, 1490, 777, 733, 522, 672, 591, 407, 387 and 391.
[D.M.’78]
Solution : The figures are arranged in order of magnitude:
387, 391, 407, 522, (591,) (672), 733, 777, 1490, 2488
There are 10 observation i e. ever number of observation. Hence, median is the
arithmetic mean of the two middle most observations.
591 + 672 1263
Median = = = 631.5 = 631.5
2 2
∴ Median = 631.5 thousands.
[55] The Mean weight per student in a group of 6 students is 119 lbs. The individual
weights of 5 of them are 115 lbs.,109 lbs., 129 lbs., 117 lbs, and 114 lbs. what is the
weight of the other student of the group?
Solution: Let the mean weight of the remaining student be x.
As per condition given,
115 + 109 + 129 + 117 + 114 + x
119 =
6
584 + x
or, 119 = 6
www.gayali.in
or, 714 = 584+x

or, x=714–584
= 130 lbs.
Hence the weight of others student = 130 lbs.
[56] A factory has 5 sections, employing 105, 184, 130, 93 and 124 workers. The mean
earnings in a certain week per workers are Rs. 63.84, 65.12, 65.27, 68.19, and 64.22 for 5
sections. Determine the mean earnings per workers for the whole factory.

www.gayali.in
Solution: Applying formula for mean of composite group

N x = n1 x1 + n2 x 2 + n3 x 3 + n 4 x 4 + n5 x 5
Where N = n1 + n2 + n3 + n4 + n5
x1 , x 2 , x 3 , x 4 and x 5 and are respective means of groups.
∴ N = 105 + 184 + 130 + 93 + 124 = 636
636 x = 105×63.84 + 184 × 65.12 + 130 × 65.27+ 93 × 68.19 + 124 × 64.22
= 6703.2 + 11982.08 + 8485.1 + 6341.67 + 7963.28
= 41475.33
41475.33
∴ x= = 65.21
636
∴ Mean earning per workers = Rs.65.21
[57] The mean monthly income of a gentleman is Rs.1219 and his mean monthly
expenditure comes out to be Rs.1193. What are his mean monthly savings?
Solution: Yearly income = 12 ×1219 = Rs.14928 ------(1)
Yearly expenditure = 12 ×1193 = Rs.14316 ------(2)
(1) – (2) = Yearly saving = Rs.312
www.gayali.in
312
Mean monthly saving = = Rs.26
12
[58] The following data show the length of ear – head (in cm.) for 24 ears of a variety
of wheat. Compute the mean and the median.
11.5 8.8 10.1
8.2 9.3 10.0
9.7 10.1 10.3
10.3 11.3 9.8
10.7 9.8 9.3
8.6 10.4 9.8
11.3 8.4 9.0
10.7 9.6 11.2
Solution : Table: Calculations for A.M. and Median
Class limit Frequency (f) Mid-value (x) fx Class boundary Cumulative frequency
8.0-8.70 3 8.35 25.05 7.995-8.705 3
8.71-9.41 4 9.06 36.24 8.705-9.415 7 N
← = 12
2
9.42-10.12 8 9.77 78.16 9.415-10.125 15
www.gayali.in
10.13-10.83 5 10.48 52.40 10.125-10.835 20

10.84-11.54 4 11.19 44.76 10.835-11.545 24=N
Total 24 - 236.61
236.61
Arithmetic mean ( x ) = = 9.858 = 9.9 cm.
24
12 - 7 5
Median = 9.415 + × 0.71 = 9.415 + × 0.71 = 9.415 + 0.444 = 9.859 = 9.9 cm.
8 8

www.gayali.in
[59] For a certain frequency table with total frequency 150, the mean was found to be
Rs.76.47. But while copying out the table, a typist Left out two of the class frequency,
say f * and f **, as that the table is given to you in the following form :
Weekly wages in Rs. (mid-value) 65 70 75 80 85 90 95 Total
Frequency 5 48 f* 30 f** 8 6 150
Determine f * and f **
Solution: Calculation for Arithmetic Mean
Mid-value (x) Frequency (f) fx
65 5 325
70 48 3360
75 f* 75 f*
80 30 2400
85 f** 85 f**
90 8 720
95 6 570
Total 97 + f* + f** 7375 +75 f* +85 f**
www.gayali.in
97 + f* + f** = 150
f* + f** = 53 ––––(1)
* **
7375 + 75f * + 85f **
A.M. ( x ) = 7375 + 75f + 85f Or, 76.47 =
150 150
Or, 11470.5=7375+75f*+85f** Or, 75f*+85f**=4095.5 Or, 15f*+17f**=819.1---(2)
Equation (1)×15, 15f*+15f**=795.0 ----- (3)
Equation (2) – (3), we get 2f**=24.1
24.1
∴ f**= = 12
2
Putting the value of f** in equation (1), we get
f* + 12 = 53
f* = 53–12 = 41
[60] The number of telephone calls received at an exchange per interval for 245
succesive one – minute intervals are shown in the following frequency distribution :
Number of calls Frequency
0 14
1 21
2 25
www.gayali.in
3 43
4 51
5 40
6 39
7 12
Total 245
Evaluate the mean, median and mode

www.gayali.in
Solution: Calculations for A.M., Median and Mode

x f Cumulative frequency Y=x–4 f
0 14 14 -4 -56
1 21 35 -3 -63
2 25 60 -2 -50
3 43 103 -1 -43
4 51 154 0 0
5 40 194 1 40
6 39 233 2 78
7 12 245=N 3 36
Total 245 -4 -58
 58 
Arithmetic Mean ( x ) = 4 +  −  = 4 – 0.24 = 3.76
 248 
Mode = 4 (maximum frequency)
www.gayali.in
N + 1 245 + 1
Median = Median = = = 123
2 2
Value of x corresponding to cumulative frequency 123 which is 4.
Median = 4.
[61] Compute the mean, median and mode for the following frequency distribution:
Frequency distribution of I.Q. for 309 six - years old children
I.Q. Frequency
160 – 196 2
150 – 159 3
140 – 149 7
130 – 139 19
120 – 129 37
110 – 119 79
100 – 109 69
90 – 99 65
80 – 89 17
www.gayali.in
70 – 79 5
60 – 69 3
50 – 59 2
40 – 49 1
Total 309

www.gayali.in
Solution : Rearranging the data

Class Cumulative Class x −104.5
Frequency (f) Mid-value (x) y = 10 fy
limits frequency boundary
40 – 49 1 1 39.5 - 49.5 44.5 -6 -6
50 – 59 2 3 49.5 – 59.5 54.5 -5 -10
60 – 69 3 6 59.5 – 69.5 64.5 -4 -12
70 – 79 5 11 69.5 – 79.5 74.5 -3 -15
80 – 89 17 28 79.5 – 89.5 84.5 -2 -34
90 – 99 65 93 89.5 – 99.5 94.5 -1 -65
N
← = 154.5
2
100 – 119 69 162 99.5 – 109.5 104.5 0 0
110 – 119 79 241 109.5 – 119.5 114.5 1 79
120 – 129 37 278 119.5 – 139.5 134.5 3 74
130 – 139 19 297 129.5 – 139.5 134.5 3 57
140 – 149 7 304 139.5 – 149.5 144.5 4 28
150 – 159 3 307 149.5 – 159.5 154.5 5 15
160 – 169 2 309=N 159.5 – 169.5 164.5 6 12
Total 309 - - - - 123
www.gayali.in
123
Mean = 104.5 +
× 10 = 104.5 + 3.98 = 108.48
309
154.5 − 93 61.5
Median = 99.5 + × 10 = 99.5 + × 10 = 99.5 + 8.91 = 108.41
69 69
79 − 69
Mode = 109.5 + × 10
2 × 79 − 69 − 37
10 100
= 109.5 + × 10 = 109.5 + = 109.5 + 1.92 = 111.42
158 − 106 25
[62] Determine the median and mode for the following distribution of monthly
income for 580 middle–class people :
Monthly income (Rs.) Frequency
–300 53
300–350 81
350–400 114
400–450 195
450–500 63
www.gayali.in
500–550 32
550–600 20
600–650 11
650–700 8
700– 3
Total 580

www.gayali.in
Solution : Table: Calculations for Median and Mode

Class boundary Frequency Cumulative frequency
– 300 53 53
300 – 350 81 134
350 – 400 114 248
N
← = 290
2
400 – 450 195 443
450 – 500 63 506
500 – 550 32 538
550 – 600 20 558
600 – 650 11 569
650 – 700 8 577
700 – 3 580=N
Total 580
Median lies in the class 400 – 450
290 − 248 42
www.gayali.in
∴Median = 400 + × 50 = 400 + × 50 = 400 + 10.77 = Rs.410.77
195 195
Mode lies in the class 400 – 450 i.e. maximum frequency of 195
195 − 114 81 81 × 50
∴Mode = 400 + × 50 = 400 + × 50 = 400 +
2 × 195 − 114 − 36 290 − 177 213
4050
= 400 + = 400 + 19.01= Rs.419.01
213
[63] The age-distribution of 4488 Bengali males is given below
Age last birth day Frequency
0 156
1 121
2 111
3 106
4 103
5–9 472
10–14 434
15–19 407
20–24 383
25–29 357
30–34 335
www.gayali.in
35–39 306
40–49 522
50–59 370
60–69 213
70–79 80
80–89 11
90–99 1
Total 4488

www.gayali.in
Compute the mean age of Bengali males by means of formula.

Solution : The given data have been grouped into 3 sub-groups such as :
Value (x) Frequency (f1) Y1 = x - 2 f1 y1
0 156 -2 -312
1 121 -1 -121
2 111 0 0
3 106 1 106
4 103 2 206
Total 597 - -121
 121 
x1 = 2 +  −  = 2 – 0.20 = 1.80
 597 
n1 = 597
x − 22
Class Marks Frequency (f2) Mid-value (x2) y = 5 f 2y2
5–9 472 7 -3 -1416

10 – 14 434 12 -2 -868
15 – 19 407 17 -1 -407
www.gayali.in
20 – 24 383 22 0 0
25 – 29 357 27 1 357
30 – 34 335 32 2 670
35 – 39 306 37 3 918
Total 2694 - - -746
 746 
x 2 = 22 +  −  × 5 = 22 - 1.38 = 20.62
 2694 
x 3 − 64.5
Class Marks Frequency (f2) Mid-value (x2) y 3 = f3 y3
10
40 – 49 522 44.5 -2 -1044
50 – 59 370 54.5 -1 -370
60 – 69 213 64.5 0 0
70 – 79 80 74.5 1 80
80 – 89 11 84.5 2 22
90 – 99 1 94.5 3 3
Total 1197 -1309
 1309 
x 3 = 64.5 +  −  × 10 = 64.5 – 10.94 = 53.56
 1197 
Therefore, according formula for composite Mean
www.gayali.in
n1 x1 + n2 x 2 + n3 x3
Arithmetic Mean ( x ) =
n1 + n2 + n3
597 × 1.80 + 2694 × 20.62 + 1197 × 53.56 1074.6 + 55550.28 + 64111.32
= =
597 + 2694 + 1197 4488
120736.2
= = 26.90
4488

www.gayali.in
MEASURES OF DISPERESION
Meaning :
The word dispersion is used to denote the ‘degree of heterogeneity’ in the data. It
is an important characteristic indicating the extent to which observations vary among
themselves. The dispersion of a given set of observations will be zero, only when all of
them are equal. The wider the discrepancy from one observation to another, the larger
will be the disperstion.
A measure of dispersion is designed to state numerically the extent to which
individual observations vary on the average. There are several measures of dispersion.
Measures of Dispersion

Absolute measures Relative measures

Range Quartile Mean Standard Coefficient Coefficient Coefficient
Deviation Deviation Deviation of variation of Quartile of Mean
www.gayali.in
Deviation Deviation
Range
Range of a set of observations is the difference between the maximum and the
minimum value.
Range = Maximum value – Minimum value
Quartile Deviation (or semi – interquartile Range)

Quartile Deviation is defined as half the difference between the upper and the
lower quartiles Q − Q1
Quartile Deviation = 3
2
The difference Q3 – Q1 being the distance between the quartiles, may be called
interquartile range, and half of this is semi – interquartile Range.
Mean Deviation (or Mean Absolute Deviation)
Mean Deviation of a set of observations is the arithmetic mean of absolute
deviations from mean or any other specified value.
Given the observations x1, x2, - - -, xn in order to find ‘Mean Deviation about A’,
www.gayali.in
we first obtain the deviations (x1 – A), (x2 – A), - - -, (xn - A). Some of these deviations
may be positive and some negative. If we write |xi - A| to denote the positive value of
(xi –A), whatever be the actual sign, the sum of these ‘absolute deviations’ is |x1 - A| +
|x2 - A| + - - - + |xn - A| = Σ |xi - A| and A.M. of the absolute deviations is
1
Mean Deviation about A = n Σ(xi–A). Mean Deviation (M.D.) is usually
calculated about arithmetic mean ( x ), and hence ‘Mean Deviation’ only refers to M.D.
about mean.

www.gayali.in
For simple series,

1
Mean Deviation =  x i − x 
n
For frequency distribution,
1 
Mean Deviation =  Σfi x i − x 
N 
An important property of M.D. is that it has the minimum value when deviations
are taken from median, i.e. M.D. about median is the least.
Standard Deviation (S.D.)
Standard Deviation of a set of observations is the square – root of the arithmetic mean
of squares of deviations from arithmetic mean. In short, S.D. may be defined as “Root
– Mean – Square – Deviation from mean”. It is usually denoted by the greek small
Letter σ (sigma). If x1, x2, - - -, xn be a set of observations and x their A.M. then,
Deviations from mean : ( x1 − x ) , ( x 2 − x ) . − − − −. ( x n − x )
Square – Deviations from mean : ( x1 − x ) , ( x 2 − x ) , − − −−, ( x n − x )
2 2 2
Mean – Square – Deviation from mean :

1
( x1 − x ) + ( x 2 − x ) + − − − − + ( x n − x )  = 1 Σ ( x i − x )2
www.gayali.in
2 2 2
n  n
Root Mean Square – Deviation from mean, i.e.
 1 2 
Standard Deviation (σ) =  Σ ( xi − x ) 
 n 
The square of standard deviation is knownas variance
Variance = (S.D.)2
1
For simple series, σ2 = Σ ( x i − x )
2
n
1
For frequency distribution, σ2 = Σfi ( x i − x )
2
n
S.D. is always considered as positive.
Important properties of S.D.
[a] S.D. is independent of the charge of origin;
i.e. if y = x - c, where c is a constant, then S.D. of x = S.D. of y
In symbol, σx = σy
This implies that the same S.D. will be obtained if each of the observations
is increased or decreased by a constant.
www.gayali.in
[b] If two variables x and z are so related that z = ax + b for each x = xi where
a and b are constant, then σz =| a | σx.
Where | a | denotes the positive value of a
In particular, if y = (x – c)/d, where c and d are constants (d positive), then σx = d. σy
This implies that S.D. does not depend on origin, but depends on scale
of measurement. If each observation is multiplied or divided by a constant, S.D.
will also be similarly affected,

www.gayali.in
[c] If a group of n1, observations has means x 1 and S.D. σ1 and another group
of n2 observations has mean x 2 and S.D. 62, then S.D. (σ) of the composite group of n1
+ n2 (= N, say) observations an be obtained by the formula
Nσ2 = (n1 σ12 + n2σ2) + (n1 d12 + n2 d2) - - - (i)
Where d1 = x1 − x , d 2 = x2 − x
and Nx = n1 x1 + n2 x2
Relation (i) may be extended to any number of groups :
Nσ2 = Σ ni σi2 + Σ ni di2
Where d i = xi − x , N = Σni and x is the mean of composite group, given by
N x =Σ ni x i
[d] S.D is the minimum root – mean – square – deviation, i.e
1 1
Σ ( xi − x ) ≤ Σ ( xi − A )
2 2
n n
Whatever be the value of A.
Calculation of standard Deviations (σ)
www.gayali.in
If the observation are small, S.D. can be calculated by using the following
relations :
For simple series,
2
Σx 2  Σx 
σ2 = −
n  n 
For frequency distributions,
2
Σfx 2  Σfx 
σ2 = − 
N  N 
The calculations can, however, be simplified based on the following results.
[I] If y1, y2, - - - yn represent the deviations of x1, x2, - - -, xn from an
arbitrary constant c, than S.D. of x = S.D. of y.
In symbol, if y = x – c, then σx = σy.
[II] If y1, y2, - - -, yn represent the deviations of x1, x2, - - -, xn from an
arbitrary constant c, in units of another constant d, then
S.D. of x = d (S.D. of y)
x−c
In symbols, if y = , then σx = d.σy :
d
Relative measures of dispersion
www.gayali.in
There are 3 such measures –

S tan dard Deviation
[i] Coefficient of Variation = 100× Mean
Quartile Deviation
[ii] Coefficient of Quartile Deviation = 100× Median
Mean Deviation
[iii] Coefficient of Mean Deviation = 100×
Mean or Median

www.gayali.in
Lorenz curve
Lorenz curve is a diagram for showing the dispersion of a group. It is, in effect,
a cumulative percentage curve, combining the percentage of items under review with
the percentage of the factor (Say, wealth distribution) among the items. If wealth
were equally distributed among the people, the curve would be the straight line ACB,
connecting the two extremes of the scales. In practice, however, curve like ADB are
obtained. The less the area between the Lorenz curve ADB and the diagonal straight
line ACB, the greater is the homogeneity in the distribution of wealth, i.e. less is the
dispersion.
FIGURE – LORENZ CURVE
100 B
Percentage of Wealth
D
A
o
o 100
Percentage of Population
www.gayali.in
On the other hand, the Larger the area, the larger is the percentage of poor
people and greater is the concentration of wealth in the hands of a few. Lorenz curve
does not yield a numerical measure. It is in this respect, inferior to the familiar
measures of dispersion e.g. Range, Standard deviation, etc. But the advantage is that it
affords a picture of the dispersion at a glance. Lorenz curve is useful in such studies as
the distribution of land, wages and income among the population of a country or the
distribution of profits over different groups in business.
Exercise :
[1] If each item is reduced by 10, what effect would this have on (i) the arithmetic
mean, (ii) the range, and (iii) the standard deviation?
[CA 1964]
Ans. (i) A.M. is reduced ley 10
(ii) & (iii) Range and S.D. unchanged.
[2] If the variables are increased or decreased, (i) by the same amount, (ii) by the
same proportion, what will be the effect on standard deviation?
Ans. (i) The values of standard deviation will be the same as before i.e. unchanged.
(ii) S.D. will be changed in the same proportion.
[3] (i) If the first quartile is 142 and the semi-interquartile range is 18, what is
www.gayali.in
the third quartile?

(ii) The coefficient of variation is 40 and the mean is 30; find the standard
deviation.
[C.U., M.com’ 69]
Ans. (i) Here, Q1 = 142
Q3 − Q1
= 18
2

www.gayali.in
Or, Q3 – 142 = 36
Q3 = 142 + 36 = 178
SD
(ii) Coefficient of Variation (C.V.) = Mean ×100
SD
∴ 40 = × 100
30
40 × 30
Or, SD = = 12
100
[4] Find out the range of the following data
Height (inches) 60–62 63–65 66–68 69–71 72–74
No of students 8 27 42 18 5
[D.S.W. 1973]
Solution: Table: Calculation for Range
Class limits Class boundary Frequency
60–62 59.5–62.5 8
63–65 62.5–65.5 27
66–68 65.5–68.5 42
69–71 68.5–71.5 18
www.gayali.in
72–74 71.5–74.5 5
Total
Highest Value = 74.5
Lowest Value = 59.5
Range = 74.5 – 59.5 = 15
[5] Calculate the quartile deviation and its coefficient from the following :
Cl. Interval 10 - 15 15 - 20 20 - 25 25 - 30 30 - 40 40 – 50 50 – 60 60 - 70 Total
Frequency 4 12 16 22 10 8 6 4 82
Solution : Table : Calculation for Quartile Deviation

Class boundaries Frequency Cumulative Frequency
10-15 4 4
15-20 12 16 N
← Q1 = 20.5 =
4
20-25 16 32 N
← Q2 = 41 =
2
25-30 22 54
www.gayali.in
3N
← Q3 = 61.5 =
4
30-40 10 64
40-50 8 72
50-60 6 78
60-70 4 82=N
Total 82

www.gayali.in
20.5 − 16 4. 5 × 5
Q1 = 20 + × 5 = 20 + = 20 + 1.4 = 21.4
16 16
41 − 32 9
Q2 = 25 + × 5 = 25 + × 5 = 25 + 2.05 = 27.05
22 22
61.5 − 54
Q3 = 30 + × 10 = 30 + 7.5 = 37.5
10
Q3 − Q1  37.5 − 21.4 16.1
Quartile Deviation = 2
= = = 8.05
2 2
Quartile Deviation
Coefficient of Quartile Deviation = 100 ×
Median
8.05
= 100 × = 29.76 = 30 (approx.)
27.05
[6] The following table shows the distribution of the maximum loads supported by
certain cables produced by a company :–
Maximum load (short tons) 9.3–9.7 9.8–10.2 10.3–10.7 10.8–11.2 11.3–11.7 11.8–12.2 12.3–12.7 12.8–13.2
www.gayali.in
No of cables 2 5 12 17 14 6 3 1
Find the semi – inter quartile range
[D.S.W. 1968]
Solution: Calculation for semi – Quartile Range
Class limits Class boundary Frequency Cumulative frequency
9.3–9.7 9.25–9.75 2 2
9.8–10.2 9.75–10.25 5 7
N
← = 15 = Q1
4
10.3–10.7 10.25–10.75 12 19
10.8–11.2 10.75–11.25 17 36
3N
← = 45 = Q 3
4
11.3–11.7 11.25–11.75 14 50
11.8–12.2 11.75–12.25 6 56
12.3–12.7 12.25–12.75 3 59
12.8–13.2 12.75–13.25 1 60=N
Total 60
15 − 7 8
Q1 = 10.25 + × 0.50 = 10.25 + × .5 = 10.25 + 0.33 = 10.58
12 12
www.gayali.in
45 − 36 9
Q3 = 11.25 + × 0.50 = 11.25 + × 0.5 = 11.25 + 0.32 = 11.57
14 14
Q3 − Q1 11.57 − 10.58 0.99
Semi-inter quartile range = = = = 0.49 short tons
2 2 2
[7] Find the mean deviation about the arithmetic mean of the number 31, 35, 29,
63, 55, 72, 37.
[B.U.B.com, 1976]

www.gayali.in
Solution :
1 322
Arithmetic Mean ( x ) = (29 + 31 + 35 + 37 + 55 + 63 + 72) = = 46
7 7
Table : Calculation of Mean Deviation
x |x– x | = |x–46|
29 17
31 15
35 11
37 9
55 9
63 17
72 26
Total 104
1 1
Mean Deviation about Mean = ∑ x − x = × 104 = 14.86 = 14.9 Ans.
n 7
[8] Calculate the mean deviation of the following: 13, 84, 68, 24, 96, 139, 84, 27,
about the median.
www.gayali.in
[B.U.B.com. 1977]
Solution :
Since there are even number of observations, viz. 8, the median is the average of
the two middle – most observations, when arranged in order of magnitude: 13, 24, 27,
(68,84), 84, 96, 139
∴ Median = (68 + 84)/2 = 152/2 = 76
Table : Calculation for Mean Deviation
x |x–Median| i.e. difference from median
13 63
24 52
27 49
68 8
84 8
84 8
96 20
139 63
Total 271
1 1
Mean Deviation about median = Σ x − median = × 271 = 33.9
www.gayali.in
n 8
[9] Find the mean deviation about median from the following data: 46, 79, 26, 85,
39, 65, 99, 29, 56, 72
[C.U.B.com. 1977]
Solution :
Since there are even number of observations, viz. 10, the median is the average
of two middle most 26, 29, 39, 46, (56,65), 72, 79, 85, 99

www.gayali.in
56 + 65 121
∴ Median = = = 60.5
2 2
Table : Calculations for Mean Deviation

x |x–Median| i.e. difference from median
26 34.5
29 31.5
39 21.5
46 14.5
56 4.5
65 4.5
72 11.5
79 18.5
85 24.5
99 38.5
Total 204.0
1 1
Mean Deviation about median = Σ x − median = × 204 = 20.4
www.gayali.in
n 10
[10] Find mean deviation for the following frequency distribution:
Variable 3 5 7 9 11 13
Frequency 2 7 10 9 5 1
[D.M. (Suppl.), 1977]
Solution: Table: Calculations for Mean Deviation
x f fx |x– x | f|x– x |
3 2 6 4.65 9.30
5 7 35 2.65 18.55
7 10 70 0.65 6.50
9 9 81 1.35 12.15
11 5 55 3.35 16.75
13 1 13 5.35 5.35
Total 34 260 - 68.60
260 1 1
=x = 7.65 , Mean Deviation = Σ x − x = × 68.60 = 2.02
34 n 34
www.gayali.in
[11] Calculate the mean deviation from the following data, relating to heights (to the
nearest inches) of 100 children :
Height (inches) 60 61 62 63 64 65 66 67 68
No. of children 2 0 15 29 25 12 10 4 3
[I.C.W.A. 1973]

www.gayali.in
Solution :
x f y = x – 64 fy |x– x | f|x– x |
60 2 -4 -8 3.89 7.78
61 0 -3 0 2.89 0
62 15 -2 -30 1.89 28.35
63 29 -1 -29 0.89 25.81
64 25 0 0 0.11 2.75
65 12 1 1 1.11 13.32
66 10 2 2 2.11 21.10
67 4 3 3 3.11 12.44
68 3 4 4 4.11 12.33
Table 100 - -11 - 123.88
11
x = 64 − = 64 − 0.11 = 63.89
100
123.88
Mean Deviation = = 1.24 inches
100
[12] Calculate mean deviation from median from the following :
www.gayali.in
Class interval 2–4 4–6 6–8 8 – 10
Frequency 3 4 2 1
[I.C.W.A. 1977]
Solution :
Class interval Frequency (f) Cumulative frequency Mid-value (x) |x-Median| f|x-Median|
2–4 3 3 N 3 2 6
Median→ ← =5
2
4–6 4 7 5 0 0
6–8 2 9 7 2 4
8–10 1 10=N 9 4 4
Total 10 - - 14
5−3 2
Median = 4 + ×2 = 4 + ×2 = 5
4 4
1
Mean Deviation = × 14 = 1.4
10
[13] In a certain distribution of N=25 measurements it was found that x = 56 inches
www.gayali.in
and S.D. = 2 inches. After these results were computed it was discovered that a mistake
had been made in one of the measurements which was recorded as 64 inches. Find the
mean and standard deviation, if the incorrect measurement is omitted.
[C.U.M.com, 1962]
Solution:
Here, N = 25, x = 56, S.D = 2

www.gayali.in
Σx = 25 × 56 = 1400
(–) mistaken record = 64
New ∑x = 1336
1336
New Mean = = 55.67
24
2
Σx 2  Σx 
σ2 = −
n  n 
2
Σx 2  1400  Σx 2
4= − = − 562
25  25  25
Σx2
∴ = 3136 + 4 = 3140
25
∑x2 = 3140 × 25 = 78500
After excluding mistaken record,
∑x2 = 78500 – 642 = 78500 – 4096 = 74404
2
74404  1336 
New σ2 = −  = 3100.17 – 3098.78 = 1.39
24  24 
www.gayali.in
σ = 1.39 = 1.18 inches
[14] The mean and S.D. of a group of 25 observations were found to be 30 and 3 respectively.
After the calculations were made, it was found that two of the observations were incorrect, which
were recorded as 29 and 31. Find the mean and S.D. if the incorrect observations are excluded.
[C.U., B.com. (Hons.)1968]
Solution :
Σx = 25 × 30 = 750
2
2 Σx 2  Σx 
σ = −
n  n 
2
Σx 2  750  Σx 2
− ( 30 )
2
32 = −  =
25  25  25
Σx 2
9= − 900
25
Σx 2
= 909
25
∑x2 = 909 × 25 = 22725
When the incorrect items are omitted , we have for the remaining 23 items,
www.gayali.in
∑x = 750 – 29 – 31 = 690
∑x2 = 22725 – 292 – 312 = 22725 – 841 – 961 = 22725 – 1802 = 20923
690
Mean =
( x ) = 30
23
2
20923  690 
S.D.2 = −  = 909.70 – 900 = 9.70
23  23 
S.D. = 9.70 = 3.1

www.gayali.in
[15] The mean and the standard deviation of a group of 100 observations were found
to be 20 and 3 respectively. After the calculations were made it was found that three of
the observations were incorrect which were recorded as 21, 21 and 18. Find the mean
and s.d. if the incorrect observations are omitted.
[C.U., B.A.(Econ.) 1965]
Solution :
∑x = 100 × 20 = 2000
∴ Σx = nx , n = 100, x = 20
2 2
Σx 2  Σx 2  Σx 2  2000 
σ2 = −   = −
n  n  100  100 
Σx 2 Σx 2
− ( 20 ) or ,
2
32 = = 400 + 9 = 409
100 100
∑x2 = 40900
When incorrect items are omitted,
∑x = 2000 – 21 – 21 –18 = 1940
Now, n = 100 – 3 = 97
www.gayali.in
Σx 1940
Now, = = 20
n 97
New, ∑x2 = 40900–212–212–182=40900–441–441–324=40900–1206=39694
39694
− ( 20 ) = 409.22–400=9.22
2
σ2 =
97
∴ σ = 9.22 = 3.04
[16] The mean and the standard deviations of a sample of size 10 were found to be
9.5 and 2.5 respectively. Later on, an additional observation became available. This
was 15.0 and was included in the original sample. Find the mean and the standard
deviations of the 11 observations.
[I.C.W.A. 1975]
Solution :
∑x = 10 × 9.5 = 95
Where n = 10, x = 9.5
Σx 2 Σx 2
− ( 9. 5 ) =
2
As per given condition, 2.52 = − 90.25
10 10
Σx 2
= 6.25 + 90.25 = 96.50
10
www.gayali.in
∑x2 = 965
When additional observation were available,
then, ∑x = 95 + 15 = 110
then again, ∑x2 = 965 + 152 = 965 + 225 = 1190
110
=
x = 10
11

www.gayali.in
2
1190  110  1190
σ2 = − = − 100 = 108.18 − 100 = 8.18
11  11  11
∴ σ = 8.18 = 2.86
[17] The mean and the standard deviation of a sample of 100 observations were
calculated as 40 and 5.1 respectively, by a student who by mistake took one observation
as 50 instead of 40. Calculate the correct S.D.
[I.C.W.A. 1976]
Solution :
Here, x = 40, n = 100, σ = 5.1
∑x=100×40=4000
2 2
Σx 2  Σx  Σx 2  4000 
σ2 = −  = −
n  n  100  100 
Σx 2
5.12 = − 1600
100
Σx 2
www.gayali.in
Or, = 26.01 + 1600 = 1626.01
100
∑x2 = 162601
When 50 is replaced by 40, the correct sum of observations are
∑x = 100×40–(50)+40 = 4000–10=3990
∑x2 = 162601–502+402 = 162601–2500+1600 = 162601–900=161701
Using in the formulae for mean and variance.
3990
Mean = = 39.90
100
161701
− ( 39.90 ) = 1617.01–1592.01
2
S.D.2 =
100
∴ S.D. = 25 = 5
[18] For a distribution of 280 observations mean and standard deviations were
found to be 54 and 3 respectively. On checking it was discovered that two observations
which should correctly read as 62 and 82, had been wrongly recorded as 64 and 80
respectively. Calculate the correct values of mean and S.D.
[C.U., B.A.(Econ.) 1969]
Solution :
www.gayali.in
Here, n = 280, x = 54, σ = 3

∑x = 280×54 = 15120
2
Σx 2  Σx 
σ2 = − 
n  n 
2
Σx 2  15120  Σx 2
32 = − = − 542
280  280  280

www.gayali.in
Σx 2
9= − 2916
280
Σx 2
Or, = 2916 + 9 = 2925
280
Or, ∑x2 = 2925×280=819000
When 64 and 80 are replaced by 62 and 82,
∑x = 15120–64–80+62+82 = 15120–144+144=15120
∑x2 = 819000–642–802+622+822 = 819000+622–642+822–802
= 819000+(–2)(126)+2(162) = 819000–252+324=819000+72=819072
Σx 15120
Mean = = = 54
n 280
2
819072  Σx  819072 2
S.D.2 = −  = 280 − 54 = 2925.26–2916 = 9.26
280  n 
∴ S.D.2 = 9.26 = 3.04
[19] X is the mean of X1, X2, and X3. If x1, x2, x3are the deviations of X1, X2, X3 from
X respectively, prove that x1 + x2 + x3 = X1 + X2 + X3 - 3 X .
2 2 2 2 2 2 2
www.gayali.in
[C.U., B.A. (Econ.) 1969]
Solution :
X1 + X 2 + X 3
X=
3
Or, X1 + X2 + X3 = 3 X
L.H.S. = ( X − X1 ) + ( X − X 2 ) + ( X − X 3 )
2 2 2
= X 2 + X12 − 2 X X1 + X 2 + X 22 − 2 X X 2 + X 2 + X 32 − 2 X X 3
= X12 + X 22 + X 32 − 2 X ( X1 + X 2 + X 3 ) + 3X 2 = X12 + X 22 + X 32 − 2 X.3X + 3X 2
= X12 + X 22 + X 32 − 6 X 2 + 3X 2 = X12 + X 22 + X 32 − 3X 2
= R.H.S. proved.
[20] Let x1, x2, - - -, xn be a set of observations. Suppose we compute yi = a + b xI (i =
1, 2, - - -, n), where a and b are constants, Express the s.d. of the y's in terms of the s.d.
of the x’s and comment on the relation between the two.
[C.U.,B.A.(Econ.) 1978]
Solution :
y i = a + bx i , y = a + bx
www.gayali.in
yi − a
Or, xi =
b
( y i − y ) = ( a + bx i ) − ( a + bx ) = b ( x i − x )
Σ{b ( x i − x )}
2
Σ( yi − y ) b2 Σ ( x i − x )
2 2
2
σ y = = = = b2 σ x 2 = σ y = bσ x
n n n
It is observed from the result that on the right hand side, the new origin ‘a’ is

www.gayali.in
absent but sale ‘b’ is present. This proves that S.D. is unaffected by any change of origin,
but depends on scale.
[21] If the mean and the standard deviation of n observations x1, x2, - - -, xn be x
and σ respectively then the mean and the stand and deviations of –x1, -x2, - - -, -xn will
be – x and –σ respectively comment.
[I.C.W.A., 1975]
Solution :
Mean of –x1, –x2, ––––, –xn
− ( x1 + x 2 + − − − − + x n ) − Σx
= = = −x
n n
Σ ( xi − x )
2
S.D. of –x1, –x2, ––––, –xn= − = −σ

n
Deviation from mean  − x1 − ( − x ) ,  − x 2 − ( − x ) ,  − x n − ( − x )
= ( − x1 + x ) , ( − x 2 + x ) − − − − ( − x n + x )
= − ( x1 − x ) , − ( x 2 − x ) , − − − − , − ( x n − x ) = − ( x1 − x ) , ( x 2 − x ) , − − − − , ( x n − x )
www.gayali.in
Square Deviation from mean :
− ( x1 − x ) , − ( x 2 − x ) , − − − − , − ( x 2 − x ) − − − − ( x n − x )
2 2 2 2
Mean Square Deviation from mean :

1 1
− ( x1 − x ) − ( x 2 − x ) − − − − ( x n − x )  = − Σ ( x i − x )
2 2 2 2

n  n
1
∴ S.D. = − Σ ( x i − x )
2 2
n
1
Σ ( x i − x ) = −σ
2
S.D. = −
n
[22] If d2 = mean square deviation about x, σ= standard deviation, and x - x = a,
then show that d2 = σ2 + a2
[D.S.W., 1971]
Solution :
Σ(x i − x)2 Σ(a + x − x)2
d2 = , xi = a + x =
n n
(x i − x) = (x i − x ) + ( x − x)
Therefore,
www.gayali.in
Σ(x i − x)2 = Σ{(x i − x ) + ( x − x)}2 = Σ{(x i − x )2 + 2(x i − x )( x − x) + ( x − x)2 }

= Σ(x i − x )2 + Σ2(x i − x )( x − x) + Σ( x − x)2
= Σ(x i − x )2 + 2Σ (x i − x )( x − x) + n ( x − x)2 = Σ(x i − x )2 + 2(x − x ) × 0 + n ( x − x)2
= Σ(x i − x )2 + n ( x − x)2
1 1
i.e. Σ(x i − x)2 = Σ(x i − x )2 + ( x − x)2
n n

www.gayali.in
d 2 = σ2 + (−a )2 = σ2 + a 2
[23] Calculate the standard deviation from the following Series: 20, 85, 120, 60, 40
[B.U., B. Com. 1971]
Solution :
Table : Calculations for standard Deviation
x − 60
x y= y2
5
20 -8 64
40 5 25
60 12 144
85 0 0
120 -4 16
Total 5 249
2 2
Σy 2  Σ y  249  5 
σy= −  = − = 49.8 − 1 = 48.8
n  n  5  5 
σy = 6.98
www.gayali.in
σx = d. (σy) = 5 × 6.98 = 34.9
[24] Find the standard deviation of weights (to the nearest pound) of 15 students
given below: 138, 156, 147, 115, 145, 132, 163, 158, 130, 123, 103, 109, 100, 105, 106.
[B.U., B.A. (Econ.) 1972]
Solution: Table: Calculation for S.D.
x y = x – 130 y2
100 -30 900
103 -27 729
105 -25 625
106 -24 576
109 -21 441
115 -15 225
123 -7 49
130 0 0
132 2 4
138 8 64
145 15 225
147 17 289
156 26 676
158 28 784
www.gayali.in
163 33 1089
Total -20 6676
2 2
2 Σy 2  Σ y  6676  20 
σy = −  = − − = 445.07 – 1.78 = 443.29
n  n  15  15 
σy= 443.29 = 21 lbs.
σx=σy=21 lbs.

www.gayali.in
[25] Calculate the s.d. of the following observations: 240.12, 240.13, 240.15, 240.12,
240.17, 240.15, 240.17, 240.16, 240.22, 240.21.
[I.C.W.A. 1976]
Solution :
Table : Calculations for S.D.
x − 240
x y= y2
.01
240.12 12 144
240.12 12 144
240.13 13 169
240.15 15 225
240.15 15 225
240.16 16 256
240.17 17 289
240.17 17 289
240.21 21 441
240.22 22 484
Total 160 2666
2
www.gayali.in
2
Σy 2  Σ y  2666  160 
σy2 = −
n  n 
 = −  = 266.6 – 256 = 10.6
10  10 
σy= 10.6 = 3.256
σx = d σy= .01 × 3.256 = 0.033
[26] Find the standard deviation for the distribution given below :
x 1 2 3 4 5 6 7
Frequency 10 20 30 35 14 10 2
[Dip. Management, 1967]
Solution :
Calculation for S.D.
x f fx f x2
1 10 10 10
2 20 40 80
3 30 90 270
4 35 140 560
5 14 70 350
6 10 60 360
7 2 14 98
www.gayali.in
Total 121 424 1728

2 2
fx 2  fx  1728  424 
σ x2 = − = −  = 14.28 – 12.28 = 2
n  n  121  121 
σx= 2 = 1.41
[27] Find the s.d. from the following table giving the age distribution of 540 members

www.gayali.in
of a Parliament:
Age in years 30 40 50 60 70
No. of members 64 132 153 140 51
[C.U.,B.Com. 1978]
Solution :
Table : Calculation for S.D.
x − 50
x (Age in years) f (No. of members) y= fy f y2
10
30 64 -2 -128 256
40 132 -1 -132 132
50 153 0 0 0
60 140 1 140 140
70 51 2 102 204
Total 540 - -18 732
2
732  18 
σy2 = − − = 1.3560 - .00111 = 1.3449
540  540 
www.gayali.in
σy= 1.3449 = 1.164
σx2 = d. σy = 10 × 1.164 = 11.64
[28] Find the s.d. from the following frequency distribution:
Wt. (lbs.) 120–124 125–129 130–134 135–139 140–144 145–149 Total
No. of boys 12 25 28 15 12 8 100
[B.U., B.Com. 1974]
x −132
Class interval Frequency (f) Mix-value (x) y = fy f y2
5
120–124 12 122 -2 -24 48
125–129 25 127 -1 -25 25
130–134 28 132 0 0 0
135–139 15 137 1 15 15
140–144 12 142 2 24 48
145–149 8 147 3 24 72
Total 100 - - 14 208
2
208  14 
σy2 = − = 2.08 − 0.0196 = 2.08 − .02 = 2.062
100  100 
www.gayali.in
σy= 2.062 = 1.436

σx = d.σy = 5×1.436 = 7.18 lbs.
[29] The weights of a certain product produced in a factory are given below from a
sample of 121 articles :

www.gayali.in
Weight – (OZ) 3.0–3.1 3.1–3.2 3.2–3.3 3.3–3.4 3.4–3.5 3.5–3.6 3.6–3.7 3.7–3.8 3.8–3.9 3.9–4.0
Frequency 5 10 12 20 25 18 10 8 8 5
Calculate the arithmetic mean and standard deviation.
[C.A. 1972]
Solution:
x − 3.45
Class interval Frequency (f) Mix-value (x) y = .10
fy f y2
3.0–3.1 5 3.05 -4 -20 80
3.1–3.2 10 3.15 -3 -30 90
3.2–3.3 12 3.25 -2 -24 48
3.3–3.4 20 3.35 -1 -20 20
3.4–3.5 25 3.45 0 0 0
3.5–3.6 18 3.55 1 18 18
3.6–3.7 10 3.65 2 20 40
3.7–3.8 8 3.75 3 24 72
3.8–3.9 8 3.85 4 32 128
3.9–4.0 5 3.95 5 25 125
Total 121 - - 25 621
www.gayali.in
2
Σfy 2  Σfy  25
σy2 = −  x = 3.45 + × 0.10 = 3.45 + 0.02 = 3.47
n  n  121
2
621  25 
= −  = 5.13 – 0.04 = 5.09
121  121 
σy = 5.09 = 2.3
σx = 0.10 × 2.3 = 0.23
[30] Compute the standard deviation of the following data:
Weekly wages in Rs. Number of Men
30 and under 40 8
40 and under 50 12
50 and under 60 6
60 and under 70 4
70 and under 80 10
[B.U.,B.Com. 1973]
Solution :
Calculation for S.D.
x −55
Class interval Frequency (f) Mid-value (x) Y= fy f y2
10
www.gayali.in
30 – 40 8 35 -2 -16 32
40 – 50 12 45 -1 -12 12
50 – 60 6 55 0 0 0
60 – 70 4 65 1 4 4
70 – 80 10 75 2 20 40
Total 40 - - -4 88
2 2
Σfy 2  fy  88  4 
σy2 = −  = − − = 2.20 – 0.01 = 2.19
n n 40  40 

www.gayali.in
σy= σy = 2.19 = 1.48

σx = 10 × 1.48 = Rs.14.80
[31] Find the standard deviation of the following distribution :
Turnover (Rs.'000 p.a.) 50–100 100–150 150–200 200–250 250–300 300–350 350–400
No. of firms 5 8 9 12 18 23 17
[C.U.B.Com, 1974]
Solution :
Table : Calculation for S.D.
Turnover ('000) f Mid-value (x) x − 225 fy f y2
y=
50
50–100 5 75 -3 -15 45
100–150 8 125 -2 -16 32
150–200 9 175 -1 -9 9
200–250 12 225 0 0 0
250–300 18 275 1 18 18
300–350 23 325 2 46 92
350–400 17 375 3 51 153
Total 92 - - 75 349
www.gayali.in
2
349  75 
σy2= − = 3.79 − 0.66 = 3.13
92  92 
σy= 3.13 = 1.768
σx=σx = d. σy = 50 ×1.768 = 88.4
[32] Compute the arithmetic mean, standard deviation and mean deviation about
the mean for the following data :
Scores 4–5 6–7 8–9 10–11 12–13 14–15 Total
f 4 10 20 15 8 3 60
[I.C.W.A., 1978]
Solution :
Table : Calculations for A.M. and S.D.
x −8.5
Class interval f Mid-value (x) Y= fy f y2
2
4–5 4 4.5 -2 -8 16
6–7 10 6.5 -1 -10 10
8–9 20 8.5 0 0 0
10–11 15 10.5 1 15 15
12–13 8 12.5 2 16 32
www.gayali.in
14–15 3 14.5 3 9 27
Total 60 - - 22 100
22
x = 8. 5 + × 2 = 8.5 + 0.73 = 9.23
60
2
2100  22 
Σy = − = 1.67 − 0.13 = 1.54
60  60 

www.gayali.in
σy= 1.54 = 1.24

σx = d.σy = 2 ×1.24=2.48

x f | x - x | i.e | x – 9.23 | f|x– x |
4.5 4 4.73 18.92
6.5 10 2.73 27.30
8.5 20 0.73 14.60
10.5 15 1.27 19.05
12.5 8 3.27 26.16
14.5 3 5.27 15.81
Total 60 - 121.84
121.84
=
Mean Deviation about Mean = 2.03
60
[33] Compute the s.d. of income from the following:
www.gayali.in
Income (Rs.) Below 200 200–399 400–599 600–799 800–999 1000–1199
No. of earners 25 72 47 22 13 7
[C.U., B.A.(Econ.), 1978]
Solution : Calculations for S.D.
Income (Rs.) f Class – interval Mid-value (x) y = x − 499.5 fy f y2

200
Below 200 25 0 – 199 99.5 -2 -50 100
200–399 72 200 – 399 299.5 -1 -72 72
400–599 47 400 – 599 499.5 0 0 0
600–799 22 600 – 799 699.5 1 22 22
800–999 13 800 – 999 899.5 2 26 52
1000–1199 7 1000 – 1199 1099.5 3 21 63
Total 186 - - -53 309
2 2
Σfy 2  fy  309  53 
σy2= −  = −−  = 1.66 − 0.08 = 1.58
n  n  186  186 
σy= 1.58 = 1.257
www.gayali.in
σx = d.σy= 200×1.257=251.40
[34] Find the mean and the s.d. from the following frequency distribution:
Weight (lb.) 131–140 141–150 151–160 161–170 171–180 181–190 191–210 211–240
No. of person 2 5 4 9 7 5 3 1
[I.C.W.A. 1971]

www.gayali.in
Solution :
Table : calculations for Mean and S.D.
Weight (lb.) f Mid-value (x) x −165.5 fy f y2
y=
10
131–140 2 135.5 -3 -6 18
141–150 5 145.5 -2 -10 20
151–160 4 155.5 -1 -4 4
161–170 9 165.5 0 0 0
171–180 7 175.5 1 7 7
181–190 5 185.5 2 10 20
191–210 3 200.5 3.5 10.5 36.75
211–240 1 225.5 6 6 36
Total 36 - - 13.5 141.75
13.5
x = 165.5 + × 10 = 165.5 + 3.75 = 169.25
36
2
141.75  13.5 
σy2= −  = 3.938 − 0.1406 = 3.797
36  36 
www.gayali.in
∴σy= 3.797 = 1.949
σx = 10×1.949=19.49=19.5 Ans.
[35] Calculate the proportion of firms in which costs of production are within the
range A.M. ± S.D. in the following distribution:
Costs of production (Rs. per 5 litres) 4–6 6–8 8–10 10–12 12–14 14–16 Total
No. of dairy farms 13 111 182 105 19 7 437
[I.C.W.A. 1973]
Solution:
Table: Calculations for A.M. & S.D.
Costs of production Cumulative Mid-value (x) x −9
f frequency y= fy f y2
(class boundary) 2
4–6 13 13 5 -2 -26 52
6–8 111 124 7 -1 -111 111
8–10 182 306 9 0 0 0
10–12 105 411 11 1 105 105
12–14 19 430 13 2 38 76
14–16 7 437 = N 15 3 21 63
Total 437 - - - 27 407
27
x =9+ × 2 = 9 + 0.124 = 9.124
www.gayali.in
437
2
407  27  177859 − 729 177130
σy2= − = =
437  437  4372 4372
177130 420.87
∴σy= = = 0.963
4372 437
σx = 2 × 0.963 = 1.926
A.M. + S.D = 9.124 + 1.926 = 11.05

www.gayali.in
A.M. – S.D. = 9.124 – 1.926 = 7.198

Class boundary Cumulative frequency (less-than)
4 0
6 13
8 124
10 306
12 411
14 430
16 437
By interpolation, we get
Q1 − 13 7.198 − 6 Q − 13 1.198
= or, 1 = or, Q1 – 13 = 66.49 or Q1 = 79.49
124 − 13 8−6 111 2
Q2 − 306 11.05 − 10 Q − 306 1.05
= or, 2 = or, Q2 – 306 = 55.13 or Q2 = 361.13
411 − 306 12 − 10 105 2
∴ Q3 – Q1 = 361.13 – 79.49 = 281.64
∴ Proportion of Frims
282
= = 0.65
Within the given range
437
[36] Out of 400 observations, 100 observations have the value one and the rest of the
www.gayali.in
observations are zero. Find the mean and s.d. of 400 observations together.
[B.U., B.A.(Econ.),1966]
Solution :
Table : Calculations for A.M. and S.D.
x f fx f x2
0 300 0 0
1 100 100 100
Total 400 100 100
Σfx 100 1
A.M. ( x ) = = =
Σf 400 4
2 2
Σfx 2  Σfx  100  100  1 1 4 −1 3
σx2= −  = −  = − = =
n  n  400  400  4 16 16 16
3
∴σx=
4
[37] Two samples of sizes 60 and 90 have 52 and 48 as the respective arithmetic means,
and 9 and 12 as the respective standard deviation. Find the arithmetic mean and standard
deviation of the combined sample of size 150.
[I.C.W.A.,1970]
Solution :
www.gayali.in
Table : Data for S.D. of Composite Group

Group
Characterization Composite Group
I II
No. of observations 60 90 150
Mean 52 48 x
Standard Deviation 9 12 σ

www.gayali.in
The formulae for mean and S.D. of combined group are

1. Nx = n1 x1 + n2 x 2 ( where N = n1 + n2 )
2. ( ) (
Nσ2 = n1σ12 + n2 σ22 + n1d12 + n2 d 22 )
7440
Using (1) 150 x = 60 × 52 + 90 × 48 = 3120 + 4320 = 7440 = = 49.6
150
( ) (
Nσ2 = n1σ12 + n2 σ22 + n1d12 + n2 d 22 )
d1 = x1 − x , d 2 = x 2 − x
∴ d1 = 52 – 49.6 = 2.4, d2 = 48 – 49.6 = –1.6
150σ2 = 60×92+90×122+60×2.42+90×(–1.6)2=60×81+90×144+60×5.76+90×2.56
= 4860+12900+345.60+230.40 = 18336
18336
σ2 = = 122.24
150
σ = 122.24 = 11.06 = 11.1
[38] The mean of two samples of sizes 50 and 100 respectively are 54.4 & 50.3 and the
standard deviations are 8 and 7. Obtain the mean and standard deviation of the sample of size
150 obtained by combining the two samples. (Give answers correct to one decimal place.)
www.gayali.in
[I.C.W.A.,1978]
Solution :
Table : Mean and S.D. of composite Group
Group
Characteristics Composite Group
I II
No. of observations 50 100 150
Mean 54.4 50.3 x
Standard Deviation 8 7 σ
150 x = 50 × 54.4 + 100 × 50.3 = 2720 + 5030 = 7750
7750
=x = 51.67 = 51.7
150
d1 = 54.4 – 51.7 = 2.7
d2 = 50.3 – 51.7 = –1.4
150σ2 = 50×82+100×72+50×2.72+100×(–1.4)2 = 50×64+100×49+50×7.29+100× 1.96
= 3200+4900+364.50+196 = 8660.50
8660.5
σ2 = = 57.74
150
σ = 57.74 = 7.6
www.gayali.in
[39] An analysis of monthly wages paid to workers in two firms A and B, belonging
to same industry, gives the following results :
Firm A Firm B
Number of wage earners 550 650
Average monthly wages Rs. 50 Rs. 45
S.D. of the distribution of wages Rs. (√90) Rs. (√120)

www.gayali.in
Answer the following questions with proper justifications :

[a] Which firm A or B pays out larger amount as monthly wages?
[b] In which firm A or B is there greater variability in individual wages?
[c] What are the measures of (i) average monthly wages, and (ii) standard
deviation in the distribution of individual wages of all workers in the two firms
taken together?
[I.C.W.A, 1977]
Solution:
[a] Firm A: monthly wages = 550 × 50 = Rs.27,500
Firm B: monthly wages = 650 × 45 = Rs.29,250
Firm B pays greater wages.
[b] Firm A: S.D. of distribution of wage = 9.49
Firm B: S.D. of distribution of wage = 10.95
S.D. is higher in case of firm B’ hence greater variability is firm B.
[c] (i) n1 + n2 = 550 + 650 = 1200
1200 x = 550 × 50 + 650 × 45 = 27500 + 29250 = 56750
56750
= 47.29 =x
1200
(ii) d1 = 50 – 47.29 = 2.71
www.gayali.in
d2 = 45 – 47.29 = -2.29
( ) ( )
2 2
1200 σ2 = 550 × 90 + 650 × 120 + 550 × 2.712 + 650 × ( −2.29 )
2

= 49500 + 78000 + 4039.26 + 3408.67 = 1, 34, 947.93
2 134947.93
σ = = 112.46
1200
σ = 112.46 = 10.60
[40] A company has three establishments E1, E2 and E3 in there cities. Analysis of the
monthly salaries paid to the employees in the three establiments is given below :
E1 E2 E3
Number of employees 20 25 40
Average monthly salary (Rs.) 305 300 340
S.D. of monthly salaries (Rs.) 50 40 45
Find the average and the standard deviation of the monthly salaries of all 85
employees in the company.
[I.C.W.A., 1976]
Solution :
85 x = 20 × 305 + 25 × 300 + 40 × 340 = 6100 + 7500 + 13600 = 27,200
27200
=x = 320
www.gayali.in
85
d1 = 305 – 320 = –15
d2 = 300 – 320 = –20
d3 = 340 – 320 = 20
85 σ2 = 20 × 502 + 25 × 402 + 40 × 452 + 20 × (-15)2 + 25 × (-20)2 + 40 × (20)2
= 50000 + 40000 + 81000 + 4500 + 10000 + 16000 = 201500
σ2 = 201500/85 = 2370.59
σ = 2370.59 = 48.69

www.gayali.in
[41] Three sets of values of the variable x have means 26.3, 27.0 and 28.5 and standard
deviations 4.5, 3.9 and 4.8. If the three sets have respectively 50, 60 and 55 values. What
would be the mean and variance of x, if the three sets are taken together?
Solution :
n1 + n2 + n3 = 50 + 60 +55 = 165
165 x = 50 × 26.3 + 60 × 27.0 + 55 × 28.5 = 1315 + 1620 + 1567.5 = 4502.50
x = 4502/165 = 27.29
d1 = 26.3 – 27.29 = 0.99, d2 = 27.0 – 27.29 = -0.29
d3 = 28.5 – 27.29 = 1.21
165 σ2 = 50 × 4.52 + 60 × 3.92 + 55 × 4.82 + 50 × 0.992 + 60 × (-0.29)2 + 55 × 1.122
= 50 × 20.25 + 60 × 15.21 + 55 × 23.04 + 50 × 0.9801 + 60 × 0.0841 + 55 × 1.4641
= 1012.50 + 912.60 + 1267.20 + 49 + 5.05 + 80.53 = 3326.88
∴σ2 = 3326.88/165 = 20.16
∴Variance = 20.16
[42] The mean and the variance calculated from a group of 80 observations are 63.2
and 25.9 respectively. If 60 of these observations have mean 64.8 and s.d. 4, find the mean
and the s.d. of the remaining 20 observations.
www.gayali.in
[I.C.W.A., 1971]
Solution :
when n = 80,
∑x = 80 × 63.2 = 5056 - - - - (i)
When n = 60, ∑x1 = 60 × 64.8 = 3888 - - - - (ii)
Sum of remaining 20 observation (i) – (ii) = 1168 = ∑x2 (say)
1168
∴ x2 = = 58.4
20
Here, n1, σ1, x 1 60, 4, 64.8
n2, σ2, x 2 20, 62, 58.4
d1 = x 1 - x = 64.8 – 63.2 = 1.6
d2 = x 2 - x = 58.4 – 63.2 = -4.8
By the formula,
N σ2 = n1 σ12 + n2 σ22 + n1 d12 + n2 d22
80 × 25.93 = 60 × 42 + 20 × σ22 + 60 × 1.62 + 20 × (-4.8)2
2074.4 = 60×16+20 σ22+60×2.56+20×23.04 = 960+20 σ22+153.6+460.80
= 1574.40 + 20 σ22
Or, 20 σ22 = 2074.4 – 1574.4 = 500
σ22 = 500/20 = 25
www.gayali.in
σ2 = 25 = 5
[43] A group has the following measurements:

x = 10, σ2 = 4 and n = 60
A subgroup of the above has x 1 = 11, σ12 = 2.25 and n1 = 40. Find the mean and
standard deviation of the other subgroup.
[I.C.W.A. 1973, MBA, 1978]

www.gayali.in
Solution: As per formula,

N x = n1 x 1 + n2 x 2
N σ2 = n1 σ12 + n2 σ22 + n1 d12 + n2 d22
60 × 10 = 40 × 11 + 20 × x 2 or, 600 = 440 + 20 x 2 or, 20 x 2 = 160
x 2 = 160/20 = 8
d1= x 1 - x = 11 – 10 = 1
d2 = x 2 - x = 8 – 10 = –2
60 × 4 = 40 × 2.25 + 20 × σ22 + 40 × 12 + 20 × (-2)2 = 40 × 2.25 + 20 σ22 + 40 + 80
= 90 + 20 σ22 + 40 + 80 = 240 = 210 + 20 σ22 or, 20 σ22 = 30
30
σ22 = = 1.5
20
σ2 = 1.5 = 1.22
[44] The following data refer to the dividend (%) paid by two companies A and B
over the Last 7 years.
A: 4 8 4 15 10 11 9
B: 12 8 3 15 6 4 10
www.gayali.in
Calculate the coefficients of variation and commeant.
[I.C.W.A. 1975]
Standard Deviation
Solution : Coefficient of Variation = × 100
Mean
Table : Calculations for Mean and S.D.
For A For B
x y2 x y2
y = x −10 y = x −8
4 -6 36 12 4 16
8 -2 4 8 0 0
4 -6 36 3 –5 25
15 5 25 15 7 49
10 0 0 6 –2 4
11 1 1 4 –4 16
9 -1 1 10 2 4
Total -9 103 Total 2 90
 9
For A : x = 10 +  −  = 10 − 1.29 = 8.71
 7 
2
103  9 
σ2 = − − = 14.71 − 1.66 = 13.05
www.gayali.in
7  7 
σ = 13.05 = 3.61
2
For B : x = 8 +   = 8 + 0.29 = 8.29
7
2
90  2 
σ2 = − = 12.86 − 0.08 = 12.78
7  7 

www.gayali.in
σ = 12.78 = 3.57
3.61
C.V. (for A) = × 100 = 41.45 = 41.5
8.71
3.57
C.V. (for B) = × 100 = 43.06 = 43.1
8.29
The percentage of dividend is higher in B than A. Hence shares of company B is
more preferable than A.
[45] From the pries of shares x and y below find out which is more stable in value:
x: 35 54 52 53 56 58 52 50 51 49
y: 108 107 105 105 106 107 104 103 104 101
[I.C.W.A., 1976]
Solution :
For x For y
x y= y2 y z= z2
x − 52 y − 105
35 –17 289 108 3 9
www.gayali.in
54 2 4 107 2 4
52 0 0 105 0 0
53 1 1 105 0 0
56 4 16 106 1 1
58 6 36 107 2 4
52 0 0 104 –1 1
50 –2 4 103 –2 4
51 –1 1 104 –1 1
49 –3 9 101 –4 16
Total –10 360 Total 0 40
 10 
For x : x = 52 +  −  = 52 − 1 = 51
 10 
2
360  10 
σx2= − − = 36 − 1 = 35, σ x = 35 = 5.92
10  10 
0
For y : y = 105 + = 105
10
2
40  0 
σy2= − =4
10  10 
www.gayali.in
σy= 4 =2
5.92
C.V. (for x) = × 100 = 11.61
51
2
C.V. (for y) = × 100 = 1.9
105
Share y is more stable.

www.gayali.in
[46] Calculate the coefficient of variation from the following data, showing Grades of
100 students in M.A. Mathematics :
Grades 30–39 40–49 50–59 60–69 70–79 80–89 90–99
Frequency 2 3 11 20 32 25 7
[C.U.,M com. 1973]
Solution:
Class interval Frequency (f) Mid-value (x) x − 64.5 fy f y2

y=
10
30–39 2 34.5 -3 -6 18
40–49 3 44.5 -2 -6 12
50–59 11 54.5 -1 -11 11
60–69 20 64.5 0 0 0
70–79 32 74.5 1 32 32
80–89 25 84.5 2 50 100
90–99 7 94.5 3 21 63
www.gayali.in
Total 100 - - 80 236
80
x = 64.5 + × 10 = 64.5 + 8 = 72.5
100
2
236  80 
σy 2 = − = 2.36 − 0.64 = 1.72
100  100 
σ y = 1.72 = 13.1
σx = 1.31 × 10 = 13.1
13.1
C.V. = × 100 = 18.07 = 18.1
72.5
[47] The mean life in days and standard deviation for two types of electric bulbs are
given below :
Mean life in days Standard Deviation in days
Type I 310 9
Type II 260 14
Compare the relative variability of life of the two types of bulbs.
[B.U., B.A. (Econ.), 1965]
www.gayali.in
Solution:
9
C.V. (type I bulb) = × 100 = 2.29
310
14
C.V. (type II bulb) = × 100 = 5.38 = 5.4
260
Mean life of Electric bulb of Type II is more variable.

www.gayali.in
[48] You are given the distribution of wages in two factories X and Y.
Wages (Rs.) 50–100 100–150 150–200 200–250 250–300 300–350
X 2 9 29 54 11 5
No. of Workers Y 6 11 18 32 27 11
State in which factory the wages are more variable (Use Standard Deviation and Mean.)
[C.A., 1975]
Solution :
Table : Calculations for Mean and S.D. for X
Wages Frequency (f) Mid-value (x) z=
y −175 fy f y2
50
50–100 2 75 -2 -4 8
100–150 9 125 -1 -9 9
150–200 29 175 0 0 0
200–250 54 225 1 54 54
250–300 11 275 2 22 44
300–350 5 325 3 15 45
Total 110 - - 78 160
Table : Calculations for Mean and S.D. for Y
www.gayali.in
Wages Frequency (f) Mid-value (y) z=
y −175 fz f z2
50
50–100 6 75 –2 –12 24
100–150 11 125 –1 –11 11
150–200 18 175 0 0 0
200–250 32 225 1 32 32
250–300 27 275 2 54 108
300–350 11 325 3 33 99
Total 110 – – 96 274
78
For x : x = 175 + × 50 = 175 + 35.45 = 210.45
100
2
160  78 
σy2= − = 1.45 − 0.50 = 0.95
110  110 
σy= 0.95 = 0.97, σx = 50 × 0.97 = 48.5
48.5
C.V. = × 100 = 23
210.45
96
For y : y = 175 + × 50 = 175 + 45.71 = 220.71
110
www.gayali.in
2
274  96 
σz2= − = 2.61 − 0.84 = 1.77
110  110 
σz= 1.77 = 1.33
σy= 50 × 1.33 = 66.5
66.5
C.V. = × 100 = 30.13 = 30
220.71
Wages of factor y is more variable

www.gayali.in
[49] Calculate a suitable measure of dispersion for the following distribution:

Cotton consumed in 0–2 2–4 4–6 6–8 8–10 10–12 12–14 14–16 16–18 18–20 20–22
thousand candles
No. of Mills 5 13 12 11 8 4 1 3 1 1 2
How does this dispersion compare with a S.D. of 3 lbs. for weight of yarns per
spindle among mills producing yarns of average weight 20 lbs. per spindle?
Solution :
Table : Calculation for S.D. & A.M.
x −11
Class interval Frequency (f) Mid-value (x) y= fy f y2
2
0–2 5 1 -5 -25 125
2–4 13 3 -4 -52 208
4–6 12 5 -3 -36 108
6–8 11 7 -2 -22 44
8–10 8 9 -1 -8 8
10–12 4 11 0 0 0
www.gayali.in
12–14 1 13 1 1 1
14–16 3 15 2 6 12
16–18 1 17 3 3 9
18–20 1 19 4 4 16
20–22 2 21 5 10 50
Total 61 - - -119 581
 119 
x = 11 +  − × 2  = 11 − 3.90 = 7.10
 61 
2
581  119 
σy2= − − = 9.52 − 3.80 = 5.72
61  61 
σy= 5.72 = 2.39
σx= 2 × 2.39 = 4.78
4.78
C.V.= × 100 = 67.32
7.10
(for cotton consumed)
3
C.V. for spindle mill = × 100 = 15
20
Dispersion for cotton consumption is more.
www.gayali.in
[50] In a small town, a survey was conducted in respect of profit made by retail
shops. The following results were obtained :
Profit or loss (Rs. '000) -4 to -3 -3 to -2 -2 to -1 -1 to 0 0 to 1 1 to 2 2 to 3 3 to 4 4 to 5 5 to 6
No. of shops 4 10 22 28 38 56 40 24 18 10
Calculate : (i) the average profit made by a retail shop;
(ii) total profit made by all the shops;
(iii) the coefficient of variation of earnings.
[C.A., 1977]

www.gayali.in
Solution :
Class interval Frequency (f) Mid-value (x) y = x – 0.5 fy f y2
–4 to –3 4 -3.5 -4 -16 64
–3 to –2 10 -2.5 -3 -30 90
–2 to –1 22 -1.5 -2 -44 88
–1 to 0 28 -0.5 -1 -28 28
0 to 1 38 0.5 0 0 0
1 to 2 56 1.5 1 56 56
2 to 3 40 2.5 2 80 160
3 to 4 24 3.5 3 72 216
4 to 5 18 4.5 4 72 288
5 to 6 10 5.5 5 50 250
Total 250 - 5 212 1240
212
x = 0. 5 + = 0.5 + 0.848 = 1.348
250
2
1240  212 
σy2= − = 4.96 − 0.719 = 4.24
250  250 
σy= 4.24 = 2.06
∴ (i) Average profit = 1.348
www.gayali.in
(ii) Total profit = 250 × 1.348 = 337
2.06
(iii) Coefficient of variation = × 100 = 153
1.348
[51] The following data show the length of ear-head (in cm) for 24 ears of a variety of wheat
11.5 8.8 10.1
8.2 9.3 10.0
9.7 10.1 10.3
10.3 11.3 9.8
10.7 9.8 9.3
8.6 10.4 9.8
11.3 8.4 9.0
10.7 9.6 11.2
Determine the range, the mean deviation about mean and the standard deviation
for the data
Solution : Range = Maximum value – Minimum value
= 11.5 – 8.2 = 3.3 cm.
Table : Calculations for Mean Deviation about mean and s.d.
www.gayali.in
Class limits Frequency (f) Mid-value (x) fx f x2 |x– x | f |x– x )|

8.0–8.7 3 8.35 25.05 209.17 1.55 4.65
8.71–9.41 4 9.06 36.24 328.33 0.84 3.36
9.42–10.12 8 9.77 78.16 763.62 0.13 1.04
10.13–10.83 5 10.48 52.40 549.15 0.58 2.90
10.84–11.54 4 11.19 44.76 500.86 1.29 5.16
Total 24 - 236.61 2351.13 4.39 17.11

www.gayali.in
236.61
=x = 9.9 cm
24
2
2351.13  236.61 
S.D.2 = −  = 97.96 – 97.19 = .77
24  24 
S.D.
= =
0.77 0.88 = 0.90
Σf | x − x | 17.11
Mean Deviation about Mean = = = 0.71
n 24
[52] For the frequency distribution of the number of telephone calls received at
an exchange per interval for 245 successive one – minute interval are shown in the
following frequency distribution :
Number of calls Frequency
0 14
1 21
2 25
3 43
4 51
5 40
www.gayali.in
6 39
7 12
Total 245
Compute the mean deviation about median and the standard deviation.
Solution :
Table : Calculations for Mean Deviation and S.D.
x f Cumulative fx f x2 |x – Median| i.e. |x–4| f |x–Median|
frequency
0 14 14 0 0 4 56
1 21 35 21 21 3 63
2 25 60 50 100 2 50
3 43 103 129 387 1 43
4 51 154 204 816 0 0
5 40 194 200 1000 1 40
6 39 233 234 1404 2 78
7 12 245=N 84 588 3 36
Total 245 - 922 4316 16 366
Median = value corresponding to the cumulative frequency (N + 1)/2 i.e. term
245 + 1
=123-th term = 4
www.gayali.in
2
366
Mean Deviation about Median
= = 1.494
245
2
4316  922  1057420 − 850084 207336
S.D.2 = = − = =
245  245  ( 245 )
2
( 245 )
2
207336 455.34
∴Standard deviation = = = 1.858
( 245 )
2
245

www.gayali.in
[53] Evaluate the three quartiles for the frequency distribution of the following
frequency distribution, Frequency distribution of I.Q. for 309 six – year old children
I.Q. Frequency
160–169 2
150–159 3
140–149 7
130–139 19
120–129 37
110–119 79
100–109 69
90–99 65
80–89 17
70–79 5
60–69 3
50–59 2
40–49 1
Total
www.gayali.in
Next determine the mean deviation about median, the standard deviation and
the quartile deviation.
Solution : Frequency Distribution is arranged in reverse order
Class Limits Frequency (f) Class boundary Cumulative frequency
40-49 1 39.5-49.5 1
50-59 2 49.5-59.5 3
60-69 3 59.5-69.5 6
70-79 5 69.5-79.5 11
28
N
80-89 17 79.5-89.5 ← = 77.25
4
= Q1
93
N
90-99 65 89.5-99.5 ← = 154.5
2
= Q2
162
3N
100-109 69 99.5-109.5 ← = 231.75
4
www.gayali.in
= Q3
110-119 79 109.5-119.5 241
120-129 37 119.5-129.5 278
130-139 19 129.5-139.5 297
140-149 7 139.5-149.5 304
150-159 3 149.5-159.5 307
160-169 2 159.5-169.5 309=N
Total 309 -

www.gayali.in
77.25 − 28 49.25
Q1 = 89.5 + × 10 = 89.5 + × 10 = 89.5 + 7.58 = 97.08
65 65
154.5 − 93 61.5 615
Q2 = 99.5 + × 10 = 99.5 + × 10 = 99.5 + = 99.5 + 8.91 = 108.41
69 69 69
231.75 − 162 69.75 69.75
Q3 = 109.5 + × 10 = 109.5 + × 10 = 109.5 + =109.5+8.83=118.33
79 79 79
Table : Calculations for s.d.
Mid-value (x) y = x −104.5 f fy f y2 | x – 108.41 | f |x–108.41|

10
44.5 -6 1 -6 36 63.91 63.91
54.5 -5 2 -10 50 53.91 107.82
64.5 -4 3 -12 48 43.91 131.73
74.5 -3 5 -15 45 33.91 169.55
www.gayali.in
84.5 -2 17 -34 68 23.91 406.47
94.5 -1 65 -65 65 13.91 904.15
104.5 0 69 0 0 3.91 269.79
114.5 1 79 79 79 6.09 481.11
124.5 2 37 74 148 16.09 595.33
134.5 3 19 57 171 26.09 495.71
144.5 4 7 28 112 36.09 252.63
154.5 5 3 15 75 46.09 138.27
164.5 6 2 12 72 56.09 112.18
Total - 309 123 969 - 4128.65
2
969  123  299421 − 15129 284292
S.Dy2 = − = =
309  30  ( 309 )
2
( 309 )
2
533.19
∴=
S.Dy = 1.726
309
∴ S.D.x = 1.726×10=17.26
www.gayali.in
∴ S.D.x =17.26
4128.65
=
Mean deviation about median = 13.36
309
Q3 − Q1 118.33 − 97.08 21.25
Quartile Deviation = = = = 10.63
2 2 2

www.gayali.in
[54] Compute the standard deviation of the age – distribution of Bengali males as
given below:
Age last birthday Frequency
0 156
1 121
2 111
3 106
4 103
5-9 472
10-14 434
15-19 407
20-24 383
25-29 357
30-34 335
35-39 306
40-49 522
50-59 370
60-69 213
70-79 80
www.gayali.in
80-89 11
90-99 1
Total 4488
Solution : From the solution on page 86, Example-63
x = 26.90, n1 = 597, n2 = 2694, n3 = 1197
x 1 = 1.80, x 2 = 20.62, x 3 = 53.56
2
1263  −121 
S12= − = 2.12 − 0.04 = 2.08
597  597 
2
10847  −746 
S2 =
2
− = 4.02 − 0.08 = 3.94
2694  2694 
2
2591  −1309 
S32= − = 2.16 − 1.20 = 0.96
1197  1197 
n1s12 + n2 s22 + n3s32 n1 ( x1 − x ) + n2 ( x2 − x ) + n3 ( x3 − x )
2 2 2
σ2 = +
n1 + n2 + n3 n1 + n2 + n3
597 × 2.08 + 2694 × 3.94 + 1197 × 0.96

1st Part =
597 + 2694 + 1197
1241.76 + 10614.36 + 1149.12 13005.24
www.gayali.in
= = = 2.90
4488 4488
597 (1.80 − 26.90 ) + 2694 ( 20.62 − 26.90 ) + 1197 ( 53.56 − 26.99 )
2 2 2
2nd Part =
597 + 2690 + 1197
597 × ( −25.10 ) + 2694 × ( −6.28 ) + 1197 ( 26.66 )
2 2 2
=
4488

www.gayali.in
597 × 630.01 + 2694 × 39.44 + 1197 × 710.76

=
4488
376115.97 + 160251.36 + 850779.72 1333147.05
= = = 297.05
4488 4488
∴ σ2 = 2.90 + 297.05 = 299.95
σ = 299.5 = 17.32
[55] For a set of 250 observations on a certain variable x, the mean and standard
deviation are, respectively, 65.7 and 4.4, However, on scrutinizing the data it is found
that two observations, which should correctly read as 71 and 83, had been wrongly
recorded as 91 and 80. Obtain the correct values of the mean and the standard
deviation.
Solution : For 250 observation (which include the incorrect values 91 and 80),
using the formulae for mean and variance, viz.
2
Σx 2 Σx 2  Σx 
x= ,σ = −
n n  n 
Σx
www.gayali.in
We have 65.7 = , so that ∑ x = 65.7 × 250 = 16425
250
2
Σx 2  16425 
4. 4 2 = −
250  250 
Σx 2
= 4316.49 + 19.36 = 4335.85
250
∑ x2 = 1083962.50
When 91 and 80 are replaced by 71 and 83, the correct values are
∑ x = 16425 – 91 – 80 + 71 + 83 = 16408
∑ x2 = 1083962.50–(91+80)2+(71+83)2=1083962.50–29241+23716=1078437.5
Using these in the formulae for mean and variance,
16408
=
Mean = 65.63
250
1078437.5
− ( 65.63 ) = 4313.75 – 4307.30 = 6.45
2
(S.D)2 =
250
S.D.
= =
6.45 2.54
www.gayali.in
[56] The number of runs scored by cricketers A and B during a test series consisting
of 5 test matches is shown below for each of the 10 innings :
Cricketers A – 5, 26, 97, 76, 112, 89, 6, 108, 24, 16.
Cricketers B – 51, 47, 36, 60, 58, 39, 44, 42, 71, 50.
Make a comparative study of their batting performance.

www.gayali.in
Solution : we have to calculate,

Standard Deviation
Coefficient of variation = 100 × Mean
Cricketers A Cricketers B
x y2 y z2
y = x − 55 z = y − 50
5 -50 2500 51 1 1
26 -29 841 47 -3 9
97 42 1764 36 -14 196
76 21 441 60 10 100
112 57 3249 58 8 64
89 34 1156 39 -11 121
6 49 2401 44 -6 36
108 53 2809 42 -8 64
www.gayali.in
24 -31 961 71 21 441
16 -39 1521 50 0 0
Total 107 17643 Total -12 1032
107
x = 55 + = 55 + 10.7 = 65.7
10
2
17643  107 
σ2 = −  = 1764.3 − 114.49 = 1649.81
10  10 
σ = 1649.81 = 40.62
12
y = 50 − = 50 − 1.2 = 48.8
10
2
1032  12 
σ2 = − − = 103.2 − 14.4 = 88.8
10  10 
σ = 88.8 = 9.42
40.62
www.gayali.in
For C.V. of A = × 100 = 61.83

65.7
9.42
For C.V. of B = × 100 = 19.30
48.8
For cricketer B, the coefficient of variation is smaller, he is more consistent.

www.gayali.in
MOMENTS, SKEWNESS & KURTOSIS
Moments
Given n observations x1, x2, - - --, xn and an arbitrary constant A,
1
∑ (x – A) is called the 1st moment about A,
n
1
∑ (x – A)2 is called the 2nd moment about A,
n
1
∑ (x – A)3 is called the 3rd moment about A,
n
and so on, let us denote these moments successively by m11 , m12 , m13 , etc.
Then m11 = Σ ( x − A ) / n = ( Σx − ΣA ) / n = ( Σx − nA ) / n = x − A i.e. the 1st moment about A
equals ( x − A ) .
(a) Moment about zero (i.e, when A = 0) or raw moments
1
1st moment about zero = Σx = x
n
www.gayali.in
1
2nd moment about zero = Σx 2
n
1 3
3 moment about zero = Σx
rd
n
And so on, Note that the 1st moment about zero is the mean x
m11 = x
(b) Moment about mean (or central moments)
1
1st moment about mean = Σ(x − x ) = 0
n
1
2nd moment about mean = Σ ( x − x ) = σ2
2
n
1
3 moment about mean = Σ ( x − x )
rd 3
n
1
4 moment about mean = Σ ( x − x )
th 4
n
and so on.
These are usually denoted by m1 , m2 , m3 , m4 , etc. Note that the 1st central moment
is always zero, and the 2ndcentral moment is the variance σ2. Hence, m1 = 0 m2 = σ2
www.gayali.in
From the second relation, we find that the standard deviation is the square – root
of the second central moment m2.
The 3rd central moment m3 is used to measure skewness and the 4th central
moment m4 to measure kurtosis.
In general, given n observations x1, x2, ---, xn, the r-th order moment (r = 0, 1, 2, ---)
are defined as follows:

www.gayali.in
1
r-th moment about A : mr/ = ∑ (x − A)r
n
1
r-th raw moment : mr/ = ∑ x r
n
1
r-th central moment : mr = ∑ (x − x )r
n
For a frequency distribution,
/ 1
r-th moment about A : mr = ∑ f (x − A)r
N
1
r-th raw moment : mr/ = ∑ fx r
N
1
r-th central moment : mr = ∑ f (x − x )r
N
Where N = ∑f
There are important relations between central and non-central moment. For
example, if the non-central moment (m1/ , m2/ , m3/ etc.) about any arbitrary origin A
are known, central moment can be obtained by using the relations, viz.
www.gayali.in
2
m2 = m2/ – m1/
3
m3 = m3/–3 m2/ m1/ +2 m1/
2 4
m4 = m4/–4 m3/ m1/ +6m2/ m1/ –3 m1/
In particular, using the first two moments, m1/ and m2/, about an arbitrary origin
A, the mean and the variance may be obtained:
2
x = m1/ + A, σ2 = m2/ − m1/
Relation between central and non-central moment
[I] Formula for mr in terms of mr/ and moments of lower order:
mr = Σ ( x i − x ) / n m1r = Σ ( x i − A ) / n
r r
Let us write
xi– x =(xi – A)–( x –A)={(xi – A) – d}, (suppose) where d=( x –A)=m1/
Using the binomial expansion
( xi − x ) = ( x i − A )r −r c1 ( x i − A )r −1 d + r c2 ( x i − A )r −2 d2 − − − + ( −1)r dr
r
Summing overall value of i = 1, 2, - - - -, n

∑(xi– x )r=∑(xi–A)r- rc1 d ∑(xi – A)r–1+ rc2d2 ∑ (xi – A)r–2 ––– +(–1)rn dr
Now dividing both sides by n,
www.gayali.in
mr= mr/ − r c1mr/ −1d + r c2 mr/ −2 d 2 − − − + ( −1) d r

r
In particular, putting r= 1, 2, 3, 4, we get

m1 = m1/–d
m2 = m2/–2 m1/ d+d2
m3 = m3/–3 m2/ d+3m1/ d2–d3
m4 = m4/–4 m3/ d+6 m2/ d2– 4 m1/ d3 +d4

www.gayali.in
Writing d = m1/ and simplifying, the central moments (mr) when expressed in
terms of the moment (mr/) about any origin are
m1 = 0
m2 = m2/ − m1/ 2
m3 = m3/ − 3m2/ m1/ + 2m1/ 3
m4 = m /4 − 4m3/ m1/ + 6m2/ m1/ 2 − 3m1/ 4
[II] Formula for mr/ in terms of mr and moment of lower order:

We write xi–A=(xi– x )+( x –A)=(xi– x )+d (suppose)
Therefore, using the binomial expansion
(xi–A)r = (xi– x )r + rc1(x– x )r–1d + rc2 (xi– x )r–2d2 + ––– + dr
Summing overall the value of i = 1, 2, –––, n
∑(xi–A)r=∑(xi– x )r+rc1d∑(xi– x )r –1 + rc2d2∑(xi– x )r –2 + - - - + ndr.
Dividing both sides by n,
mr/= mr+rc1m r–1d+rc2 mr–2 d2 + ––– + dr
Where d= x –A= m1/ is the first moment about A
In particular, putting r=1, 2, 3, 4 we get
www.gayali.in
m11 = m1 + d
m12 = m2 + 2m1d + d 2
m13 = m3 + 3m2 d + 3m1d 2 + d 3
m14 = m 4 + 4m3 d + 6m2 d 3 + d 4
Since m1 = 0 and d = x –A = m1/ , we have
m1/ = m1/ (as exp ected)
m2/ = m2 + m1/ 2
m3/ = m3 + 3m2 m1/ + m1/ 2
m /4 = m 4 + 4m3 m1/ + 6m2 m1/ 2 + m1/ 4
Beta – coefficients and Gamma – coefficients

‘Beta – coefficients’ are defined as follows:
m23 m
β1 = 3 ; β2 = 42
m2 m2
The beta – coefficients can never be negative, and are pure number, independent
of origin and scale of observation. Since for a symmetrical distribution all odd order
www.gayali.in
moments are zero, m3=0; consequently β1=0, when the distribution is symmetrical.
Frequency distributions are classified as Leptokurtic, Platykurtic or mesokurtic,
according as the value of β2 is greater than, less than, or equal to 3. Beta – coefficients
are used for measuring skewness and kurtosis.
‘Gamma – coefficients’ are defined as follows:
γ 1 = β1 ; γ 2 = β 2 − 3

www.gayali.in
γ1 must have the same sign as m3. The gamma – coefficients may be positive,
negative or zero; but are pure numbers like the beta – coefficients. These are used as
measures of ‘skewness’ and ‘kurtosis’:
m
Skewness ( γ1 ) = β1 = 33
σ
m
Kurtosis ( γ 2 ) = β2 − 3 = 44 − 3
σ
Distributions are said to be ‘positively skew’ ‘negetively skew’, or ‘symmetrical’,
according as γ is positive, negative or zero. Similarly positive, negative, or zero value of
γ2 are associated with ‘leptokurtic, ‘platykurtic’ or ‘mesokurtic’ distributions.
Moments of frequency distributions
If x1, x2, - - -, xn have frequency f1, f2, - - -, fn respectively, the r–th moment about
A is defined as
1
Σf ( x − A )
r
m1r =
N
where N = ∑ƒ. The r–th central moment is similarly defined as
www.gayali.in
1
mr= ∑ƒ (x – x )r
n
where x = ∑fx/N
In the care of grouped frequency distributions, the mid values are taken as
representatives of the respective classes and x1, x2, - - -, xn denote these mid value. If the
successive mid value have a common difference, the calculation of central moments
can be simplified.
If y = (x – c)/d, where c and d are constants, the r-th central moment of x is equal
to d times the r-th central moment of y.
r
mr (x) = dr, mr (y)

Since the value of y are usually small, mr (y) can be calculated easily from the raw
moment of y. Finally, mr (y) is multiplied by dr to give the required central moments of x.
In particular, for the standardized variable z = (x- x )/σ, we have c = x and d = σ, so that
mr(x) = σr, mr(z); or, mr(z) = mr(x)/σr; i.e σr = mr /σr
Charlier’s check :–
C.V.L. charlier gave a simple and effective check on the calculations for moments
www.gayali.in
from a grouped frequency distribution. Expanding (y + 1)4 by binomial theorem,

multiplying by ƒ, and then summing we have the identity
∑f (y + 1)4 = ∑fy4 + 4∑fy3 + 6∑fy2 + 4∑fy + N
Sheppard’s correction for errors due to grouping
For the calculation of mean, variance and other moment from a grouped
frequency distribution, the mid – value of a class interval is taken as representative of

www.gayali.in
all observations falling within that class interval. This, however, introduces some error
in the calculated values known as ‘error due to grouping’.
So far as the first four moments are concerned:
[i] No correction is necessary for the mean m11 and the third central moment m3
[ii] Correction for the 2nd and the 4th central moments are
c2
M2 (corrected) = m2 (uncorrected) −
12
c2 7c 4
m 4 (corrected) = m 4 (uncorrected) − m2 (uncorrected) +
2 240
Where c is the common width of class interval.
Sheppard’s corrections should be applied when –
[a] The distribution relates to a continuous variable, and is of moderate
symmetry;
[b] The frequency distribution becomes smaller and smaller approaching
zero at each end of the distribution.
www.gayali.in
These corrections are not applicable to J- or U- shaped distributions or to very
skew distributions. Moreover, unless the total frequency is fairly large, the corrections
will be of little practical importance.
Skewenss
A frequency distribution is said to be ‘symmetrical,’ if the frequencies are
symmetrically distributed about mean, i. e. when values of the variable equidistant
from mean have equal frequencies.
Illustration–I : Symmetrical distribution:
[i] x : 10 15 20 25 30
ƒ: 3 7 16 7 3
[ii] x : 10 15 20 25 30 35
ƒ: 3 7 16 16 7 3
Note that in the above distributions, the means are respectively 20, 22.5. The
median and mode for each also have the same value. In fact, for any symmetrical
distribution mean, median and mode are equal.
The word “skewness” is used to denote the ‘extent of asymmetry’ in the
data, when the frequency distribution is not symmetrical, it is said to be ‘skew: The
word ‘skewnes’ literally denotes ‘asymmetry’ or 'lack of symmetry' and skew denotes
www.gayali.in
‘asymmetrical’. A symmetrical distribution has therefore zero skewness. Skewness may

also be positive or negative.
Skewness is measured by the following formulae :–
[1] Pearson’s first measure –
Mean − Mode
Skewness =
Standard Deviation

www.gayali.in
[2] Pearson’s second measure –

3 (Mean − Median)
Skewness =
Standard Deviation
[3] Bowley’s measure –
Skewness =
( Q3 − Q2 ) − ( Q2 − Q1 )
( Q3 − Q2 ) + ( Q2 − Q1 )
Q3 − 2Q2 + Q1
=
Q3 − Q1
Where Q1, Q2, Q3 denote the first, second and third quartiles of the distribution,
[4] Moment measure –
m3 m3
Skewness (γ1) = =
σ ( m2 )
3 3
Where m2 and m3 are second and third central moments, and σ denotes the S.D.
It should be noted that all the measures of skewness are pure numbers and have
the value zero when the distribution is symmetrical.
www.gayali.in
Figures – Position of Mean (M), Median (Me), Mode (Mo) for Different Types of
skewness
Mo MeM MMe Mo M=Me=Mo

(a) Positive Skewness (b) Negative Skewness (c) Zero Skewness
Kurtosis
Kurtosis refers to the degree of “peakedness” of the frequency curve. Two
distributions may have the same average, dispersion and skewness; yet, in one there
may be high concentration of values near the mode, showing as sharper peak in the
frequency curve than in the other. This characteristic of the frequency distribution is
known as “Kurtosis”, The only measure of kurtosis is based on moments, Viz
m4
Kurtosis ( γ 2 ) = − 3 = β2 − 3
σ4
where m4 and σ denote the fourth central moment and S.D. respectively.
Figure – Different Types of Kurtosis
www.gayali.in
(a) Platykurtic (b) Mesokurtic (c) Laptokurtic

www.gayali.in
β2 < 3, for Platykurtic distribution

β2 = 3, for Mesokurtic distribution
β2 > 3, for Leptokurtic distribution
For the normal distribution which is neither very peaked nor flat-topped, β2=3.
This important distribution is taken as α standard for measuring kurtosis, and it has
become customary to use γ2= β2–3 as a measure of kurtosis.
A distribution is said to be “Platykurtic”, when γ2 is negative; it is said to be
“mesokurtic” when γ2=0 and “leptokurtic” when γ2 is positive.
The frequency curve for α platykurtic distribution is relatively flat-topped, and
for α leptokurtic distribution it has α relatively high peak. A mesokurtic distribution
(e. g. normal distribution) is of moderate peakedness.
Exercise
[1] The first two moments of α distribution about the value 4 are -1.5 and 2.7 (a) Find
the moment about zero. (b) Also calculate the mean and S.D.
Solution: ∑(x–4)/n = –1.5 - - - - (1)
www.gayali.in
∑(x– 4)2/n = 2.7 - - - - (2)
Σx − 4n
From (1) we get = −1.5
n
Σx Σx
Or, − 4 = −1.5 Or, = −1.5 + 4 = 2.5
n n
From (2) we get ∑ (x2 - 8x + 16)/n = 2.7
Σx 2 8Σx 16n
Or, − + = 2. 7
n n n
Σx 2
− 8 × 2.5 + 16 = 2.7
n
Σx 2
= 2.7 − 16 + 20 = 6.7
n
Σx
(b) ∴Mean = = 2.5 (a) Moment about 0
n
Σ( x − 0)
2 2
Σx 2  Σx 
S.D. = − 2
= = S.D.
n  n  n
Σx 2
www.gayali.in
=6.7–2.52 = = 6. 7
n
=6.7–6.25
=0.45
∴S.D.= 0.45 = 0.671
[2] The first three moments of α distribution about the value 3 of the variable are 2,
10 and 30 respectively. Obtain the first three moments about zero, show also that the
variance of the distribution is 6.

www.gayali.in
Solution : Here,
Σx 3n Σx
∑(x–3)/n = 2 Or , − = 2 Or , =2+3=5
n n n
Σ ( x − 3 ) / n = 10
2

Σx 2 6 Σx
(
Or, Σ x 2 − 6 x + 9 / n = 10 Or , ) n
−
n
+ 9 = 10
2
Σx Σx 2
Or , − 6 × 5 + 9 = 10 Or , = 10 + 30 − 9 = 31
n n
∑(x–3)3/n = 30
Σx 3 9Σx 2 27Σx
Or, ∑(x3 – 9x2 + 27x – 27)/n = 30 Or , − + − 27 = 30
n n n
Σx 3 Σx 3
Or, − 9 × 31 + 27 × 5 − 27 = 30 Or, = 279 − 135 + 27 + 30
n n
= 336 – 135 = 201
1
1st moment about zero = Σx = x = 5
n
1 2
2nd moment about zero = Σx = 31
n
www.gayali.in
1 3
3 moment about zero = Σx = 201
rd
n
1
Variance = Σ ( x − x ) = σ2
2
n

1
(
= Σ x 2 − 2 xx + x 2
n
)
1 2Σxx Σx 2 Σx 2 nx
/
2
− 2 ( Σx ) +
2
= Σx 2 − + =
n n n n n/
= 31 – 2 × 52 + 52

= 31 – 50 + 25 = 6 proved.
[3] The first four moments about the value 1 are 2.6, 10.2, 43.4 and 192.6 respectively.
Find the A.M. and the first four moments about 4.
Solution : As per condition given,
∑(x–1)/n = 2.6
Σx Σx
Or, − 1 = 2.6 Or, = 3.6 = x = A.M
n n
∑(x–1)2/n = 10.2
Σx 2 2 Σx Σx 2
Or, ∑(x2–2x+1)/n = 10.2 Or, − + 1 = 10.2 Or, − 2 × 3.6 + 1 = 10.2
n n n
www.gayali.in
Σx 2
Or, = 10.2 + 7.2 − 1 = 16.4
n
∑(x–1)3/n = 43.4
Σx 3 3Σx 2 3Σx
Or, ∑(x3–3x2+3x–1)/n = 43.4 Or, − + − 1 = 43.4
n n n
Σx 3 Σx 3
Or, − 3 × 16.4 + 3 × 3.6 − 1 = 43.4 Or, = 43.4 + 49.2 − 10.8 + 1
n n
= 93.6 – 10.8 = 82.8

www.gayali.in
Σn
Σ ( x − 1) / n = Σx 4 / n − 4Σx 3 / n + 6Σx 2 / n − 4
4
+ 1 = 192.6
n
Σx 4
=192.6+4×82.8–6×16.4+4×3.6–1=192.6+331.2–98.4+14.4–1=538.2–99.4=438.8
n
Σx
Σ( x − 4) / n = − 4 = 3.6 − 4 = −0.4
n
Σx 2 8 Σx
Σ( x − 4) / n =
2
− + 16 = 16.4 – 8 × 3.6 + 16 = 32.4 – 28.8 = 3.6
n n
Σx 3 12Σx 2 48Σx
∑(x–4)3/n = ∑(x3–12x2+48x–64)/n = − + − 64
n n n
= 82.8–12×16.4+48×3.6–64 = 82.8–196.8+172.8–64 = 255.6–260.8 = –5.2
∑(x–4)4/n = ∑(x4–16∑x3+96∑x2–256∑x+256)/n
Σx 4 16Σx 3 96Σx 2 256Σx
= − + − + 256
n n n n
= 438.8–16×82.8+96×16.4–256×3.6+256 = 438.8–1324.8+1574.4–921.6+256
= 2269.20–2246.40 = 22.80
[4] The first three moment of α distribution about the value 7, calculated from α set of
www.gayali.in
9 observations are 0.2, 19.4 and -41.0. Find the measures of central tendency, dispersion
and also the third moment about the origin.
[I.C.W.A 1975]
Solution :
∑(x–7)/n = 0.2
Σx
− 7 = 0.2
n
Σx
= 7. 2 = x
n
∑(x–7)2/n = 19.4
∑(x2-14x+49)/n = 19.4
Σx 2 14Σx
− + 49 = 19.4
n n
Σx 2 Σx 2
− 14 × 7.2 + 49 = 19.4 Or, = 19.4 + 100.8 − 49 = 120.2 – 49 = 71.2
n n
Σx 2 2Σxx Σx 2 Σx 2
σ2 =
1
n
1
n
(
Σ ( x − x ) = Σ Σx 2 − 2 xx + x 2 =
2
)n
−
n
+
n
=
n
− 2x2 + x2
www.gayali.in
Σx 2
= − x 2 = 71.2 − 7.22 = 71.2–51.84 = 19.36
n
∴ σ = 19.36 = 4.4
∑(x–7)3/n = – 41 Or, ∑(x3–21x2+147x – 343)/n = – 41
Σx 3 21Σx 2 147Σx Σx 3
Or, − + − 343 = −41 Or, − 21 × 71.2 + 147 × 7.2 − 343 = −41
n n n n

www.gayali.in
Σx 3
Or, = −41 + 1495.2 − 1058.4 + 343 = 1838.20 – 1099.40 = 738.80
n
Hence, A.M. = 7.2
S.D. = 4.4
Third moment about origin = 738.80
[5] Find the first, the second and the third central moments of the frequency
distribution of expenditure (Rs. Per month) given below:
Expenditure 3-6 6-9 9-12 12-15 15-18 18-21 21-24 Total
No. of familities 28 292 389 212 59 18 2 1000
[I.C.W.A.1978]
Solution :
Table : Calculations for moment
Mid – value x f x −13.5 fy f y2 f y3 f y4 f (y+1)4
y=
3
4.5 28 -3 -84 252 -756 2268 448
7.5 292 -2 -584 1168 -2336 4672 292
10.5 389 -1 -389 389 -389 389 0
www.gayali.in
13.5 212 0 0 0 0 0 212
16.5 59 1 59 59 59 59 944
19.5 18 2 36 72 144 288 1458
22.5 2 3 6 18 54 162 512
Total 1000 - -956 1958 -3224 7838 3866
Charlier's check :–
∑f(x+1)4 = ∑fy4+4∑fy3+6∑fy2+4∑fy+N
= 7838+4×–3224+6×1958+4×–956+1000
= 7838–12896+11748–3824+1000 = 20586–16720=3866=L.H.S.
Raw moments of y :–
−956
m1/ = Σfy / N = = −0.956
1000
1958
m2/ = Σfy 2 / N = = 1.958
1000
3224
m3/ = Σfy 3 / N = − = −3.224
1000
7838
m /4 = Σfy 4 / N = = 7.838
1000
Central moments of y :–
( )
www.gayali.in
2
= 1.958 − ( −0.956 ) = 1.958 – 0.91 = 1.048
2
m2 = m2/ − m1/
( )
3
m3 = m3/ − 3m2/ m1/ + 2 m1/ = –3.224 – 3 × 1.958 × –0.956 + 2 (– .956)3
= –3.224 + 5.616 – 1.747 = 0.645
( ) ( )
2 4
m 4 = m /4 − 4m3/ m1/ + 6m2/ m1/ − 3 m1/
= 7.838 – 4 × –3.224 × – .956 + 6 × 1.958 × 0.91 – 3 × 0.956
= 7.838 – 12.33 + 10.69 – 2.51 = 18.53 – 14.84 = 3.69

www.gayali.in
Central moments of x :–
m1 = 0
m2(x) = d2m2(y) = 32 × 1.048 = 9 × 1.048 = 9.43
m3(x) = d3m3(y) = 33 × 0.645 = 27 × 0.645 = 17.4
Therefore,
}
1st central moment = 0
2nd central moment = 9.43 Ans.
3rd central moment = 17.4
[6] Find the first four moments and the value of β1 and β2 from the following
frequency distribution:
x 21–24 25–28 29–32 33–36 37–40 41–44
f 40 90 190 110 50 20
Also, find the measures of skewness and kurtosis
Solution :
Table : Calculations for moments
www.gayali.in
Mid – value x f x − 30.5 fy f y2 f y3 f y4
y=
4
22.5 40 -2 -80 160 -320 640
26.5 90 -1 -90 90 -90 90
30.5 190 0 0 0 0 0
34.5 110 1 110 110 110 110
38.5 50 2 100 200 400 800
42.5 20 3 60 180 540 1620
Total 500 - 100 740 640 3260
Raw moments of y :–
m11 = Σfy / N = 100 / 500 = 0.2
m12 = Σfy 2 / N = 740 / 500 = 1.48
m13 = Σfy 3 / N = 640 / 500 = 1.28
m14 = Σfy 4 / N = 3260 / 500 = 6.52
Central moments of y :–
( )
2
= 1.48 − ( 0.2 ) = 1.48 − 0.04 = 1.44
2
m2 = m12 − m11
( )
3
m3 = m13 − 3m12 m11 + 2 m11
= 1.28 – 3 × 1.48 × 0.2 + 2 (0.2)3 = 1.28 – 0.888 + 0.016 = 0.408
www.gayali.in
( ) ( )
2 4
m 4 = m14 − 4m13 m11 + 6m12 m11 − 3 m11
= 6.52 – 4 × 1.28 × 0.2 – 6 × 1.48 × (.2)2 – 3(.2)4
= 6.52 – 1.024 + 0.3552 – .0048 = 6.8752 – 1.0288 = 5.8464
Central moments of x :–
m2(x) = d2m2(y) = 42 × 1.44 = 23.04
m3(x) = d3m3(y) = 43 × 0.408 = 26.11
m4(x) = d4m4(y) = 44 × 5.8464 = 1496.68

www.gayali.in
100
x = c + dy = 30.5 + 4 × = 30.5 + 0.80 = 31.30
500
m23 0.4082 0.1665
β1 = = = = 0.056
m32 1.44 3
2.99
4
m 5.8464
β2 = = = 2.82
m22 1.442
Skewness ( γ1 ) = β1 = 0.056 = +0.24
Kurtosis ( γ 2 ) = β2 − 3 = 2.82 − 3 = −0.18
[7] Find the coefficient of skewness = (Mean – Mode)/S.D:-

Marks 55–58 58–61 61–64 64–67 67–70
Frequency 12 17 23 18 11
Solution :
Table : Calculation for coefficient of skewness
www.gayali.in
Class Marks Mid-value x f x − 62.50 fy f y2
y=
3
55 – 58 56.50 12 -2 -24 48
58 – 61 59.50 17 -1 -17 17
61 – 64 62.50 23 0 0 0
64 – 67 65.50 18 1 18 18
67 – 70 68.50 11 2 22 44
Total 81 - -1 127
−1 3
Mean ( x ) = c + dy = 62.50 + 3 × = 62.5 − = 62.50 – 0.04 = 62.46
81 81
2 2
fy 2  fy  127  −1 
S.D. ( σ ) = d −  = 3 −   = 3 1.57 = 3.76
n n 81  81 
f − f −1 23 − 17
Mode = l1 + 0 × c = 61 + ×3
2f0 − f−1 − f1 2 × 23 − 17 − 18
6 18
= 61 + × 3 = 61 + = 62.64
46 − 17 − 18 11
Coefficient of skewness = (Mean – Mode)/S.D.
62.46 − 62.64 −0.18
www.gayali.in
= = = –0.048
3.76 3.76
[8] Calculate Pearson’s measure of skewness on the basis of Mean, Mode and
Stadard deviation :–
x 14.5 15.5 16.5 17.5 18.5 19.5 20.5 21.5
f 35 40 48 100 125 87 43 22
[C.A. 1975]

www.gayali.in
Table : Calculations for Mean, Mode and S.D.

Class boundary x f y = x–18.5 fy f y2
14–15 14.5 35 -4 -140 560
15–16 15.5 40 -3 -120 360
16–17 16.5 48 -2 -96 192
17–18 17.5 100 -1 -100 100
18–19 18.5 125 0 0 0
19–20 19.5 87 1 87 87
20–21 20.5 43 2 86 172
21–22 21.5 22 3 66 198
Total - 500 - -217 1669
 −217 
x = c + dy = 18.5 +   = 18.5 – 0.43 = 18.07
 500 
f −f 125 − 100
Mode = l1 + 0 −1 × c = 18 +
2f0 − f−1 − f1 2 × 125 − 100 − 87
25 25
www.gayali.in
= 18 + = 18 + = 18 + 0.04 = 18.40
250 − 187 63
2 2
fy 2  fy  1669  −217 
S.D. = −  = − = 3.34 − 0.19 = 3.15 = 1.77
n n 500  500 
Mean − Mode 18.07 − 18.40
Skewness = = = −0.19
S.D 1.77
[9] Calculate from the undernoted table the measure of skewness based on Mean,
Median and Standard Deviation :
x 100–200 200–300 300–400 400–500 500–600 600–700 700–800 800–900
y 45 88 146 206 79 52 30 14
[C.A. 1973]
Solution :
Table : Calculations for Mean, Median and S.D.
x f Mid – value x y = x − 450 fy f y2 Cumulative frequency
100
100–200 45 150 -3 -135 405 45
200–300 88 250 -2 -176 352 133
300–400 146 350 -1 -146 146 279
www.gayali.in
N
← = 330
2
400–500 206 450 0 0 0 485
500–600 79 550 1 79 79 564
600–700 52 650 2 104 208 616
700–800 30 750 3 90 270 646
800–900 14 850 4 56 224 660
Total 660 - - -128 1684 -

www.gayali.in
 −128 
Mean ( x ) = c + dy = 450 + 100 ×   = 450 – 19.39 = 430.61
 660 
N −F
2 330 − 279 51
Median= l1 + × c = 400 + × 100 = 400 + × 100 =400+24.76=424.76
fm 206 450
2
fy 2  fy  1684  −128 
S.D. = d −   = 100 −  = 100 2.55 − 0.04
n n 660  660 
= 100 2.51 = 100 × 1.58 = 158
3 ( Mean − Median ) 3 ( 430.61 − 424.76 ) 3 × 5.85
Skewness = = = = 0.11
S.D 158 158
[10] Calculate the measure skewness based on quartiles and median from the
following data :
Variable 10–20 20–30 30–40 40–50 50–60 60–70 70–80
Frequency 358 2417 976 129 62 18 10
[C.A. 1972]
www.gayali.in
Solution :
Table : Calculation for Quartiles and Median
Class boundary Frequency (f) Cumulative frequency
10–20 358 358
N
← Q1 = = 992
4
N
← Q2 = = 1985
2
20–30 2417 2775
3N
← Q3 = = 2977.50
4
30–40 976 3751
40–50 129 3880
50–60 62 3942
60–70 18 3960
70–80 10 3970=N
Total 3970 –
N
−F
www.gayali.in
1985 − 358 1627

Median (Q2) = l1 + 2 × c = 20 + × 10 = 20 + × 10
fm 2417 2417
= 20 + 6.73 = 26.73
992.50 − 358 634.50
First Quartile (Q1 = 20 + × 10 = 20 + × 10 = 20+2.63 = 22.63
2417 2417
2977.50 − 2775 202.50
Third Quartile (Q3) = 30 + 976
× 10 = 30 + × 10 = 30+2.07 = 32.07
976

www.gayali.in
Q3 − 2Q2 + Q1 32.07 − 2 × 26.73 + 22.63

Skewness = =
Q3 − Q1 32.07 − 22.63
32.07 − 53.46 + 22.63 54.70 − 53.46
= = = 0.13
9.44 9.44
[11] Calculate the coefficient of skewness, based on quartiles:
Class Limits 10–19 20–29 30–39 40–49 50–59 60–69 70–79 80–89
Frequency 5 9 14 20 25 15 8 4
[I.C.W.A. 1976]
Solution :
Table : Calculations for Quartiles
Class boundary Frequency (f) Cumulative frequency
9.5–19.5 5 5
19.5–29.5 9 14 N
← Q1 = = 25
4
29.5–39.5 14 28
www.gayali.in
39.5–49.5 20 48 N
← Q2 = = 50
2
49.5–59.5 25 73
3N
← Q3 = = 75
4
59.5–69.5 15 88
69.5–79.5 8 96
79.5–89.5 4 100=N
Total 100
25 − 14 11
Q1 = 29.5 + × 10 = 29.5 + × 10 = 29.5 + 7.86 = 37.36
14 4
50 − 48 2
Median (Q2) = 49.5 + × 10 = 49.5 + × 10 = 49.5+0.80=50.30
25 25
75 − 73 2
( Q3 ) = 59.5 + 15 × 10 = 59.5 + 15 × 10 = 59.5 + 1.33 = 60.83
Q − 2Q2 + Q1 60.83 − 2 × 50.30 + 37.36
Skewness = 3 =
Q3 − Q1 60.83 − 37.36
60.83 − 100.60 + 37.36 98.19 − 100.60 2.41
www.gayali.in
= = =− = −0.103
23.47 23.47 23.47
[12] Calculate, Bow Ley’s measure of skewness from the following :

x 10–14 15–19 20–29 30–39 40–49 50–59
f 786 924 320 172 96 32
[C.A. 1975]

www.gayali.in
Solution :
Table : Calculation for Q1, Q2, and Q3
Class boundary Frequency (f) Cumulative Frequency
0–9.5 0 0 N
← Q1 = = 582.50
4
9.5–14.5 786 786 N
← Q2 = = 1165
2
14.5–19.5 924 1710 3N
← Q3 = = 1747.50
4
19.5–29.5 320 2030
29.5–39.5 172 2202
39.5–49.5 96 2298
49.5–59.5 32 2330=N
Total 2330
582.5 − 0
Q1 = 9.5 + × 5 = 9.5 + 3.71 = 13.21
786
www.gayali.in
1165 − 786 379
Q2 = 14.5 + × 5 = 14.5 + × 5 = 14.5 + 2.05 = 16.55
924 924
1747.5 − 1710 37.5
Q3 = 19.5 + × 10 = 19.5 + × 10 = 19.5 + 1.17 = 20.67
320 320
Q − 2Q2 + Q1 20.67 − 2 × 16.55 + 13.21 20.67 − 33.10 + 13.21
Skewness = 3 = =
Q3 − Q1 20.67 − 13.21 7.46
33.88 − 33.10 0.78
= = = 0.105
7.46 7.46
[13] Compute Bowley’s measure and Pearson’s measures of skewness:
Monthly income (Rs.) 0–75 75–150 150–225 225–300 300–375 375–400
Frequency 15 200 250 225 10 5
[C.U., M.Com. '76]
Solution :
Table : Calculation for Bowley’s and Pearson's measure of skewness
Class boundary Frequency (f) Mid-value (x) c. f. x − 262.5 fy f y2
y=
75
0 – 75 15 37.50 15 -3 -45 135
www.gayali.in
75 – 150 200 112.50 215 -2 -400 800

150 – 225 250 187.50 465 -1 -250 250
225 – 300 225 262.50 690 0 0 0
300 – 375 10 337.50 700 1 10 10
375 – 450 5 412.50 705 = N 2 10 20
Total 705 - - - -675 1215

www.gayali.in
 675 
x = 262.5 +  −  × 75 = 262.5–71.81=190.69
 705 
352.5 − 215 137.50
Median = 150 + × 75 = 150 + × 75 = 150 + 41.25 = 191.25
250 250
250 − 200 50 × 75
Mode = 150 + × 75 = 150 + = 200
500 − 200 − 225 75
2
1215  675 
σ = 75 − − = 75 1.7234 − 0.9167 = 75 0.806 = 0.8982 × 75 = 67.36
705  705 
176.25 − 15 161.25
Q1 = 75 + × 75 = 75 + × 75 = 75 + 60.47 = 135.47
200 200
528.75 − 465 63.75
Q3 = 225 + × 75 = 225 + = 225 + 21.25 =246.25
225 3
Q − 2Q2 + Q1 246.25 − 2 × 191.25 + 135.47
Bowley's measure of skewness = 3 =
Q3 − Q1 246.25 − 135.47
381.75 − 382.50 0.78
= =− = −.0070
110.78 110.78
www.gayali.in
Mean − Mode 190.69 − 200 9.31
Pearson's measure of skewness = = =− = −0.14
S.D. 67.36 67.36
[14] Calculate the quartile measure of skewers for the distribution of the time taken
by 100 workers to complete a job:
Time (Seconds) -12 13-15 16-18 19-21 22-24 25-27 28-
No. of workers 4 16 22 28 15 9 6
[I.C.W.A 1974]
Solution: Table- Calculation for skewnes
Clan boundary Frequency Cumulative Frequency
– 12.5 4 4
12.5 – 15.5 16 20
15.5 – 18.5 22 42
18.5 – 21.5 28 70
21.5 – 24.5 15 85
24.5 – 27.5 9 94
27.5 - 6 100 = N
www.gayali.in
Total 100
25 − 20 5
First Quartile (Q1) = 15.5 + × 3 = 15.5 + × 3 = 15.5 + 0.68 = 16.18
22 22
50 − 42 8
Second Quartile (Q2) = 18.5 + × 3 = 18.5 + × 3 = 18.5 + 0.86 = 19.36
28 28
75 − 70 5
Third Quartile (Q3) = 21.5 + × 3 = 21.5 + × 3 = 21.5 + 1 = 22.5
15 15

www.gayali.in
Q3 − 2Q2 + Q1
Skewnes (Bowley’s measure) =
Q3 − Q1
22.5 − 2 × 19.36 + 16.18
=
22.5 − 16.18
22.5 − 38.72 + 16.18 38.68 − 38.72
= =
6.32 6.32
.04
= − = −0.0063
6.32
[15] Find the appropriate measure of skewnes from the following:

Age (years) Below20 20 – 25 25 - 30 30 - 35 35 - 40 40 - 45 45 - 55 55 & above
No of employees 13 29 46 60 112 94 45 21
[I.C.W.A. 1975]
Solution:
Table: Calculation for skewness
www.gayali.in
Class boundary Frequency Cumulative Frequency
Below 20 13 1
20 – 25 29 42
25 – 30 46 88 N
← = Q1 = 105
4
30 – 35 60 148
N
← = Q2 = 210
2
35 – 40 112 260
3N
← = Q3 = 315
4
40 – 45 94 354
45 – 55 45 399
55 & above 21 420 = N
Total 420
105 − 88 17
Q1 = 30 + × 5 = 30 + × 5 = 30 + 1.42 = 31.42
60 60
www.gayali.in
210 − 148 62
Q2 = 35 + × 5 = 35 + × 5 = 35 + 2.77 = 37.77
112 112
315 − 260 55 × 5
Q3 = 40 + × 5 = 40 + = 40 + 2.93 = 42.93
94 94
Q3 − 2Q2 + Q1 42.93 − 2 × 37.77 + 31.42
Bowley’s measure of skewness = =
Q3 − Q1 42.93 − 31.42

www.gayali.in
42.93 − 75.54 + 31.42 74.35 − 75.54

= = 11.51
11.51
1.91
= − 11.51 = −0.103
[16] Compute an appropriate measure of skewness from:

Marks (Per cent) Under 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 – 90
No. of students 45 40 24 12 9 3 2
[C.U.M.Com 1975]
Solution:
Class boundary Frequency Cumulative frequency
N
← = Q1 = 33.75
4
Under 30 45 45
www.gayali.in
N
← = Q2 = 67.50
2
30 – 40 40 85
3N
← = Q3 = 101.25
4
40 – 50 24 109
50 – 60 12 121
60 – 70 9 130
70 – 80 3 133
80 – 90 2 135 = N
Total 135
33.75 − 0 33.75
Q1 = 0 + ×0 = × 30 = 22.50
45 45
67.5 − 45 22.5
Q2 = 30 + × 10 = 30 + = 35.63
40 4
101.25 − 85 16.25
Q3 = 40 + × 10 = 40 + × 10 = 40 + 6.77 = 46.77
24 24
www.gayali.in
Q3 − 2Q2 + Q1 46.77 − 2 × 35.63 + 22.50

Skewness (Bowley’s measure) = =
Q3 − Q1 46.77 − 22.50
46.77 − 71.26 + 22.50 69.27 − 71.26
= = 24.27
24.27
1.99
= − 24.27 = −0.081

www.gayali.in
[17] Calculate with the use of quartiles the coefficient of skewness for the following
frequency distribution:-
Under years 10 20 30 40 50 60
No. of persons 15 32 51 78 97 109
[C.U., M.Com 1961]
Solution:
Class boundary Cumulative Frequency Frequency
0 – 10 15 N 15
← = Q1 = 27.25
4
10 – 20 32 17
20 – 30 51 N 19
← = Q2 = 54.50
2
30 – 40 78 3N 27
← = Q3 = 81.75
4
www.gayali.in
40 – 50 97 19
50 – 60 109 = N 12
Total 109
27.25 − 15 12.25 × 10
First Quartile (Q1) = 10 + × 10 = 10 + = 17.21
17 17
54.50 − 51 3.5 × 10
Second Quartile (Q2) = 30 + × 10 = 30 + = 30 + 1.3 = 31.3
27 27
81.75 − 78 3.75
Third Quartile (Q3) = 40 + × 10 = 40 + × 10 = 40 + 1.97 = 41.97
19 19
Q3 − 2Q2 + Q1 41.97 − 2 × 31.3 + 17.21
Skewness (Bowley's measure) = =
Q3 − Q1 41.97 − 17.21
41.97 − 62.6 + 17.21 59.18 − 62.6
= =
24.76 24.76
3.42
= − = −0.14
24.76
[18] Coefficient of skewness = -0.375, Mean = 62, Median = 65 Find the value of
standard deviation.
www.gayali.in
[C.A. 1972]
3 ( Mean − Median )
Solution: skewness =
S.D
3 ( 62 − 65 )
Or, – 0.375 =
S.D
−3 × 3 9000
Or, S.D. = = = 24
−0.375 375

www.gayali.in
[19] The measure of skewness for a certain distribution is -0.8.If the lower and the
upper quartiles are 44.1 and 56.6 respectively. Find the median.
[I.C.W.A. 1971]
Solution:
Here, skewness = – 0.8
Q1 = 44.1, Q3 = 56.6
56.6 − 2Q 2 + 44.1 100.7 − 2Q 2
–0.8 = =
56.6 − 44.1 12.5
Or, 100.7 – 2Q2 = -0.8 × 12.5
Or, 2Q2 = 100.7 + 10 = 110.7
110.7
=
Q2 = 55.35
2
[20] The median, mode and coefficient of skewness for a certain distribution are respectively
17.4, 15.3 and 0.35. Calculate the coefficient of variation.
Mean − Mode [I.C.W.A. 1973]
Solution: skewness =
S.D
www.gayali.in
We know, Mean – Mode = 3(Mean – Median)
Mean – 15.3 = 3 (Mean – 17.4)
= 3 Mean – 52.2
Or, 2 Mean = 52.2 – 15.3 = 36.9
Mean = 18.45
18.45 − 15.3 3.15
Again, 0.35 = =
S.D S.D
3.15
Or, S.D = =9
0.35
S.D. 9
Coefficient of variation = ×100 = × 100 = 49 ( app ) .
Mean 18.45
[21] The mean, median and the coefficient of variation of the weekly wages of a
group of workers are respectively Rs.45, Rs.42 and Rs.40. Find the (i) mode (ii)
variance, and (iii) coefficient of skewness, for the distribution of wages.
Solution:
Here given, Mean = 45, Median = 42, C.V = 40
S.D
C.V = ×100
Mean
www.gayali.in
S.D
Or, 40 = 45 × 100
40 × 45
Or, S.D. = = 18
100
3 ( Mean − Median ) 3 ( 45 − 42 ) 9
(iii) Coefficient of skewness = = = = 0. 5
S.D 18 18

www.gayali.in
(ii) S.D2 = variance = 182 = 324

Mean − Median
(i) Coefficient of skewness =
S.D
45 − Mode
0.5 =
18
Or, 9.0 = 45 – Mode
Mode = 45 – 9 = 36
[22] You are given Mean = 50, C.V. = 40%, SK = –0.4. Find the S.D., Mode and
Median.
S.D [C.A. 1972]
Solution: C.V. = ×100
Mean
S.D
Or, 40 = × 100 or, S.D. = 20
50
Mean − Mode
SK. =
S.D
50 − Mode
www.gayali.in
–0.4 =
20
Or, 50 – Mode = –8
Or, Mode = 58
3 ( 50 − Median )
Again, –0.4 =
20
–8 = 150 –3 Median
Or, 3 Median = 150 + 8 = 158
158
Median = = 52.67
3
[23] The first two moments of a distribution about the value 4 of the variable are –1.5
and 2.7. It is also known that the median of the distribution is 2.1. Comment on the
shape of the distribution.
[I.C.W.A. 1974]
Solution:
∑ (x - 4)/4 = -1.5
Σx
or, n − 4 = –1.5
www.gayali.in
Σx
= 4 –1.5 = 2.5 = x
n
2nd moment about A = ∑ (x–A)2/n
∑ (x – 4)2/4 = 2.7
or, ∑ (x2 - 8x + 16)/n = 2.7
Σx 2 8 Σx
− + 16 = 2.7
n n

www.gayali.in
Σx 2
= 2.7 − 16 + 8 × 2.5
n
= 2.7–16+20 = 22.7–16 = 6.7
σ2 = ∑ (x2 - 2x x + x 2 )/n
Σx 2 Σx 2
= − 2x2 + x2 = − x2
n n
= 6.7 – (2.5)2 = 0.45
S.D(σ) = 0.45 = 0.67
3 ( Mean − Median ) 3 ( 2.5 − 2.1) 3 × 0. 4 1. 2
Skewness = = = = = 1.79
S.D 0.67 0.67 0.67
Comment: Frequency curve is asymmetrical and has the longer tail on the right hand side.
[24] The following factor were gathered before and after an industrial dispute:
Before dispute After dispute
No. of workers Employed 516 508
Mean wages (Rs.) 49.50 51.75
Median wages (Rs.) 52.70 50.00
www.gayali.in
Variance of wages (Rs.) 100.00 121.00
Compare the position before and after the dispute in respect of (a) total wages, (b) Modal
wages, (c) Standard deviation (d) Coefficient of variation (e) Skewness.
Solution:
[a] Before the dispute total wages = 49.5 × 516 = 25542
After the dispute total wages = 51.75 × 508 = 26289
Total wages increased by Rs. 747.00 and 23% increase
[c] Before dispute: S.D = 100 = 10
After dispute: S.D = 121 = 11
Hence, S.D has increased.
3 ( Mean − Median ) 3 ( 49.50 − 52.70 )
[e] Before dispute, skewness = =
S.D 10
3 × −3.2 −9.6
= = = −0.96
10 10
3 ( 51.75 − 50 ) 3 × 1.75
After dispute, skewness = = = 0.48
11 11
Mean − Mode
[b] Before dispute: skewness =
www.gayali.in
S.D
49.5 − Mode
–0.96 = 10
or, –9.6 = 49.5 – Mode
Mode = 49.5 + 9.6 = 59.1
51.75 − Mode
After dispute: 0.48 = 11

www.gayali.in
or, 5.28 = 51.75 - Mode

Mode = 51.75 - 5.28 = 46.47
S.D
[d] C.V = Mean ×100
10
Before dispute: CV = × 100 = 20.2
49.5
11
After dispute: CV = 51.75 × 100 = 21.26
CURVE FITTING AND METHOD OF LEAST SQUARES
Curve fitting
When observations in respect of two variables are available, very often a
relation is found to exist between them. For example, height and weight of persons
are interdependent, expenditure depends on income, yield of a crop depends on the
amount of rainfall etc. Frequently, it is found desirable to express this relationship
between variables by means of some mathematical equation, representing a certain
www.gayali.in
geometrical curve. The process of finding such a curve or it’s equation on the basis of
a given set of observations is called curve fitting.
We list below the equations of some common type of curve :
[1] Y = a + bx Straight line
[2] Y = a + bx + cx2 Parabola
[3] Y = a + bx + cx2 + dx3 Cubic curve
The variables x and y are often referred to as independent variable and
dependent variable respectively. (All letters a, b, c, d except x and y, appearing in the
above equations represent constants.).
Straight line
Straight line is the geometrical representation of an equation of the form.
Ax + By + C = 0
Where A, B, C are constants. When B is not zero i.e. the equation contains a
term in y, the equation of the straight line can be solved for y, giving y = (-A/ B) x +
 − C  , which is of the form, y = a + bx
 
 B
www.gayali.in
Where a and b are constants. This is the form are shall generally used to represent
a straight line. Sometimes, the form y = mx + c is also used.
Slope of Straight Line

In the equation y = a + bx, the coefficient of x on the right, viz. b, is called the
‘slope’ of the straight line. The slope may be positive, negative or zero.

www.gayali.in
Figures – Slope of straight lines
Y B Y
A
(a) Positive X (b) Negetive B X

Y
Y
A B
B X
(c) Zero (d) Infinity
www.gayali.in
X

When the slope is positive, y increases as x increases ; When the slope is negative,
y decreases as x increases ; When the slope is zero, y remains a constant whatever be
the value of x, slope represents the amount of change ( increase or decrease ) in the
value of y for a unit increase in the value of x.
Geometrically, the slope depends on the inclination of the straight line with the
x – axis. Two parallel straight lines have the same slope. When the slope is zero, the
straight line is parallel to x – axis. As the slope increases, the inclination also increases.
When the slope is positive, the straight line is inclined towards the right ; When the
slope is negative, the line is inclined towards the left.
Parabola
Parabola is the geometrical representation of an equation of the form.
y = a + bx + cx2
where a, b, c are constants.
www.gayali.in
Free-hand method of curve fitting, when the given data are plotted as points on
a graph paper, it is often possible to draw a smooth curve through the cluster of points,
which appears best to represent their pattern. The smooth curve so drawn is called free
hand curve. It may be noted that the free-hand curve depends entirely on individual
judgement, and may either be a straight line or a curved line. If the pattern of points
is linear, the equation of a straight line of the form y = a + bx is obtained by choosing
two points on the line.

www.gayali.in
Y Y
(a) Linear X (b) Curvilinear X

Let (1,5) and (4,11) be two such points. Substituting for x and y in the equation
y = a + bx, we find two relations 5 = a + b and 11 = a + 4b, solving which we get a = 3
and b = 2. Hence y = 3 + 2x is the equation of the fitted free-hand curve, which may
be used to estimate the value of y for any given value of x. Similarly, for other type
of curves, it is necessary to choose as many points on the smooth curve as there are
constants in the equation.
Method of Least Squares
www.gayali.in
Method of least squares is a device for finding the equation of a specified type
of curve, which best fits a given set of observations. The method depends upon the
principle of least squares, which suggests that for the “best-fitting” curve, the sum
of the squares of differences between the observed and the corresponding estimated
values should be the minimum possible.
Suppose, we are given n pairs of observation (x1, y1), (x2, y2),-------- , (xn, yn)
and it is required to fit a straight line to these data. The general equation of a straight
line y = a + bx is taken, where a and b are constant. Any values for a and b would give
a straight line, and once these values are obtained, an estimate of y can be had by
substituting the value of x. That is to say, the estimated values of y when x = x1, x2. ----,
xn would be more a+bx1, a+bx2, ------, a+bxn respectively. In order that the equation y
= a+ bx gives a good representation of the relationship between x and y, it is desirable
that the estimated values a + bx1, a+ bx2, ------, a + bxn are, on the whole, close enough
to the corresponding observed values y1, y2, ---, yn.
Principle of Fitting Straight Line by Least Square Method
x Observed y Estimated y = a+bx Difference = (2)–(3) (Difference)2
(1) (2) (3) (4) (4)
x1 y1 a+bx1 y1–a–bx1 (y1–a–bx1)2
x2 y2 a+bx2 y2–a–bx2 (y2–a–bx2)2
www.gayali.in
xn yn a+bxn yn–a–bxn (yn–a–bxn)2

Total – – Σ(yi–a–bxi)2

www.gayali.in
For the best fitting straight line, therefore, our problem is only to choose such
values of a and b for the equation y = a + bx which will provide estimates of y as close
as possible to the observed values. According to the principle of least squares, the
“best-fitting” equation is interpreted as that which minimizes the sum of the squares
of differences.
(yi–a–bxi)2 i.e. (y1–a–bx1)2+(y2–a–bx2)2+------+(yn–a–bxn)2
Figure – Method of least Squares (Geometrical Interpretation)
Y P (xi, yi)
B
M
y=a+bx
A
N
Xi X
If the pairs of observation (x1, y1), (x2, y2),-------- , (xn, yn) are plotted as points
www.gayali.in
on a graph paper, and all possible straight lines are drawn on it, that straight line will be
considered to be the “best-fitting” for which the sum of the squares of vertical distances
PM between the plotted points P and the line AB is the least.
Fitting Straight Line
Let y = a + bx
be the equation of the straight line to be fitted to a given set of n pairs of observations
(x1, y1), (x2, y2),-------- , (xn, yn). Applying the method of least squares, the value of a
n
∑ ( y i − a − bx i )
2
and b are so determined as to minimize .
i =1
Taking partial derivatives with respect to a and b, and equating them to zero, we get
δ δ
Σ ( y − a − bx ) = 0 Σ ( y − a − bx ) = 0
2 2
δa δb
∑ (y–a–bx)(–2)=0
∑(y – a – bx)(– 2x) = 0 Or, ∑x(y – a – bx) = 0
Or, ∑y = an + b∑x
∑xy = a∑x + b∑x2
Here, the values of n, Σx, Σy, Σx2 and Σxy are substituted on the basis of the given data.
We have then two equations involving a and b, solving which the values of a and b are
www.gayali.in
obtained.
Simplified Calculations
[1] If we change the origin of x only i.e. Write X = x – c, where c is an arbitrary
constant, then x is replaced by X in the equations giving
ΣY = an + bΣX
ΣXY = aΣX + bΣX2

www.gayali.in
[2] If we change the origins of both x and y i.e. Write X = x – c and Y = y – c‘, where
c and c’ are arbitrary constants, then x and y are replaced by X and Y in the equations,
so that
ΣY = an + bΣX
ΣXY = aΣX + bΣX2
[3] In case the successive values of the independent variable x are found to have
a common difference, two special transformations are available for the cases, (i) n is
odd and (ii) n is even.
x − ( central values of x )
[i] When n is odd, write µ =
common difference
x − ( mean of two central values of x )
[ii] When n is even, write µ =
1
2
( common difference )
Fitting Parabola
Let y = a + bx + cx2
be the equation of the parabola to be fitted to a given set of n pairs of observations (x1,
www.gayali.in
y1), (x2, y2), ---------, (xn, yn). Using the method of least squares, the constants a, b, c of
the best fitting parabola are obtained by solving the normal equations,
Σy = an + bΣx + cΣx2
Σxy = aΣx + bΣx2 + cΣx3
Σx2y = aΣx2 + bΣx3 + cΣx4
Fitting Exponential and Geometrical Curves
In order to fit curves with equations of the form y = abx and y = axb, the procedure
is to take logarithms of both sides and then form normal equations. For example, to fit
the exponential curve y = abx, we take logarithms of both sides, obtaining
log y = (log a) + x (log b)
This can be written as Y = A + Bx
Where Y = log y, A = log a, B = log b. The normal equations are then
Σy = An + BΣx
Σxy = AΣx + BΣx2
These equations are solved for A and B, and then taking antilog, we find the
values of a and b.
www.gayali.in
Similarly, the Geometric curve y = axb written as log y = (log a) + b (log x)

i.e. Y = A +bx
Where Y = log y, A = log a, x = log x. The normal equations are
Σy = An + bΣx
Σxy = AΣx + bΣx2
These are solved for A and b and then a = antilog A is obtained.

www.gayali.in
Exercises
[1] Fit a straight line of the form y = a + bx to each of the following set of data:
(i)
{ x
y
2
8
5
14
6
19
8
20
9
31
Solution: Let y = a + bx be the equation of the best fitting straight line by the
method of least squares. The constants a and b are obtained by solving the
normal equations.
Σy = an + bΣx
Σxy = aΣx + bΣx2
Where n is the number of pairs of observations.
Table : Calculations for fitting straight line
x y x2 xy
2 8 4 16
5 14 25 70
6 19 36 114
8 20 64 160
www.gayali.in
9 31 81 279
Total 30 92 210 639
Putting Σy= 92, n = 5, Σx = 30, Σxy = 639
Σx2 = 210 in the normal equations, we have
92 = 5a + 30b - - - (i)
639 = 30a + 210b - - - (ii)
Multiplying (i) by 30 and (ii) by 5, and subtracting
2760 = 150a + 900b
3195 = 150a + 1050b
– 435 = – 150b
435
or, b = = 2. 9
150
Putting the value of b in (i) we have
5a = 92 – 30 × 2.9 = 92 – 87 = 5
∴a=1
Now, substituting the value of a and b in y = a + bx, the equation of the fitted
straight line is y = 1.0 + 2.9x
www.gayali.in
(ii)
{ x
y 16
1 3
12
5
10
7
7
9
5
11
4
Solution:
Let y = a + bx be the equation of the best fitting straight line by the method of
least squares. The constants a and b are obtained by solving the normal equations.
Σy = an + bΣx
Σxy = aΣx + bΣx2

www.gayali.in
Where n is the number of pairs of observtions.

Table : Calculations for Fitting straight line
x y x2 xy
1 16 1 16
3 12 9 36
5 10 25 50
7 7 49 49
9 5 81 45
11 4 121 44
Total 36 54 286 240
Putting Σy = 54, n = 6, Σx = 36
Σxy = 240, Σx2 = 286
in the normal equations, we have
54 = 6a + 36b------(i)
240 = 36a + 286b--------(ii)
Multiplying (i) by 36 and (ii) by 6, and subtracting
1944 = 216a + 1296b
1440 = 216a + 1716b
www.gayali.in
504 = − 420b
∴ b = -1.2
Putting the value of b in (i) we have
6a = 54 - 36 × -1.2
= 54 + 43.2 = 97.2 ∴ a = 16.2
Now, substituting the value of a and b in y=a+bx, the equation of the fitting
straight line is y = 16.2 – 1.2x
(iii)
{ x
y
4
46
6
42
8
40
12
36
15
30
17
25
22
19
Solution:
Let y = a + bx be the equation of the best fitting straight line, where we write
X = x – 12 and Y = y – 36. The normal equations for determining the value of a and
b are viz ΣY = an + bΣX, ΣXY = aΣX + bΣX2
Table - Calculations for fitting straight line
x y X = x –12 Y = y – 36 X2 XY
4 46 -8 10 64 -80
6 42 -6 6 36 -36
www.gayali.in
8 40 -4 4 16 -16
12 36 0 0 0 0
15 30 3 -6 9 -18
17 25 5 -11 25 -55
22 19 10 -17 100 -170
Total - - 0 -14 250 -375
Substituting the values ΣY = –14, ΣX = 0, ΣX2 = 250, ΣXY = 375 and n = 7 in the

www.gayali.in
normal equations, we get

–14 = 7a + b × 0 - - - (i)
–375 = a × 0 + 250b - - -(ii)
From (i) we get, a = -2
375
From (ii) we get, b = − = −1.5
250
Hence, the equation of the best fitting straight line is
Y = a+bX
y – 36 = –2 –1.5 (x – 12)
Or, y – 36 = – 2 – 1.5x + 18
= 36 – 2 + 18 – 1.5x
= 54 – 2 – 1.5x
∴ y = 52 – 1.5x
(iv)
{ x
y
1.0
5.3
1.5
5.7
2.0
6.3
2.5
7.2
3.0
8.2
3.5
8.7
4.0
8.4
Solution:
x − 2. 5 y − 7. 2
Let us write the equation of the straight line as Y = a+bX where X = ,Y = .
www.gayali.in
0. 1 0. 1
Using the method of least squares, the normal equations for determining the values of
a and b are ΣY = an + bΣX, ΣXY = aΣX + bΣX2
Table: Calculations for fitting straight line
x − 2. 5 y − 7.2
x y X= Y= X2 XY
0.1 0.1
1.0 5.3 -15 -19 225 285
1.5 5.7 -10 -15 100 150
2.0 6.3 -5 -9 25 45
2.5 7.2 0 0 0 0
3.0 8.2 +5 10 25 50
3.5 8.7 10 15 100 150
4.0 8.4 15 12 225 180
Total - - 0 -6 700 860
Substituting the values from the table in the normal equations,

–6 = 7a +b × 0- - - (i)
860 = a × 0 + 700b- - - (ii)
www.gayali.in
From (i) we get 7a = –6

6
a=−
7
From (ii) we get 700b = 860
860
b= = 1.23
700

www.gayali.in
The equation of the straight line is, therefore,

Y = a + bX
y − 7. 2 6  x − 2. 5 
or, = − + 1.23  
0. 1 7  0. 1 
y − 7.2 1.23x − 3.075 −0.086 + 1.23x − 3.075
or, = −0.86 + =
0.1 0. 1 0. 1
or, y – 7.2 = -3.161 + 1.23x
or, y = 4.04 + 1.23x
(v)
{ x
y
70
553
80
547
90
539
100
533
110
527
120
520
Solution:
x −100
Let Y = a + bX be the equation o the best fitting straight line, where we write X =
10
and Y = y – 533. The normal equations are ΣY = an + bΣX, ΣXY = aΣX + bΣX2.
Table: Calculations for fitting straight line
www.gayali.in
x −100
x y X= Y = y – 533 X2 XY
10
70 553 -3 20 9 -60
80 547 -2 14 4 -28
90 539 -1 6 1 -6
100 533 0 0 0 0
110 527 1 -6 1 -6
120 520 2 -13 4 -26
Total - - -3 21 19 -126
Substituting the values in the normal equations we get
21 = 6a - 3b - - - (i)
–126 = -3a + 19b - - - (ii)
Multiplying (ii) by 2 and adding with (i),
21 = 6a–3b
−252 = −6a + 38b
−231 = 35b
231
www.gayali.in
∴b = − = −6.6
35
Putting the value of b in equation (i), we get
21 = 6a - 3 × -6.6
or, 6a = 21 - 19.8 = 1.2
1. 2
∴a = = 0. 2
6

www.gayali.in
Therefore, the equation of the fitting straight line is

6.6(x − 100)
y − 533 = 0.2 −
10
2 − 6.6 x + 660
=
10
or, 10y – 5330 = 662 – 6.6x
or, 10y - 599.2 – 6.6x
or, y = 599.2 – .66x
(vi)
{ x
y
5
1.7
7
2.4
9
2.8
11
3.4
13
3.7
15
4.4
Solution:
Let Y = a + bX be the equation of the best fitting straight line, where we write
x −9 y − 2. 8
X= ,Y = . The normal equations are ΣY = an + bΣX, ΣXY = aΣX + bΣX2
2 0. 1
Table: Calculations for fitting straight line.
www.gayali.in
x −9 y − 2.8
x y X= Y= X2 XY
2 0.1
5 1.7 -2 -11 4 22
7 2.4 -1 -4 1 4
9 2.8 0 0 0 0
11 3.4 1 6 1 6
13 3.7 2 9 4 18
15 4.4 3 16 9 48
Total - - 3 16 19 98
Substituting the values in the normal equations, we get
16 = 6a + 3b - - - (i)
98 = 3a + 19b - - - (ii)
Multiplying (ii) by 2 and subtracting from (i), we get
16 = 6a + 3b
96 = 6a + 38b
−
180 = 35b
180
∴b = = 5.14
35
Putting the value of b in equation (i), we get
www.gayali.in
6a + 3 × 5.14 = 16
6a = 16 – 15.42 = + .58
0.58
a=+ = .097
6
Therefore, the equation of the fitting straight line is

y − 2.8  x −9 
= .097 + 5.14 ×  
0.1  2 

www.gayali.in
= .097 + 2.57x – 23.13

Or, = 2.57x – 23.032
y – 2.8 = 0.257x – 2.3033
∴ y = 0.50 + 0.257x
[2] Given the following data on quantities exchanged and prices, fit a linear demand
curve y = a + bx.
Price (x) 16 10 8 9 5 4 3
Quantity (y) 70 85 100 115 120 124 130
[C.U. M.com. 1970]
Solution:
The normal equations for determining the values of a and b are viz
ΣY = an + bΣX
ΣXY = aΣX + bΣX2
where X = x – 9, Y = y – 115
Table : Calculations for Fitting Straight Lines
x y X = x – 9 Y = y – 115 X2 XY
www.gayali.in
16 70 7 -45 49 -315
10 85 1 -30 1 -30
8 100 -1 -15 1 +15
9 115 0 0 0 0
5 120 -4 5 16 -20
4 124 -5 9 25 -45
3 130 -6 15 36 -90
Total - - -8 -61 128 -485
Substituting the values ΣY = 610, ΣX = –8, ΣX2 = 128, ΣXY = –485 and n = 7 in
the normal equations, we get
–61 = 7a – 8b - - - (i)
–485 = -8a + 128b - - - (ii)
Multiplying equation (i) by -8 and equation (ii) by 7 and subtracting from
(iii) to (iv) we get
488 = -56a + 64b - - - (iii)
−3395 = −56a + 896b −−−−(iv )
3883 = − 832b
3883
∴b = − = −4.667
832
www.gayali.in
Putting the value of b in equation (i) we get

7a = –61 + 8 × –4.667 = –61 – 37.336 = –98.34
∴a = –14.05
y – 115 = 14.05 – 4.667 (x – 9)
y = 115 + 14.05 – 4.667x + 42.00
= 143.0 – 4.667x

www.gayali.in
[3] Apply the principle of least squares to fit a straight line y = a + bx to the following
data:
x 2 4 6 8 10 12 14
y 10 14 15 16 15 17 18
[C.U., B.Sc. (math hours) 1968]
Solution :
The normal equations for determining the values of a and b are viz.
ΣY = an + bΣX
x −8
ΣXY = aΣX + bΣX2 where X = , Y = y − 16
2
Table : Calculations for Fitting Straight Line
x y x −8 Y = y – 16 X2 XY
X=
2
2 10 -3 -6 9 18
4 14 -2 -2 4 4
6 15 -1 -1 1 1
www.gayali.in
8 16 0 0 0 0
10 15 1 -1 1 -1
12 17 2 1 4 2
14 18 3 2 9 6
Total - - - -7 28 30
Substituting the values ΣY = –7, ΣX = 0, ΣX2 = 28, ΣXY = 30 and x = 7 in the
normal equations, we get.
–7 = 7a + b × 0
or, a = –1 30
30 = a × 0 + 28b or,=b = 1.071
28
 x −8 
y − 16 = −1 + 1.071 
 2 
or, y = 16 – 1 + 536x – 4.28
= 15 – 4.28 + 0.536x
= 10.72 + 0.536x
[4] Fit a straight line to the following data and estimate the most probable yield of
rice for 40 inches of water.
www.gayali.in
Water x (inches) 12 18 24 30 36 42 48
Yield y (tons) 5.27 5.68 6.25 7.21 8.02 8.71 8.42
[C.U., M.com 1964]
Solution:
Let Y=a+bX be te equation of te best fitting straight line, where we write X=
x − 30 y − 7.21
Y= . The normal equations for determining the values of a and b are,viz.
6 .01

www.gayali.in
ΣY = an + bΣX
ΣXY = aΣX + bΣX2
Table: Calculations for Fitting Straight Line
x y x − 30 y − 7.21 X2 XY
X= Y=
6 0.01
12 5.27 -3 -194 9 582
18 5.68 -2 -153 4 306
24 6.25 -1 -96 1 96
30 7.21 0 0 0 0
36 8.02 1 81 1 81
42 8.71 2 150 4 300
48 8.42 3 121 9 363
Total - - - -91 28 1728
Substituting the values ΣY = –91, ΣX = 0, ΣX2 = 28, ΣXY = 1728 and n = 7 in the
normal equations, we get.
–91 = 7a + b.0
91
or , a = − = –13
www.gayali.in
7
1728 = a×0+28b
1728
or=
,b = 61.71
28
y − 7.21 x − 30 
= −13 + 61.71 
0.01  6 
= –13 + 10.29x – 308.56
= –321.56 + 10.29x
or, y – 7.21 = –3.2156 + 0.1029x
y = 7.21 – 3.2156 + 0.1029x = 3.99 + 0.103x
When x = 40, y = 3.99 + 0.103 × 40
= 3.99 + 4.120
= 8.11 tons.
[5] Calculate the values of m and k for the equation y = mx + k to show the regression
of profit per unit of output on output.
Output x (000) 5 7 9 11 13 15
Profit per unit of output y (Rs.) 1.7 2.4 2.8 3.4 3.7 4.4
www.gayali.in
Estimate the profit per unit of output when there is an output of 10,500.
[I.C.W.A. 1973]
Solution:
Let the equation of the straight line be Y = mX + k, the normal equations will be
ΣY = kn + mΣX
ΣXY = kΣX + mΣX2

www.gayali.in
9 + 11
x−
Where, X = 2 = x − 10 = x − 10
1 1
×2
2
x y X = x –10 X2 Xy
5 1.7 -5 25 -8.5
7 2.4 -3 9 -7.2
9 2.8 -1 1 -2.8
11 3.4 1 1 3.4
13 3.7 3 9 11.1
15 4.4 5 25 22.0
Total - - - 70 18.0
Substituting the values in the normal equations.
18.4 = 6k + m × 0
18.4
or=
,k = 3.067
6
www.gayali.in
18 = k × 0 + m × 70
18
=
or , m = 0.257
70
Putting the values of k and m, we get
y = 0.257 × (x – 10) + 3.067
y = 3.067 – 2.57 + 0.257x
y = 0.50 + 0.257x be the best fitting straight line
10, 500
When x = y = 0.50 + 0.257 ×10.5
1000
= 10.5 = 0.50 + 2.6985
= Rs.3.20
[6] The following data relate to results of a fertiliser experiment on crop yields:
Units of fertiliser used(x) 0 2 4 6 8 10
Units of yield (y) 110 113 118 119 120 118
Fit a straight line to the above data and estimate the amounts of yield when units
of fertiliser used are 3 and 7 respectively.
[C.U. M.com 1969]
www.gayali.in
Solution:
5−x
Let the equation of the straight line be y = a + bu, when µ = = x −5
1
×2
The normal equations are 2
Σy = na + bΣu
Σuy = aΣu + bΣu2

www.gayali.in
Table : Calculations for Fitting Straight Line

X Y u=x–5 u2 uY
0 110 -5 25 -550
2 113 -3 9 -339
4 118 -1 1 -118
6 119 1 1 119
8 120 3 9 360
10 118 5 25 590
Total - 698 - 70 62
698
698 = 6a + b × 0 or , a = = 116.33
6
62
62 = a × 0 + 70b or , b = = 0.886
70
Putting the values of a and b and rewriting u in terms of x,
y = 116.33 + 0.886 (x – 5) =116.33 + 0.886x – 4.43
y = 111.90 + 0.886x
www.gayali.in
when x = 3, y = 111.90 + 0.886 × 3 = 114.5
when x = 7, y = 111.90 + 0.886 × 7 = 118.1
[7] The weights (in lbs) of a calf taken at weekly intervals are given below. Fit a
straight line, and calculate the average rate of growth per week.
Age (x) 1 2 3 4 5 6 7 8 9 10
Weight (y) 52.5 58.7 65.0 70.2 75.4 81.1 87.2 95.5 102.2 106.4
Solution:
x − 5. 5
Let the equation of the straight line be y = a + bu, where u = 1 = 2 x − 11
×1
The normal equations are 2
Σy = an + bΣu
Σuy = aΣu + bΣu2
x y u = 2x – 11 u2 uy
1 52.5 -9 81 -472.5
2 58.7 -7 49 -410.9
3 65.0 -5 25 -325.0
4 70.2 -3 9 -210.6
www.gayali.in
5 75.4 -1 1 -75.4
6 81.1 1 1 81.1
7 87.2 3 9 261.6
8 95.5 5 25 477.5
9 102.2 7 49 715.4
10 106.4 9 81 957.6
Total - 794.2 0 330 998.8

www.gayali.in
Putting the values from above table in the normal equations:

794.2
794.2 = 10a + b × 0 or , a = = 79.42
10
998.8
998.8 = a × 0 + 330b or , b = = 3.0267
330
Substituting there values of a and b in equation (1) and rewriting u in terms of x.
y = 79.42 + 3.0267 (2x – 11)
= 79.42 + 6.053x – 33.2937
y = 46.13 + 6.053x
Average rate of growth is the slope of the straight line = 6.053 lbs
[8] In the following table, S is the weight of potassium bromide which will dissolve
in 100 gms of water at T0C. Fit an equation of the form S = mT + b by the method of
least squares. Use this relation to estimate S, when T = 500.
T 0 20 40 60 80
S 54 65 75 85 96
Solution:
Let us write the equation of the straight line as y = b + mu, where
www.gayali.in
u = (T – 40)/20
y = S – 75
The normal equations are
Σy = bn + mΣu
Σuy = bΣu + mΣu2
Table :
T − 40
T S u= y = S – 75 u2 uy
20
0 54 -2 -21 4 42
20 65 -1 -10 1 10
40 75 0 0 0 0
60 85 1 10 1 10
80 96 2 21 4 42
Total - - 0 0 10 104
Substituting the values from the table in the normal equations:
0 = 5b + m×0 or, b = 0
104 = b×0+10m or, m = 10.4
Putting the values of b and m and rewriting u and y in terms of T and S,
www.gayali.in
 T − 40 
S − 75 = 0 + 10.4  
 20 
or, S = 75 + 0.52T – 20.8
or, S = 0.52T + 54.2
When T = 500

S = 0.52 ×50 + 54.2
= 26 + 54.2 = 80.2 units

www.gayali.in
[9] Fit a curve of the form y = abx to the following data:

x 2 3 4 5 6 7
y 640 512 410 328 262 210
Solution:
Taking logarithms of both sides in the equation y = abx, we have log y = log a + x
log b i.e. Y = A + Bx.
Where Y = log y, A = log a, and B = log b. Applying the method of least squares,
the normal equations are
ΣY = An + BΣx
ΣxY = AΣx + BΣx2
Table : Fitting Exponential Curve
x y Y = log y x2 xY
2 640 2.8062 4 5.6124
3 512 2.7093 9 8.1279
www.gayali.in
4 410 2.6128 16 10.4512
5 328 2.5159 25 12.5795
6 262 2.4183 36 14.5098
7 210 2.3222 49 16.2554
Total 27 - 15.3847 139 67.5362
Substituting the values in the normal equations:
15.3847 = 6A +27B
67.5362 = 27A + 139B
Solving these equations we get
A = 2.999, B= -0.0968
{ log a = 2.999
log b = –0.0968
or { a = anti log 2.999 = 1000 (approx)
b= anti log – 0.0968 = 0.8
Therefore, the equation of the fitting curve is y = 1000 (.8)x
[10] Fit a curve of the form y = axb to the following data:
x 1 2 3 4 5
y 5.0 6.3 7.2 7.9 8.5
Solution:
www.gayali.in
Taking logarithms of both sides in the equation y = axb, we have log y = log a + b
log x i.e. Y = A + bX ------- (i)
Where Y = log y, A = log a, and X = log x. The normal equations for determining
the constants A and b in (i) are
ΣY = An + bΣX
ΣXY = AΣX + bΣX2

www.gayali.in
Table: Fitting Exponential Curve

X = log x Y= log y
x y X2 XY
from log table rounded from log table rounded
1 5.0 0.0000 0.00 0.6990 0.70 0 0
2 6.3 0.3010 0.30 0.7993 0.80 .0900 .2400
3 7.2 0.4771 0.48 0.8573 0.86 .2304 .4128
4 7.9 0.6021 0.60 0.8976 0.90 .3600 .5400
5 8.5 0.6990 0.70 0.9294 0.93 .4900 .6510
Total - - - 2.08 - 4.19 1.1704 1.8438
Substituting the values in the normal equations:
4.19 = 5A + 2.086
1.8438 = 2.08A + 1.1704b
Solving these equations we get
A = 0.7 or, log a = 0.7
a = antilog 0.7 = 5.012 = 5 (approx)
b = 0.33
www.gayali.in
Therefore , the equation of the fitting curve is y = 5x0.33
[11] Estimate the constants of the pareto curve n = Ax–a which fits the data below:
1945-46 : Number of net incomes more than Rs.x after tax
Income (Rs.x) Number (n)
150 14,000,000
500 825,000
1000 173,000
2000 35,500
[I.C.W.A. 1973]
Solution:
Taking logarithms of both sides in the equation n = Ax–a we have,
log n = log A – a log x i.e. Y = a/ + b/x
Where Y = log n, a/ = log x, b/ = –a, X = log x
The normal equations for determining the constants a/ and b/ in (i) are
ΣY = na/ + b/Σx - - - (ii)
ΣXY = a/ΣX + b/ΣX2 - - - (iii)
Table - Fitting Pareto curve
www.gayali.in
X = log X Y=log n
x n from log table rounded from log table rounded X2 XY
150 14,000,000 2.1761 2.18 7.1461 7.15 4.7525 15.587
500 825,000 2.6990 2.70 5.9165 5.92 7.2900 15.987
1000 173,000 3.000 3.00 5.2380 5.24 9.0000 15.720
2000 35,500 3.3010 3.30 4.5502 4.55 10.8900 15.015
Total - - 11.18 - 22.86 31.9324 62.306

www.gayali.in
Putting the values from the table, we have

22.86 = 4a/ + 11.18b/- - - (iv)
62.31 = 11.18a/ + 31.93b/ - - - (v)
Multiplying equation (iv) by 11.18 and equation (v) by 4 and substracting we get
255.5748 = 44.72 a/ + 125b/
249.2400 = 44.72a / + 127.72b/
6.3348 = − 2.72b/ , or b/ = −2.33
Putting the value of b/ = 2.33 in equation (iv) we get a/ = 12.22
∴log A = 12.22, b/ = –2.33
[12] Fit a second degree parabola (y = a + bx + cx2) to the following data:
x 0 1 2 3 4
y 1 5 10 22 38
[C.U, M.com. 1966]
Solution:
The constants a, b, c appearing in the equation y = a + bx + cx2 -------(i) are
obtained by solving the normal equations
www.gayali.in
Σy=an+bΣx+cΣx2
Σxy=aΣx+bΣx2+cΣx3
Σx2y=aΣx2+bΣx3+cΣx4
Table: Calculations for Fitting Parabola
x y x2 x3 x4 xy x2y
0 1 0 0 0 0 0
1 5 1 1 1 5 5
2 10 4 8 16 20 40
3 22 9 27 81 66 198
4 38 16 64 256 152 608
Total 10 76 30 100 354 243 851
Substituting the values from the table (here r=5)
76 = 5a + 10b + 30c - - - (ii)
243 = 10a + 30b + 100c- - - (iii)
851 = 30a + 100b + 354c- - - (iv)
Multiplying (ii) by 2 and subtracting from (iii)
243 = 10a + 30b + 100c
152 = 10a + 20b + 60c
91 = 10b + 40c −−−−−( v )
www.gayali.in
Again, multiplying (ii) by 6 and subtracting from (iv)

851 = 30a + 100b + 354c
456 = 30a + 60b + 180c
395 = 40b + 174c −−−−−( vi)
Solving (v) and (vi) we get b = 0.26, c = 2.21
Putting the value of b and c in equation (ii) we get a = 1.42. Now putting the
values of a, b, c in (i) the required equation of the parabola is y = 1.42 + 0.26x + 2.21x2

www.gayali.in
[13] Fit a parabola of the second degree to the following data taking x as the
independent variable (y = a + bx + cx2), by the method of least squares.
x 0 1 2 3 4
y 1 1.8 1.3 2.5 6.3
Find out the difference between the actual value of y and the value of y obtained
from the fitted curve when x=2.
[I.C.W.A. 1965]
Solution:
The constant a, b, c appearing in the equation y = a + bx + cx2-----(i) are obtined
by solving the normal equations
Σy=an+bΣx+cΣx2
Σxy=aΣx+bΣx2+cΣx3
Σx2y=aΣx2+bΣx3+cΣx4
Table - Calculations for Fitting Parabola
x y x2 x3 x4 xy x2y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
www.gayali.in
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
Total 10 12.9 30 100 354 37.1 130.3
Substituting the values from the table (here n = 5)
12.9 = 5a + 10b + 30c- - - (ii)
37.1 = 10a + 30b + 100c- - - (iii)
130.3 = 30a + 100b + 354c- - -(iv)
Multiplying (ii) by 2 and subtracting from (iii)
37.1 = 10a + 30b + 100c
25.8 = 10a + 20b + 60c
11.3 = 10b + 40c −−−−−( v )
Again, multiplying (ii) by 6 and subtracting from (iv),
130.3 = 30a + 100b + 354c
77.4 = 30a + 60b + 180c
52.9 = 40b + 174c −−−−−(vi))
Solving (v) and (vi) we get b = –1.07, c = 0.55
Putting these values in (ii), we have a = 1.42. Now putting the values of a, b and
c in (i), the required equation of the parabola is y = 1.42 – 1.07x + 0.55x2
www.gayali.in
[14] The profits (in 1000 Rs.) of a company in the x year of its life are given below. Fit
a parabola and estimate its profit in the sixth year.
x 1 2 3 4 5
y 1250 1400 1650 1950 2300
Solution:
Let v = a + bu + cu2- - - (i) be the equation of the parabola, where u = x – 3,

www.gayali.in
v = (y – 1650)/50. Applying the method of least squares, the normal equations for
determining the constants a, b, c are
Σv=an+bΣu+cΣu2
Σuv=aΣu+bΣu2+cΣu3
Σu2v=aΣu2+bΣu3+cΣu4
Table: Calculations for Fitting Parabola
y −1650
x y u=x–3 v= u2 u3 u4 uv u2v
50
1 1250 -2 -8 4 -8 16 16 -32
2 1400 -1 -5 1 -1 1 5 -5
3 1650 0 0 0 0 0 0 0
4 1950 1 6 1 1 1 6 6
5 2300 2 13 4 8 16 26 52
Total - - 0 6 10 0 34 53 21
Putting the values from the table in the normal equations
6 = 5a + 6 × 0 + 10c- - - (ii)
www.gayali.in
53 = 0 × a + 10b + 0 × c- - - (iii)
21 = 10a + b × 0 + 34c- - - (iv)
From (ii) 5a + 10c = 6- - - (v)
From (iii) 10b = 53 or, b = 5.3
From (iv) 10a + 34c = 21- - - (vi)
Multiplying (v) by 2 and subtracting from (vi)
10a + 34c = 21
10a + 20c = 12
9
14c = 9 or, =
c = 0.643
14
Putting the value of c in (v) we get
5a+10×.643=6
or, 5a=6–6.43=–0.43
−0.43
a= = −0.086
5
Now, putting the values of a, b, c in equation (i) and rewrite u and v is terms of x and y
y −1650
= –0.086 + 5.3 (x–3) + 0.643 (x–3)2
50
www.gayali.in
y −1650
or, =–0.086 + 5.3x – 15.9 + 0.643x2 – 3.858x + 5.787
50
= 15.986 – 5.787 + 1.442x + 0.643x2 = 10.199 + 1.442x + 0.643x2
or, y – 1650 = 509.95 + 72.1x + 32.2x2
or, y = 1140 + 72.1x + 32.2 x 2
When x=6, y=1140 + 72.1 × 6 + 32.2 × 36=1140 + 432.6 + 1159.2=Rs.2732 (approx.)

www.gayali.in
Time Series
Meaning
A series of observations recorded in accordance with the time of occurrence is
called “Time Series”. Production, consumption, sales, profits during successive periods
of time and population, price etc. are successive points of time are examples of time
series. Components of time series.
The four components of time series are
1. Secular trend or Trend ( T )
2. Seasonal Variation ( S )
3. Cyclical Fluctuation ( C )
4. Irregular or Random movement ( I )
It is assumed that there is a multiplicative relationship between the four
components i.e. any particular observation is considered to be the product of the
effects of four components.
Yt = T × S × C × I
www.gayali.in
Secular trend (or simply trend) of time series is the smooth, regular and long-
term movement exhibiting the tendency of growth or decline over a period of time. The
trend is that part which the series would have exhibited, had there been no other factors
affecting the values. The population growth together with advances in technology and
methods of business organization are the main factors for the growth or upward trend
in most of the economic and business data. The decline and downward trend may
be due to the decreasing demand of the product, or a substitute taking its place, or
difficulty in obtaining raw materials etc. Many industries, however initially show a
steady growth until a saturation point is reached, and then the trend decline steadily.
But sudden or frequent changes are incompatible with the idea of trend.
Seasonal variation represents a type of periodic movement, where the period is
not longer than one year. Business activities are found to have a brisk and slack periods
at different parts of the year. This up-and-down movement of time series, recurring
with remarkable regularity year after year, is attributable to the presence of seasonal
variations. The factors which cause this type of variation are the climatic changes of
the different seasons, such as changes in rainfall, temperature, humidity etc. and the
customs and habits which people follow at different parts of the year.
Cyclical fluctuation is another type of periodic movement, where the period is
www.gayali.in
more than a year. Such movements are fairly regular and oscillatory in nature. One
complete period is called a cycle. Cyclical fluctuation is found to exist in most of the
business and economic time series, where it is known as business cycle. Business cycle
are caused by a complex combination of forces affecting the equilibrium of demand
and supply. Prosperity, decline, depression and recovery are usually considered to be
the four phases of business cycles. The swing from prosperity to recovery and back
again to prosperity varies both in time span and intensity.

www.gayali.in
Irregular or random movements are such variations which are caused by factors
of an erratic nature. There are completely unpredictable or caused by such unforeseen
events as war, flood, earthquake, strike and lockout etc. and may sometimes be the
result of many small forces, each of which has a negligible effect, but there combined
effect is not negligible. Random movements do not reveal any pattern of the repetitive
tendency and may be considered as residual variation.
Measurement of trend
There are four methods of isolating secular trend in time series :-
1. Free-hand Method
2. Semi-average Method
3. Moving average Method and
4. Fitting mathematical curves.
[1] Free-hand method : The given data are plotted as points on a graph paper
against time. The time series data (Yt)are shown along the vertical axis and time (t)
along the horizontal axis. Then a smooth free-hand curve is drawn through the scatter
of the plotted points, which appears to represents their patterns of movement over
time. The distance of this line, known as trend line, gives the trend value for each time
www.gayali.in
period. The advantages of the method are that a quick estimate of the trend is obtained
and that the method can be used to obtain a preliminary knowledge of the nature of
trend with a view to applying more refined methods.
[2] Semi-average Method : Semi average method consists in dividing the data into
two parts, and then finding an average for each part. These averages are plotted as
points on a graph paper against the mid-point of the time interval covered by each
part. The straight line joining these two points gives the trend line. As before the
distances of trend line from the horizontal axis give the trend values. If the actual trend
is a straight line, the method will give quite satisfactory results.
[3] Moving Average Method : Moving average method is very commonly used
for the isolation of trend and in smoothing out fluctuations in time series. In this
method, a series of arithmetic means of successive observations, known as moving
averages, are calculated from the given data, and these moving averages are used as
trend values. Precisely, moving averages of period n are a series of arithmetic means
of groups of successive n observations, and are shown against the mid-points of time
intervals covered by the respective groups. If the period of moving average is odd, the
trend values correspond to the given value. If the period of moving average is even, a
two point moving average of the moving averages so obtained, has to be found out for
www.gayali.in
‘centering’ them.
[4] Fitting Mathematical Curves : In this method, an appropriate type of
mathematical equation is selected for trend, and the constraints appearing in the trend
equation are determined on the basis of the given time series data.
[i] If the plotted data shows approximately a straight line tendency on an
ordinary graph paper, the equation used is :
Y = a + bx (Straight Line)

www.gayali.in
[ii] If they show a straight line on a semi logarithmic graph paper, the
equation used is :
log y = a + bx (Exponential Curve)
[iii] Sometimes a parabola or higher order polynomial may also be fitted.
[iv] Special types of curves are used in certain cases.
y = a + bcx (Modified Exponential Curve)
1/y = a + bcx (Logistic Curve)
log y = a + bcx (Gompertz Curve)
The constants appearing in the equations are referred to at (i) to (iii) are obtained
by applying the principle of least squares.
Measurement of seasonal variation
There are four methods of measuring seasonal fluctuation:
1. Method of (Monthly or Quarterly) Averages
2. Moving Average Method
3. Trend-ratio Method
4. Link Relative Method
www.gayali.in
[1] Method of (Monthly or Quarterly) Averages :
This method is applied when the given time series data do not contain trend
or cyclical fluctuations to any appreciable extent. From the quarterly data the totals
for each quarter and the averages A1, A2, A3, A4 for the 4 quarters, Q1, Q2, Q3 and
1
Q4 are found. The grand average G = (A1 + A2 + A3 + A4) is also calculated. If the
4
additive model is used, the deviations of quarterly averages from the grand average
give seasonal variation:
S1=A1-G, S2=A2-G, S3=A3-G, S4=A4-G
If the multiplicative model is used, each quarterly average is expressed as a
percentage, of the grand average giving the seasonal indices :
A1 A A A
S1 = × 100, S2 = 2 × 100, S3 = 3 × 100, S 4 = 4 × 100
G G G G
If monthly figures are given, we find 12 averages A1, A2, ---------, A12 for the
months January, February, ---------------, December respectively, and then proceeding
the same way as before, the seasonal index for each month is obtained. The total (or
average) seasonal variation (in the additive model) is 0 and the average seasonal index
(in the multiplicative model) is 100.
www.gayali.in
[2] Moving Average Method :

From the given quarterly figures the trend is estimated by taking four quarter
moving averages. The effect of trend is then eliminated from the original data. If the
additive model is used, the moving average trend values are subtracted from the
original data to give us ‘devisions from trend’. Since these deviations do not contain
any effect of trend, the method of quarterly averages is applied to these deviations,
using additive model.

www.gayali.in
If the multiplicative model is taken then we find ‘ratios to moving averages’,

expressed as percentages i.e. the original values are expressed as percentages of the
corresponding moving average values. These percentages are now arranged by quarters
and the average for each quarter, P1, P2, P3 (suppose) are found out. These are adjusted
1
to a total of 400, multiplying each by 100/P, where P = (P1 + P2 + P3 + P4 ) is the grand
4
average.
P1 P P P
The Seasonal indices are S1 = × 100, S2 = 2 × 100, S3 = 3 × 100, S 4 = 4 × 100
P P P P
respectively for the quarters Q1, Q2, Q3 and Q4.
[3] Trend-Ratio Method :

In this method, the multiplicative model is always taken. Trend values are
obtained by fitting a mathematical curve, and the original data are expressed as
percentages of the corresponding trend. As in the moving average method, these
percentages are arranged in quarters, viz. P1, P2, P3, P4 are found out. Each of these is
now multiplied by 100/P, to give the seasonal indices.
P1 P P P
www.gayali.in
S1 = × 100, S2 = 2 × 100, S3 = 3 × 100, S 4 = 4 × 100
P P P P
1
Corresponding to the quarters Q1, Q2, Q3, Q4 respectively, where P = ( P1 + P2 + P3 + P4 ) .
4
The total of the 4 seasonal indices will be 400.
[4] Link Relative Method :
If quarterly data are given, each value is expressed as a percentage of the value for
the immediate preceding period. These are known as Link Relatives (L.R). Of course,
the link relative for the first quarter (Q1) of the first year cannot be obtained. The L.R.s
are arranged by quarters and the average L.R for each quarter is found, either by using
the arithmetic mean or median. The average link relatives show the average relation of
each quarterly value to the value of the previous quarter.
From these average L.R.s we find chain relatives (C.R.) by relating them to a
common base, e.g. the first quarter, for which C.R. is taken as 100. The C.R. for any
quarter is now obtained on multiplying the L.R. for that quarter by the C.R. for the
immediate preceding quarters and dividing by 100. Proceeding this way, we find a
second C.R for the first quarter (Q1) by the relation.
(C.R For Q3 ) × (L.R For Q1 )
Second C.R for Q1 =
www.gayali.in
100
Usually, the second C.R for Q1 will differ from the originally assumed C.R 100,
owing to the presence of trend. Some adjustments to the C.R’s are therefore necessary.
Let C be the average quarterly deviations of the 2nd C.R from 100 i.e.
1
C = (Sec ond C.R. for Q1 − 100)
4
Subtracting C, 2C, 3C and 4C from the C.R’s for Q2, Q3, Q4 and the second C.R

www.gayali.in
for Q1, we find that both the C.R’s for Q1 are now equal to 100. The adjusted C.R’s for
Q1, Q2, Q3, Q4 are now expressed as percentages of their A.M. to give the seasonal
indices. The total of these seasonal indices will be 400.
Exercise
[1] Using 3-year moving averages, determine the trend and short tern fluctuations,
Plot the original and the trend value on the same graph paper:-
Year 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977
Production (‘000 tons) 21 22 23 25 24 22 25 26 27 26
[C.A. 1981]
Solution:
Calculations for 3 – yearly Moving Average
Year Value (‘000) tons 3-year moving total (‘000 tons) 3 – year moving average (‘000 tons)
1968 21 - -
1969 22 66 22.00
1970 23 70 23.33
1971 25 72 24.00
www.gayali.in
1972 24 71 23.67
1973 22 71 23.67
1974 25 73 24.33
1975 26 78 26.00
1976 27 79 26.33
1977 26 - -
∴Trend: 22.00, 23.33, 24.00, 23.67, 24.33, 26.00, 26.33 for 1969-76 (in ‘000 tons)
Stort – term fluctuations:
22 – 22 = 0, 23.00 – 23.33 = -0.33, 25 – 24 = 1.00, 24 – 23.67 = 0.23, 22 – 23.67 = -1.67,
25 – 24.33 = 0.67, 26 – 26 = 0, 27 – 26.33 = 0.67 (in ‘000 tons)
27 Trend values
26
ta
25 da
al
igin
Or
24
Value
23
22
21
www.gayali.in
20
1968 1969 1970 1971 1972 1973 1974 1975 1976 1977
year
[2] The net profits of a company for eleven successive years are given below. Find
the three – year moving averages:-
Year 1956 ‘57 ‘58 ‘59 ‘60 ‘61 ‘62 ‘63 ‘64 ‘65 ‘66
Profit in lakh of Rs. 2.7 2.9 3.4 5.2 5.8 6.4 9.3 9.2 9.8 10.2 11.0
[I.C.W.A. 1969]

www.gayali.in
Solution: Calculations for 3 – yearly moving Average

Year Value (in lakh Rs.) 3-year moving total 3 – year moving average
1956 2.7 - -
1957 2.9 9.00 3.00
1958 3.4 11.50 3.83
1959 5.2 14.40 4.80
1960 5.8 17.40 5.80
1961 6.4 21.50 7.17
1962 9.3 24.90 8.30
1963 9.2 28.30 9.43
1964 9.8 29.20 9.73
1965 10.2 31.00 10.33
1966 11.0 - -
Trend: 3.00, 3.83, 4.80, 5.80, 7.17, 8.30, 9.43, 9.73, 10.33 for 1957 - 1965
[3] The following data give daily sales of a shop observing a five – day week, over
four successive weeks. Determine the period of the moving average and calculate the
moving average accordingly.
Day 1 2 3 4 5 6 7 8 9 10
Sales 26 29 35 47 51 26 32 37 46 53
www.gayali.in
Day 11 12 13 14 15 16 17 18 19 20
Sales 28 30 36 46 54 28 31 36 46 54
[C.A. 1974]
Solution:
The data show a regular cycle of 5 days, because every 5th figure is the highest
after which there is slump, followed by gradual recovery.
Table: Calculation for 5 – Day Moving Average
Day Value 5 – day moving total 5 – day moving average
1 26 - -
2 29 - -
3 35 188 37.6
4 47 188 37.6
5 51 191 38.2
6 26 193 38.6
7 32 192 38.4
8 37 194 38.8
9 46 196 39.2
10 53 194 38.8
11 28 193 38.6
www.gayali.in
12 30 193 38.6
13 36 194 38.8
14 46 194 38.8
15 54 195 39.0
16 28 195 39.0
17 31 195 39.0
18 36 195 39.0
19 46 - -
20 54 - -

www.gayali.in
5 – day moving averages are 37.6, 37.6, 38.2, 38.6, 38.4, 38.8, 39.2, 38.8, 38.6,
38.6, 38.8, 38.8, 39.0, 39.0, 39.0, 39.0 for days 3 to 18
[4] From the following data calculate the 4 – yearly moving average and determine
the trend values. Find the short-tern fluctuations. Plot the original values and trend on
a graph paper.
Year 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967
Value 50.0 36.5 43.0 44.5 38.9 38.1 32.6 41.7 41.1 33.8
[C.A. 1980]
Solution: Table: Calculations for 4 – yearly Moving Average
Year Value 4-year moving total 4-year moving 2-item moving
(not centered) average (not centered) total (centered)
4-year moving
average (centered)
(1) (2) (3) (4) (5) (6)
1958 50.0 - - - -
1959 36.5 - - - -
1960 43.0 174.0 43.5 84.23 42.1
1961 44.5 162.9 40.73 81.86 40.9
www.gayali.in
1962 38.9 164.5 41.13 79.66 39.8
1963 38.1 154.1 38.53 76.36 38.2
1964 32.6 151.3 37.83 76.21 38.1
1965 41.7 153.5 38.38 75.68 37.8
1966 41.1 149.2 37.30
1967 33.8 - - - -
Note: Col (4) = Col (3)/ 4, Col (6) = Col (5)/2
Short Trend values are: 42.1, 40.9, 39.8, 38.2, 38.1, 37.8 for 1960 to 1965.
Short-trend fluctuations: 43.0-42.1=0.9, 44.5-40.9=3.6, 38.9-39.8=-0.9, 38.1-
38.2=-0.1, 32.6-38.1=–5.5, 41.7-37.8=3.9 for 1960 to 1965.
Figure: Trend by 4-year Moving Average
50.0 L ine
re nd
T
45.0 e r age
Av
v ing
Mo
Value
40.0
www.gayali.in
35.0 Original Data
30.0
1958 1960 1962 1964 1966 1968
Year
[5] Determine trend by the method of moving averages from the figures of quarterly
production of a commodity:

www.gayali.in
Production (in thousand tons)

Quarter/Year 1975 1976 1977
I 115 119 149
II 180 189 209
III 108 149 179
IV 99 119 145
[W.B.H.S. 1982]
Solution: Table: Calculations for 4 – Quarterly Moving Average
4-Quarter 4-Quarter 2-item moving 4-item moving
Year/Quarter Value moving total (not moving average total (centered) average
centered) (not centered)
1975 I 115 - - - -
II 180 - - - -
III 108 502 125.50 252.00 126.0
IV 99 506 126.50 255.25 127.6
1976 I 119 515 128.75 267.75 133.9
II 189 556 139.00 283.00 141.5
III 149 576 144.00 295.50 147.8
IV 119 606 151.50 320.50 160.2
1977 I 149 626 156.50 334.50 167.2
www.gayali.in
II 209 656 164.00
III 179 682 170.50
IV 145 - -
Trend: 126.0, 127.6, 133.9, 141.5, 147.8, 160.2, 167.2, for 1975-III to 1977-II
[6] Find the quarterly trend value from the following data by the moving average
method, using an appropriate period:
Quarterly output (million tons)
Quarter/Year 1964 1965 1966
I 52 59 57
II 54 63 61
III 67 75 72
IV 55 65 60
[I.C.W.A. 1971]
Solution: Table: Calculations for Moving Average Trend
Year/Quarter Output (million 4-Quarter moving 2-Period moving 4-quarter moving
tons) total total average
1964 I 52 - - -
II 54 - - -
III 67 228 463 57.9
IV 55 235 479 59.9
1965 I 59 244 496 62.0
www.gayali.in
II 63 252 514 64.2

III 75 262 522 65.2
IV 65 260 518 64.8
1966 I 57 258 513 64.1
II 61 255 505 63.1
III 72 250 - -
IV 60 - - -
Four quarterly moving averages:
57.9, 59.9, 62.0, 64.2, 65.2, 64.8, 64.1, 63.1 for 1964-III to 1966-II

www.gayali.in
[7] Assuming a four-yearly cycle, calculate the trend by the method of moving
averages from the following data relating to the production of tea in India:
Year 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950
Production (mn.lbs.) 464 515 518 467 502 540 557 571 586 612
[I.C.W.A. 1968]
Solution: Table: Calculation of Trend by Moving Averages
Year (1) Production (mn.lbs.) 4-year moving 2-item moving total 4-year moving
total of col.(3) (centered) average (centered)
(1) (2) (3) (4) (5)
1941 464 - - -
1942 515 - - -
1943 518 1964 3966 495.8
1944 467 2002 4029 503.6
1945 502 2027 4093 511.6
1946 540 2066 4236 529.5
1947 557 2170 4424 553.0
1948 571 2254 4580 572.5
1949 586 2326 - -
www.gayali.in
1950 612 - - -
Moving Average are 495.8, 503.6, 511.6, 529.6, 553.0, 572.5 (mn.lbs.) for 1943–1948.
[8] For the following series of observations verify that the 4-year centered moving
average is equivalent to a 5-year weighted moving average with weights 1, 2, 2, 2, 1
respectively:
Year 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974
Sales (Rs.’000) 2 6 1 5 3 7 2 6 4 8 3
[C.A. 1979]
Solution;
Table: Calculations for Moving Average
Year (1) Sales (Rs.’0000) 4-year moving total 2-item moving total 4-year moving
of (3) average
(1) (2) (3) (4) (5)
1964 2 - - -
1965 6 - - -
1966 1 14 29 29/8
1967 5 15 31 31/8
www.gayali.in
1968 3 16 33 33/8
1969 7 17 35 35/8
1970 2 18 37 37/8
1971 6 19 39 39/8
1972 4 20 41 41/8
1973 8 21 - -
1974 3 - - -

www.gayali.in
Instead of taking simple averages of the values for 4-conseeutive years, the
weighted averages are calculated.
Table: Calculations for weighted Moving Average
Year Value i. e. Sales (Rs.0000) Weighted Moving Total Weighted Moving Average (a)
(1) (2) (3) (4)
1964 2 - -
1965 6 - -
1966 1 29 29/8
1967 5 31 31/8
1968 3 33 33/8
1969 7 35 35/8
1970 2 37 37/8
1971 6 39 39/8
1972 4 41 41/8
1973 8 - -
1974 3 - -
(x) 2 × 1 + 6 × 2 + 1 × 2 + 5 × 2 + 3 × 1 = 29
6 × 1 + 1 × 2 + 5 × 2 + 3 × 2 + 7 × 1 =31
(a) col (4) = col(3) ÷ sum of weights i. e. 1+2+2 +2+1=8
www.gayali.in
Hence, the result.
[9] Fit a suitable straight line to the following data by the method of least squares:
Year 1959 1960 1961 1962 1963
% of insured people 11.3 13.0 9.7 10.6 10.7
[Dip. Management 1972]
Solution: Let y = a + bx - - - (i)
be the equation of the straight line trend with origin at the year 1961 and x unit = 1 year.
By the least square method, the normal equations for finding the currants ‘a’ and ‘b’ are
Σy = an + bΣx - - - (ii)
Σxy = aΣx + bΣx2 - - - (iii)
Table: Fitting straight Line Trend
Year % of insured people (y) x x2 xy
1959 11.3 -2 4 -22.6
1960 13.0 -1 1 -13.0
1961 9.7 0 0 0
1962 10.6 1 1 10.6
1963 10.7 2 4 21.4
Total 55.3 0 10 -3.6
Number of observations n = 5. Substituting the value from the table in equations
(ii) and (iii)
www.gayali.in
55.3
55.3 = 5a + b. 0 or, a = = 11.06
5
3. 6
-3.6 = a. 0 + 10b or, b = − = –0.36
10
Putting the value of a and b in equation (i), the trend equation is
y = 11.6 – 0.36x
(origin = 1961, unit of x = 1 year)

www.gayali.in
[10] Fit a straight line trend to the following data, and show the original observations
and trend values on graph paper.
Year 1965 1966 1967 1968 1969 1970 1971

Gross ex-factory value of 672 824 967 1204 1464 1758 2057
output (Rs.crores)
[I.C.W.A. 1975]
Solution:
Set y = a + bx - - - (i)
By the least square method, the normal equations for finding the constraints ‘a’ and ‘b’ are
Σy = an + bΣx - - - (ii)
www.gayali.in
Year Value (y) (Rs.crores) x x2 xy
1965 672 -3 9 -2016
1966 824 -2 4 -1648
1967 967 -1 1 -967
1968 1204 0 0 0
1969 1464 1 1 1464
1970 1758 2 4 3516
1971 2057 3 9 6171
Total 8946 0 28 6520
Number of observations n = 7. substituting the values from the table in equations

(ii) and (iii)
8946
8946 = 7a + b × 0 or, a = = 1278
7
6520
6520 = a × 0 + 28b or, b = = 232.9
28
www.gayali.in
Putting the value of a and b in equation (i), the trend equation is

y = 1278 + 232.9x
Putting x = 3 and –3 in the trend equation, we get
y = 1278 + 232.9 × 3 = 1278 + 698.7 = 1976.7
y = 1278 + 232.9 × –3 = 1278 – 698.7 = 579.3

www.gayali.in
These values are plotted on the graph paper and a straight line is drawn through
the points, giving the trend line.
2200
Value of output (Rs. crores)
1800
D ata
i nal
ig
1400 Or
ine
ndL
e
1000 Tr
600
0 1965 1966 1967 1968 1969 1970 1971
Year
[11] Find the value of the trend ordinates by the method of least squares from the
www.gayali.in
data given below.
Year 1971 1972 1973 1974 1975 1976 1977
Sales (Rs.’000) 125 128 133 135 140 141 143
[I.C.W.A. 1980]
Solution:
Let y = a + bx - - - (i)
By the least square method, the normal equations for finding the constants ‘a’ and ‘b’ are
Σy = an + bΣx - - - (ii)
Year Sales (y) x x2 xy
1971 125 -3 9 -375
1972 128 -2 4 -256
1973 133 -1 1 -133
1974 135 0 0 0
1975 140 1 1 140
1976 141 2 4 282
1977 143 3 9 429
www.gayali.in
Total 945 0 28 87
Number of observations n = 7. Substituting the value from the table in equations
(ii) and (iii)
945
945 = 7a + b × 0 or, a = = 135
7
87
87 = a × 0 + 28b or, b = = 3.1
28

www.gayali.in
Putting these values of a and b in equation (i), the trend equation is

y = 135 + 3.1x
Trend values are 125.7, 128.8, 131.9, 135.0, 138.1, 141.2, 144.3(Rs.’000)
[12] The following data give the values of sales of a company for the years 1968-1978
(Sales in Rs.’000):
Year (x) 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978
Sales ( Y) 50.0 36.5 43.0 44.5 38.9 38.1 32.6 38.7 41.7 41.1 33.8
Use the method of least squares to fit a straight line trend to the data given
above. Compute the trend values for 1971 and 1976 (Take x=0 for the year 1973 and
the unit of x is 1year).
Construct a 5-year moving average and compare the trend value for the years
1971 and 1976.
[I.C.W.A. 1980]
Solution: Let y = a + bx - - - (i)
www.gayali.in
Σy = an + bΣx - - - (ii)
Year Sales (y) x x2 xy
1968 50.0 -5 25 -250.0
1969 36.5 -4 16 -146.0
1970 43.0 -3 9 -129.0
1971 44.5 -2 4 -89.0
1972 38.9 -1 1 -38.9
1973 38.1 0 0 0
1974 32.6 1 1 32.6
1975 38.7 2 4 77.4
1976 41.7 3 9 125.1
1977 41.1 4 16 164.4
1978 33.8 5 25 169.0
Total 438.9 0 110 -84.4
Number of observations n = 11. Substituting the value from the table in
equations (ii) and (iii)
438.9
438.9 = 11a + b(0) or, a = = 39.9
11
www.gayali.in
−84.4
-84.4 = a (0) + b (110) or, b = = –0.77
110
Putting these values of a and b in equation (i), the trend equation is
y = 39.9 – 0.77x - - - (iv)
With origin at 1973 and x unit = 1year
The value of x for the year 1971 and 1976 are respectively -2, 3
Hence, Putting x=–2, 3 in equation (iv), the estimates for 1971 and 1976 are respectively

www.gayali.in
y = 39.9 – 0.77 × –2
= 39.9 + 1.54 = 41.44
and y = 39.9 – 0.77 × 3
= 39.9 – 2.31 = 37.59
Table: Calculations for 5-yearly Moving Average
Year Value 5-year moving total 5-year moving average
1968 50.0 - -
1969 36.5 - -
1970 43.0 212.9 42.40
1971 44.5 201.0 40.20
1972 38.9 197.1 39.42
1973 38.1 192.8 38.56
1974 32.6 190.2 38.00
1975 38.7 192.2 38.44
1976 41.7 187.9 37.58
1977 41.1 - -
1978 33.8 - -
Trend value (in Rs.’000) by least squares:
41.4 & 37.59 by moving averages: 40.20 & 37.58
www.gayali.in
[13] Fit a linear trend equation to the following series on production:
Year 1961 1962 1963 1964 1965 1966
Production (tons) 21 37 48 56 62 69
[M.B.A. 1979]
Solution:
Let y = a + bx - - - (i)
be the equation of the straight line trend with origin at the mid point of 1963 and 1964
and x unit = 6 months (since data are given for an even number of years, i.e. n=6 in even,
the origin and unit of x have been so chosen to make ∑x=0). By the least square method, the
normal equations for finding the constants a and b are
Σy = an + bΣx - - - (ii)
year = 1963.5
Year Value (y) i. e. x= x2 xy
Production 1
2
1961 21 -5 25 -105
1962 37 -3 9 -111
1963 48 -1 1 -48
1964 56 1 1 56
1965 62 3 9 186
www.gayali.in
1966 69 5 25 345
Total 293 0 70 323
Putting the value in the normal equations (ii) and (iii)
293
293 = 6a + b(0) or, a = = 48.83
6
323
323 = a(0) + 70b or, b = 70 =4.61

www.gayali.in
Substituting the value of 'a' and 'b' in equation (i), the trend equation is
y = 48.83 + 4.61x
(origin: mid – point of 1963-64 unit of x = 6 months.)
[14] Fit a straight line trend to the following series of production data:
Electricity Generated (monthly average) in West Bengal
Year 1951 1952 1953 1954 1955 1956
Electricity Generated (million KW) 101 107 113 121 136 148
[C.U.M.Com 1980]
Solution:
Let y = a + bx - - - (i)
be the equation of the straight line trend with origin at the mid-point of 1953 and 1954
and x unit = 6 months.
Σy = an + bΣx - - - (ii)
www.gayali.in
year = 1953.5
Year Value (y) x= x2 xy
(million tons) 1
2
1951 101 -5 25 -505
1952 107 -3 9 -321
1953 113 -1 1 -113
1954 121 1 1 121
1955 136 3 9 408
1956 148 5 25 740
Total 726 0 70 330
729
726 = a(6) + b(0) or, a = = 121
6
330
330 = a × 0 + b(70) or, b = = 4.71
70
Substituting the value of a and b in equation (i), the trend equation is
y = 121 + 4.71x
With origin at mid – point of 1953-54 and x unit = 6 months.
[15] The annual revenue expenditure (in Rs.crores) of Govt. of India is given below
for 6 successive years:
www.gayali.in
Year 1953-54 1954-55 1955-56 1956-57 1957-58 1958-59

Revenue expenditure 225 238 262 293 399 520
Find a linear trend by the method of least squares.
[W.B.H.S. 1980]
Solution:
Let y = a + bx - - - (i)
be the equation of the straight line trend with origin at the mid-point of 1955-56

www.gayali.in
and 1956-57 and x unit = 6 months.

By the least square method, the normal equations for finding the constants ‘a’ and ‘b’
are
Σy = an + bΣx - - - (ii)
year −1956
1
2
1953 – 54 225 -5 25 -1125
1954 – 55 238 -3 9 -714
1955 – 56 262 -1 1 -262
1956 – 57 293 1 1 293
1957 – 58 399 3 9 1197
1958 – 59 520 5 25 2600
Total 1937 0 70 1989
www.gayali.in
1937
1937 = 6a + b × 0 or, a = = 322.8
6
1989
1989 = a × 0 + b(70) or, b = = 28.41
70
y = 322.8 + 28.41x
With origin at mid – point of 1955-56 and 1956-57 and x unit = 6 months.
[16] Fit a straight line trend equation by the method of least squares and estimate the
value for 1969.
Year 1960 1961 1962 1963 1964 1965 1966 1967
Value 380 400 650 720 690 600 870 930
[C.A. 1978]
Solution:
Let y = a + bx - - - (i)
www.gayali.in
be the equation of the straight line trend with origin at the mid-point of 1963 and 1964
and x unit = 6 months.
Σy = an + bΣx - - - (ii)

www.gayali.in

year −1963.5
1
2
1960 380 -7 49 -2660
1961 400 -5 25 -2000
1962 650 -3 9 -1950
1963 720 -1 1 -720
1964 690 1 1 690
1965 600 3 9 1800
1966 870 5 25 4350
1967 930 7 49 6510
Total 5240 0 168 6020
Number of observations = 8.Putting the value in the normal equations (ii) and (iii)
5240
5240 = 8a + b(0) or, a = =655
8
6020
6020 = a × 0 +b(168) or, b = =35.83
168
www.gayali.in
y = 655 + 35.83x
With origin at mid – point of 1963-64 and x unit = 6 months.
1969 − 1963.5
For the year 1969, the value of x =
1
2
5. 5
= = 11
1
2
The trend value is y = 655 + 35.83 × 11
= 655 + 394.13
= 1049.13
= 1049
[17] Fit a parabolic curve of second degree (y = a + bx + cx2 ) to the data given below
by the method of least squares:
Year 1973 1974 1975 1976 1977
Import(y) in ‘000 bales 10 12 13 10 8
(Take 1975 as origin and unit of x as 1 year)
[I.C.W.A. 1981]
www.gayali.in
Solution:
Let y = a + bx + cx2 - - - (i)
be the equation of the second degree polynomial i. e. parabola with origin of x
at the year 1975 and unit of x = 1 year. Using the method of least squares, the normal
equations for determining the constants a, b, c are
Σy = an + bΣx + cΣx2 - - - (ii)
Σxy = aΣx + bΣx2 + + cΣx3 - - - (iii)
Σx2y = aΣx2 + bΣx3 + + cΣx4 - - - (iv)

www.gayali.in
Table: Fitting Second Degree Polynomial

Year y x x2 x3 x4 xy x2 y
1973 10 -2 4 -8 16 -20 40
1974 12 -1 1 -1 1 -12 12
1975 13 0 0 0 0 0 0
1976 10 1 1 1 1 10 10
1977 8 2 4 8 16 16 32
Total 53 0 10 0 34 -6 94
Substituting the value from the table in the normal equation,
53 = 5a + b(0) + 10c or, 53 = 5a + 10c - - - (v)
6 3
-6 = a(0) + 10b + c(0) or, -6 = 10b or, b = − = − = −0.60
10 5
94 = 10a + b(0) + 343c or, 94 = 10a + 34c - - - (vi)
Solving equations (v) and (vi)
10a + 20c = 106
10a + 34c = 94
– 14c = 12
www.gayali.in
12 6
or, c = − = − = −0.86
14 7
Putting the value of c in equation (v), we get
6
5a + 10 × − = 53
7
Or, 35a – 60 = 371
431
Or, 35a = 431 or, a = = 12.31
35
Solving these three equations, we find a = 12.3, b = –0.60, c = –0.86
The equation of the fitted second degree polynomial is therefore y = 12.3 – 0.60x – 0.86x2
Where the origin of x is at the year 1975 and unit of x = 1 year.
[18] Fit a quadratic trend to the following data:

Year 1960 1961 1962 1963 1964 1965 1966
Average Production (‘000 tons) 37 38 37 40 41 45 50
[C.U., M.Com 1979]
Solution:
www.gayali.in
Let y = a + bx + cx2 - - - (i)

be the equation of the second degree polynomial i. e. parabola with origin of x
at the year 1963 and unit of x = 1 year. Using the method of least squares, the normal
equations for determining the constants a, b, c are
Σy = an + bΣx + cΣx2 - - - (ii)
Σxy = aΣx + bΣx2 + + cΣx3 - - - (iii)
Σx2y = aΣx2 + bΣx3 + + cΣx4 - - - (iv)

www.gayali.in
Table: Fitting Second Degree Polynomial

Year y x x2 x3 x4 xy x2 y
1960 37 -3 9 -27 81 -111 333
1961 38 -2 4 -8 16 -76 152
1962 37 -1 1 -1 1 -37 37
1963 40 0 0 0 0 0 0
1964 41 1 1 1 1 41 41
1965 45 2 4 8 16 90 180
1966 50 3 9 27 81 150 450
Total 288 0 28 0 196 57 1193
Number of observations n = 7. Substituting the values from the table in the
normal equations,
288 = 7a + b(0) + 28c or, 7a + 28c = 288 - - - (v)
57
57 = a(0) + 28b + c(0) or, b = =2.04
28
1193 = 28a + b(0) + 196c or, 28a + 196c = 1193 - - - (vi)
Solving equations (v) and (vi) and substracting equation (viii) from (vii)
28a + 112c = 1152 ------ (vii)
www.gayali.in
−28a + 196c = 1193 ---- (viii)
−84c = −41
41
=
c = 0.49
84
Putting the value of c in equation (v), we get
41
7a = 288 − 28 × = 288 − 13.67 = 274.33; ∴a = 39.2
84
Solving these three equations, we find a = 39.2, b = 2.04, c = 0.49
The equation of the fitted second degree polynomial is therefore y = 39.2 + 2.04x +
0.49x where the origin of x is at the year 1963, and unit of x=1 year.
2
[19] Fit a trend equation log y = A + Bx to the series of sales data given below:
Year (x) 1943 ‘44 ‘45 ‘46 ‘47 ‘48 ‘49 ‘50 ‘51
Sales (y) 97 113 129 202 195 193 192 237 235
[C.U., B.A.(Econ.) 1971]
Solution:
Let us take the origin of x at the year 1947. The original exponential function
www.gayali.in
should be y = abx, we have

Log y = (Log a) + x (Log b)
This can be written in the form of a straight line y = A + Bx, - - - (i)
where Y = Log y, A = Log a and B = Log b
Using the method of least squares, the normal equations for determining A and B are.
ΣY = An + BΣx - - - (ii)
ΣxY = AΣx + BΣx2 - - - (iii)

www.gayali.in
Table: Fitting Exponential Trend

Year Sales (y) x Y = log y x2 xY
1943 97 -4 1.9868 16 -7.9472
1944 113 -3 2.0531 9 -6.1593
1945 129 -2 2.1106 4 -4.2212
1946 202 -1 2.3054 1 -2.3054
1947 195 0 2.2900 0 0
1948 193 1 2.2856 1 2.2856
1949 192 2 2.2833 4 4.5666
1950 237 3 2.3747 9 7.1241
1951 235 4 2.3747 16 9.4844
Total 1593 0 20.06 60 2.8276
Substituting the values in the normal equations,
20.06
we get 20.06 = 9A + B(0) or, A = 9
= 2.2290
2.8276
2.8276 = A. 0 + 60B or, B = = .0471
www.gayali.in
60
Putting the value of A = Log a and B = Log b in the given trend equations is
Log y = 2.2290 + .0471x
(origin = 1947, unit x = 1 year)
[20] The following data relate to average monthly number of tourists coming to India
in different year. Fit an exponential trend by the method of least squares:
Year 1971 1972 1973 1974 1975
Number of tourists 25,083 28,579 34,157 35,267 38,773
[W.B.H.S. 1981]
Solution:
Let us take the origin of x at the year 1973. Let us take the exponential trend
y = abx of the data given and year is takes as x and nmuber of tourists as y, Taking
logarithms of both sides of the equation
We have, y = abx
log y = log a + x log b
This can be written in the form of a straight line
www.gayali.in
Y = A + Bx, - - - (i)
where Y = Log y, A = Log a and b = log b, using the method of least squares, the
normal equations for determining A and B are
ΣY = An + BΣx - - - (ii)
ΣxY = AΣx + BΣx2 - - - (iii)

www.gayali.in
Table: Fitting Exponential Trend

Year No. of Tourists (y) x Y = Log y x2 xy
1971 25,083 -2 4.3998 4 -8.7996
1972 28,579 -1 4.4573 1 -4.4573
1973 34,157 0 4.5343 0 0
1974 35,267 1 4.5481 1 4.5481
1975 38,773 2 4.5483 4 9.5481
Total - 0 22.5278 10 0.4678
Substituting the values in the normal equations, we get
5A = 22.5278 or, A = 4.50556
10B = 0.4678 or, B = 0.04678
∴ a = antilog 4.50556 = 31990
b = antilog 0.04678 = 1.114
Putting the value of a and b the equation of the fitted exponential trend is
y = 31990 (1.114) x
[21] Fit a straight line trend to the following series of production data:
www.gayali.in
Year 1960 1961 1962 1963 1964 1965 1966
Y 37 38 37 40 41 45 50
Y Values being the average production in thousand tons, what is the monthly
trend increment ? Find the monthly trend values from the fitted equation for January,
March and December of 1961.
Solution: Let y=a+bx ----- (i) be the equation of straight line trend fitted to the
given yearly data (origin 1963; x unit = 1 year). The normal equations for finding the
constants a and b are
Σy = an + bΣx - - - (ii)
Table: Fitting Straight Line Trend
Year y x x2 xy
1960 37 -3 9 -111
1961 38 -2 4 -76
1962 37 -1 1 -37
1963 40 0 0 0
1964 41 1 1 41
1965 45 2 4 90
www.gayali.in
1966 50 3 9 150
Total 288 0 28 57
Using the results from the table in the normal equations,
288
288 = 7a + b(0) or , a = = 41.14
7
57
57 = a(0) + 28b or , b = = 2.04
28

www.gayali.in
The trend equation fitted to yearly values is

y = 41.14 + 2.04x (Origin 1963; x unit = 1 year)
Since y represents the monthly average for each year and the unit of x is 12
months, we see that the trend of monthly average increases by 2.04 in 12 months i.e.
2.04/12 per month.
2.04
So, the trend equation for monthly values is y = 41.14 + × = 41.14 + 0.17 x
12
(origin: 30th June, 1963, x unit = 1 month)
The monthly trend equation is y = 41.14 + 0.17x
If July 1963 is chosen as origin i.e. origin is shifted half a month later, x should
 1
be replaced by  x +  .The monthly trend equation is, therefore,
 2 
y = 41.14 + 0.17x + 0.085
= 41.23 + 0.17x
Let us find the trend values for January 1961, March 1961 and December 1961.
Since January 1961 is 30 months earlier from origin viz. July 1963, putting x = –30
in the trend equation y = 41.23 + 0.17x –30
= 41.23 – 5.1 = 36.13
www.gayali.in
Since April 1961 is 28 months earlier from origin viz July 1963, putting x = –28
in the trend equation y = 41.23 + 0.17x – 28
= 36.47
Since December 1961 is 19 months earlier from origin vizJuly 1963, putting x = –19
in the trend equation y = 41.23 + 0.17x – 19
= 41.23 – 3.23 = 38
Ans: [a] y = 41.14 + 0.17x origin 1963, unit of x = 1 year
[b] Trend values for January 1961 = 36.13
Trend values for March 1961 = 36.47
Trend values for December 1961 = 38.00
[22] Determine the linear trend equation that fits the following figures on quarterly
consumption of raw material in some factory. Given that the seasonal index for the
third quarter of a year is 117%. What is the estimated consumption for the third
quarter of 1982 ?
Consumption (in tons)
Year / Quarter 1 2 3 4
1976 28 25 31 39
1977 42 44 48 51
1978 55
www.gayali.in
[D.S.W : 1978]
Solution:
Let y = a + bx be the equation of trend (origin: 1st quarter of 1977; unit of x = 1
quarter). By the method of least squares, the values of a and b are obtained from the
normal equations.
∑y = an + bx
∑xy = a∑x + b∑x2

www.gayali.in
Table: Fitting Linear Trend to Quarterly Data:

Year Quarter Time Series (y) x x2 xy
1976 i. 28 -4 16 -112
ii. 25 -3 9 -75
iii. 31 -2 4 -62
iv. 39 -1 1 -39
1977 i. 42 0 0 0
ii. 44 1 1 44
iii. 48 2 4 96
iv. 51 3 9 153
1978 i. 55 4 16 220
Total 363 0 60 225
363
363 = 9a + b × 0 or , a = = 40.33
9
225
225 = a × 0 + 60b or , b = = 3.75
60
The trend equation is therefore y = 40.33 + 3.75x (origin: 1st quarter 1977, unit
www.gayali.in
of x = 1 quarter)
For third quarter of 1982, x = 22
Trend Value, y = 40.33 + 3.75 × 22 = 40.33 + 82.5 = 122.83
∴ Seasonal index = 117%
∴ Estimated consumption = 122.83 × 1.17 = 143.7 tons.
[23] Suppose we have a series of quarterly production figures (in thousand tons) in
an industry for the years 1970 to 1976, and the equation of the linear trend fitted to the
annual data is
xt = 107.2 + 2.93t
Where t = year – 1973 and xt = annual, production in time period t.
Use this equation to estimate the annual production for the year 1977, and for
the year 1971.
Suppose now the quarterly indices of seasonal variations are:
January – March 125, April-June 105, July-September 87, October – December 83.
(The multiplicative model for the time series is assumed. Use these indices to
estimate the production during the first quarter of 1977.)
[C.U., B.A (Econ.) 1978]
Solution:
Here : t1977 = 1977 – 1973 = 4
www.gayali.in
t1971 = 1971 – 1973 = –2

Putting the value of t = 4 in the trend equation
Annual Production, y = 107.2 + 2.93 × 4
= 107.2 + 11.72 = 118.92 ('000 tons)
Putting the value t = –2 in trend equation
Annual Production, y = 107.2 + 2.93 × –2
= 107.2 – 5.84 = 101.34 ('000 tons)

www.gayali.in
Trend equation for quarterly values

107.2 2.93  1 
x t = + t+
4 4 × 4  2 
= 26.8 + 0.183t + 0.09
= 26.89 + 0.183t
(origin: 3rd quarter of 1973; unit of t = 1 quarter)
Putting t = 14, trend for 1st quarter of 1977 is y = 26.89 + 0.183 × 14 = 29.452
Seasonal Effect = Seasonal index ÷ 100
= 125 ÷100
= 1.25 for 1st quarter
Estimated Production = 29.452 × 1.25
= 36.82 (‘000 tons)
[24] A large company estimates its average monthly sales in a year to be Rs.2, 00,000.
The seasonal indices of the sales data are as follows:-
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Seasonal index 76 77 98 128 137 122 101 104 100 102 82 73
www.gayali.in
Using this information, draw up a monthly sales budget for the company.
(Assume that there is no trend).
[C.A. , 1978]
Solution:
Seasonal indices are usually expressed as percentages, their average being 100.
Hence, seasonal indices must be divided by 100, to obtain seasonal effects. The average
monthly sales being Rs.2, 00, 000 × (seasonal effect),
Where Seasonal Effect = Seasonal index ÷ 100
Table: Budget Estimates of Monthly Sales
Month Seasonal Index Seasonal Effect Estimated Sales (Rs.’000)
(1) (2) (3) (4)
Jan. 76 .76 152
Feb. 77 .77 154
Mar. 98 .98 196
Apr. 128 1.28 256
May. 137 1.37 274
Jun. 122 1.22 244
Jul. 101 1.01 202
www.gayali.in
Aug. 104 1.04 208

Sep. 100 1.00 200
Oct. 102 1.02 204
Nov. 82 .82 164
Dec. 73 .73 146
Total 1200 12.00 2400
Col(3) = Col(2) ÷ 100 , Col(4) = Col(3) × 200.

www.gayali.in
[25] Deseasonalise the following data with the help of seasonal index given against:
Month January February March April May June
Cash Balance(Rs.’000) 360 400 550 360 350 550
Seasonal Index 120 80 110 90 70 100
[C.A. May]
Solution:
“Seasonal Index” shown here actually refers to seasonal effect. Since, seasonal
index give ratio changes over the normal value, a multiplicative model is to be assumed
for the data, Yt = T × S × C × I. In order to deseasonalise (i.e. eliminate the seasonal
effect) it is therefore necessary to divide the data by the seasonal effects.
Yt T × S × C × I
Deseasonalise data = = = = T×C×I .
S S
Table : Deseasonalise Time Series Data
yt
Month Cash Balance (Rs.’000) yt Seasonal Index Seasonal effects S Decentralized data = S
January 360 120 1.20 300
February 400 80 0.80 500
www.gayali.in
March 550 110 1.10 500
April 360 90 0.90 400
May 350 70 0.70 500
June 550 100 1.00 550
Deseasonalised data are 300,500,500,400,500 and 550 (Rs.’000)
Seasonal Index
Note: Seasonal Effect = =
100
[26] The following table gives the cash receipts and the seasonal indices for 12 months:
Months Jan. Feb. Mar. Apr. May. Jun. Jul. Aug. Sep. Oct. Nov. Dec.
Cash Receipts (millions 35.1 23.7 20.8 21.1 28.3 22.5 23.1 24.3 41.3 62.1 65.4 71.7
of Rs.)
Seasonal Index 1.30 0.67 0.57 0.57 0.71 0.63 0.71 0.71 1.37 1.82 1.45 1.49
Eliminate the seasonal variations in the cash-receipts and discuss the significance
of such data.
Solution:
‘Seasonal Index’ shown here actually refers to seasonal effect. (Note that “seasonal
index is generally used to show ‘percentage’ position and should be distinguished from
www.gayali.in
“seasonal effect”, indicating effect ‘per unit’. The average seasonal index is 100, but
average seasonal effect is 1.) Since seasonal indices give ratio changes over the normal
value, a multiplicative model is to be assumed for the data, Yt = T × S × C × I. In order
to deseasonalise (i.e. eliminate the seasonal effect) it is therefore necessary to divide
the data by the ‘seasonal effects’.
Yt T × S × C × I
Deseasonalise data = = = T×C×I .
S S

www.gayali.in
Table: Deseasonalising Time Series Data

Decentralized data
Cash receipts (mill Rs.) Seasonal effects Yt
Month y S =
S
t
January 35.1 1.30 27.0
February 23.7 0.67 35.4
March 20.8 0.57 36.5
April 21.1 0.57 37.0
May 28.3 0.71 39.9
June 22.5 0.63 35.7
July 23.1 0.71 32.5
August 24.3 0.71 34.2
September 41.3 1.37 30.1
October 62.1 1.82 34.1
November 65.4 1.45 45.1
December 71.7 1.49 48.1
The significance of decentralized data is that cash receipts in the different
months would have been as shown in the last column of the above table if they were
www.gayali.in
not affected by seasonal fluctuations.
[27] The sale of a company rose from Rs.60, 000 in the month of August to Rs.69,
000 in the month of September. The seasonal indices for these two months are 105
and 140 respectively. The owner of the company was not at all satisfied with the rise
of the sale in the month of September by Rs.9,000. He expected much more because
of the seasonal index for that month, what were his estimate of sales for the month of
September?
[I.C.W.A. 1977]
Solution:
The seasonal index for august is 105 and the actual sales were Rs.60, 000. On
60000
this basis the normal monthly sales would be = Rs.57143 and the expected sales
1.05
60000
during September, when the seasonal index is 140, would be, × 1.40 = Rs.80, 000
1.05
But the actual sales during September viz. 69000 are less than this i.e. Rs.80,000.
Hence, the company is a losing concern which justifies company owner’s dissatisfaction.
[28] Suppose that the secular trend of sales of a company is accurately described by
the equation Ye = 120,000 + 1000x where x represents a period of one month and has a
www.gayali.in
value 0 in December 1981. The seasonal indices for the company’s sales are as follows:
January 100, February 80, March 90, April 120, May 115, June 95, July 75, August
70, September 90, October 95, November 120, December 150.
Ignoring cyclical and random influences, forecast sales for (i) February 1983,
(ii) May 1986, (iii) December 1994.
[C.U., B.Sc. (Econ.) 1982]

www.gayali.in
Solution:
[i] From Feb 1983 to Dec 1981 = 14 months
[ii] Form May 1986 to December 1981 = 53 months
[iii] Form December 1994 to December 1981 = 156 months
Putting the value of x = 14,53 and 156 is the equation ye = 120,000 + 1,000x we
get trend values as
[i] ye = 120,000 + 1000 × 14 = 120,000 + 14000 = Rs.134,000
[ii] ye = 120,000 + 1000 × 53 = 120,000 + 53000 = Rs.173,000
[iii] ye = 120,000 + 1000 × 156 = 120,000 + 156000 = Rs.276,000
Now, multiply by 0.80, 1.15 and 1.50 with the trend values Rs.134,000, Rs.173,000
and Rs.276,000, we get the forecast sales for
[i] February 1983 as Rs.134000 × .80 = Rs. 107,200
[ii] May 1986 as Rs.173,000 × 1.15 = Rs.198,950
[iii] December 1994 as Rs.276000 × 1.50 = Rs.414,000.
[29] Calculate the seasonal index from the following data using the average method:
Year 1st Qr 2nd Qr 3rd Qr 4th Qr
1974 72 68 80 70
1975 76 70 82 74
www.gayali.in
1976 74 66 84 80
1977 76 74 84 78
1978 78 74 86 82
[C.A. 1979]
Solution:
Method of quarterly averages seems as appropriate here, since no appreciable
trend is noticed in the given data (note that the values in any quarter do not reveal any
definite tendency to change). The calculations are shown below, using the multiplicative
model.
Table: Calculation for Seasonal Index
Year Q1 Q2 Q3 Q4 Total
1974 72 68 80 70 -
1975 76 70 82 74 -
1976 74 66 84 80 -
1977 76 74 84 78 -
1978 78 74 86 82 -
Total 376 352 416 384 1528
A.M 75.2 70.4 83.2 76.8 305.6
Seasonal Index 98 92 109 101 400
www.gayali.in
Grand Average = 305.6 ÷ 4 = 76.4

Seasonal Index = (A.M ÷ Grand Average) × 100
(75.2 ÷ 76.4) × 100 = 98
(70.4 ÷ 76.4) × 100 = 92
(83.2 ÷ 76.4) × 100 = 109
(76.8 ÷ 76.4) × 100 = 101
Therefore, Seasonal Index are 98, 92, 109 and 101 using multiplicative model.

www.gayali.in
[30] Using the method of exponential smoothing, find forecasts for the following
sales data, taking an initial forecast 25 and a smoothing coefficient 0.4.
Day 1 2 3 4 5 6 7 8
Sales 26 28 23 27 24 30 26 27
[C.A. 1974]
Solution:
The exponentially smoothed average at time t is Ut= Ut–1 + αe+
Where et = yt – Ut–1 is the “error” . Here we are given α= 0.4.
Table: Calculations for Exponential Smoothing
Day (t) Sales yt Previous forecast Ut–1 Error et = y+ – Ut–1 ∝ et Current Forecast
(1) (2) (3) (4) (5) (6)
1 26 25.00 1 0.4 25.40
2 28 25.40 2.6 1.04 26.44
3 23 26.44 -3.44 -1.38 25.06
4 27 25.06 1.94 0.78 25.84
5 24 25.84 -1.84 -0.74 25.10
www.gayali.in
6 30 25.10 4.90 1.96 27.06
7 26 27.06 -1.06 -0.42 26.64
8 27 26.64 0.36 0.14 26.78
Forecasted sales data are:
25.40, 26.44, 25.06, 25.84, 25.10, 27.06, 26.64, 26.78.
CORRELATION AND REGRESSION

Concepts of ‘correlation’ and ‘regression’.
Correlation is concerned with the measurement of the ‘strength of association’ between
variables; while regression is concerned with the ‘prediction’ of the most likely value of
one variable when the value of the other variable is known.
Scatter Diagram
When statistical data relating to the simultaneous measurement on two variables are
available, each pair of observations can be geometrically represented by a point on the
graph paper – the values of one variable being shown along the X-axis and those of the
other variable along Y-axis. If there are n pairs of observations, finally the graph paper
will contain n points. This diagrammatic representation of bivariate data is known as
Scatter Diagram.
www.gayali.in
A scatter diagram indicates the nature of association between the two variables i.e.
the type of correlation between them. If the pattern of points (or dots) on the scatter
diagram shows a linear path diagonally across the graph paper from the bottom left-
hand corner to the top right, correlation will be positive. In other words, association
between variable is direct, indicating thereby that high values of one variable are in
general, associated with high values of the other variable, and low values are associated
with low values.

www.gayali.in
y y y
x x x
(a) positive (b) negative (c) zero
y y y
x x x
(d) zero (e) +1 (f) –1
On the other hand, if the pattern of dots be such as to indicate a straight line path
from the upper Left-hand corner to the bottom right, correlation is negative, i.e. is the
association is indirect, high values of one variable being associated with low values of
the other ( fig.b).
www.gayali.in
When dots do not indicate any straight line tendency, but a swarm (fig.c) or
concentration around the curved line, correlation is small (fig.d). In fact, if no straight
line tendency is noticed, correlation will be zero.
When the dots lie exactly on a straight line, correlation is perfect- the correlation
coefficient being +1 or -1, according as the slope of the straight line is positive or
negative (figs. e, f).
The scatter diagram also gives an indication of the degree of linear correlation between
the variables, i.e. whether correlation is high or low. If the plotted points on the scatter
diagram i.e. approximately on, or near about, a straight line (figs e,f) correlation
coefficient will be nearly one, numerically. The more scattered the points are around a
straight line, the less is the correlation coefficient (figs. a,b).
Correlation:
The word ‘correlation’ is used to denote the degree of association between variables.
If two variables x and y are so related that variations in the magnitude of one variable
tend to be accompanied by variations in the magnitude of the other variable, they are
said to be correlated. If y tends to increase as x increases, the variables are said to be
positively correlated. If y tends to decrease as x increase, the variables are negatively
correlated. If y tends to decrease as x increases, the variables are negatively correlated.
If the values of y are not affected by changes in the values of x, the variables are said to
www.gayali.in
be uncorrelated.
Covariance:
Given a set of n pairs of observations (x1, y1), (x2, y2), ----------, (xn, yn) relating to two
variables x and y, the covariance of x and y, usually represented by cov (x, y), is defined as
1
cov ( x, y ) = Σ ( x − x ) ( y − y )
n

www.gayali.in
Expanding the expression on the right, it can be shown that

Σxy  Σx   Σy 
cov ( x, y ) = −  
n  n   n 
Covariance has properties similar to those of variance, i.e. the square of S.D.
[i] If X = x – c and Y = y – c′ where c, c′ are constants, then cov (x, y) = cov (X, Y).
[ii] If u = (x – c)/d and v = (y – c′)/d′, where c, c′ d, d′ are constants then cov (x, y) = dd′ cov (u, v)
However, while variance must always positive, covariance may be positive, negative or zero.
Correlation coefficient (r)
Let (x1, y1), (x2 , y2) ----------, (xn, yn ) be a given set of n pairs of observations on two
variables x and y. The correlation coefficient, or coefficient of correlation, between x
and y (denoted by the symbol r) is then defined as
cov ( x, y )
r= ,
σ X σY
Where σx and σy are the standard deviations of x and y respectively and cov (x, y)
denotes the covariance of x and y. This expression is known as Pearson’s product
www.gayali.in
moment formula and is used as a measure of linear correlation between x and y.
The formula for r may be written in various other forms:
Σ ( x − x )( y − y )
r=
√ Σ ( x − x ) Σ ( y − y ) 
2 2
 
Σxy − nx y
r=
( )(
√  Σx 2 − nx 2 Σy 2 − ny 2 
  )
nΣxy − ( Σx )( Σy )
r=
 { 2
}{
√  nΣ x 2 − ( Σ x ) nΣ y 2 − ( Σ y ) 
2
 }
Properties of correlation coefficient:
[i] The correlation coefficient r is independent of the choice of both origin and
scale of observations. This means that if
x −c y − c′
u= and v =
d d′
Where c, c′, d, d′ are arbitrary constants and (d, d′ positive), then rxy = ruv
www.gayali.in
[ii] The correlation coefficient r is a pure number and is independent of the units
of measurement. This means that if, for example, x represents heights in inches and y
weight in lbs, than the correlation coefficient between x and y will neither be in inches
n or in lbs, or any other unit, but only a number.
[iii] The correlation coefficient r lies between -1 and +1 i.e. r cannot exceed 1
numerically.
–1 ≤ r ≤ + 1

www.gayali.in
Calculation of r
Correlation coefficient (r) calculated from a given set of n pairs of observations (x1, y1),
(x2, y2), ----------, (xn, yn) as follows:
cov ( X , Y )
[i] If X = x – c and Y = y – c/ (here c, c/ are constants) then rXY =
σ X σY
2 2
ΣX 2  ΣX 
2 ΣY 2  ΣY 
Where σ X = −  , σY 2 = − 
n  n  n  n 
ΣXY  ΣX   ΣY 
cov ( X , Y ) = −  
n  n  n 
x−c y − c′
[ii] If u = and v = (here c, c′ d, d′ are constants and d, d′ are positive), then
d d′
cov ( u, v )
rxy = ruv =
σu , σ v
2 2
Σu 2  Σu  Σv 2  Σv 
where σu 2 = 2
www.gayali.in
− , σ = −
n  n  n  n 
v
Σuv  Σu   Σv 
cov ( u, v ) = −  
n  n  n 
Regression:
The word "regression" is used to denote estimation or predition of the average value
of one variable for a specified value of the other variable. The estimation is done by
means of suitable equations, derived on the basis of available bivariate data. Such an
equation is known as Regression equation and its geometrical representation is called
a Regression curve.
In linear regression (or simple regression) the relationship between the variables is
assumed to be linear. The estimate of y (say y’) is obtained from an equation of the
form. y ′ − y = byx ( x − x ) and the estimate of x (say, x’) from another equation of the
form x ′ − x = bxy ( y − y )
Equation (i) is known as Regression equation of y on x, and equation (ii) as Regression
equation of x an y. The coefficient byx appearing in the regression equation of y on x
is known as the regression coefficient of y on x. Similarly bxy is called the Regression
coefficient of x on y.
www.gayali.in
1. Properties of linear regression.

[i]
Regression Equation of y on x
y − y = byx ( x − x )

cov ( x, y ) σy
where byx = =r
σ x2 σx

www.gayali.in
[ii]
Regression Equation of x on y
x − x = bxy ( y − y )

cov ( x, y ) σx
Where bxy = 2
=r
σy σy
2. The product of two regression coefficients is equal to the square of correlation
coefficient byx, bxy = r2
3. r, byx and bxy, all have the same sign. If the correlation coefficient r is zero, the
regression coefficients byx and bxy are also zero.
4. The regression lines always intersect at the point ( x , y ) . The slopes of the regression
line of y on x and the regression line of x on y are respectively byx and 1/bxy.
5. The angle between the two regression lines depends on the correlation coefficient r.
When r = 0 the two lines are perpendicular to each other. When r = + 1 or r = –1, they
coincide. As r increases numerically from 0 to 1, the angle between the regression lines
diminishes from 900 to 00
Rank Correlation:
www.gayali.in
The correlation coefficient between the two series of ranks is called ‘Rank Correlation
Coefficient’ . It is given by the formula,
b Σd 2
R =1− - - - (1)
n3 − n
where d represents the difference of the ranks of an individual in the two characters
and n is the number of individuals. This formula is also known as ‘Spearman’s formula
for rank correlation coefficient.
The rank correlation coefficient lies between –1 and +1.
–1 ≤ R ≤ + 1
It has the maximum value +1, when the ranks in the two characters are equal. Again R
has the minimum value -1, when the ranks are just the opposite.
In the calculation of rank correlation coefficient from given scores, if several individuals
have the same score in any character, they must be allotted the same ranks and we are
then concerned with, what are known as ‘tied ranks’. In dealing with such cases, the
usual way is to allot the average rank to each of these individuals, and then calculate
the product moment correlation – coefficient from these ranks. However, in such cases
www.gayali.in
one way of correcting formula (1) is to increase ∑d2 by (t3 – t)/12 in respect of each tie,
where t denotes the number of individuals involved in a tie, whether in the first or
second series. The modified formula for rank correlation coefficient, when there are
ties, is then
R ′ = 1 −
{ (
6 Σd 2 + Σ t 3 − t / 12 ) }
3
n −n

www.gayali.in
Exercise
[1] The data given below relate to the heights and weights of 20 persons. You are
required to form a two-way frequency table with class-intervals 62" to 64", 64" to 66"
and so on, and 115 to 125 lbs, 125 to 135 lbs, and so on.
Sl no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Height 70 65 65 64 69 63 65 70 71 62 70 67 63 68 67 69 66 68 67 67
weight 170 135 136 137 148 124 117 128 143 129 163 139 122 134 140 132 120 148 129 152
[C.A. 1966]
Solution:
Table : Two-way Frequency Table Showing Height with Weight of 20 persons.
Height (inches)
62-64 64-66 66-68 68-70 70-72 115-125 125-135 135-145 145-155 155-165 165-175 Total
62-64
64-66
66-68
68-70
www.gayali.in
70-72
Weight (lbs.)
115-125 2 1 1 4
125-135 1 1 2 1 5
135-145 3 2 1 6
145-155 1 2 3
155-165 1 1
165-175 1 1
Total 3 4 5 4 4 20
Note. class intervals 62-64, 64-66 etc. represent “62 and above but below 64, 64
and above but below 66 etc.
[2] Calculate r from the following given results :–
x=10; Σx=125; Σx2=1585; Σy=80; Σy2=650; Σxy=1007.
[C.A. - 1966]
Solution :
∑ xy  ∑ x   ∑ y  1007 125 80 10070 − 10000 70 7
Cov (x, y)= −  = − × = = =
n  n  n  10 10 10 100 100 10
2 2
∑ x2  ∑ x  15850 − 15625 225 15
σx = −  = 1585 −  125  = = =
n  n  100 100 10
www.gayali.in
10  10 
2 2
∑ y2  ∑ y  650  80  = 6500 − 6400 = 100 = 1
σy = −  = −
n  n  10  10  100 100
7
cov( x, y ) 10 7
∴r= = = = 0.47
6x 6y 15 10
×1
10

www.gayali.in
[3] Find the coefficients of correlation from the following results :

8 8 8 8 8
Σ X = 42.2, Σ Y = 46.4, Σ X 2 = 291.20, Σ Y 2 = 290.52, Σ XY = 230.42
i =1 i =1 i =1 i =1 i =1
Solution :
∑ XY  ∑ X   ∑ Y 
Cov(X,Y) = −  
N  N  N 
230.42 42.2 46.4 1843.36 − 1958.08 114.72
= − × = =−
8 8 8 64 64
2
∑ X2  ∑ X 
σx = − 
N  N 
2
291.20  42.2  2329.60 − 1780.84 548.76 23.43
= −  = = =
8  8  64 64 8
2 2
ΣY 2  ΣY  290.52  46.4  2324.16 − 2151.96 176.20 13.08
σy = − = −  = = =
N  N  8  8  64 64 8
114.72
www.gayali.in
−
Cov (X , Y) 64 114.72
r = = =− = −0.37
σx σy 23 . 43 13 . 08 306.46
×
8 8
[4] Obtain the correlation coefficient from the following:
x 6 2 10 4 8
y 9 11 5 8 7
[D.S.W. 1977]
Solution:
Table - Calculations for correlation coefficient
x y X=x–6 Y=y–8 X2 Y2 XY
6 9 0 1 0 1 0
2 11 -4 3 16 9 -12
10 5 4 -3 16 9 -12
4 8 -2 0 4 0 0
8 7 2 -1 4 1 -2
Total 30 40 0 0 40 20 -26
ΣXY  ΣX   ΣY  −26 −26
Cov(X,Y) = −    = − 0×0 = = −5.2
N  N  N  5 8
www.gayali.in
2 2
∑ X2  ∑ X  40  0 
σX = −  = − = 8
N  N  5  5 
2 2
ΣY 2  ΣY  20  0 
σY = −  = − =2
N  N  5  5 
Cov ( X , Y ) −5.2 5. 2
r = = =− = −0.92
σx σy 8 ×2 5.66

www.gayali.in
[5] Calculate the coefficient of correlation for the ages of husband and wife:
Age of husband 23 27 28 29 30 31 33 35 36 39
Age of wife 18 22 23 24 25 26 28 29 30 32
[I,C,W,A, 1970]
Solution:
Table - Calculations for Correlation Coefficient
x y X = x – 31 Y = y – 26 X2 Y2 XY
23 18 -8 -8 64 64 64
27 22 -4 -4 16 16 16
28 23 -3 -3 9 9 9
29 24 -2 -2 4 4 4
30 25 -1 -1 1 1 1
31 26 0 0 0 0 0
33 28 2 2 4 4 4
35 29 4 3 16 9 12
36 30 5 4 25 16 20
39 32 8 6 64 36 48
Total - - 1 -3 203 159 178
ΣXY  ΣX   ΣY  178 1 −3 1780 + 3 1783
Cov(X,Y) = −  = − × = =
www.gayali.in
N  N   N  10 10 10 100 100
2 2
ΣX 2  ΣX  203  1  2030 − 1 2029
σX2 = − = − = =
N  N  10  10  100 100
2 2
ΣY 2  ΣY  159  −3  1590 − 9 1581
σY2 = − = − = =
N  N  10  10  100 100
1783
Cov ( X , Y ) 100 1783 1783 1783
r= = = = = = 0.996
σx σy 2029 1581 2029 . 1581 4 × 39.76
45.04 1790.79
×
100 100
∴ Correlation coefficient for ages of husband and wife (r) = 0.996
[6] Calculate the correlation coefficient rxy from the following:
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
[D.M., 1978]
Solution :
Table : Calculations for correlation coefficient
x y X = x – 67 Y = y – 72 X2 Y2 XY
www.gayali.in
65 67 -2 -5 4 25 10
66 68 -1 -4 1 16 4
67 65 0 -7 0 49 0
67 68 0 -6 0 36 0
68 72 1 0 1 0 0
69 72 2 0 4 0 0
70 69 3 -3 9 9 -9
72 71 5 -1 25 1 -5
Total - - 8 -26 44 136 0

www.gayali.in
ΣXY  ΣX   ΣY  0 8 −26 13
Cov(X,Y) = −  = − × = = 3.25
N  N  N  8 8 8 4
2 2
ΣX 2  ΣX  44  8  352 − 64 288
σX2 = −  = −  = = = 4.50
N  N  8 8 64 64
σX = 4.50 = 2.12
2 2
ΣY 2  ΣY  136  −26  1088 − 676 412
γY2 = − = − = = = 6.44
N  N  8  8  64 64
σY = 6.44 = 2.54
Cov (x, y ) 3.25 3.25
rxy = = = = 0.60
σx σy 2.12 × 2.54 5.38
rXY = rxy = 0.60
[7] Calculate the coefficient of correlation between x and y :
x 155 157 153 151 159 162 158
y 118 129 125 124 129 133 127
[C.U.B.A.(Econ), 1975]
www.gayali.in
Solution :
Table : Calculation for Correlation Coefficient
x y u = x – 155 v = y – 124 u2 v2 uv
155 118 0 -6 0 36 0
157 129 2 5 4 25 10
153 125 -2 1 4 1 -2
151 124 -4 0 16 0 0
159 129 4 5 16 25 20
162 133 7 9 49 81 63
158 127 3 3 9 9 9
Total - - 10 17 98 177 100
Σuv  Σu   Σv  100 10 17 700 − 170 530
Cov(u,v) = −  = 7 − 7 × 7 = =
n  n  n  49 49
2 2
Σu 2  Σu  98  10  686 − 100 586
σu2 = − = −  = =
n  n  7  7  49 49
586 586
σu = =
49 7
2 2
Σv 2  Σv  177  17  1239 − 289 950
σv2 = − = − = =
n  n  7  7 
www.gayali.in
49 49
950 950
σv = =
49 7
530
cov ( u, v ) 49 530 530 530
r = = = = = = 0.71
σu σ v 586 950 586 × 950 24 . 21 × 30 . 82 74
4 6.15
×
7 7

www.gayali.in
[8] Calculate Pearson’s coefficient of correlation from the following data using 44
and 26 as the origins of X and Y respectively.
X 43 44 46 40 44 42 45 42 38 40 42 57
Y 29 31 19 18 19 27 27 29 41 30 26 10
[C.A. 1978]
Solution:
Table - Calculations for correlation Coefficient
X Y x = X – 44 y = Y – 26 x2 y2 uv
43 29 -1 3 1 9 -3
44 31 0 5 0 25 0
46 19 2 -7 4 49 -14
40 18 -4 -8 16 64 32
44 19 0 -7 0 49 0
42 27 -2 1 4 1 -2
45 27 1 1 1 1 1
42 29 -2 3 4 9 -6
38 41 -6 15 36 225 -90
40 30 -4 4 16 16 -16
www.gayali.in
42 26 -2 0 4 0 0
57 10 13 -16 169 256 -208
Total - - -5 -6 255 704 -306
Σxy  Σx   Σy  −306 −5 −6 −3672 − 30 −3702

Cov(x,y) = −  = − × = =
n  n   n  12 12 12 144 144
2 2 2
Σx  Σx  255  −5  3060 − 25 3035
σx2 = − = − = =
n  n  12  12  144 144
3035 3035
σx = =
144 12
2 2
Σy 2  Σy  704  −6  8448 − 36 8412
σy2 = −  = − = =
n  n  12  12  144 144
8412 8412
∴σy = =
144 12
−3702
Cov ( x, y ) 144 −3702
∴r = = = = −0.733
σx σy 3035 / 12 × 8412 / 12 55.09 × 91.72
www.gayali.in
[9] Specimens of similarly treated alloy steel containing various percentages of

nickel are tested for toughness with the following results:
Toughness (arbitrary units) 47 50 52 52 54 56 58 59 60 60 62 64 65 66
Parentage of Nickel 2.7 2.7 2.8 2.8 2.9 3.2 3.2 3.3 3.4 3.5 3.5 3.6 3.7 3.8
Find the coefficient of correlation between ‘toughness’ as measured by the test,
and ‘Percentage content of Nickel’ in the alloy steel.
[C.U., M.Com 1971]

www.gayali.in
Solution:
x y u = x – 58 v = y – 3.2 u2 v2 uv
47 2.7 -11 -0.5 121 0.25 5.5
50 2.7 -8 -0.5 64 0.25 4.0
52 2.8 -6 -0.4 36 0.16 2.4
52 2.8 -6 -0.4 36 0.16 2.4
54 2.9 -4 -0.3 16 .09 1.2
56 3.2 -2 0 4 0 0
58 3.2 0 0 0 0 0
59 3.3 1 0.1 1 0.01 0.1
60 3.4 2 0.2 4 0.04 0.4
60 3.5 2 0.3 4 0.09 0.6
62 3.5 4 0.3 16 0.09 1.2
64 3.6 6 0.4 36 0.16 2.4
65 3.7 7 0.5 49 0.25 3.5
www.gayali.in
66 3.8 8 0.6 64 0.36 4.8
Total - - -7 0.3 451 1.91 28.5
Σuv  Σu   Σv  28.5 −7 0.3 399 + 2.1 401.1
Cov(u,v) = −  = − × = =
n  n   n  14 14 14 196 196
2 2
Σ u 2  Σu  451  −7  6314 − 49 6265
σu2 = − = − = =
n  n  14  14  196 196
6265 6265
σu = =
196 14
2 2
Σv 2  Σv  1.91  0.3  26.74 .09 26.74 − 0.09 26.65
σv2 = −  = − = − = =
n  n  14  14  14 196 196 196
26.65 26.65
σv = =
196 14
401.1
Cov (u, v ) 196 401.1 401.1
∴r = = = = = 0.98
σu σ v 6265 26.65 6265 × 26.65 799.15 × 5.16
×
14 14
www.gayali.in
[10] Calculate the coefficient of correlation from the following data:

Export of raw cotton (Rs.crores) 42 44 58 55 89 98 66
Import of manufactured goods (Rs.crores) 56 49 53 58 65 76 58
Calculate also the standard error of the coefficient of correlation.
[I.C.W.A., 1965]

www.gayali.in
Solution:
x y u = x – 55 v = y – 58 u2 v2 uv
42 56 -13 -2 169 4 26
44 49 -11 -9 121 81 99
58 53 3 -5 9 25 -15
55 58 0 0 0 0 0
89 65 34 7 1156 49 238
98 76 43 18 1849 324 774
66 58 11 0 121 0 0
Total - - 67 9 3425 483 1122
Σuv  Σu   Σv  1122  67   9  7854 − 603 7251
Cov(u,v) = −  = −    = =
n  n  n  7  7  7  49 49
2 2
Σu 2  Σu  3425  67  23975 − 4489 19486
σu2 = − = −  = =
n  n  7  7  49 49
19486 19486
σu = =
www.gayali.in
49 7
2 2
Σv 2  Σ v  483  9  3381 − 81 3300
σv2 = −  = − = =
n  n  7  7  49 49
3300 3300
σv = =
49 7
7251
cov ( u, v ) 49 7251 7251
∴r = = = = = 0.90
σu σ v 19486 3300 139 . 59 × 57 . 45 8019 . 45
×
7 7
2
1− r 1 − 0.902 1 − 0.81 0.19
Standard Error = = = = = 0.07
n 7 2.65 2.65
[11] Determine the correlation coefficient between x and y
x 5 7 9 11 13 15
y 1.7 2.4 2.8 3.4 3.7 4.4
[Dip Management 1967]
Solution:
Table - Calculations for Correlation coefficient
www.gayali.in
x y u=x–9 v = y – 2.8 u2 v2 uv
5 1.7 -4 -1.1 16 1.21 4.4
7 2.4 -2 -0.4 4 0.16 0.8
9 2.8 0 0 0 0 0
11 3.4 2 0.6 4 0.36 1.2
13 3.7 4 0.9 16 0.81 3.6
15 4.4 6 1.6 36 2.56 9.6
Total - - 6 1.6 76 5.10 19.6

www.gayali.in
Σuv  Σu   Σv  19.6 6 1.6 117.6 − 9.6 108.0

Cov(u,v) = −  = − × = =
n  n  n  6 6 6 36 36
2 2
Σu 2  Σu  76  6  456 − 36 420
σu2 = −  = −  = =
n  n  6 6 36 36
420 420
σu = =
36 6
2 2
Σv 2  Σv  5.10  1.6  30.6 − 2.56 28.04
σv2 = − = −  = =
n  n  6  6  36 36
28.04 28.04
σv = =
36 6
108
cov ( u, v ) 36 108 108 1008
r= = = = = = 0.995
σu σ v 420 28.04 420 × 28.04 20.49 × 5.30 108.60
×
6 6
[12] From the following figures, calculate the coefficient of correlation between the
www.gayali.in
income and the general level of prices:
Income (X) 360 420 500 550 600 640 680 720 750
General Level of Prices (Y) 100 104 115 160 180 290 300 320 330
[C.U., M.com 1968]
Solution:
X − 600
X Y u= v = Y – 180 u2 v2 uv
10
360 100 -24 -80 576 6400 1920
420 104 -18 -74 324 5476 1332
500 115 -10 -65 100 4225 650
550 160 -5 -20 25 400 100
600 180 0 0 0 0 0
640 290 4 110 16 12100 440
680 300 8 120 64 14400 960
720 320 12 140 144 19600 1680
www.gayali.in
750 330 15 150 225 22500 2250

Total - - -18 281 1474 85101 9332
Σuv  Σu   Σv  9332 −18 281 83988 + 5058 89046
Cov(u,v) = −  = − × = =
n  n  n  9 9 9 81 81
2 2
Σu 2  Σu  1474  −18  13266 − 324 12942
σu2 = − = −  = =
n  n  9  9  81 81

www.gayali.in
12942 12942
σu = =
81 9
2 2
Σv 2  Σv  85101  281  765909 − 78961 686948
σv2 = − = −  = =
n  n  9  9  81 81
686948 686948
σv = =
81 9
89046
Cov ( u, v ) 81 89046 89046 89046
r= = = = = = 0.94
6u 6 v 12942 686948 12942 × 686948 113.76 × 828.82 94286.56
×
9 9
[13] The following data give the hardens (x) and tensile strength (y) for some
specimens of a material in certain units. Find the correlation coefficient and calculate
its probable error:
x 23.3 17.5 17.8 20.7 18.1 20.9 22.9 20.8
www.gayali.in
y 4.2 3.8 4.6 3.2 5.2 4.7 4.4 5.6
[I.C.W.A., 1972]
Solution :
Table - calculations for Correlation coefficient
X −20.7 y −3.2
x y u= v= u2 v2 uv
0.1 0.1
23.3 4.2 26 10 676 100 260
17.5 3.8 -32 6 1024 36 -192
17,8 4.6 -29 14 841 196 -406
20.7 3.2 0 0 0 0 0
18.1 5.2 -26 20 676 400 -520
20.9 4.7 2 15 4 225 30
22.9 4.4 22 12 484 144 264
20.8 5.6 1 24 1 576 24
Total - - -36 101 3706 1677 -540
www.gayali.in
Σuv  Σu   Σv  −540 −36 101 −4320 + 3636 −684

Cov(u,v) = −  = − × = =
n  n  n  8 8 8 64 644
2 2
Σu 2  Σu  3706  −36  29648 − 1296 28352
σu2 = − = −  = =
n  n  8  8  64 64
28352 28352
σu = =
64 8

www.gayali.in
2 2
Σv 2  Σv  1677  101  13416 − 10201 3215
σv2 = − = −  = =
n  n  8  8  64 64
3215 3215
σv = =
64 8
684
( )=
Cov u , v −
64 684 684 684
r= =− =− =− = −0.072
6u 6 v 28352 3215 28352 × 3215 168.38 × 56.70 9547.15
×
8 8
1 − r2 1 − (−0.072)2
Probable Error = 0.6745 × = 0.6745 ×
n 8
1 − 5184 / 1000000 0.994816
= 0.6745 × = × .6745 = 0.237
2.83 2.83
[14] The following table gives the saving bank deposits in billions of dollars and
strikes and lock - outs in thousand over a number of years. Compute the correlation
coefficient and comment on the result.
www.gayali.in
Saving deposits 5.1 5.4 5.5 5.9 6.5 6.0 7.2
Strikes and loek-outs 3.8 4.4 3.3 3.6 3.3 2.3 1.0
[I.C.W.A., 1964]
Solution:
x − 5. 9 y − 3. 6
x y u= v= u2 v2 uv
0. 1 0.1
5.1 3.8 –8 2 64 4 –16
5.4 4.4 –5 8 25 64 –40
5.5 3.3 –4 –3 16 9 12
5.9 3.6 0 0 0 0 0
6.5 3.3 6 –3 36 9 –18
6.0 2.3 1 –13 1 169 –13
7.2 1.0 13 –26 169 676 –338
Total – – 3 –35 311 931 –413
Σuv  Σu   Σv  −413 3 −35 −2891 + 105 −2786
Cov(u,v) = −  = − × = =
n  n  n  7 7 7 49 49
2 2
www.gayali.in
Σ u 2  Σu  311  3  2177 − 9 2168

σu2 = − = − = =
n  n  7  7  49 49
2168 2168 46.56
σu = = =
49 7 7
2 2
Σv 2  Σv  931  −35  6517 − 1225 5292
σv2 = − = − = =
n  n  7  7  49 49

www.gayali.in
5292 72.75
σv = =
49 7
−2786
Cov (u, v ) 49 −2786 −2786
r= = = = = 0.82
σu σ v 46.56 72.75 46.56 × 72.75 33887.24
×
7 7
Saving deposits and bank strikes and lookouts are fairly positively correlated.
[15] Two positively correlated variables x1 and x2 have variances σ12 and σ22 respectively.
σ1
Determine the value of the constant ‘a’ such that x1+ax2 and x1 + x 2 are uncorrelated.
σ2
[B.U., B.A (Econ), 1972]
Solution :
Here, u = x1 + ax 2 or , u = x1 + ax2
σ1 σ
v = x1 + x 2 or , v = x1 + 1 x 2
σ2 σ2
If u, v are uncorrelated then
www.gayali.in
Cov (u, v )
r= =0
σu σ v
or, Cov (u, v) = 0
1
or, Cov (u, v) = ∑(u − u)(v − v ) = 0
n
1  σ σ 
or, Σ ( x1 + ax 2 − x1 − ax 2 )  x1 + 1 x 2 − x1 − 1 x 2  = 0
n  σ2 σ2 
1   σ1 
or,
n 
{ }( x
Σ  ( x1 − x1 ) + a ( x 2 − x 2 ) 1 − x1 ) +
σ2
( x 2 − x2 ) = 0


1 σ1 σ1 2
 ∑ ( x1 − x1 ) + a ∑ ( x1 − x1 ) ( x 2 − x 2 ) + ∑ ( x1 − x1 ) ( x 2 − x 2 ) + a ∑ ( x 2 − x 2 )  = 0
2
or,
n σ2 σ2 
σ1 σ
or, σ12 + a cov ( x1 , x 2 ) + cov ( x1 , x 2 ) + a 1 .σ22 = 0
σ2 σ2
or, σ12 + 0 + 0 + aσ1σ2 = 0
or, aσ1σ2 = −σ12
σ1
or, a = −
www.gayali.in
σ2
[16] Given Σx=56, Σy=40, Σx2=524, Σy2=256, Σxy=364, x=8.
Find (i) the correlation coefficient and (ii) the regression equation of x on y.
[I.C.W.A, 1967]
Solution:
Σxy  Σx   Σy  364 56 40 2912 − 2240 672
(i) Cov(x,y) = −  = − × = =
n  n  n  8 8 8 64 64

www.gayali.in
2 2
Σx 2  Σx  524  56  4192 − 3136 1056
σx2 = −  = −  = =
n  n  8  8  64 64
1056 32.50
∴σx = =
64 8
2 2
Σy 2  Σy  256  40  2048 − 1600 448
σy2 = −  = − = =
n  n  8  8  64 64
448 21.17
∴σy = =
64 8
672
Cov ( x, y ) 64 672
r = = = = 0.98
σx σy 32.50 21.17 688.03
×
8 8
Cov ( x, y )
(ii) The regression equation of x on y is x − x = bxy ( y − y ) where bxy =
σy 2
Σx 56 Σy 40
Here, x = = = 7, y = = =5
n 8 n 8
www.gayali.in
672
Cov(x,y) = as above (i)
64
448
and σy2= as above (i)
64
672
672
∴bxy = 64
= = 1. 5
448 448
64
∴ The regression equation of x on y is then x – 7 = 1.5(y–5)
or, x = 1.5y – 7.5 + 7 = 1.5y – 0.5 ∴x = 1.5y – 0.5
[17] The following sums have been obtained from 100 observations - pairs:
∑x = 12,500, ∑y = 8,000, ∑x2 = 1,585,000, ∑y2 = 648,100, ∑xy = 1,007,425
(i) Find the regression of y on x, and estimate the value of y when x=130
(ii) Compute the correlation coefficient (r) between x and y and state what you
learn from the value of r obtained by you,
[C.U., B.A.(Econ), 1976]
Solution:
Cov (x, y )
[i] The regression of y on x is y − y = byx(x − x ) where byx = σx2
www.gayali.in
Σxy Σx Σy 1, 007, 425 12, 500 8, 000 10, 07, 425 − 10,000, 000 7425
Cov(x,y) = = × = − × = = = 74.25
n n n 100 100 100 100 100
2
Σx 2  Σx 
σx2 = −
n  n 
2
1, 585, 000  12, 500 
= −  = 15850 − 15625 = 225
100  100 

www.gayali.in
74.25
byx = = 0.33
225
Σx 12, 500
x = = = 125
n 100
Σy 8, 000
y = = = 80
n 100
∴The regression equation of y on x is y – 80 = 0.33(x – 125)
Or, y = 0.33x – 41.25 + 80 = 0.33x + 38.75
∴y = 0.33x + 38.75
when x = 130, y = 0.33 × 130 + 38.75= 42.9 + 38.75 = 81.65
2
Σy 2  Σ y 
(ii) σy 2 = − 
n  n 
2
648,100  8, 000 
= −  = 6481 − 6400 = 81
100  100 
∴σy= 81 = 9 , σx2=225 from (i)
www.gayali.in
σy
We know, byx = r × , σx=15
σx
9
∴0.33 = r ×
15
0.33 × 15
or, r = = 0.55
9
The correlation between x and y is moderately positively related.
[18] Given the following totals for 10 pairs of obervations on two caracters x and y
obtain the two regression equations and hence calculate the correlation coefficient: ∑x=12,
∑y=4, ∑x2=16.20, ∑y2=1.96, ∑xy=5.2
[M.B.A. 1979]
Solution:
Cov (x, y )
The regression equation of y on x is y − y = byx(x − x ), where byx =
σx2
Σy 4
y= = = 0. 4
n 10
Σx 12
x= = = 1.2
n 10
www.gayali.in
Σxy Σx Σy
Cov(x,y) = − ×
n n n
5.2 12 4 52 − 48 4
= − × = =
10 10 10 100 100
2 2
Σx 2  Σx  16.20  12  162 − 144 18
σx2= − = −  = =
n  n  10  10  100 100

www.gayali.in
4
4
∴byx = 100= = 0.222
18 18
100
∴y–0.4=0.222(x–1.2)
or, y = 0.222x–0.267+0.4=0.222x+0.1333
Cov ( x, y )
The regression equation of x on y is x − x = bxy ( y − y ) , where bxy=
σy 2
2 2 2
Σy  Σy  1.96  4  19.6 − 16 3.6
σy2 =−  = − = =
n  n  10  10  100 100
4
100 4
∴bxy = = = 1.11
3. 6 3. 6
100
The regression equation of x on y is then x–1.2 = 1.11 (y–0.4)
Or, x= 1.11y – 0.444 + 1.2 = 1.11y + 0.756
www.gayali.in
σy
we know, byx= r
σx
3. 6
4 100 3. 6
=r× =r×
18 18 18
100
2 3. 6
or, = r∴
9 18
3. 6 4
or, r 2 × =
18 81
72
or,
= r2 = 0.247
291.6
∴r = 0.50
[19] Estimate from the information given below, the probable crop yield, when
rainfall is 29 inches:-
Mean S.D
Rain fall in inches 25 3
www.gayali.in
Yield in units per acre 40 6

Coefficient of correlation between variables: 0.65
[C.U.B. Sc. 1973]
Solution:
Let Rainfall = x
Yield = y

www.gayali.in
σy
∴byx = r
σx
6
byx= 0.65 × = 1.30
3
The regression equation of y on x is
y − y = byx ( x − x )
y – 40 = 1.30 (x – 25)
y = 1.30x – 32.50 + 40
= 1.30x + 7.50
When x=29'', y=1.30×29+7.50=37.7+7.50=45.2 unit per acre.
[20] The correlation coefficient between two variates x and y is r = 0.60. If σx = 1.50,
σy = 2.00,= =
x 10, y 20 find the equations of the regression lines of (i) y on x (ii) x on y.
[I.C.W.A. 1977]
Solution:
σy 2 1.20
www.gayali.in
byx= r. = 0.60 × = = 0. 8
σx 1.50 1.50
σy 1. 5 0. 9
bxy= r. = 0.60 × = = 0.45
σx 2 2
Therefore, The regression equation of y on x is
y − y = byx ( x − x )
y–20=0.8(x–10)=0.8x–8
∴y=0.8x–8+20=0.8x+12
The regression equation of x on y is
x − x = bxy ( y − y ) = 0.45 ( y − 20 )
x–10=0.45y–9 or, x=0.45y+1
[21] The following data pertain to the marks in two subject, say A and B.
Mean marks in A = 39.5, Mean marks in B = 47.5
S.D. of marks in A = 10.8, S.D. of marks in B = 16.8
Coefficient of correlation between marks in A and B = 0.42. Obtain the equations
of two regression lines and then estimate the marks in B for candidates who secured 50
marks in A.
www.gayali.in
[I.C.W.A. 1978]
Solution:
Let the marks in A is denoted by x
Let the marks in B is denoted by y
∴ x 39
= = .5 , y 47.5
σx= 10.8, σy = 16.8
r = 0.42.

www.gayali.in
σy 16.8 7.056
byx= r. = 0.42 × = = 0.65
σx 10.8 10.8
σ 10.8 4.536
bxy= r. x = 0.42 × = = 0.27
σy 16.8 16.8
Therefore, The regression equation of y on x is y − y = byx ( x − x )
y – 47.5 = 0.65 (x – 39.5)
y = 0.65x – 25.68 + 47.5 = 0.65x + 21.82
The regression equation of x on y is x − x = bxy ( y − y )
x – 39.5 = 0.27 (y – 47.5)
x = 0.27y – 12.82 + 39.5 = 0.27y + 26.68
Here, marks of y to be estimated
∴ y = 0.65 × 50 + 21.82 = 32.5 + 21.82 = 54.32 = 54 (approx)
∴ Marks in B = 54
[22] Given the following results of the height and weight of 1000 men students:
x = 68 inches, y = 150 lbs, r = 0.60, σ x = 2.50 inches, σy = 20.00 lbs. John Doe weighs
www.gayali.in
200 lbs, Richard Roe is five feet tall. Estimate the height of Doe from his weight and
weight of Roe from his height.
[C.U.M.Com, 1976]
Solution:
Let height of students = x inches
weight of students = y lbs
Therefore, x =68 inches y =150 lbs.
σx=2.5 inches σy=20.00 lbs.
The regression equation of y on x is
y − y = byx ( x − x ) - - - - (i)
σy 20 12
byx= r. = 0.60 × = = 4.8
σx 2. 5 2. 5
Putting the value of byx in (i) we get
y – 150 = 4.8(x – 68)
y = 4.8x – 326.40 + 150 = 4.8x – 176.40
x − x = bxy ( y − y ) - - - - (ii)
www.gayali.in
σx 2. 5 1. 5
bxy= r. = 0.60 × = = 0.075
σy 20 20
Putting the value bxy in equation (ii) we get,
x – 68 = 0.075(y–150)
x = 0.075y – 11.25 + 68 = 0.075y + 56.75
when weight of John Doe is 200 lbs. i. e. y = 200 lbs.

www.gayali.in
(height) x = 0.075 × 200 + 56.75 = 15 + 56.75 = 71.75 inches.

when height of Richard Roe is 5 feet = 60 inches = x
(weight) (y) = 4.8 × 60 – 176.4 = 288 - 176.40 = 111.60 lbs.
[23] From the following data find coefficient of linear correlation between X and Y,
Determine also the regression line of Y on X, and then make an estimate of the value
of Y when X=12.
X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9
[I.C.W.A. 1975]
Solution:
Table: Calculations for Correlation Coefficient
X Y X2 Y2 XY
1 1 1 1 1
3 2 9 4 6
4 4 16 16 16
6 4 36 16 24
www.gayali.in
8 5 64 25 40
9 7 81 49 63
11 8 121 64 88
14 9 196 81 126
Total 56 40 524 256 364
ΣX 56 ΣY 40
X= = =7 Y = = =5
n 8 n 8
2 2
ΣX 2  ΣX  524  56  4192 − 3136 1056
σX 2 = −  = −  = =
n  n  8  8  64 64
2 2
ΣY 2  ΣY  256  40  2048 − 1600 448
σY 2 = −  = 8 − 8  = =
n  n    64 64
ΣXY ΣX ΣY 364 56 40 2912 − 2240 672
Cov(X,Y) = − × = − × = =
n n n 8 8 8 64 64
672
Cov ( X , Y ) 64 672 672 672
r= = = = = = 0.98
σx σy 1056 448 1056 × 448 32.50 × 21.17 688.03
×
64 64
672
Cov ( X , Y )
www.gayali.in
672
byx= = 64 = = 0.64
σX 2 1056 1056
64
The regression equation of Y on X is y − y = byx(X − X)
or, Y – 5 = 0.64(X – 7)
or, Y = 0.64X – 4.48 + 5 = 0.64X + 0.52
when X = 12. Y = 0.64 × 12 + 0.52 = 7.68 + 0.52 = 8.20

www.gayali.in
[24] Obtain the lines of regression for the following data:

(X) 1 2 3 4 5 6 7 8 9
(Y) 9 8 10 12 11 13 14 16 15
[C.U., M.com 1968, I.C.W.A. 1978]
Solution:
Table: Calculations for Regression
X Y u=X–5 v = Y – 12 u2 v2 uv
1 9 -4 -3 16 9 12
2 8 -3 -4 9 16 12
3 10 -2 -2 4 4 4
4 12 -1 0 1 0 0
5 11 0 -1 0 1 0
6 13 1 1 1 1 1
7 14 2 2 4 4 4
8 16 3 4 9 16 12
9 15 4 3 16 9 12
Total 45 108 0 0 60 60 57
Since the variance and covariance are unaffected by the change of origin.
www.gayali.in
2 2
Σu 2  Σu  60  0  60
σ x 2 = σu 2 = − = − =
n  n  9  9  9
2 2
Σv 2  Σv  60  0  60
σy 2 = σv 2 = − = − =
n  n  9  9  9
Σuv  Σu   Σv  57 0 0 57
Cov(X,Y)= −  = − × =
n  n  n  9 9 9 9
Cov (u, v ) 57 60 57
byx=bvu= = = = 0.95
6u 2 9 9 60
45 108
=
X = 5= Y = 12
9 9
57
Cov ( u, v ) 9 57
bxy=buv= = = = 0.95
σv 2 60 60
9
The regression equation of Y on X is Y − Y = byx ( X − X )
Y – 12 = 0.95(X – 5)
www.gayali.in
or, Y = 0.95X – 4.75 + 12

Y = 0.95X + 7.25
The regression equation of X on Y is X − X = bxy ( Y − Y )
X – 5 = 0.95(Y – 12)
Or, X = 0.95Y – 11.4 + 5
X = 0.95Y – 6.4

www.gayali.in
[25] Find the two lines of regression from the following data:
Age of husband(x) 25 22 28 26 35 20 22 40 20 18
Age of wife (y) 18 15 20 17 22 14 16 21 15 14
Hence, estimate (i) the age of husband when the age of wife is 19, (ii) the age of
wife when the age of husband is 30.
[C.U.M.Com 1970]
Solution:
x y u = x – 25 v = y – 20 u2 v2 uv
25 18 0 -2 0 4 0
22 15 -3 -5 9 25 15
28 20 3 0 9 0 0
26 17 1 -3 1 9 -3
35 22 10 2 100 4 20
20 14 -5 -6 25 36 30
22 16 -3 -4 9 16 12
40 21 15 1 225 1 15
20 15 -5 -5 25 25 25
www.gayali.in
18 14 -7 -6 49 36 42
Total 6 -28 452 156 156
6
x = 25 + = 25.6
10
28
y = 20 − = 17.2
10
2 2
2 2 Σu 2  Σu  452  6  4520 − 36 4484
σ x = σu = −  = −  = =
n  n  10  10  100 100
2 2
2 2 Σv 2  Σv  156  −28  1560 − 784 776
σy = σv = −  = −  = =
n  n  10  10  100 100
Σuv  Σu   Σv  156 6 −28 1560 + 168 1728
Cov(x,y)=Cov(u,v)= −  = − × = =
n  n   n  10 10 10 100 1000
1728
Cov ( u, v ) 100 1728
byx=bvu= = = = 0.39
σu 2 4484 4484
100
1728
Cov ( u, v ) 100 1728
bxy=buv= = = = 2.23
www.gayali.in
σv 2 776 776
100
(i) The regression equation of x on y is x − x = bxy ( y − y )
x – 25.6 = 2.23(y – 17.2)
or, x = 2.23y – 38.36 + 25.6
or, x = 2.23y – 12.76
when y=19, x=2.23×19–12.76=42.37–12.76
∴x = 29.61 = 30 (approx.)

www.gayali.in
(ii) The regression equation of y on x is y − y = byx ( x − x )

y – 17.2 = 0.39(x – 25.6)
or y = 0.39x – 9.98 + 17.2 = 0.39x + 7.22
when x = 30, y = 0.39 × 30 + 7.22 = 11.70 + 7.22 = 18.92 = 19 (approx.)
[26] From the following data, obtain two regression equations
Sales 91 97 108 121 67 124 51 73 111 57
Purchases 71 75 69 97 70 91 39 61 80 47
[C.A. 1977]
Solution:
Let us take sales = X, Purchases = Y
Table - Calculations for Regression
X Y u = X – 90 v = Y – 70 u2 v2 uv
91 71 1 1 1 1 1
97 75 7 5 49 25 35
108 69 18 -1 324 1 -18
www.gayali.in
121 97 31 27 961 729 837
67 70 -23 0 529 0 0
124 91 34 21 1156 441 714
51 39 -39 -31 1521 961 1209
73 61 -17 -9 289 81 153
111 80 21 10 441 100 210
57 47 -33 -23 1089 529 759
Total 900 700 0 0 6360 2868 3900
900 700
=
X = 90 =
Y = 70
10 10
2 2
Σu 2  Σu  6360  0  6360
σ x 2 = σu 2 = − = − = = 636
n  n  10  10  10
2 2
Σv 2  Σv  2868  0  2868
σy 2 = σv 2 = − = − = = 286.8
n  n  10  10  10
Σuv Σu Σv 3900 0 0 3900
Cov (X,Y)=Cov(u,v) = − . = − × = = 390
n n n 10 10 10 10
Cov ( u, v ) 390
byx=bvu= = = 0.613
σu 2 636
www.gayali.in
Cov ( u, v ) 390
bxy=buv= 2
= = 1.36
σv 286.8
The regression equation of x on y is x − x = bxy ( y − y )
X – 90 = 1.36(y – 70)
or, X = 1.36Y – 95.20 + 90
X = 1.36y – 5.20

www.gayali.in
The regression equation of y on x is y − y = byx ( x − x )

or, Y–70=0.613(X–90)
or, Y = 0.613X – 55.17 + 70
Y = 0.613X + 14.83
[27] Obtain the equation of the line of regression of yield of rice(y) on water(x) from
the data given in the following table:
Water in inches (x) 12 18 24 30 36 72 48
Yield in tons (y) 5.27 5.68 6.25 7.21 8.02 8.71 8.42
Estimate the most probable yield of rice for 40 inches of water.
[C.U., M.Com., 1964, I.C.W.A. 1976]
Solution:
x y v=y–7 u2 v2 uv
x − 30
u=
6
www.gayali.in
12 5.27 -3 -1.73 9 2.99 5.19
18 5.68 -2 -1.32 4 1.74 2.64
24 6.25 -1 -0.75 1 3.06 0.75
30 7.21 0 0.21 0 0.04 0
36 8.02 1 1.02 1 1.04 1.02
42 8.71 2 1.71 4 2.92 3.42
48 8.42 3 1.42 9 2.02 4.26
Total 210 49.56 0 0.56 28 13.81 17.28
210 49.56
=x = 30 = , y = 7.08
7 7
 28  0 2 
σ x 2 = d 2 σu 2 = 62  −    = 36 × 4 = 144
 7  7  
17.28
Cov(x,y)=d Cov(u,v)= 6 × = 14.81
7
Cov ( u, v ) 14.81
byx=buv= = = 0.103
σu 2 144
Therefore, the regression equation of y on x is y – 7.08 = byx (x – 30)
Or, y = 0.103x – 3.09 + 7.08
www.gayali.in
y = 0.103x + 3.99
when x = 40, y = 0.103 × 40 + 3.99 = 4.12 + 3.99 = 8.11 tous.
[28] If the regression equation of y on x be y = 0.57 + 6.93 and the regression equation
of x on y be x = 1.12y – 2.46 find the correlation coefficient between x and y.
[B.U., B.A.(Econ) 1972]

www.gayali.in
Solution:
From the equation, Y = 0.57x + 6.93
∴byx = 0.57
From the equation, X = 1.12y – 2.46
∴bxy = 1.12
we know r2 = byx.bxy = 0.57 × 1.12
r = 0.57 × 1.12 = 0.64
∴ r = +0.80
[29] For some bivariate data the following results were obtained. The mean value of
X = 53.2, the mean value Y = 27.9, the regression coefficient of y on X = –1.5, and the
regression coefficient of X on Y = –0.2. Find the (i) most probable value of Y when
X = 60,(ii) r the coefficient of correlation between X and Y.
[C.U., M.Com., 1974]
Solution:
The regression equation of Y on X is Y − Y = b yx ( X − X )
www.gayali.in
or, Y – 27.9 = –1.5 (X – 53.2)
or, Y = 27.9 + 79.80 – 1.5X
Y = 107.7–1.5X
when X = 60, Y = 107.7 – 1.5 × 60 = 107.7 – 90 = 17.7
we know the relation
r2 = bYX.bXY = –1.5x – 0.2 = 0.30
∴ r = ± 0.30 = ± 0.55
But since the regression coefficients are negative, the correlation coefficient also
must be negative i. e. r = –0.55.
[30] The regression equations calculated from a given set of observations are x = –0.2y
+ 4.2, y = –0.8x + 8.4. Calculate (i) x and y (ii) r, (iii) the estimated value of y when x = 4.
[I.C.W.A., 1986]
Solution:
Here, (ii) byx = –0.8
bxy = –0.2
∴r2 = byx×bxy
= –0.8×–0.2 = 0.16
www.gayali.in
r = ± 0.16 = ± 0.4
But since the regression coefficients are negative the correlation coefficient also
must be negative i. e. r = –0.4.
(i) The regression equations are
y = –0.8x + 8.4 - - - (i)
x = –0.2y + 4.2 - - - (ii)

www.gayali.in
Multiplying (ii) by 5 and (i) by 1 we get

y + 0.8 = 8.4
y + 5x = 21.0
Substracting –4.2x = –12.6
12.6
or, x= =3
4. 2
Putting the value of x = 3 in equation (i), we get
y = –0.8 × 3 + 8.4 = –2.4 + 8.4 = 6
Since two regression lines always intersect at x , y ∴ x = 3, y = 6
(ii) when x = 4, y = –8 × 4 + 8.4 = – 3.2 + 8.4 = 5.2
[31] The two regression lines involving two variables X and Y are Y = 5.6 + 1.2X and
X = 12.5 + 0.6Y. Find the means of X and Y and the its correlation coefficient.
[W.B.H.S., 1978]
Solution:
The two regression lines are
Y = 5.6 + 1.2X - - - (i)
www.gayali.in
X = 12.5 + 0.6Y - - - (ii)
Solving equation (i) and (ii)
Multiplied by 10 in both the equations
10Y = 56 + 12X - - - (iii)
10X = 12.5 + 6Y - - - (iv)
Multiplying (iii) by 6 and (iv) by 10 we get
60Y − 72 X = 336
–60Y + 100X = 1250
(Adding) 28X = 1586
1586
X= = 56.64
28
Putting the value of X in equation (i) we get
Y = 5.6 + 1.2 × 56.64 = 5.6 + 67.97 = 73.57
Since two regression lines intersect at x , y therefore
= =
x 56 .64, y 73.57
Here, From equation (i)
byx = 1.2
From equation (ii) bxy = 0.6
r2 = byx.bxy
= 1.2 × 0.6 = 0.72
www.gayali.in
∴ r = 0.72 = + 0.85
Ans. X = 56.64, Y = 73.57, r = +0.85
[32] Two variates have the least squares regression lines x + 4y + 3 = 0 and 4x + 9y + 5 = 0.
Find their mean values and the correlation coefficient.
[W.B.H.S., 1978]

www.gayali.in
Solution:
The regression lines are
x + 4y + 3 = 0 - - - (i)
4x + 9y + 5 = 0 - - - (ii)
Solving equation (i) and (ii),
Multiplied (i) by 4 and (ii) by 1 we get
4x+16y+12=0
4x + 9y + 5=0
(Substracting) 7y + 7 = 0
or, 7y = –7
y = –1
Putting the value of y = –1 in equation (i) we get x + 4 × –1 + 3 = 0 or, x = 1
Since, two regression lines intersect at. point x , y , therefore x = 1, y = −1 .
From equation (i)
4y = –x – 3
1 3
y= − x−
4 4
www.gayali.in
1
∴byx = −
4
From equation (ii)
4x = –9y – 5
9 5
x= − y−
4 4
9
∴bxy = −
4
1 9 9
r2=byx×bxy= − × − =
4 4 16
9 3
∴r= ± = ± = ±0.75
16 4
As, byx and bxy are negative, therefore r = –0.75
[33] Two lines of regression are given by x+2y=5 and 2x+3y=8 and σx2 = 12. Calculate
the values of x , y ,σ and r.
y [I.C.W.A. 1976]
Solution:
www.gayali.in
The regression equations are

x + 2y = 5 - - - (i)
2x + 3y = 8 - - - (ii)
Multiplying equation (i) by 2 and equation (ii) by 1 and subtracting from (iii) to (iv)
2x + 4y = 10 ----(iii)
2x + 3y = 8 ----(iv)
y=2

www.gayali.in
Putting the value of y = 2 in equation (i) we get

x + 2 × 2 =5
or, x – 1
Since two regression lines intersect at x , y , hence,=x 1,=y 2
Let the equation x + 2y = 5 is the regression equation of y on x
or, 2y = 5 –x
5 1
y= − x
2 2
1
∴byx = −
2
2x + 3y = 8
or, 2x = 8 – 3y
3
x = 4− y
2
3
∴bxy = −
2
1 3 3
www.gayali.in
we know, r2=byx.bxy = − x − =
2 2 4
3 3
∴r = ± =±
4 2
3
As byx and bxy are -ve sign, therefore r = −
2
σy
we know, byx = r.
σx
1 3 σY 1 3 σY
− =− × =− =− ×
2 2 12 2 2 2 3
σy=2
3
Ans. x = 1, y = 2, σ y = 2 and r = −
2
[34] In order to find the correlation coefficient between two variants x and y from 12
pairs of observations, the following calculations were made.
∑x = 30, ∑y = 5, ∑x2 = 670, ∑y2 = 285, ∑xy = 334
On subsequent verification it was found that the pair (x = 11, y = 4) was copied
wrongly, the correct value being (x = 10, y = 14). Find the correct value of correlation
coefficient.
[I.C.W.A. 1975]
www.gayali.in
Solution:
Here given, ∑x = 30, ∑y = 5
Correct ∑x = 30 –11 + 10 = 29
Correct ∑y = 5 –4 + 14 = +15
∑x2 = 670, ∑y2 = 285, ∑xy = 334
Correct ∑x2 = 670 – 112 + 102 = 670 – 121 + 100 = 649
Correct ∑y2=285–42+142=285–16+196=465
Correct ∑xy = 334 – 44 + 140 = 290 + 140 = 430

www.gayali.in
Σxy Σx Σy 430 29 +15 5160 − 435 4725

Now, Cov(x,y) = − . = − × = =
n n n 12 12 12 144 144
2 2
Σx 2  Σx  649  29  7788 − 841 6947
σx2 = − = − = =
n  n  12  12  144 144
6947 6947
∴σx = =
144 12
2 2
Σy 2  Σ y  465  15  5580 − 225 5355
σy2 = −  = − = =
n  n  12  12  144 144
5355 5355
∴σy = =
144 12
Cov ( x, y ) 4725  6947 5355  4725
∴r = = ÷ × = = 0.77
σx σy 144  12 12  83.35 × 73.18
Ans. r = + 0.77
[35] Obtain the linear regression equation that you consider more relevant for the
www.gayali.in
following set of paired observations and give reasons why you consider it to be so:
Age 56 42 72 36 63 47 55 49 38 42 68 60
Blood Pressure 147 125 160 118 149 128 150 145 115 140 152 155
Also estimate the blood pressure of a person whose age is 45.
[C.U.M.Com. 1973]
Solution:
Let Age be x and blood pressure be y.
Table - Calculations for Regression
x y u = x – 50 v = y – 140 u2 v2 uv
56 147 6 7 36 49 42
42 125 -8 -15 64 225 120
72 160 22 20 484 400 440
36 118 -14 -22 196 484 308
63 149 13 9 169 81 117
47 128 -3 -12 9 144 36
55 150 5 10 25 100 50
49 145 -1 5 1 25 -5
www.gayali.in
38 115 -12 -25 144 625 300

42 140 -8 0 64 0 0
68 152 18 12 324 144 216
60 155 10 15 100 225 150
Total 628 1684 28 4 1616 2502 1774
Σx 628 1684
x= = = 52.33, y = = 140.33
n 12 12

www.gayali.in
Σxy Σx Σy Σuv Σu Σv 1774 28 4 21288 − 1112 21176

Cov (x,y) = − . = Cov ( u, v ) = − . = − . = =
n n n n n n 12 12 12 144 144
2 2
Σu 2  Σu  1616  28  19392 − 784 18608
σx2=σu2= − = − = =
n  n  12  12  144 144
21176
Cov ( u, v ) 21176
byx=buv= = 144 = = 1.14
σu 2 18608 18608
144
The regression equation of y on x is y – 140.33 = 1.14(x – 52.33)
or, y = 1.14x – 59.65 + 140.33
= 1.14x + 80.68
when Age is 45 i. e. x = 45.
Blood pressure (y) = 1.14 × 45 + 80.68 = 51.3 + 80.68 = 132
[36] For the variables x and y the equations of two regression lines are 4x – 5y + 33 = 0
and 20x – 9y = 107. Identify the regression line of y on x and that of x on y. What is the
estimated value of y, when x = 0? If this estimate is denoted by yo, find the estimated
www.gayali.in
value of x, when y = yo.
[C.U., B.A.(Econ.)]
Solution:
Let us take equation
4x – 5y + 33 = 0 as regression equation of y on x
or, 5y = 4x + 33
4 33 4
or, y = x + , byx=
5 5 5
Let us take equation 20x – 9y = 107 as regression equation of x on y
or, 20x = 9y + 107
9 107 9
or, x= y+ , bxy=
20 20 20
4 9 9
∴r2 = byx × bxy = × =
2 20 25
9 3
or, r= = = 0.60 < 1
25 5
www.gayali.in
On the other hand, if we take equation 4x–5y+33=0 as regression of x on y then,

4x – 5y + 33 = 0
4x = 5y – 33
5 33 5
or, x = y − , bxy=
4 4 4
and 20x – 9y = 107 as regression equation of y on x.
9y = 20x – 107

www.gayali.in
20 107 20
or, y = x− , byx=
9 9 9
20 5 25
r2 = × = > 1
9 4 9
As r can not be more than 1 so that 1st equation is the regression equation of y
on x and the second equation is regression equation of x on y.
4 33 4 33
when x=10, y= x + = × 10 + = 14.6
5 5 5 5
∴y0 = 14.6
9 107
x= y+
20 20
9 107
when y = y0 = 14.6, x = × 14.6 + = 6.57 + 5.35 = 11.92
20 20
[37] State the meaning of the terns explained variation and unexplained variation,
used in theory of regression. If the coefficient of correlation between two variables X
and Y be 0.83, what percentage of total variation remains unexplained by the regression
www.gayali.in
equation?
[I.C.W.A. 1975]
Solution:
If y i/ represents the estimated value of y from the regression equation of
y on x when x = x i, i.e. y i / − y = byx ( x i − x ) then it can be shown that
( ) + Σ ( y − y ) . Σ ( y − y ) is called Total variation of the observed

2 2
Σ ( y i − y ) = Σ y i − y i/
2 / 2
i i
values of y. Σ ( y ) is the sum of the squares of vertical distances of the points on

2
i − y i/
the scatter diagram around the regression line of y on x, and as such is called variation
( )
2
around the regression line or unexplained variation or Residual variation. Σ y i / − y
is called variation due to regression or Explained variation.
Total variation = Unexplained variation + Explained variation
Explained var iation
r2=
Total var iation
r=0.83
∴r2=0.69
∴unexplained variation = 1 - 0.69 = 0.31 = 31%
[38] Given Unexplained variation = 19.70, and explained variation = 19.22, determine
www.gayali.in
the coefficient of correlation.

[I.C.W.A., 1976]
Solution:
Total variation = Unexplained variation + Explained variation = 19.70 + 19.22 = 38.92
19.22
r2= = 0.49
38.92
∴|r|=0.70

www.gayali.in
[39] In a contest, two judges ranked eight candidates A, B, C, D, E, F, G and H in

order of their preference, as shown in the following table. Find the rank correlation
coefficient.
A B C D E F G H
First Judge 5 2 8 1 4 6 3 7
Second Judge 4 5 7 3 2 8 1 6
[I.C.W.A, 1975]
Solution:
Table - Calculations for Rank Correlation Coefficient
Ranked by
Candidates Judge I Judge II d=x–y d2
x y
A 5 4 1 1
B 2 5 -3 9
C 8 7 1 1
D 1 3 -2 4
E 4 2 2 4
F 6 8 -2 4
G 3 1 2 4
www.gayali.in
H 7 6 1 1
Total - - 0 28
Here, n = 8, ∑d2 = 28,
6Σd 2 6 × 28 168 504 − 168 336 2
R =1− =1− =1− = = = = 0.67
(n 3
−n ) 3
8 −8 504 504 504 3
2
Ans. R=
3
[40] Compute the correlation coefficient of the following ranks of a group of students
in two examinations. what conclusion do you draw from the result?
Roll Nos. 1 2 3 4 5 6 7 8 9 10
Ranks in B.com Exam 1 5 8 6 7 4 2 3 9 10
Ranks in M.com Exam 2 1 5 7 6 3 4 8 10 9
[C.U., M.com 1975]
Solution:
Table - Calculations for Rank correlation coefficient
Roll Nos. Ranks in B.com Exam (x) Ranks in M.com Exam. (y) d = x – y d2
1 1 2 -1 1
2 5 1 4 16
www.gayali.in
3 8 5 3 9
4 6 7 -1 1
5 7 6 1 1
6 4 3 1 1
7 2 4 -2 4
8 3 8 -5 25
9 9 10 -1 1
10 10 9 1 1
Total - - - 0 60

www.gayali.in
Here, n = 10, ∑d2 = 60

6Σd 2 6 × 60 360 99 − 36 63
∴R = 1 − =1− =1− = = = +0.64
n3 − n 103 − 10 990 99 99
Ans. R=+0.64 it shows ranks in B.Com. Exam & M.Com. Exam are moderately
correlated.
[41] Ten competitors in a musical contest were ranked by 3 judges A, B, C in the
following order:
Ranks by A 1 6 5 10 3 2 4 9 7 8
Ranks by B 3 5 8 4 7 10 2 1 6 9
Ranks by C 6 4 9 8 1 2 3 10 5 7
Using Rank correlation method, discuss which pair of judges has the nearest
approach to common likings in music.
[I.C.W.A., 1978]
Solution:
Ranks by
www.gayali.in
Judge A Judge B Judge C d12 = R1 − R 2 d13 = R1 − R 3 d23 = R 2 − R 3 d12 d13 d23
2 2 2
(R1) (R2) (R3)

1 3 6 -2 -5 -3 4 25 9
6 5 4 1 2 1 1 4 1
5 8 9 -3 -4 -1 9 16 1
10 4 8 6 2 -4 36 4 16
3 7 1 -4 2 6 16 4 36
2 10 2 -8 0 8 64 0 64
4 2 3 2 1 -1 4 1 1
9 1 10 8 -1 -9 64 1 81
7 6 5 1 2 1 1 4 1
8 9 7 -1 1 2 1 1 4
Total - - - - - - 200 60 214
6 × 200 1200 990 − 1200 210
R12 = 1 − 3
=1− = =− = −0.21
10 − 10 990 990 990
6 × 60 360 990 − 360 630
R13 = 1 − =1− = = = 0.64
103 − 10 990 990 990
6 × 214 1284 990 − 1284 294
R23 = 1 − =1− = =− = −0.30
103 − 10 990 990 990
Since R13=+0.64 is the largest, Judges A and C have the nearest approach to
www.gayali.in
common likings in music.

[42] Ten students obtained the following marks in Mathematics and statistics.
Calculate the rank correlation coefficient.
Student (Roll No.) 1 2 3 4 5 6 7 8 9 10
Marks in Mathematics 78 36 98 25 75 82 90 62 65 39
Marks in Statistics 84 51 91 60 68 62 86 58 53 47

www.gayali.in
Solution:
Mathematics Physics
Roll No. d=x–y d2
Marks(X) Ranks(x) Marks(Y) Rank(y)
1 78 4 84 3 1 1
2 36 9 51 9 0 0
3 98 1 91 1 0 0
4 25 10 60 6 4 16
5 75 5 68 4 1 1
6 82 3 62 5 -2 4
7 90 2 86 2 0 0
8 62 7 58 7 0 0
9 65 6 53 8 -2 4
10 39 8 47 10 -2 4
Total - - - - - 0 30
6Σd 2 6 × 30 180 990 − 180 810
R =1− =1− =1− = = = 0.82
n3 − n 103 − 10 990 990 990
[43] Compute the rank correlation coefficient from the following data:
www.gayali.in
Series A 115 109 112 87 98 98 120 100 98 118
Series B 75 73 85 70 76 65 82 73 68 80
Solution:
Table - calculations for Rank Correlation coefficient
Series A Series B Rank of A (x) Rank of B (y) d = x – y d2
115 75 3 5 -2 4
109 73 5 6.5 -1.5 2.25
112 85 4 1 3 9
87 70 10 8 2 4
98 76 8 4 4 16
98 65 8 10 -2 4
120 82 1 2 -1 1
100 73 6 6.5 -0.5 0.25
98 68 8 9 -1 1
118 80 2 3 -1 1
Total - - - - 0 42.50
There are two ties, one of them containing 3 entries and the other 2 entries.
∴
(
Σ t3 − t ) = (3
3
−3 ) + (2 3
−2 ) = 24 + 6 = 2.5
12 12 12 12 12
www.gayali.in
 Σ t − t ( 3
)
/ 
2
R = 1 − 6  Σd +
12
 n3 − n ( )
 
6 ( 42.5 + 2.5 ) 6 × 45 270 990 − 270 720
=1− =1− =1− = = = 0.73
103 − 10 990 990 990 990
[44] Twelve sales man are ranked in order of merit of efficiency by their manager.
They are also ranked in accordance with their length of service. what indication is

www.gayali.in
there of a relation between length of service and efficiency?

Sales man Years of service Order of merit (Service) Order of merit (Efficiency)
A 5 7.5 6
B 2 11.5 12
C 10 2 1
D 8 4 9
E 6 6 8
F 4 9 5
G 12 1 2
H 2 11.5 10
I 7 5 3
J 5 7.5 7
K 9 3 4
L 3 10 11
Solution:
Rank (Service) (x) Rank (Efficiency) (y) d = x – y d2
7.5 6 1.5 2.25
11.5 12 -0.5 0.25
2 1 1 1
www.gayali.in
4 9 -5 25
6 8 -2 4
9 5 4 16
1 2 -1 1
11.5 10 1.5 2.25
5 3 2 4
7.5 7 0.5 0.25
3 4 -1 1
10 11 -1 1
Total - - 0 58.00
There are 2 ties with 2 entries each.
(
Σ t3 − t ) = 2(2 3
−2 ) = 2×6 =1
12 12 12
 Σd 2 + Σ t 3 − t

6
( ) 

 12  6 ( 58 + 1)
/ 6 × 59 227
R =1− =1− 3 =1− = = 0.79
n3 − n 12 − 12 143 × 12 286
[45] Given the following coefficients: r12 = 0.41, r13 = 0.71, r23 = 0.5. Find r12.3 , r13.2 and
r1.23, where the symbols have their usual signifinance.
[C.U., M.Com., 1974]
www.gayali.in
Solution:
r12 − r13r23 0.41 − 0.71 × 0.5 0.41 − 0.355 0.055
r12.3 = = = = = 0.09
(1 − r )(1 − r )
13
2
23
2
(1 − 0.41 )(1 − 0.5 )
2 2 0.8319 × 0.75 0.6239
r13 − r12r32 0.71 − 0.41 × 0.5 0.71 − 0.205 0.505 0.50 0. 505
r13.2 = = = = = = = 0.64
(1 − r )(1 − r ) (1 − 0.41 )(1 − 0.5 )
12
2
32
2 2 2
(1 − 0.17 )(1 − .25) 0.83 × .75 0.83 × 0.75 0.79

www.gayali.in
( 0. 7 ) + ( 0. 6 )
2 2
r122 + r132 − 2r12 r23r13 − 2 × 0 . 7 × 0. 4 × 0. 6 0.49 + 0.36 − 0.336 0.85 − 0.336 0.514
r1.23 = = = = = = 0.78
1 − ( 0.44 )
2 2
1 − r23 1 − 0.16 .84 .84
[46] In a three - variate multiple correlation analysis, the following results were found:
x1 = 60 x 2 = 70 x 3 = 100
s1=3 r2=4 s3=5
r12=0.7 r13=0.6 r23=0.4

the symbols having their usual significance. Find regression of x1 on x2 and x3,
and the multiple correlation coefficient R1.23.
[B.U., M.A.(Econ.), 1968]
Solution:
x1 − x1 = b12.3 ( x 2 − x 2 ) + b13.2 ( x 3 − x 3 )
σ1  r12 − r13r23
 3  0.7 − 0.6 × 0.4  3  0.7 − 0.24  3 0.46
b12.3 =  = 4  =  = × = 0.41
www.gayali.in
σ2  1 − r232
2
 1 − 0.4  4  1 − 0.16  4 0.84

σ  r − r r  3  0.6 − 0.7 × 0.4  3  0.6 − 0.28  3 .32
Again, b13.2 = 1  13 12 23  =   = 5  1 0 16  = 5 × 84 = 0.23
σ2  1 − r232  5  1 − 0.4
2
  − .  .
∴x1–60 = 0.41(x2–70) + 0.23(x3–100)

Or, x1= 60 + 0.41x2–28.70 + 0.23x3–23
x1 = 8.30 + 0.41x2 + 0.23x3
Multiple Correlation coefficient x1 on x2 and x3
2 2
R1.23 = r12 + r13 − 2r12 r23r13
1 − r232
( 0. 7 ) + ( 0. 6 )
2 2
− 2 × 0 . 7 × 0. 4 × 0. 6 0.85 − 0.336 0.514
R1.23 = = = = 0.78
1 − ( 0. 4 )
2
.84 84
INTERPOLATION
Interpolation has been defined as the ‘art of reading between the lines of a table, and
www.gayali.in
the term usually denotes the process of the finding the intermediate value of a function
from a set of given values of that function.
Finite Differences:∆ and E operators
In problems of interpolation, the independent variable x is often known as ‘argument’, and
the dependent variable or the function y = f (x) is known as ‘entry’. Let x0, x1, x2,- - -, xn
denote a set of equidistant values of the argument, i. e. x1 - x0 = x2 - x1 = - - - = xn - xn-1 = h

www.gayali.in
where h is a constant, and y0, y1, y2 ----, yn denote the corresponding values of the entry.
Differences of the successive values of y, viz. ( y 1 − y 0 ) , ( y 2 − y 1 ) , ( y 3 − y 2 ) , ----, ( y n − y n −1 )
are called finite differences of the first order and are denoted by ∆y 0, ∆y 1 , ∆y 2 , ----, ∆y n −1
respectively.
The differences of the successive first order differences ∆y, namely
( ∆y1 − ∆y 0 ) , ( ∆y 2 − ∆y1 ) , ----, ( ∆y n −1 − ∆y n −2 ) are known as finite differences of the
2 2 2
second order and are denoted by ∆ y 0 , ∆ y 1 , ----, ∆ y n −2 respectively. Similarly the
third differences ∆3 y, the fourth differences ∆4 y and differences of higher order may
be defined.
Argument (x) Entry (y) First differences (∆y) Second differences (∆2 y)
x0 y0
y1 – y0 = ∆y0
x1 y1 ∆y1 – ∆y0 = ∆2y0
y2 – y1 = ∆y1
x2 y2 ∆y2 – ∆y1 = ∆2y1
y3 – y2 = ∆y2
x3 y3 ∆y3 – ∆y2 = ∆2y2
www.gayali.in
y4 – y3 = ∆y3
x4 y4
A table which shows the finite differences is known as Difference Table.
Table - Difference Table
First Second Third Fourth
Argument x Entry y differences differences differences differences
∆y ∆2 y ∆3 y ∆4 y
x0 = 1 y0 = 1
14
x1 = 3 y1 = 15 36
50 24
x2 = 5 y2 = 65 60 0
110 24
x3 = 7 y3 = 175 84 0
194 24
x4 = 9 y4 = 369 108
302
x5 = 11 y5 = 671
The initial term y0 of the entry is called the leading term and the initial terms in the
difference columns, viz.∆y0,∆2y0, ∆3y0 etc. are called leading differences.
www.gayali.in
Another symbolic operator E is defined as follows:

=
Ey 0 y=
1 , Ey 1 y 2, Ey 2 = y 3, ------
Both the operators ∆ and E can be applied repeatedly, the repeated operations being
indicated by ∆2, ∆3, - - - - and E2, E3, etc. Thus
∆2y0=∆(∆y0)= ∆y1– ∆y0=(y2–y1)–(y1–y0)=y2–2y1+y0
∆3y0=∆(∆2y0)= ∆2y1– ∆2y0=(y3–2y2+y1)–(y2–2y1+y0)=y3–3y2+3y1–y0

www.gayali.in
Similarly, E2y0=E(Ey0)=E(y1)=y2
E3y0=E(E2y0)=E(y2)=y3;
From the definitions, we may in general write
∆yr = yr+1–yr
Eyr = yr+1
These operators may thus be interpreted in the following manner:
(a) ∆ when prefixed to yr implies that yr is to be subtracted from the next value of the
entry yr+1
(b) E when prefixed to yr denotes the next value of the entry yr+1
From (a) and (b) we find that
E yr = yr + ∆yr
or, E yr = (1 + ∆) yr (Suppose)
Omitting yr from both sides, we find that the operators E and ∆ are connected by the
symbolic relation E≡1+∆
www.gayali.in
This does not mean that when added to ∆ gives E, but that the operation by E is
equivalent to the operation by (1+∆). It may be shown that the above relation follows
certain algebraic rules.
As shown earlier, we have
∆y0=y1–y0
∆2y0=y2–2y1+y0
∆3y0=y3–3y2+3y1–y0
∆4y0=y4–4y3+6y2–4y1+y0
Alternatively, we may write
∆y0=Ey0–y0
∆2y0=E2y0–2Ey0+y0
∆3y0=E3y0–3E2y0+3Ey0–y0
∆4y0=E4y0–4E3y0+6E2y0–4Ey0+y0
,
With the operators only (removing y0 s from both sides)
∆=E–1
∆2=E2–2E+1=(E–1)2
∆3=E3–3E2+3E–1=(E–1)3
www.gayali.in
∆4=E4–4E3+6E2–4E+1=(E–1)4
This, we have developed a convenient method of expressing the finite difference of any
order in terms of the entries.
Newton’s Forward Interpolation Formula
Let y0, y1, y2-----, yn be some tabulated values of a function y=f(x) corresponding to the
equidistant values x=x0, x1, x2, ----, xn.
x1–x0=x2–x1=x3–x2=----=xn–xn–1=h (say).

www.gayali.in
It is required to find the value of y corresponding to an intermediate value of x

lying near the beginning of the tabulated values. This is given by Newton’s Forward
interpolation Formula:-
u ( u − 1) u ( u − 1) ( u − 2 ) u ( u − 1) ( u − 2 ) − − − ( u = n + 1)
y=y0+u∆y0+ ∆2 y 0 + ∆3 y 0 + − − − − + ∆n y 0
1× 2 1× 2 × 3 1 × 2 × 3 − − − ×n
x − x0
where µ =
h
Newton’s Backward Interpolation Formula
Let x0, x1, x2, - - - , xn be some equidistant value of the argument and y0, y2, - - -, yn the
corresponding entries. It is required to find the value of y corresponding to a sepcified
value of x lying near the end of the tabulated values.
This is given by Newton’s Backward Interpolation Formula:-
v ( v + 1) v ( v + 1) ( v + 2 ) v ( v + 1) ( v + 2 ) − − − ( v + n − 1)
y=yn+v∆yn–1+ ∆ 2 y n −2 + ∆ 3 y n −3 + − − − + ∆n y 0
1× 2 1× 2 × 3 1× 2 × 3 × − − − × n
Central Difference Formulae - StirLing’s and Bessel’s : When the successive values of
www.gayali.in
the argument are equidistant, Newton’s forward and backward formulae are generally
applied in all cases of interpolation. However, for interpolation near the middle of the
set of tabulated values, Central Difference formulae, which utilize differences near the
central part of the difference table, are found more useful; because the successive terms
coverage more rapidly than in Newton’s forward and backward formulae.
Let the function y = f(x) be tabulated for some equidistant values of the argument.
(x0 – nh), ---- (x0 – 2h), (x0 – h), x0, (x0 + n), (x0 + 2h) ---- (x0 + nh). The common
difference being h; and the corresponding entries be denoted by y–n, ---- y–2, y–1, y0, y1,
y2 ---- yn
It is required to interpolate the value of function y = f(x) for a value of x in the interval
(x0 – h) to (x0 + h).
I. Stirling’s Interpolation Formula
y=y0+u.
( ∆y 0 + ∆y −1 ) + u2 .∆2 y +
(
u u 2 − 12 ) . ( ∆y −1 + ∆ 3 y −2 )
−1
2 2! 3! 2
+
2
(
u u −1 2 2
) .∆ y
4
(
u u −12 2
)(u 2
−2 2
) . (∆ y5
−2 + ∆ 5 y −3 )
−2 +
www.gayali.in
4! 5! 2
+ ------------------
where u = (x – x0)/h
Stirling’s formula is appropriate when the value of u lies in the interval -0.25 to +0.25.
It uses y0 and even order differences on the horizontal line through y0, but the average
of odd order differences above and below that line.

www.gayali.in
II. Bessel’s Interpolation Formula

 2 1  1
y=
y 0 + y1
+ v .∆y 0 +  .
(
 v − 4  ∆ 2 y o + ∆ 2 y −1
 +  )
∆ v2 − 
4 3
.∆ y −1
2 2! 2 3!
 2 1  2 9   1  9
+  . (
 v − 4   v − 4  ∆ 4 y −1 + ∆ 4 y −2
+
) v  v2 −  v2 − 
 4  4 5
.∆ y −2
4! 2 5!
+----------
1
where u=(x–x0)/h and v= u −
2
Bessel’s formula is appropriate for values of u lying in the interval +0.25 to +0.75, i. e.
for interpolation in the middle - half of two tabulated values.
Bessel’s formula is especially found useful for interpolating the value of the
1
function exactly at the middle of two given values, i. e. when u = so that v = 0. Every
2
alternate term in Bessel’s formula then vanish. This special case of Bessel’s formula
www.gayali.in
known as “Formula for Interpolating to Halve".
Lagrange’s Interpolation Formula
Let y0, y1, y2, ----, yn denote the tabulated values of a function y = f(x)corresponding
to the values of the argument x0, x1, x2, - - -, xn (which may not be equidistant).It is
required to find the value of y corresponding to a specified value of x lying in between
the given values. This is obtained by using Lagrange’s Interpolation Formula:
( x − x1 ) ( x − x 2 ) − − − ( x − x n ) y
y=
( x 0 − x1 ) ( x 0 − x 2 ) − − − ( x 0 − x n ) 0
( x − x0 )( x − x2 ) − − − ( x − xn ) y
+
( x1 − x 0 ) ( x1 − x 2 ) − − − ( x1 − x n ) 1
+---- ---- ---- ---- ---- ---- ----
( x − x 0 ) ( x − x1 ) − − − ( x − x n −1 )
+ yn
( x n − x 0 ) ( x n − x1 ) − − − ( x n − x n −1 )
Inverse Interpolation
Given a set of tabulated values of a function y = f(x) corresponding to some values of
the argument x, the process of finding the value of the argument for an intermediate
value of the function is called. ‘Inverse Interpolation’.
www.gayali.in
Let y0, y1, - - -, yn be a set of tabulated values of a function y = f(x) corresponding to

some given values (usually equidistant) of the argument x = x0, x1, x2, - - -, xn. In some
cases it may be required to determine the values of x for another values of y lying in
between y0, y1, - - -, yn. The process of finding such a value of x is known as ‘inverse
interpolation’. If y is a function of x, say y = f(x) , in many cases it is possible to treat x
also as a function of y say x = g(y). Since Lagrange’s interpolation formula is applicable
for unequal intervals, we can use this formula for inverse interpolation, interchanging
the role of the argument x and the function y.

www.gayali.in
x=
( y − y1 ) ( y − y 2 ) ( y − y 3 ) − − − ( y − y n ) x
( y 0 − y1 ) ( y 0 − y 2 ) ( y 0 − y 3 ) − − − ( y 0 − y n ) 0
( y − y 0 )( y − y 2 )( y − y 3 ) − − − ( y − y n ) x
+
( y1 − y 0 ) ( y1 − y 2 ) ( y1 − y 3 ) − − − ( y1 − y n ) 1
( y − y 0 ) ( y − y1 ) ( y − y 3 ) − − − ( y − y n ) x
+
( y 2 − y 0 ) ( y 2 − y1 ) ( y 2 − y 3 ) − − − ( y 2 − y n ) 2
+---- ---- ---- ---- ---- ---- ----
( y − y 0 ) ( y − y1 ) ( y − y 2 ) − − − ( y − y n −1 )
+ xn
( y n − y 0 ) ( y n − y1 ) ( y n − y 2 ) − − − ( y n − y n −1 )
Exercise
[1] The following data show the monthly average number of deaths under one year
in a certain large city. Find the missing term?
Year 1960 1961 1962 1963 1964
www.gayali.in
Number of deaths (monthly average) 940 ? 907 843 798
[I.C.W.A. 1972]
Solution:
Since only 4 values are given, we assume a third degree polynomial, so that
the fourth differences are zero. Denoting the entries corresponding to the years 1960,
1961, 1962, etc. by y0, y1, y2, --------, we have then.
∆4 y0 = 0
or, (E – 1)4 y0 = 0
Expanding we get
y4 – 4y3 + 6y2 – 4y1 + y0 = 0
Putting y0=940, y2=907, y3=843, y4=798
798–4×843+6×907–4y1+940=0
Or, 7180 – 3372 = 4y1
3808
Or, 3808 = 4y1 ∴y1= = 952
4
[2] The following gives the amount y of cement in thousands of tons manufactured
www.gayali.in
in India in the year x. Find the missing term.

[I.C.W.A. 1969]
Solution:
Since five values are given, we assume a 4th degree polynomial, so that the 5th
differences are zero. Denoting the entries corresponding to the years, 1946, 1948, 1950,
1952, 1954 etc. by y0, y1, y2, y3 ----- we have then

www.gayali.in
∆5y0=0
(E–1)5y0=y5–5y4+10y3–10y2+5y1–y0=0
or, Putting y0=39, y1=85, y3=151, y4=264,y5=388 in the above equation,
388–5×264+10×151–10y2+5×85–39=0
or, 388–1320+1510–10y2+425–39=0
or, 2323–1359–10y2=0
10y2=964
964
∴y2= = 96.4
10
[3] The growth of population in India, according to the decennial census, is shown
below:
Year 1901 1911 1921 1931 1941 1951
Population (Lakh) 2384 2522 2514 2791 ------- 3613
The census figure for 1941 is not given here. Give an estimate of the actual
population for 1941.
www.gayali.in
[C.U., B.Sc., 71]
Solution:
Since only 5 values are known, we assume a 4 degree polynomial for f(x);
so that 5th order differences are zero. In particular ∆5f(5)=0. Denoting the entries
corresponding to the years 1901, 1911, 1921, 1931, 1941 etc. by y, y1, y2 ---- we have
then
∆5y0=(E–1)5y0
= y5–5y4+10y3–10y2+5y1–y0=0
Putting y0=2384, y1=2522, y2=2514, y3=2791, y5=3613 in the above equation.
3613–5y4+10×2791–10×2514+5×2522–2384=0
or, 3613–594–27910–25140+12610–2384=0
or, 44133–27524=5y4
or,5y4=16609
16609
or y4= = 3322 lakh.
5
[4] Below are given the values of a function Ux for certain values of x :
x 0 1 2 3 4
www.gayali.in
Ux 1 0 5 22 57
Construct the table of differences. What does the table suggest? Use this table to
find U5.
[I.C.W.A., 1976]
Solution:
Since the given values of x are equidistantand the value U5 can be found by

www.gayali.in
Difference Table
x Ux ∆Ux ∆2Ux ∆3Ux ∆4Ux
0 1
1
1 0 4
5 8
2 5 12 2
17 6
3 22 18
35
4 57
Since only 4 values are given. We assume that Ux is a 3rd degree polynomial in x.
So 4 difference may be regarded as zero, i.e. ∆4Ux=0
th
In particular, ∆4U1=0 or, (E–1)4U1

or, (E4–4E3+6E2–4E+1)U1=0
or, U5–4U4+6U3–4U2+U1=0
or, U5–4×57+6×22–4×5+0=0
or, U5=228–132+20+0=248–132=116
www.gayali.in
[5] Form a difference table and find the values of y3 and y9 from the following :
y4=135, y5=432, y6=1015, y7=2016, y8=3591
Solution:
Since only 5 values are known, we assume a 4th degree polynomial for f(x), so
that 5 and higher order differences are zero. In particular ∆5f(5)=0
th
(E–1)5f(5)=0
or, E5y0–5E4y0+10E3y0+10E2y0+5Ey0–y0=0
or, y5–5y4+10y3+10y2+5y1–y0=0
Substituting the values of y5 , y4 , y3 , y2 , y1 , y0 to y8 , y7 , y6 , y5 , y4 , y3 we get
y8–5y7+10y6+10y5+5y4–y3=0
or, y3=3591–5×2016+10×1015–10×432+5×135
=3591–10080+10150–4320+675=14416–14400=16
Difference Table
x f(x) = yx ∆yx ∆2 yx ∆3 yx ∆4 yx
4 y4 = 135
www.gayali.in
297
5 y5 = 432 286
583 132
6 y6 = 1015 418 24
1001 156
7 y7 = 2016 574
1575
8 y8 = 3591

www.gayali.in
Newton’s backward interpolation formula is :

v ( v + 1)
f(x) = f(xn)+V∆f(xn)+ ∆2f(xn)
2
v ( v + 1) ( v + 2 ) v ( v + 1) ( v + 2 ) ( v + 3 )
+ ∆3 f ( x n ) + v 4 f ( xn )
2×3 2× 3× 4
x − xn 9 − 8
v = = =1
h 1
1(1 + 1) 1(1 + 1) (1 + 2 ) 1(1 + 1) (1 + 2 ) (1 + 3 )
f(9) = 3591 + 1 × 1575 + × 574 + × 156 + × 24
2 2×3 2×3× 4
= 3591+1575+574+156+24=5920
[6] [a] Given y0=8, y1=6, y2=4, y4=24, find the value of y3.
[b] Given y2=5, y5=122, y6=193, find the value of y3 and y4
[c] Given u5=5, U10=10, U20=–24, U25=–33, estimate U15 under suitable assumption.
Solution:
[a] Since 4 values are known, we assume a third degree polynomial for yx so
www.gayali.in
that 4th and higher order differences are zero.
In particular ∆4y0=0
or, (E–1)4y0=0
Expanding we get (E4–4E3+6E2–4E+1)y0=0
or, y4–4y3+6y2–4y1+y0=0
or, 24–4y3+6×4–4×6+8=0
or, 4y3=24+24–24+8=32
32
∴y3= =8
4
[b] Since 3 values are given, we assume a 2nd degree polynomial for yx so
that 3rd and higher order differences are zero. In particular, ∆4y1=0
∆3y0=0
or, (E–1)3 y0=0
or, (E3–3E2+3E–1)y0=0
or, y3–3y2+3y1–y0=0
Replacing y0, y1, y2, y3, y4 by y2, y3, y4, y5, y6
y5–3y4+3y3–y2=0
www.gayali.in
or, 122–3y4+3y3–5=0
or, 3y3–3y4+117=0 ------ (1)
y4–3y3+3y2–y1=0
or, y6–3y5+3y4–y3=0
or, 193–3×122+3y4–y3=0
or, 3y4–y3–173=0 ------ (2)

www.gayali.in
Solving equation (1) and (2) we get

3y4–y3–173=0 ------ (2)
3y3–3y4+117=0 ------ (1)
(Adding) 2y3 – 56 = 0
2y3 = 56
56
y= 3 = 28
2
Putting value of y3 = 28 in (1)
3 × 28 – 3y4 + 117 = 0
84 – 3y4 + 117 = 0
3y4 = 201
201
=y 4 = 67
3
[c] Difference Table
Ux Value of y = f(x) ∆y ∆y2 ∆y3
5 3
–3
www.gayali.in
10 0 -21
→15 -24 -12

20 -24 -33
-57
25 -33
Since only 4 values are known, we assume a 3rd degree polynomial for
f(x). So that 4th and higher order differences are zero. In particular, ∆4 f(4) = 0
or, (E – 1)4 f(4) = 0
∆4 y0 = 0
Expanding, we get
y4–4y3+6y2–4y1+y0=0
–33–4×–24+6y2–4×0+3=0
6y2=33–96–3=–99+33=–66
66
∴ y 2 = − = −11
6
[7] Find the missing term:
x 0 1 2 3 4
www.gayali.in
f(x) 1 3 9 ? 81
Solution:
Since 4 values are given, we assume a 3rd degree polynomial, so that the 4th
differences are zero. Denoting the entries corresponding to x 0, 1, 2, 3, 4 by y0, y1, y2, y3,
y4 we have them
∆4 y0 = 0
Or, (E – 1)4 y0 = 0

www.gayali.in
Expanding we get
or, y4–4y3+6y2–4y1+y0=0
or, 81–4y3+6×9–4×3+1=0
or, 4y3=81+54–12+1=136–12=124
124
∴y3 = = 31
4
[8] Find y for x = 2, from the following table:
x 0 1 3 4 5
y 39 85 151 264 388
[I.C.W.A., 1969]
Solution:
Since 5 values are given. We assume a 4th degree polynomial, so that 5th differences
are zero, Denoting the enties corresponding to x 0, 1, 2, 3, 4, 5 by y0, y1, y2, y3, y4, y5 we
have then
∆5 y0 = 0
Or, (E – 1)5 y0 = 0
www.gayali.in
Or, y 5 − 5y 4 + 10 y 3 − 10 y 2 + 5y 1 − y 0 = 0
Or, 388 – 5 × 264 + 10 × 151 – 10y2 + 5 × 85 – 39 = 0
10y2 = 388 – 1320 + 1510 + 425 – 39
= 2323 – 1359 = 964
964
∴y2 = = 96.4
10
[9] Find f(5) from the following data :
f(3)=4, f(4)=13, f(6)=43
Solution:
Since only 3 values are known, we assume a 2nd degree polynomial for f(x), so
that 3 and higher order differences are zero. In particular ∆3f(0)=0 where entries
rd
corresponding to f(3), f(4), f(5), f(6) denoted by y0, y1, y2, y3 we have then
( E − 1)
3
y0 = 0
Expanding we get,
y3–3y2+3y1–y0=0
or, 43–3y2+3×13–4=0
or, 3y2=43+39–4=82–4=78
www.gayali.in
78
or, y=
2 = 26
3
f(5) = 26
[10] Find the polynomial function f(x) from the following values f(3)=–1, f(4)=5, f(5)=15
Solution:
Since only 3 values of the function are known, we may assume that the

www.gayali.in
polynomial is of 2nd degree. f(x)=a+bx+bx2------- (i)

where a, b, c are certain constants to be determined.
Putting x = 3, 4, 5 in equation (i) we get
f(3) = a + 3b + 9c i.e. a + 3b + 9c = –1 -------(ii)
f(4) = a + 4b + 16c i.e. a + 4b + 16c = 5 --------(iii)
f(5) = a + 5b + 25c i.e. a + 5b + 25c = 15 ------(iv)
Solving these 3 equations, we get
From (ii) & (ii) a+3b+9c=–1
a+4b+16c=5
(Substracting) –b–7c=–6
b+7c=6 ------ (v)
From (iii) & (iv) a+4b+16c=5
a+5b+25c=15
(Substracting) –b–9c=–10
or, b+9c=10 ------(vi)
From (v) & (vi) b+7c=6
www.gayali.in
b+9c=10
(Substracting) –2c=–4
or, c=2
Putting the value of c=2 in equation (v) we get
b+9×2=10
or, b=–8
Putting the value of b=–8 and c=2 in equation (ii) we get
a+3×–8+9×2=–1
or, a=24–18–1=5
Hence the required equation of the polynomial is
5–8x+2x2=f(x)
or, 2x2–8x+5=f(x)
[11] Given the following table, find the function f(x), assuming it to be a polynomial
of the 3rd degree in x
x 0 1 2 3
f(x) 1 2 11 34
[I.C.W.A., 1975]
www.gayali.in
Solution:
We assume the polynomial as f(x)=a+bx+cx2+dx3
where a, b, c, d are certain constants to be determined.
Putting x=0,1, 2, 3 successively
f(0) = a = 1
f(1)=a+b+c+d=2

www.gayali.in
f(2)=a+2b+4c+8d=11
f(3)=a+3b+9c+27d=34
a=1 -------(i)
or, b+c+d=1-------(ii)
2b+4c+8d=10-------(iii)
3b+9c+27d=33-------(iv)
From (ii) & (iii) b+c+d = 1
b+2c+4d=5
(Substracting) –c–3d=–4
or, c+3d=4 --------(v)
From (iii) & (iv) b+2c+4d=5
b+3c+9d=11
or, c+5d=6 -------(vi)
From (v) & (vi) c+3d=4
c+5d=6
(Substracting) –2d=–2
www.gayali.in
or, d=1
Putting the values of d = 1 in equation (v), we get
c+3×1=4 b+c+d=1
or, c=1 or, b+1+1=1
∴b=–1
The required polynomial is
1+(–1)x+1x2+1x3
= 1–x+x2+x3
f(x)=x +x –x+1
3 2
[12] Ux is a polynomial in x. Given the following table find Ux

x 0 1 2 5
Ux 2 3 12 147
Solution:
Since only 4 values of the function Ux are known. We may assume that the
polynomial is of 3rd degree.
Ux = a + bx + cx2 + dx3
www.gayali.in
Putting x=0, 1, 2, 5 Successively

a = 2 ------(i)
a + b + c + d = 3 ------(ii)
a + 2b + 4c + 8d = 12 ------(iii)
a + 5b + 25c + 125d = 147 ------(iv)
or, b + c + d = 1 ------(ii)
2b + 4c + 8d = 10

www.gayali.in
or, b + 2c + 4d = 5 ------(iii)
5b + 25c + 125d = 145
or, b + 5c + 25d = 29 ------(iv)
From (ii) & (iii) b+c+d=1
b+2c+4d=5
or, c+3d=4 ------(v)
From (iii) & (iv) b+2c+4d=5
b+5c+25d=29
(Substracting) –3c–21d=–24
or. c+7d=8 ------(vi)
From (v) & (vi) c+3d=4
c+7d=8
(Substracting) –4d=–4
or, d=1
Putting the value of d in (vi) we get
c+7×1=8 or, c=1
www.gayali.in
Putting the value of c, d in (ii) we get
b+1+1=1 or, b=–1
Therefore, the polynomial is 2+(–1)x+1x2+1.x3
Ux=x3+x2–x+2
[13] Below are given the values of a function f(x) for contain values of x. Find f(2),
stating your assumption.
x 0 1 3 4
f(x) 5 6 50 105
[I.C.W.A, 1975]
Solution:
Since 4 values are given, so it is a case of polynomial of degree 3, hence the
4 differeces are zero. ∆4y0=0
th
or, y4 – 4y3 + 6y2 – 4y1 + y0 = 0

or, 105 – 4 × 50 + 6y2 – 4 × 6 + 5 = 0
or, 105 – 200 + 6y2 – 24 + 5 = 0
or, –224 + 110 + 6y2 = 0
www.gayali.in
or, 6y2 = 114

114
or, =
y 2 = 19
6
[14] Find, f(x) given that f(0)=–3, f(1)=6, f(2)=8, f(3)=12 (state your assumptions, if
any). Hence find f(6).
[I.C.W.A. 1976]

www.gayali.in
Solution:
Since only 4 values of the function are known we may assume that the polynomial
is of 3rd degree,
∴ f (x) = a + bx + cx 2 + dx 3
Putting x = 0, 1, 2, 3 in the polynomial
a=–3 ------(i)
a+b+c+d=6 -----(ii)
a+2b+4c+8d=8 -----(iii)
a+3b+9c+27d=12 -----(iv)
or, b+c+d=9 -----(ii)
2b+4c+8d=11 -----(iii)
3b+9c+27d=15
or, b+3c+9d=5 -----(iv)
From (ii) & (iii) 2b + 4c + 8d = 11
2b+2c+2d=18
www.gayali.in
(Substracting) 2c+6d=–7 -----(v)
From (iii) & (iv) 2b+6c+18d=10
2b+4c+8d=11
(Substracting) 2c+10d=–1 -----(vi)
From (v) & (vi) 2c+10d=–1
2c+6d=–7
(Substracting) 4d=6
6 3
d= =
4 2

Putting the value of d in equation (vi) we get
3
2c + 10 × = −1
2
or, 2c=–1–15=–16
c=–8
Putting the value of c, d in equation (ii)
3
b + ( −8 ) + = 9
2
www.gayali.in
3 3 31
b = 9 − + 8 = 17 − =
2 2 2
31 3
f(b)= −3 + × 6 + ( −8 ) × 36 + × 216
2 2
= –3+31×3–288+3×108
= –3+93–288+324
= 417–291=126

www.gayali.in
[15] Using any algebraic method, find the value of y when x = 6.

x 3 7 9 10
y 16.8 12.0 7.2 6.3
[I.C.W.A., 1975]
Solution:
We write the 3rd degree polynomial in the form
f(x)=A+B(x–3)+C(x–3)(x–7)+D(x–3)(x–7)(x–9)
where A, B, C, D are constants to be determined.
Putting x=3, A=16.8
x=7, f(7)=A+4B=16.8+4B=12
or, 4B=–4.8
4.8
B=− = −1.2
4
x = 9, f(9) = A + 6B + C × 6 × 2
= A + 6B + 12C
=16.8 + 6 × –1.2 + 12C = 7.2
or, 16.8 – 7.2 + 12C = 7.2
www.gayali.in
or, 12C = 7.2 – 9.6 = –2.4
−2.4
∴C = = −0.2
12
f(10)=A+7B+C×7×3+D×7×3×1=6.3
or, A+7B+21C+21D=6.3
or, 16.8+7×–1.2+21×–0.2+21D=6.3
or, 16.8–8.4–4.2+21D=6.3
or, 21D = 6.3 – 4.2 = 2.1
2. 1
∴D = = −0.1
21
f(x)=16.8–1.2(x–3)–0.2(x–3)(x–7)+0.1(x–3)(x–7)(x–9)
f(6)=16.8 – 1.2(6–3)–0.2(6–3)(6–7)+0.1(6–3)(6–7)(6–9)
=16.8–1.2×3–0.2×3×–1+0.1×3×–1×–3
=16.8–3.6+0.6+0.9
=18.3–3.6=14.7
[16] For a certain polynomial function yx it is known that y1 = –1, y2+y3=–1, y4+y5+y6 = 61.
Find yx and hence value of y3.
Solution:
www.gayali.in
Since only 3 values are available, we assume a 2nd degree polynomial for yx and
we write yx = a + bx + cx2
Putting x=0, y0=a
Putting x=1, y1=a+b+c=–1------(1)
Putting x=2&3, y2=a+2b+4c
and adding y3=a+3b+9c
y2 + y3 = 2a + 5b + 13c = –1 ------(2)

www.gayali.in
Putting x=4, 5 & 6, y4 =a+4b+16c

and adding y5=a+5b+25c
y6 = a + 6b + 36c
y4+y5+y6=3a+15b+77c=61 ------(3)
Solving the equation (1), (2) & (3) we get
a+b+c=–1×2
2a+2b+2c=–2
2a+5b+13c=–1
(Substracting) –3b–11c=–1
or, 3b + 11c = 1 ------(4)
or, 3a+3b+3c=–3
3a+15b+77c=61
(Substracting) –12b–74c=–64
or, 6b+37c=32 ------(5)
From (4) & (5) 6b+37c=32
6b+22c=2
(Substracting) 15c=30
or, c=2
www.gayali.in
Putting the value of c in equation (4)
3b+11×2=1
3b=1–22=–21
b=–7
Putting the value of b & c in the equation
a+b+c=–1
or, a–7+2=–1
or, a–5=–1
or, a=4
∴yn = 4 + (–7)x + 2x2
yn = 4 – 7x + 2x2
when x=3, y3=4–21+18=22–21=1
[17] Given U0 + U6 = –107, U1 + U5 = –36, U2 + U4 = –3, find the value of U3.
Solution:
Since 3 values are given, we assume a 2nd degree polynomial for U, so that 3rd and
higher order differences are zero. In particular.
∆6 U0 = 0
www.gayali.in
or, ( E − 1) U 0 = 0 Expanding
6
U 6 − 6 U 5 + 15 U 4 − 20 U 3 + 15 U 2 − 6 U1 + U 0 = 0
or, ( U 0 + U 6 ) − 6 ( U1 + U 5 ) + 15 ( U 2 + U 4 ) − 20U 3 = 0
or, −107 − 6 × −36 + 15 × −3 − 20U 3 = 0
or, −107 + 216 − 45 − 20U 3 = 0

www.gayali.in
or, 216 − 152 − 20U 3 = 0

20U 3 = 64
64
or, U=
3 = 3. 2
20
[18] From the following table estimate by interpolation the number of units of a
commodity supplied when the price is Rs. 4 :-
Price in Rs. 1 3 5 7 9
No. of units supplied 256 625 935 1201 1433
Solution:
Let price of the commodity be x and number of units supplied by f(x).
Since the given value of x are equidistant and the value of x = 4 lies near the
beginning of these values. We use Newton’s forward interpolation formula.
Difference Table
Price Rs. (x) No. of units y = f(x) ∆y ∆2y ∆3y ∆4y
www.gayali.in
1 256
369
3 625 –59
310 15
5 935 –44 -5
266 10
7 1201 –34
232
9 1433
x − x0 4 − 1 3
u= = = = 1. 5
h 2 2
∆ 2 y 0 u ( u − 1) ∆ 3 y 0 u ( u − 1) ( u − 2 ) ∆ 4 y 0 u ( u − 1) ( u − 2 ) ( u − 3 )
y= y 0 + ∆y 0 u + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
1.5 (1.5 − 1)
1.5 (1.5 − 1) (1.5 − 2 ) −5 × 1.5 (1.5 − 1) (1.5 − 2 ) (1.5 − 3 )
= 256 + 1.5 × 369 + ( −59 ) × + 15 × +
1× 2 1× 2 × 3 1× 2 × 3 × 4
59 × 1.5 × .5 15 × 1.5 × .5 × −.5 −5 × 1.5 × −0.5 × −1.5
= 256 + 553.5 − + +
2 6 24
=256+553.5–22.13–0.94–0.12 = 809.5–23.19 = 786.31=787 (approx)
www.gayali.in
[19] The following table gives the expectation of life e0x at age x. Calculate the
expectation of life as age 12 by Newton’s forward Interpolation formula
x 10 15 20 25 30 35
e0x 35.4 32.2 29.1 26.0 23.1 20.4
[I.C.W.A. 1977]

www.gayali.in
Solution:
Difference Table
x e0x ∆e0x ∆2e0x ∆3e0x ∆4e0x
10 35.4
-3.2
15 32.2 0.1
-3.1 -0.1
20 29.1 0 0.3
-3.1 0.2
25 26.0 0.2 -0.2
-2.9 0
30 23.1 0.2
-2.7
35 20.4
x − x 0 12 − 10 2
u= = = = 0.4
h 5 5
∆ y 0 u(u − 1) ∆ 3 y 0 u(u − 1)(u − 2) ∆ 4 y 0 u(u − 1)(u − 2)((u − 3)
2
y= y 0 + ∆y 0 u + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
www.gayali.in
0.1 × 0.4(0.4 − 1) −0.1 × 0.4(.4 − 1)(.4 − 2) 0.3 × 0.4(0.4 − 1)(.4 − 2)(.4 − 3)
= 35.4 − 3.2 × 0.4 + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
= 35.4–1.28–0.012–0.064–0.0125 =35.4–1.369 =34.031=34.1 (approx)
[20] The following shows the values of the function y = f(x) for a number of values of x :-
x 0.5 0.6 0.7 0.8 0.9
y 0.35207 0.33322 0.31225 0.28969 0.26609
Obtain the values of y when x = 0.58, using suitable interpolation formula.
[C.U., B.A.(Econ.), 1976]
Solution:
Difference Table
x y ∆y ∆2 y ∆3 y ∆4 y
0.5 0.35207
-0.01885
0.6 0.33322 -0.00212
-0.02097 0.00053
0.7 0.31225 -0.00159 0.00002
-0.02256 0.00055
0.8 0.28969 -0.00104
www.gayali.in
-0.02360
0.9 0.26609
x − x 0 0.58 − 0.5 0.08 8
u= = = = = 0. 8
h 0. 1 0.1 10
∆ 2 y 0 u ( u − 1) ∆ 3 y 0 u ( u − 1) ( u − 2 ) ∆ 4 y 0 u ( u − 1) ( u − 2 ) ( u − 3 )
y= y 0 + ∆y 0 u + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4

www.gayali.in
−0.00212 × (0.8)(−0.2) 0.00053 × 0.8 × −0.2 × −1.2 .00002 × 0.8 × −0.2 × −1.2 × −2.2
= 0.35207 + (0.01885 × 0.8) + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
=0.35207–0.01508+0.00017+0.00002–.0000004 =0.33718
[21] The table below gives the average number of years of life remaining to persons
who survive to exact age x, for male African population of Belgium Congo.
x 0 5 10 15 20
e0x 37.64 44.04 41.40 37.78 34.41
Obtain 0e2 approximately
[I.C.W.A. 1973]
Solution:
Difference Table
x yx= 0ex ∆y ∆2y ∆3 y ∆4y
0 37.64
6.40
5 44.04 -9.04
-2.64 8.06
www.gayali.in
10 41.40 -0.98 -0.09
-3.62 7.97
15 37.78 6.99
-3.37
20 34.41
x − x0 2 − 0 2
u= = = = 0.4
h 5 5
∆ y 0 u ( u − 1) ∆ 3 y 0 u ( u − 1) ( u − 2 ) ∆ 4 y 0 u ( u − 1) ( u − 2 ) ( u − 3 )
2
y= y 0 + ∆y 0 u + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
∴0e2= 37.64 + 6.40 × 0.4 −

9.04 × 0.45 × −0.6
+
8 . 06 × 0 . 4 × ( )( ) + −6.83 × 0.4 × −0.6 × −1.6 × −2.6
−0 . 6 −1 . 6
1× 2 1 × 2 ×33 1× 2 × 3 × 4
= 37.64+2.56+1.08+0.52+0.28 =42.08
[22] State Newton’s Forward Interpolation Formula, and use it to find 5.5 given that
√5=2.236, 6 =2.449, √7=2.646 and √8=2.828.
[I.C.W.A., 1974]
Solution:
Difference Table
x y ∆y ∆2y ∆3y
www.gayali.in
V5 2.236
0.213
V6 2.449 -0.016
0.197 0.001
V7 2.646 -0.015
0.182
V8 2.828

www.gayali.in
x − x0 5.5 − 2.236 2.345 − 2.236 0.109

u= = = = = 0.5117
h 6− 5 2.449 − 2.236 0.213
Newton’s Forward Interpolation Formula:
∆ 2 y 0 u ( u − 1) ∆ 3 y 0 u ( u − 1) ( u − 2 )
y= y 0 + ∆y 0 u + +
1× 2 1× 2 × 3
( −0.016 ) × .5117 × (.5117 − 1) + .001× .5117 × (.5117 − 1) (.5117 − 2 )
∴y= 2.236 + 0.213 × 0.5117 +
2 1× 2 × 3
( −0.016 ) × 0.5117 × ( −.4883) + .001 × 0.5117 × −.4883 × −1.4883
= 2.236 + 0.1089 +
2 6
= 2.236+0.1089+.0001+.00006 =2.345
[23] The following table shows the number of earners earning incomes exceeding
different amounts during a certain period:
Income (Rs.) 50,000 75,000 1,00,000 1,25,000 1,50,000
No. of earners 412 304 225 147 88
Find the number of earners earning more than 60,000 by liner interpolation and
also by using Newton’s Forward Formula.
www.gayali.in
[C.U., B.A.(Econ.), 1977]
Solution:
Since there are 5 values, it is a 4th degree polynomial and ∆5y0. Let income=x or, and
number of earners=y
(E–1)5y0=0
or, y5–5y4+10y3–10y2+5y1–y0=0
or, 88–5×147+10×225–10×304+5y1–412=0
or, 88–735+2250–3040+5y1–412=0
or, 2338–4187+5y1=0
or, 5y1=1849
1849
or,=y 1 = 369.8 = 369
5
Difference Table
Income (Rs.) x No. of years (y) ∆y ∆2y ∆3y ∆4y
50,000 412
-108
75,000 304 29
-79 -28
1,00,000 225 1 46
-78 18
www.gayali.in
1,25,000 147 19
-59
1,50,000 88
x − x 0 60, 000 − 50, 000 10, 000
u= = = = 0.4
h 25, 000 25, 000
∆ 2 y 0 u ( u − 1) ∆ 3 y 0 u ( u − 1) ( u − 2 ) ∆ 4 y 0 u ( u − 1) ( u − 2 ) ( u − 3 )
y= y 0 + ∆y 0 u + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4

www.gayali.in
29 × 4(−0.6) (−28) × .4 × −0.6 × −1.6 46 × 0.4 × −0.6 × −1.6 × −2.6

= 412 + (−108) × 0.40 + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
=412–43.2–3.48–1.79–1.91 =412–50.38=361.62=362
[24] Using suitable interpolation formulae, calculate the values of y, when (i) x = 10, and (ii) x = 25.
x 7 11 15 19 23 27
y 20,256 20,625 21,296 22,407 24,098 26,511
Solution:
Difference Table
x y ∆y ∆2y ∆3 y ∆4y ∆5y
7 20,256
369
11 20,625 302
761 138
15 21,296 440 2
1111 140 0
19 22,407 580 2
1691 142
23 24,098 722
www.gayali.in
2413
27 26,511
10 − 7 3
(i) u= = = 0.75
4 4
By Newton’s forward formula,
∆ 2 y 0 u ( u − 1) ∆ 3 y 0 u ( u − 1) ( u − 2 ) ∆ 4 y 0 u ( u − 1) ( u − 2 ) ( u − 3 )
y= y 0 + ∆y 0 u + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
302 × .75 × (.75 − 1) 138 × 0.75 × (0.75 − 1)(0.75 − 2) 2 × 0.75 × (0.75 − 1)(0.75 − 2)(0.75 − 3)
= 20256 + 369 × 0.75 + + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
= 20256+276.75–28.31+5.39–.05=20256–28.36=20509.78=20510
(ii) By Newton’s Backward Formula,
x − x n 25 − 27 2
v= = = − = −0.5
h 4 4
∆y v ( v + 1) ∆y n −3 v ( v + 1) ( v + 2 ) ∆y n − 4 v ( v + 1) ( v + 2 ) ( v + 3 )
y= y n + ∆y n −1 v + n −2 + +
1× 2 1× 2 × 3 1× 2 × 3 × 4
−0.5(−0.5 + 1) −0.5(−0.5 + 1)(−0.5 + 2) 2 × −0.5(−0.5 + 1)(−0.5 + 2)(−0.5 + 3)
= 26511 + 2413 × −0.5 + 722 × + 142 × +
1× 2 1× 2 × 3 1× 2 × 3 × 4
= 26511–1206.5–90.25–8.88–0.08 =26511–1305.71 =26205.29
www.gayali.in
y = 26205
Ans. 20510, 26205.
[25] Find the value of sin480 from the following table.

x (degrees) 30 35 40 45 50
sin x .5000 .5736 .6428 .7071 .7660

www.gayali.in
Solution:
Since the given values of x are equidistant and the value x = 480 lies near the end
of these values, we use Newton’s backward interpolation formula:
Difference Table
x y = f(x) = sin x ∆y ∆2y ∆3y ∆4y
30 .5000
.0736
35 .5736 -0.0044
.0692 -0.0005
40 .6428 -0.0049 0
.0643 -0.0005
45 .7071 -0.0054
.0589
50 .7660
x − x n 48 − 50 2
v= = = − = −0.4
h 5 5
∆y n −2 v ( v + 1) ∆y n −3 v ( v + 1) ( v + 2 )
y= y n + ∆y n −1V + +
1× 2 1× 2 × 3
( −.00054 ) × −0.4 × ( −.4 + 1) + −0.0005 × −0.4 ( −0.4 + 1) ( −0.4 + 2 )
www.gayali.in
= 0.7660 + 0589 × −0.4 +
1× 2 1× 2 × 3
.0054 × −0.4 × 0.6 ( −0.0005 ) × −0.4 × 0.6 × 1.6
= 0.7660 − 0.0236 − +
2 6
= 0.7660–0.0236+0.0065+0.00005 =0.7667–0.0236=0.7431
Ans. 0.7431
[26] Using Newton’s interpolation formula, find the number of factories earning less
than Rs.65,000 as profits, from the following data:
Profits (Rs.’000) 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80
No. of factories 34 43 56 39 29
[I.C.W.A. 1975]
Solution:
Let yx denote the number of factories earning Profits less than x rupees i. e. yx
gives the “less than” cumulative frequency (c. f). The less than cumulative frequency
and the difference table is given below:
Difference Table
x c.f = y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
30 = x0 0=y0
34
www.gayali.in
40 34 9
43 4
50 77 13 -34
56 -30 71
60 133 -17 37
39 7
70 172 -10
29
80 201

www.gayali.in
Since we have to find the value of y (= 65) near the end of tabulated values and
the x values are equidistant we may use Newton’s backward interpolation formula.
x − x n 65 − 80 15
Here, v= = = − = −1.5
h 10 10
f(65)= 201 + 29 ( −1.5 ) +

( −10 ) × ( −1.5) ( −1.5 + 1) + 7 × ( −1.5) ( −1.5 + 1) ( −1.5 + 2 )
1× 2 1× 2 × 3
37 × ( −1.5 ) ( −1.5 + 1) ( −1.5 + 2 ) ( −1.5 + 3 ) 71 × ( −1.5 ) ( −1.5 + 1) ( −1.5 + 2 ) ( −1.5 + 3 ) (−1.5 + 4)
+ +
1× 2 × 3 × 4 1× 2 × 3 × 4 × 5
( −10 ) ( −1.5) ( −0.5) + 7 × −1.5 × −0.5 × .5 × 1.5 + 37 × −1.5 × −0.5 × 0.5 × 1.5 × 2.5
= 201 − 43.5 +
2 24 120
= 201–43.5–3.75+0.44+0.87+0.83 =155.89 or 156 (approx).
Ans. 156
[27] Apply the appropriate interpolation formula to find log 3.146, given
Log 3.141 = 0.497,0679 Log 3.144 = 0.497,4825
Log 3.142 = 0.497,2062 Log 3.145 = 0.497,6205
Log 3.143 = 0.497,3444
www.gayali.in
( Find correct up to 7 decimal places)
[I.C.W.A., 1978]
Solution:
Since the tabulated value of the argument x are equidistant and 3.146 lies near
the end of the tabulated values of x, Newton’s backward interpolation formula seems
to be appropriate here.
Difference Table
x y ∆y ∆2 y ∆3 y ∆4 y
3.141 0.4970679
0.0001383
3.142 0.4972062 -0.0000001
0.0001382 0
3.143 0.4973444 -0.0000001 0
0.0001381 0
3.144 0.4974825 -0.0000001
0.0001380
3.145 0.4976205
x − x n 3.146 − 3.145
Here, v= = =1
h 0.001
f(3.146)=log 3.146=0.4976205+0.0001380×1+
( −0.0000001) × 1× 2
www.gayali.in
1× 2
= 0.4976205 + 0.0001380 – 0.0000001
= 0.4977584
[28] Use Stirling’s interpolation formula to find the value of the probability integral
(P) when X = 1.52 :-
X 1.3 1.4 1.5 1.6 1.7
P = prob. integral .90320 .91924 .93319 .94520 .95543

www.gayali.in
Solution:
Difference Table
x y ∆y ∆2y ∆3y
1.3=x–2 .90320=y–2
.01604=∆y–2
1.4=x–1 .91924=y–1 –.00209=∆2y–2
.01395=∆y–1 .00015=∆3y–2
1.5=x0 .93319=y0 –.00194=∆2y–1
.01201=∆y0 .00016=∆3y–1
1.6=x1 .94520=y1 –.00178=∆ y0 2
.01023=∆y1
1.7=x2 .95543=y2
1.52 − 1.50 .02
u= ( x − x 0 ) / h. = = = .2
.1 .1
Stirling’s Interpolation Formula
y= y 0 +
u ( ∆y 0 + ∆y −1 )
+
u2 2
∆ y −1 +
( .
)(
u u 2 − 12 ∆ 3 y −1 + ∆ 3 y −2 )
2 2! 3! 2
.2 × ( .01201 + 01395 ) ( )
.2 .22 − 12 ( .00016 + .00015 )
www.gayali.in
.22
= .93319 + + × −.00194 + ×
2 2! 3! 2
=0.93319+.00260–.0004–.032×0.00016=0.93319+0.00260–.0004–.00001
=0.93579–.00005=0.93574
Ans. 0.93574
[29] Given the following cube - roots, find by Bessel’s interpolation formula, the cube
- root of 102.5:
Number 101 102 103 104
Cube - root 4.657,0095 4.672,3287 4.687,5481 4.702,6694
[B.U., B.A.(Econ.), 1965]
Solution:
Since the values of the argument are equidistant, and 102.5 lies near the middle
of the tabulated values, we use central difference formula
x − x 0 102.5 − 102
u= = = 0. 5
h 1
1 1
v =u− = 0.5 − = 0
2 2
Difference Table
www.gayali.in
x y ∆y ∆2y ∆3y
x–1 = 101 y–1 = 4.6570095
∆y–1 = .0153192
x0 = 102 y0 = 4.6723287 ∆2y–1 = .0000998
∆y0 = .0152194 ∆3y–1 = .0000017
x1 = 103 y1 = 4.6875481 ∆ y0 = .0000981
2
∆y1 = .0151213
x2 = 104 y2 = 4.7026694

www.gayali.in
 2 1
y=
y 0 + y1
+ v∆y 0 +  . ( )
 v − 4  ∆ 2 y 0 + ∆ 2 Y−1
2 2! 2
1
−  −.0000981 + ( −.0000998 ) 
4.6723287 + 4.6875481
= + 4
2 2
=4.6799384+.0000124=4.6799508
Ans. 4.6799508
[30] Find by using Bessel’s interpolation formula, the expectation of life at age 22
from the following data:
Age (x) 10 15 20 25 30 35
Exp. of life (y) 35.4 32.2 29.1 26.0 23.1 20.4
Solution:
Here, x0=20
x − x 0 22 − 20 2
u= = = = 0. 4
www.gayali.in
h 5 5
1 1
v= u − = 0.4 − = 0.4 − 0.5 = −0.1
2 2
Difference Table
x y ∆y ∆2 y ∆3 y
x–2 = 10 y–2 = 35.4
∆y–2 = –3.2
x–1 = 15 y–1 = 32.2 ∆2 y–2 = 0.1
∆y–1 = –3.1 ∆3 y–2 = –0.1
x0 = 20 y0 = 29.1 ∆ y–1 = 0
2
∆y0 = –3.1 ∆3 y–1 = 0.2

x1 = 25 y1 = 26.0 ∆ y0 = 0.2
2
∆y1 = –2.9 ∆3 y0 = 0
x2 = 30 y2 = 23.1 ∆ y1 = 0.2
2
∆y2 = –2.7
x3 = 35 y3 = 20.4
−1) − .25 2 + 0 ( −0.1) ( −0.1) − .25

 2 
(
2
29.1 + 26.0
y= + 0.1 × −3.1 + . + × 0. 2
2 2! 2 3!
=27.55+0.31–0.012+.0008=27.86–.01=27.85
www.gayali.in
Ans. = 27.85
[31] The following table gives the normal weight of a baby during the first six months of life:
Age in months 0 2 3 5 6
Weight in lbs. 5 7 8 10 12
Estimate the weight of a baby at the age of 4 months.
[I.C.W.A., 1970]

www.gayali.in
Solution:
Since the successive values of x for the tabulated function f(x)are not equidistant,
Newton’s forward or backward formula cannot be applied, Appling Lagrange’s formula
y=
( x − x1 ) ( x − x 2 ) ---- ( x − x n ) y
( x 0 − x1 ) ( x 0 − x 2 ) ---- ( x 0 − x n ) 0
( x − x 0 ) ( x − x 2 ) ---- ( x − x n ) y
+
( x1 − x 0 ) ( x1 − x 2 ) ---- ( x1 − x n ) 1
+ ---------------------------------------
( x − x 0 ) ( x − x1 ) ---- ( x − x n −1 ) y
+
( x n − x 0 ) ( x n − x1 ) ---- ( x n − x n −1 ) n
Let age in years = x
Weight in lbs = y = f(x) corresponding to the values of the argument x 0 , x1 , x 2 , ----, x n
∴y=
( 4 − 2 )( 4 − 3)( 4 − 5)( 4 − 6 ) × 5 + ( 4 − 0 )( 4 − 3)( 4 − 5)( 4 − 6 ) × 7
(0 − 2 )(0 − 8 )(0 − 5)(0 − 6 ) (2 − 0 )(2 − 3)(2 − 5)(2 − 6 )
www.gayali.in
+
( 4 − 0 ) ( 4 − 2 ) ( 4 − 5) ( 4 − 6 ) × 8 + ( 4 − 0 ) ( 4 − 2 ) ( 4 − 3) ( 4 − 6 ) × 10
(3 − 0 )(3 − 2 )(3 − 5)(3 − 6 ) (5 − 0 )(5 − 2 )(5 − 3)(5 − 6 )
+
( 4 − 0 ) ( 4 − 2 ) ( 4 − 3) ( 4 − 5) × 12 = 2 × 1× −1× −2 × 5 + 4 × 1× −1× −2 × 7
(6 − 0 )(6 − 2 )(6 − 3)(6 − 5) −2 × −3 × −5 × −6 2 × −1 × −3 × −4
4 × 2 × −1 × −2 4 × 2 × 1 ×× − 2 4 × 2 × 1 × −1
+ ×8 + × 10 + ×112
3 × 1 × −2 × −3 5 × 3 × 2 × −1 6 × 4 × 3 ×1
4 8 16 −16 −8 1 7 64 16 4
= + ×7 + ×8 + × 10 + × 12 = − + + −
36 −24 18 −30 72 9 3 9 3 3
1 − 21 + 64 + 48 − 12 113 − 33 80 8 8
= = = = 8 lbs. Ans. 8 lbs. .
9 9 9 9 9
[32] State Lagrange’s interpolation formula. Use it to find f(x). When x = 0 given.
x -1 -2 2 4
f(x) -1 -9 11 69
[I.C.W.A. 1974]
Solution:
www.gayali.in
Let y0, y1, y2, -------, yn denote the tabulated values of a function y= f(x)
corresponding to the values of the argument x0, x1, x2, -------, xn (which may not be
equidistant). It is required to find the value of y corresponding to a specified value of
x lying in between the given values. This is obtained by using Lagrange’s Interpolation
Formula:
y=
( x − x1 ) ( x − x 2 ) − − − − ( x − x n ) y
( x 0 − x1 ) ( x 0 − x 2 ) − − − − ( x 0 − x n ) 0

www.gayali.in
+
( x − x 0 ) ( x − x 2 ) ---- ( x − x n ) y
( x1 − x 0 ) ( x1 − x 2 ) ---- ( x1 − x n ) 1
+------------------------------
+
( x − x 0 ) ( x − x1 ) ---- ( x − x n −1 ) y
( x n − x 0 ) ( x n − x1 ) ---- ( x n − x n −1 ) n
y=
( 0 + 2 ) ( 0 − 2 ) ( 0 − 4 ) × −1 + ( 0 + 1) ( 0 − 2 ) ( 0 − 4 ) × −9
( −1 + 2 ) ( −1 − 2 ) ( −1 − 4 ) ( −2 + 1) ( −2 − 2 ) ( −2 − 4 )
+
( 0 + 1) ( 0 + 2 ) ( 0 − 4 ) × 11 + ( 0 + 1) ( 0 + 2 ) ( 0 − 2 ) × 69
( 2 + 1) ( 2 + 2 ) ( 2 − 4 ) ( 4 + 1) ( 4 + 2 ) ( 4 − 2 )
2 × −2 × −4 1 × −2 × −4 1 × 2 × −4 1 × 2 × −2
= × −1 + × −9 + × 11 + × 69
1 × −3 × −5 −1 × −4 × −6 3 × 4 × −2 5×6×2
16 8 −8 −4 16 11 23
= × −1 + × −9 + × 11 + × 69 = − + 3 + −
15 −24 −24 60 15 3 5
−16 + 45 + 55 − 69 100 − 85 15
= = = = 1 Ans.1
15 15 15
www.gayali.in
[33] State Lagrange’s interpolation formula. Use it to find value of U4 of a function
=
Ux, given that =
U1 10, U 2 15, U 5 = 42
[I.C.W.A., 1974]
Solution:
Lagrange’s interpolation formula as in Q.32.
Here, x : 1 2 5
y=Ux : 10 15 42
y=U4=
( 4 − 2 ) ( 4 − 5) × 10 + ( 4 − 1) ( 4 − 5) × 15 + ( 4 − 1) ( 4 − 2 ) × 42
(1 − 2 ) (1 − 5) ( 2 − 1) ( 2 − 5) ( 5 − 1) ( 5 − 2 )
2 × −1 3 × −1 3×2 2 −3 6
= × 10 + × 15 + × 42 = − × 10 + × 15 + × 42
−1 × −4 1 × −3 4×3 4 −3 12
= –5+15+21=31. Ans. 31
[34] Using Lagrange’s formula or otherwise, obtain the value of log 96 approximately
from the following table:
x 95 97 98 99
log x 1.977, 7236 1.986, 7717 1.991, 2261 1.995, 6352
[C.U., B.com(Hons), 1969]
www.gayali.in
Solution:
Using Lagrange’s Interpolation Formula,
y=log 96=
( 96 − 97 ) ( 96 − 98 ) ( 96 − 99 ) × 1.9777236
( 95 − 97 ) ( 95 − 98 ) ( 95 − 99 )
+
( 96 − 95 ) ( 96 − 98 ) ( 96 − 99 ) × 1.9867717 + ( 96 − 95) ( 96 − 97 ) ( 96 − 99 ) × 1.9912261
( 97 − 95) ( 97 − 98 ) ( 97 − 99 ) ( 98 − 95) ( 98 − 97 ) ( 98 − 99 )

www.gayali.in
+
( 96 − 95 ) ( 96 − 97 ) ( 96 − 98 ) × 1.9956352
( 99 − 95 ) ( 99 − 97 ) ( 99 − 98 )
−1 × −2 × −3 1 × −2 × −3
= × 1.9777236 + × 1.9867717
−2 × −3 × −4 2 × −1 × −2
1 × −1 × −3 1 × −1 × −2
+ × 1.9912261 + × 1.9956352
3 × 1 × −1 4 × 2 ×1
−6 6 3 2
= × 1.9777236 + × 1.9867717 + × 1.9912261 + × 1.9956352
−24 4 −3 8
=.4944309+2.98015755–1.9912261+0.4989088
=3.97349725–199122610=1.98227115 Ans. 1.98227115
[35] Given log10 654=2.8156, log10 658=2.8182, log10 659=2.8189, log10 661=2.8202. Find
by Lagrange’s Interpolation formula log10 656 (Retain 4 decimal places).
[I.C.W.A., 1978]
Solution:
x 654 658 659 661
y = f(x) = log10 x 2.8156 2.8182 2.8189 2.8202
www.gayali.in
y=
( 656 − 658 ) ( 656 − 659 ) ( 656 − 661) × 2.8156
( 654 − 658 ) ( 654 − 659 ) ( 654 − 661)
+
( 656 − 654 ) ( 656 − 659 ) ( 656 − 661) × 2.8182
( 658 − 654 ) ( 658 − 659 ) ( 658 − 661)
+
( 656 − 654 ) ( 656 − 658 ) ( 656 − 661) × 2.8189
( 659 − 654 ) ( 659 − 658 ) ( 659 − 661)
+
( 656 − 654 ) ( 656 − 658 ) ( 656 − 659 ) × 2.8202
( 661 − 654 ) ( 661 − 658 ) ( 661 − 659 )
−2 × −3 × −5 2 × −3 × −5 2 × −2 × −5 2 × −2 × −3
= × 2.8156 + × 2.8182 + × 2.8189 + × 2.8202
−4 × −5 × −7 4 × −1 × −3 5 × 1 × −2 7 ×3×2
−30 30 20 12
= × 2.8156 + × 2.8202 + × 2.8189 + × 2.8202
−140 12 −10 42
= 0.6033+7.0455–5.6378+0.8058 =2.8168 Ans. 2.8168
[36] Find the value of x for which y = 40
x 10 12 15 20
y 25 32 35 45
www.gayali.in
Solution:
The process of finding a value of x is known as ‘Inverse Interpolation; Since Lagrange’s
Interpolation formula is applicable for unequal intervals, we can use this formula for inverse
interpolation interchanging the role of the argument x and the function y.
x=
( y − y1 ) ( y − y 2 ) ( y − y 3 ) ------ ( y − y n ) x
( y 0 − y1 ) ( y 0 − y 2 ) ( y 0 − y 3 ) ------ ( y 0 − y n ) 0

www.gayali.in
+
( y − y 0 ) ( y − y 2 ) ( y − y 3 ) ------ ( y − y n ) x
( y1 − y 0 ) ( y1 − y 2 ) ( y − y 3 ) ------ ( y1 − y n ) 1
+
( y − y 0 ) ( y − y1 ) ( y − y 3 ) ------ ( y − y n ) x
( y 2 − y 0 ) ( y 2 − y1 ) ( y 2 − y 3 ) ------ ( y 2 − y n ) 2
+ ------------------------------------------
+
( y − y 0 ) ( y − y1 ) ( y − y 2 ) ------ ( y − y n −1 ) x
( y n − y 0 ) ( y n − y1 ) ( y n − y 2 ) ------- ( y n − y n −1 ) n
Applying the above formula
(40 − 32)(40 − 35)(40 − 45) (40 − 25)(40 − 35)(40 − 45)
x= × 10 + × 12
(25 − 32)(25 − 35)(25 − 45) (32 − 25)(32 − 35)(32 − 45)
(40 − 25)(40 − 32)(40 − 45) (40 − 25)(40 − 32)(40 − 35)
+ × 15 + × 20
(35 − 25)(35 − 32)(35 − 45) (45 − 25)(45 − 32)(45 − 35)
8 × 5 × −5 15 × 5 × −5 15 × 8 × −5 15 × 8 × 5
= × 10 + × 12 + × 15 + × 20
−7 × −10 × −20 7 × −3 × −13 10 × 3 × −10 20 × 13 × 10
−200 −375 −600 600 10 (−)125 60
= × 10 + × 12 + × 15 + × 20 = + × 12 + 30 +
−1400 273 −300 2600 7 91 13
www.gayali.in
130 − 1500 + 2730 + 420 3280 − 1500 1780
= = = = 19.56 Ans. 19.56
91 91 91
INDEX NUMBERS
Index Numbers are numerical figures which indicate the relative position in respect of
price or quantity or value of a group of articles at certain periods of time as compared
with another period called base period. Index numbers for the base year period is
always taken as 100.
Method of Construction of Index Numbers
Index Nuber Construction

Aggregative Relative
Method Method

Simple Weighted Simple Weighted
Aggregative Aggregative Average of Average of
Formula Formula Relatives Relative

www.gayali.in

Laspeyre's Paasche's Edgeworth Fisher's
Formula Formula Marshall's Ideal
Formula Formula
[I] Aggregative Method
In this method, the aggregate price of all items in the given year is expressed as
a percentage of the same in the base year, giving the index number.

www.gayali.in
Aggregate price in the given year

Index Number = ×100
Aggregate price in the basse year
If simple aggregates of prices are compared we get
ΣPn
Simple Aggregative Index (Ion)= × 100
ΣP0
where P0 is the price in the base year and Pn is the price in the given year.
ΣPn w
Weighted Agregative Index (Ion)= × 100
ΣP0 w
Where w represents the ‘weights’.
[i] If the base year quantity (q0 ) is used as weight i.e. w = q0 , we get
ΣPn q o
Laspeyre’s Index (Ion)= ×100
ΣPo q o
[ii] If the current year quantity (qn ) is used as weight i.e. w = qn , we get
ΣPn q n
Paasche’s Index (Ion)= ×100
ΣPo q n
[iii] If the sum of quantities in the base year and the current year as weight. i.e.
www.gayali.in
w = (q0 + qn ), we get
ΣP ( q + q )
Edgeworth-Marshall's Index (Ion)= n o n ×100
ΣPo ( q o + q n )
[iv] The geometric mean (i.e. square-rite of the product) of Laspeyre’s Index
and Paasche’s index.
Fisher's Ideal Index (Ion)= (Laspeyre’s Index × Paasche’s Index
ΣPn q o ΣPn q n
= × ×100
ΣPo q o ΣPo q n
1
[v] Bowley’s Index (Ion)= (Laspeyre’s Index + Paasche’s Index)
2
1  ΣPn q o ΣPn q n 
=  +  × 100
2  ΣPo q o ΣPo q n 
[vi] If the geometric mean of base year and current year quantities is used as
weight i.e. W = q 0 q n we get.
ΣPn q o q n
Walsh’s Index (Ion)= ×100
ΣPo q o q n
www.gayali.in
[vii] If the weights used are kept fixed for all periods i.e. weights are constant
quantities (q). Without any reference to base or current period we get,
ΣPn q
Kelly’s Index (Ion)= ×100
ΣPo q
[II] Relative Method
In this method, the price of each item in the current year is expressed as a percentage
of the price of the base year. This is called price relative and is given by the formula,

www.gayali.in
Price in the given year P

Price Relative = ×100 = x ×100
Price in the base year Po
The average of price relatives, which shows the average percentage change for
the whole group of items, gives the index number,
Price Index = Average of Price Relatives
Simple A. M of Relative Index (Ion) = ∑(Price Relatives) ÷ K, where K is the
number of items included.
Simple G. M of Relative Index (Ion)= k Product of Price Relative
Σ(Price Relative) × W
Weighted A. M of Relative Index (Ion)=
ΣW
Aggregative Formulae by Relative Method.
Considering price index numbers
[1] The A.M of relatives formula weighted by base year values (p0qn) gives exactly
the same formula as Laspeyre’s :
www.gayali.in
P 
Σ  n × 100  Po q o
P ΣP q
Ion=  o  = n o × 100 =Laspeyre's index
ΣPo q o ΣPo q o
[2] The A.M of relatives formula weighted by values of current year quantities at
base year prices (p0qn) gives. Paasche’s Formula:
P 
Σ  n × 100  po q n
P ΣP q
Ion=  o  = n n × 100 = Paasche’s Index
ΣPo q n ΣPo q n
[3] The H.M of relatives formula weighted by current year values (pnqn) gives the
same formula as Paasche’s :
ΣPn q n ΣPn q n
Ion= = × 100 = Paasche’s Index
 Pn  ΣPo q n
Σ ( Pn q n )  × 100 
P
 o 
Construction of General Index from Group Index
In the construction of any index number, the items included are usually classified under
some broad categories called Groups, with similar or related items coming under each
group. A separate index number is constructed for each group, and is called Group Index.
The weighted average (usually A.M.) of group index numbers gives the General Index.
www.gayali.in
ΣIW
General Index =
ΣW
when I represents the Group Index and W is the Group Weight.
Quantity Index Number
Just as price index numbers measure and permit comparison of the price of a group
of related items, quantity index numbers similarly measure and permit comparison

www.gayali.in
of the physical quantity of goods produced or consumed or marketed or distributed.

Quantity index number formulae may be obtained from the corresponding price index
number formulae replacing p by q and q by p.
Σq n
Simple Aggregative Quantity Index = × 100
Σq o
Σq n Po
Laspeyre’s Quantity Index = × 100
Σq o Pn
Σq n Pn
Paasche’s Quantity Index = × 100
Σq o Pn
Σq n ( Po + Pn )
Edgeworth-Marshall’s Index = × 100
Σq o ( Po + Pn )
Σq n Po Σq n Pn
Fisher’s Ideal Index = × × 100
Σq o Po Σq o Pn
qn
www.gayali.in
Quantity Relative = × 100
qo
Simple A. M of Quantity Relative Index = ∑ (Quantity Relatives) ÷ K
Σ(Quantity Relative × Weight)
Weighted A. M of Quantity Relative Index = ΣWeight
Tests of Index Numbers
In order to judge the efficiency of an index number formula as a measure of the level of
phenomenon from one period to another, the noted economist Irving fisher suggested
certain tests. The three most important tests of index numbers are (1) Time Reversal
Test, (2) Factor Reversal Test, and (3). Circular Test.
[1] Time Reversal Test
According to this test, a good index number formula should work both ways,
forward and backward, with respect to time. In other words, we should get the same
picture of change between two points of time, no matter which of the two is taken as
base. Consequently, the index number (Ion) for period n with base period 0 should be the
reciprocal of the index number (Ino) for period 0 with the base period n. Symbolically
Ion × Ino = 1
www.gayali.in
An index number formula which obeys this relation is said to satisfy the time
reversal test.
Time reversal test is satisfied by simple aggregative formula, Marshall –
Edgeworth’s formula, Fisher’s ideal index formula and simple geometric mean of
relatives formula. Weighted aggregative formula and weighted geometric mean of
relatives formula also satisfy this test, if constant weights are used which do not depend
upon the base or current period.

www.gayali.in
[2] Factor Reversal Test

An index number formula is said to satisfy the factor reversal test, if the product
of Price Index (Pon) and Quantity Index (Qon) gives the true value ratio (omitting the
factor ratio 100 from each index). Symbolically
ΣP q
Pon × Q on = n n
ΣPo q o
Fisher’s ideal index is the only formula which satisfies this test.
[3] Circular Test
This is an extension of time reversal test. An index number formula is said
to satisfy the circular test, if the time reversal test is satisfied through a number of
intermediate years. Symbolically, I01 × I12 × I23 ×-------× In-1, n × In,o = 1
This means that the relation is satisfied in a circular fashion through several
years, 0 to 1, 1 to 2, 2 to 3 -------, (n – 1) to n, and finally from n back to 0. Simple
aggregative formula and the simple geometric mean of relatives formula satisfy this
test. Weighted aggregative formula and weighted geometric mean of relatives formula
www.gayali.in
satisfy this test, if constant weights are used for all time periods.
Cost of Living Index Numbers
Cost of living index numbers are special-purpose index numbers which are designed to
measure the relative change in the cost level for maintaining similar standard of living
in two different situations. These are generally intended to represent the average changes
in prices over time, paid by the ultimate consumer for a specified group of goods and
services and hence are also called Consumer Price Index Numbers.
The steps in the construction of a Cost of Living Index are as follows:
[1] The first step is to decide on the class of people for whom the index number is
intended.
[2] The next step is to conduct a ‘family budget enquiry’ in the base period relating
to the class of people concerned, by the process of random sampling only important
items among those which are used by the majority of the class of people are included
in the construction of a cost of living index.
[3] The items of expenditure are classified in certain major groups e.g. (i) Food,
(ii) Clothing, (iii) Fuel and light, (iv) Housing, and (v) Miscellaneous. There major
groups are further divided into smaller groups and sub-groups, so that the items are
individually mentioned.
www.gayali.in
[4] Arrangements should be made to callect retail prices of the items are regular
intervals of time from important local markets. Price quotations are taken at least once
a week.
[5] For each item there will be a number of price quotation covering different
qualities and markets. The simple average of price relatives of the different quotations
is taken as the price relative for the particular item.

www.gayali.in
[6] A separate index number is then computed for each group, using Laspeyre’s
formula in the form of weighted average of price relatives.
P 
ΣW  n × 100 
 Po 
Group Index (1) =
100
Po q o
where W = ×100
ΣPo q o
Thus, in the construction of a Group Index, the weight (W) of an item is the
percentage expenditure of an ‘average family’ on that item in relation to the total
expenditure in the Group, as obtained from the family budget enquiry.
[7] The weighted average of group index numbers, gives the final cost of Living
Index number.
ΣIW
Cost of Living Index =
100
The weight (W) of a group index is the percentage of total expenditure of an
www.gayali.in
average family spent on that group, as shown by the family budget enquiry.
[8] Cost of living index numbers are generally constructed for each week. The
average of the weekly index numbers is taken as the index number for a month. The
average of monthly index numbers gives the cost of living index for the whole year.
Chain Base Method
There are two methods of constructing of index number depending on the nature of
base period employed: (i) Fixed Base Method and (ii) Chain Base Method. Most of
the index numbers in common use are of the fixed base type, where a fixed period
is chosen as base and the index number for any given year is calculated by direct
reference to this fixed base period. The fixed base index for any year is not, therefore,
affected by changes in price or quantity in any other year. It is however considered
that the net changes in any given year are the result of gradual changes that have taken
place during the past years. The idea is reflected in “Chain Base Index” numbers.
For the construction of index numbers by the chain base method, using an appropriate
index number formula (say laspeyres formula), it is first necessary to compute index
numbers for all the years, always using the preceding year are base. These are known
as Link Index.
www.gayali.in
Link index = Index number with preceding period as base. For example, using
laspeyres formula,
Σp1q 0
Link index for year 1 (I01)= × 100
Σp 0 q 0
Σp 2 q 1
Σp1q1

www.gayali.in
Σp 3 q 2
Σp 2 q 2
Σp 4 q 3
Link index for year 4 (I34)= × 100 etc.
Σp 3 q 3
The link indices I01, I12, I23, I34, ------ are then multiplied successively (called chaining
process) in order to relate them to a common base. The progressive products, expressed
as percentages, give the required index numbers by the chain base method. These are
called Chain Index Number or Chain Base Index Number. Thus, a chain index number
is the product of several index numbers, each calculated with the preceding period as
base.
The chain index numbers with reference to year 0 are (omitting the factor 100 from
each index)
I′01 = I01
I′02 = I01×I12
I′03 = I01×I12×I23
I′04 = I01×I12×I23×I34
www.gayali.in
(Here I′ is used for chain index and I for index of the fixed base type).
The chain index number I′on will not in general be equal to the corresponding fixed
base index number Ion unless the formula employed satisfies the circular test of index
numbers.
Sums
[1] Find the simple Aggregative index number from the following data:-
Commodity Base Price Current Price
Rice 140 180
Sugar 100 300
Oil 400 550
Wheat 125 150
Pulse 160 200
Solution:
Commodity Base Price(Rs.) Current Price(Rs.)
Rice 140 180
Sugar 100 300
www.gayali.in
Oil 400 550

Wheat 125 150
Pulse 160 200
Total 925 1380
ΣPn 1380
Simple Aggregative Index (Ion)= ×100 = × 100 = 149
ΣPo 925

www.gayali.in
[2] Find by the weighted aggregative method, the index number of the following
data:-
Commodity Base Price Current Price Weight
Rice 140 180 10
Oil 400 550 7
Sugar 100 250 6
Wheat 125 150 8
Fish 200 300 4
Solution:
Commodity Base Price (P0) Current Price (Pn) Weight (Wi) Pn w P0 w
Rice 140 180 10 1800 1400
Oil 400 550 7 3850 2800
Sugar 100 250 6 1500 600
Wheat 125 150 8 1200 1000
Fish 200 300 4 1200 800
∑Pnw = 9550 ∑P0w = 6600
www.gayali.in
ΣPn w 9550
Weight Aggregative Index (Ion)= × 100 = × 100 = 145
ΣP0 w 6600
[3] Calculate the price index numbers by a). Paasch’s Method, b).Laspeyre’s Method,
c).Bowleys Method, d).Fisher’s ideal formula.
1979 1980
Commodities
Price (Rs.) Quantity (Kgs) Price (Rs.) Quantity (Kgs)
A 20 8 40 6
B 50 10 60 5
C 40 15 50 10
D 20 20 20 15
Solution:
Table : Calculations for Price Index
Commodity P0 Pn q0 qn p0 q0 pn qn p0 qn pn q0
A 20 40 8 6 160 240 120 320
B 50 60 10 5 500 300 250 600
C 40 50 15 10 600 500 400 750
D 20 20 20 15 400 300 300 400
www.gayali.in
Total 1660 1340 1070 2070

ΣPn q n 130
[a] Paasche’s Price Index = ×100 = × 100 = 125.2
ΣPo q n 1070
ΣPn q 0 2070
[b] Laspeyre’s Index = × 100 = × 100 = 124.7
ΣP0 q 0 1660

www.gayali.in
1
[c] Bowley’s Index = [ Laspeyre’s Index + Paasche’s Index]
2
1 1
= [125.2+124.7] = × 249.9 = 125
2 2
[d] Fisher’s Ideal Index = Laspeyre’s index × Paasche’s index
= 124.7 × 125.2 = 125
[4] Prepare price index numbers for 1977 with 1975 as base year from the following
data, using (i) Laspeyre’s, (ii) Paasche’s, (iii) Fisher’s method.
Commodity Unit Quantity Price (Rs.) Quantity Price (Rs.)
A Kg. 5 2.00 7 4.50
B Quintal 7 2.50 10 3.20
C Dozen 6 3.00 6 4.50
D Kg. 2 1.00 9 1.80
Solution:
Table : Calculations for Price Index
Commodity P0 Pn q0 qn p 0 q0 p 0 qn pn q0 pn qn
www.gayali.in
A 2.00 4.50 5 7 10.00 14.00 22.50 31.50
B 2.50 3.20 7 10 17.50 25.00 22.40 32.00
C 3.00 4.50 6 6 18.00 18.00 27.00 27.00
D 1.00 1.80 2 9 2.00 9.00 3.60 16.20
Total 47.50 66.00 75.50 106.70
Σp n q 0 75.50
[i] Laspeyre’s Index = × 100 = × 100 =159
Σp 0 q 0 47.50
Σp n q n 106.70
[ii] Paasche’s Index = × 100 = × 100 =162
Σp 0 q n 66.00
[iii] Fisher’s Index = Laspeyre’s index × Paasche’s index = 159 × 162 = 160
[5] Using the data given below, calculate price index numbers for the year 1958 by i)
Laspeyre’s formula, ii) Paasche’s Formula, iii) Fisher’s Formula, with the year 1949 as base :
Price (Rs.) Quantity (1000 kg.)
Commodity
1949 1958 1949 1958
Rice 9.3 4.5 100 90
Wheat 6.4 3.7 11 10
Pulses 5.1 2.7 5 3
State with reasons one advantage of Laspeyre’s index over the Paasche’s index in
case revisions of an index number are to be made from year to year.
www.gayali.in
Solution:
Table: Calculations for Index Number
Commodity p0 pn q0 qn p0 q0 p0 qn pn q0 pn qn
Rice 9.3 4.5 100 90 930 837.0 450 405
Wheat 6.4 3.7 11 10 70.4 64.0 40.7 37
Pulses 5.1 2.7 5 3 25.5 15.3 13.5 8.1
Total 1025.9 916.3 504.2 450.1

www.gayali.in
Σp n q 0 504.2
[i] Laspeyre’s Index = × 100 = × 100 = 49.15
Σp 0 q 0 1029.9
Σp n q n 450.1
[ii] Paasche’s Index = × 100 = × 100 = 49.12
Σp 0 q n 916.3
[iii] Fisher’s Index = Laspeyre’s index × Paasche’s index
= 49.15 × 49.12 = 49.13
[6] Given the following data, calculate price index numbers by i) Laspeyre’s Formula,
and ii) Fisher’s Formula with 1927 as base:
Rice Wheat Jowar
Year
Price Qty Price Qty Price Qty
1927 9.3 100 6.4 11 5.1 5
1934 4.5 90 3.7 10 2.7 3
Solution:
www.gayali.in
Rice 9.3 4.5 100 90 930 837 450 405
Wheat 6.4 3.7 11 10 70.4 64 40.7 37
Jowar 5.1 2.7 5 3 25.5 15.3 13.5 8.1
Total 1025.9 916.3 504.2 450.1
Σp n q 0 504.2
[i] Laspeyre’s Index = × 100 = × 100 = 49.15
Σp 0 q 0 1025.9
Σp n q n 450.1
[ii] Paasche’s Index = × 100 = × 100 = 49.12
Σp 0 q n 916.3
[iii] Fisher’s Index = Laspeyre’s index × Paasche’s index
= 49.15 × 49.12 = 49.13
[7] Calculate the price index number for 1940 with 1937 as base year by the
aggregative method, using (a) Base year quantities as weight, and (b) given year
quantities as weights, from the following data:
www.gayali.in
1937 1940
Commodity
Quantity (‘000 tons) Price per ton (Rs.) Quantity (‘000 tons) Price per ton(Rs.)
A 350 100 400 120
B 200 130 180 200
C 140 50 200 110
D 80 125 100 140

www.gayali.in
Solution:
Table: Calculations for Index Numbers
Commodity p0 pn q0 qn p0 q0 p0 qn pn q0 p n qn
A 100 120 350 400 35,000 40,000 42,000 48,000
B 130 200 200 180 26,000 23,400 40,000 36,000
C 50 110 140 200 7,000 10,000 15,400 22,000
D 125 140 80 100 10,000 12,500 11,200 14,000
Total 78,000 85,900 1,08,600 1,20,000
Σp n q 0 1, 08, 600
[A] Price Index = × 100 = × 100 = 139.2
Σp 0 q 0 78, 000
Σp n q n 1, 20, 000
[B] Price Index = × 100 = × 100 = 139.7
Σp 0 q n 85, 900
[8] The following table gives the change in the price and consumption of three
commodities in the workers consumption basket. Compute Fisher’s ideal index
number from the data given in the table:
1950 1960
Commodity
Price (Rs.) Consumption (units) Price (Rs.) Consumption (units)
www.gayali.in
Wheat 100 10 110 6
Rice 150 15 170 18
Cloth 5 50 4 30
Solution:
Wheat 100 110 10 6 1000 600 1100 660
Rice 150 170 15 18 2250 2700 2550 3060
Cloth 5 4 50 30 250 150 200 120
Total 3500 3450 3850 3840
Σp n q 0 3850
[I] Laspeyre’s Index = × 100 = × 100 = 110
Σp 0 q 0 3500
Σp n q n 3840
[II] Paasche’s Index = × 100 = × 100 = 111.30
Σp 0 q n 3450
[III] Fisher’s Index = Laspeyre’s index × Paasche’s index

= 110 × 111.30 = 111
[9] From the data given below, calculate Fisher’s Ideal Index Number of prices for
www.gayali.in
1963 with references to 1960 on base period:

Price (Rs.) Quantity (‘000 kg.)
Commodities
1960 1963 1960 1963
a 4.3 5.2 20 16
b 2.1 3.9 5 4
c 0.8 1.6 11 8
d 3.2 4.8 8 6

www.gayali.in
Solution:
Table: Calculations for Price Index Number
Commodity p0 pn q0 qn p0 q0 p0 q0 p0 q0 p0 q0
a 4.3 5.2 20 16 86.00 68.8 104 83.2
b 2.1 3.9 5 4 10.50 8.4 19.5 15.6
c 0.8 1.6 11 8 8.80 6.4 17.6 12.8
d 3.2 4.8 8 6 25.60 19.2 38.4 28.8
Total 130.90 102.8 179.5 140.4
ΣPn q 0 179.5
Laspeyre’s Index = × 100 = × 100 = 137.13
ΣP0 q 0 130.90
ΣPn q n 140.4
Paasche’s Index = × 100 = × 100 = 136.58
ΣP0 q n 102.8
Fisher’s Index = Laspeyre’s index × Paasche’s index

= 137.13 × 136.58 = 137
[10] Find by Arithmetic Mean method the index number from the following data:
www.gayali.in
Commodity Base Price Current Price
Rice 140 180
Sugar 100 300
Oil 400 550
Wheat 125 150
Pulse 160 200
Solution:
Commodity P0 Pn Pn
Price Relative = P ×100
o
Rice 140 180 180/140 × 100 = 128.6
Sugar 100 300 300/100 × 100 = 300.0
Oil 400 550 550/400 × 100 = 137.5
Wheat 125 150 150/125 × 100 = 120.0
Pulse 160 200 200/160 × 100 = 125.0
Total 811.10
Pn
Σ × 100
Σ(Pr ice Re latives) P0
Simple Arithmetic Mean of Price Relatives Index= =
www.gayali.in
n n
811.10
= = 162
4
[11] Find the index number from the following data by the method of Relatives (use A.M.):
Commodity Rice Wheat Fish Potato Coal Pulse
Base Price 30 22 54 20 15 4
Current Price 33 25 64 23 16 5

www.gayali.in
Solution:
Commodity p0 pn Price Relative =
pn
×100
po
33
Rice 30 33 × 100 = 110.00
30
25
Wheat 22 25 × 100 = 113.64
22
64
Fish 54 64 × 100 = 118.52
54
23
Potato 20 23 × 100 = 115.00
20
16
Coal 15 16 × 100 = 106.67
15
5
Pulse 4 5 × 100 = 125.00
4
www.gayali.in
Total = 688.83
Σ(Pr ice Re latives) 688.83
Simple A.M. of Price Relative Index = = = 115
n 6
where n = number of items
[12] Calculate a suitable index number from the data given below:
Commodity Price Relative Weight
A 125 5
B 67 2
C 250 3
Solution:
Commodity Price Relative (I) Weight (W) IW
A 125 5 625
B 67 2 134
C 250 3 750
Total - 10 1509
ΣIW 1509
Weighted A.M of Price Relative Index = = = 150.9
ΣW 10
[13] Find by Arithmetic Mean method the index number from the following:
www.gayali.in
Commodity Base Price Current Price Weight

Rice 30 52 8
Wheat 25 30 6
Fish 130 150 3
Potato 35 49 5
Oil 70 105 7

www.gayali.in
Solution:
Commodity Base Price(P0) Current Price(Pn) Weight (W) Price Relative (I) IW
52
Rice 30 52 8 × 100 = 173.33 1386.64
30
30
Wheat 25 30 6 × 100 = 120.00 720.00
25
150
Fish 130 150 3 × 100 = 115.38 346.14
130
49
Potato 35 49 5 × 100 = 140.00 700.00
35
105
Oil 70 105 7 × 100 = 150.00 1050.00
70
Total - - 29 - 4202.78
ΣIW 4202.78
Weighted A.M of Price Relative Index = = = 145
ΣW 29
[14] The price quotations of 4 different commodities for 1951 and 1965 are given
www.gayali.in
below. Calculate the index number for 1965 with 1951 as base, by using i). Simple
Average of price relatives, ii). The Weighted Average of price relatives.
Price
Commodity Unit Weight (Rs.1000)
1951 1965
A Seer 5 2.00 4.50
B Mound 7 2.50 3.20
C Dozen 6 3.00 4.50
D Seer 2 1.00 1.80
Solution: Table: Calculations for Index Numbers
n P
Commodity Base price Current Price Weight Price Relatives P × 100 = I IW
0
4.50
A 2.00 4.50 5 × 100 = 225 1125
2.00
3.20
B 2.50 3.20 7 × 100 = 128 896
2.50
4.50
C 3.00 4.50 6 × 100 = 150 900
3.00
1.80
D 1.00 1.80 2 × 100 = 180 360
1.00
Total 20 683 3281
www.gayali.in
683
Simple Average of Price Relative = = 171
4
3281
Weighted A.M of Price Relative Index = = 164
20
[15] An index number of wholesale prices, based on the simple arithmetic mean
of price relatives comprises 40 items. They are divided into seven groups. A separate

www.gayali.in
index is published for each group. Find the index number for all the group combined
for 1968, from the following data:
Group A B C D E F G
No. of items 10 5 8 4 3 4 6
Group Index for 1968 120 95 115 142 86 100 105
Solution:
Group Index (I) Weight (W) IW
A 120 10 1200
B 95 5 475
C 115 8 920
D 142 4 568
E 86 33 258
F 100 4 400
G 105 6 630
Total - 40 4451
www.gayali.in
ΣIW 4451
Index Number of the groups = = = 111
ΣW 40
[16] In 1976 the average price of a commodity was 20% more than in 1975, but 20%
less than in 1974; and moreover it was 50% more than in 1977. Reduce the data to price
relatives using 1975 as base (1975 Price Relative = 100).
Solution:
Assume 1976 price to be 100;
100 100 100
Then 1975 price is 100 × , 1974 price is 100 × and 1977 price is 100 ×
120 80 150
If price relative for 1975 is taken as 100
100 × 100 120
Price relatives for 1974 = × = 150
80 100
100 × 120
Price relatives for 1976 = = 120
100
100 × 100 120
Price relatives for 1977 = × = 80
150 100
∴ The price relatives are 150,100, 120, 80 for 1974-1977.
www.gayali.in
[17] Using Paasche's formula, compute the quantity index and price index numbers
for 1970 with 1966 as base year.
Quantity Units Value (Rs.)
Commodity
1966 1970 1966 1970
A. 100 150 500 900
B. 80 100 320 500
C. 60 72 150 360
D. 30 33 360 297

www.gayali.in
Solution :
Table : Calculations for Quantity Index Number
Commodity q0 p0 qn pn q0p0 q0pn qnp0 qnpn
A. 100 5 150 6 500 600 750 900
B. 80 4 100 5 320 400 400 500
C. 60 2.5 72 5 150 300 180 360
D. 30 12 33 9 360 270 396 297
Total 1330 1570 1726 2057
Σq n p n 2057
Paasche's Quantity Index = × 100 = × 100 = 131
Σq 0 p n 1570
Σp n q n 2057
Paasche's Price Index = × 100 = × 100 = 119
Σp 0 q n 1726
[18] Using Fisher’s ‘Ideal’ formula, calculate the quantity index number from the
following data:
Base year
Commodity Base year Price(Rs.) Quantity(Kg.) Current year Current year
Price(Rs.) Quantity (Kg.)
A 5 50 10 56
www.gayali.in
B 3 100 4 120
C 4 60 6 60
D 11 30 14 24
E 7 40 10 36
Solution:
Table: Calculations for Quantity Index Number
Commodity q0 p0 qn pn q0 p0 q0 pn qn p0 qn pn
A 50 5 56 10 250 500 280 560
B 100 3 120 4 300 400 360 480
C 60 4 60 6 240 360 240 360
D 30 11 24 14 330 420 264 336
E 40 7 36 10 280 400 252 360
Total - - - - 1400 2080 1396 2096
Σq n P0 1396
Laspeyre’s Quantity Index= × 100 = × 100 = 99.71
Σq 0 P0 1400
Σq n Pn 2096
Paasche’s Quantity Index = × 100 = × 100 = 100.77
Σq 0 Pn 2080
Fisher’s Ideal Index= Laspeyre’s index × Paasche’s index = 99.71 × 100.77 = 100.2
www.gayali.in
[19] Annual production is million tons of three commodities are given:

Commodity Production in 1935 Production in 1940 Weights
A 160 200 13
B 10 12 21
C 80 100 35
Calculate quantity index number for the year 1940 with 1935 as base year, using
simple arithmetic mean and weighted arithmetic mean of the relatives.

www.gayali.in
Solution:
Commodity q0 qn W Quantity Relatives (I) IW
200
A 160 200 13 × 100 = 125 1625
160
12
B 10 12 21 × 100 = 120 2520
10
100
C 80 100 35 × 100 = 125 4375
80
Total 69 370 8520
370
Arithmetic mean of quantity relatives = = 123.3
3
8520
Weighted arithmetic mean of quantity relatives = = 123.5
69
[20] In 1970 the price of a commodity increased by 50% over that in 1952 while the
production of the quantity decreased by 30%. By what percentage did the total rupee
www.gayali.in
value of the commodity in 1970 increase or decrease with respect to the 1952 value?
Solution:
Let, Base year Price (P0 ) = 100
Current year Price (Pn ) = 100 × 150% =150
Base year Quantity (qo) = 100
Current year Quantity (qn) = 100 × 70% = 70
150 70
∴ Value Ratio = × = 1.50 × 70 = 1.05
100 100
∴ Total rupee value of the commodity increased by 5%.
[21] Using the following data, show that Laspeyres price index formula does not
satisfy the time reversal test:
Commodity Base Year Price Base Year Quantity Current Year Price BaseYear Quantity
A 6 50 10 56
B 2 100 2 120
C 4 60 6 60
D 10 30 12 24
E 8 40 12 36
Solution:
www.gayali.in
Using Laspeyres Price Index formula and omitting the factor 100.
ΣP q
Index number for current year with base year (o) ( Ion ) = n 0
ΣP0 q 0
Interchanging the suffixes o and n
Index number for Base year with current year
ΣP0 q n
In0=
ΣPn q n

www.gayali.in

Commodity p0 q0 pn qn p0 q0 p0 qn pn q0 pn qn
A 6 50 10 56 300 336 500 560
B 2 100 2 120 200 240 200 240
C 4 60 6 60 240 240 360 360
D 10 30 12 24 300 240 360 288
E 8 40 12 36 320 288 480 432
Total - - - - 1360 1344 1900 1880
ΣPn q 0 1900 ΣP0 q n 1344
Ion = = , Ino = =
ΣP0 q 0 1360 ΣPn q n 1880
1900 1344
Ion × Ino = × ≠1
1360 1880
This verifies that Laspeyre's formula does not satisfy time reversal test.
[22] From the following data on shoe prices and quantities show that Laspeyres
index does not meet the factor reversal test:
Type of shoe P 1950 Q 1950 P 1960 Q 1960
www.gayali.in
Men’s £7 36 £10 48
Women’s 5 50 9 80
Children’s 4 18 6 26
Solution:
Σp q
The factor reversal test may be represented in symbols as Pon.Qon=Value Ration= n n
Σp 0 q 0
Table: Calculations for Factor Reversal Test
Commodity P0 Pn q0 qn P0 q0 P0 qn Pn q0 Pn qn
Men’s 7 10 36 48 252 336 360 480
Women’s 5 9 50 80 250 400 450 720
Children’s 4 6 18 26 72 104 108 156
Total - - - - 574 840 918 1356
Using Laspeyres Price Index and omitting the factor 100.
∑ Pn q 0 918
Price Index = (Pon)= =
∑ P0 q 0 574
Interchanging P and q,
∑ q n P0 840
Quantity Index (Qon)= =
∑ q 0 P0 574
∑P q 1356
www.gayali.in
n n
Value Ratio = ∑ P q = 574 = 2.36
0 0
918 840
Pon.Qon = × = 2.34
574 574
∴Pon.Qon ≠ Value Ratio
This shoes Laspeyres Index formula does not satisfy Factor Reversal Test.
[23] Prove using the following data that the factor reversal test and time reversal test

www.gayali.in
are satisfied by Fisher’s Ideal formula for index numbers:

Rice Wheat Jowar
Year
Price Quantity Price Quantity Price Quantity
1949 4 50 3 10 2 5
1959 10 40 8 8 4 4
Solution:
Table: Calculations for Time and Factor Reversal Test
Commodity P0 q0 Pn qn P0 q0 P0 qn Pn q0 Pn qn
Rice 4 50 10 40 200 160 500 400
Wheat 3 10 8 8 30 24 80 64
Jowar 2 5 4 4 10 8 20 16
Total - - - - 240 192 600 480
Price Index Number for 1959 with as 1949 omitting the factor 100.
∑P q
n 0 600
Laspeyre’s Index = ∑ P q = 240
0 0
∑P q
n n 480
Paasche’s Index = ∑ P q = 192
0 n
www.gayali.in
Fisher’s Index = Laspeyre’s index × Paasche’s index
600 480
= × ------- (i)
240 192
Interchanging the suffixes 0 and n in the above formula, Price Index Number for
year 1949 with base 1959.
∑P q
0 n 192
n n
∑P q
0 0 240
n 0
∴Fisher’s Index(Ion) = Laspeyre’s index × Paasche’s index

192 240
= × -------- (ii)
480 600
Multiplying (i) and (ii)
Fisher's Index (Ion)×Fisher's Index (In, 0)
600 480 192 240
= × × × =1
240 192 480 600
www.gayali.in
Using Fisher’s formula we find that Ion×Ino=1. This verifies that Fisher’s formula
satisfies Time Reversal Test.
The factor reversal test may be represented in symbols as Pon.Qon=Value Ratio
Using Fisher’s Ideal Formula and Omitting the factor.
∑P q
n 0 ∑P q
n n 600 480
Price Index (Pon) = ∑ P q × ∑ P q = ×
0 0 0 n 240 192

www.gayali.in
∑q P
n 0 ∑q P
n n 192 480
Quantity Index = ∑q P × ∑q P = ×
0 0 0 n 240 600
∑P q
n n 480
Value Ration = ∑ P q = 240
0 0
600 480 192 480 480

Now Pon×Qon = × × × ×
240 192 240 600 600
480
∴Pon×Qon = =Value Ratio
240
This shows that Fisher’s Ideal index formula satisfies Factor Reversal Test.
[24] Show that Fisher’s ideal price index number satisfies both the time reversal and
the factor reversal tests and verify from the following data:
1970 1972
Commodity
Price Quantity Price Quantity
A 6 50 10 56
B 2 100 2 120
www.gayali.in
C 4 60 6 60
D 10 30 12 24
E 8 40 12 36
Solution:
Table: Calculations for Time and Factor Reversal Test
Commodity P0 q0 Pn qn P0 P0 P0 qn Pn q0 Pn Pn
A 2 50 10 56 100 112 500 560
B 2 100 2 120 200 240 200 240
C 4 60 6 60 240 240 360 360
D 10 30 12 24 300 240 360 288
E 8 40 12 36 320 288 480 432
Total - - - - 1160 1120 1900 1890
[I] Price Index Number for the 1972 with base 1970 omitting factor 100.
∑P q
n 0 1900
0 0
∑P q
n n 1890
0 n
Fisher’s Index(Ion) = Laspeyre’s index × Paasche’s index

www.gayali.in
1900 1890
= × -------- i)
1160 1120
[II] Interchanging the suffixes 0 and n in the above formula, Price Index Number for
year 1970 with base 1972.
∑P q
0 n 1120
n n

www.gayali.in
∑P q
0 0 1160
n 0
Fisher’s Ideal Index (Ino) = Laspeyre’s index × Paasche’s index

1120 1160
= × --------- (ii)
1890 1900
Multiplying (i) and (ii)
Fisher's Index (Ion)×Fisher's Index (Ino)

1900 1890 1120 1160
= × × × =1
1160 1120 1890 1900
Using Fisher’s formula we find that Ion×Ino=1. This verifies that Fisher’s formula
satisfies Time Reversal Test.
The factor reversal test may be represented in symbols as Pon.Qon=Value Ratio
www.gayali.in
Using Fisher’s Ideal Formula and Omitting the factor 100.
∑ Pn q 0 ∑ Pn q n 1900 1890
Price Index (Pon) = ×
∑ P0 q 0 ∑ P0 q n =
×
1160 1120
∑ q n P0 ∑ q n Pn 1120 1890
Quantity Index = ×
∑ q 0 P0 ∑ q 0 Pn =
×
1160 1900
∑P q
n n 1890
Value Ratio = ∑ P q = 1160
0 0
1900 1890 1120 1890 1890

Pon×Qon = × × × =
1160 1120 1160 1900 1160
This shows that Fisher’s Ideal Index formula satisfies Factor Reversal Test.
[25] Calculate a simple price index number for the year 1934 from the following data and
verify numerically whether the formula employed satisfies the appropriate test or not.
Commodity A B C D E
www.gayali.in
1927 6 2 4 10 8
Price (Rs)
1934 10 2 6 12 12
Solution:
Let P0, Pn denote the Base Price and Current Price

www.gayali.in
Commodity P0 Pn Pn
Price Relative = = × 100
Po
10
A 6 10 × 100 = 166.67
6
2
B 2 2 × 100 = 100.00
2
6
C 4 6 × 100 = 150.00
4
12
D 10 12 × 100 = 120.00
10
12
E 8 12 × 100 = 150.00
8
Total 30 42 686.67
686.67
www.gayali.in
Simple A.M of Price Relative Index = = 137.33
5
42
Simple Aggregative Index = × 100 = 140
30
Verifying Time Reversal Test: i.e Ion × Ino = 1
P  42 7
Ion = ∑  n = =
 P0  30 5
 P  30 5
Ino = ∑  0  = =
 Pn  42 7
7 5
Ion × Ino = × = 1
5 7
∴ It satisfies Time Reversal Test.
[26] Find index numbers for the years 1961, 1962, 1963 by the chain base method,
with base-year 1960. From the following table:
Year 1960 1961 1962 1963
Link Index 100 110 95.5 109.5
www.gayali.in
Solution:
Table: Calculations for chain Base Index
Year Link Index Chain Index (Base 1960=100)
Y0 = 1960 100 100
Y1 = 1961 I01 = 110 I′01 = 100 ×1.10 = 110
Y2 = 1962 I12 = 95.5 I′12 = 100 × (1.10 × .955) = 105.05 = 105
Y3 = 1963 I23 = 109.5 I′23 = 100 ×(1.10×0.955×1.095) = 115

www.gayali.in
Ans: Y1 = 110
Y2 = 105
Y3 = 115
[27] Compute chain index numbers with 1970 prices as base, from the following
table giving the average wholesale prices for the years 1970-74.
Average Wholesale Price (Rs.)
Commodity
1970 1971 1972 1973 1974
A 20 16 28 35 21
B 25 30 24 36 45
C 20 25 30 24 30
Link Index for 1971 (Ion) Link Index for 1972 (I12)
16 28
× 100 = 80 × 100 = 175
20 16
30 24
× 100 = 120 × 100 = 80
25 30
25 30
× 100 = 125 × 100 = 120
20 25
_________________ _________________
www.gayali.in
Total = 325 ÷ 3 = 108 Total = 375 ÷ 3 = 125
Link Index for 1973 (I23) Link Index for 1974 (I34)
35 21
× 100 = 125 × 100 = 60
28 35
36 45
× 100 = 150 × 100 = 125
24 36
24 30
× 100 = 80 × 100 = 125
30 24
___________________ ___________________
Total = 355 ÷ 3 = 118.3 Total = 310 ÷ 3 = 103.3
Table: Calculations for Chain Index
Year Link Index Chain Index (Base 1970=100)
Y0= 1970 100
Y1 = 1971 I01 = 108 I′01 = 100 × 1.08 = 108
Y2 = 1972 I12 = 125 I′12 = 100 × 1.08 × 1.25 = 135
Y3 = 1973 I23 = 118.3 I′23 = 100 × 1.08 ×1.25 × 1.183 = 160
Y4 = 1974 I34 = 103.3 I′34 = 100 × 1.08 ×1.25 × 1.183 × 1.033 = 166
Ans: 100, 108, 135, 160, 166 (Using A.M of relatives)
www.gayali.in
[28] From the table of group index numbers and group expenditures given below
calculate the cost of living index number:
Group Index Number Percentage of Total Expenditure
Food 428 45
Clothing 250 15
Fuel & Light 220 8
House Rent 125 20
Others 175 12

www.gayali.in
Solution:
Table: Calculations for Cost of Living Index
Group Group Index (I) Weight (W) IW
Food 428 45 19260
Clothing 250 15 3750
Fuel & Light 220 8 1760
House Rent 125 20 2500
Other 175 12 2100
Total - 100 29370
∑ IW 29370
Cost of Living Index = ∑ W = 100 = 293.70
[29] The following are the group index numbers and corresponding group weights of
an average working class family’s budget. Construct the cost of living index number.
Group Food Fuel & Lighting Clothing Rent Miscellaneous
www.gayali.in
Index No. 352 220 230 160 190
Weight 48 10 8 12 15
Solution:
Table: Calculations for Cost of Living Index Number
Group Index No. (I) Weight (W) WI
Food 352 48 16896
Fuel & Lighting 220 10 2200
Clothing 230 8 1840
Rent 160 12 1920
Miscellaneous 190 15 2850
Total 93 25706
∑ WI 25706
Cost of Living Index = ∑ W = 93 = 276.41
[30] The following table given group index numbers and corresponding group
weights with regard to cost of Living for a given year. Construct the overall cost of
living index for the year.
www.gayali.in
Group Index No. Weight

Food 350 5
Fuel & Lighting 220 1
Clothing 230 1
Rent 160 1
Miscellaneous 190 2

www.gayali.in
How is the overall index number altered, if

(i) all the group index numbers are changed in the same ratio;
(ii) all the weights and group indices are changed in the same ratio;
(iii) all the group index numbers are increased by 10 and all the weights are doubled?
Solution:
Table: Calculations for cost of Living Index
Group Group Index (I) Weight (W) IW
Food 350 5 1750
Clothing 230 1 230
Rent 160 1 160
Total 10 2740
∑ IW 2740
Cost of living = = = 274
∑W 10
www.gayali.in
i) Cost of Living Index will be changed in the same ratio.
ii) Cost of Living Index will be changed in the same ratio as group index numbers.
iii) Cost of Living index will be increased by 10.
[31] The percentage increase in price in 1971 over 1960 in the following groups for
middle class people in calcutta and the percentage of total expenditure spent on those
groups are shown below. Calculate the cost of Living index number for 1971 with 1960
as base.
Group Percentage Increase in price Percentage of total expenditure
Food 125 45
Clothing 66 6
Fuel & Lighting 112 5
House Rent& Tax 90 10
Miscellaneous 105 34
Solution:
P n
 Percentage of total IW
Group Percentage increase in price (I)  P × 100  expenditure (W)
www.gayali.in
 0 
Food 125 45 5625
Clothing 66 6 396
Rent 90 10 900
Total 100 11051

www.gayali.in
11051
Price increase over 1960 = = 110.51
100
Cost of Living Index = 100 + 110.51 = 210.51
[32] Determine the relative importance for the food group given that the cost of
living index number for 1975 with 1970 as base in 175 from the following figures:
Group % increase in expenditure Weight
Food 65 -
Clothing 90 12
Fuel etc. 20 18
Miscellaneous 70 10
Rent etc. 150 20
Solution:
Let the weight of food group be x.
Group Index (I) Weight (W) IW
www.gayali.in
Food 165 x 165x
Fuel etc. 120 18 2160
Rent etc. 250 20 5000
Total - 60 + x 11140 + 165x
11140 + 165x
175 = or, 10500 + 175x = 11140 + 165x or, 10x = 640 ∴x = 64
60 + x
∴ The relative importance of food group is 64.
[33] The group indices and the corresponding weights for the working class cost of
living index numbers in an industrial city for the years 1976 and 1980 are given below:
Group index
Group Weight
1976 1980
Food 71 370 380
Clothing 3 423 504
www.gayali.in
Fuel etc 9 469 336

House Rent 7 110 116
Compute the cost of living indices for the two years 1976 and 1980. If a worker
was getting Rs. 300 per month in 1976, do you think that he should be given some
extra allowance so that he can maintain his 1976 standard of living? If so, what should
be the minimum amount of his extra allowance?

www.gayali.in
Solution:
Group Weight (W) Group index for 1976 (I1) Group index for 1980 (I2) I1 W I2 W
Food 71 370 380 26270 26980
Clothing 3 423 504 1269 1512
Fuel etc. 9 469 336 4221 3024
House Rent 7 110 116 770 812
Miscellaneous 10 279 283 2790 2830
Total 100 - - 35320 35158
35320
Cost of living index for 1976 = = 353.20
100
35158
Cost of living index for 1980 = = 351.58
100
since the index number for 1980 is smaller, no extra allowance need to be given.
[34] An enquiry into the budgets of the middle class families of a certain city revealed
that on an average, the percentage expended on the different groups were- Food 45,
Rent 15, Clothing 12, Fuel, Light and miscellaneous 20. The group index numbers for
the current year as compered with a fixed base period were respectively 410, 150, 343,
www.gayali.in
248 and 285. Calculate the consumer price index number for the current year. Mr.X
was getting Rs. 240 in the base period and Rs. 430 in the current year. State now much
he ought to have received as extra allowance to maintain his former standard of living.
Solution:
Group Percentage Expenses (W) Group Index (I) IW
Food 45 410 18450
Rent 15 150 2250
Fuel and Light 8 248 1984
Total 100 - 32500
32500
Cost Living Index = = 325
100
When base period index is 100 current period index is 325
325
Base period index is 1 current period indexes is
100
www.gayali.in
325
Base period index is 240 current period index is × 240 = 780
100
∴ He has to receive a sum of Rs.(780-430) = 350 to maintain his former standard of living.
[35] During a certain period the cost of living index number goes up from 110 to 200
and the salary of a worker is also raised from Rs. 325 to Rs. 500. Does the worker really
gain and if so, by how much in real terms?

www.gayali.in
Solution:
Let the years be Y1 and Y2
Actual Wages
Year Cost of living index Wages Real wages= ×100
CLI
Y1 110 325 325
× 100 = 295
100
Y2 200 500 500
× 100 = 250
200
Hence real wage in the year Y2 has fallen by Rs. (295-250) = Rs.45.
[36] The average weekly wages for all manufacture industries for a number of months
in 1960 are Rs. 78.52, 79.71, 78.55, 78.17, 78.99, the corresponding consumer price
index numbers are 115, 116, 118, 117, 120. Find the real wages for the different months
and calculate the percentage change in the real wages during the period.
Solution:
www.gayali.in
Table: Calculations for Real Wages
Months in Weekly wages Consumer Price Index Actual Wages
1960 (Rs.) (Base =100) Real Wage= ×100
CPI
1 78.52 115 68.28
2 79.71 116 68.72
3 78.55 118 66.57
4 78.17 117 66.81
5 78.99 120 65.82
78.52
Note: Real wage for month 1 = × 100 = 68.28
115
79.71
2= × 100 = 68.72
116
Change in real wage from month 1 & month 5 = Rs. (68.28-65.82) = Rs. 2.46
2.46
% change in real wage during the period = × 100 = 3.6%
68.25
www.gayali.in
[37] Monthly wages average in different years in as follows:

Year 1967 1968 1969 1970 1971 1972 1973
Wages (Rs.) 200 240 350 360 360 380 400
Price Index 100 150 200 220 230 250 250
Calculate the real wages index numbers.

www.gayali.in
Solution:
Table: Calculations for Real Wages
Consumer Price Index Actual Col.3

Year Real wage Index
number (Base 1967=100) Wage (Rs.) Real Wages= Col.2 × 100
1 2 3 4 5
1967 100 200 200 200
× 100 = 100
200
1968 150 240 240 160
× 100 = 160 × 100 = 80
150 200
1969 200 350 350 175

× 100 = 175 × 100 = 88
200 200
1970 220 360 360 163.64
× 100 = 163.64 × 100 = 82
220 200
1971 230 360 360 156.52
× 100 = 156.52 × 100 = 78
www.gayali.in
230 200
1972 250 380 380 152
× 100 = 152.00 × 100 = 76
250 200
1973 250 400 400 160
× 100 = 160.00 × 100 = 80
250 200
Real Wage Index is: 100, 80, 88, 82, 78, 76 and 80 (base 1967=100)
[38] Given the following table, calculate the real wage rates and the purchasing power
of the rupee for the years 1947-1954, taking 1947 as the base year:
Year 1947 1948 1949 1950 1951 1952 1953 1954
Wage rate per day (Rs.) 1.19 1.33 1.44 1.57 1.75 1.84 1.89 1.94
Consumer price index 95.5 102.8 101.8 102.8 111.0 114.4 114.4 114.8
(1947-49=100)
Solution:
Table: Calculations for Real Wages & Purchasing Power of Rupee
Year Consumer Price Actual Col.3 Purchasing Power of
Index number (Base Wage (Rs. Real Wages = × 100 Rupee for 1947-54
1947=100) per day) Col.2
1 2 3 4 5
www.gayali.in
1947 95.5 1.19 1.25 95.5 ÷ 95.5=1.00

1948 102.8 1.33 1.29 95.5 ÷ 102.8=0.93
1949 101.8 1.44 1.41 95.5 ÷ 101.8=0.94
1950 102.8 1.57 1.53 95.5 ÷ 102.8=0.93
1951 111.0 1.75 1.58 95.5 ÷ 111.0=0.86
1952 113.5 1.84 1.62 95.5 ÷ 113.5=0.84
1953 114.4 1.89 1.65 95.5 ÷ 114.4=0.83
1954 114.8 1.94 1.69 95.5 ÷ 114.8=0.83

www.gayali.in
Real Wages Rates are: 1.25, 1.29, 1.41, 1.53, 1.58, 1.62, 1.65, 1.69
Purchasing Power of Rupee: 1.00, 0.93, 0.94, 0.93, 0.86, 0.84, 0.83, 0.83
[39] Given below are the average wages in rupees per hour of unskilled workers of
a factory during the years 1975-1980. Also shown in consumer price index for these
years(taking 1975 as base year with Price Index 100). Determine the real wages of the
workers during 1975-1980 compared with their wages in 1975.
Year 1975 1976 1977 1978 1979 1980
Consumer Price Index 100 120.2 121.7 125.9 129.3 140
Average Wage(Rs/ hours) 1.19 1.94 2.13 2.28 2.45 3.10
How much is the worth of one rupee of 1975 in subsequent year?
Solution:
Table: Calculations for Real Wages & Purchasing Power of Money
Year Consumer Actual Col.3 Purchasing Power
Price Index Wage Real Wages = × 100 of Money(consumer
number (Rs./hour) Col.2 price index ÷ 100)
1 2 3 4 5
www.gayali.in
1975 100 1.19 1.19 1.00
1976 120.2 1.94 1.94 1.20
× 100 = 1.61
120.2
1977 121.7 2.13 2.13 1.22
× 100 = 1.75
121.7
1978 125.9 2.28 2.28 1.26
× 100 = 1.81
125.9
1979 129.3 2.45 2.45 1.29
× 100 = 1.89
129.3
1980 140 3.10 3.10 1.4
× 100 = 2.21
140
100
Note: Purchasing power of money =
Price Index Number
(With references to base period)
Index Number for subsequent years 120.2
Purchasing power of money = = = 1. 2
Index number in the base year 100
www.gayali.in
Real wages are: 1.19, 1.61, 1.75, 1.81, 1.89, 2.21

Purchasing power of money : 1.00, 1.20, 1.22, 1.26, 1.29, 1.4
[40] The following are index numbers of prices (1969=100):
Year 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978
Index 100 120 180 207 243 270 300 360 400 420
Shift the base from 1969 to 1975 and recast the index numbers.

www.gayali.in
Solution:
Table: Base shifting from 1969 to 1975
Year Index Number (Base 1969 = 100) Index Number (Base 1975 = 100)
1969 100 100
× 100 = 33
300
120
1970 120 × 100 = 40
300
180
1971 180 × 100 = 60
300
207
1972 207 × 100 = 69
300
243
1973 243 × 100 = 81
300
270
1974 270 × 100 = 90
300
1975 300 300
× 100 = 100
300
1976 360 360
× 100 = 120
300
www.gayali.in
400
1977 400 × 100 = 133
300
1978 420 420
× 100 = 140
300
Base shifted to 1975 = 100
Index Numbers are: 33, 40, 60, 69, 81, 90, 100, 120, 133, 140.
[41] The following table shows the index number of wholesale prices in India
(Revised Series) with base 1970-71 = 100:-
Year 1971 1973 1974 1975 1976 1977
Index Number 105 132 169 176 172 185
Find the index numbers for these years with base 1973 = 100.
Solution:
Table: Index Numbers with base shifted to 1973
Year Index Numbers Index Numbers (1973 = 100)
1971 105 105
× 100 = 80
132
1973 132 100
1974 169 169
× 100 = 128
132
www.gayali.in
1975 176 176

× 100 = 133
132
1976 172 172
× 100 = 130
132
1977 185 185
× 100 = 140
132
Ans: 80, 100, 128, 133, 130, 140.

www.gayali.in
[42] An index number is at 100 in 1971. It rises 10% in 1972, falls 4% in 1973, falls
2% in 1974, and again rises 10% in 1975, over the preceding year. Calculate the index
number for the 5 years with 1974 as base.
Solution:
Table: Base shifting from 1971 to 1974
Year Index Number (Base 1971=100) Index Number (Base 1974=100)

100
1971 100 × 100 = 96.6
103.5
110 110
1972 × 100 = 110 × 100 = 106.3
100 103.5
96 105.6
1973 × 110 = 105.6 × 100 = 102.0
100 103.5
www.gayali.in
98 103.5
1974 × 105.6 = 103.5 × 100 = 100
100 103.5
110 113.9
1975 × 103.5 = 113.9 × 100 = 110
100 103.5
Ans: 96.6, 106.3, 102.0, 100, 110.
[43] Given below are two series of index numbers, one with 1961 as base and the
other with 1970 as base:
a) Year Index b) Year Index
1965 180 1970 100
1966 192 1971 108
1967 208 1972 112
1968 220 1973 125

www.gayali.in
1969 232 1974 130
1970 250 1975 150
The index number series (a) was discontinued in 1971. Splice the series (a) to
the series (b) with 1970 as base.

www.gayali.in
Solution:
Table: Splicing Two Series of Index Numbers
Year ‘a’ series index (1961=100) ‘b’ series index (1970=100) New continuous series index (1970=100)
1 2 3 4
180
1965 180 × 100 = 72
250
192
1966 192 × 100 = 76.8
250
208
1967 208 × 100 = 83.2
250
220
1968 220 × 100 = 88
250
232
1969 232 × 100 = 92.8
250
1970 250 100 100
1971 108 108
1972 112 112
1973 125 125
1974 130 130
www.gayali.in
1975 150 150
Ans: 72, 76.8, 83.2, 88, 92.8, 100, 108, 112, 125, 130, 150
[44] In 1950, a statistical Bureau started constructing an index number series with 1950 as base.
Year 1950 (Base) 1956 1960
Index 100 140 200
In 1961, the Bureau reconstructed the index number series on a plan with base 1960.
Year 1960 (Base) 1965 1970
Index 100 150 210
In 1971 the Bureau again reconstructed the series on yet another plan with base year 1970.
Year 1970 (Base) 1975 1981
Index 100 180 240
Obtain a Continuous series with base 1970, by splicing the three series.
Solution:
Tbale: Splicing of Three Series of Index Numbers
Year A –(1950=100)
Series index B – Series index C – series index New Continuous C – Series index
(1960=100) (1970=100) (1960=100) (1970=100)
1 2 3 4 5 6
100 50
1950 100 200
× 100 = 50
210
× 100 = 23.8
140 70
1956 140 × 100 = 70 × 100 = 33.3
www.gayali.in
200 210
200 100
1960 200 100 200
× 100 = 100
210
× 100 = 47.6
150
1965 150 210
× 100 = 71.4
1970 210 100 100

1975 180 180
1981 240 240

www.gayali.in
Ans: 23.8, 33.3, 47.6, 71.4, 100, 180, 240.

[45] Deflate the per capita income shown in the following table on the basis of the
rise in the cost of living index and comment on your results:
Year 1965 1966 1967 1968 1969 1970 1971 1972
Cost of Living index 100 110 120 130 150 200 250 350
Per Capita income (Rs.) 65 70 75 80 90 100 110 130
Solution:
Table: Deflating Per Capita Income
Year Cost of Living Index Actual Per Capita Real Income (Rs.)
(1965=100) Income (Rs.)
1 2 3 4
1965 100 65 65.00
1966 110 70 70
× 100 = 63.64
110
1967 120 75 75
www.gayali.in
× 100 = 62.50
120
1968 130 80 80
× 100 = 61.54
130
1969 150 90 90
× 100 = 60.00
150
1970 200 100 100
× 100 = 50.00
200
1971 250 110 110
× 100 = 44.00
250
1972 350 130 130
× 100 = 37.14
350
Comments: It is observed from column (2) that although actual income has
gradually increased from Rs. 65 in 1965 to double i.e. 130 in 1972, the “real income”
has considerably gone down. This indicates that people of the particular category have
been hard hit by the substantial rise in the cost of living index.
www.gayali.in

*****Volume - I ends here*****
Thank you for your interest
&
Congratulations on finishing this book entirely
Keep Learning & All the best for your future!
You can give your feedback at pritish@gayali.in



Statistics Made Easy Volume 1 Descriptive Statistics by Pritish Ranjan Gayali

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Made Easy Volume 1 Descriptive Statistics by Pritish Ranjan Gayali

Uploaded by

Copyright:

Available Formats

PREVIEW COPY, NOT FOR SALE OR REPRINT

Support Independent Authors & Publishers like me.