Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

UNIT II

MEASURES OF CENTRAL TENDENCY

Syllabus:

Measures of Central tendency – A.M, Median, Quartiles & Mode (without Grouping), G.M, H.M.

Summarisation of data is a necessary function of any statistical analysis. As a first step in this direction, the
huge mass of unwieldy data are summarized in the form of tables and frequency distributions. In order to
bring the characteristics of the data into sharp focus, these tables and frequency distributions need to be
summarized further. A measure of central tendency or an average is very essential and an important
summary measure in any statistical analysis. An average is a single value which can be taken as a
representative of the whole distribution.

“An average is an attempt to find one single figure to describe the whole of figures”

“A measure of central tendency is a typical value around which other figures congregate”.

Functions of an Average

1. To present huge mass of data in a summarized form. A measure of average is used to summarize
a large body of numerical figures into a single figure which makes it easier to understand and
remember.
2. To facilitate comparison.Different sets of data can be compared by comparing their averages.
3. To help in decision making.Most of the decisions to be taken in research, planning etc. are based
on the average values of certain variables.

Characteristics of a Good Average

According to Prof. Yule anideal measure of average must possess the following characteristics:

1. It should be rigidly defined, preferably by an algebraic formula, so that different persons obtain the
same value for a given set of data.
2. It should be easy to compute and understand.
3. It should be based on all observations. Thus, in the computation of an ideal average the entire set
of data at our disposal should be used and there should not be any loss of information resulting
from not using the available data. Obviously if the whole data is not used in computing the average,
it will be unrepresentative of the distribution.
4. It should be suitable for further mathematical treatment. In other words, the average should possess
some important and interesting mathematical properties so that its use in further statistical theory
is enhanced.
5. It should be affected as little as possible by fluctuations in sampling.By this we mean that if we
take independent random samples of the same size from a given population and compute the
average for each of these samples then, for an ideal average, the values so obtained from different
samples should not vary much from one another. The difference in the values of the average for
different samples is attributed to the so-called fluctuations of sampling. This property is also
explained by saying that an ideal average should possess sampling stability.
6. It should not be affected much by extreme observations. By extreme observations we mean very
small or very large observations. Thus a few very small or very large observations should not
unduly affect the value of a good average.

Various Measures of Central Tendency

Various measures of average can be classified into the following three categories:

a) Mathematical Averages
i) Arithmetic Mean
ii) Geometric Mean
iii) Harmonic mean
iv) Quadratic Mean
b) Positional Averages
i) Median
ii) Mode
c) Commercial Averages
i) Moving Average
ii) Progressive Average
iii) Composite Average

Out of these, A.M, Median, Mode, G.M & H.M are commonly used in practice.

I. Arithmetic Mean

Arithmetic Mean of a given set of observations is their sum divided by the number of observations.

Case 1: Individual Series:

Individual series means where frequencies are not given.

Here the mean can be found by two methods.

(i) Direct Method:

Example 1. Find Mean for the following figures.

Solution:

∑X =30 + 41 + 47 + 54 + 23 + 34 + 37 + 51 + 53 + 47=417; N= 10.


(ii) Short Cut Method:

Here X is calculated using an Assumed Mean; taking deviations from it, the following formula is used.

Where A is assumed mean

and dx = the deviation of items from assumed mean (X – A),

∑dx/N is known as correction factor.

Case 2: Discrete Frequency Distribution

Here each frequency is multiplied by the variable, taking the total and dividing total by total number of
frequencies, we get X.

Symbolically,

X = ∑fx/N

Where f = frequency,

X = the value of the variable

And N = the sum of frequency or N = ∑f

Example 1. Calculate A.M. from the following data

Solution:
(ii) Short Cut Method:

Here Assumed Mean is taken and taking deviations of variable from it. We obtain X by using the following
formula.

Where A = Assumed Mean

dx = (X-A);

f = frequency ∑f or N = Total number of terms,

(Note :-This formula is often used when the variables are large in size or infractions and direct formula is
not easy to use.)

Example 2. Calculate the Arithmentic Mean using short-cut method:

Solution:
Important:
But this formula cannot be applied to every data. For example if in the given example, values of X are 4, 7,
12, 17, 19 ; the common factor cannot be procured in this case. Hence problem in such a case will be solved
by Direct or Short Cut method.

Case 3: Grouped Frequency Distribution

Continuous series means where frequencies are given along with the value of the variable in the form of
class intervals. For example.

Here:

(i) 10-20, 20-30 … etc. are class intervals.

(ii) 3, 7, 11, 9, 6 is known as their respective frequencies.

(iii) In 20-30, 30-40…. etc. 20 is the lower and 30 the upper limit of 20-30 class interval.

(iv) Adding both the limits and taking their average, we get midpoint of the class interval. The mid-value
of 20-30 is ; 20+30/2 = 25.

It is often denoted by m or X.

When we take mid points of class Intervals, it can be denoted by X, m or M X can be found by three methods.
𝑿−𝑨
Important: i is the magnitude of the class intervals. 𝒅′ 𝒙 = 𝒊
Other Special Cases of Continuous Series:

Series such as in the last example i.e. 10—20, 20—30, 30—40……….. is known as Exclusive Series.

For other types of continuous series as discussed below, all the series are first converted into exclusive
series and then preceded for the solution as above.

Important: It is regarded essential to convert all other types of series into exclusive type ; Otherwise we will
proceed to a wrong result.
Properties of A.M

1. The sum of the deviations, of all the values of x, from their arithmetic mean, is zero.

Justification :

Since is a constant,

2. The product of the arithmetic mean and the number of items gives the total of all items.

Justification :

or
3. If and are the arithmetic mean of two samples of sizes n1 and n2 respectively then, the
arithmetic mean of the distribution combining the two can be calculated as

4. Wrong Observations: for correcting incorrect value of mean, first we find the corrected ∑X or ∑fX
( in case of discrete or continuous frequency distribution). For this we have to subtract the wrong
items from the incorrect ∑X or ∑fX and add the correct observations to it. Finally on dividing the
corrected ∑X or ∑fX by N we obtain the correct mean.

Merits and Demerits of AM

(A) Merits:
1. It can be easily calculated; and can be easily understood. It is the reason that it is the most used measure
of central tendency.

2. As every item is taken in calculation, it is effected by every item.

3. As the mathematical formula is rigid one, therefore the result remains the same.

4. Fluctuations are minimum for this measure of central tendency when repeated samples are taken from
one and the same population.

5. It can further be subjected to algebraic treatment unlike other measures i.e. mode and median.

6. A.M. has also a plus point being a calculated quantity and is not based on position of terms in a series.

7. As it is rigidly defined, it is mostly used for comparing the various issues.

(B) Demerits or Limitations:


1. It cannot be located graphically.

2. A single item can bring big change in the result. For example if there are three terms 4, 7, 10 ; X is 7 in
this case. If we add a new term 95, the new X is 4+7+10+95/4 = 116/4 = 29. This is a big change as
compared to the size of first three terms’ AM.

3. Its value will be effective only if the frequency is normally distributed. Otherwise in case skewness is
more, the results become ineffective.
4. In case of open end class intervals we have to assume the limits of such intervals and a little variation in
X can take place. Such is not the case with median and mode, and there is no use of the open end intervals
in its calculations.

5. Qualitative forms such as Cleverness, Riches etc. cannot give X as data can’t be expressed numerically.

6. X cannot be located by inspection as in the case of mode and median.

Weighted Arithmetic Mean

In case of simple arithmetic mean, we give equal importance to all the observations, but in practice we
might come across situations where the relative importance of all the items of distribution is not same. In
such cases proper weightage is to be given to various items – the weights attached to each item being
proportional to the importance of the item in the distribution.

We have to provide different weights according to their importance and the mean calculated so is known
as Weighted Arithmetic Mean.
MEDIAN

Median is a value which divides the series into two equal parts. It is position which is exactly in the centre,
equal number of terms lie on either side of it, when terms are arranged in ascending or descending order.

Definition:
“The median is that value of the variable which divides the group into two equal parts, one part comprising
all values greater and the other all values less than median”.

“Median of a series is the value of the item actual or estimated when a series is arranged in order of
magnitude which divides the distribution into two parts.” —Horace Secrist

Determination of Median

A) Individual Series:

To find the value of Median, in this case, the terms are arranged in ascending or descending order first; and
then the middle term taken is called Median.

Given that there are n observations, the median is given by:

i) the median is found by taking the ((N+1)/2)th element if there are an odd number of elements.
ii) If there are an even number of elements, then the median is an average of the (N/2)th and (N/2
+ 1)th element.

Example 1. Find Median from following data:

N = Total number of terms = 9

Now = N+1/2 = 9+1 /2 = 5th term

Md = 19

Example 2. From the following figures of ages of some students, calculate the median age:

B) When ungrouped frequency distribution is given

After arranging the terms, take cumulative frequencies, then we take (N+1/2) and calculate median.

Steps to Calculate:
(1) Arrange the data in ascending or descending order.

(2) Find cumulative frequencies.

(3) Find the value of the middle item by using the formula

Median = Size of (N+1/2)th item

(4) Find that total in the cumulative frequency column which is equal (N + 1/2)th or nearer to that value.

(5) Locate the value of the variable corresponding to that cumulative frequency This is the value of Median.
Example: Locate the median of the following frequency distribution:

Variable (X) : 10 11 12 13 14 15 16

Frequency (f) : 8 15 25 20 12 10 5

Solution

X : 10 11 12 13 14 15 16

f : 8 15 25 20 12 10 5

c.f. : 8 23 48 68 80 90 95
95+1
Here, N = 95, which is odd. Thus, median is the size of ( )th term = 48th observation.
2
Md = 12

Alternative Method
𝑁 95
= = 47.5th term.
2 2

Looking at the c.f., Md = 12

C) When grouped frequency distribution is given (Interpolation Formula):

In this case, less than cumulative frequencies is taken and then the value from the class-interval in which
(N/2)th term lies is taken using the interpolation formula.

Steps involved in its computation are:

i. Prepare less than c.f. distribution.


ii. Find N/2.
iii. See c.f Just greater than N/2.
iv. The corresponding class contains the median value and is called the median class.
The value of median is now obtained by using the formula:

ℎ 𝑁
Median = 𝑙 + ( - C)
𝑓 2
Where, l is the lower limit of the median class
f is the frequency of the median class
h is the magnitude or width of the median class,
N = ∑f, is the total frequency,
C is the cumulative frequency of the class preceding the median class.

Remarks:

1. The distribution of the variable under consideration is continuous with exclusive type classes
without any gaps.
2. There is an orderly and even distribution of observations within each class.

Merits and Demerits of Median

Merits

• Median is rigidly defined as in the case of Mean.

• Even if the value of extreme item is much different from other values, it is not much affected by
these values e.g. median in case of 4, 7, 12, 18, 19 is 12 and if we add two values equal to 450
10000, new median is 18.

• It can also be used for the quantities those can’t give A.M; as is in case of intelligence etc. It is
possible to arrange in any order and to locate the middle value. For such cases it is the best measure.

• It can be located graphically.


• For open end intervals, it is also suitable one. As taking any value of the intervals, value of Median
remains the same.

• It can be easily calculated and is also easy to understand.

• Median is also used for other statistical devices such as Mean Deviation and skewness.

• It can be located by inspection in some cases.

• Extreme items may not be available to get Median.

Demerits or Limitations

• Even if the value of extreme items is too large, it does not affect too much, but due to this reason,
sometimes median does not remain the representative of the series.

• It is affected much more by fluctuations of sampling than A.M.

• Median cannot be used for further algebraic treatment. Unlike mean we can neither find total of
terms as in case of A.M. nor median of some groups when combined.

• In a continuous series it has to be interpolated. We can find its true-value only if the frequencies
are uniformly spread over the whole class interval in which median lies.

• If the number of series is even, we can only make its estimate; as the A.M. of two middle terms is
taken as Median.

QUARTILES

The values which divide the given data into four equal parts are known as quartiles. Obviously there will
be three such points Q1, Q2, and Q3 such that Q1≤Q2≤Q3, termed as the quartiles. Q1, known as the lower
or first quartile is the value which has 25% of the items of the distribution below it and consequently 75%
of the items are greater than it. Incidentally Q2, the second quartile, coincides with the median and has an
equal number of observations above it and below it. Q3, known as the upper or third quartile, has 75% of
the observations below it and consequently 25% of the observations above it.

Determination of Quartiles

The working principle for computing the quartiles is basically the same as that of computing the median.

To compute Q1, the following are the steps involved:

i. Find N/4, where, N = ∑f is the total frequency


ii. See the less than cumulative frequency just greater than N/4.
iii. The corresponding value of X gives the value of Q1. In case of continuous frequency
distribution, the corresponding class contains Q1 and the value of Q1 is obtained by the
interpolation formula:
ℎ 𝑁
𝑄1 = 𝑙 + ( – 𝐶)
𝑓 4
Where, l is the lower limit, f is the frequency and h is the magnitude of the class containing Q1.
C is the cumulative frequency of the class preceding the class containing Q1.

Similarly to compute Q3, see the less than c.f., just greater than 3N/4. The corresponding value of X gives
Q3. In case of continuous frequency distribution, the corresponding class contains Q 3 and the value of Q3 is
given by the formula:
ℎ 3𝑁
𝑄3 = 𝑙 + ( − 𝐶)
𝑓 4

Where, l is the lower limit, f is the frequency and h is the magnitude of the class containing Q3.
C is the cumulative frequency of the class preceding the class containing Q3

Graphic Method of locating Median and Quartiles

The various partition values viz., quartiles, deciles and percentiles can be easily located graphically with
the help of a curve called the cumulative frequency curve or ogive. The procedure involves the following
steps.

Less than ogive

Steps: 1. Represent the given distribution in the form of a less than cumulative frequency distribution.

2. Take the values of the variable (in the case of frequency distribution) and the class intervals (in the case
of continuous frequency distribution) along the X axis and the cumulative frequency along the vertical axis.

3. Plot the c.f against the corresponding value of the variable (in the case of frequency distribution) and
against the upper limit of the corresponding class (in the case of continuous frequency distribution).

4. The smooth curve obtained by joining the points so obtained by means of free-hand drawing is called
‘less than’ ogive or less than c.f. curve

More than ogive

In this case we form the more than cumulative frequency distribution and plot it against the corresponding
value of the variable or against the lower limit of the corresponding class (in case of continuous frequency
distribution). The curve obtained on joining the points so obtained by smooth free-hand drawing is called
more than cumulative frequency curve or more than ogive.

Remark. If we draw a perpendicular from the point of intersection of the two ogives on the x-axis, the foot
of the perpendicular gives the value of the median.

Finding Median Graphically


Conversion into
Marks No. of students Cumulative Frequency
exclusive series

(x) (f) (c.f)

410-419 409.5-419.5 14 14

420-429 419.5-429.5 20 34

430-439 429.5-439.5 42 76

440-449 439.5-449.5 54 130

450-459 449.5-459.5 45 175

460-469 459.5-469.5 18 193

470-479 469.5-479.5 7 200

The median value of a series may be determined through the graphic presentation of data in the form of
Ogives. This can be done in 2 ways.

1. Presenting the data graphically in the form of 'less than' ogive or 'more than' ogive.
2. Presenting the data graphically and simultaneously in the form of 'less than' and 'more than' ogives. The
two ogives are drawn together.

1. Less than Ogive approach

Marks Cumulative Frequency (c.f)

Less than 419.5 14

Less than 429.5 34

Less than 439.5 76

Less than 449.5 130

Less than 459.5 175

Less than 469.5 193

Less than 479.5 200

Steps involved in calculating median using less than Ogive approach:

1. Convert the series into a 'less than ' cumulative frequency distribution as shown above.

2. Let N be the total number of students whose data is given. N will also be the cumulative frequency of
the last interval. Find the (N/2)th item(student) and mark it on the y-axis. In this case the (N/2)th item
(student) is 200/2 = 100th student.

3. Draw a perpendicular from 100 to the right to cut the Ogive curve at point A.

4. From point A where the Ogive curve is cut, draw a perpendicular on the x-axis. The point at which it
touches the x-axis will be the median value of the series as shown in the graph.

More than Ogive approach

Marks Cumulative Frequency (C.f)

More than 409.5 200

More than 419.5 186

More than 429.5 166

More than 439.5 124

More than 449.5 70

More than 459.5 25

More than 469.5 7

More than 479.5 0


Steps involved in calculating median using more than Ogive approach

1. Convert the series into a 'more than ' cumulative frequency distribution as shown above.

2. Let N be the total number of students who's data is given. N will also be the cumulative frequency of the
last interval. Find the (N/2)th item(student) and mark it on the y-axis. In this case the (N/2)th item (student)
is 200/2 = 100th student.

3. Draw a perpendicular from 100 to the right to cut the Ogive curve at point A.

4. From point A where the Ogive curve is cut, draw a perpendicular on the x-axis. The point at which it
touches the x-axis will be the median value of the series as shown in the graph.

2. Less than and more than Ogive approach

Another way of graphical determination of median is through simultaneous graphic presentation of both
the less than and more than Ogives.

1. Mark the point A where the Ogive curves cut each other.

2. Draw a perpendicular from A on the x-axis. The corresponding value on the x-axis would be the median
value.
.
MODE

Mode is the value which occurs most frequently in a set of observations and around which the other items
of the set cluster densely. In other words, mode is the value of a series which is predominant in it. In the
words of Croxton and Cowden, “the mode of a distribution is the value at the point around which the items
tend to be most heavily concentrated. It may be regarded as the most typical of a series of values.”

The concept of mode, as a measure of central tendency, is preferable to mean and median when it is desired
to know the most typical value, e.g., the most common size of shoes, the most common size of ready- made
garment, the most common size of pocket expenditure of a college student etc.

Determination of mode

a) When data are either in the form of individual observations or in the form of ungrouped frequency
distribution.

Given individual observations, these are first transformed into an ungrouped frequency distribution. The
mode of an ungrouped frequency distribution can be determined in two ways:

i. By inspection
ii. By method of grouping
i. By inspection

when a frequency distribution is fairly regular, then mode if often determined by inspection. It is that value
of the variate for which the frequency is the maximum. By a fairly regular frequency distribution we mean
that as the values of the variable increase the corresponding frequencies of these values first increase in a
gradual manner and reach a peak at a certain value and finally start declining gradually in, approximately,
the same manner as in case of increase.

Example

Compute mode of the following data:

3, 4, 5, 10, 15, 3, 6, 7, 9, 12, 10, 16, 18, 20, 10, 9, 8, 19, 11, 14, 10, 13, 17, 9, 11

Solution

X : 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

f : 2 1 1 1 1 1 3 4 2 1 1 1 1 1 1 1 1 1

therefore mode = 10

Remarks: 1. if the frequency of each possible value of the variable is the same then there is no mode.

2. If there are two values having maximum frequency, then distribution is said to be bi-modal.

b) When the data are in the form of a grouped frequency distribution.

In the case of continuous frequency distribution, the class corresponding to the maximum frequency is
called the modal class and the value of mode is obtained by the interpolation formula:
ℎ(𝑓1−𝑓0) ℎ(𝑓1−𝑓0)
Mode = 𝑙 + (𝑓1−𝑓0)−(𝑓2−𝑓1) = 𝑙 + 2𝑓1−𝑓0−𝑓2

Where, l is the lower limit of the modal class,

h is the magnitude of the modal class,

f1 is the frequency of the modal class,

f0, is the frequency of the class preceding the modal class,

f2 is the frequency of the class succeeding the modal class.

Remarks: The above formula for computing mode is based on the following assumptions:

1. the frequency distribution must be continuous with exclusive type classes without any gaps. If the
data are not given in exclusive form, they must be first converted into exclusive class intervals.
2. the class intervals must be uniform throughout i.e., the width of all the class intervals must be the
same. In case of distribution with unequal class intervals, they should be made equal under the
assumption that the frequencies are uniformly distributed over all the classes, otherwise the value
of mode computed will give misleading results.

Example
The frequency distribution of marks obtained by 60 students of a class in a college is given below:

Marks: 30 – 34 35 -39 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64

Frequency: 3 5 12 18 14 6 2

Find mode of the distribution.

Solution

Marks Frequency

29.5 – 34.5 3

34.5 – 39.5 5

39.5 – 44.5 12

44.5 – 49.5 18

49.5 – 54.5 14

54.5 – 59.5 6

59.5 – 64.5 2

Modal class is 44.5 – 49.5

l = 44.5, h = 5, f1= 18, f0= 12, f2 = 14


5(18−12)
Mode = 44.5 + 2∗18−12−14 = 44.5+3 = 47.5marks

Graphic location of mode

Mode can be located graphically from the histogram of frequency distribution by making use of rectangles
erected on the modal, pre-modal and post-modal classes. The method involves the following steps:

i. join the top right corner of the rectangle erected on the modal class with top right corner of the
rectangle erected on the preceding class by means of a straight line.
ii. Join the top left corner of the rectangle erected on the modal class with the top left corner of
the rectangle erected on the succeeding class by a straight line.
iii. From the point of intersection of the lines in steps (i) and (ii) above, draw a perpendicular to
the X-axis. The X- coordinate or the abscissa of the point where the perpendicular meets the X
axis gives the modal value.

Example

Finding Mode Graphically


Marks Conversions into No. of students
inclusive series exclusive series (frequency)

(x) (f)

10-19 9.5-19.5 10

20-29 19.5-29.5 12

30-39 29.5-39.5 18

40-49 39.5-49.5 30

50-59 49.5-59.5 16

60-69 59.5-69.5 6

70-79 69.5-79.5 8

The following steps must be followed to find the mode graphically.

1. Represent the given data in the form of a Histogram. The height of the rectangles in the histogram
is marked by the frequencies of the class interval as shown in the graph .Identify the highest
rectangle. This corresponds to the modal class of the series.

2. Join the top corners of the modal rectangle with the immediately next corners of the adjacent
rectangles. The two lines must be cutting each other.This might be difficult to visualise so look at
the graph given below.

3. Let the point where the joining lines cut each other be 'A'. Draw a perpendicular line from point A
onto the x-axis. The point 'P' where the perpendicular will meet the x-axis will give the mode.

The Histogram
In this case the value of point P turns out to be 44.12

Merits and Demerits of Mode

Merits or Uses of Mode:

1. Mode is the term that occur most in the series hence it is not an isolated value like Median nor it is value
like mean that may not be there in the series.

2. It is not affected by extreme values hence is a good representative of the series.

3. It can be found graphically also.

4. For open end intervals it is not necessary to know the length of open intervals.

5. It can also be used in case of Quantitative phenomenon.

6. With only just a single glance on data we can find its value. It is simplest.

7. It is the most used average in day today life, such as average marks of a class, average number of students
in a section, average size of shoes, etc.

Demerits or Limitations of Mode:

1. Mode cannot be determined if the series is bimodal or multimodal.

2. Mode is based only on concentrated values; other values are not taken into account in-spite of their big
difference with the mode. In continuous series only the lengths of class intervals are considered.
3. Mode is most affected by fluctuation of sampling.

4. Mode is not so rigidly defined. Solving the problem by different methods we won’t get the same results
as in case of mean.

5. It is not capable of further algebraic treatment. It is impossible to find the combined mode of some series
as is in case of Mean

6. Also we can’t find the total of whole series from value of mode as is in case of Mean.

7. If the number of terms is too large, only then we can call it as the representative value.

8. It is also said that sometimes mode is ill-defined, ill- definite and indeterminate.

Relationship between mean, median and mode

In case of a normal or a symmetrical distribution mean=median=mode. When the frequencies are not
properly distributed it is called as an asymmetrical or skewed distribution. If it is moderately asymmetrical
distribution the following empirical relationship holds good.

Mode = 3 median - 2 mean.

If it is positively skewed, then mean > median > mode

If it is negatively skewed, then mode > median > mean.

GEOMETRIC MEAN

The geometric mean, usually abbreviated as G.M of a set of n observations is the nth root of their product.
Thus, if X1, X2, X3, …….., Xn are the given n observations then their G.M is given by
𝑛
G.M = √𝑋1. 𝑋2. 𝑋3 … … . . 𝑋𝑛 =(𝑋1. 𝑋2. 𝑋3 … . 𝑋𝑛)1/𝑛 …….(1)

If n is 2, then G.M can be computed by taking the square root of their product.

But if n, the number of observations is greater than 2, then the computation of the nth root is very tedious.
In such a case the calculations are facilitated by making use of the logarithms. Taking logarithm on both
sides of (1), we get
1 1
Log G.M = (log X1 + logX2 + log X3+ ……+ log Xn) = ∑log X ………(2)
𝑛 𝑛

Taking antilog of both sides of (2) we finally obtain,


1
G.M = Antilog[𝑛 ∑ log 𝑋]

In case of frequency distribution (Xi,fi); i= 1,2,…….n, where the total number of observations is N = ∑f.
1
G.M = Antilog[𝑁 ∑ 𝑓log 𝑋]
In the case of grouped or continuous frequency distributions, the values of X are the mid-values of the
corresponding classes.

Advantages and Disadvantages of GM:

The geometric mean, like the arithmetic mean, has a number of advantages and disadvantages.

These are given in order below:

(a) It can be defined rigidly

(b) It is calculated on the basis of all the observations of a variable.

(c) It is much convenient to calculate required averages of ratios, rates, and percentages with the aid of GM.

(d) It is not affected by the exceptional and extremely large or small values of a variable.

(e) It gives the highest weight for the lowest observation and the lowest weight for the highest observation
and thereby balances the entire procedure to get the best result.

(f) It is much suitable for using it in different mathematical treatments afterwards.

(g) It helps in the calculation for determining rates of exchange among the currencies of various countries.

Its disadvantages are noted below:

a. It is very difficult to calculate when the data is given in the fashion of a grouped frequency distribution
having large frequencies in enough numbers.

b. The result becomes meaningless if any of the information is zero or negative.

c. The result finally obtained may not be equal to any of the observations given in the series.

d. It gives least importance to the marginal and extreme observations.

e. In some cases it cannot play the role as the true representative of an average.

f. It usually brings out the property of the ratio of changes and not the differences of change.

Computation of GM:

Example 1:

Find the GM of the observations 12, 18, 48 and 61 of a variable having their frequencies 5, 3, 2 and 8
respectively

Solution:
Let us prepare the data in the form of a table so as to calculate

GM.

HARMONIC MEAN

The harmonic mean of a set of observations on a variable is defined as the reciprocal of the arithmetic
average of the reciprocal of the given observations (any of the observations must not be zero).

If the variable noted is X which takes n- number of values as x1, x2, x3, … xn and their reciprocals are:

For the observations having their respective frequencies, the weighted HM can be computed as:
It is a special kind of average used in some selected situations.

Important Properties of the HM:

(a) If the given values of a variable are all equal (but ≠ 0) then their harmonic mean will be equal to their
common value.

Here, n is the total number of observations of the variable and c is the common value.

(b) If a variable y is related to another variable X in the form y = ax, then the harmonic mean of y is related
to that of x in the similar form:

(c) If n1 and n2 are two sets of values of a variable x and their respective harmonic means are H 1 and H2,
then the harmonic mean of the combined set (H) is given by:

Computation of the HM:

Example 1:

Calculate the simple harmonic mean of the numbers 3, 6, 24 and 48.

Solution:

Applying the principle of HM we get:


Example 2:

Determine the weighted HM for the observations of the variable X from the following:

Merits and Demerits of the HM:

Like all the devices of central tendency mentioned earlier, harmonic mean also has a number of merits and
demerits.

These are:

Merits of the Harmonic Mean:

(a) It is defined much clearly and rigidly

(b) It is calculated on the basis of all the information available on the variable.

(c) It is very much suitable for using in various mathematical analyses.

(d) It remains more or less unaffected due to sampling fluctuations.

(e) It is easily computable and hence precise in nature.


(f) It always possesses a definite value.

(g) It considers smaller observation with larger importance and vice-versa.

(h) As it measures relative changes in the given observations of a variable, it becomes perfectly useful for
finding out averages of certain ratios and rates.

Demerits of the Harmonic Mean:

(a) The result usually found has no existence in the given series of observations on the variable.

(b) It is not easily explainable, computable, and hence understandable.

(c) It is much restrictive in the sense that it cannot be calculated if any of the observations is zero.

(d) It has limited applications in practical situations.

Interrelationship among AM,GM and HM:

Let us consider the simplest example on a variable X having only two observations x1 and x2 (e.g., the two
sides of a coin).

The same analysis can be extended for any number of observations of a variable and the same result can
easily be established.

Other important relations are:

2. AM = GM = HM when all the observations of the variable are identical in magnitude.

AM > GM > HM for heterogeneous observations.

Symbolically, we generalise them as


AM > GM > HM

But, for any two different numbers the relation turns into:

AM x HM = (GM)2

All the averages will become equal with each other when the variable assumes identical observations.

You might also like