Emba (Central Tendency & Dispersion)

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 49

Business Statistics & Quantitative Methods

Syllabus
Introduction & Concepts
Descriptive Statistics & Inferential Statistics
Measures of Central Tendency [Mean, Median & Mode]Measures of
Dispersion [Standard Deviation, Variance, Coefficient of Variation]
Correlation Analysis[Pearson Correlation, Spearman Rank
Correlation]
Regression Analysis[Simple Regression]

Prepared by Dr M. Gowri Shankar

1|Page
Definition of Statistics

The word statistics has been derived from the Latin word ‘status’. In the plural sense it means a
set of numerical figures called ‘data’ obtained by counting or measurement. In the singular sense
it means collection, classification, presentation, analysis, comparison and meaningful
interpretation of raw data

It is defined as the science which deals with the collection, analysis and interpretation of
numerical data

Nature of the statistical study

Formulation of the study

Objectives of the study

Designing data collection

Conducting the field survey

Organizing the data

Analyzing the data

Reaching statistical findings

Presentation of findings

Applications of Business Statistics

The planning of operations – relating to the special projects for the firm

Setting up of standards – size of employment, volume of sales, fixation of quality norms for the
manufactured product, norms for the daily output

In statistical quality control methods – statistics can be useful in various ways to ensure of
production of quality goods. This is achieved by identifying and rejecting defective or
substandard goods. The sales targets can be fixed on the basis of sale forecasts, which are done
by using varying methods of forecasting, analysis of sales etc

Personnel management – measurement of productivity, comparison of wages and productivity is


undertaken in order to ensure increase in industrial productivity, concerned with fixation of
wages, incentive norms and performance of appraisal of employees

Seasonal behavior – construct a seasonal index for the consumption of products

2|Page
Export marketing – analyzing the quality of the products, to select the right products which has
demand in the overseas markets, analyzing the statistics of imports and exports

Maintenance of cost records – ensure cost of production includes cost of raw materials and
wages

Management of inventory – determine a magnitude of inventory that is neither excessive nor


inadequate

Expenditure on advertising and sales – to find association between two or more variables such as
advertising expenditure and sales

e.g Regression, Correlation

Mutual funds, banking and financial institutions – statistics provide certain tools or techniques to
a consultant or financial adviser, provide an avenue to a person to invest his savings for
reasonable returns.

Concepts of Quantitative Techniques/Statistics

Population(N)

It is a collection of people, items or events about which you want to make inferences

Population is defined as the potential set of respondents in a geographical area. It is any large
collection of objects or individuals, such as Indian housewives, consumers etc about which the
information is desired.

The numerical value that characterizes the aspect of population

Parameter

Parameter is a characteristic of population

It is used to describe the certain characteristics of population

It is any summary number, like an average or percentage, that describes the entire population

Ex: to determine the average annual expenditure on clothing in a city, proportion of employees
working overtime in a factory

Ex: the mean income of the Indian middle class, it is denoted by µ

The mean and variance for a given population are known as population parameters.

Sample (n)

It is a subgroup of population

3|Page
Statistics

Sample characteristics are called statistics

Data

It is collection of some raw facts

Types of data or modes of classification

Qualitative data

It is a non numerical property such as satisfaction of customer, good will of a company etc

It is done according to attributes or non-measurable characteristics; like social status, gender,


nationality, occupation etc

Quantitative classification

It is done according to number size like weights in kg or heights in cm. Here we classify data by
assigning arbitrary limits known as class-limits. For ex, the population of the whole country may
be classified according to different variables like age, income, wage, price, etc. hence this
classification is often ‘classification by variables

Variable

It is a symbol or characteristic of population. The quantitative phenomenon under study is called


a variable. It means any measurable characteristic or quantity which can assume a range of
numerical values within certain limits, e.g., income, height, age, weight, wage, price etc.

Classification of Variables

A variable can be classified as either discrete or continuous

Discrete variable

A variable which can take up only exact values and not any fractional values, is called a discrete
variable. It has specific values in a given class interval.

Number of workmen in a factory, members of a family, students in a class, number of births in a


certain year, etc., are examples of discrete variable

4|Page
Continuous variable

A variable which can take up any numerical value[integral/fractional of whole number] with a
certain range is called a continuous variable. Height, weight, rainfall, time, temperature, etc., are
examples of continuous variables.

Height of students in a school is a continuous variable as it can be measured to the nearest


fraction of time, i.e., years, months, days, etc.

Temperatures recording by a meteorological bureau, water consumption in liters etc

Ungrouped data

 The data which is not arranged or tabulated without frequency

 Ex: the marks of students : 40, 30, 20, etc

Grouped data

 The data which can be arranged and tabulated

 Ex:

 Year : 2008 2009 2010 2012

 Sales[‘00]: 20 40 60 80

Discrete series:

The series without class interval and having frequency. It takes specific values. The items which
can be easily counted

Ex:

Age Number of students(frequency)

30 10

45 30

5|Page
Continuous series:

The series with class interval and having frequency. It may take any value in a given class
interval. The items cannot be easily counted.

Ex:

Class interval(age) Number of students(frequency)

30-40 10

40-45 30

If the class interval is not continuous

For ex: 10 – 19

It can be made continuous by reducing the lower limit of the class interval by 0.5 and
increasing the upper limit of the class interval by 0.5 then the interval will be as follows :

10 – 19 can be rewritten as 9.5 – 19.5

Frequency

Number of occurrence of value

Frequency Distribution

The arrangement and display of data, where the observed value is paired with the frequency

Class limit

They are the lowest and highest values of a class. In 30 – 50, 30 is lower limit and 50 is the
higher limit

Class interval

The difference between the upper limit and the lower limit is called the class interval. 30 is
lowest and 50 is highest

Midpoint: lower limit + upper limit /2

Width of a class interval = highest value – lowest value/number of groupings

6|Page
Classification of Quantitative Techniques

Descriptive Statistics and inferential statistics

Descriptive Statistics

 Descriptive statistics refers to procedures for organizing, summarizing and describing


quantitative data about the samples or about the population where complete population
data are available. It does not involve the drawing of an inference from a sample to its
population.
 Measures of Descriptive Statistics
Measures of Central Tendency – Mean, Median and Mode
Measures of Dispersion/Variability – Range, Quartile Deviation Standard Deviation,
Variance and Coefficient of Variation
Inferential Statistics

Statistical procedures used for drawing of inferences about the properties of populations from
sample data are generally referred to as sampling or inferential techniques.
Ex: Chi-Square test, Correlation, Regression, Tests of Hypothesis - t test, z test, Analysis of
Variance(ANOVA) etc

Measures of Central Tendency

Central Tendency describe the centre of distribution, measures of location.

Classification of Central Tendency

It is classified into computational measures and positional measures.

Computational measures – Arithmetic Mean (A.M), Harmonic Mean (HM) & Geometric
Mean (GM)

Positional Measures – Median & Mode

Arithmetic Mean is also called as Mean or Average. A.M is useful while making
comparisons among several data sets. A.M is calculated by taking all the values in the
given data set.

Calculate the mean or average for the ungrouped data

Ex: 10, 15, 30, 7, 42, 79 and 83

7|Page
X_ (sample mean) = ∑ x/n = 10+15+30+7+42+79+83/7 = 38

N – number of observations

Mean for Ungrouped data

Calculate the mean for the ungrouped data

Ex: 12, 15, 18, 21 and 25

Short cut method

The monthly wages of 4 workmen are Rs. 400, Rs. 440, Rs. 380 and Rs. 360. Find the A.M.
of the wages of four workmen.

Solution

Wages (Rs.) Deviation from 380

x (d = x – 380)

400 20

440 60

380 0

360 -20

∑d = 60

Here A = 380, n = 4, ∑d = 60

X_ = A + { ∑d / n} = 380 + { 60 /4} = Rs. 395

Weighted Average Mean[WA or WM]

8|Page
A weighted average is a type of average where each observation in the data set is multiplied
by a predetermined weight before calculation.

Weighted Mean of Weighted Average for Ungrouped data

Calculate the weighted mean

Relative weight(w) : 2 3 5

Marks(x) : 30 25 20

wx : 60 75 100

X_ = ∑wx/∑w = w 1x1 + w 2x2 + w 3x3/w1+w2+w3

= 60+75+100/2+3+5 = 23.5 marks

Calculate the weighted mean

Years : 1 2 3 4 5

Income(’00) : 5 10 15 20 25

For ungrouped data

A market survey on demands of eggs at a local shop provided the following


distribution of daily demand:

Daily demand (No. of eggs) Frequency (No. of days)

10 30

15 25

20 40

30 20

40 35

9|Page
Find the average demand of eggs in numbers/day

Solution

Daily demand (No. of eggs) Frequency (No. of days) fx

X f

10 30 200

15 25 375

20 40 800

30 20 600

40 35 1400

N = ∑f = 150 ∑fx = 3475

X_ = ∑{fx / N }= {3475/ 150} = 2317

Find the A.M. of the following frequency distribution by short-cut method and direct method

x(observations) f(frequency)

92 12

125 7

180 6

80 9

Solution

Direct Method

10 | P a g e
A.M = X_ = ∑ = {fx/N} = {12*92+125*7+180*6+80*9}/ {12 + 7 + 6+ 9}
={3779/34} = 111.15

Short cut method

x(observations) f(frequency) Deviation from A(=100) fd

i.e. d = (x – 100)

92 12 -8 -96

125 7 25 175

180 6 80 480

80 9 -20 -180

∑f = 34 ∑ fd = 379

Here A = 100, ∑fd = 379, N = 34

X_ = A + {∑ fd / N} = 100 + {379/34} = 111.15

Arithmetic mean

Calculate A.M for the given series :

C.I : 10-20 20-30 30-40 40-50 50-60 60-70 70-80

F : 10 15 5 30 15 12 13

Solution

Formula for calculating A.M( X_ )

X_ = A + ∑ fd’ * i

A – Assumed mean

f – Frequency

d’ – deviation of the mean (d’ = m – A/i)

11 | P a g e
where m – the midpoint

i – Width of the class interval corresponding to the assumed mean

N – the total frequency or ∑f

Steps in calculating A.M

 Find out the midpoint of the class interval

 Locate the assumed mean (assumed mean can be identified from the centre of the
series having the midpoints)

Calculation of Arithmetic Mean for Continuous Series

Class Interval Frequency Midpoint Deviation(d’ = m – A/i) fd’

10 – 20 10 [10 + 20/2] =15 [15 -45/10] = -3 -30

20 – 30 15 25 -2 -30

30 – 40 5 35 -1 -5

40 – 50 30 45 - Assumed Mean 0 0

50 – 60 15 55 1 15

60 – 70 12 65 2 24

70 – 80 13 75 3 39

80 – 90 10 85 4 40

Total N =∑f = 110 ∑ 118

X_ = A + ∑ fd’ * i

X_ = 45 + (53) / 110 * 10

X_ = 45 + [4.818]*10 = 49.818

12 | P a g e
Median

It is the positional measure which divides the entire series into two halves. It is also called as
second quartile

Calculation of median for individual observation [Odd series]

Ex: 5, 6, 8, 10, 12

Here n = 5, which is odd

Median = value of n+1/2 th term

5+1/2 = 3rd item = 8

Calculation of median for individual observation [Even series]

If n is even

4, 5, 6, 7, 8, 9

Here n = 6

Median = A.M of n/2 and n/2 + 1 th terms

6/2 and 6/2 + 1

3rd and 4th = 6+7/2 = 6.5

The terms are 6th and 7th term

Calculate the Median for the discrete series[odd]

Wt in Kg : 112 118 122 130 40

Frequency : 6 10 5 4 2

Median for discrete series(odd)

Wt in kg F Cumulative frequency

112 6 6

118 10 16
13 | P a g e
122 5 21

130 4 25

40 2 27

N=∑f =
27

N = 27(odd) = [(N+1)/2]= [(27+1)/2] = 14

14 lies between 6 and 16 of cumulative frequency

Value corresponding to cumulative frequency 14 is 118 kg

Calculate the Median for the discrete series[odd]

Wt in Kg : 200 250 260 270 280 300

Frequency : 10 5 8 6 7 4

Median for discrete series(even)

Wt in kg F Cumulative frequency

200 10 10

250 5 15

260 8 23

270 6 29

280 7 36

300 4 40

N = ∑f= 40

Median = A.M of N/2 and N/2 + 1 th terms

N = 40(even) = even N/2 = 20 term and N/2 + 1 term = 21

14 | P a g e
Median = A.M of 20th and 21th values = 260+260/2 = 260

The values corresponding to 20 and 21 are 260

Median for Continuous Series

Calculate Median for the given series :

C.I : 10-20 20-30 30-40 40-50 50-60 60-70 70-80

F : 10 15 5 30 15 12 13

Solution

Formula for calculating Median

Md = l 1 + N/2 – C * i

l1 - lower class interval of the median class

f – Frequency of the median class

i – Width of the class interval corresponding to median class

N – the total frequency or ∑f

C – Cumulative Frequency of the Preceding Median Class

Steps in calculating Median

 Calculate the cumulative frequency

 Locate the median class

 Identify the preceding median class

 Locate the frequency of the median class

(The median class is N/2 = 110/2 = 55)

Calculation of Median Class

Class Interval Frequency Cumulative

Frequency

10 – 20 10 15

15 | P a g e
20 – 30 15 25

30 – 40 5 30 (C)

40 – 50 30(f) 60

50 – 60 15 75

60 – 70 12 87

70 – 80 13 100

80 – 90 10 110

Total N =∑f = 110

Substitute the values in the formula

Md = l 1 + N/2 – C * i

Md = 40 + 55 – 30/30 * 10

Md = 40 + 8.33 = 48. 33

Mode

Mode is the value of the variable that occurs most frequently

For the Ungrouped Data

For the ungrouped data, mode is the value of the variable that occurs most frequently.

16 | P a g e
For the Discrete Series

The value of the individual observations or items with the highest frequency is the mode for the
discrete series.

Continuous Series

The class interval with highest frequency is mode for the continuous series.

Ex: 8,9,11,15,16,12,15,3,7,15

In the above ex: out of 10 items the number 15 appears 3 times then 15 is called mode.

Calculate the mode for the given series :

C.I : 10-20 20-30 30-40 40-50 50-60 60-70 70-80

F : 10 15 5 30 15 12 13

Solution

Formula for calculating Mode

Mode = l1 + (f1 – f0) *i

2f1 – f0 – f2

l1 - Lower class interval of the modal class

f0 - Frequency of the preceding modal class

f1 - Frequency of the modal class

f2 - Frequency of the succeeding modal class

i - Width of the modal class Class Interval Frequency

10 – 20 10

20 – 30 15
Determination of Mode
30 – 40 5(f0)

40 – 50 30(f1)

50 – 60 15(f2)

60 – 70 12
17 | P a g e
70 – 80 13

80 – 90 10
Substitute the values in the formula we get

Mode = l1 + (f1 – f0) *i

2f1 – f0 – f2

= 40 + (30 – 5) * 10

2*30 – 5 - 15

Mode = 40 + 6.25 = 46.25

Measures of Dispersion

Dispersion is a spread of variability in a set of data. It determines dispersal or scatter of


individual items in a given distribution from a central value. It is a measure of variation
of items. A low degree of dispersion indicates high degree of uniformity.

Measures of dispersion – mathematical or computational, Graphical, positional

18 | P a g e
Mathematical or computational – mean deviation or average deviation, standard deviation
or root mean square deviation taken from AM

Graphical – Lorenz curve

Positional measures – Range, interquartile range, Quartile deviation or semi-interquartile


range

Measures of Dispersion – Range, Quartile Deviation, Variance, Standard Deviation,


Coefficient of Variation

Range :

In an arranged array of data the difference between the two extreme values, i.e., the
largest and the smallest values of the distribution is called the range.

Range for Ungrouped data

Range = L – S = Largest value – Smallest value

Ex: the marks obtained by 6 students were 6,8,16, 25, 30, 40. Find the rand the range

Solution

Range = L – S = 40 – 6 = 34

Range for Grouped data/Continuous series

Find the range for the following data :

Weight (in Kg) 140 – 150 150 – 160 160 – 170 170 - 180

No. of bags 5 8 10 12

19 | P a g e
Range = L – S = Largest value – Smallest value

Range = L – S = 180 – 140 = 40

Applications of range

It can be used in stock market, money-rates, gold prices etc

It is used for quality control of the finished products using the control chart for the range
in industry

It is also used by the meteorological department for forecasting weather since it gives an
idea of the fluctuation of temperatures between maximum and minimum levels

Coefficient of range : for comparison purposes a related measure of range is computed by


using the formula given below :

Coefficient of Range (or relative range) = Absolute range / sum of two extreme values =
L – S/L + S

Quartile Deviation

Quartile : median divides the series into two halves, whereas the quartile divides the
series into four halves

Quartile Deviation is an absolute measure of dispersion. It shows average difference


between the two quartiles (Q3- Q1)/2

20 | P a g e
Quartile Deviation for ungrouped data

Find the Q.D of the daily expenses(in rs.) of 7 persons

8,9,14, 16, 25, 30, 40

Here n = 7

1st quartile deviation (q1) = n +1/4 = 2

3rd quartile deviation(q3) = 3n+1/4 = 6

Q1 = 2nd term = 9

Q3 = 6th term = 30

Quartile deviation = Q3 – Q1/2 = 10.5

Coefficient of QD = Q3-Q1/Q3+Q1 = 30-9/30+9 = 0.539

Quartile Deviation for discrete series

Wages F Cumulative frequency

16 1 1

18 4 5(Q1)

21 6 11

28 9 20(Q3)

32 12 32

40 3 35

Q1 = value of N + ¼ th term = 9 th term = 21

Q3 = value of 3N+1/4 = 27th term = rs 32

Coeff of QD = Q3 – Q1/Q3+Q1 = 0.208

Quartile Deviation for Continuous Series

Calculate QD and its coefficient for the given series :

C.I : 10-20 20-30 30-40 40-50 50-60 60-70 70-80

F : 10 15 5 30 15 12 13

21 | P a g e
Solution

Formula(s) of Quartile Deviation

Q1(Lower or first Quartile) = l1 + N/4 – C * i

Q3(Upper or Third Quartile) = l3 + 3 N/4 – C *i

Quartile Deviation(QD) = Q3-Q1/2

Coefficient of QD = Q3-Q1

Q3 +Q1

Class Interval Frequency Cumulative

Frequency

10 – 20 10 15

20 – 30 15 25(c)

l1 - 30 – 40 5(f) 30

40 – 50 30 60

50 – 60 15 75(c)

l3 - 60 – 70 12(f) 87

70 – 80 13 100

80 – 90 10 110

Total N =∑f = 110

Determine N/4 = 110/4 = 27.5

Determine 3N/4 = 330/4 = 82.5

Formula(s) of Quartile Deviation

22 | P a g e
Q1(Lower or first Quartile) = l1 + N/4 – C * i

Q1= 30 + (27.5 – 25) * 10

Q1 = 30 + 2.5 * 10

Q1 = 35

Q3(Upper or Third Quartile) = l3 + 3 N/4 – C *i

= 60 + (82.5-75) * 10

12

= 60 + 7.5 * 10

12

Q3 = 66.25

Quartile Deviation(QD) = Q3-Q1/2

QD = 66.25 – 35/2

QD = 31.25/2 = 15.625

Coefficient of QD = Q3-Q1

Q3 +Q1

23 | P a g e
Coefficient of QD = 31.25/101.25 = 0.30

 Formula for Quartile Deviation

 QD = Q3 – Q1 /2

 QD is also called the Semi-Interquartile range

 Formula for Inter-Quartile Range

 Q3 – Q1

 Formula for Coefficient of Quartile Deviation

24 | P a g e
 Coefficient of Quartile Deviation = Quartile Deviation/Median

(or) Q3 – Q1/Q3 + Q1

To know the percentage of variation = coefficient of quartile deviation * 100

Quartile Deviation is rarely used for practical purposes since it does not consider the
variability of all the values. It gives a fair measure of variability as 50% of the observations
lie between the two quartiles and is affected by fluctuations

Standard Deviation, Variance

It denotes the total variation in the mean. Standard deviation is also called as the Root Mean
Square Deviation. The square of the standard deviation is called variance

The standard deviation is a frequently used measure of dispersion. It enables us to determine as


to how far individual items in a distribution deviate from its mean. It is symmetrical , bell shaped
curve

25 | P a g e
The Standard Deviation is an absolute measure of the scatter of the various values about the
A.M. the relative measure of dispersion based on S.D. is called Coefficient of S.D. which is
given by :

Coefficient of S.D = S.D / mean = 𝛔 / X_

About 68% of values in the population fall within ± 1 standard deviation from the mean

About 95% of the values in the population fall within ± 2 standard deviation from the mean

About 99% of the values in the population fall within ± 3 standard deviation from the mean

µ-3𝛔 µ-2𝛔 µ- 𝛔 µ µ- 𝛔 µ 3𝛔 µ

According to Prof. Karl Pearson, who first suggested this relative measure, Coefficient of
Variation(C.V) is the percentage variation in the mean whereas S.D is the total variation in the
mean. It is widely used since it provides a suitable basis of comparison when the frequency
distributions are of different sizes and have variables of different units. The expression in
percentages gives a better idea about the magnitude of deviations in a number of distributions

For comparing uniformity, homogeneity, variability, stability and consistency of two series, we
must compute the C.V. of the given series. The series with larger coefficient of variation is
considered more variable than the other. The series having smaller C.V. is said to be more
consistent, more uniform and highly stable than the other.

Formula(s) of Standard Deviation

For Ungrouped Data

𝛔 = √(x- X_ )2/n-1 or √∑d2/n-1 (where d = X- X )

X – individual observation

N – number of observations

Calculate the value of 𝛔 for the following series

26 | P a g e
d - deviation

9, 12, 10, 11, 8, 3, 11

x (x – x) (x – x )2

9 -0.14286 0.020408

12 2.857143 8.163265

10 0.857143 0.734694

11 1.857143 3.44898

8 -1.14286 1.306122

3 -6.14286 37.73469

11 1.857143 3.44898

Total 54.85714

𝛔 = √(x- X_ )2/n-1

𝛔 = √54.85714/7-1 = 3.02

Calculating Variance

(𝛔)2 = (3.02)2 = 9.14

For Ungrouped Data

When the values are fractional, use the following formula

𝛔 = √∑d2/n – {∑d/n}2

d is deviation = x - A

Calculate S.D. from the following set of observations :

S.No x (Observations)

1 9

27 | P a g e
2 12

3 10

4 11

5 8

6 13

7 11

8 12

9 10

10 11

11 11

12 12

13 11

14 8

15 11

16 16

Solution

𝛔 = √∑{d2/n} – {∑d/n}2

Computation of S.D

Values (x) d = x – 10 d2

9 -1 1

12 2 4

28 | P a g e
10 0 0

11 1 1

8 -2 4

13 3 9

11 1 1

12 2 4

10 0 0

11 1 1

11 1 1

12 2 4

11 1 1

8 -2 4

11 1 1

16 6 36

∑ = 176 ∑d = 16 ∑d2 = 72

Here, n = 16

𝛔 = √∑{d2/n} – {∑d/n}2

𝛔 = √∑{72/16} – {∑16/16}2

𝛔 = √4.5 – 1 = 𝛔 = √3.5 = 1.87

Alternative formula

𝛔 = √∑{x2/n} – {∑x/n}2

For Discrete Series(Direct Method)

29 | P a g e
Size of items (x) Frequency

6 4

10 7

9 5

11 13

12 8

13 10

14 3

Solution

𝛔 = √∑{fx2/N} – {∑fx/N}2

Size of items (x) Frequency fx fx2

6 4 24 144

10 7 70 700

9 5 45 405

11 13 143 1573

12 8 96 1152

13 10 130 1690

14 3 42 588

∑N = 50 ∑fx = 550 ∑fx2 = 6252

𝛔 = √∑{fx2/N} – {∑fx/N}2

𝛔 = √{6250/50} – {550/50}2

𝛔 = √{125.04} – {11}2

30 | P a g e
𝛔 = √4.04

𝛔 = 2.01

For Discrete Series(Direct Method)

𝛔 = √∑{fd2/N} – {∑fd/N}2

Compute the S.D. of household size from the frequency distribution of 500 households :

Household size No. of households

1 92

2 49

3 52

4 82

5 102

6 60

7 35

8 24

9 4

Solution

Household size No. of households d = x – 4 fd fd2

1 92 -3 -276 828

2 49 -2 -98 196

3 52 -1 -52 52

4 82 0 0 0

5 102 1 102 102

31 | P a g e
6 60 2 120 240

7 35 3 105 315

8 24 4 96 384

9 4 5 20 100

∑500 ∑fd=17 ∑ fd2 = 2217

S.D (𝛔) = √∑{fd2/N} – {∑fd/N}2

S.D (𝛔) = √{2217/500} – {17/500}2

S.D (𝛔) = √{4434} – {0.001}

S.D (𝛔) = √{4434} = 2.11

For Continuous series

𝛔 = 11√∑fd’2/N – {fd’/N}2 * i

d’ = m – A/i

d’ - deviation

m – midpoint

A – Assumed Mean

i – width of the class interval

For continuous series

Weight : 44-46 46-48 48-50 50-52 52-54

f : 8 24 27 21 10

Weight f m d’ = m – 49/2 d’2 fd’ fd’2

44-46 8 45 -2 4 -16 32

46-48 24 47 -1 1 -24 24

32 | P a g e
48-50 27 49 0 0 0 0

50-52 21 51 1 1 21 21

52-54 10 53 2 4 20 40

90 Total 1 117

𝛔 = √∑fd’2/N – {∑fd’/N}2 * i

𝛔 = √117/90 – {1/90}2 * 2

𝛔 = 2.28

Calculating Variance

(𝛔)2 = (2.28)2 = 5.198

Important points on Standard Deviation

 SD measures the absolute dispersion or variability of a distribution

 A small SD means a high degree of uniformity and homogeneity of the observations and
vice versa.

 If two or more comparable series have almost identical means, the distribution with
minimum SD has the most representative mean.

 The practical applicability of SD in the population samples is to determine the variability


in the income levels, education levels, wage levels etc. The determining of the variation
in the income and the education levels helps us to understand the standard of living of the
people. It is used to measure the inequalities in the distribution of income and wealth in a
country

Coefficient of Standard Deviation(CSD)

CSD = 𝛔/ x

33 | P a g e
Coefficient of Variation(CV)

Coefficient of Variation is the percentage of variation in the mean. It is used to compare the
percentage of variation of the mean for the two series

Formula of CV

CV = 𝛔 / x * 100

𝛔 is the standard deviation

x is the Arithmetic Mean

X_ = A + ∑fd’/N * i

Practice Problems in Central Tendency & Measures of Dispersion

1. Calculate Arithmetic Mean, Median and mode for the following :

Class 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89

Frequency 2 4 9 11 12 6 4 2

2. Calculate Arithmetic Mean, Median and mode for the following :

Class 62-63 63-64 64-65 65-66 66-67 67-68 68-69

Frequency 2 6 14 16 8 3 1

3. Calculate Arithmetic Mean, Median and mode for the following :

Class 410- 420- 430- 440- 450- 460- 470-


419 429 439 449 459 469 479

Frequency 14 20 42 54 45 18 7

34 | P a g e
4. Calculate Arithmetic Mean, Median and mode for the following :

Class 30-40 40-50 50-60 60-70 70-80 80-90

Frequency 18 38 46 27 15 8

5. Calculate Arithmetic Mean, Median and mode for the following :

Net 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-


Profit(crores) 100

No. of 5 7 19 29 16 9 8 7
companies

6. Calculate Arithmetic Mean, Median ,mode and Quartile Deviation for the following :

Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

No. of 1 3 11 21 43 32 9
students

7. Calculate Arithmetic Mean, Median and mode for the following :

Class Interval 5-9 10-14 15-19 20-24 25-29 30-34 35-39

Frequencies 1 3 13 17 27 36 38

8. Calculate Median & Q3

Height 141- 151- 161- 171- 181-


150 160 170 180 190

F 5 16 56 19 4

35 | P a g e
9. Calculate Quartile Deviation and its coefficient

Net 0-10 10-20 20-30 30-40 40-50 50-60 60-70


Profit(crores)

No. of 4 8 11 15 12 6 5
companies

10. Calculate Arithmetic Mean, Median and mode for the following :

Net 30- 32- 34- 36- 38- 40- 42- 44- 46- 48-
Profit(crores) 32 34 36 38 40 42 44 46 48 50

No. of 3 8 24 31 50 61 38 21 12 2
companies

11. Calculate Arithmetic Mean, Median and mode for the following :

Net 0-75 75- 150- 225- 300- 375-


Profit(crores) 150 225 300 375 450

No. of 15 200 250 225 10 5


companies

12. Calculate Arithmetic Mean, Median and mode for the following :

Net 55- 65- 75- 85- 95- 105- 115- 125- 135-
Profit(crores) 64 74 84 94 104 114 124 134 144

36 | P a g e
No. of 2 20 79 184 302 207 82 24 4
companies

Calculate Variance, Standard Deviation and Coefficient of Variation

Net 130- 140- 150- 160- 170- 180-


Profit(crores) 140 150 160 170 180 190

No. of 10 20 30 20 10 10
companies

13. Calculate Variance, Standard Deviation and Coefficient of Standard Deviation

Net 20- 25- 30- 35- 40-45 45-50


Profit(crores) 25 30 35 40

No. of 170 110 80 45 40 35


companies

14. Calculate QD & SD

Net 44- 46- 48- 50- 52-54


Profit(crores) 46 48 50 52

No. of 8 24 27 21 10
companies

15. Calculate SD

Net 30- 40- 50- 60- 70-7 80-89 90-99


Profit(crores) 39 49 59 69

No. of 2 12 22 20 14 4 1
companies

16. Calculate SD & Coefficient of Variation(CV)

Net 5-7 8-10 11- 14- 17-19


Profit(crores) 13 16

37 | P a g e
No. of 14 24 38 20 4
companies

17. Calculate SD & Coefficient of Variation(CV)

Net 30- 40- 50- 60- 70-79


Profit(crores) 39 49 59 69

No. of 8 12 6 4 10
companies

18. Compare the variability of life of two varieties of lamps using Coefficient of Variation

Length of life 500- 700- 900- 1100- 1300-


in hrs 700 900 1100 1300 1500

No. of 5 11 26 10 8
lamps(A)

No. of lamps 4 30 12 8 6
(B)

19. Compare the variability of wages for the two varieties M and N using Coefficient of
Variation

Daily 2-3 3-4 4-5 5-6 6-7 7-8 8-9


wages(’00)

Factory M 15 30 44 60 30 14 7

Factory N 25 40 60 35 20 15 5

Additional Problems

Find the Standard Deviation for the sample observations on the weights(g) of a certain
product

S.No x (Observations)

38 | P a g e
1 9

2 12

3 10

4 11

5 8

6 13

7 11

8 12

9 10

10 11

11 11

12 12

13 11

14 8

15 11

16 16

Problem

The following table gives the number of finished articles turned out per day by different number
of workers in a factory. Find the mean value and S.D. of the daily output of finished articles:

No. of articles (x) No. of workers (f)

18 3

19 7

20 11

21 14

22 18

39 | P a g e
23 17

24 13

25 8

26 5

27 4

Problem

The following data give the number of passengers travelled by Airbus from Kolkata to
Mumbai from Sunday to Saturday

No. of Passengers

320

290

265

300

270

200

315

Problem

Two batsmen A and B made the following scores in a series of cricket matches.

No. of Matches A B

1 14 37

2 13 22

3 26 56

40 | P a g e
4 53 52

5 17 14

6 29 10

7 79 37

8 36 48

9 84 20

10 49 4

Calculate the measure of C.V. and determine the consistent player among two batsmen A and B

Problem

Lives of two models of refrigerators in a recent survey are :

Life (No. of Years) No. of Refrigerators

Model A Model B

0–2 5 2

2–4 16 7

4–6 13 12

6–8 7 19

8 – 10 5 9

10 – 12 4 1

What is the average life of each model of each refrigerators? Which model has greater
uniformity?

Problem

From the prices of shares x and y below find out which is more stable in value :

x y

35 108

41 | P a g e
54 107

52 105

53 105

56 106

58 107

52 104

50 103

51 104

49 101

Problem

The following data refer to the dividend (%) paid by two companies A and B cover the last seven
years:

A B

4 12

8 8

4 3

15 15

10 6

11 4

9 10

Calculate the coefficient of variation and comment.

Problem

Goal scored by two teams A and B in a football match were as follows:

42 | P a g e
No. of goals scored in a match No. of Matches

A B

0 26 18

1 10 8

2 7 5

3 6 6

4 4 3

The following table gives the figures of profits of two companies X and Y for the last 10 years.
Which of the two companies has greater consistency in profits :

Year Profits (Rs. ‘000)

X Y

1979 700 550

1980 625 600

1981 725 575

1982 625 550

1983 650 650

1984 700 600

1985 650 550

1986 700 525

1987 600 625

1988 650 600

43 | P a g e
Problem

The following table gives the figures of electricity generated (million K.W. hours) profits
of two companies X and Y. Which of the two companies have greater consistency in
electricity generation :

Year Electricity generated (million K.W. hours)

X Y

2000 101 120

2001 107 115

2002 113 104

2003 121 109

2004 136 110

2005 148 116

Problem

The production figures of a certain commodity are given below :

Year Production (in ‘000’) tonnes

2005 7

2006 9

2007 10

2008 7

2009 5

Calculate the standard deviation

Calculate the percentage of variation of sales for the two quarters

44 | P a g e
Year I II

2008 68 63

2009 70 59

2010 60 55

2011 68 51

2012 65 43

Problem

Calculate the standard deviation for the following :

Year Production of wheat(‘000’ tonnes)

2008 9

2009 10

2010 12

2011 15

2012 13

2013 10

2014 8

2015 16

2016 15

Problem

Calculate the standard deviation for the following :

45 | P a g e
Year Earnings(Rs. Lakhs)

2008 38

2009 40

2010 65

2011 72

2012 69

2013 60

2014 87

2015 95

2016 74

Calculate the percentage of variation between the demand and supply of a


commodity in units

Year

Supply(‘000 tons) Demand(‘000 tons)

1979 120 240

1980 110 250

1981 120 260

1982 119 266

1983 140 232

1984 125 245

1985 127 255

1986 119 267


46 | P a g e
1987 140 268

1988 160 239

47 | P a g e
48 | P a g e
49 | P a g e

You might also like