Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 41

Lecture 9

Measures of Dispersion
and
Skewness
Stat 101/math 107
Department of Statistics
Forman Christian College (A Chartered University), Lahore
Preview

 Measures of dispersion attempt to quantify the variability or spread of data.


 Two types of measures of dispersion: Absoulte and relative measures of dispersion
1. Range is determined by two extreme
values

2. Quartile deviation ignores 50% of the scores i-e., first 25% of the
scores and
the last 25%of the scores.

3. Mean Deviation is based on all observatiosn but ignores

negative sign
Measure of Dispersion

Variance and Standard deviation


 Rather than relying on the location of the value, variance measures dispersion by
calculating how far observations are from the mean.
 Variance = Average of the distance from the mean of each observation (squared).
 High variance means that many/most observations are far from the mean but
could be heavily influenced by outliers.
 Low variance means that many/most observations are close to the mean.
Standard deviation
 Takes square root of variance to put measure in the
same unit as the observations.
 Example: The average rating of the Conservatives is
4.8 and the standard deviation is 2.8. This tells us that
the average amount that the ratings differ from the
mean is 2.8 points on the 11 point scale used to
measure feeling towards the Conservative Party.
 In contrast, the variance is 8.0, which can be
interpreted as 8 squared points on the 11 point scale.
 This explanation is confusing and has little intuitive
power.
Formula I

 (variance)
 (standard deviation)

 Steps to calculate Standard deviation and variance


1. Find mid points (X)
2. Caluculate mean
3. Find deviations and take sqaure of each deviation
4. Multiply squared deviations with corresponding frequencies
5. Sum and divide it by sum of frequencies, you will get variance
6. Take sqaure root of variance to get standard deviation.
Formula II

 (variance)
 (standard deviation)

 Steps to calculate Standard deviation and variance


1. Find mid points (X)
2. Multiply X with coressponding frequencies to get
3. Multiply with coressoponding gives
4. Sum , and . Then Impute all values in formula to get variance
5. Take sqaure root of variance to get standard deviation.
Example 1 (using formula 1)
 The data represent the number of miles that 20 runners ran during one week.

Class Frequency
5.5.-10.5 1
10.5-15.5 2
15.5-20.5 3
20.5-25.5 5
25.5-30.5 4
30.5-35.5 3
35.5-40.5 2

 Find the variance and the standard deviation


Step 1: to find mid points

Class f Mid points (X) fX


5.5.-10.5 1 (1*8)=8
10.5-15.5 2 13 26
15.5-20.5 3 18 54
20.5-25.5 5 23 115
25.5-30.5 4 28 112
30.5-35.5 3 33 99
35.5-40.5 2 38 76

Step 2: Calculate
Step 3:
Find deviations and take sqaure of each deviation

Class f X fX
5.5.-10.5 1 8 8 8-24.5= -16.5
10.5-15.5 2 13 26 13-24.5= -11.5 132.25
15.5-20.5 3 18 54 -6.5 42.25
20.5-25.5 5 23 115 -1.5 2.25
25.5-30.5 4 28 112 3.5 12.25
30.5-35.5 3 33 99 8.5 72.25
35.5-40.5 2 38 76 13.5 182.25
Step 4:
Multiply squared deviations with corresponding frequencies

Class f X fX
5.5.-10.5 1 8 8 -16.5
10.5-15.5 2 13 26 -11.5 132.25 264.5
15.5-20.5 3 18 54 -6.5 42.25 126.75
20.5-25.5 5 23 115 -1.5 2.25 11.25
25.5-30.5 4 28 112 3.5 12.25 49
30.5-35.5 3 33 99 8.5 72.25 216.75
35.5-40.5 2 38 76 13.5 182.25 364.5

Step 5: find variance

Step 6: find standard deviation


miles
Example 2 (using formula 2)
 The following gives the frequency distribution of the daily commuting time (in minutes)
from home to work for all 25 employees of a company.

Daily commuting time Number of employees


0 to less than 10 4
10 to less than 20 9
20 to less than 30 6
30 to less than 40 4
40 to less than 50 2

 Calculate mean, variance and standard deviation.


Step 1: to find mid points

Classes f X fX
0-10 4 5 20
10-20 9 15 135
20-30 6 25 150
30-40 4 35 140
40-50 2 45 90

 Step 2 :Multiply X with coressponding frequencies to get


Step 3: Multiply with coressoponding gives

Classes f X fX
0-10 4 5 20 5*20=100
10-20 9 15 135 2025
20-30 6 25 150 3750
30-40 4 35 140 4900
40-50 2 45 90 4050

 Step 4: : calculate variance


= 135.04 minutes2
 Step 5: : calculate stajndard deviation

Example video : https://www.youtube.com/watch?v=eeRrNT7DUHk&feature=youtu.be


 Whenever two samples have the same
units of measure, the variance and
Coefficient of Variation standard deviation for each can be
compared directly.
 For example, suppose an automobile
dealer wanted to compare the standard
deviation of miles driven for the cars she
received as trade-ins on new cars. She
found that for a specific year, the
standard deviation for Buicks was 422
miles and the standard deviation for
Cadillacs was 350 miles. She could say
that the variation in mileage was greater
in the Buicks.
Coefficient of variation

 But what if a manager wanted to compare the standard deviations of


two different variables, such as the number of sales per sales- person
over a 3-month period and the commissions made by these
salespeople?
 A statistic that allows you to compare standard deviations when the
units are different, as in this example, is called the coefficient of
variation.
Note: Coefficient of variation is always a percenatge
Coefficient of Variation

 Coefficient of variation is the percentage variation in mean, standard deviation being


considered as the total variation in the mean.
 Formula

Where
s= standard deviation
= mean
 The series or groups of data for which the coefficient of variation is greater indicates that the
group is more variable, less stable, less uniform, less consistent or less homogeneous.
 If coefficient of variation is less, it indicates that the group is lessvariable, more stable, more
uniform, more consistent or more homogeneous.
Example 3:
The mean of the number of sales of cars over a 3-month period is 87, and the standard deviation is 5. The mean
of the commissions is $5225, and the standard deviation is $773. Compare the variations of the two.

 Sales  Commissions

Since the coefficient of variation is larger for commissions, the commissions are more
variable than the sales.
Example 4

 Goals scored by teams A and B in a football season as follows:


Number of goals Number of matches
scored in match Team A Team B
0 27 17
1 9 9
2 8 6
3 5 5
4 4 3
 By calculating the coefficient of variation in each case, find which term may be
considered more consistent.
Solution
No. of goals Team A Team B
(X) f1 f1X f1 X2 f2 f2X f2 X2

0 27 0 0 17 0 0
1 9 9 9 9 9 9
2 8 16 32 6 12 24
3 5 15 45 5 15 45
4 4 16 64 3 12 48
Total
 Team A  Team B

 
=1.71 =1.71

 =122.6%  =108.3%

So Team B is more consistent in its performance


Merits and Demerits of Standard deviation

 Merits  Demerits
1. Based on all the item of the distribution. 1. Compare to others it is difficult to
compute.
2. The sqauring of deviations make them
positive and the difficulty about algebric 2. It gives more weight to extreme values
signsn which was expressed in case of and less to those which near to mean
mean deviation is not found here
3. For comparing the variability of two or
more distribution coefficient of variation
is considered to be most appropriate.
Symmetry and Skewness
Symmrtey and Skewness
Symmetry Skewness
 The distribution is symmtrical, when the  Skewness measures the extent to which the
values of mean , median and mode are observations are asymmetric.
equal.  In other words, skewness tells us whether
 If observations are symmetric around the there are many more observations above or
mean there are as many observations less below the mean.
than the mean than there are  Like mean, skew is sensitive to extreme
observations greater than the mean values.
Skewness
REFERS TO ASYMMETRY

Skewed to the left Skewed to the right


Positively Skewed distribution Negatively skewed distribution
 There is a long tail on the right and the  There is a long tail on the keft and the
mean is on the right of the mode. mean is one left od the mode

Skewness is +ve or –ve depending upon the location of the mode with respect
to mean
Example : The algebra test results of a class is presented using histogram

 The kangroo fits our graph its tail going


to the right
There are a small number of stusdents who have  So it has positive skewness
done well, which makes this graphg strech out
horizontal to the right  Then mean> median> mode
 There are a small number of stduents  The Kangroo fits onto our graph with its
who have done poorly, which makes this tail going to the LEFT.
graph stretch out horizontally to the left.  So it has Negative skewness
 Then mean< median< mode
 This particular class is faily normal and
has both low and high scores.  Two kangroos show this graph is a
 This graph streches our evenly on both mirros image type shape.
sides of the Mode at 60-69.  The graph is symmetrical
 Mean= medain = mode
MEASURES OF SKEWNESS

 shows the degree of asymmetru or departure from symmetry of a distribution


 Also indicates the direction of the distribution

 Karl Pearson’s first formula to find coeffiicient of Skewness (sk)


1.
 Mode is indeterminated, then we Mean – Mode can me taken as
3(Mean –Median)
MEASURES OF SKEWNESS

 Karl Pearson’s second forumla to find coeffiicient of Skewness (sk)


2.

Positively skewed

Negatively skewed

Symmetrical

 Sk lies between -3 to +3
Example 5: Give the following statistics on weights of court justices in kilograms. Calculate
coefficient of skewness using Karl Pearson’s methods.

mean= 74.1, median=75, mode=84 and standard deviation=11.25

Negatively skewed

Video Example : https://www.youtube.com/watch?v=5HUZe_QXXww


Example 6: The following frequency distribution presents the monthly
salaries in rupees of 20 employees of a firm

Salary (Rs.) 60-80 80-100 100-120 120-140 140-160


Frequency 4 5 4 4 3

Calculate Karl Pearson’s Co-efficient of Skewness.


Solution
Calculate mean, mode and standard deviation to measure skeweness

Classes f X fX fX2
60-80 4 70 280 19600
Modal class 80-100 5 90 450 40500
100-120 4 110 440 48400
120-140 4 130 520 67600
140-160 3 150 450 67500
Total

Σ 𝑓𝑋 2140
𝑋= = =107
Σ𝑓 20
=80+10=90

As sk>0, so the distribution is positively skewed


MEASURES OF SKEWNESS

 Bowley suggested a formula for skewness based on relative positions of quartiles.


 Bowley’s coefficient of skewness

 Its value lies between -1 to +1


 In a symmetrical distribution , the quartiles are equidistant from the value of the means i-e
= and the coefficient of skewness will be zero.
 If , the distribution is negatively skewed
 If , the distribution is positively skewed
Example:

Calculate quartile coefficient of skewness from the information about two Places given below.

Measure Place A Place B


Median 201.0 201.6
Third Quartile 260.0 242.0
First Quartile 157.0 164.2
Solution

Place A Place B

Positivelyt skewed
Example:
 The following frequency distribution represents the annual sales and number of firms. By
using quartiles, find a measure of skewness.

Annual Sales (Rs. 00,000) No. of firms


10-20 30
20-30 195
30-40 240
40-50 115
50-60 54
60-70 10
70-80 6
80-90 15
90-100 15
Solution:
Sales (Rs. 00,000) f C.F
10-20 30 30
20-30 195 225 Q1 Locate Q1:
30-40 240 465 Median or Q2
40-50 115 580 Q3
50-60 54 634 Locate Q2:
60-70 10 644
70-80 6 650
80-90 15 665
90-100 15 680 Locate Q3:
Total 680
 l=20; f=195.; h=10; C.F.=30;

 l=30; f=240.; h=10; C.F.=225;

 l=40; f=115; h=10; C.F.=465;


 Bowley’s coefficient of skewness

The distribution is positively skewed

Video Example : https://www.youtube.com/watch?v=NIU51PzSD5o

You might also like