Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 13

Measures of variation

1
Variability measures
• In addition to locating the center of the observed
values of the variable in the data, another important
aspect of a descriptive study of the variable is
numerically measuring the extent of variation around
the center. Two data sets of the same variable may
exhibit similar positions of center but may be
remarkably different with respect to variability.
• The variability measures should have the following
characteristics:
- be minimum if all the value of the distribution are the
same
-increase as increase the difference among the values
of the distribution

2
Shop Revenues Costs employe place Director Shop R.O
s e gender On-line
1 350 205 5 city male yes 145

2 200 100 3 suburbs male yes 100

3 600 350 10 Near the female no 250


city
4 500 270 10 suburbs female no 230

5 270 200 6 city male no 70

6 180 120 3 city male no 60

7 205 105 3 suburbs male no 100

8 340 210 5 Near the female no 120


city
9 280 140 4 city female yes 140

3
Variability
Possible distribution
revenu revenu revenu revenu
e e e e (C) All the 3 possible
(A) (B) distribution have the
same mean of the
350 325 300 140
observed one
200 325 350 270 x  325
600 325 400 830
500 325 200 605
270 325 300 120
180 325 325 200
205 325 300 190 BUT the distribution are
very different!!!
340 325 400 200
280 325 350 370

Observed distribution
4
Some measures of variability
Range range  xmax  xmin

It is the width of the interval that contain all


the values of the distribution.

Interquartile range dQ  Q3  Q1

It is the width of the interval that contain


50% the values of the distribution.
(central ones).

5
Example
Revenue Revenue Revenue Revenue
(A) (B) (C)
350 325 300 140 A
No Variability
200 325 350 270
All values are
600 325 400 830 the same
500 325 200 605

270 325 300 120 From A to B


180 325 325 200 and from B to
C, the
205 325 300 190
variability
340 325 400 200 increasaes,
the range is
280 325 350 370
higher.
xmin 180 325 200 120

xmax 600 325 400 830

Range=xmax-xmin 420 0 200 710

6
Deviation from the mean
The variance σ2 is function of the differences
among each value xi and the mean x
1 n
   xi  x 
2 2

n 11
2  0

The sum of squared deviation is


n
Dev(X)   xi  x 
2

i1

7
The standard is the squared root of the
variance

1 n
   i x  x 2

n i1

The coefficient of variation CV is the ratio


between the standard dev. and the mean,
multiplied 100
 x0
CV  100
x
8
Example
Revenue Differences Squared
xj from mean differences n
Mean property
(xj-μ) (xj-μ)2  xi  x   0
i1

n
25 625
350

 i
x  x 2
 Dev(X)  163200
200 -125 15625 i1

275 75625 s.s.dev.=163200


600
1 n Dev(X)
500 175 30625
 x i  x 2
   2

270 -55 3025 n i1 n
180 -145 21025 163200
  18133,3
205 -120 14400 9
Variance=18133,3
340 15 225
1 n
280 -45 2025    x i  x 2

n i1 Std.Dev.=134,7
9

mean   x  325  18133,3  134,7


9
Variance from a frequency
distribution
Employee Shops
(xj) (nj) (xj-μ)2*nj

3 2 19,34
4 1 4,45
6 3 0,04
7 1 0,79
10 2 30,26

  x  6,11
1 K 54,88
   x j  x  n j 
2
2
 6,10
n j1 9
2,47
  6,10  2,47 CV  100  40,43%
6,11
10
Standardised values
If a quantitative variable X as mean x
and standard deviation σ, it is possible to
obtain its standardised values

yi  xi  x  /  i  1...n

The distribution of Y has zero mean and standard


deviation equal to 1
Comparison among two founds
(equal mean)
F1 F2
2003 7,7 6,4
In last 5 years F1 and F2 had
2004 6,1 5,9 the same performance in
2005 0,4 3,2 mean, but variances are
2006 9,8 7,1 different Var(F1)>Var(F2)
2007 3,5 4,9
mean 5,5 5,5
var 10,7 1,8
Higher variability means that performance very different
from the mean are more frequent.
Higher volatility Higher risk

12
Comparison among the performance of two
founds (different mean)
F1 F2
F1 has a mean and a variance
2003 9,7 1,4
higher than F2.
2004 7,1 1,9
Can we say that F1 is an higher risk
2005 0,9 2,2
2006 9,9 2,1 found than F2?
2007 7,5 4,9
media 7,0 2,5
var 10,6 1,5
CV 46,5 49,3

We have to compare the CV


F1 has less variability

13

You might also like