Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 48

Measures of Variability

Lec 7

Dr.Nesrin H. Darwesh
University of Duhok-college of dentistry
Measures of Variability (variation,
spread, scatter or dispersion around
the mean)
•Range
•Variance
•Standard deviation
Normal Distribution

•Normal distribution is a theoretical model of the whole


population.

•The normal distribution is a descriptive model that describes real


world situations.

1. This is the most important probability distribution in statistics


2. and important tool in analysis of epidemiological data and
management science.
The Standard Deviation :- describes the spread of the
data.(is a measure of how spread out numbers are)
Normal Distribution
What is Normal (Gaussian) Distribution?

Tripthi M. Mathew, MD, MPH


Normal Distribution Curve
• mean=median=mode
• Symmetry about the center
• 50% of the values less than the mean and 50%
greater than the mean
• The distribution is symmetric, with a mean of
zero and standard deviation of 1.
Properties of Normal Distributions
1. The mean, median, and mode are equal.
2. The normal curve is bell-shaped and symmetric about
the mean.
3. The total area under the curve is equal to one.
4. The normal curve approaches, but never touches the
x-axis as it extends farther and farther away from the
mean.
Total area = 1

x
μ
6
The X axis is divided up into deviations from the
mean. Below the shaded area is one deviation from
the mean.
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-5 -4 -3 -2 -1 0 1 2 3 4 5
68% of values
are within 1
standard
deviation of the
The Standard mean

Deviation :
95% of values
are within 2
is a measure standard
deviations of
of how the mean

spread out
numbers are. 99.7% of values
are within 3
standard
deviations of
the mean
The properties of a normal distribution:
• It is a bell-shaped curve.
• It is symmetrical about the mean, μ. (The mean, the mode and the median all have
the same value).
• The total area under the curve is 1 (or 100%).
• 50% of the area is to the left of the mean, and 50% to the right.
• Approximately 68% of the area is within 1 standard deviation, σ, of the mean.

68%

σ σ
μ-σ μ μ+σ
The properties of a normal distribution:
• It is a bell-shaped curve.
• It is symmetrical about the mean, μ. (The mean, the mode and the median all have the same
value).
• The total area under the curve is 1 (or 100%).
• 50% of the area is to the left of the mean, and 50% to the right.
• Approximately 68% of the area is within 1 standard deviation, σ, of the mean.
• Approximately 95% of the area is within 2 standard deviations of the mean.
• Approximately 99% of the area is within 3 standard deviations of the mean.
99%

σ σ σ σ σ σ
μ - 3σ μ - 2σ μ - σ μ μ + σ μ + 2σ μ + 3σ
Data Distribution
• Data can be “distributed” (spread out) in
different ways
Why do we need to know Standard
Deviation?
• Any value is
– likely to be within 1 standard deviation of the
mean
– very likely to be within 2 standard deviations
– almost certainly within 3 standard deviations
When the mean and median are equal,
the data is symmetric

xM 14
Skewed to the left: The mean is less than the
median.

x M 15
Skewed to the right: The mean is bigger
than the median.

M x
16
Measures of central tendency (location or averages)
•Mean
•Mode
•Median
Measures of variability (variation, dispersion or scatter)
•Range
•Variance
•Standard deviation
measure of the center: sample mean x
measure of spread: sample standard deviation s
Variance: a measure of how data
points differ from the mean

• Data Set 1: 3, 5, 7, 10, 10


Data Set 2: 7, 7, 7, 7, 7

What is the mean and median of the above data set?

Data Set 1: mean = 7, median = 7


Data Set 2: mean = 7, median = 7

But we know that the two data sets are not identical!

The variance shows how they are different.


What else?

We want a measure which shows


how far away most of the data
points are from the mean?

19
Range
The range of quantitative data is denoted R and is
given by:

R = Maximum – Minimum

20
Variance

1. Find the mean of the data.


Hint – mean is the average so add up the values and
divide by the number of items.
2. Subtract the mean from each value – the result is
called the deviation from the mean.

3. Square each deviation of the mean.


4. Find the sum of the squares.
5. Divide the total by the number of items.
Variance
Variance is the average squared deviation
from the mean of a set of data. It is used to
find the standard deviation.
Higher the variance Greater is the
variation in the series of data.
Variance Formula

The variance formula includes the Sigma


Notation,  , which represents the sum of all
the items to the right of Sigma.

 (x   ) 2

n
Mean is represented by  and n is the
number of items.
Standard Deviation

Standard Deviation shows the variation in data.

•If the data is close together, the standard deviation will be


small.

•If the data is spread out, the standard deviation will be


large.

The standard deviation is the square root of the


variance
The sample variance, denoted by s², is:

s 2

 (x i  x) 2

n 1
Degrees of freedom (n-1) the number of values in the final
calculation of a statistic that are free to vary

The sample standard deviation


The sample standard deviation is much more commonly
used as a measure of variance.

s s .2
25
Standard Deviation
Find the variance.
a) Find the mean of the data.
b) Subtract the mean from each value.
c) Square each deviation of the mean.
d) Find the sum of the squares.
e) Divide the total by the number of items.
Take the square root of the variance.
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2.
R
a) The range
b) the mean
The standard deviation of this sample.

xi 2 4 3 2 5 2 1 4 5 2

( xi  x )
( xi  x ) 2

27
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5,
2.
a) The range R  5 1  4
b) The standard deviation of this sample.

2  4  3  2  5  2  1  4  5  2 30
x  3
10 10

xi 2 4 3 2 5 2 1 4 5 2

( xi  x )
( xi  x ) 2

28
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5,
2.
R  5 1  4
a) The range
b) The standard deviation of this sample.
2  4  3  2  5  2  1  4  5  2 30
x  3
10 10

xi 2 4 3 2 5 2 1 4 5 2

( xi  x ) -1 1 0

( xi  x ) 2

29
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5,
2.
b) The standard deviation of this sample.

2  4  3  2  5  2  1  4  5  2 30
x  3
10 10

xi 2 4 3 2 5 2 1 4 5 2

( xi  x ) -1 1 0 -1 2 -1 -2 1 2 -1

( xi  x ) 2 1 1 0 1 4 1 4 1 4 1

30
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5,
2.
xi 2 4 3 2 5 2 1 4 5 2

( xi  x ) -1 1 0 -1 2 -1 -2 1 2 -1

( xi  x ) 2
1 1 0 1 4 1 4 1 4 1

s 2

 (x  x)
i
2

n 1
11 0 1 4 1 4 1 4 1
 2
10  1
31
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5,
2.

s 2

 (x  x)
i
2

n 1
11 0 1 4 1 4 1 4 1
 2
10  1

Standard Deviation:

s s 2
 2  1.41
32
Find the variance and
standard deviation
The test scores of five students are:
92,88,80,68 and 52.
1) Find the mean: (92+88+80+68+52)/5 = 76.
2) Find the deviation from the mean:
92-76=16
88-76=12
80-76=4
68-76= -8
52-76= -24
Find the variance and standard deviation
The test scores of five students are:
92,88,80,68 and 52.

3) Square the deviation from the


mean:
(16)  256
2

(12)  144
2

(4)  16
2

( 8)  64
2

( 24)  576
2
Find the variance and standard deviation

4) Find the sum of the squares of the deviation


from the mean:
256+144+16+64+576= 1056
5) Divide by the number of data items to
find the variance:
1056/5 = 211.2
The statistic test scores of five students
are: 92,88,80,68 and 52.
6) Find the square root of the
variance: 211.2  14.53
Thus the standard deviation of the test
scores is 14.53.
Standard Deviation
A different math class took the same test
with these five test scores:
92,92,92,52,52.

Find the standard deviation for this class.


Solve:
A different statistic class took the same test
with these five test scores:
92,92,92,52,52.

Find the standard deviation for this class.


The test scores of five students are:
92,92,92,52 and 52.
1) Find the mean: (92+92+92+52+52)/5 = 76
2) Find the deviation from the mean:
92-76=16 92-76=16 92-76=16
52-76= -24 52-76= -24
3) Square the deviation from the mean:
(16) 2  256(16) 2  256(16) 2  256
   
4) Find the sum of the squares:
256+256+256+576+576= 1920
The test scores of five students are:
92,92,92,52 and 52.
5) Divide the sum of the squares by the
number of items :
1920/5 = 384 variance
6) Find the square root of the variance:
384  19.6
Thus the standard deviation of the second set of test
scores is 19.6.
Coefficient of Variation
• The coefficient of variation (CV) or relative
standard deviation (RSD):-
is the sample standard deviation expressed as a
percentage of the mean, i.e

s
CV     100%
x

Coefficient of Variation = Standard Deviation / Mean

41
Coefficient of Variation

 Measures relative variation

 Always in percentage (%)

 Shows variation relative to mean

 Can be used to compare two or more sets of data

measured in different units

s
CV     100%
x
Example:
Calculate the coefficient of standard deviation and
coefficient of variation for the following sample data:
2, 4, 8, 6, 10, and 12.
Example
The systolic blood pressure of seven middle aged men were as
follows:
151, 124, 132, 170, 146, 124 and 113.

The mean is

x
 151  124  132  170  146  124  113 
7
 137.14

44
Example :

The reordered systolic blood pressure data seen earlier are:

113, 124, 124, 132, 146, 151, and 170.

The Median is the middle value of the ordered data, i.e.


132.

Two individuals have systolic blood pressure = 124 mm Hg,


so the Mode is 124.

45
Example
Data Deviation Deviation2
151 13.86 192.02
124 -13.14 172.73
132 -5.14 26.45
170 32.86 1079.59
146 8.86 78.45
124 -13.14 172.73
113 -24.14 582.88
Sum = 960.0 Sum = 0.00 Sum = 2304.86
x  137.14
46
Example (contd.)

 x  x
2
i  2304.86
i 1

2304.86
s
7 1
 19.6
47
Example
The pulse rates of 10 individuals arranged in
increasing order are:
62, 64, 68, 70, 74, 74, 76, 78, 78, 80
Find the variance and standard deviation

48

You might also like