Measures of Dispersion V4

PQX 7001
STATISTICS IN EDUCATION
Lecture 4
Understanding Data Via Descriptive
Analysis
• Two sets of descriptive measures:
– Measures of central tendency: used to
report a single piece of information that
describes the most typical response to a
question
– Measures of variability: used to reveal the
typical difference between the values in a
set of values
Analysis
• Measures of Central Tendency:
– Mode: the value in a string of numbers that
occurs most often
– Median: the value whose occurrence lies in
the middle of a set of ordered values
– Mean: sometimes referred to as the
“arithmetic mean”; the average value
characterizing a set of numbers
3
Analysis
• Measures of Variability:
– Frequency distribution reveals the number (percent) of
occurrences of each number or set of numbers
– Range identifies the maximum and minimum values in a
set of numbers
– Standard deviation indicates the degree of variation in a
way that can be translated into a bell-shaped
curve distribution
4
Analysis
5
Content
• Measures of dispersion
– Range
– Inter quartile range
– Variance
– Standard deviation
Measures of dispersion
 Observe the June test report for two students in the

following table:
 What are the similarities in achievement between student A
and student B?
 What are the differences in achievement between student A
and student B?
 Discuss their performance based on the report
Student A 77 78 79 80 81 82 83
Student B 47 69 73 80 96 97 98
Student A 77 78 79 80 81 82 83
Student B 47 69 73 80 96 97 98
Measure of central tendency Student A Student B
Mode None None

Median 80 80
Mean 80 80
Student A 77 78 79 80 81 82 83
Student B 47 69 73 80 96 97 98
 What is the difference between the achievement of

student A and student B?
 Student A:
 His marks range between 77 and 83
 There is only a small difference between the marks obtained
 There is only a small difference between the marks and the
mean
 This student’s marks have a small dispersion
Student A 77 78 79 80 81 82 83
Student B 47 69 73 80 96 97 98
 What is the difference in achievement between student A

and student B?
 For student B:
 His marks range between 47 dan 98
 There is a big difference between the test marks
 There is a big difference between the marks and the mean
 His marks have a large dispersion
Student A 77 78 79 80 81 82 83
Student B 47 69 73 80 96 97 98
 Discuss the students’ performance based on this

report
 Student A showed a more consistent performance
compared to student B because his test scores

have a smaller dispersion
• Question:
Can the measures of central tendency alone (mode,
median and mean) provide enough information to
give a good representation of how data is
distributed?
• Answer:
– No
– Therefore, we need other measures that can
provide additional information about the
dispersion of a data set. This measure is called
measure of dispersion
• A measure that shows how far the values in a data
set differ from each other or from the central
position of the data set
• Common measures of dispersion:
– Range
– Interquartile range
– Variance
– Standard deviation
How do scores spread out?
• Variability
–Tell us how far scores spread out
–Tells us how the degree to which
scores deviate from the central
tendency
How are these different?
Mean = 10 Mean = 10
Measure of Variability
Measure Definition Related to:

Range Largest - Smallest Mode
Interquartile Range X75 - X25
Median
Semi-Interquartile Range (X75 - X25)/2
Average Absolute Deviation X i X
N
N
 X X
2
i
Variance i 1
Mean
N 1
N
 X i  X
2
Standard Deviation i 1
N 1
The Range
• The simplest measure of variability

– Range (R) = Xhighest – Xlowest
– Advantage – Easy to Calculate
– Disadvantages
• Like Median, only dependent on two scores  unstable
{0, 8, 9, 9, 11, 53} Range = 53
{0, 8, 9, 9, 11, 11} Range = 11
• Does not reflect all scores
Range
• Measure of the difference between the largest value
and the smallest value in a data set:
• Range = largest value – smallest value
• Example:
– Find the range for the following data set:
9, 12, 6, 32, 15
• Solution:
– Range = largest value – smallest value
= 32 – 6
= 26
Range
• Self test:
Find the range for the following data set:
a) 108, 104, 109, 45, 106, 107, 110
b) 4.5, 3.2, 9.3, 4.2, 2.1
• Solution:
a) Range = largest value – smallest value
= 110 – 45
= 65
b) Range = largest value – smallest value
= 9.3 – 2.1
= 7.2
Interquartile Range
• Quartile: the value that divides a data set arranged in order into four parts
containing the same number of data
• First Quartile (Q1): a value where up to 1/4 of the total data has a value that
is lower than its own value
• Second Quartile (Q2) or median: a value where up to 1/2 of the total data has
a value that is lower than its own value
• Third Quartile (Q3): a value where up to 3/4 of the total data has a value that
is lower than its own value
• Example: 33, 40, 45, 47, 50, 52, 60, 66, 70, 76, 82
Q1=45
Q2=52 Q3=70
Variability: IQR
• Interquartile Range
– = P75 – P25 or Q3 – Q1
– This helps to get a range that is not influenced by
the extreme high and low scores
– Where the range is the spread across 100% of the
scores, the IQR is the spread across the middle
50%
Interquartile Range
• Measure of the difference between the third quartule and the first
quartile ia a data set:
• Interquartile range = third quartile – first quartile
= Q 3 – Q1
• Example:
– Find the interquartile range of the following data set:
33, 50, 45, 47, 40, 52, 82, 66, 70, 76, 60
Arrange data in ascending order:
33, 40, 45, 47, 50, 52, 60, 66, 70, 76, 82
Q1=45 Q3=70
• Interquartile range = third quartile – first quartile

= 70 – 45
= 25
Interquartile Range
• Self test:
– Find the inter quartile range for the following:
– a) 54, 42, 45, 76, 71, 62, 47
– b) 19, 14, 20, 27, 17, 24, 22, 18
• Solution:
– a) Arrange data in ascending order:
42, 45, 47, 54, 62, 71, 76

Q1=45 Q3=71
Inter quartile range = Q3 – Q1

= 71 – 45
= 26
Interquartile Range
• Solution:
– b) Arrange data in ascending order:
14, 17, 18, 19, 20, 22, 24, 27
Q3=22  24
Q1= 17  18 2
2 = 23
= 17.5
Interquartile range = Q3 – Q1
= 23 – 17.5
= 5.5
Variability: SIQR
• Semi-interquartile range
– =(P75 – P25)/2 or (Q3 – Q1)/2
– IQR/2
– This is the spread of the middle 25% of the data
– The average distance of Q1 and Q3 from the
median
– Better for skewed data
Variability: SIQR
• Semi-Interquartile range
Q1 Q2 Q3 Q1 Q2 Q3
Variance and standard deviation
• Mean of the squares of the deviation of each data from the

mean of a data set:
• Variance = sum of squares of the deviations of each data from the mean
number of data
or  2
= 2
• Standard deviation is the square root of the variance:

• Standard deviation,  = √variance
Variance
• When calculated for a sample
  X
2
X

2 i
s
N 1
• When calculated for the entire population
  X
2
X
 
2 i
N
Standard Deviation
• Variance is in squared units
• What about regular old units
• Standard Deviation = Square root of the variance
 X X
2
s
i
N 1
Standard Deviation
• Uses measure of central tendency (i.e. mean)

• Uses all data points
• Has a special relationship with the normal
curve
• Can be used in further calculations
• Standard Deviation of Sample = SD or s
• Standard Deviation of Population = 
Why N-1?
• When using a sample (which we always do)
we want a statistic that is the best estimate
of the parameter
  X  X 2   
  X X
2
   E  
i 2 i
E
 N 1   N 1 
   
Variance and standard deviation of
population
Variance and standard deviation of
sample
=
Example:
Find the variance and standard deviation of the following
data set
26, 28, 30, 35, 38, 23
Solution:
X =
X
N
X (X- X ) (X- X )2
=
• = 180
26 26- 30 = -4 16
158
6
28 28- 30 = -2 4 =
30 30- 30 = 0 0 5
= 30 35 35- 30 = 5 25
= 31.6
38 38- 30 = 8 64
23
180
23- 30 = -7 49
158
 = √31.6
= 5.62
1  (  X ) 2

S2    X 2  
 1   
• Ex.: Find the variance and standard deviation of the following
26, 28, 30, 35, 38, 23
• Solution:
X X2
26 676
1  (  X ) 2

S2    X 2  
28 784  1   
30 900
1  180 2 
=  5558  
35 1225
6 1  6 
38 1444
= 31.6
23 529
s = √31.6
TOTAL = 180 TOTAL = 5558 = 5.62
• Self test:
– Find the variance and standard deviation of the following:
15, 28, 33, 47, 56
• Solution: 1  ( X ) 2 
S 
2 2
  X  
X X2  1   
15 225
1  179 2 
  7443  
28 784 5 1  5 
33 1089 = 258.7
47 2209
S = √258.7
56 3136
=16.08
Total =179 Total =7443
• Calculating variance and standard deviation

using SPSS v 15 for Windows:
– Prepare an SPSS file
– Enter the data
– Click on the menu Analyze
– Select Descriptive Statistics
– In the Descriptive Statistics menu, select
Descriptives, Options, Variance, Std. Deviation,
Continue, OK
• Analyze
• Descriptive Statistics
• Descriptives
• Options
• Variance
• Std. Deviation
• Continue
• OK
• Self test:
– Calculate the variance and standard deviation of the following
data set using SPSS:
345, 232, 422, 341, 330, 472, 356, 436
• Solution:
 2
= 75.30 (2 t.p.)
 = 5669.36 (2 t.p.)
• Example:
• In a road poll, the speed of 40 cars were recorded as
follows (km/hr)
56 72 63 81 73 53 57 69 70 89
63 68 72 77 82 85 59 74 69 76
73 62 80 70 60 71 65 73 64 69
72 65 77 68 78 72 67 75 66 79
• Find the variance and standard deviation of the car
speed using SPSS
Range, IQR, Variance and standard deviation
• Question:
– Which of the measures of dispersion is suitable to
represent a data set?
Characteristics of range, inter quartile range,
variance and standard deviation
• Range
– Advantages:
• Easy to understand and calculate
• Suitable for providing a rough picture of dispersion in a
short time
– Disadvantages:
• Its calculation only involves values at both ends of the
data set, ignoring the remaining data
• Its value is influenced by the presence extreme values
in a data set
• Thus, range may give a less than accurate picture of
dispersion for the data set
• Range
– Example:
• Scores obtained by students in a Mathematics test:
• 52, 54, 56, 60, 61, 64, 65, 92
• Calculate the mean and range of marks
– Solution:
• Mean = 504/8 = 63
• Range = 92 – 52 = 40
• Range
• Scores obtained by students in a Mathematics test:
• 52, 54, 56, 60, 61, 64, 65, 92
• Mean = 504/8 = 63
• Range = 92 – 52 = 40
– Question:
• Can the range give an accurate picture of the dispersion of
scores?
– Answer:
• No, the range of 40 shows a large dispersion.
• Most of the scores are around 52 to 65 and are mainly around
the mean
• Only one score is too high (92) and is placed very far from the
mean
• Inter quartile range
– Advantages:
• Its value is not affected by extreme values in a data set
as it is the range for 50% of the value of data between
the first quartile and the third quartile
• Thus, the inter quartile range can be used even if there
are extreme values in the data set
– Disadvantages:
• Does not measure dispersion of each data from the
mean of a data set
– Example:
• Two groups of teachers made the following donations
for Warriors Day:
• Group A:
• RM1, RM2, RM2, RM3, RM4, RM5, RM5, RM6, RM6,
RM6, RM8, RM30
• Group B:
• RM1, RM2, RM5, RM5, RM6, RM6, RM6, RM6, RM8,
RM8, RM10, RM15
• Find the mean, range and inter quartile range for
each group.
• Group A:
– RM1, RM2, RM2, RM3, RM4, RM5, RM5, RM6, RM6, RM6, RM8, RM30
• Group B:
– RM1, RM2, RM5, RM5, RM6, RM6, RM6, RM6, RM8, RM8, RM10, RM15
Group A Group B
Mean RM78/12=RM6.50 RM78/12=RM6.50
Range RM30-RM1=RM29 RM15-RM1=RM14
Inter quartile range RM6-RM2.50=RM3.50 RM8-RM5=RM3
 Can the range give accurate information about dispersion of data?

 No, because it is influenced by extreme values
 Can the inter quartile range give accurate information about
dispersion of data?
 Yes, because it is not influenced by extreme values
• Variance
– Advantages:
• More accurate compared to range and inter quartile
range because its calculation involves all values in the
data set
• Measures the spread of each value from the mean of
the data set
– Disadvantages:
• Calculation involves the square of deviations of each
value from the mean of the data set
• Thus, the unit for variance is not the same as the unit
for data
• Standard deviation
– Advantages:
• More accurate compared to range and inter quartile
range – its calculation includes all data from the data
set
• Measures the dispersion of each value from the mean
of the data set
• Its calculation does not involve the square of deviations
from the mean of the data set
• Thus, the unit for standard deviations the same as the
unit for data
Analysis
Ch 15 50
When to Use a Particular Statistic
Research Question
• How can quiz scores of students enrolled in an
introductory statistics class be summarised
using measures of central tendency?
Measures of dispersion.
• Template:
How can [variable] be summarised using
measures of central tendency? Measures of
dispersion.
SPSS Output
Summarising results
• As shown in Table 3.5, scores ranged from 9 to
20.
• The mean was 15.56 approximate, median
was 17.00 and the mode was 17.00.
• Thus, the scores tended to lump together at
the high end of the scale.
• A negatively skewed distribution is suggested
given that the mean was less than the median
and mode
Summarising results
• The range was 11, the interquartile range was
5.0, variance was 10.01 and standard deviation
was 3.16.
• For example, the middle 50% of the scores had a
range of 5 (interquartile range) indicating that
there was a reasonable spread of scores around
the median.
• Thus, despite a high ”average” score, there were
some low performing students as well.
• These results are consistent with those described
using the graphical representation.
Results in APA format
As shown in Table 3.5, scores ranged from 9 to 20. The
mean was 15.56, approximate median was 17.00 and
the mode was 17.00. Thus, the scores tended to lump
together at the high end of the scale. A negatively
skewed distribution is suggested given that the mean
was less than the median and mode. The exclusive
range was 11, the interquartile range was 5.0, variance
was 10.01 and standard deviation was 3.16. From
example, the middle 50% of the scores had a range of
5(interquartile range) indicating that there was a
reasonable spread of scores around the median. Thus,
despite a high ”average” score, there were some low
performing students as well. These results are
consistent with those described using the graphical
representation.

Measures of Dispersion V4

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Measures of Dispersion V4

Uploaded by

Copyright:

Available Formats

PQX 7001

 Observe the June test report for two students in the

Measure of central tendency Student A Student B

Mode None None

 What is the difference between the achievement of

 What is the difference in achievement between student A

 Discuss the students’ performance based on this

compared to student B because his test scores

Measure Definition Related to:

• The simplest measure of variability

• Interquartile range = third quartile – first quartile

42, 45, 47, 54, 62, 71, 76

Inter quartile range = Q3 – Q1

14, 17, 18, 19, 20, 22, 24, 27

• Mean of the squares of the deviation of each data from the

• Standard deviation is the square root of the variance:

• Uses measure of central tendency (i.e. mean)

• Calculating variance and standard deviation

 Can the range give accurate information about dispersion of data?

You might also like