Descriptive Statistics PDF

Numerical Descriptive
Measures
Descriptive Statistics
The best way to work with data is to

summarize and display the data.
Numbers that have not been

summarized and organized are called
raw data.
Descriptive measures
A descriptive measure is a single

number that is used to describe a set
of data.
Descriptive measures include

measures of central tendency and
measures of dispersion.
Summary Definitions
 The central tendency is the extent to
which all the data values group around
a central value.
 The variation is the amount of

dispersion, or scattering, of values
 The shape is the pattern of the

distribution of values from the lowest
value to the highest value.
Distribution curve
Describing Data Numerically
Describing Data Numerically
Central Tendency Variation
Arithmetic Mean Range
Median Interquartile Range
Mode Variance
Standard Deviation
Coefficient of Variation
Calculating the Mean, Median
and Mode for ungrouped data
The Sample Mean
Pronounced x-bar The ith observation
∑x x1 + x2 +  + xn
i
=x =
i =1
n n
Sample size=number of observations n observations
Example 1
 For this sample data Xi:
2, 3, 5, 1, 4, 3, 2, 4 find the sample mean.

xi 8
x1 2 ∑x i
x2 3 Sample mean, x = i =1
n
x3 5
x4 1 24
x5 4
x=
8
x6 3
2
x7
x=3
x8 4
Σxi 24
Example 2
The following are the ages (in years) of

all eight employees of a small company
53, 32, 61, 27, 39, 44, 49, 57
Find the mean age of these employees.

Properties of the Sample Mean
 Uniqueness -- For a given set of data there is one and only
one mean.
 Affected (distorted) by extreme values (outliers)
 May better be replaced by the median when the distribution
of the data is ‘skewed’).
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 + 2 + 3 + 4 + 5 15 1 + 2 + 3 + 4 + 10 20
= =3 = =4
5 5 5 5
Measures of Central Tendency:
The Median
The median is the value of the

middle observation in a dataset
that has been ranked in increasing
order.
The Median
 First, arrange the observations in ascending order
 Then, find the middle position, using the following

formula
n +1
Median position = position in the ordered data
2
 Find the median value.
Example 1
Find the median for the following data set.
27 38 12 34 42 40 24 40 23
 The ordered set becomes
Observation 12 23 24 27 34 38 40 40 42
Rank 1 2 3 4 5 6 7 8 9
9 + 1 th
 The median position is = 5 rank (observation)
2
 Therefore the median = 34
Example 2
Sambiri Silicon manufactures computer

monitors. The following data are numbers of
computer monitors produced at the company
for a sample of 10 days. Find the median.
24 31 27 25 35 33 26 40 25 28
Properties of the Median
 In an ordered array, the median is the “middle”
number (50% above, 50% below)
 Uniqueness -- There is only one median for each
set of data.
 Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
The Mode
The mode is the most frequently
occurring value in a set of observations.
Organizing data into an ordered array

(in ascending order) helps to locate the
mode.
The Mode
 Find the mode for the data below

7.00 11.00 14.25 15.00 15.00 15.50
19.00 19.00 19.00 19.00 21.00 22.00
23.00 24.00 25.00 27.00 27.00 28.00
34.22 43.25
The mode is 19.00 because it recurs the

most times, i.e. four (4) times
Properties of the Mode
 Not affected by extreme values

 There may be no mode
 There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Review Example
House Prices:  Sample Mean = $600,000

$2,000,000
$500,000
$300,000
 Median = $300,000
$100,000
$100,000
 Mode = $100,000
Sum $3,000,000
Relationship among the Mean,
Median and Mode
Knowing the values of the mean,

median and mode can give us some
idea about the shape of a frequency
distribution curve
Symmetric Histogram
Skewed Histogram
Skewed Histogram
Which Measure to Choose?
 The mean is generally used, unless

extreme values (outliers) exist.
 The median is often used, since the

median is not sensitive to extreme values.
 In some situations it makes sense to report

both the mean and the median.
Summary
Central Tendency
Sample Mean Median Mode Geometric

Mean
n
∑X i
XG = ( X1 × X 2 ×  × Xn )1/ n
X= i=1
n Middle value Most Rate of
in the ordered frequently change of
array observed a variable
value over time
Measures of Dispersion for
ungrouped data
Measures of Dispersion
The measures of central tendency, such
as the mean, median and mode, do not
reveal the whole picture of the
distribution of the dataset.
 Two datasets with the same mean may have

completely different spreads.
 The amount or degree of spread is known

as variation.
 Which dataset has the larger variation?
Dataset 1
Dataset 2
Population 1 Population 2
Narrow range Wide range
Smaller Larger
variation variation
Smaller Larger
deviation deviation Population 1
Observations Observations
clustered spread out Population 2
Same centre,
different variation
Measures of Dispersion:
Summary Characteristics
 The more the data are spread out, the
greater the range, variance, and
standard deviation.
 The more the data are concentrated,

the smaller the range, variance, and
standard deviation.
Summary Characteristics
 If the values are all equal (no
variation), all these measures will be
zero.
 None of these measures are ever

negative.
Consider the following data on ages of

employees at each of two companies.
The mean age of employees of these

companies is the same, 40 years.
If we do not know the ages of individual
employees at these two companies and
we are only told that the mean age of
employees at both companies is the
same, we may wrongly deduce that the
employees at these two companies have
a similar age distribution.
The diagram shows that the ages of the

employees at the second company have a
much larger variation than the ages of the
employees at the first company.
The mean, median and mode locate the

centre of the distribution.
We also need a measure that can

provide some information about the
variation (spread) among data values.
Consider the following data on ages of

employees at each of two companies.
The mean age of employees of these

companies is the same, 40 years.
Variation
Range Variance Standard Coefficient

Deviation of Variation
Measures of variation give

information on the spread
or variability or dispersion
of the data values.
Same centre,
different variation
The Range
Range = Xlargest – Xsmallest
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 – 1 = 12
Why The Range Can Be Misleading
 Ignores the way in which data are distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Why The Range Can Be Misleading
 Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119

The Sample Variance
Variance is used to measure the
dispersion of values relative to the
mean.
The Sample Variance
∑ (x i − x) 2
s =
2 i=1
n−1
Where X = arithmetic mean
n = sample size
Xi = ith observation of the
variable X
The Sample Variance
When values are close to their mean

(narrow range) the dispersion is less than
when there is scattering over a wide
range.
Example
For this sample data Xi:
2, 3, 5, 1, 4, 3, 2, 4 find.
1. Sample variance
2. Sample standard deviation
xi
2
3
5
1
4
3
2
4
Σ 24
xi
2 2-3
3 3-3
5 5-3
1 1-3
4 4-3
3 3-3
2 2-3
4 4-3
Σ 24
xi
2 -1
3 0
5 2
1 -2
4 1
3 0
2 -1
4 1
Σ 24 0
xi
2 -1 1
3 0 0
5 2 4
1 -2 4
4 1 1
3 0 0
2 -1 1
4 1 1
Σ 24 12
Solution
n
∑x i
Sample mean=x = 3
i =1
n
n 2
∑( x i − x)
s = 2 i =1
n −1
12
Sample variance s= = 1.714
2
7
The Sample Standard Deviation
 Most commonly used measure of variation
 Tells us how much observations in our sample
differ from the mean value within our sample.
 Has the same units as the original data
s= s 2
Solution n
∑x i
Sample mean
= x i =1
= 3
n
n 2
∑( x i − x)
Sample variance s =
2 i =1
n −1
12
2
s= = 1.714
7
Sample standard deviation
=s s
= 2
= 1.309
1.714
Comparing Standard Deviations
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338
Data B Mean = 15.5

11 12 13 14 15 16 17 18 19 20 21
s = 0.926
Data C Mean = 15.5

s = 4.570
11 12 13 14 15 16 17 18 19 20 21
The Coefficient of Variation
Sometimes we may need to compare
the variability of two different datasets
that have different units of
measurement.
The Coefficient of Variation
 Measures relative variation to mean
 Always in percentage (%)
 Can be used to compare the variability of two or
more sets of data measured with different units of
measurements.
 s 
CV =   ×100%
x 
Example 1
The yearly salaries of all employees who work for
a company have a mean of $62,350 and a
standard deviation of $6820. The years of
experience for the same employees have a mean
of 15 years and a standard deviation of 2 years. Is
the relative variation in the salaries larger or
smaller than that in the years of experience for
these employees?
Example 2
For example, we wish to know which is more

variable, the price of stock A or price of stock B
Stock A Stock B
Average price $50 $100
Standard deviation $5 $5
Comparing Coefficients of Variation
s  5
CVA =
  ⋅100% = ⋅100% =
10%
x  50
s  5
CVB =  ⋅100% = ⋅100% =5%
x  100
Comparing the C.V. it is clear that variation is

much higher stock A than in stock B.
Interpretation
 A low (%) value shows low variability
implying tight clustering of
observations about the mean.
 A middle to high (%) value shows high

variability implying that observations
are widely spread.
Measures of Position for
ungrouped data
(Quartile Measures)
Quartile Measures
 Quartiles split the ranked data into 4 equal

segments.
25% 25% 25% 25%

Q1 Q2 Q3
 The first quartile, Q1, contains the first 25% of the
observations.
 Q2 is the same as the median contains the first 50%
of the observations.
 The third quartile, Q3, contains the first 75% of the
observations.
Quartile Measures
 Q1 = 25th percentile = P25

Locating Quartiles Positions
Find a quartile by determining the value in
the appropriate position in the ranked data
= Q 1 0.25 ( n + 1)
First quartile position:
= Q 2 0.5 ( n + 1)
Second quartile position:
= Q 3 0.75 ( n + 1)
Third quartile position:
Quartile Measures:
The Interquartile Range (IQR)
Because the range can be distorted by
outliers (extreme values), a modified range
which excludes these outliers if often
calculated.
The IQR measures the spread in the

middle 50% of the data
IQR
= Q3 − Q1
Quartile Measures:
 The IQR is also called the 50%
midspread.
 This modified range removes outliers,

but it excludes 50% of all observations
from further analysis.
Quartile Measures:
The IQR, like the range, also provides no

information on the clustering of
observations within the dataset as it
uses only two observations in its
computation.
Example 1
Given Sample Data in Ordered Array:

11 12 13 16 16 17 18 21 22
Find
and 3
1. Q 1 Q
2. IQR
Locating First quartile, Q1
11 12 13 16 16 17 18 21 22
(n = 9)
Q1 is in the 0.25(9+1)=2.5position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
Locating Third Quartile, Q3
11 12 13 16 16 17 18 21 22
(n = 9)
Q3 is in the 0.75(9+1)=7.5position of the ranked data
so use the value half way between the 7th and 8th values,
so Q3 = 19.5
IQR
= Q3 − Q1
= 19.5 − 12.5
= 7.0
Example 2
Given Sample Data in Ordered Array:

7 8 9 10 11 12 13 13 14 17 17 45
Find
and 3
1. Q 1 Q
2. IQR
Locating First quartile, Q1
7 8 9 10 11 12 13 13 14 17 17 45
(n in the 0.25 (12 + 1) 3.25 pos of the ranked data.

12) Q 1 is=
So find the value half way between the 3rd and 4th values,
9 + 10
which is = 9.5
2
9 + 9.5
= Q 1 = 9.25
2
Locating Third Quartile, Q3
7 8 9 10 11 12 13 13 14 17 17 45
(n in the 0.75 (12 + 1) 9.75 pos of the ranked data.

12) Q 3 is=
So find the value half way between the 9th and 10th values,
14 + 17
which is = 15.5
2
15.5 + 17
= Q 3 = 16.25
2
IQR
= Q3 − Q1
= 16.25 − 9.25
= 7.0
Numerical Descriptive
Measures of a Population
Numerical Descriptive Measures
for a Population
 Numerical descriptive measures
discussed so far described a sample, not
the population.
 These descriptive statistics are called

sample statistics.
for a Population
 Summary measures describing a
population, are called parameters, and
are denoted with Greek letters.
 Important population parameters are the

population mean, population variance,
and population standard deviation.
for a Population:
The population mean µ
N
∑X i
µ= i =1
N
Where μ = population mean
N = population size
variable X
For A Population:
The Population Variance σ2
N
∑ (X − μ)i
2
σ =2 i=1
N
Where μ = population mean
N = population size
variable X
For A Population: The Population
Standard Deviation σ
∑ (X i − μ) 2
σ= i =1
N
Sample statistics versus
population parameters
Measure Population Sample
Parameter Statistic
Mean µ x
Variance σ2 s2
Standard σ s
Deviation
Proportion π p
Approximating the Mean,
Variance and Standard
deviation from grouped data
Computing Numerical Descriptive
Measures From A Frequency
Distribution
We can only compute approximations to
the mean, variance and the standard
deviation of the data since we are
dealing with grouped data.
Approximating the Sample Mean
from a Frequency Distribution
Use the midpoint of a class interval to approximate the values
in that class
k
∑fx i i
x= i=1
n
Where n = number of observations or sample size
k = number of classes in the frequency
distribution
xi = class midpoint
fi = frequency of observations
Example 1
The table below gives the commuting times (in
minutes) from home to work for 30 employees of
a company
18 15 7 24 10
23 28 10 16 12
5 23 24 16 19
26 17 27 17 17
29 18 23 9 26
12 22 14 26 22
Descriptive Statistics on Raw Data
Std.
n Range Mean Deviation Variance
Time 30 24 18.50 6.627 43.914
Question 1
Using the grouped data approximate the mean

commuting time for the 30 employees.
Frequency Distribution
Class
fi xi fixi
Limits
5 ≤ x <10
10 ≤ x <15
15 ≤ x <20
20 ≤ x <25
25 ≤ x <30
Class
fi xi fixi
Limits
5 ≤ x <10 3
10 ≤ x <15 5
15 ≤ x <20 9
20 ≤ x <25 7
25 ≤ x <30 6
30
Class
fi xi fixi
Limits
5 ≤ x <10 3 7.5
10 ≤ x <15 5 12.5
15 ≤ x <20 9 17.5
20 ≤ x <25 7 22.5
25 ≤ x <30 6 27.5
30
Class
fi xi fixi
Limits
5 ≤ x <10 3 7.5 22.5

10 ≤ x <15 5 12.5 62.5
15 ≤ x <20 9 17.5 157.5
20 ≤ x <25 7 22.5 157.5
25 ≤ x <30 6 27.5 165.0
30 565
The mean commuting time
5
∑fx i i
565
Mean= x= i =1
=
n 30
x = 18.833 minutes
x = 18.8minutes
Approximating the Sample Standard
Deviation from a Frequency Distribution
∑ (x − x) i
2
fi
s= i=1
n−1
Where n = number of observations or sample size
k = number of classes in the frequency distribution
xi = class midpoint
fi = frequency of observations
Descriptive Statistics on Raw Data
Std.
n Range Mean Deviation Variance
Time 30 24 18.50 6.627 43.914
Question 2
Using the grouped data approximate the sample

variance and standard deviation for commuting
time for the 30 employees.
Class
fi mid- ( xi − x ) ( xi − x ) ( x − x )
2 2
Class Limits i fi
point, xi
5 ≤ x <10 3 7.5
10 ≤ x <15 5 12.5
15 ≤ x <20 9 17.5
20 ≤ x <25 7 22.5
25 ≤ x <30 6 27.5
Class
fi mid- ( xi − x ) ( xi − x ) ( x − x )
2 2
Class Limits i fi
point, xi
5 ≤ x <10 3 7.5 -11.333

10 ≤ x <15 5 12.5 -6.333
15 ≤ x <20 9 17.5 -1.333
20 ≤ x <25 7 22.5 3.667
25 ≤ x <30 6 27.5 8.667
Class
fi mid- ( xi − x ) ( xi − x ) ( x − x )
2 2
Class Limits i fi
point, xi
5 ≤ x <10 3 7.5 -11.333 128.437

10 ≤ x <15 5 12.5 -6.333 40.107
15 ≤ x <20 9 17.5 -1.333 1.777
20 ≤ x <25 7 22.5 3.667 13.447
25 ≤ x <30 6 27.5 8.667 75.117
Class
fi mid- ( xi − x ) ( xi − x ) ( x − x )
2 2
Class Limits i fi
point, xi
5 ≤ x <10 3 7.5 -11.333 128.437 385.311

10 ≤ x <15 5 12.5 -6.333 40.107 200.535
15 ≤ x <20 9 17.5 -1.333 1.777 15.993
20 ≤ x <25 7 22.5 3.667 13.447 94.129
25 ≤ x <30 6 27.5 8.667 75.117 450.702
1146.670
The Variance
∑( x )
2
i −x fi =
1146.670
∑( x )
2
i −x fi
Variance= s=
2 i =1
n −1
1146.670
=s 2
= 39.540
29
The Standard Deviation
s= s 2
=s s
=2
39.540
s = 6.288 minutes
s = 6.3minutes
Class Exercise 1
The frequency distribution table below gives the
number of iPods sold by a shop on each of 30 days.
Calculate the mean, variance and standard
deviation.
iPods sold f
5-9 3
10 - 14 6
15 - 19 8
20 -24 8
25 -29 5
30
Class Exercise 2
Sambiri Silicon manufactures computer monitors.
The following table represents the distribution of
computer monitors produced at the company for
a sample of 30 days. Calculate the mean, variance
and standard deviation.
Class Limits f
21 - 23 7
24 - 26 6
27 - 29 6
30 -32 4
33 -35 7
30
Class Exercise 3
A sample of 40 randomly selected households
from a city produced the following distribution of
the number of vehicles owned. Find the mean,
variance and standard deviation.
Class f
0 2
1 18
2 11
3 4
4 3
5 2
Approximating the Median
from grouped data
Approximating the Median from a
c [ 0.5n − CF ]
Me = L +
fme
Where L = lower class limit of the median class interval

c = class width
n = sample size
fme = absolute frequency of the median class interval
CF = absolute cumulative frequency of the interval
before the median interval
2. Find the median.
 Calculate the cumulative frequencies-(order)
 Identify the median position

n
= 0.5n
 Use the formula 2
Identify the median class interval from
the cumulative frequency column.
This is the class interval that contains the

median value
Question 3
Using the grouped data from the
commuting time for 30 employees
example approximate:
1. The median.
Commuting times Example
Class Limits fi CF
5 ≤ x <10 3 3
10 ≤ x <15 5 8
15 ≤ x <20 9 17
20 ≤ x <25 7 24
25 ≤ x <30 6 30
30
L = ? 15
c=? 5
n = 30 fme
fme = ? 9
Median CF = ? 8
interval CF
Cumulative
Class Limits fi
frequency
5 to <10 3 3 Median
10 to <15 5 8 position
15 to <20 9 17
20 to <25 7 24
25 to <30 6 30
n=∑ fi=30
1. The Median commuting time
n 30
n 30
= = = 15th observation
2 2
The median interval is
15 to < 20
as it contains the 15th observation
=L 15, = c 5, = n 30,f= me 9,=CF 8
5[ 0.5 × 30 − 8]
Me =
15 + =
18.889 ≈ 18.9 minutes
9
Approximating the Mode from
grouped data
Approximating the Mode from a Frequency
Distribution
c ( fm − fm −1 )
Mo = L +
2 fm − fm −1 − fm +1
Where L = lower limit of the modal class interval

c = class width of the modal class interval
fm = frequency of the modal class interval
fm-1 = frequency of the class preceding the modal
interval
fm+1 = frequency of the class following the modal
interval
L = ? 15
c=? 5
fm = ? 9
Modal fm+1 = ? 7
fm-1 = ? 5
interval
Class Limits fi
5 to <10 3 Modal
10 to <15 5 value
15 to <20 9
20 to <25 7
25 to <30 6
n=∑ fi=30
2. The Modal commuting time
Identify the modal interval
This is the interval associated with the highest frequency
The modal interval is

15 to < 20
as it contains the highest frequency
=L 15, = c 5, = fm 9, =fm −1 5,=fm +1 7
5 ( 9 − 5)
Mo =
15 + =
18.333 ≈ 18.3minutes
2(9) − 5 − 7
Finding the mode using the graphical
method.
Number of Minutes
10
6
Frequency
0
7.5 12.5 17.5 22.5 27.5
Class Midpoint
Class Exercise 1
With the aid of an appropriate graph

drawn to scale, determine
1. The mode for the commuting time
problem
Measures of Position for
grouped data
(Quartile Measures)
Question 5
Using the grouped data from Example

on commuting time for 30 employees
approximate the following:
1. Q1
2. Q3
3. The IQR
Class Limits fi CF
5 ≤ x <10 3 3
10 ≤ x <15 5 8
15 ≤ x <20 9 17
20 ≤ x <25 7 24
25 ≤ x <30 6 30
30
Q1 = 25th percentile
 Identify the interval that contains the 25th

percentile.
c [ 0.25n − CF ]
P25 = L +
fP25
L=?
c=?
n=?
fp25 = ?
CF = ?
L = ? 10
c=? 5
n = 30 fp25
fp25 = ? 5
P25 CF = ? 3
interval CF
Cumulative
Class Limits fi
frequency
5 to <10 3 3 P25
15 to <20 9 17
20 to <25 7 24
25 to <30 6 30
n=∑ fi=30
 L = 10 → lower limit of the P25 interval
 n = 30 → sample size
 fp25=5
 CF= 3 → cumulative frequency of the interval
before the P25 interval
 c = 5 → class width
5[ 0.25 × 30 − 3]
P25 =
10 + =
14.5
5
Q3 = 75th percentile
 Identify the interval that contains the 25th

percentile.
c [ 0.75n − CF ]
P75 = L +
fP75
L=?
c=?
n=?
fp75 = ?
CF = ?
L = ? 20
c=? 5
n = 30 fp75
fp75 = ? 7
P75 CF = ? 17
interval CF
Cumulative
Class Limits fi
frequency
5 to <10 3 3
10 to <15 5 8
15 to <20 9 17 P75
25 to <30 6 30
n=∑ fi=30
 L = 20 → lower limit of the P75 interval
 n = 30 → sample size
 fp75=7
 CF= 17 → cumulative frequency of the
interval before the P75 interval
 c = 5 → class width
5[ 0.75 × 30 − 17]
P75 =
20 + =
23.9
7
The Interquartile Range
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
5 14.5 18.3 23.9 29
Interquartile range
= 23.9 – 14.5 = 9.4
Class Exercise 2
With the aid of an appropriate graph drawn to

scale, determine
1. Q1, Q3 and IQR
2. The 80th percentile and
3. The mid-60% range.

Descriptive Statistics PDF

Uploaded by

Copyright:

Available Formats

You might also like

Descriptive Statistics PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Descriptive Statistics PDF

Uploaded by

Copyright:

Available Formats

Numerical Descriptive

The best way to work with data is to

Numbers that have not been

A descriptive measure is a single

Descriptive measures include

 The variation is the amount of

 The shape is the pattern of the

Describing Data Numerically

Central Tendency Variation

Arithmetic Mean Range

Median Interquartile Range

Pronounced x-bar The ith observation

2, 3, 5, 1, 4, 3, 2, 4 find the sample mean.

The following are the ages (in years) of

53, 32, 61, 27, 39, 44, 49, 57

Find the mean age of these employees.

The median is the value of the

 First, arrange the observations in ascending order

 Then, find the middle position, using the following

Sambiri Silicon manufactures computer

Organizing data into an ordered array

 Find the mode for the data below

The mode is 19.00 because it recurs the

 Not affected by extreme values

House Prices:  Sample Mean = $600,000

Knowing the values of the mean,

 The mean is generally used, unless

 The median is often used, since the

 In some situations it makes sense to report

Sample Mean Median Mode Geometric

 Two datasets with the same mean may have

 The amount or degree of spread is known

 The more the data are concentrated,

 None of these measures are ever

Consider the following data on ages of

The mean age of employees of these

The diagram shows that the ages of the

The mean, median and mode locate the

We also need a measure that can

Consider the following data on ages of

The mean age of employees of these

Range Variance Standard Coefficient

Measures of variation give

Range = Xlargest – Xsmallest

Range = 120 - 1 = 119

When values are close to their mean

Data B Mean = 15.5

Data C Mean = 15.5

For example, we wish to know which is more

Comparing the C.V. it is clear that variation is

 A middle to high (%) value shows high

 Quartiles split the ranked data into 4 equal

25% 25% 25% 25%

 Q1 = 25th percentile = P25

 Q2 = 50th percentile = P50

 Q3 = 75th percentile = P75

The IQR measures the spread in the