Professional Documents
Culture Documents
Measusres of Locations
Measusres of Locations
Measusres of Locations
Numerical Measures
Chapter Goals
After completing this chapter, you should be able
to:
Compute and interpret the mean, median, and mode
for a set of data
Compute the range, variance, and standard deviation
and know what these values mean
Construct and interpret a box and whiskers plot
Compute and explain the coefficient of variation and
z scores
Use numerical measures along with graphs, charts, and
tables to describe data
Chapter Topics
Measures of Center and Location
Mean, median, mode, geometric mean,
midrange
Measures of Variation
Range, interquartile range, variance and
standard deviation, coefficient of variation
What Is Statistics?
Statistics is the science
of collecting, organizing,
analyzing, and
interpreting data in order
to make decisions.
Important Terms
Population
The collection of all responses,
measurements, or counts that are of interest.
Sample
A portion or subset of the population.
Important Terms
Parameter
A number that describes a population characteristic.
Average gross income of all
people in the United States in
2002.
Statistic
A number that describes a sample characteristic.
2002 gross income of people
from a sample of three states.
Inferential Statistics
Involves using sample data to draw conclusions
about a population.
Summary Measures
Describing Data Numerically
Center and Location
Other Measures
of Location
Mean
Median
Mode
Weighted Mean
Variation
Range
Percentiles
Interquartile Range
Quartiles
Variance
Standard Deviation
Coefficient of
Variation
Mean
Median
Mode
Weighted Mean
x
i1
XW
i1
wx
w
wx
n
Sample mean
x
i1
x1 x 2 x n
n
N
N = Population
Size
Population mean x
i
x1 x 2 x N
N
N
i1
Coding Method
Sample mean X =
u
A
n
u
h
= A
n
Where u = (x-A)/h
h = class width
Sample mean X =
fu
A
h
n
Where u = (x-A)/h
h = class width
Note that x=mid value for class interval data
= 1/2(lower limit+ upper limit)
Mean = 3
1 2 3 4 5 15
3
5
5
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1 2 3 4 10 20
4
5
5
Median
Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3
n / 2 cf
l
h
f
Mode
Mode = 5
0 1 2 3 4 5 6
No Mode
1
Mode = l
h
1 2
Where l = lower limit of mode class
1 f1 f 0
2 f1 f 2
f1 = mode class frequency (maximum frequency
f 0 = frequency preceding to mode class
f 2 = frequency succeeding to mode class
Weighted Mean
Used when values are grouped by
frequency or relative importance
Example: Sample of
26 Repair Projects
Days to
Complete
Weight
XW
5
12
wx
w
i
(4 5) (12 6) (8 7) (2 8)
4 12 8 2
164
6.31 days
26
Review Example
Five houses on a hill by the beach
$2,000 K
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
$500 K
$300 K
$100 K
$100 K
Summary Statistics
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
Sum 3,000,000
Mean: ($3,000,000/5)
= $600,000
Median: middle value of ranked
data
= $300,000
Mode: most frequent value
= $100,000
Shape of a Distribution
Describes how data is distributed
Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Quartiles
array:
Percentiles
The pth percentile in an ordered array of
n values is the value in ith position, where
p
i
(n 1)
100
p
60
i
(n 1)
(19 1) 12
100
100
Quartiles
Quartiles split the ranked data into 4
equal groups
25% 25%
25%
25%
Q1
Q2
Q3
so use the value half way between the 2nd and 3rd values,
so
Q1 = 12.5
k n / 10 cf
Dk l
h
f
K = 1,2, , 9
t n / 100 cf
h
Percentile (100 equal division ) Pt l
f
t = 1,2,,99
Minimum
25%
1st
Quartile
25%
Median
3rd
Quartile
25%
Maximum
Q1
Q2 Q3
Symmetric
Q1 Q2 Q3
Right-Skewed
Q1 Q2 Q3
Box-and-Whisker Plot
Example
Below is a Box-and-Whisker plot for
the following data:
Min
Q1
0 2
2
10 27
00 22 33 55
Q2
Q3
Max
27
27
Measures of Variation
Variation
Range
Interquartile
Range
Variance
Standard Deviation
Population
Variance
Population
Standard
Deviation
Sample
Variance
Sample
Standard
Deviation
Coefficient of
Variation
Variation
Measures of variation give
information on the spread or
variability of the data values.
Same center,
different variation
Range
Simplest measure of variation
Difference between the largest and
the smallest observations:
Range = xmaximum xminimum
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
Disadvantages of the
Range
10
11
Range = 12 - 7 = 5
12
10
11
12
Range = 12 - 7 = 5
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Sensitive
to outliers
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Interquartile Range
Can eliminate some outlier problems by
using the interquartile range
Eliminate some high-and low-valued
observations and calculate the range from
the remaining values.
Interquartile range = 3rd quartile 1st quartile
Interquartile Range
Example:
X
minimum
Q1
25%
12
Median
(Q2)
25%
30
25%
45
Q3
maximum
25%
57
Interquartile range
= 57 30 = 27
70
Variance
Average of squared deviations of
values from the mean(individual
n
series)
2
(x i x )
Sample variance:
s2 i1
n -1
N
Population variance:
2
(x
i1
Standard Deviation
Most commonly used measure of variation
Shows variation about the mean
Has the same units as the original data
Sample standard deviation:
(Ungroup data)
s
2
(x
x
)
i
i1
n -1
N
2
(x
)
i
i 1
=
Population standard deviation
f xx
n 1
2
2
fx ( fx)
n 1 n(n 1)
f x
N
2
2
fx ( fx )
2
N
N
Coding Method
Sample std. dev. s =
Where u = x-A
And A = guess value
Sample standard deviation s=
Where
x A
u'
h
2
2
fu
(
fu
)
n 1 n(n 1)
2
2
fu ' ( fu ' )
n 1
n(n 1)
( fu )
fu
2
N
N
Calculation Example:
Sample Standard
Deviation
Sample
Data (Xi) :
10
12
14
n=8
s
15
17
18
18
24
Mean = x = 16
126
7
4.2426
Comparing Standard
Deviations
Data A
11
12
13
14
15
16
17
18
19
20 21
Mean = 15.5
s = 3.338
20 21
Mean = 15.5
s = .9258
20 21
Mean = 15.5
s = 4.57
Data B
11
12
13
14
15
16
17
18
19
Data C
11
12
13
14
15
16
17
18
19
Coefficient of Variation
CV
100%
Sample
s
CV
x
100%
Comparing Coefficient
of Variation
Stock A:
Average price last year = $50
Standard deviation = $5
s
CVA
x
$5
100%
100% 10%
$50
Stock B:
Average price last year = $100
Standard deviation = $5
s
CVB
x
$5
100%
100% 5%
$100
Both stocks
have the same
standard
deviation, but
stock B is less
variable relative
to its price
99.7%
Tchebysheffs Theorem
Regardless of how the data are
distributed, at least (1 - 1/k2) of the
values will fall within k standard
deviations of the mean
Examples:At least
(1 - 1/12) = 0% ..... k=1 ( 1)
(1 - 1/22) = 75% ........ k=2 ( 2)
(1 - 1/32) = 89% . k=3 ( 3)
within
Standardized Population
Values
x
z
where:
x = original data value
= population mean
= population standard deviation
z = standard score
(number of standard deviations x is from
)
Standardized Sample
Values
xx
z
s
where:
x = original data value
x = sample mean
s = sample standard deviation
z = standard score
(number of standard deviations x is from
)
Chapter Summary
Described measures of center and
location
Mean, median, mode, geometric mean,
midrange
Chapter Summary
Illustrated distribution shapes
Symmetric, skewed