ch03 2

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 50

Business Statistics:

A Decision-Making Approach
7th Edition

Chapter 3
Describing Data Using
Numerical Measures

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1


Chapter Goals
After completing this chapter, you should be able to:
 Compute and interpret the mean, median, and mode for a
set of data
 Compute the range, variance, and standard deviation and
know what these values mean
 Construct and interpret a box and whisker graph
 Compute and explain the coefficient of variation and
z scores
 Use numerical measures along with graphs, charts, and
tables to describe data
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-2
Chapter Topics
 Measures of Center and Location
 Mean, median, mode
 Other measures of Location
 Weighted mean, percentiles, quartiles
 Measures of Variation
 Range, interquartile range, variance and standard
deviation, coefficient of variation
 Using the mean and standard deviation together
 Coefficient of variation, z-scores
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-3
Summary Measures

Describing Data Numerically

Center and Location Other Measures Variation


of Location
Mean Range
Percentiles
Median Interquartile Range
Quartiles
Mode
Variance
Weighted Mean
Standard Deviation

Coefficient of
Variation
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-4
Measures of Center and Location
Overview
Center and Location

Mean Median Mode Weighted Mean


n

x i
XW 
wx i i
x
w
i1
n i
N

x i W 
 wxi i

 i1
N
w i
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-5
Mean (Arithmetic Average)
 The Mean is the arithmetic average of data
values
 Population mean N = Population Size
N

x
x1  x 2    x N
i
  i1
N N
 Sample mean n = Sample Size
n

x i
x1  x 2    x n
x i1

n n
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-6
Mean (Arithmetic Average)
(continued)

 The most common measure of central tendency


 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4

1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-7
Median

 In an ordered array, the median is the “middle”


number, i.e., the number that splits the
distribution in half
 The median is not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-8


Median
(continued)
 To find the median, sort the n data values
from low to high (sorted data is called a
data array)
 Find the value in the i = (1/2)n position
 The ith position is called the Median Index
Point
 If i is not an integer, round up to next highest
integer

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-9


Median Example
(continued)
Data array:
4, 4, 5, 5, 9, 11, 12, 14, 16, 19, 22, 23, 24
 Note that n = 13
 Find the i = (1/2)n position:
i = (1/2)(13) = 6.5
 Since 6.5 is not an integer, round up to 7
 The median is the value in the 7th position:
Md = 12
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-10
Shape of a Distribution
 Describes how data is distributed
 Symmetric or skewed

Left-Skewed Symmetric Right-Skewed

Mean < Median Mean = Median Median < Mean


(Longer tail extends to left) (Longer tail extends to right)
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-11
Mode
 A measure of location
 The value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 5 No Mode
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-12
Weighted Mean

 Used when values are grouped by frequency or


relative importance

Example: Sample of
26 Repair Projects
Weighted Mean Days
Days to
Frequency to Complete:
Complete
5 4 XW 
 wx i i

(4  5)  (12  6)  (8  7)  (2  8)
6 12 w i 4  12  8  2
7 8 164
  6.31 days
8 2 26

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-13


Review Example
 Five houses on a hill by the beach
$2,000 K
House Prices:

$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000

$100 K

$100 K

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-14


Summary Statistics

House Prices:
 Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000
 Median: middle value of ranked data
Sum 3,000,000
= $300,000

 Mode: most frequent value


= $100,000

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-15


Which measure of location
is the “best”?

 Mean is generally used, unless


extreme values (outliers) exist
 Then Median is often used, since
the median is not sensitive to
extreme values.
 Example: Median home prices may be
reported for a region – less sensitive to
outliers

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-16


Other Location Measures
Other Measures
of Location

Percentiles Quartiles

The pth percentile in a data array:  1st quartile = 25th percentile


 p% are less than or equal to this
value  2nd quartile = 50th percentile
 (100 – p)% are greater than or = median
equal to this value
(where 0 ≤ p ≤ 100)  3rd quartile = 75th percentile

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-17


Percentiles

 The pth percentile in an ordered array of n values is the


value in ith position, where

p If i is not an integer,
i (n) round up to the next
100 higher integer value

 Example: Find the 60th percentile in an ordered array of


19 values.

p 60 So use value in the


i (n)  (19)  11.4
100 100 i = 12th position
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-18
Quartiles
 Quartiles split the ranked data into 4 equal
groups:
25% 25% 25% 25%

Q1 Q2 Q3

 Note that the second quartile (the 50 th percentile)


is the median

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-19


Quartiles

 Example: Find the first quartile


Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9)

Q1 = 25th percentile, so find i : i = (9)25


= 2.25
100
so round up and use the value in the 3 rd position: Q1 = 13

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-20


Box and Whisker Plot
 A graphical display of data using a central “box”
and extended “whiskers”:

Example:
25% 25% 25% 25%

* *
Outliers Lower 1st Median 3rd Upper
Limit Quartile Quartile Limit

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-21


Constructing the
Box and Whisker Plot

* *
Outliers Lower 1st Median 3rd Upper
Limit Quartile Quartile Limit

The lower limit is Q1 The upper limit is


– 1.5 (Q3 – Q1) Q3 + 1.5 (Q3 – Q1)

 The center box extends from Q1 to Q3


 The line within the box is the median
 The whiskers extend to the smallest and largest values within the
calculated limits
 Outliers are plotted outside the calculated limits

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-22


Shape of Box and Whisker Plots
 The Box and central line are centered between the
endpoints if data is symmetric around the median

 (A Box and Whisker plot can be shown in either


vertical or horizontal format)

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-23


Distribution Shape and
Box and Whisker Plot

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-24


Box-and-Whisker Plot Example
 Below is a Box-and-Whisker plot for the following data:
Min Q1 Q2 Q3 Max
0 2 2 2 3 3 4 5 6 11 27

*
0 2 3 6 12 27
Upper limit = Q3 + 1.5 (Q3 – Q1) 27 is above the
upper limit so is
= 6 + 1.5 (6 – 2) = 12 shown as an outlier

 This data is right skewed, as the plot depicts


Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-25
Measures of Variation
Variation

Range Variance Standard Deviation Coefficient of


Variation
Population Population
Interquartile
Variance Standard
Range
Deviation

Sample Sample
Variance Standard
Deviation

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-26


Variation

 Measures of variation give information on


the spread or variability of the data
values.

Same center,
different variation

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-27


Range
 Simplest measure of variation
 Difference between the largest and the smallest
observations:

Range = xmaximum – xminimum

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-28
Disadvantages of the Range
 Ignores the way in which data are distributed

7 8 9 10 11 7 8 9 10 11
12 Range = 12 - 7 = 5 12 Range = 12 - 7 = 5

 Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-29
Interquartile Range

 Can eliminate some outlier problems by using


the interquartile range

 Eliminate some high-and low-valued


observations and calculate the range from the
remaining values.

 Interquartile range = 3rd quartile – 1st quartile

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-30


Interquartile Range Example

Example:
X Median X
minimum Q1 (Q2) Q3 maximum

25% 25% 25% 25%

12 30 45 57 70

Interquartile range
= 57 – 30 = 27

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-31


Variance
 Average of squared deviations of values from
the mean
 Population variance: N

 i
(x  μ) 2

σ 
2 i1
N

 i
 Sample variance:
(x  x ) 2

s 
2 i1
n -1
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-32
Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Has the same units as the original data
N
Population standard deviation:
 i

(x  μ) 2

σ i1
N

n
 Sample standard deviation:
 i
(x  x ) 2

s i1
n -1
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-33
Calculation Example:
Sample Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16
(10  x ) 2  (12  x ) 2  (14  x ) 2    (24  x ) 2
s
n 1

(10  16) 2  (12  16) 2  (14  16) 2    (24  16) 2



8 1

130
  4.3095
7
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-34
Comparing Standard Deviations
Same mean, but different
standard deviations:
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258

Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-35
Coefficient of Variation
 Measures relative variation
 Always in percentage (%)
 Shows variation relative to mean
 Is used to compare two or more sets of data
measured in different units

Population Sample
σ  s 
CV  
μ
  100% CV     100%

   x 
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-36
Comparing Coefficients
of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

s  $5
CVA     100% 
  100%  10%
x  $50 Both stocks
have the same
 Stock B: standard
 Average price last year = $100 deviation, but
stock B is less
 Standard deviation = $5
variable relative
to its price
s  $5
CVB     100% 
  100%  5%
x  $100
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-37
The Empirical Rule
 If the data distribution is bell-shaped, then
the interval:
 μ  1σ contains about 68% of the values in
the population or the sample

68%

μ
μ  1σ
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-38
The Empirical Rule
 μ  2σ contains about 95% of the values in
the population or the sample
 μ  3σ contains about 99.7% of the values
in the population or the sample

95% 99.7%

μ  2σ μ  3σ

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-39


Tchebysheff’s Theorem

 Regardless of how the data are distributed,


at least (1 - 1/k2) of the values will fall within
k standard deviations of the mean

 Examples:
At least within
(1 - 1/12) = 0% ……..... k=1 (μ ± 1σ)
(1 - 1/22) = 75% …........ k=2 (μ ± 2σ)
(1 - 1/32) = 89% ………. k=3 (μ ± 3σ)

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-40


Standardized Data Values

 A standardized data value refers to


the number of standard deviations a
value is from the mean

 Standardized data values are


sometimes referred to as z-scores

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-41


Standardized Population Values

x μ
z
σ
where:
 x = original data value

 μ = population mean

 σ = population standard deviation

 z = standard score

(number of standard deviations x is from μ)

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-42


Standardized Sample Values

xx
z
s
where:
 x = original data value

 x = sample mean

 s = sample standard deviation

 z = standard score

(number of standard deviations x is from μ)

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-43


Standardized Value Example
 IQ scores in a large population have a bell-
shaped distribution with mean μ = 100 and
standard deviation σ = 15
Find the standardized score (z-score) for a
person with an IQ of 121.

Answer: x  μ 121  100


z   1.4
σ 15

Someone with an IQ of 121 is 1.4 standard deviations


above the mean
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-44
Using Microsoft Excel

 Descriptive Statistics are easy to obtain


from Microsoft Excel

 Use menu choice:


Data / data analysis / descriptive statistics

 Enter details in dialog box

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-45


Using Excel

 Select:
Data / data analysis / descriptive statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-46
Using Excel
(continued)

 Enter dialog box


details

 Check box for


summary statistics

 Click OK
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-47
Excel output

Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:

$2,000,000
500,000
300,000
100,000
100,000

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-48


Chapter Summary

 Described measures of center and location


 Mean, median, mode, weighted mean
 Discussed percentiles and quartiles
 Created Box and Whisker Plots
 Illustrated distribution shapes
 Symmetric, skewed

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-49


Chapter Summary
(continued)

 Described measure of variation


 Range, interquartile range, variance,
standard deviation, coefficient of variation
 Discussed Tchebysheff’s Theorem
 Calculated standardized data values

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-50

You might also like