Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Chapter 1 1-1

Statistics for
Business and Economics

Chapter 1
Introduction to Statistics

Copyright © 2013 Pearson Education Ch. 1-1

1.1
Decision Making in an
Uncertain Environment

Everyday decisions are based on incomplete


information

Examples:

 Will the job market be strong when I graduate?


 Will the price of Yahoo stock be higher in six months
than it is now?
 Will interest rates remain low for the rest of the year if
the federal budget deficit is as high as predicted?

Copyright © 2013 Pearson Education Ch. 1-2

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-2

Decision Making in an
Uncertain Environment
(continued)

Data are used to assist decision making

 Statistics is a tool to help process, summarize, analyze,


and interpret data

Copyright © 2013 Pearson Education Ch. 1-3

Key Definitions

 A population is the collection of all items of interest or


under investigation
 N represents the population size
 A sample is an observed subset of the population
 n represents the sample size

 A parameter is a specific characteristic of a population


 A statistic is a specific characteristic of a sample

Copyright © 2013 Pearson Education Ch. 1-4

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-3

Population vs. Sample

Population Sample

Values calculated using Values computed from


population data are called sample data are called
parameters statistics
Copyright © 2013 Pearson Education Ch. 1-5

Examples of Populations

 Names of all registered voters in the United


States
 Incomes of all families living in Daytona Beach
 Annual returns of all stocks traded on the New
York Stock Exchange
 Grade point averages of all the students in your
university

Copyright © 2013 Pearson Education Ch. 1-6

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-4

Random Sampling

Simple random sampling is a procedure in which

 each member of the population is chosen strictly by


chance,
 each member of the population is equally likely to be
chosen,
 every possible sample of n objects is equally likely to
be chosen

The resulting sample is called a random sample

Copyright © 2013 Pearson Education Ch. 1-7

Descriptive and Inferential Statistics

Two branches of statistics:


 Descriptive statistics
 Graphical and numerical procedures to summarize
and process data

 Inferential statistics
 Using data to make predictions, forecasts, and
estimates to assist decision making

Copyright © 2013 Pearson Education Ch. 1-8

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-5

Descriptive Statistics

 Collect data
 e.g., Survey

 Present data
 e.g., Tables and graphs

 Summarize data
 e.g., Sample mean =
X i

Copyright © 2013 Pearson Education Ch. 1-9

Inferential Statistics

 Estimation
 e.g., Estimate the population
mean weight using the sample
mean weight
 Hypothesis testing
 e.g., Test the claim that the
population mean weight is 140
pounds

Inference is the process of drawing conclusions or


making decisions about a population based on
sample results
Copyright © 2013 Pearson Education Ch. 1-10

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-6

1.2
Classification of Variables

Data

Categorical Numerical

Examples:
 Marital Status
 Are you registered to Discrete Continuous
vote?
 Eye Color Examples: Examples:
(Defined categories or  Number of Children  Weight
groups)  Defects per hour  Voltage
(Counted items) (Measured characteristics)

Copyright © 2013 Pearson Education Ch. 1-11

1.5
Graphs to Describe
Numerical Variables

Numerical Data

Frequency Distributions Stem-and-Leaf


and Display
Cumulative Distributions

Histogram Ogive

Copyright © 2013 Pearson Education Ch. 1-12

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-7

Frequency Distributions

What is a Frequency Distribution?


 A frequency distribution is a list or a table …
 containing class groupings (categories or
ranges within which the data fall) ...
 and the corresponding frequencies with which
data fall within each class or category

Copyright © 2013 Pearson Education Ch. 1-13

Why Use Frequency Distributions?

 A frequency distribution is a way to


summarize data
 The distribution condenses the raw data
into a more useful form...
 and allows for a quick visual interpretation
of the data

Copyright © 2013 Pearson Education Ch. 1-14

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-8

Class Intervals
and Class Boundaries

 Each class grouping has the same width


 Determine the width of each interval by
largest number  smallest number
w  interval width 
number of desired intervals

 Use at least 5 but no more than 15-20 intervals


 Intervals never overlap
 Round up the interval width to get desirable
interval endpoints

Copyright © 2013 Pearson Education Ch. 1-15

Frequency Distribution Example

Example: A manufacturer of insulation randomly


selects 20 winter days and records the daily
high temperature

data:

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Copyright © 2013 Pearson Education Ch. 1-16

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-9

Frequency Distribution Example


(continued)

 Sort raw data in ascending order:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

 Find range: 58 - 12 = 46
 Select number of classes: 5 (usually between 5 and 15)
 Compute interval width: 10 (46/5 then round up)

 Determine interval boundaries: 10 but less than 20, 20 but


less than 30, . . . , 60 but less than 70

 Count observations & assign to classes

Copyright © 2013 Pearson Education Ch. 1-17

Frequency Distribution Example


(continued)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative
Interval Frequency Percentage
Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100
Copyright © 2013 Pearson Education Ch. 1-18

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-10

Histogram

 A graph of the data in a frequency distribution


is called a histogram
 The interval endpoints are shown on the
horizontal axis
 the vertical axis is either frequency, relative
frequency, or percentage
 Bars of the appropriate heights are used to
represent the number of observations within
each class
Copyright © 2013 Pearson Education Ch. 1-19

Histogram Example

Interval Frequency
His togram : Daily High Te m pe rature
10 but less than 20 3
20 but less than 30 6 7 6
30 but less than 40 5
6 5
40 but less than 50 4
50 but less than 60 2 5 4
Frequency

4 3
3 2
2
1 0 0
(No gaps 0
between 0 0 10 10 2020 30 30 40 40 50 50 60 60 70
bars) Temperature in Degrees
Copyright © 2013 Pearson Education Ch. 1-20

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-11

Describing Data Numerically


Describing Data Numerically

Central Tendency Variation

Arithmetic Mean Range

Median Interquartile Range

Mode Variance

Standard Deviation

Coefficient of Variation

Copyright © 2013 Pearson Education Ch. 2-21

2.1
Measures of Central Tendency
Overview
Central Tendency

Mean Median Mode

x
i1
i
x
n
Arithmetic Midpoint of Most frequently
average ranked values observed value
(if one exists)
Copyright © 2013 Pearson Education Ch. 2-22

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-12

Arithmetic Mean
 The arithmetic mean (mean) is the most
common measure of central tendency
 For a population of N values:
N

x
i1
i
x1  x 2    x N Population
μ  values
N N
Population size

 For a sample of size n:


n

x
i1
i
x1  x 2    x n Observed
x  values
n n
Sample size
Copyright © 2013 Pearson Education Ch. 2-23

Arithmetic Mean
(continued)

 The most common measure of central tendency


 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4
1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5

Copyright © 2013 Pearson Education Ch. 2-24

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-13

Median
 In an ordered list, the median is the “middle”
number (50% above, 50% below)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3

 Not affected by extreme values

Copyright © 2013 Pearson Education Ch. 2-25

Finding the Median

 The location of the median:


th
 n  1
Median position    position in the ordered data
 2 
 If the number of values is odd, the median is the middle number
 If the number of values is even, the median is the average of
the two middle numbers

n 1
 Note that is not the value of the median, only the
2
position of the median in the ranked data

Copyright © 2013 Pearson Education Ch. 2-26

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-14

Mode
 A measure of central tendency
 Value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

No Mode
Mode = 9
Copyright © 2013 Pearson Education Ch. 2-27

Review Example

 Five houses on a hill by the beach


$2,000 K
House Prices:

$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000

$100 K

$100 K

Copyright © 2013 Pearson Education Ch. 2-28

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-15

Review Example:
Summary Statistics

House Prices:
 Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000  Median: middle value of ranked data
Sum 3,000,000
= $300,000

 Mode: most frequent value


= $100,000

Copyright © 2013 Pearson Education Ch. 2-29

Which measure of location


is the “best”?

 Mean is generally used, unless extreme


values (outliers) exist . . .
 Then median is often used, since the median
is not sensitive to extreme values.
 Example: Median home prices may be reported for
a region – less sensitive to outliers

Copyright © 2013 Pearson Education Ch. 2-30

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-16

Shape of a Distribution

 Describes how data are distributed


 Measures of shape
 Symmetric or skewed

Left-Skewed Symmetric Right-Skewed


Mean < Median Mean = Median Median < Mean

Copyright © 2013 Pearson Education Ch. 2-31

Geometric Mean

 Geometric mean
 Used to measure the rate of change of a variable
over time

x g  n (x1  x 2    x n )  (x1  x 2    x n )1/n


 Geometric mean rate of return
 Measures the status of an investment over time

rg  (x1  x 2  ...  x n )1/n  1


 Where xi is the rate of return in time period i
Copyright © 2013 Pearson Education Ch. 2-32

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-17

Example

An investment of $100,000 rose to $150,000 at the


end of year one and increased to $180,000 at end
of year two:

X1  $100,000 X2  $150,000 X3  $180,000

50% increase 20% increase

What is the mean percentage return over time?

Copyright © 2013 Pearson Education Ch. 2-33

Example
(continued)

Use the 1-year returns to compute the arithmetic


mean and the geometric mean:

Arithmetic (50%)  (20%)


mean rate X  35% Misleading result
2
of return:

Geometric rg  (x1  x 2 )1/n  1


mean rate
 [(50)  (20)]1/2  1
of return: Accurate
 (1000)1/2  1  31.623  1  30.623% result
Copyright © 2013 Pearson Education Ch. 2-34

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-18

Copyright © 2013 Pearson Education Ch. 2-35

2.2
Measures of Variability

Variation

Range Interquartile Variance Standard Coefficient of


Range Deviation Variation

 Measures of variation give


information on the spread
or variability of the data
values.

Same center,
different variation
Copyright © 2013 Pearson Education Ch. 2-36

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-19

Population Variance

 Average of squared deviations of values from


the mean
N
Population variance: 2

 (x  μ)
i
σ2  i1
N
Where μ = population mean
N = population size
xi = ith value of the variable x
Copyright © 2013 Pearson Education Ch. 2-37

Sample Variance

 Average (approximately) of squared deviations


of values from the mean
n
Sample variance: 2

 (x  x)i
s2  i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Copyright © 2013 Pearson Education Ch. 2-38

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-20

Population Standard Deviation

 Most commonly used measure of variation


 Shows variation about the mean
 Has the same units as the original data

 Population standard deviation:

N
2
 (x  μ)
i1
i
σ
N
Copyright © 2013 Pearson Education Ch. 2-39

Sample Standard Deviation

 Most commonly used measure of variation


 Shows variation about the mean
 Has the same units as the original data

 Sample standard deviation: n


2
 (x  x)
i1
i
S
n -1

Copyright © 2013 Pearson Education Ch. 2-40

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-21

Calculation Example:
Sample Standard Deviation
Sample
Data (xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16

(10  X ) 2  (12  x ) 2  (14  x ) 2    (24  x ) 2


s 
n 1

2 2 2 2
(10  16)  (12  16)  (14  16)    (24  16)

8 1

130 A measure of the “average”


  4.3095
7 scatter around the mean
Copyright © 2013 Pearson Education Ch. 2-41

Measuring variation

Small standard deviation

Large standard deviation

Copyright © 2013 Pearson Education Ch. 2-42

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-22

Comparing Standard Deviations


Mean = 15.5 for each data set

11 12 13 14 15 16 17 18 19 20 21
s = 3.338
(compare to the two
Data A cases below)

11 12 13 14 15 16 17 18 19 20 21
s = 0.926
(values are concentrated
Data B near the mean)

s = 4.570
11 12 13 14 15 16 17 18 19 20 21 (values are dispersed far
Data C from the mean)

Copyright © 2013 Pearson Education Ch. 2-43

Advantages of Variance and


Standard Deviation

Copyright © 2013 Pearson Education Ch. 2-44

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-23

Coefficient of Variation
 Measures relative variation
 Always in percentage (%)
 Shows variation relative to mean
 Can be used to compare two or more sets of
data measured in different units
Population coefficient of Sample coefficient of
variation: variation:
σ   s 
CV     100% CV     100%
μ   x 

Copyright © 2013 Pearson Education Ch. 2-45

Comparing Coefficient
of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

s  $5
CVA    100%  100%  10%
x  $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100 deviation, but
stock B is less
 Standard deviation = $5 variable relative
to its price
s  $5
CVB    100%  100%  5%
x  $100
Copyright © 2013 Pearson Education Ch. 2-46

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-24

Chebychev’s Theorem

 For any population with mean μ and


standard deviation σ , and k > 1 , the
percentage of observations that fall within
the interval
[μ + kσ]
Is at least

100[1 (1/k 2 )]%

Copyright © 2013 Pearson Education Ch. 2-47

Chebychev’s Theorem
(continued)

 Regardless of how the data are distributed, at


least (1 - 1/k2) of the values will fall within k
standard deviations of the mean (for k > 1)
 Examples:

At least within
(1 - 1/1.52) = 55.6% ……... k = 1.5 (μ ± 1.5σ)
(1 - 1/22) = 75% …........... k = 2 (μ ± 2σ)
(1 - 1/32) = 89% …….…... k = 3 (μ ± 3σ)

Copyright © 2013 Pearson Education Ch. 2-48

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-25

The Empirical Rule

 If the data distribution is bell-shaped, then


the interval:
 μ  1σ contains about 68% of the values in
the population or the sample

68%

μ
μ  1σ
Copyright © 2013 Pearson Education Ch. 2-49

The Empirical Rule


(continued)
 μ  2σ contains about 95% of the values in
the population or the sample
 μ  3σ contains almost all (about 99.7%) of
the values in the population or the sample

95% 99.7%

μ  2σ μ  3σ

Copyright © 2013 Pearson Education Ch. 2-50

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-26

The Empirical Rule

Copyright © 2013 Pearson Education Ch. 2-51

The Empirical Rule

Copyright © 2013 Pearson Education Ch. 2-52

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-27

z-Score

A z-score shows the position of a value


relative to the mean of the distribution.
 indicates the number of standard deviations a
value is from the mean.
 A z-score greater than zero indicates that the value is
greater than the mean
 a z-score less than zero indicates that the value is
less than the mean
 a z-score of zero indicates that the value is equal to
the mean.
Copyright © 2013 Pearson Education Ch. 2-53

z-Score
(continued)

 If the data set is the entire population of data


and the population mean, µ, and the population
standard deviation, σ, are known, then for each
value, xi, the z-score associated with xi is

xi - μ
z
σ

Copyright © 2013 Pearson Education Ch. 2-54

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-28

z-Score
(continued)

Copyright © 2013 Pearson Education Ch. 2-55

2.4
Measures of Relationships
Between Variables

Two measures of the relationship between


variable are

 Covariance
 Correlation Coefficient

Copyright © 2013 Pearson Education Ch. 2-56

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-29

Covariance
 The population covariance:
N

 (x  
i1
i x )(yi   y )
Cov (x , y)   xy 
N
 The sample covariance:
n

 (x  x)(y  y)
i1
i i
Cov (x , y)  s xy 
n 1
 Only concerned with the strength of the relationship
 No causal effect is implied

Copyright © 2013 Pearson Education Ch. 2-57

Interpreting Covariance

 Covariance between two variables:

Cov(x,y) > 0 x and y tend to move in the same direction

Cov(x,y) < 0 x and y tend to move in opposite directions

Cov(x,y) = 0 x and y are independent

Copyright © 2013 Pearson Education Ch. 2-58

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-30

Coefficient of Correlation
 Measures the relative strength of the linear relationship
between two variables

 Population correlation coefficient:


Cov (x , y)
ρ
σXσY
 Sample correlation coefficient:
Cov (x , y)
r
sX sY

Copyright © 2013 Pearson Education Ch. 2-59

Features of
Correlation Coefficient, r

 Unit free
 Ranges between –1 and 1
 The closer to –1, the stronger the negative linear
relationship
 The closer to 1, the stronger the positive linear
relationship
 The closer to 0, the weaker any positive linear
relationship

Copyright © 2013 Pearson Education Ch. 2-60

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education
Chapter 1 1-31

Scatter Plots of Data with Various


Correlation Coefficients
Y Y Y

X X X
r = -1 r = -.6 r=0
Y
Y Y

X X X
r = +1 r = +.3 r=0
Copyright © 2013 Pearson Education Ch. 2-61

Statistics for Business and Economics, 8/e Copyright © 2013 Pearson Education

You might also like