Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Statistics for Managers

Using Microsoft® Excel


5th Edition

Chapter 5 (Textbook Ch7)

Sampling and Sampling


Distributions

1
Learning Objectives

In this chapter, you will learn:


§  To distinguish between different survey
sampling methods
§  The concept of the sampling distribution
§  To compute probabilities related to the sample
mean and the sample proportion
§  The importance of the Central Limit Theorem

2
Why Sample?

§  Selecting a sample is less time-consuming


than selecting every item in the population
(census).
§  Selecting a sample is less costly than selecting
every item in the population.
§  An analysis of a sample is less cumbersome
and more practical than an analysis of the
entire population.

3
Types of Samples

Samples

Non-Probability Probability Samples


Samples

Simple Stratified
Judgment Chunk Random

Systematic Cluster
Quota Convenience

4
Types of Samples

§  In a nonprobability sample, items included are


chosen without regard to their probability of
occurrence.
§  In convenience sampling, items are selected based
only on the fact that they are easy, inexpensive, or
convenient to sample.
§  In a judgment sample, you get the opinions of pre-
selected experts in the subject matter.

5
Types of Samples
§  In a probability sample, items in the
sample are chosen on the basis of known
probabilities.

Probability Samples

Simple
Systematic Stratified Cluster
Random

6
Simple Random Sampling
§  Every individual or item from the frame has
an equal chance of being selected
§  Selection may be with replacement (selected
individual is returned to frame for possible
reselection) or without replacement
(selected individual isn’t returned to the
frame).
§  Samples obtained from table of random
numbers or computer random number
generators.
7
Systematic Sampling
§  Decide on sample size: n
§  Divide frame of N individuals into groups of k
individuals: k=N/n
§  Randomly select one individual from the 1st
group
§  Select every kth individual thereafter

For example, suppose you were sampling n = 9


individuals from a population of N = 72. So, the
population would be divided into k = 72/9 = 8 groups.
Randomly select a member from group 1, say
individual 3. Then, select every 8th individual
thereafter (i.e. 3, 11, 19, 27, 35, 43, 51, 59, 67)
8
Stratified Sampling

§  Divide population into two or more subgroups


(called strata) according to some common
characteristic.
§  A simple random sample is selected from each
subgroup, with sample sizes proportional to strata
sizes.
§  Samples from subgroups are combined into one.
§  This is a common technique when sampling
population of voters, stratifying across racial or
socio-economic lines.

9
Cluster Sampling

§  Population is divided into several “clusters,” each


representative of the population.
§  A simple random sample of clusters is selected.
§  All items in the selected clusters can be used, or items
can be chosen from a cluster using another probability
sampling technique.
§  A common application of cluster sampling involves
election exit polls, where certain election districts are
selected and sampled.

10
Comparing Sampling Methods

§  Simple random sample and Systematic sample


§  Simple to use
§  May not be a good representation of the population’s
underlying characteristics
§  Stratified sample
§  Ensures representation of individuals across the entire
population
§  Cluster sample
§  More cost effective
§  Less efficient (need larger sample to acquire the same
level of precision)

11
Evaluating Survey Worthiness

§  What is the purpose of the survey?


§  Were data collected using a non-
probability sample or a probability sample?
§  Coverage error – appropriate frame?
§  Non-response error – follow up
§  Measurement error – good questions elicit
good responses
§  Sampling error – always exists

12
Types of Survey Errors
§  Coverage error or selection bias
§  Exists if some groups are excluded from the frame and
have no chance of being selected
§  Non-response error or bias
§  People who do not respond may be different from those
who do respond
§  Sampling error
§  Chance (luck of the draw) variation from sample to
sample.
§  Measurement error
§  Due to weaknesses in question design, respondent
error, and interviewer’s impact on the respondent

13
Sampling Distributions

§  A sampling distribution is a distribution of all of the


possible values of a statistic for a given size sample
selected from a population.
§  For example, suppose you sample 50 students from
your college regarding their mean GPA. If you obtained
many different samples of 50, you will compute a
different mean for each sample. We are interested in
the distribution of all potential mean GPA we might
calculate for any given sample of 50 students.

14
Sampling Distributions

Sampling
Distributions

Sampling Sampling
Distributions Distributions
of the of the
Mean Proportion

15
Sampling Distributions
Sample Mean Example

§  Suppose your population (simplified) was


four people at your institution.
§  Population size N=4
§  Random variable, X, is age of individuals
§  Values of X: 18, 20, 22, 24 (years)

16
Sampling Distributions
Sample Mean Example (continued)
Summary Measures for the Population Distribution:

µ=
∑ X i P(x)
N .3
18 + 20 + 22 + 24 .2
= = 21
4 .1

σ=
∑ i
(X − µ)2

= 2.236 0 18 20 22 24 x
N A B C D
Uniform Distribution

17
Sampling Distributions
Sample Mean Example (continued)
Now consider all possible samples of size n=2
1st 2nd Observation
Obs 16 Sample
18 20 22 24
Means
18 18,18 18,20 18,22 18,24
1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement)
24 21 22 23 24

18
Sampling Distributions
Sample Mean Example (continued)

Sampling Distribution of All Sample Means

16 Sample Means Sample Means


Distribution
1st 2nd Observation _
Obs 18 20 22 24 P(X)
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 _
24 21 22 23 24 18 19 20 21 22 23 24 X
(no longer uniform)
19
Sampling Distributions
Sample Mean Example (continued)

Summary Measures of this Sampling Distribution:

µX =
∑ X i
=
18 + 19 + 21 + ! + 24
= 21
N 16

σX =
∑ ( X i − µ X
) 2

(18 - 21)2 + (19 - 21)2 + ! + (24 - 21)2


= = 1.58
16
20
Sampling Distributions
Sample Mean Example
(continued)

Population Sample Means Distribution


N=4 n=2
µ = 21 σ = 2.236 µX = 21 σ X = 1.58
_
P(X) P(X)
.3 .3

.2 .2

.1 .1
0
X
0 18 19 20 21 22 23 24
_
18 20 22 24 X
A B C D
21
Sampling Distributions
Standard Error

§  Different samples of the same size from the same


population will yield different sample means
§  A measure of the variability in the mean from sample
to sample is given by the Standard Error of the Mean:

σ
σX =
n
§  Note that the standard error of the mean decreases as
the sample size increases

22
Sampling Distributions
Standard Error: Normal Population
§  If a population is normal with mean µ and
standard deviation σ, the sampling distribution
of X is also normally distributed with

σ
µX = µ and σX =
n
(This assumes that sampling is with replacement or
sampling is without replacement from an infinite population)

23
Sampling Distributions
Normal Population

§ 

24
Sampling Distributions
Z-value: Normal Population
§  Z-value for the sampling distribution of X :

( X − µX ) ( X − µ)
Z= =
σX σ
n
where: X = sample mean
µ = population mean
σ = population standard deviation
n = sample size

25
Sampling Distributions
Z-value: Finite Population Correction
§  Apply the Finite Population Correction if:
§  the sample is large relative to the population
(n is greater than 5% of N)
and…
§  Sampling is without replacement

( X − µ)
Then Z=
σ N−n
n N −1
26
Sampling Distributions
Properties: Normal Population
Normal Population

µx = µ Distribution

µ x
x
(i.e. is unbiased ) Normal Sampling
Distribution
(has the same mean)

µx
x
27
Sampling Distributions
Properties: Normal Population
(continued)

§  For sampling with replacement:


As n increases, Larger
sample size
σ x decreases

Smaller
sample size

µ x
28
Sampling Distributions
Non-Normal Population
§  The Central Limit Theorem states that as the sample
size (that is, the number of values in each sample) gets
large enough, the sampling distribution of the mean is
approximately normally distributed. This is true
regardless of the shape of the distribution of the
individual values in the population.
§  Measures of the sampling distribution:

σ
µx = µ σx =
n
29
Central Limit Theorem

the sampling
As the n↑
distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population

x30
Sampling Distributions
Non-Normal Population
Population Distribution
Sampling distribution
properties:
Central Tendency

µx = µ
µ x
Variation Sampling Distribution
σ (becomes normal as n increases)
σx = Larger
n Smaller
sample size
sample
size
(Sampling with
replacement)
µx x
31
Central Limit Theorem
300
§  Individuals in population
§  Highly non-normal distribution 0
§  Mean , standard deviation 0 5 10

§  Averages of n = 3 individuals 200

§  Non-normal, but less so 100


§  Same mean
σX = σ / 3 0
§  Lower std. deviation: 0 5 10

§  Averages of n = 10 individuals100
§  Close to normal
§  Same mean σ X = σ / 10 0
§  Lower std. deviation 0 5 10

32
Sampling Distributions
Non-Normal Population

§  For most distributions, n > 30 will give a


sampling distribution that is nearly normal

§  For fairly symmetric distributions, n > 15


§  For normal population distributions, the
sampling distribution of the mean is always
normally distributed

33
Sampling Distributions
Example

§  Suppose a population has mean µ = 8 and standard


deviation σ = 3. Suppose a random sample of size
n = 36 is selected.
§  What is the probability that the sample mean is
between 7.75 and 8.25?
§  Even if the population is not normally distributed,
the central limit theorem can be used (n > 30).
§  So, the distribution of the sample mean is
approximately normal with
σ 3
µx = 8 σx = = = 0.5
n 36
34
Sampling Distributions
Example
First, compute Z values for both 7.75 and 8.25.
7.75 - 8
Z= = −0 . 5
3
36
8.25 - 8
Z= = 0.5
3
36
Now, use the cumulative normal table to compute
the correct probability.

P(7.75 < µ X < 8.25) = P(-0.5 < Z < 0.5) = 0.3830


35
Sampling Distributions
Example
Population
Distribution

= 2(.5000-.3085)
= 2(.1915)

µ=8 X = 0.3830
Sample

Sampling Standardized Normal


Distribution Distribution

7.75 8.25 -0.5 0.5


µX = 8 x µz = 0 Z

36
Sampling Distributions
of the Proportion

Sampling
Distributions

Sampling Sampling
Distributions Distributions
of the of the
Mean Proportion

37
Sampling Distributions
The Proportion
§  The proportion of the population having some
characteristic is denoted π.

§  Sample proportion ( p ) provides an estimate of π:


X number of items in the sample having the characteristic of interest
p= =
n sample size

§  0 ≤ p ≤ 1
§  p has a binomial distribution
(assuming sampling with replacement from a finite population or without
replacement from an infinite population)

38
Sampling Distributions
The Proportion

§ 

39
Sampling Distributions
The Proportion

§  Standard error for the proportion:


π (1− π )
σp =
n

§  Z value for the proportion:


p −π p −π
Z= =
σp π (1 − π )
n

40
Sampling Distributions
The Proportion: Example

§  If the true proportion of voters who support


Proposition A is π = .4, what is the probability
that a sample of size 200 yields a sample
proportion between .40 and .45?

§  In other words, if π = .4 and n = 200,


what is
P(.40 ≤ p ≤ .45) ?

41
Sampling Distributions
The Proportion: Example

π (1− π ) .4(1 − .4)


Find σ p: σp = = = .03464
n 200

Convert to ⎛ .40 − .40 .45 − .40 ⎞


standardized P(.40 ≤ p ≤ .45) = P⎜ ≤Z≤ ⎟
⎝ .03464 .03464 ⎠
normal:
= P(0 ≤ Z ≤ 1.44)

42
Sampling Distributions
The Proportion: Example

Use cumulative normal table:


P(0 ≤ Z ≤ 1.44) = P(Z ≤ 1.44) – 0.5 = .4251
Standardized
Sampling Distribution Normal Distribution

.4251

Standardize

.40 .45 0 1.44


p Z

43

You might also like