Professional Documents
Culture Documents
Lecture 01
Lecture 01
SAMPLING DISTRIBUTIONS
POPULATION AND SAMPLING
DISTRIBUTIONS
Population Distribution
Sampling Distribution
2
Population Distribution
Definition
The population distribution is the
probability distribution of the population
data.
3
Population Distribution cont.
Suppose there are only five students in an
advanced statistics class and the midterm
scores of these five students are
70 78 80 80 95
Let x denote the score of a student
4
Table 7.1 Population Frequency and Relative Frequency
Distributions
x f Relative Frequency
70 1 1/5 = .20
78 1 1/5 = .20
80 2 2/5 = .40
95 1 1/5 = .20
N=5 Sum = 1.00
5
Table 7.2 Population Probability Distribution
x P (x)
70 .20
78 .20
80 .40
95 .20
ΣP (x) = 1.00
6
Sampling Distribution
Definition
The probability distribution of x is called its
sampling distribution. It lists the various
values that x can assume and the probability
of each value of x .
In general, the probability distribution of a
sample statistic is called its sampling
distribution.
7
Sampling Distribution cont.
Reconsider the population of midterm
scores of five students given in Table 7.1
Consider all possible samples of three
scores each that can be selected, without
replacement, from that population.
The total number of possible samples is
5! 5 4 3 2 1
5 C3 10
3!(5 3)! 3 2 1 2 1
8
Sampling Distribution cont.
Suppose we assign the letters A, B, C, D,
and E to the scores of the five students so
that
A = 70, B = 78, C = 80, D = 80, E = 95
Then, the 10 possible samples of three scores each
are
ABC, ABD, ABE, ACD, ACE,
ADE, BCD, BCE, BDE, CDE
9
Table 7.3 All Possible Samples and Their Means When
the Sample Size Is 3
Relative
x f Frequency
76.00 2 2/10 = .20
76.67 1 1/10 = .10
79.33 1 1/10 = .10
81.00 1 1/10 = .10
81.67 2 2/10 = .20
84.33 2 2/10 = .20
85.00 1 1/10 = .10
Σf = 10 Sum = 1.00
11
Table 7.5 Sampling Distribution of x When the Sample
Size Is 3
x P( x )
76.00 .20
76.67 .10
79.33 .10
81.00 .10
81.67 .20
84.33 .20
85.00 .10
ΣP ( x ) = 1.00
12
SAMPLING AND
NONSAMPLING ERRORS
Definition
Sampling error is the difference between
the value of a sample statistic and the value
of the corresponding population parameter.
In the case of the mean,
Sampling error = x
assuming that the sample is random and no
nonsampling error has been made.
13
SAMPLING AND
NONSAMPLING ERRORS cont.
Definition
The errors that occur in the collection,
14
Example 7-1
Reconsider the population of five scores
given in Table 7.1. Suppose one sample of
three scores is selected from this
population, and this sample includes the
scores 70, 80, and 95. Find the sampling
error.
15
Solution 7-1
70 78 80 80 95
80.60
5
70 80 95
x 81.67
3
Sampling error x 81.67 80.60 1.07
17
SAMPLING AND
NONSAMPLING ERRORS cont.
The difference between this sample mean
and the population mean is
x 82.33 80.60 1.73
This difference does not represent the
sampling error.
Only 1.07 of this difference is due to the
sampling error
18
SAMPLING AND
NONSAMPLING ERRORS cont.
The remaining portion represents the
nonsampling error
It is equal to 1.70 – 1.07 = .66
It occurred due to the error we made in
recording the second score in the sample
Also,
Nonsampling error Incorrect x Correct x
82.33 81.67 .66
19
Figure 7.1 Sampling and nonsampling errors.
20
MEAN AND STANDARD
DEVIATION OF x
Definition
The mean and standard deviation of the
21
MEAN AND STANDARD
DEVIATION OF x cont.
Mean of the Sampling Distribution of x
The mean of the sampling distribution
x
22
MEAN AND STANDARD
DEVIATION OF x cont.
Standard Deviation of the Sampling Distribution of x
The standard deviation of the sampling
distribution of x is
x
n
where σ is the standard deviation of the population
and n is the sample size. This formula is used when
n /N ≤ .05, where N is the population size.
23
Standard Deviation of the
Sampling Distribution of x cont.
If the condition n /N ≤ .05 is not satisfied,
we use the following formula to calculate
x :
N n
x
n N 1
N n
where the factor N 1is called the finite
population correction factor
24
Two Important Observations
1. The spread of the sampling distribution
of x is smaller than the spread of the
corresponding population distribution,
i.e. x x
2. The standard deviation of the sampling
distribution of x decreases as the
sample size increases
25
Example 7-2
The mean wage for all 5000 employees
who work at a large company is $17.50
and the standard deviation is $2.90. Let x
be the mean wage per hour for a random
sample of certain employees selected
from this company. Find the mean and
standard deviation of x for a sample size
of
a) 30 b) 75 c) 200
26
Solution 7-2
a)
x $17.50
2.90
x $.529
n 30
27
Solution 7-2
b)
x $17.50
2.90
x $.335
n 75
28
Solution 7-2
c)
x $17.50
2.90
x $.205
n 200
29
SHAPE OF THE SAMPLING
DISTRIBUTION OF x
Sampling from a Normally Distributed
Population
Sampling from a Population That Is Not
Normally Distributed
30
Sampling From a Normally
Distributed Population
If the population from which the samples
are drawn is normally distributed with mean
μ and standard deviation σ , then the
sampling distribution of the sample mean,
x ,will also be normally distributed with
the following mean and standard deviation,
irrespective of the sample size:
x and x
n
31
Figure 7.2 Population distribution and sampling
distributions of x .
Normal distribution
32
Figure 7.2 Population distribution and sampling
distributions of x .
Normal distribution
33
Figure 7.2 Population distribution and sampling
distributions of x .
Normal distribution
34
Figure 7.2 Population distribution and sampling
distributions of x .
Normal distribution
x 35
Figure 7.2 Population distribution and sampling
distributions of x .
Normal distribution
x 36
Example 7-3
In a recent SAT, the mean score for all
examinees was 1020. Assume that the
distribution of SAT scores of all examinees is
normal with the mean of 1020 and a standard
deviation of 153. Let x be the mean SAT score
of a random sample of certain examinees.
Calculate the mean and standard deviation of x
and describe the shape of its sampling
distribution when the sample size is
a) 16 b) 50 c) 1000
37
Solution 7-3
a)
x 1020
153
x 38.250
n 16
38
Figure 7.3
Sampling
x = 38.250 distribution of x for
n = 16
Population
distribution
σ = 153
39
Solution 7-3
b)
x 1020
153
x 21.637
n 50
40
Figure 7.4
Sampling
x = 21.637 distribution of x for
n = 50
Population
distribution
σ = 153
41
Solution 7-3
c)
x 1020
153
x 4.838
n 1000
42
Figure 7.5
Sampling
x = 4.838 distribution of x for
n = 1000
Population
σ = 153 distribution
43
x = μ = 1020 SAT scores
Sampling From a Population That
Is Not Normally Distributed
Central Limit Theorem
According to the central limit theorem, for a
large sample size, the sampling distribution of x is
approximately normal, irrespective of the shape of
the population distribution. The mean and standard
deviation of the sampling distribution of x are
x and x
n
The sample size is usually considered to be large if
n ≥ 30.
44
Figure 7.6 Population distribution and sampling
distributions of x .
45
Figure 7.6 Population distribution and sampling
distributions of x .
46
Figure 7.6 Population distribution and sampling
distributions of x .
47
Figure 7.6 Population distribution and sampling
distributions of x .
48
Figure 7.6 Population distribution and sampling
distributions of x .
x
49
Example 7-4
The mean rent paid by all tenants in a large city
is $1550 with a standard deviation of $225.
However, the population distribution of rents for
all tenants in this city is skewed to the right.
Calculate the mean and standard deviation of x
and describe the shape of its sampling
distribution when the sample size is
a) 30 b) 100
50
Solution 7-4
a) Let x be the mean rent paid by a sample of 30
tenants
x $1550
225
x $41.079
n 30
51
Figure 7.7
σ = $225
μ = $1550 x
52
Figure 7.7
x = $41.079
x = $1550 x
53
Solution 7-4
b) Let x be the mean rent paid by a sample of 100
tenants
x $1550
225
x $22.500
n 100
54
Figure 7.8
σ = $225
μ = $1550 x
55
Figure 7.8
(b) Sampling distribution of x for n = 100.
x = $22.500
x = $1550 x 56
APPLICATIONS OF THE
SAMPLING DISTRIBUTION OF x
1. If we make all possible samples of the
same (large) size from a population and
calculate the mean for each of these
samples, then about 68.26% of the
sample means will be within one
standard deviation of the population
mean.
57
Figure 7.9 P ( 1 x x 1 x )
Shaded area
is .6826
.3413 .3413
1 x 1 x x
58
APPLICATIONS OF THE SAMPLING
DISTRIBUTION OF x cont.
59
Figure 7.10 P ( 2 x x 2 x )
Shaded area
is .9544
.4772 .4772
2 x 2 x x
60
APPLICATIONS OF THE SAMPLING
DISTRIBUTION OF x cont.
61
Figure 7.11 P ( 3 x x 3 x )
Shaded area
is .9974
.4987 .4987
3 x 3 x x
62
Example 7-5
Assume that the weights of all packages of
a certain brand of cookies are normally
distributed with a mean of 32 ounces and a
standard deviation of .3 ounce. Find the
probability that the mean weight, x ,of a
random sample of 20 packages of this
brand of cookies will be between 31.8 and
31.9 ounces.
63
Solution 7-5
x 32 ounces
.3
x .06708204 ounce
n 20
64
z Value for a Value of x
The z value for a value of x is calculated
as
x
z
x
65
Solution 7-5
31.8 32
For x = 31.8: z 2.98
.06708204
31.9 32
For x = 31.9: z 1.49
.06708204
Shaded are
is .0667
31.8 31.9 x = 32 x
-2.98 -1.49 0
z
67
Example 7-6
According to the College Board’s report, the
average tuition and fees at four-year private
colleges and universities in the United States was
$18,273 for the academic year 2002-2003 (The
Hartford Courant, October 22, 2002). Suppose that
the probability distribution of the 2002-2003 tuition
and fees at all four-year private colleges in the
United States was unknown, but its mean was
$18,273 and the standard deviation was $2100.
Let x be the mean tuition and fees for 2002-2003
for a random sample of 49 four-year private U.S.
colleges. Assume that n /N ≤ .05.
68
Example 7-6
a) What is the probability that the 2002-2003
mean tuition and fees for this sample was
within $550 of the population mean?
b) What is the probability that the 2002-2003
mean tuition and fees for this sample was
lower than the population mean by $400
or more?
69
Solution 7-6
x $18,273
2100
x $300
n 49
70
Solution 7-6
a)
x 17,723 18,273
For x = $17,723: z 1.83
x 300
x 18,823 18,273
For x = $18,823: z 1.83
x 300
P (17,723 ≤ x ≤ 18,823)
= P (-1.83 ≤ z ≤ 1.83)
= .4664 + .4664
= .9328 71
Solution 7-6
a)
Therefore, the probability that the 2002-
2003 mean tuition and fees for this
sample of 49 four-year private U.S.
colleges was within $550 of the
population mean is .9328.
72
Figure 7.13 P($17,723 x $18,823)
Shaded area
is .9328
.4664 .4664
P ( x ≤ $17,873) = P (z ≤ -1.33)
= .5 - .4082
= .0918
74
Solution 7-6
b)
Therefore, the probability that the 2002-
2003 mean tuition and fees for this
sample of 49 four-year private U.S.
colleges was lower than the population
mean by $400 or more is .0918.
75
Figure 7.14 P( x $17,873)
The required
probability
is .0918
.4082
$17,873 x = $18,273 x
-1.33 0 z
76
POPULATION AND SAMPLE
PROPORTIONS
The population and sample
proportions, denoted by p and p̂ ,
respectively, are calculated as
X x
p and pˆ
N n
77
POPULATION AND SAMPLE
PROPORTIONS cont.
where
N = total number of elements in the population
n = total number of elements in the sample
X = number of elements in the population that
possess a specific characteristic
x = number of elements in the sample that
possess a specific characteristic
78
Example 7-7
Suppose a total of 789,654 families live in a
city and 563,282 of them own homes. A
sample of 240 families is selected from this
city, and 158 of them own homes. Find the
proportion of families who own homes in
the population and in the sample. Find the
sampling error.
79
Solution 7-7
X 563,282
p .71
N 789,654
x 158
pˆ .66
n 240
Sampling error pˆ p .66 .71 .05
80
MEAN, STANDARD DEVIATION, AND
SHAPE OF THE SAMPLING DISTRIBUTION
OF p̂
Sampling Distribution of p̂
Mean and Standard Deviation of p̂
Shape of the Sampling Distribution of p̂
81
Sampling Distribution of p̂
Definition
The probability distribution of the sample
proportion, p̂ , is called its sampling
distribution. It gives various values that
p̂ can assume and their probabilities.
82
Example 7-8
Boe Consultant Associates has five
employees. Table 7.6 gives the names of
these five employees and information
concerning their knowledge of statistics.
83
Table 7.6 Information on the Five Employees of Boe
Consultant Associates
84
Example 7-8
If we define the population proportion, p,
as the proportion of employees who know
statistics, then
p = 3 / 5 = .60
85
Example 7-8
Now, suppose we draw all possible
samples of three employees each and
compute the proportion of employees, for
each sample, who know statistics.
5! 5 4 3 2 1
Total number of samples 5 C 3 10
3!(5 3)! 3 2 1 2 1
86
Table 7.7 All Possible Samples of Size 3 and the Value
of p̂ for Each Sample
Relative
p̂ f Frequency
.33 3 3/10 = .30
.67 6 6/10 = .60
1.00 1 1/10 = .10
Σf = 10 Sum = 1.00
88
Table 7.9 Sampling Distribution of p̂ When the
Sample Size is 3
p̂ P ( p̂ )
.33 .30
.67 .60
1.00 .10
ΣP ( p̂ ) = 1.00
89
Mean and Standard Deviation
of p̂
Mean of the Sample Proportion
The mean of the sample proportion, p̂ ,
is denoted by p̂ and is equal to the
population proportion, p. Thus,
pˆ p
90
Mean and Standard Deviation
of p̂ cont.
Standard Deviation of the Sample Proportion
The standard deviation of the sample
proportion, p̂ , is denoted by p̂ and is given by
the formula pq
pˆ
n
93
Example 7-9
The National Survey of Student Engagement
shows about 87% of freshmen and seniors rate
their college experience as “good” or “excellent”
(USA TODAY, November 12, 2002). Assume this
result is true for the current population of all
freshmen and seniors. Let p̂ be the proportion
of freshmen and seniors in a random sample of
900 who hold this view. Find the mean and
standard deviation of p̂ and describe the shape
of its sampling distribution.
94
Solution 7-9
95
Solution 7-9
np and nq are both greater then 5
Therefore, the sampling distribution of p̂ is
approximately normal with a mean of .87
and a standard deviation of .001, as
shown in Figure 7.15
96
Figure 7.15
Approximately
normal p̂ =.011
p̂ = .87 p̂
97