Professional Documents
Culture Documents
Session 5
Session 5
Session 5
3
Sampling Plans and Experimental Designs
If the four objects are identified by the symbols x1, x2, x3,
and x4, there are six distinct pairs that could be selected, as
listed in table.
Sample Observations in Sample
1 x1, x2
2 x1, x3
3 x1, x4
4 x2, x3
5 x2, x4
6 x3, x4
Table 7.1
4
Sampling Plans and Experimental Designs
DEFINITION
If a sample of n elements is selected from a population of
N elements using a sampling plan in which each of the
possible samples has the same chance of selection, then
the sampling is said to be random and the resulting
sample is a simple random sample.
5
Sampling Plans and Experimental Designs
6
7.2
Statistics and Sampling Distributions
Statistics and Sampling Distributions
8
Statistics and Sampling Distributions
DEFINITION
The sampling distribution of a statistic is the probability
distribution for the possible values of the statistic that
results when random samples of size n are repeatedly
drawn from the population.
9
Example 7.3
A population consists of N = 5 numbers: 3, 6, 9, 12, 15. If a
random sample of size n = 3 is selected without
replacement, find the sampling distributions for the sample
mean and the sample median M.
10
Example 7.3 – Solution
We are sampling from the population shown in figure.
Figure 7.1
11
Example 7.3 – Solution
It contains five distinct numbers and each is equally likely,
with probability p(x) = 1 ∕ 5. We can easily find the population
mean and median as
12
Example 7.3 – Solution
There are possible random samples of size n and
each is equally likely, with probability 1 ∕ 10. These samples,
along with the calculated values of and m for each, are
listed in table.
13
Example 7.3 – Solution
You will notice that some values of are more likely than
others because they occur in more than one sample. For
example,
14
Example 7.3 – Solution
Using the values in Table 7.3, we can find the sampling
distribution of and m, shown in table and graphed in
Figure 7.2 shown on the next slide.
m p(m)
6 .3
9 .4
12 .3
(a) (b)
Sampling Distributions for (a) the Sample Mean and (b) the Sample Median
Table 7.4
15
Example 7.3 – Solution
Figure 7.2
16
7.3
The Central Limit Theorem and the Sample Mean
The Central Limit Theorem
Under rather general conditions, this theorem states that
sums and means of random samples of measurements
drawn from a population tend to have an approximately
normal distribution.
18
The Central Limit Theorem
This familiar random variable can take six values, each with
probability 1∕ 6, and its probability distribution is shown in
figure.
19
The Central Limit Theorem
The shape of the distribution is flat—generally called a
discrete uniform distribution—and is symmetric about the
mean μ = 3.5, with a standard deviation σ = 1.71.
20
The Central Limit Theorem
Table shows the 36 possible outcomes, each with
probability 1∕ 36.
21
The Central Limit Theorem
When all of the 36 possible averages are consolidated into
a statistical table, the result is the sampling distribution of
shown in table and graphed in figure.
22
The Central Limit Theorem
Notice the dramatic difference in the shape of the sampling
distribution. It is now roughly mound-shaped but still
symmetric about the mean μ = 3.5.
23
The Central Limit Theorem
For n = 3, the sampling
distribution in figure
clearly shows the mound
shape of the normal
probability distribution,
still centered at μ = 3.5.
24
The Central Limit Theorem
Figure dramatically shows
that the distribution of
is approximately
normally distributed
based on a sample as
small as n = 4.
25
The Central Limit Theorem
Central Limit Theorem
If random samples of n observations are drawn from a
nonnormal population with finite mean μ and standard
deviation σ, then, when n is large, the sampling distribution
of the sample mean is approximately normally
distributed, with mean μ and standard deviation
n = sample size 26
The Central Limit Theorem
The Central Limit Theorem can be restated to apply to the
sum of the sample measurements Σxi, which, as n
becomes large, also has an approximately normal
distribution with mean nμ and standard deviation
27
The Central Limit Theorem
• When the sampled population is approximately
symmetric, the sampling distribution of becomes
approximately normal for relatively small values of n.
Remember how rapidly the discrete uniform distribution in
the dice example became mound-shaped (n = 3).
• When the sampled population is skewed, the sample
size n must be larger, with n at least 30 before the
sampling distribution of becomes approximately normal.
28
The Central Limit Theorem
These guidelines suggest that, for many populations, the
sampling distribution of will be approximately normal for
moderate sample sizes, but as specific applications of the
Central Limit Theorem arise, we will give you the
appropriate sample size n.
29
The Sampling Distribution of the Sample Mean
30
The Sampling Distribution of the Sample Mean
31
Standard Error of the Sample Mean
DEFINITION
The standard deviation of a statistic used as an estimator
of a population parameter is also called the standard error
of the estimator (abbreviated SE) because it refers to the
precision of the estimator. Therefore, the standard
deviation of is referred to as the
standard error of the mean (abbreviated as
SEM, or sometimes just SE).
32
7.4
Assessing Normality
Assessing Normality
• Histogram. Construct a histogram of the data. If the
histogram departs significantly from a bell-shaped
distribution you can conclude that the data do not have a
normal distribution.
• Box Plot. Construct a box plot and check for outliers. One
or more outliers may indicate that the data do not have a
normal distribution. Also can check if distribution is
Skewed (Left or Right).
34
Assessing Normality
• Normal Probability Plot. If the histogram is relatively scatter plot - which plots
ordered data points
symmetric and there are no extreme outliers, use a
statistical computer package to generate a normal
probability plot in which the ordered data points are
plotted against their z-values (Not Z scores).
z score measures distance of value from mean
z value = inverse of cumulative distribution of x.
• Normal Distribution. If the data have been drawn from a
normal population, the normal probability plot should be
reasonably close to a straight line and the plotted data
points should not show a systematic departure from this
straight line pattern.
35
Assessing Normality
• Nonnormal Distribution. If the normality plot is not
reasonably close to a straight line and/or the plotted
points exhibit some systematic pattern that is not a
straight line, the data is not normal.
36
Example 7.6
The histogram and normal probability plot in figure were
constructed based upon a sample of n = 50 observations
from a normal population with mean μ = 10 and standard
deviation σ = 2.
Histogram and normal probability plot for data from a normal distribution
Figure 7.10
37
Example 7.6
Comment on the shape of the histogram and whether the
normal probability plots can reasonably be described as a
straight line.
Solution:
The histogram is almost symmetrical and displays the
mound shape (or bell shape) of the normal curve.
38
Example 7.6 – Solution
The probability plot shows the ordered data points lying
almost in a straight line.
39
Assessing Normality (4 of 4)
What happens if the data is from a distribution that is not
normal? Let’s investigate some non-normal situations.
40
Example 7.7 (1 of 2)
Suppose the data are selected from a discrete uniform
distribution on the integers 1 to 10.
41
Example 7.7 (2 of 2)
Histogram and normal probability plot for data from a discrete uniform distribution
Figure 7.11
42
Example 7.7 – Solution
The histogram is far from mound-shaped, and is relatively
flat, characteristic of a discrete uniform distribution, and
hence not normal.
This reflects the fact that the tails do not taper off like the
normal curve, but rather, both tails are cut off, the lower at
1 and the upper at 10. This is not characteristic of the
normal distribution.
43
Example 7.9
The data are n = 48 sea-level pressures measured monthly
for 4 years. Discuss the nonnormal aspects of the graphs in
figure.
Figure 7.13
44
Example 7.9 – Solution
The data are not normal based upon the histogram in
Figure 7.13(a).
45
References – Additional Readings
• Chapters 7, “Introduction to Probability and Statistics”, 2020, William Mendenhall, Robert J.
Beaver, Barbara M. Beaver, 15TH Edition, Cengage Learning, ISBN: 1337554421
•.
random variables
[a,b]
a and b are upper limit and lower limit
function = RAND(b-a) + a
gives you random variables between the range