Session 5

Probability
IPM – Term II, January 2023
Dr. Landis Conrad Felix Michel

7.1
Sampling Plans and Experimental Designs
The way a sample is selected is called the sampling plan

or experimental design. Knowing the sampling plan used
in a particular situation will often allow you to measure the
reliability or goodness of your inference.
Simple random sampling is a commonly used sampling

plan in which every sample of size n has the same chance
of being selected. For example, suppose you want to select
a sample of size n = 2 from a population containing N = 4
objects.
3
If the four objects are identified by the symbols x1, x2, x3,
and x4, there are six distinct pairs that could be selected, as
listed in table.
Sample Observations in Sample
1 x1, x2
2 x1, x3
3 x1, x4
4 x2, x3
5 x2, x4
6 x3, x4
Ways of Selecting a Sample of Size 2 from 4 Objects
Table 7.1
4
If the sample of n = 2 observations is selected so that each

of these six samples has the same chance—one out of six
or 1/6—of selection, then the resulting sample is called a
simple random sample, or just a random sample.
DEFINITION
If a sample of n elements is selected from a population of
N elements using a sampling plan in which each of the
possible samples has the same chance of selection, then
the sampling is said to be random and the resulting
sample is a simple random sample.
5
Remember that nonrandom samples can be described

but cannot be used for making statistical inferences!
6
7.2
Statistics and Sampling Distributions
When you select a random sample from a population, the

numerical descriptive measures you calculate from the
sample are called statistics.
These statistics vary or change for each different random

sample you select; that is, they are random variables.
8
The probability distributions for statistics are called

sampling distributions because, in repeated sampling,
they tell us:
• What values of the statistic can occur.
• How often each value occurs.
DEFINITION
The sampling distribution of a statistic is the probability
distribution for the possible values of the statistic that
results when random samples of size n are repeatedly
drawn from the population.
9
Example 7.3
A population consists of N = 5 numbers: 3, 6, 9, 12, 15. If a
random sample of size n = 3 is selected without
replacement, find the sampling distributions for the sample
mean and the sample median M.
10
Example 7.3 – Solution
We are sampling from the population shown in figure.
Figure 7.1
11
It contains five distinct numbers and each is equally likely,
with probability p(x) = 1 ∕ 5. We can easily find the population
mean and median as
To find the sampling distribution, we need to know what

values of and M can occur when the sample is taken.
12
There are possible random samples of size n and
each is equally likely, with probability 1 ∕ 10. These samples,
along with the calculated values of and m for each, are
listed in table.
Values of and m for Simple Random Sampling when n = 3 and N = 5

Table 7.3
13
You will notice that some values of are more likely than
others because they occur in more than one sample. For
example,
14
Using the values in Table 7.3, we can find the sampling
distribution of and m, shown in table and graphed in
Figure 7.2 shown on the next slide.
m p(m)
6 .3
9 .4
12 .3
(a) (b)
Sampling Distributions for (a) the Sample Mean and (b) the Sample Median
Table 7.4
15
Figure 7.2
16
7.3
The Central Limit Theorem and the Sample Mean
The Central Limit Theorem
Under rather general conditions, this theorem states that
sums and means of random samples of measurements
drawn from a population tend to have an approximately
normal distribution.
For example, suppose you toss a die n = 1 time. The

random variable x is the number observed on the upper
face.
18
This familiar random variable can take six values, each with
probability 1∕ 6, and its probability distribution is shown in
figure.
Probability distribution for x, the number

appearing on a single toss of a die
Figure 7.3
19
The shape of the distribution is flat—generally called a
discrete uniform distribution—and is symmetric about the
mean μ = 3.5, with a standard deviation σ = 1.71.
Now, take a sample of size n = 2 from this population; that

is, toss two dice and record the sum of the numbers on the
two upper faces, Σxi = x1 +x2.
20
Table shows the 36 possible outcomes, each with
probability 1∕ 36.
Sums of the Upper Faces of Two Dice

Table 7.5(a)
The sums are tabulated, and each of the possible sums is

divided by n = 2 to obtain an average.
21
When all of the 36 possible averages are consolidated into
a statistical table, the result is the sampling distribution of
shown in table and graphed in figure.
Sampling Distribution of Sampling distribution of for n = 2 dice

Table 7.5(b) Figure 7.4
22
Notice the dramatic difference in the shape of the sampling
distribution. It is now roughly mound-shaped but still
symmetric about the mean μ = 3.5.
Using a similar procedure, we generated the sampling

distributions of when n = 3 and n = 4.
23
For n = 3, the sampling
distribution in figure
clearly shows the mound
shape of the normal
probability distribution,
still centered at μ = 3.5.
Sampling distribution of for n = 3 dice

Figure 7.5
Notice also that the spread of the distribution is slowly

decreasing as the sample size n increases.
24
Figure dramatically shows
that the distribution of
is approximately
normally distributed
based on a sample as
small as n = 4.
Sampling distribution of for n = 4 dice

Figure 7.6
This phenomenon is the result of an important statistical

theorem called the Central Limit Theorem (CLT).
25
Central Limit Theorem
If random samples of n observations are drawn from a
nonnormal population with finite mean μ and standard
deviation σ, then, when n is large, the sampling distribution
of the sample mean is approximately normally
distributed, with mean μ and standard deviation
The approximation becomes more accurate as n becomes

large.
n = sample size 26
The Central Limit Theorem can be restated to apply to the
sum of the sample measurements Σxi, which, as n
becomes large, also has an approximately normal
distribution with mean nμ and standard deviation
When the Sample Size Is Large Enough to Use the

Central Limit Theorem
• If the sampled population is normal, then the sampling
distribution of will also be normal, no matter what
sample size you choose.
summission x ~ N(nk, std.dev * (n)^1/2
27
• When the sampled population is approximately
symmetric, the sampling distribution of becomes
approximately normal for relatively small values of n.
Remember how rapidly the discrete uniform distribution in
the dice example became mound-shaped (n = 3).
• When the sampled population is skewed, the sample
size n must be larger, with n at least 30 before the
sampling distribution of becomes approximately normal.
28
These guidelines suggest that, for many populations, the
sampling distribution of will be approximately normal for
moderate sample sizes, but as specific applications of the
Central Limit Theorem arise, we will give you the
appropriate sample size n.
29
The Sampling Distribution of the Sample Mean
The Sampling Distribution of the Sample Mean,

• If a random sample of n measurements is selected from a
population with mean μ and standard deviation σ, the
sampling distribution of the sample mean will have
mean μ and standard deviation
• If the population has a normal distribution, the sampling

distribution of will be exactly normally distributed,
regardless of the sample size, n.
30
The Sampling Distribution of the Sample Mean
• If the population distribution is nonnormal, the sampling

distribution of will be approximately normally distributed
for large samples (by the Central Limit Theorem).
Conservatively, we require n ≥ 30.
31
Standard Error of the Sample Mean
DEFINITION
The standard deviation of a statistic used as an estimator
of a population parameter is also called the standard error
of the estimator (abbreviated SE) because it refers to the
precision of the estimator. Therefore, the standard
deviation of is referred to as the
standard error of the mean (abbreviated as
SEM, or sometimes just SE).
32
7.4
Assessing Normality
Assessing Normality
• Histogram. Construct a histogram of the data. If the
histogram departs significantly from a bell-shaped
distribution you can conclude that the data do not have a
• Box Plot. Construct a box plot and check for outliers. One
or more outliers may indicate that the data do not have a
normal distribution. Also can check if distribution is
Skewed (Left or Right).
34
Assessing Normality
• Normal Probability Plot. If the histogram is relatively scatter plot - which plots
ordered data points
symmetric and there are no extreme outliers, use a
statistical computer package to generate a normal
probability plot in which the ordered data points are
plotted against their z-values (Not Z scores).
z score measures distance of value from mean
z value = inverse of cumulative distribution of x.
• Normal Distribution. If the data have been drawn from a
normal population, the normal probability plot should be
reasonably close to a straight line and the plotted data
points should not show a systematic departure from this
straight line pattern.
35
Assessing Normality
• Nonnormal Distribution. If the normality plot is not
reasonably close to a straight line and/or the plotted
points exhibit some systematic pattern that is not a
straight line, the data is not normal.
36
Example 7.6
The histogram and normal probability plot in figure were
constructed based upon a sample of n = 50 observations
from a normal population with mean μ = 10 and standard
deviation σ = 2.
Histogram and normal probability plot for data from a normal distribution
Figure 7.10
37
Example 7.6
Comment on the shape of the histogram and whether the
normal probability plots can reasonably be described as a
straight line.
Solution:
The histogram is almost symmetrical and displays the
mound shape (or bell shape) of the normal curve.
38
The probability plot shows the ordered data points lying
almost in a straight line.
Although all normal plots based on normal data will not

always look this good, these are the characteristics that
you look for.
39
Assessing Normality (4 of 4)
What happens if the data is from a distribution that is not
normal? Let’s investigate some non-normal situations.
40
Example 7.7 (1 of 2)
Suppose the data are selected from a discrete uniform
distribution on the integers 1 to 10.
A sample of n = 100 observations produced the histogram

and normal probability plot in figures.
41
Example 7.7 (2 of 2)
Histogram and normal probability plot for data from a discrete uniform distribution
Figure 7.11
How do they differ from those produced by a normal

sample?
42
The histogram is far from mound-shaped, and is relatively
flat, characteristic of a discrete uniform distribution, and
hence not normal.
The normal probability plot shows a downturn in the lower

area of the plot and an upturn in the upper area of the plot.
This reflects the fact that the tails do not taper off like the
normal curve, but rather, both tails are cut off, the lower at
1 and the upper at 10. This is not characteristic of the
43
Example 7.9
The data are n = 48 sea-level pressures measured monthly
for 4 years. Discuss the nonnormal aspects of the graphs in
figure.
Figure 7.13
44
The data are not normal based upon the histogram in
Figure 7.13(a).
The probability plot in Figure 7.13(b) has the appearance

of a wavy line first below the centerline, then above, again
below ending above the centerline, indicating a periodic
pattern in the data.
45
References – Additional Readings
• Chapters 7, “Introduction to Probability and Statistics”, 2020, William Mendenhall, Robert J.
Beaver, Barbara M. Beaver, 15TH Edition, Cengage Learning, ISBN: 1337554421
•.
random variables
[a,b]
a and b are upper limit and lower limit
function = RAND(b-a) + a
gives you random variables between the range

Session 5

Uploaded by

Copyright:

Available Formats

You might also like

Session 5

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Session 5

Uploaded by

Copyright:

Available Formats

Probability

IPM – Term II, January 2023

Dr. Landis Conrad Felix Michel

The way a sample is selected is called the sampling plan

Simple random sampling is a commonly used sampling

Ways of Selecting a Sample of Size 2 from 4 Objects

If the sample of n = 2 observations is selected so that each

Remember that nonrandom samples can be described

When you select a random sample from a population, the

These statistics vary or change for each different random

The probability distributions for statistics are called

To find the sampling distribution, we need to know what

Values of and m for Simple Random Sampling when n = 3 and N = 5

For example, suppose you toss a die n = 1 time. The

Probability distribution for x, the number

Now, take a sample of size n = 2 from this population; that

Sums of the Upper Faces of Two Dice

The sums are tabulated, and each of the possible sums is

Sampling Distribution of Sampling distribution of for n = 2 dice

Using a similar procedure, we generated the sampling

Sampling distribution of for n = 3 dice

Notice also that the spread of the distribution is slowly

Sampling distribution of for n = 4 dice

This phenomenon is the result of an important statistical

The approximation becomes more accurate as n becomes

When the Sample Size Is Large Enough to Use the

summission x ~ N(nk, std.dev * (n)^1/2

The Sampling Distribution of the Sample Mean,

• If the population has a normal distribution, the sampling

• If the population distribution is nonnormal, the sampling

Although all normal plots based on normal data will not

A sample of n = 100 observations produced the histogram

How do they differ from those produced by a normal

The normal probability plot shows a downturn in the lower

The probability plot in Figure 7.13(b) has the appearance

You might also like