Professional Documents
Culture Documents
Ba1 7
Ba1 7
Unbiased
Sample Unbiased,
representative sample
Male students
drawn at random from
Female students
Population the entire population
Biased
Sample
Biased, unrepresentative
Female sample drawn consisting
Male students students of more female students
Population
than males
Sampling Process begins with a Sampling Frame
Sampling
Systematic Cluster
Types of Sampling: Non-probability Sampling
In non-probability sampling, items included are chosen without
considering their probability of occurrence.
• In convenience sampling, items are selected based only on the fact that they are
easy, inexpensive, or convenient to sample.
• In judgment sampling, one gets the opinions of pre-selected individuals or
experts in the subject matter.
• In quota sampling, individuals or items are selected on the basis of specific traits
or qualities. Some fixed number of units are selected including all the traits.
• In snowball sampling, research units are selected with the help of other research
units. It is used where potential participants are difficult to identify. For example,
customers in life insurance, network marketing, survey on ‘social evils’ etc.
Types of Sampling: Probability Sampling
Probability Sampling
Joann P. 849
Paul F. 850
Probability Sampling: Stratified Random Sampling
• Divide population into two or more subgroups (called strata) according to some common
characteristic
• A simple random sample is selected from each subgroup, with sample sizes proportional
to strata sizes
Population
Divided
into 4
strata
Chap 7-11
Probability Sampling: Systematic Sampling
N = 40 First Group
n=4
k = 10
Probability Sampling: Cluster Sampling
Population
divided into
16 clusters. Randomly selected
clusters for sample
Probability Sample: Comparing Sampling Methods
μX μ and σX
σ
n
Different samples of the same size from the same population will yield different
sample means
A measure of the variability in the mean from sample to sample is given by the
Standard Error of the Mean (standard deviation of sample means)
Note that the standard error of the mean decreases as the sample size increases
Central Limit Theorem
the sampling
As the n↑
distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population
x
Sampling Distribution of :
If the Population is not Normal
• We can apply the Central Limit Theorem:
• Even if the population is not normal,
• Sample means from the population will be approximately
normal as long as the sample size is large enough.
and σ
μx μ σx
n
Sample Mean Sampling Distribution:
If the Population is not Normal
(continued)
Population Distribution
Sampling distribution
properties:
Central Tendency
μx μ
μ x
Variation Sampling Distribution (becomes normal as n increases)
σ
σx Larger
n Smaller
sample size
sample
size
μx x
How Large is Large Enough?
• For most distributions, n > 30 will give a sampling
distribution that is nearly normal
• For fairly symmetric distributions, n > 15
• For normal population distributions, the sampling
distribution of the mean is always normally distributed
Z-value for Sampling Distribution of Mean
( X μX ) ( X μ)
Z
σX σ
n
where: X = sample mean
μ = population mean
σ = population standard deviation
n = sample size
Example
• Suppose a population has mean μ = 8 and standard
deviation σ = 3. Suppose a random sample of size n = 36 is
selected.
• What is the probability that the sample mean is between
7.8 and 8.2?
Example
Solution:
• Even if the population is not normally distributed,
the central limit theorem can be used (n > 30)
• The sampling distribution of is approximately
normal with
mean μx = 8
and, standard deviation σ 3
σx 0.5
n 36
Solution (continued):
=
=
= 1-0.0344
= 0.9656
• ‘x’ is the number of elements in the sample that possess the characteristic of
interest and ‘n’ is the sample size.
• 0≤ p≤1
• p is approximately distributed as a normal distribution when n is large
(assuming sampling with replacement from a finite population or without replacement from an
infinite population)
Sampling Distribution of
𝝁=𝒑 and
𝝈 =
𝒏 √
𝒑(𝟏− 𝒑)
−𝜇 − 𝑝
𝑍= =
√
𝜎 𝑝 (1 −𝑝 )
𝑛
Example
• If the true proportion of voters who support Proposition A is
0.4, what is the probability that a sample of size 200 yields a
sample proportion between 0.40 and 0.45?
• i.e. if p = 0.4 and n = 200, what is P(0.40 ≤ ≤ 0.45) ?
Example
(continued)
𝜎 =
√ 𝑛
=
√
𝑝(1 − 𝑝) 0.4 (1 −0.4 )
200
=0.03464
(
0.40−0.40 0.45−0.40
)
Convert to
standardized P(0.40≤≤0.45)=P ≤Z ≤
normal: 0.03464 0.03464
Example
(continued)
if p = 0.4 and n = 200, what is
P(0.40 ≤ ≤ 0.45) ?
Standardized
Sampling Distribution Normal Distribution
0.9251-0.5
= 0.4251
Standardize
1.The Grocery Manufacturers of America reported that 76% of consumers read the ingredients
listed on a product’s label. Assume the population proportion is p = .76 and a sample of 400
consumers is selected from the population.
(a) Show the sampling distribution of the sample proportion where is the proportion of
the sampled consumers who read the ingredients listed on a product’s label.
(b) What is the probability that the sample proportion will be within ±.03 of the population
proportion?
(c) Answer part (b) for a sample of 750 consumers.
2. The Food Marketing Institute shows that 17% of households spend more than $100 per week
on groceries. Assume the population proportion is p = .17 and a sample of 800 households will be
selected from the population.
(d) Show the sampling distribution of p, the sample proportion of households spending more
than $100 per week on groceries.
(e) What is the probability that the sample proportion will be within ±.02 of the population
proportion?
(f) Answer part (b) for a sample of 1600 households.
Point Estimation
• Point estimation is the process of using the sample data available to estimate the unknown
value of a parameter. The point estimate obtained from the data will be a single number like
sample mean, sample standard deviation, sample proportion etc.
• Suppose we have an unknown population parameter, such as a population mean μ or a
population proportion p, which we'd like to estimate. For example, suppose we are interested
in estimating:
p = the (unknown) proportion of American college students, 18-24, who have a smart
phone
μ = the (unknown) mean number of days it takes patients to respond to a drug
In either case, we can't possibly survey the entire population. That is, neither we can survey all
American college students between the ages of 18 and 24 nor can we survey all patients with a
specific disease. So, of course, we do what comes naturally and take a random sample from
the population, and use the resulting data to estimate the value of the population parameter.
Of course, we want the estimate to be "good" in some way.
The following table shows a sample of 30 managers of a company out of the total
2500 managers.
• The mean annual salary (=$51,814) is a point estimate of the population mean
salary (μ=$51,800).
• Similarly sample std. dev. (s=$3348) is a point estimate of the population std. dev.
(σ=$4000).
• The proportion of managers who have completed training (=0.63) is a point
estimate of the population proportion (p=0.60).
Properties of a Point Estimator
Unbiasedness: If the expected value of the sample statistic is equal to the population
parameter being estimated, the sample statistic is said to be an unbiased estimator of
the population parameter.
In discussing the sampling distributions of the sample mean and the sample proportion,
we stated that E() = μ and E() = p. Thus, both and are unbiased estimators of
their corresponding population parameters μ and p. In the case of the sample standard
deviation s and the sample variance s2, it can be shown that E(s2) = σ2.
Efficiency: The most efficient point estimator is the one which is having the smallest
variance of all the unbiased estimators. The variance represents the level of dispersion
from the estimate, and the smallest variance should vary the least from one sample to
the other.
Consistency: A third property associated with good point estimators is consistency. A
point estimator is consistent if the values of the point estimator tend to become closer
to the population parameter as the sample size becomes larger. In other words, a large
sample size tends to provide a better point estimate than a small sample size.