Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Topic 6

Sampling distribution

Dr. Rishabh Rathore


Department of Operations Management
IBS Hyderabad
Population & Sample
Population • A population can be defined by any number of characteristics within a
group that statisticians use to draw conclusions about the subjects in
a study.
• A population can be vague or specific, e.g., number of graduates in
Hyderabad, number of IT companies in Hyderabad.
Sample • A sample refers to a smaller, manageable version of a larger group.
• It is a subset containing the characteristics of a population.
• Samples are used in statistical testing when population sizes are too large
for the test to include all possible members or observations.
Why sampling?
• Less costs
• Less field time
• More accuracy i.e. Can Do A Better Job of Data
Collection
• When it’s impossible to study the whole population

Population Sample
Collection of items being Part or portion of the population chosen for
Definition
considered study
Population size = N Sample size = n
Symbols Population mean = μ Sample mean =𝑥ҧ
Population standard deviation = σ Sample standard deviation = s
Example: Selection of Class Representatives
The Sampling Design Process
Define the target population

Determine the sampling frame

Select a sampling technique

Determine the sample size

Execute the sampling process


Classification of Sampling Techniques
Sampling Techniques

Nonprobability Probability
Sampling Techniques Sampling Techniques

Convenience Judgmental Quota Snowball


Sampling Sampling Sampling Sampling

Simple Random Systematic Stratified Cluster


Sampling Sampling Sampling Sampling
Types of Sampling: Non-probability Sampling
In non-probability sampling, items included are chosen without regard
to their probability of occurrence.
• In convenience sampling, items are selected based only on the fact that they
are easy, inexpensive, or convenient to sample.
• use of students, and members of social organizations
• mall intercept interviews without qualifying the respondents
• “people on the street” interviews

• In judgment sampling, one gets the opinions of pre-selected individuals or


experts in the subject matter. (sample members are chosen only on the basis
of the researcher's knowledge and judgment)
• purchasing professionals selected in business-to business marketing research.
• expert witnesses used in court
Types of Sampling: Non-probability Sampling
In non-probability sampling, items included are chosen without regard
to their probability of occurrence.
• In quota sampling, individuals or items are selected on the basis of specific traits
or qualities. Some fixed number of units are selected including all the traits.

• In snowball sampling, research units are selected with the help of other
research units. It is used where potential participants are difficult to identify. For
example, customers in life insurance, network marketing, survey on ‘social
problems’ etc.
Probability Sampling: Simple Random Sampling
• Every individual or item from the frame has an equal chance of
being selected

• Selection may be with replacement (selected individual is


returned to frame for possible reselection) or without
replacement (selected individual isn’t returned to the frame).

• Samples are obtained using either lottery method or random


number tables or computer random number generators.
Simple Random
Sampling
Stratified Sampling
• Stratified random sampling is a method of selecting a sample in which
researchers first divide a population into smaller subgroups, or strata,
based on shared characteristics of the members and then randomly
select among each stratum to form the final sample.

• These shared characteristics can include gender, age, race, education


level, or income.
Probability Sampling: Systematic Sampling
(Pseudo Random Sampling)
• The sample is chosen by selecting a random starting point and then picking
every ith element in succession from the sampling frame.

• The sampling interval, i, is determined by dividing the population size N by


the sample size n and rounding to the nearest integer. k=N/n

• For example, there are 100,000 elements in the population and a sample of
1,000 is desired. In this case the sampling interval, i, is 100. A random number
between 1 and 100 is selected. If, for example, this number is 23, the sample
consists of elements 23, 123, 223, 323, 423, 523, and so on.
Systematic
Sampling (Pseudo
Random Sampling)
Probability Sampling: Cluster Sampling
• Population is divided into several “clusters,” each representative of the
population
• A simple random sample of clusters is selected
• All items in the selected clusters can be used, or items can be chosen
from a cluster using another probability sampling technique
• A common application of cluster sampling involves election exit polls,
where certain election districts are selected and sampled.
Types of Survey Errors
Exists if some groups are excluded
• Coverage error from the frame and have no chance
of being selected

• Non response error People who do not respond may be


different from those who do respond

• Sampling error Random differences


from sample to
sample
Bad or leading
• Measurement error question
Developing a Sampling Distribution
• Assume there is a population …
C D
• Population size N=4 A B

• Random variable, X,
is age of individuals
• Values of X: 18, 20,
22, 24 (years)
Developing a Sampling Distribution
Summary Measures for the Population Distribution:

μ=  X i
P(x)
N .3
18 + 20 + 22 + 24
= = 21 .2
4 .1

 i
0
(X − μ) 2
18 20 22 24 x
σ= = 2.236
N A B C D

Uniform Distribution
Developing a Sampling Distribution
Now consider all possible samples of size n=2

16 Sample Means
1st 2nd Observation
Obs (statistic)
18 20 22 24
18 18,18 18,20 18,22 18,24 1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22

16 possible samples 22 20 21 22 23
(sampling with replacement) 24 21 22 23 24
Developing a Sampling Distribution (continued)
Sampling Distribution of All Sample Means

16 Sample Means Sample Means



𝒙 Freq Relative freq.
(Prob.) Distribution
18 1 1/16=0.0625 1st 2nd Observation _
19 2 2/16=0.125 Obs 18 20 22 24 P(X)
3/16=0.1875
.3
20 3 18 18 19 20 21
21 4 4/16=0.25
.2
22 3 3/16=0.1875 20 19 20 21 22
23 2 2/16=0.125 .1
24 1 1/16=0.0625
22 20 21 22 23
0 _
16 24 21 22 23 24 18 19 20 21 22 23 24 X
(no longer uniform)
Developing a Sampling Distribution
(continued)
Summary Measures of this Sampling Distribution:

μX =
 X
i
=
18 + 19 + 19 +  + 24
= 21
N 16

σX =
 (X i − μ X
) 2

(18 - 21)2 + (19 - 21)2 + + (24 - 21) 2


= = 1.58
16
Comparing Population Distribution and Sample Means
Distribution

Population; N = 4 Sample Means Distribution; n = 2

μ = 21 σ = 2.236 μX = 21 σ X = 1.58
_
P(X) P(X)
.3 .3
.2 .2
.1 .1
0
X 0
18 19 20 21 22 23 24
_
18 20 22 24 X
A B C D
Central Limit Theorem
• The Central Limit Theorem states that if a population is normally
distributed, then regardless of the sample size, the sample means of
samples taken from the population are also normally distributed.

• If a sample whose size is more than 30 is taken from a population, then


regardless of the distribution of the population, the sample means are
normally distributed.

• Mathematically, it can be shown that the mean of the sample means is the
population mean.

• This is written as 𝜇 = 𝜇𝑥ҧ


Central Limit Theorem
• The standard deviation of the sample means is the standard deviation
of the population divided by the square root of the sample size.
• This is written as

𝜎
𝜎𝑥ҧ =
𝑛

• A measure of the variability in the mean from sample to sample is


given by the Standard Error of theMean:(This assumes that sampling
is with replacementorsampling is without replacement from an
infinitepopulation)
Central Limit Theorem
• It is this Central Limit Theorem that allows us to make statistical
inferences about the population based on the sample statistic.

• Also, even if we are unaware of the distribution of the


population, it is the Central Limit Theorem that allows us to make
statistical inferences about the population based on sample
statistic.
Sample Mean Sampling Distribution:
If the Population is Normal
• If a population is normal with mean μ and standard deviation σ,
the sampling distribution of 𝑥ҧ is also normally distributedwith

𝜎
𝜇 = 𝜇𝑥ҧ and 𝜎𝑥ҧ =
𝑛
Z-value for Sampling Distribution of Mean
Z-value for the sampling distribution of 𝑥ҧ

𝑥ҧ − 𝜇
𝑍=
𝜎𝑥ҧ 𝜎
Where 𝜎𝑥ҧ =
𝑛
𝑥ҧ − 𝜇
𝑍= 𝜎
𝑛
where: 𝑥ҧ = sample mean
μ = population mean
σ = population standard deviation
n = sample size
Sample Mean Sampling Distribution:
If the Population is not Normal (continued)

Population Distribution
Sampling distribution
properties:
CentralTendency
𝜇 = 𝜇𝑥ҧ
μ x
Variation Sampling Distribution
𝜎 (becomes normal as n increases)
𝜎𝑥ҧ = Larger
𝑛 Smaller sample
sample size size

μx x
Sampling Distribution Properties
(continued)

As 𝑛increases, Larger
𝜎𝑥ҧ decreases sample size

Smaller
sample size

μ x
Example 1
Suppose a population has mean μ = 8 and standard deviation σ
= 3. Suppose a random sample of size n = 36 is selected.

a) What is the standard error of sample mean?

b) What is the probability that the sample mean is


between 7.8 and 8.2?
Example 2
Mean expenditure of all the visitors in a restaurant is Rs.2000 with a
std. deviation of Rs.250. A random sample of 40 customers was taken,
find the probability that
(a) What is the standard error of sample mean?
(b) mean expenditure of customers is more than Rs.1928,
(c) mean expenditure of customers is less than Rs.1872,
(d) mean expenditure of customers is between Rs.1950 and Rs.2030.
Example 3

Example 4

The numerical population of grade point averages at a college has mean


2.61 and standard deviation 0.5. If a random sample of size 100 is taken
from the population, what is the probability that the sample mean will
be between 2.51 and 2.71?
Example 5 - A prototype automotive tire has a design life of 38,500
miles with a standard deviation of 2,500 miles. Five such tires are
manufactured and tested. On the assumption that the actual
population mean is 38,500 miles and the actual population standard
deviation is 2,500 miles, find the probability that the sample mean will
be less than 36,000 miles. Assume that the distribution of lifetimes of
such tires is normal.

Example 6 - An automobile battery manufacturer claims that its


midgrade battery has a mean life of 50 months with a standard
deviation of 6 months. Suppose the distribution of battery lives of this
particular brand is approximately normal.
a) On the assumption that the manufacturer’s claims are true, find the
probability that a randomly selected battery of this type will last less than
48 months.
b) On the same assumption, find the probability that the mean of a random
sample of 36 such batteries will be less than 48 months.

You might also like