Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

STATISTICS FOR LIFE AND SOCIAL SCIENCES

CHAPTER 4: SAMPLING DISTRIBUTION

Thach Thanh Tien

Ton Duc Thang University

February 1, 2021

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 1 / 23
Content

1 Introduction

2 Sampling distribution for counts and proportions


The binomial distribution for sample counts
Binomial distributions in statistical sampling
Binomial mean and standard deviation
Sample proportion
Bias and variability
Mean and standard deviation of a sample proportion
Normal approximation for counts and proportions

3 The sampling distribution of a sample mean


The mean and standard deviation of a sample mean
The central limit theorem

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 2 / 23
Introduction

THE DISTRIBUTION OF A STATISTIC


A statistic from a random sample or randomize experiment is a random variable.
The probability distribution of the statistics is its sampling distribution.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 3 / 23
Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 4 / 23
Introduction

POPULATION DISTRIBUTION
The population distribution of a variable is the distribution of its values for all
members of the population. The population distribution is also the probability
distribution of the variable when we choose one individual at random from the
population.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 5 / 23
The binomial distribution for sample counts

THE BINOMIAL SETTING


1 There are a fixed number n of observations.
2 The n observations are all independent.
3 Each observation falls into one of just two categories, which for convenience
we call “success” and “failure.”
4 The probability of a success, call it p, is the same for each observation.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 6 / 23
Sampling distribution for counts and proportions

BINOMIAL DISTRIBUTIONS
The distribution of the count X of successes in the binomial setting is called the
binomial distribution with parameters n and p. The parameter n is the number of
observations, and p is the probability of a success on any one observation. The
possible values of X are the whole numbers from 0 to n. As an abbreviation, we
say that X is B(n, p).

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 7 / 23
Binomial distributions in statistical sampling

SAMPLING DISTRIBUTION OF A COUNT


A population contains proportion p of successes. If the population is much
larger than the sample, the count X of successes in an SRS of size n has
approximately the binomial distribution B(n, p).
The accuracy of this approximation improves as the size of the population
increases relative to the size of the sample. As a rule of thumb, we will use
the binomial sampling distribution for counts when the population is at least
20 times as large as the sample.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 8 / 23
Binomial mean and standard deviation

BINOMIAL MEAN AND STANDARD DEVIATION


If a count X has the binomial distribution B(n, p), then

µX = np (1)
σX = np(1 − p) (2)

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 9 / 23
Sample proportion

Sample proportion
In statistical sampling we often want to estimate the proportion p of “successes”
in a population. Our estimator is the sample proportion of successes:
count of successes in sample
p̂ = (3)
size of sample
X
= (4)
n

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 10 / 23
Sample proportion

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 11 / 23
Sample proportion

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 12 / 23
Bias and variability

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 13 / 23
Mean and standard deviation of a sample proportion

Mean and standard deviation of a sample proportion


Let p̂ be the sample proportion of successes in an SRS of size n drawn from a
large population having population proportion p of successes. The mean and
standard deviation of p̂ are

µp̂ = p (5)
r
p(1 − p)
σp̂ = (6)
n

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 14 / 23
Normal approximation for counts and proportions

NORMAL APPROXIMATION FOR COUNTS AND PROPORTIONS


Draw an SRS of size n from a large population having population proportion p of
successes. Let X be the count of successes in the sample and p̂ = Xn be the
sample proportion of successes. When n is large, the sampling distributions of
these statistics are approximately Normal:
 p 
X is approximately N np, np(1 − p)
 q 
p(1−p)
p̂ is approximately N p, n

As a rule of thumb, we will use this approximation for values of n and p that
satisfy np ≥ 10 and n(1 − p) ≥ 10.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 15 / 23
Normal approximation for counts and proportions

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 16 / 23
The sampling distribution of a sample mean

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 17 / 23
The sampling distribution of a sample mean

Facts about sample means


1 Sample means are less variable than individual observations.
2 Sample means are more Normal than individual observations.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 18 / 23
The mean and standard deviation of a sample mean

i.i.d. (independence and identically distributed)


Select an SRS of size n from a population, and measure a variable X on each
individual in the sample.
The n measurements are values of n random variables X1 , X2 , ..., Xn .
A single Xi is a measurement on one individual selected at random from the
population and therefore has the distribution of the population.
If the population is large relative to the sample, we can consider
X1 , X2 , ..., Xn to be independent random variables each having the same
distribution.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 19 / 23
The mean and standard deviation of a sample mean

MEAN AND STANDARD DEVIATION OF A SAMPLE MEAN


Let x be the mean of an SRS of size n from a population having mean µ and
standard deviation σ. The mean and standard deviation of x are

µx = µ (7)
σ
σx = (8)
n

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 20 / 23
The central limit theorem

Sampling distribution of a sample mean


If a population has the N (µ, σ) distribution,
√ then the sample mean x of n
independent observations has the N (µ, σ/ n) distribution.

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 21 / 23
The central limit theorem

Central limit theorem


Draw an SRS of size n from any population with mean µ and finite standard
deviation σ. When n is large, the sampling distribution of the sample mean x is
approximately Normal:
 
x is approximately N µ, √σn

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 22 / 23
The central limit theorem

Thach Thanh Tien (Ton Duc Thang University) C01125 - SAMPLING DISTRIBUTION February 1, 2021 23 / 23

You might also like