K1.3 CLT Sampling

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

The Sampling distribution

and
Central Limit Theorem
Sampling Distributions

Sampling
Distributions

Sampling Sampling
Distributions of Distributions of
the Mean the Proportion

WEEK2-2
Sampling Distributions

 A sampling distribution is a
distribution of all of the possible
values of a statistic for a given size
sample selected from a population

WEEK2-3
Developing a
Sampling Distribution

 Assume there is a population …


C D
 Population size N=4 A B

 Random variable, X,
is age of individuals
 Values of X: 18, 20,
22, 24 (years)

WEEK2-4
Developing a
Sampling Distribution
(continued)

Summary Measures for the Population Distribution:

μ
 X i P(x)
N .3
18  20  22  24
  21 .2
4 .1

 i
0
(X  μ) 2
18 20 22 24 x
σ  2.236
N A B C D
Uniform Distribution
WEEK2-5
Developing a
Sampling Distribution
(continued)
Now consider all possible samples of size n=2
1st 2nd Observation
16 Sample
Obs 18 20 22 24
Means
18 18,18 18,20 18,22 18,24
1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement)
24 21 22 23 24
WEEK2-6
Developing a
Sampling Distribution
(continued)
Sampling Distribution of All Sample Means

16 Sample Means Sample Means


Distribution
1st 2nd Observation _
Obs 18 20 22 24 P(X)
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 _
24 21 22 23 24 18 19 20 21 22 23 24 X
WEEK2-7
(no longer uniform)
Developing a
Sampling Distribution
(continued)

Summary Measures of this Sampling Distribution:


Mean of the sample means

μX 
 X i

18  19  21    24
 21
N 16
Standard Error of the Means

σX 
 ( X i  μ X
) 2

N
(18 - 21) 2  (19 - 21) 2    (24 - 21) 2
  1.58
16 WEEK2-8
Comparing the Population with its
Sampling Distribution
Population Sample Means Distribution
N=4 n=2
μ  21 σ  2.236 μX  21 σ X  1.58
_
P(X) P(X)
.3 .3

.2 .2

.1 .1

0 X 0
18 19 20 21 22 23 24
_
18 20 22 24 X
A B C D WEEK2-9
Standard Error of the Mean

 Different samples of the same size from the same population


will yield different sample means
 A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:

σ
σX 
n
 Note that the standard error of the mean decreases as the
sample size increases
WEEK2-10
If the Population is Normal

 If a population is normal with mean μ and standard


deviation σ, the sampling distribution of X is also
normally distributed with

σ
μX  μ and σX 
n
(This assumes that sampling is with replacement or sampling is
without replacement from an infinite population)
WEEK2-11
Z-value for Sampling Distribution
of the Mean
 Z-value for the sampling distribution of X :

( X  μX ) ( X  μ)
Z 
σX σ
n
where: X = sample mean
μ = population mean
σ = population standard deviation
n = sample size
WEEK2-12
If the Population is not Normal

 We can apply the Central Limit Theorem:


 Even if the population is not normal,
 …sample means from the population will be
approximately normal as long as the sample size is large
enough.

Properties of the sampling distribution:

σ
μx  μ and σx 
n WEEK2-13
Central Limit Theorem
Central limiting effect
the sampling
As the n↑
distribution
sample size
becomes almost
gets large
normal
enough…
regardless of
shape of
population

x
WEEK2-14
If the Population is not Normal
(continued)

Population Distribution
Sampling distribution
properties:
Central Tendency

μx  μ
μ x
Sampling Distribution
Variation
σ (becomes normal as n increases)
σx  Larger
n Smaller sample sample
size size
(Sampling with
replacement)
μx WEEK2-15x
WEEK2-16
How Large is Large Enough?

 For most distributions, n > 30 will give a


sampling distribution that is nearly normal
 For fairly symmetric distributions, n > 15
 For normal population distributions, the sampling
distribution of the mean is always normally
distributed

WEEK2-17
Example

 Suppose a population has mean μ = 8


and standard deviation σ = 3. Suppose a
random sample of size n = 36 is selected.

 What is the probability that the sample


mean is between 7.8 and 8.2?

WEEK2-18
Example
(continued)

Solution:
 Even if the population is not normally distributed,
the central limit theorem can be used (n > 30)
 … so the sampling distribution of x is
approximately normal
 … with mean μx = 8
σ 3
 …and standard deviation σ x    0.5
n 36
WEEK2-19
Example
(continued)
Solution (continued):
 X -μ X 
 7.8 - 8 8.2 - 8 
P(7.8  X  8.2)  P   
 3 σ 3 
 36 n 36 
 P(-0.4  Z  0.4)  0.3108
Population Sampling Standard Normal
Distribution Distribution Distribution 0.1554
??? +0.1554
? ??
? ? Sample Standardize
? ? ?
?
-0.4 0.4
μ8 X 7.8
μX  8
8.2
x μz  0 Z
WEEK2-20
Normal Approximations to the
Binomial
Characteristics of a Binomial Experiment
• There are a fixed number of trials. (n)
• The n trials are independent and repeated under
identical conditions
• Each trial has 2 outcomes, S = Success or F = Failure.
• The probability of success on a single trial is p and the
probability of failure is q.
• P(S) = p P(F) =q p + q = 1
• The central problem is to find the probability of x
successes out of n trials. Where x = 0 or 1 or 2 … n.

x is a count of the number of successes in n trials.


21
Application
34% of Americans have type A+ blood. If 500 Americans are
sampled at random, what is the probability at least 300 have
type A+ blood?
Using techniques of chapter 4 you could calculate the probability
that exactly 300, exactly 301…exactly 500 Americans have A+
blood type and add the probabilities.
Or…you could use the normal curve probabilities to
approximate the binomial probabilities.

If np 5 and nq  5, the binomial random variable x is


approximately normally distributed with

mean μ = np and   npq


22
np  5 and nq  5?Why do we require

n = 5, p = 0.25, q = .75
np =1.25 nq = 3.75
0 1 2 3 44 5

n = 20, p = 0.25
np = 5 nq = 15
4

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

n = 50, p = 0.25,
np = 12.5 , nq = 37.5
0 10 20 30 40 50
23
Binomial Probabilities
The binomial distribution is discrete with a probability histogram
graph. The probability that a specific value of x will occur is equal
to the area of the rectangle with midpoint at x.
If n = 50 and p = 0.25 find P (14 x  16)
Add the areas of the rectangles with midpoints at
x = 14, x = 15 and x = 16.
0.111 + 0.089 + 0.065 = 0.265
0.111 0.089
0.065

P (14 x  16) = 0.265


14 15 16
Larson/Farber Ch 5
24
Correction for Continuity

Use the normal approximation to the binomial to find


P(14 x  16) if n = 50 and p = 0.25

Check that np= 12.5  5 and nq= 37.5  5.

14 15 16

The interval of values under the normal curve is


13.5  x  16.5.
To ensure the boundaries of each rectangle are included in the interval,
subtract 0.5 from a left-hand boundary and add 0.5 to a right-hand
25
boundary.
Example

Sixty two percent of 12th graders attend school in a particular


urban school district. If a sample of 500 12th grade children
are selected, find the probability that at least 290 are actually
enrolled in school.

 Step 1: Determine p,q, and n:


p is defined in the question as 62%, or 0.62
To find q, subtract p from 1: 1 – 0.62 = 0.38
n is defined in the question as 500
 Step 2: Determine if you can use the normal distribution:
n * p = 310 and n * q = 190. These are both larger than 5.
 Step 3: Find the mean, μ by multiplying n and p:
n * p = 310
WEEK2-26
Example…continued

 Step 4: Multiply step 3 by q :


310 * 0.38 = 117.8.

 Step 5: Take the square root of step 4 to get the standard


deviation, σ:
sqrt(117.8)=10.85
Note: The formula for the standard deviation for a binomial is &sqrt;(n*p*q).

 Step 6: Write the problem using correct notation:


P(X≥290)

 Step 7: Rewrite the problem using the continuity correction


factor:
P (X ≥ 290-0.5)= P (X ≥ 289.5)

WEEK2-27
Example…continued

 Step 8: Draw a diagram with the mean in the center. Shade the
area that corresponds to the probability you are looking for.
We’re looking for X ≥ 289.5, so…

 Step 9: Find the z-score.


You can find this by subtracting the mean (μ) from the
probability you found in step 7, then dividing by the standard
deviation (σ):
(289.5 – 310) / 10.85 = -1.89 WEEK2-28
Example…continued

 Step 10: Look up the z-value in the z-table:


The area for -1.819 is 0.4706

 Step 11: Add .5 to your answer in step 10 to find the total area
pictured:
0.4706+ 0.5 = 0.9706.

 That’s it! The probability is .9706, or 97.06%.

WEEK2-29
Application
A survey of Internet users found that 75% favored government
regulations on “junk” e-mail. If 200 Internet users are randomly
selected, find the probability that fewer than 140 are in favor of
government regulation.
Since np=150  5 and nq = 50  5 you can use the normal
approximation to the binomial.
  np  200 (.75)  150
  npq  200 (.75)(.25)  6.1237
Use the correction for continuity P(x < 139.5)
139 .5  150
z   1.71
6.1237
P(z < -1.71) = 0.0436
The probability that fewer than 140 are in
favor of government regulation is 0.0436 30
Normal Approximation to the Binomial
Use the normal approximation to the binomial to find P(14 x
 16) if n = 50 and p = 0.25
Find the mean and standard deviation using binomial
distribution formulas.
  np  50(.25)  12.5
  npq  50(.25)(.75)  3.0618
Adjust the endpoints to correct for continuity

P(13.5  x  16.5)
Convert each endpoint to a standard score
13.5  12.5 16.5  12.5
z   0.33 z   1.31
3.0618 3.0618
P(0.33  z  1.31) = 0.9049 - 0.6293 = 0.2756
Larson/Farber Ch 5
31
Group Work

 Search for question on Application of Normal


Approximation to the Binomial
 Present the solution using power-point

WEEK2-32

You might also like