QBM101 Chapter7

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

QBM101 Business Statistics

Department of Business Studies


Faculty of Business, Economics & Accounting
HELP University
CHAPTER 7: SAMPLING DISTRIBUTIONS
➢ 7.1 Sampling distribution, sampling error,
and non sampling error
➢ 7.2 Mean and standard deviation of sample
mean
➢ 7.3 Shape of the sampling distribution of
sample mean
➢ 7.4 Applications of the sampling distribution
of sample mean
➢ 7.5 Population and sample proportions: and
mean, standard deviation, and shape of the
sampling distribution of sample proportion
➢ 7.6 Applications of the sampling distribution
of sample proportion
 The population distribution is the probability
distribution of the population data.
 The probability distribution of 𝑥ҧ is called sampling
distribution. It lists the various values that 𝑥ҧ can
assume and the probability of each value of 𝑥.ҧ
 In general, the probability distribution of a sample
statistic is called its sampling distribution.
POPULATION DISTRIBUTION VS
SAMPLING DISTRIBUTION

Suppose there are only five students in an


advanced statistics class and the midterm
scores of these five students are

70 78 80 80 95

Let X denote the score of a student.


POPULATION DISTRIBUTION
SAMPLING DISTRIBUTION

Reconsider the population of midterm scores of five


students.

Consider all possible samples of three scores each


that can be selected, without replacement, from
that population.

The total number of possible samples is


5! 5  4  3  2 1
5 C3 = = = 10
3!(5 − 3)! 3  2  1  2  1
SAMPLING DISTRIBUTION

Suppose we assign the letters A, B, C, D, and E to the


scores of the five students so that
A = 70, B = 78, C = 80, D = 80, E = 95

Then, the 10 possible samples of three scores each are


ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE,
BDE, CDE
SAMPLING DISTRIBUTION
SAMPLING DISTRIBUTION
SAMPLING ERROR AND
NONSAMPLING ERROR

Sampling error is the difference between the value


of a sample statistic and the value of the
corresponding population parameter.

𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑒𝑟𝑟𝑜𝑟 = 𝑥ҧ − 𝜇

The errors that occur in the collection, recording, and


tabulation of data are called nonsampling errors.

𝑛𝑜𝑛𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑒𝑟𝑟𝑜𝑟 = 𝑖𝑛𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑥ҧ − 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑥ҧ


SAMPLING ERROR AND
NONSAMPLING ERROR

Reconsider the population of five scores. Suppose one


sample of three scores is selected from this
population, and this sample includes the scores 70,
80, and 95. Find the sampling error.

70 + 78 + 80 + 80 + 95
= = 80.60
5
70 + 80 + 95
x= = 81.67
3
Sampling error = x −  = 81.67 − 80.60 = 1.07
That is, the mean score estimated from the sample is
1.07 higher than the mean score of the population.
SAMPLING ERROR AND
NONSAMPLING ERROR

Now suppose, when we select the sample of three


scores, we mistakenly record the second score as 82
instead of 80.

As a result, we calculate the sample mean as

70 + 82 + 95
x= = 82.33
3
The difference between this sample mean and the
population mean is

x −  = 82.33 − 80.60 = 1.73


SAMPLING ERROR AND
NONSAMPLING ERROR
This difference does not represent the sampling error.
Only 1.07 of this difference is due to the sampling
error.

The remaining portion represents the nonsampling


error. It is equal to 1.73 – 1.07 = .66. It occurred due
to the error we made in recording the second score in
the sample.
Nonsampling error = Incorrect x − Correct x
= 82.33 − 81.67 = .66
The mean and standard deviation of the
sampling distribution of 𝑥ҧ are called the
mean and standard deviation of 𝒙 ഥ and are
denoted by 𝜇𝑥ҧ and 𝜎𝑥ҧ , respectively.

Note:
𝜇 = population mean
𝑥ҧ = sample mean (for ONE sample)
𝜎 = population standard deviation
𝜇𝑥ҧ = mean of ALL sample means
𝜎𝑥ҧ = standard deviation of ALL sample means
Given the five data:
x1 = 20, x2 = 35, x3 = 40, x4 = 50, x5 = 60
20 + 35 + 40 + 50 + 60
Population mean, E ( X ) =  =  X = = 41
5

Taking a sample of size 2 out of the 5 values:


20 35  X = 27.5 35 50  X = 42.5
20 40  X = 30 35 60  X = 47.5
20 50  X = 35 40 50  X = 45
20 60  X = 40 40 60  X = 50
35 40  X = 37.5 50 60  X = 55
27.5 + 30 + 35 + 40 + 37.5 + 42.5 + 47.5 + 45 + 50 + 55
E( X ) = X = = 41
10
X = 
Mean of samples (many
samples are drawn
from the population) is
the same as the mean
of the population.
Given the 100 data (students' marks for an exam)
x1 = 10, x2 = 95, x3 = 100,..., x99 = 20, x100 = 60
10 + 95 + 100 + ... + 20 + 60
Population mean, E ( X ) =  =  X = = 60
100
Std. dev.,  =  X = ... = 20

Taking a sample of size 5 out of the 100 values:100C5 combinations


10 95 100 70 55  X = 66
10 100 70 86 5  X = 54.2

35 40 46 20 60  X = 40.2
66 + 54.2 + ... + 40.2
E( X ) = X = = 60
75, 287,520
Std. dev. of X ,  X = .... = 8.944
 X = 8.943,  = 20
20
8.943 =
5
  2
X =  =
2
X
n n

Relationship between the


std. dev. of the sample
means and the std. dev. of
the population
EXAMPLE 7.2
The mean wage for all 5000 employees who work at
a large company is $27.50 and the standard
deviation is $3.70. Let X be the mean wage per
hour for a random sample of certain employees
selected from this company. Find the mean and
standard deviation of X for a sample size of
(a) 30 (b) 70 (c) 200
N = 5000,  = 27.5,  = 3.7
 3.7
(a) n = 30 :  X =  = 27.5,  X = = = 0.676
n 30
 3.7
(b) n = 70 :  X =  = 27.5,  X = = = 0.427
n 70
 3.7
(c) n = 200 :  X =  = 27.5,  X = = = 0.262
n 70
SHAPE OF SAMPLING DISTRIBUTION
SHAPE OF SAMPLING DISTRIBUTION
SHAPE OF SAMPLING DISTRIBUTION
Central Limit Theorem
(i) Population distribution is
normal
(ii) Population distribution is
not normal, but n  30
SHAPE OF SAMPLING DISTRIBUTION
If the population from which the samples are
drawn is normally distributed with mean 𝜇
and standard deviation 𝜎, then the sampling
distribution of the sample mean, 𝑥,ҧ will also
be normally distributed with the following
mean and standard deviation, irrespective of
the sample size.

If 𝑋 ~ 𝑁 𝜇, 𝜎 2 , then 𝑋~𝑁(𝜇
ത 2
𝑥ҧ 𝑥ҧ )
, 𝜎
𝜎
Where 𝜇𝑥ҧ = 𝜇 𝑎𝑛𝑑 𝜎𝑥ҧ =
𝑛
Check the condition:
(i) Original X is normal, regardless of the
sample size n
OR
(ii) Original X is not normal, but the sample
size n is at least 30

𝑥ҧ − 𝜇𝑥ҧ
𝑧=
𝜎𝑥ҧ
EXAMPLE 7.5

Assume that the weights of all packages of a


certain brand of cookies are normally distributed
with a mean of 32 ounces and a standard deviation
of .3 ounce. Find the probability that the mean
weight, X , of a random sample of 20 packages of
this brand of cookies will be between 31.8 and 31.9
ounces.
EXAMPLE 7.5
X ~ N (  X = 32,  X2 = 0.32 )
 0.32

n = 20 : X = N   X = 32,  X =
2

 20 
P(31.8  X  31.9)
 
 31.8 − 32 31.9 − 32 
= P Z 
 0.3 0.3 
 20 20 
= P(−2.98  Z  −1.49)
= 0.0681 − 0.0014
= 0.0667
EXAMPLE 7.6
According to Moebs Services Inc., an individual checking
account at major U.S. banks costs the banks between $350
and $450 per year (Time, November 21, 2011). Suppose
that the current average cost of all checking accounts at
major U.S. banks is $400 per year with a standard
deviation of $30. Let X be the current average annual
cost of a random sample of 225 individual checking
account at major banks in America.

(a) What is the probability that the average annual cost of


the checking accounts in this sample is within $4 of the
population mean?

(b) What is the probability that the average annual cost of


the checking accounts in this sample is less than the
population mean by $2.70 or more?
EXAMPLE 7.6
 302 
(a) n = 225  30, X = N   X = 400,  X =
2

 225 
P( X − 400  4)
= P(−4  X − 400  4)
= P(400 − 4  X  400 + 4)
= P(396  X  404)
 
 396 − 400 404 − 400 
= P Z 
 30 30 
 225 225 
= P(−2  Z  2)
= 0.9772 − 0.0228
= 0.9544
EXAMPLE 7.6
 30 2

(b) n = 225  30, X = N   X = 400,  X =
2

 225 
P(  − X  2.7)
= P(400 − X  2.7)
= P( X − 400  2.7)
–2.7)
= P( X  397.3)
 
 397.3 − 400 
= PZ  
 30 
 225 
= P( Z  −1.35)
= 0.0885
EXERCISE 1
The IQ (intelligent quotient) score of all the people in a
particular country follows a normal distribution, with an
average of 100 and a standard deviation of 15. Let X be the
random variable that represents the IQ2 score2 of a person in
that country. Hence, X ~ N (  = 100,  = 15 ).

(i) What is the probability that a randomly selected person


has an IQ score between 85 and 110?
(ii) Find the probability that the mean IQ score for five
randomly selected people is between 85 and 110.

Ans: (i) 0.5899 (ii) 0.9194


𝑋
Population proportion, 𝑝 =
𝑁
𝑥
Sample proportion, 𝑝Ƹ =
𝑛

Eg. In a class of 10 students, 2 are male. The


percentage of students who are male.

2 out of 10 students = 20% ⇒ 𝑝 = 0.20


EXAMPLE 7.7

Suppose a total of 789,654 families live in a city


and 563,282 of them own homes. A sample of 240
families is selected from this city, and 158 of them
own homes. Find the proportion of families who
own homes in the population and in the sample.

X 563, 282
p= = = .71
N 789, 654
x 158
pˆ = = = .66
n 240
EXAMPLE 7.8

Boe Consultant Associates has five employees.


Table 7.6 gives the names of these five employees
and information concerning their knowledge of
statistics.
EXAMPLE 7.8

Population proportion, p = 3/5 = 0.6

Now, suppose we draw all possible samples of three


employees each and compute the proportion of
employees, for each sample, who know statistics.

Total number of samples = 5C3 = 10


EXAMPLE 7.8
MEAN AND S.D. OF SAMPLE PROPORTION

 pˆ = p
pq
 pˆ =
n
CLT FOR SAMPLE PROPORTION

According to the central limit theorem, the


sampling distribution of p̂ is approximately
normal for a sufficiently large sample size. In the
case of proportion, the sample size is considered to
be sufficiently large if np and nq are both greater
than 5, that is, if np  5, nq  5.
When we conduct a study, we usually take only one
sample and make all decisions or inferences on the
basis of the results of that one sample. We use the
concepts of the mean, standard deviation, and
shape of the sampling distribution of p̂ to
determine the probability that the value of p̂
computed from one sample falls within a given
interval.

pˆ −  pˆ pˆ − p
Z= =
 pˆ pq
n
EXAMPLE 7.10

According to a Pew Research Center nationwide


telephone survey of American adults conducted by
phone between March 15 and April 24, 2011, 75%
of adults said that college education has become too
expensive for most people and they cannot afford it
(Time, May 30, 2011). Suppose that this result is
true for the current population of American adults.
Let p̂ be the proportion in a random sample of
1400 adult Americans who will hold the said
opinion. Find the probability that 76.5% to 78% of
adults in this sample will hold this opinion.
EXAMPLE 7.10
p = 0.75, q = 1 − p = 1 − 0.75 = 0.25, n = 1400
np = 1400(0.75) = 1050  5
nq = 1400(0.25) = 350  5
 pˆ = p = 0.75
pq 0.75(0.25)
 pˆ = = = 0.01157
n 1400
pˆ ~ N (  pˆ = 0.75,  p2ˆ = 0.01157 2 )
P(0.765  pˆ  0.78)
 0.765 − 0.75 0.78 − 0.75 
= P Z 
 0.01157 0.01157 
= P (1.30  Z  2.59 )
= 0.9952 − 0.9032
= 0.0920
EXAMPLE 7.11

Maureen Webster, who is running for mayor in a


large city, claims that she is favored by 53% of all
eligible voters of that city. Assume that this claim
is true. What is the probability that in a random
sample of 400 registered voters taken from this
city, less than 49% will favor Maureen Webster?
EXAMPLE 7.11
p = 0.53, q = 1 − p = 1 − 0.53 = 0.47, n = 400
nq = 400(0.53) = 213  5
nq = 400(0.47) = 188  5
 pˆ = p = 0.53
pq 0.53(0.47)
 pˆ = = = 0.02495
n 400
pˆ ~ N (  pˆ = 0.53,  p2ˆ = 0.024952 )
P( pˆ  0.49)
 0.49 − 0.53 
= PZ  
 0.02495 
= P ( Z  −1.60 )
= 0.0548
EXERCISE 2
Brooklyn Corporation manufactures CDs. The machine
that is used to make these CDs is known to produce 6%
defective CDs. The quality control inspector selects a
sample of 100 CDs every week and inspects them for being
good or defective. If 8% or more of the CDs in the sample
are defective, the process is stopped and the machine is
readjusted. What is the probability that based on a sample
of 100 CDs, the process will be stopped to readjust the
machine?

Ans: 0.2005
Summary
CLT: Normal or n  30

X ~ N ( , 2 )
Taking a sample of size n
 2 
X ~ N   X =  , X =
2

 n 
X − X X − 
Z= =
X 
n
np  5, nq  5
 pq 
pˆ ~ N   pˆ = p,  p2ˆ = 
 n 
pˆ −  pˆ pˆ − p
Z= =
 pˆ pq
n

You might also like