CHAPTER 2 - 2 Hypothesis Tests About The Proportion

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

CHAPTER 2

Hypothesis Test

2.3 Hypothesis Tests about the Proportion (Single Population)

2.4 Hypothesis Tests about the Proportion (Two Populations)


(2.3)
HYPOTHESIS TESTS ABOUT
THE PROPORTION
(SINGLE POPULATION)
Hypothesis Tests about a Population Proportion: Large Samples

• This section presents the procedure to perform tests of hypothesis about the
population proportion, p , for large samples.
• The procedure to make such test is similar in many respects to the one for the
population mean,  .
• When the sample size is large, the sample proportion, p̂ , is approximately
pq
normally distributed with its mean equal to p and standard deviation equal to .
n

• In this case of proportion, the sample size is considered to be large when np and nq
are both greater than 5.

Test statistics: The value of the test statistics z for the sample proportion, p̂
is computed as:
ˆp
p pq
z where  pˆ 
 pˆ n
The value of p used in this formula is the one used in the null hypothesis. The value
of q is equal to 1  p .
The value of z calculated for p̂ using the above formula is also called the observed
value of z.
EXAMPLE 2.8:
When working properly, a machine that is used to make chips for calculators
does not produce more than 4% defective chips. Whenever the machine
produces more than 4% defective chips, it needs an adjustment. To check if the
machine is working properly, the quality control department at the company
often takes samples of chips and inspects them to determine if they are good
or defective. One such random sample of 200 chips taken recently from the
production line contained 14 defective chips. Test at 5% significance level
whether or not the machine needs an adjustments.

SOLUTION:

Let p be the proportion of defective chips in all chips produced by his machine,
and let p̂ be the corresponding sample proportion.
Then, from the given information,
14
n  200, p
ˆ and   0.05
200
When the machine is working properly it does not produce more than 4%
defective chips. Consequently, assuming that the machine is working
properly,

p=0.04, and q=1-p=1-0.04 = 0.96.

STEP 1: State the null and alternative hypotheses.


The machine will not need an adjustment if the percentage of defective chips
is 4% or less, and it will need an adjustment if this percentages is greater than
4%. Hence, the null and alternative hypotheses are

H 0 : p  0.04 (the machine does not need an adjustment)


H1 : p  0.04 (the machine needs an adjustment)

STEP 2: Select the distribution to use.


The values of np and nq:
np=200(0.04) = 8>5 and nq=200(0.96) =192>5

Because the sample size is large, we use the normal distribution to make the
hypotheses test about p.
STEP 3: Determine the rejection and nonrejection regions.
The significance level is 0.05. The > sign in the alternative hypothesis
indicates that the test is right-tailed and the rejection region lies in the right
tail of the sampling distribution of p̂ with its area equal to 0.05. As shown in
the following figure, the critical value of z, obtained from the normal
distribution table for 0.4500, is approximately 1.65.

  0.05


p  0.04
do not reject H 0 reject H 0

z
0 z=1.65 Critical value of z
STEP 4: Calculate the value of the test statistics.
The value of the test statistic z for pˆ  0.07 is calculated as follows:

 pˆ 
pq

0.04 0.96   0.01385641
n 200
ˆp
p 0.07  0.04
z   2.17
 pˆ 0.01385641
From H0
STEP 5: Make a decision.
Because the value of the test statistic z=2.17 is greater than the critical value
of z=1.65 and its falls in the rejection region, we reject H 0 . We conclude that
the sample proportion is too far from the hypothesized value of the population
proportion and the difference between the two cannot be attributed to chance
alone. Therefore, based on the sample information, we conclude that the
machine needs an adjustment.

NOTE: we can use the p-value approach to make test of hypotheses about
the population proportion p. The procedure to calculate the p-value for the
sample proportion is similar to the one applied to the sample mean.
EXCERCISES:
1. Consider the following null and alternative hypotheses:
H 0 : p  0.82 H1 : p  0.82
A random sample of 600 observations taken from this population produced
a sample proportion of 0.86.

a) If this test is made at the 2% significance level, would you reject the null
hypothesis?
b) What is the probability of making a Type I error in part (a)?
c) Calculate the p-value for the test. Based on this p-value, would you reject
the null hypothesis if   0.025 ? What if   0.005 ?

2. Many traffic accidents are blamed on motorist who are distracted by cell
phone use while driving. In a survey, 29% of adults said that they make
phone calls sometimes or frequently while driving alone. Suppose that a
recently taken random sample of 500 adults showed that 165 of them
make phone calls sometimes or frequently while driving alone. At the 5%
significance level, can you conclude that the current percentage of adults
who make such phone calls exceeds 29%?
(2.4)
HYPOTHESIS TESTS ABOUT
THE PROPORTION
(TWO POPULATIONS)
Hypothesis testing about p1  p2 .

• In this section we learn how to test a hypothesis about p1  p2 for two large
and independent samples. The procedure involves that five steps that we
have used previously.
• Once again, we calculate the standard deviation of pˆ1  pˆ 2 as

p1q1 p2 q2
 pˆ  pˆ  
1 2
n1 n2
• When we test of hypothesis about p1  p2 is performed, usually the null
hypothesis is p1  p2 and the values of p1 and p2 are not known.

• Assuming that the null hypothesis is true and p1  p2 , a common value of p1


and p2 , denoted by p , is calculated by using one of the following
formulas:
x1  x2 ˆ 1  n2 p
n1 p ˆ2
p or
n1  n2 n1  n2
• Which of these formulas is used depends on whether the values of
x1 and x2 or the value of p̂1 and p̂ are known.
2

• Note that x1and x2 are the number of elements in each of the two samples
that posses a certain characteristic. This value of p is called the pooled
sample proportion.

• Using the value of the pooled sample proportion, we compute an estimate of


the standard deviation of pˆ1  pˆ 2 as follows;

 1 1 
s pˆ1  pˆ 2  pq    where q  1  p
 n1 n2 

Test statistic for z for pˆ1  pˆ 2 .


The value of the test statistic z for ˆ1  p
p ˆ2 is calculated as
ˆ1  p
(p ˆ 2 )  ( p1  p2 )
z
s pˆ1  pˆ 2

The value of p1  p2 is substituted from H0 , which usually is zero.


EXAMPLE 2.9:
Reconsider example about the percentages of users of two toothpaste who will
never switch to another toothpaste. At the 1% significance level, can we
conclude that the proportion of users of toothpaste A who will never switch to
another toothpaste is higher than the proportion of users of Toothpaste B who
will never switch to another toothpaste?

SOLUTION:
Let p1 and p2 be the proportions of all users of Toothpaste A and B,
respectively, who will never switch to another toothpaste and let p̂1 and p̂2
be the corresponding sample proportions. Let x1and x2 be the number of users
of Toothpaste A and B, respectively, in the two samples who said that they will
never switch to another Toothpaste. From the given information,

Toothpaste A : n1  500 and x1  100


Toothpaste B : n2  400 and x2  68
The significance level is   0.01. The two sample proportion are
calculated as follows:
x1 100
ˆ1 
p   0.20
n1 500
x2 68
ˆ2 
p   0.17
n2 400

STEP 1; State the null and alternative hypothesis.


Thus, the two hypothesis are:

H 0 : p1  p2  0( p1 is equal to p2 )
H1 : p1  p2  0( p1 is greater than p2 )
STEP 2: Select the distribution to use

As shown earlier, n1 pˆ1 , n1qˆ1 , n1 pˆ1 , n2 qˆ2 are all greater than 5. Consequently
both samples are large, and we apply the normal distribution to make the test.
STEP 3: Determine the rejection and nonrejection regions.
The > sign in the alternative hypothesis indicates that the test is right-tailed.
From the normal distribution table, for   0.01 significance level, the critical
value of z is 2.33. This is shown in the following figure:

  0.01

0.4900

ˆ1  p
p ˆ2
p1  p2  0
Do not reject H0 Reject H0

z
0 2.33
Critical value of z
STEP 4: Calculate the value of the test statistic.
The pooled sample proportion is

x1  x2 100  68
p   0.187
n1  n2 500  400
q  1  p  1  0.187  0.813

ˆ1  p
The estimate of the standard deviation of p ˆ 2 is

1 1   1 1 
s pˆ1  pˆ 2  pq     (0.187 )(0.813)    0.02615606
 n1 n2   500 400 

ˆ1  p
The value of the test statistic z for p ˆ 2 is From H0

ˆ1  p
(p ˆ 2 )  ( p1  p2 ) (0.20  0.17 )  0
z   1.15
s pˆ1  pˆ 2 0.02615606
STEP5: Make a decision.
Since the value of the test statistic z=1.15 for pˆ1  pˆ 2 falls in the nonrejection
region, we fall to reject the null hypothesis. Therefore, we conclude that the
proportion of users of Toothpaste A who will never switch to another toothpaste
is not greater than the proportion of users of Toothpaste B who will never
switch to another toothpaste.

EXAMPLE 2.10:
According to a poll conducted for Men’s Health magazine by Opinion
Research, 67% of men and 77% of women said that a good diet is very
important to good health. Suppose that this study is based on samples of 900
men and 1200 women, Test whether the percentages of all men and women
who hold this view are different. Use the 1% significance level.
SOLUTION:
Let p1 and p2 be the proportions of all men and all women, respectively, who
hold the view that a good diet is very important to good health. Let p̂1 and p̂2
be the corresponding sample proportions. From the given information,

For men : n1  900 and p


ˆ 1  0.67
For women: n  1200 and p ˆ 2  0.77
2

STEP 1: State the null and alternative hypothesis.

The null and alternative hypothesis are

H 0 : p1  p2  0 (The two proportions are not different )


H1 : p1  p2  0 (The two proportions are different)
STEP 2: Select the distribution to use.
Because the samples are large and independent, we apply the normal
distribution to make the test.

STEP 3: Determine the rejection and rejection regions.


The  sign in the alternative hypothesis indicates that the test is two-tailed.
For a 1% significance level, the critical value of z are -2.58 and 2.58.
These values are show in the following figure:

  0.005
  0.005 2
2

0.4950 0.4950

ˆ1  p
p ˆ2
p1  p2  0
Reject H0 Do not reject H0 Reject H0

z
-2.58 0 2.58
Two critical value of z
STEP 4: Calculate the value of the test statistic.
The pooled sample proportion is
ˆ 2  n2 p
n1 p ˆ 2 900 (0.67)  1200 (0.77)
p   0.727
n1  n2 900  1200
q  1  p  1  0.727  0.273

The estimate of the standard deviation of ˆ1  p


p ˆ2 is

1 1   1 1 
s pˆ1  pˆ 2  pq     (0.727 )(0.273)    0.01964474
 n1 n2   900 1200 
The value of the test statistic z for ˆ1  p
p ˆ2 is

z
 pˆ 1  pˆ 2    p1  p2   0.67  0.77   0  5.09
s pˆ1  pˆ 2 0.01964474

STEP 5: Make a decision.


The value of the test statistic z=-5.09 for pˆ1  pˆ 2 falls in the rejection region.
Consequently, we reject the null hypothesis. As a result, we conclude that the
percentages of all men and all women who hold the view that good diet is very
important to good health are different.
EXERCISES:

ˆ1  p
1. What is the shape of the sampling distribution of p ˆ 2 for two large
samples? What are the mean and standard deviation of this sampling
distribution?
TASK 7
A sample of 500 observations taken from the first population gave x1  305. Another
sample of 600 observations taken from the second population gave x2  348. .

a) Find the point estimator of p1  p2 .


b) Make a 97% confidence interval for p1  p2.
c) Show the rejection and non-rejection regions on the sampling distribution of p ˆ1  p
ˆ2
for H 0 : p1  p2 versus H1 : p1  p2 . Use significance level of 25%.
d) Find the value of the test statistic z of the test part (c).
e) Will you reject the null hypothesis mentioned in part (c) at a significance level of
2.5%?

You might also like