Professional Documents
Culture Documents
Lecture 6 - Hypothesis Testing
Lecture 6 - Hypothesis Testing
Lecture 6 - Hypothesis Testing
Hypothesis
The jury does not know which hypothesis is true. They must
make a decision on the basis of evidence presented.
K J Somaiya Institute of Management, India
Nonstatistical Hypothesis Testing
That is, the jury is saying that there is enough evidence to conclude
that the defendant is guilty (i.e., there is enough evidence to
support the alternative hypothesis).
Notice that the jury is not saying that the defendant is innocent,
only that there is not enough evidence to support the alternative
hypothesis. That is why we never say that we accept the null
hypothesis.
P(Type I error) = α
P(Type II error) = β
There are two hypotheses. One is called the null hypothesis and
the other the alternative or research hypothesis. The usual
notation is:
pronounced
H “nought”
14
Developing Null and Alternative
Hypotheses
- Alternative Hypothesis as a Research
Hypothesis
Example:
A new teaching method is developed that is believed to be better than
the current method.
Alternative Hypothesis:
The new teaching method is better.
Null Hypothesis:
The new method is no better than the old method.
Example:
A new sales force bonus plan is developed in an attempt to increase
sales.
Alternative Hypothesis:
The new bonus plan increase sales.
Null Hypothesis:
The new bonus plan does not increase sales.
Example:
A new drug is developed with the goal of lowering blood pressure
more than the existing drug
Alternative Hypothesis:
The new drug lowers blood pressure more than the existing drug.
Null Hypothesis:
The new drug does not lower blood pressure more than the existing
drug.
Example:
The label on a soft drink bottle states that it contains 67.6 fluid
ounces.
Alternative Hypothesis:
The label is correct. µ ≥ 67.6 ounces.
Null Hypothesis:
The label is incorrect. µ < 67.6 ounces.
≥ ≤ =
< > ≠
One-tailed One-tailed Two-tailed
(lower-tail) (upper-tail)
The testing procedure begins with the assumption that the null
hypothesis is true.
Once the null and alternative hypotheses are stated, the next step
is to randomly sample the population and calculate a test statistic
(in this example, the sample mean).
Because hypothesis tests are based on sample data, we must allow for the
possibility of errors.
Two possible errors can be made in any test:
A Type I error occurs when we reject null hypothesis when it is true
and
A Type II error occurs when we don’t reject null hypothesis when it is false
The probability of making a Type I error when the null hypothesis is true
as an equality is called the level of significance:
P(Type I error) = α
P(Type II error ) = β
Do NOT reject
H0 when it is
FALSE
Population Condition
H0 True H0 False
Conclusion (µ < 12) (µ > 12)
Reject H0 Correct
Type I Error Decision
(Conclude µ > 12)
Reject H0 when it
is TRUE
Population Mean
12.35
Example to explain the Process of
Hypothesis Testing
The label on a large can of Hilltop Coffee states that the can contains
3 pounds of coffee. The FTC knows that Hilltop’s production process
cannot place exactly 3 pounds of coffee in each can, even if the mean
filling weight for the population of all cans filled is 3 pounds per can.
However, as long as the population mean filling weight is at least 3
pounds per can, the rights of consumers will be protected. Thus, the
FTC interprets the label information on a large can of coffee as a
claim by Hilltop that the population mean filling weight is at least 3
pounds per can.
If the value of the sample mean is less than 3 pounds, the sample results
will cast doubt on the null hypothesis. What we want to know is how much
less than 3 pounds must be before we would be willing to declare the
difference significant and risk making a Type I error by falsely accusing
Hilltop of a label violation. A key factor in addressing this issue is the value
the decision maker selects for the level of significance. The decision maker
must specify the level of significance.
In the Hilltop Coffee study, the director of the FTC’s testing program
made the following statement: “If the company is meeting its weight
specifications at µ = 3, I do not want to take action against them. But, I am
willing to risk a 1% chance of making such an error.”
From the director’s statement, we set the level of significance for the
hypothesis test at α = .01
Level of significance = α
a a
H 0: μ = 3 /2 /2
Non Directional
H1: μ ≠ 3 (Two-tail test)
0
H0: μ ≤ 3 a
Directional (one –
H1: μ > 3 tailed right)
0
H0: μ ≥ 3
a
H1: μ < 3
Directional (one –
tailed left) 0
Represents
critical value
• The p-value is the probability, computed using the test statistic, that
measures the support (or lack of support) provided by the sample for the
null hypothesis.
• If the p-value is less than or equal to the level of significance , the value of the
test statistic is in the rejection region.
• Reject H0 if the p-value < .
43
Lower-Tailed Test About a Population Mean: s Known
• p-Value Approach
Sampling
a = .10 Distribution of
𝑥 − 𝜇0
𝑧=
𝜎 /√𝑛
p-value
= .0721
p-Value < a ,
so reject H0. z
z= za = 0
-1.46 -1.28
44
Upper-Tailed Test About a Population Mean: s Known
• p-Value Approach
Sampling
Distribution of a = .04
𝑥 − 𝜇0
𝑧=
𝜎 /√𝑛
p-Value (p-Value < a ,
so reject H0.)
.011
z
0 za = z=
1.75 2.29
45
Critical Value Approach to One-Tailed Hypothesis Testing
• The test statistic z has a standard normal probability distribution.
• We can use the standard normal probability distribution table to find the z-value
with an area of a in the lower (or upper) tail of the distribution.
• The value of the test statistic that established the boundary of the
rejection region is called the critical value for the test.
• The rejection rule is:
• Lower tail: Reject H0 if z < -z
• Upper tail: Reject H0 if z > z
46
Lower-Tailed Test About a Population Mean: s Known
• Critical Value Approach
Sampling
Distribution of
𝑥 − 𝜇0
𝑧=
Reject H0 𝜎 /√𝑛
a 1
Do Not Reject H0
z
-za = -1.28 0
47
Upper-Tailed Test About a Population Mean: s Known
• Critical Value Approach
Sampling
Distribution of
𝑥 − 𝜇0
𝑧=
𝜎 /√𝑛 Reject H0
= .05
Do Not Reject H0
z
0 za = 1.645
48
Example
The U.S. Golf Association (USGA) establishes rules that manufacturers of golf
equipment must meet if their products are to be acceptable for use in USGA
events. MaxFlight uses a high technology manufacturing process to produce
golf balls with a mean driving distance of 295 yards. Sometimes, however, the
process gets out of adjustment and produces golf balls with a mean driving
distance different from 295 yards. When the mean distance falls below 295
yards, the company worries about losing sales because the golf balls do not
provide as much distance as advertised. When the mean distance passes 295
yards, MaxFlight’s golf balls may be rejected by the USGA for exceeding the
overall distance standard concerning carry and roll. MaxFlight’s quality control
program involves taking periodic samples of 50 golf balls to monitor the
manufacturing process. The quality control team selected α = .05 as the level
of significance for the test. Data from previous tests conducted when the
process was known to be in adjustment show thatthe population standard
deviation can be assumed known with a value of σ = 12
The system will be cost effective if the mean account balance for
all customers is greater than $170.
𝑹𝒆𝒋𝒆𝒄𝒕𝒊𝒐𝒏 𝑹𝒆𝒈𝒊𝒐𝒏
𝒙𝑳
𝝁
K J Somaiya Institute of Management, India 55
Example Rejection region
0 𝒁𝒙 𝑳
α = P(Type I error)
= P (Reject H0 given that H0 is true)
= P () OR
𝜶
= P()
Here, and
0 𝒁 𝒙 =𝒁 𝜶
𝑳
Let
Therefore, Decision Criteria: Reject if at 5 % significance level (or 95%
confidence level)
Test Statistic:
Critical Value:
This can be calculated for any level of
significance ()
.05
0 Z
H0: = 170
H1: > 170
Z.05=1.645
z = 2.46
Reject H0 in favor of
K J Somaiya Institute of Management, India
Example - The Big Picture
H0: = 170
H1: > 170 =175.34
=178
Reject H0 in favor of
K J Somaiya Institute of Management, India
p-Value of a Test
p-value
α-value =.05
p-value =.00069
z =1.645
z =2.46
Overwhelming Evidence
(Highly Significant)
Strong Evidence
(Significant)
Weak Evidence
(Not Significant)
No Evidence
(Not Significant)
p=.0069
K J Somaiya Institute of Management, India
Interpreting the p-value
The department store example (Example 11.1) was a one tail test,
because the rejection region is located in only one tail of the
sampling distribution:
The SSA Envelope example is a left tail test because the rejection
region was located in the left tail of the sampling distribution.
0.1
H0 : µ ≤ µ0
H1 : µ > µ0
H0 : µ ≥ µ0
H1 : µ < µ0
One-Tail Test (left tail) Two-Tail Test One-Tail Test (right tail)
12.74
Test Statistic for Testing
Hypothesis about Population Mean
The table below summarizes the test statistic for testing hypothesis about population
mean
Population Standard
Deviation is unknown
Z statistic: t statistic:
One-Tail Test (left tail) Two-Tail Test One-Tail Test (right tail)
A filling machine at a soft drink factory is defined to fill at an average of 200 ml of drink per
bottle. A random sample of 50 filled bottles was taken and the average volume of soft drinks
was computed to be 198 ml per bottle with a standard deviation of 10 ml. Test the hypothesis
that the mean volume of soft drink per bottle is less than 200 ml at 5% level of significance.
Soln:
1. State the appropriate null and alternative hypotheses
H0: μ = 200
H1: μ < 200 (This is a Left Tail Test)
2. Identify the given data, the sample size and determine the appropriate technique
= 0.05 and n = 50
σ is unknown but sample size n > 30 so this is a Z test
One-Tail Test (left tail) Two-Tail Test One-Tail Test (right tail)
The results of a household survey indicated that a sample of 20 households bought an average
of 75 litres of milk per month with a standard deviation of 13 litres. Test the hypothesis that
the value of the population mean is 70 litres against the alternative that it is more than 70
litres. Use 5% level of significance
Soln:
State the appropriate null and alternative hypotheses
H0: μ = 70
H1: μ > 70 (This is a Right Tail Test)
Royal Tyres has launched a new brand of tyres for tractors and claims that under
normal circumstances the average life of the tyres is 40,000 km. A retailer wants to test
this claim and has taken a random sample of 8 tyres. He tests the life of the tyres under
normal circumstance. The results obtained are presented in the table below. Use α =
0.05 for testing the hypothesis
Tyres 1 2 3 4 5 6 7 8
Kms (‘000) 35 38 42 41 39 41.5 43 38.5
12.85
Solution
-t.025, 7 0 +t.025, 7 z
K J Somaiya Institute of Management, India 86
At a 5% significance level (i.e. α = .05), we have α/2 = .025. Thus, t.025, 7
= 2.364 and our rejection region is:
-t.025, 7 +t.025, 7 z
0
t = -0.27
12.89
Steps to follow to conduct
any hypothesis test
Identify the null and alternate hypothesis
Identify the appropriate test statistic based on given data ( i.e. whether
population SD known or unknown accordingly z test or t test.)
Identify the rejection region with respect to alpha and state the Decision
Rule: Reject H0 if
| Z/t test statistic | > Z/t Critical Value OR
p-value < α
Assuming H0 is true, we compute the test statistic for the sample mean
(and/or the p-value)
We compare the computed values based on the following decision Rule:
Conclusion: Statistical in terms of “reject H0 … ” or “Do not reject H0 …”
and Managerial conclusion in terms of “there exists enough evidence
….” OR “ there does not exist enough evidence …”