Professional Documents
Culture Documents
Hypothesis Tests
Hypothesis Tests
Hypothesis Tests
Hypothesis tests
Learning outcomes:
• understand the nature of a hypothesis test, the difference between one-tailed and two-tailed tests, and the
terms null hypothesis, alternative hypothesis, significance level, rejection region (or critical region),
acceptance region and test statistic (Outcomes of hypothesis tests are expected to be interpreted in terms
of the contexts in which questions are set.)
• formulate hypotheses and carry out a hypothesis test in the context of a single observation from a
population which has a binomial or Poisson distribution, using
• formulate hypotheses and carry out a hypothesis test concerning the population mean in cases where the
population is normally distributed with known variance or where a large sample is used
• understand the terms Type I error and Type II error in relation to hypothesis tests
• calculate the probabilities of making Type I and Type II errors in specific situations involving tests based on a
normal distribution or direct evaluation of binomial or Poisson probabilities.
What is hypothesis testing?
Consider this situation.
An opinion poll asks a sample of voters who will win they vote for if there are two candidates. So e.g. if 53%
voters say they will vote for candidate A, how will you be certain that candidate A will win the election?
Evidence is provided in various courtrooms. A jury then has to make a decision based on this evidence. But
how can a judge be so certain that this evidence is true?
• A Hypothesis test analyses the data to find how likely it is that the results could happen by chance.
Suppose that you have a set of dice numbered 1 to 6. Some of the dice are fair and some are biased. The biased
dice have bias towards the number six.
Now if you pick one of the dice and roll it 16 times. How many times would you need to roll a six to convince
yourself that this die is biased?
It is possible, although, highly unlikely, to roll 16 sixes with 16 rolls of a fair die. If is the number of sixes obtained
with 16 rolls of the die then .
We can calculate theoretical probabilities of rolling different numbers of sixes.
These probabilities tell us that in 16 rolls of a die, almost 89% of the time we would expect to roll at most 4 sixes, and for over
96% of the time we would expect to roll at most 5 sixes. This means that rolling 6 sixes or more will occur by chance less than 4%
of the time.
The shaded regions on these graphs show the regions for the probabilities. The first graph shows and the second shows .
In both graphs, the region to the right of the dotted line represents 5% of the total probability.
The second of the previous two graphs shows that is completely to the right of the dotted line, which we would
expect as is less than 5%. However, the first graph shows the dotted line dividing into two parts, which is as
expected since is greater than 5%.
• If we choose 5% as the critical percentage at which the number of sixes rolled is significant; that is, the
critical value at which the number of sixes rolled is unlikely to occur by chance, then rolling 6 sixes is the critical
value we use to decide if the die is biased.
• The percentage value of 5% is known as the significance level. The significance level determines where we draw a
dotted line on the graph of the probabilities.
• This region at one end of the graph is the critical region. The other region of the graph is the acceptance region.
Key facts
• The significance level is the probability of rejecting a claim. A claim cannot be proven by
scientific testing and analysis, but if the probability of it occurring by chance is very small it is
said that there is sufficient evidence to reject the claim.
• The range of values at which you reject the claim is the critical region or rejection region.
• The value at which you change from accepting to rejecting the claim is the critical value.
Sample data, such as the data you obtain in the dice experiment, may only partly support or refute a supposition or
claim. A hypothesis is a claim believed or suspected to be true.
A hypothesis test is the investigation that aims to find out if the claim could happen by chance or if the probability
of it happening by chance is statistically significant.
It is very unlikely in any hypothesis test to be able to be absolutely certain that a claim is true; what you can say is
that the test shows the probability of the claim happening by chance is so small that you are statistically confident in
your decision to accept or reject the claim.
• If you are not going to accept the null hypothesis, then you must have an alternative hypothesis to accept; the
alternative hypothesis abbreviation is .
• Both the null and alternative hypotheses are expressed in terms of a parameter, such as a probability or a mean
value.
Example
Experience has shown that drivers on a Formula 1 racetrack simulator crash 40% of the time. Zander decides to run a simulator
training program to reduce the number of crashes. To evaluate the effectiveness of the training program, Zander allows 20 drivers,
one by one, to use the simulator. Zander counts how many of the drivers crash.
Four drivers crash. Test the effectiveness of Zander’s simulation training program at the 5% level of significance.
Solution:
Let be ‘the number of drivers that crash’. Then .
Four drivers crash; you calculate the test statistic, the region
As , the result is not significant. 4 is not the critical value so will be accepted. There is insufficient evidence at the 5% level of
significance to suggest that the proportion of drivers crashing their simulators has decreased.
In this example:
• is the assumption that there is no difference between the usual outcome and what you are
testing.
• Test statistic is the calculated region/probability using sample data in a hypothesis test. You
compare your calculated test statistic with what is the expected outcome from the null
hypothesis.
• In this case, the calculated probability must be less than the percentage significance level to say
there is sufficient evidence to reject the null hypothesis.
Solution:
As , the result not significant. 3 is the critical value so will be rejected. There is sufficient evidence at the 5% level
of significance to suggest that the proportion of drivers crashing their simulators has decreased and the simulator
program is effective.
• When you investigate a claim using a hypothesis test, you are not interested in the probability of an exact number
occurring, but the probability of a range of values up to and including that number occurring. In effect that is the
probability of a region. The test statistic is the calculated region using sample data in a hypothesis test.
• The range of values for which you reject the null hypothesis is the critical region or rejection region.
• The value at which you change from accepting the null hypothesis to rejecting it is the critical value.
Example
Studies suggest that 10% of the world’s population is left-handed. Bailin suspects that being left-handed is less common amongst
basketball
players and plans to test this by asking a random sample of 50 basketball players if they are left-handed.
a)Find the rejection region at the 5% significance level and state the critical value.
b)If the critical value is 2, what would be the least integer percentage significance level for you to conclude that Bailin’s suspicion is
correct?
Solution:
c)Let X be the random variable ‘number of basketball players’ and p be the probability of being left-handed; then
Significance level: 5%
The rejection region is such that and
b) For Bailin to reject the null hypothesis the probability of the region must be less than the significance level.
, so with a critical value of 2 the significance level needs to be 12% to reject the null hypothesis.
Summary
To carry out a hypothesis test:
a) Does this provide evidence at the 5% significance level of an increase in the percentage of the population that
react positively to this new treatment?
b) Find the critical value.
Solution:
c) Let be ‘the number of people who react positively to the new cream’. Then
Hence
Cont'd
As , so will be accepted. There is insufficient evidence at the 5% level of significance to support the
manufacture’s claim.
; accept the null hypothesis. There is insufficient evidence to support the claim that the spinner is not working
correctly.
The graph shows the distribution of the spinner landing on red when it is spun ten times.
Note that the alternative hypothesis uses to show that neither an increase nor a decrease is expected; and the significance level
percentage has been split to show two critical regions at either end of the graph.
• If a claim is for neither an increase nor a decrease, just a different value of the parameter, then the hypothesis test is two-tailed.
• In a two-tailed test use to express the alternative hypothesis and share the significance level percentage between the two tails;
that is’ between the two critical regions.
Example
In a particular coastal area, on average, 35% of the rocks contain fossils. Jamila selects 12 rocks at random from this area and
breaks them open. She finds fossils in two of the rocks.
a) Test at the 10% level of significance if there is evidence to show the average of 35% is incorrect.
b) How many fossils would Jamila need to find to reject the null hypothesis?
c) Would Jamila’s conclusion be different if the test was one – tailed?
Solution:
Let X be the random variable ‘the proportion of rocks containing fossils’. Then .
Therefore, accept the null hypothesis. Jamila must have been unlucky to find so few fossils. She expected to find about four fossils
in 12 rocks . She found only two fossils, yet this still was too many to have any evidence to reject the null hypothesis.
b) and
Since is greater than 5% and is less than 5%, the critical value is 1.
c) Since , the answer is still to accept the null hypothesis: there is insufficient evidence to show that
• For a single observation from a population that has a Poisson distribution, we can directly
compare Poisson probabilities or use a normal approximation to the Poisson distribution.
• To carry out hypothesis testing with the Poisson distribution, the process is the same as
before.
Example
Records show that 1% of the population has a positive reaction to a test for a particular allergy. In a village, 120 people are tested
and four people have a positive reaction. Test at the 5% level of significance if there is any evidence of an increase in the
population with a positive reaction for this particular allergy.
Solution:
State the distribution; is large and , so approximate to Poisson.
As , reject the null hypothesis. There is evidence at the 5% level of significance to suggest that in this village there is an increase
in the population who react positively with this particular allergy.
Example
The number of accidents on a stretch of road occurs at the rate of seven each month. New traffic measures are put in place to
reduce the number of accidents. In the following month, there are only two accidents.
a)Test at the 5% level of significance if there is evidence that the new traffic measures have significantly reduced the number of
accidents.
b)Over a period of 6 months, there are 32 accidents. It is claimed the new traffic measures are no longer reducing accidents.
I.Test this claim at the 5% level of significance.
II.What would your conclusion be if you test the claim at the 10% level of significance?
Solution:
a)
As reject the null hypothesis. There is evidence to suggest the new traffic measures reduce the number of accidents.
Cont’d
b. I)
As accept the null hypothesis. There is insufficient evidence to suggest the new traffic measures have reduced the number of
accidents.
b. II) As , reject the null hypothesis. There is evidence at the 10% significance level that the new traffic measures reduce the
number of accidents.
Hypothesis testing of the Population mean
• Sample data are often collected to test a statistical hypothesis about a population. Such a sample, even if
it is a random sample, may or may not be representative of the population.
• Using the central limit theorem, estimates of the sample mean and sample variance can be calculated
from the sample and these estimates can be used to see if they support or reject the null hypothesis.
For a sample size drawn from a normal distribution with known variance, , and sample mean , the test statistic
is:
so the masses are in the critical region. We reject , because there is some evidence to support the plant food producer’s claim.
Alternative way of writing the solution
• An alternative approach is to compare the test statistic z with the critical value from tables.
• It is possible to carry out a hypothesis test of a population mean when the population variance is unknown.
Provided the sample is large, we follow the same process as for a hypothesis test of a population mean from a
normal population with known variance. For the variance, we use , an unbiased estimate of the population
variance.
If the population mean and population variance are unknown, sample data can be used to conduct a hypothesis
test that the population mean has a particular value, as follows:
For a large sample size drawn with unknown and sample mean , the test statistic is:
Investigate at the 10% level of significance if there is any evidence to support the researcher’s claim. What advice
would you give to the researcher based on your findings?
Solution:
Accept Reject
The null hypothesis, is true Correct decision Type I error
The null hypothesis, is false Type II error Correct decision
Cont’d
• For example, in UK and many other countries the law states that a person is innocent until ‘proven’ guilty. A
Type I error happens when an innocent person is found guilty; a Type II error happens when a guilty person is
found innocent.
• In both these situations, the evidence was either incorrect or not interpreted correctly.
• E.g. for the experiment to determine whether a die is biased – a Type I error occurs when we conclude the die
is biased given that it is fair; and a Type II error occurs when we conclude the die is fair given that it is
biased.
In statistics, an error may be made if:
• However, simply changing the significance level to a higher percentage increases the chance
of a Type I error.