Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

HYPOTHESIS TESTING

RM185101 - Applied Statistics and Probability

Ira Mutiara Anjasmara, PhD

Postgraduate Program
Department of Geomatics Engineering
Faculty of Civil, Planning, and Geo Engineering
Institut Teknologi Sepuluh Nopember
Sample distribution

I Recall: a sample is a selection from a population that is deemed to be


representative of that population.
I Many samples can be taken from the same population, and each sample should
be unbiased; i.e., collected in a random fashion.
I Statistics can be calculated from each sample.
e.g., the mean, variance, standard deviation.
I We wish to infer information about the population based on all of the possible
samples, rather than just one.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 2/61 HYPOTHESIS TESTING


Sampling distribution of the mean

I If many different samples are selected from the same population we get a range
of different sample means.

For example, suppose we are interested in the mean height of Indonesian


women: a sample of 100 women will give us a sample mean, but in general, this
will be different to the true (population) mean. Another sample of 100 women
will give a different sample mean, also different to the true mean.
I All the possible sample means taken from a population have their own
distribution, known as the sampling distribution of the mean.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 3/61 HYPOTHESIS TESTING


Sampling distribution of the mean

The sampling distribution of the mean is the probability distribution for all
possible values of the sample mean, x̄.

If we take enough samples, the mean of these sample means will tend towards the
population mean, i.e.:
E[x̄] = µ (1)
or, the expected value of x̄ = the population mean.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 4/61 HYPOTHESIS TESTING


Standard error of the mean

The standard deviation of all the sample means measures the dispersion of all
possible values of x̄ for all possible samples.

The standard deviation of all the sample means is called the standard error of the
mean, and is defined by: r  
N −n σ
σx = √ (2)
N −1 n
where σ is the standard deviation of the population being sampled, n is the sample
size, and N is the population size (often unknown).

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 5/61 HYPOTHESIS TESTING


Standard error of the mean

If n/N < 0.05, i.e. the sample size is much smaller than the population size, then the
eq. (2) reduces to:
σ
σx = √ (3)
n
From now on, we will assume that the n/N < 0.05 condition is always met.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 6/61 HYPOTHESIS TESTING


Standard error of the mean: Example

Assume that many random samples of size n = 49 are to be taken from a large
population with mean µ = 100 and standard deviation σ = 21. What are the mean
and standard deviation of the values of all the sample means?

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 7/61 HYPOTHESIS TESTING


Standard error of the mean: Example

Assume that many random samples of size n = 49 are to be taken from a large
population with mean µ = 100 and standard deviation σ = 21. What are the mean
and standard deviation of the values of all the sample means?

We know that repeating the sampling process will generate different sample means
due to the different samples selected. The mean of these x̄ values is E[x̄] = µ = 100.
Since the population is large relative to the sample, the standard deviation of the x̄
values is:
σ 21
σx = √ = √ = 3
n 49

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 7/61 HYPOTHESIS TESTING


Central limit theorem

The central limit theorem states:

In selecting simple random samples of size n from a population of mean µ and


standard deviation σ, the sampling distribution of x̄ approaches a normal probability
distribution with mean µ and standard deviation σx̄ when n ≥ 30.

This is true for any population distribution, not just normally-distributed ones.

Furthermore, if the population being sampled itself has a normal distribution, then
the sampling distribution of the mean is normally distributed for any value of n.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 8/61 HYPOTHESIS TESTING


-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 9/61 HYPOTHESIS TESTING


Central limit theorem: example (1)
Consider a variable x that has a population distribution given by:

Since this distribution has a shape that cannot be represented by a mathematical


equation, we cannot find the answer to the question: What is p(x < 1340)? (The
red-shaded area).
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 10/61 HYPOTHESIS TESTING


Central limit theorem: example (2)
We can, however, sample this distribution many times, and as long as the sample
size is ≥ 30, we ‘convert’ the distribution of x into a distribution of the sample
means x̄, which, from the central limit theorem, is normally-distributed:

Now we can ask: What is p(x < 1340)? (The red-shaded area).
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 11/61 HYPOTHESIS TESTING


Probabilities using sampling distributions

We can use the sampling distribution of the mean to compute the probability of
selecting a sample that will provide a value of x̄ within any specified distance from
the population mean.

If we approximate the sampling distribution of x̄ to a normal distribution, through


the central limit theorem, we may compute a z-value, where:
x̄ − µ
z=
σx̄

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 12/61 HYPOTHESIS TESTING


Probabilities using sampling distributions

We can use the sampling distribution of the mean to compute the probability of
selecting a sample that will provide a value of x̄ within any specified distance from
the population mean.

If we approximate the sampling distribution of x̄ to a normal distribution, through


the central limit theorem, we may compute a z-value, where:
x̄ − µ
z=
σx̄
Note that here the z valued is defined by using x̄ rather than x and the standard
error of the mean rather than standard deviation.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 13/61 HYPOTHESIS TESTING


Probabilities using sampling distributions
So, for the population of x given, we would convert the normal distribution of the
sample means (x̄) into a standard normal distribution of z-scores (assuming we knew
µ and σ):

And now we can find p(x < 1340) = p(z < −1.4)
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 14/61 HYPOTHESIS TESTING


Probabilities using sampling distributions:example

For the Statistics mid-semester test the mean score of 85 students was 48% with a
standard deviation of 25%. What is the probability that the mean of a sample of 30
students will be greater than 55% ?

We have µ=48, σ=25, n=30, and require p(x̄ > 55).


First, compute the standard error of the mean:
σ 25
σx̄ = √ = √ = 4.56
n 30
Then calculate the z-score:
55 − 48
z= = 1.53
4.56

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 15/61 HYPOTHESIS TESTING


Probabilities using sampling distributions:example

From the standard normal tables, the area enclosed between z=0 (the mean) and
z=1.53 is A=0.4370:

So the area we want (shaded) is: 0.5-0.4370 = 0.0630


i.e., p(x̄ > 55) = 0.0630

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 16/61 HYPOTHESIS TESTING


Hypothesis Testing
I This is a process for making a statistical decision based on information
contained in the sample.
I A statistical hypothesis is an assumption, statement or question concerning one
or more populations, which may or may not be true.
I The truth of the assumption can only be known for certain if we examine the
whole population, which is impractical.
I Thus, the aim of hypothesis testing is to decide whether the assumption is true
based on random samples.
I We make an assumption about the value of a population statistic (e.g., mean,
variance):
I this is called the null hypothesis
I it is denoted H0 .
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 17/61 HYPOTHESIS TESTING


Hypothesis Testing

I We then define another hypothesis, opposite to the null hypothesis:


I this is called the alternative hypothesis
I it is denoted Ha or sometimes H1 .
I We then test the null hypothesis, usually through experiment on a sample.
I If the results of the test (based on the sample) are inconsistent with the null
hypothesis then we reject H0 .
I If the results are consistent with the null hypothesis, then this does not
necessarily imply that H0 is true, or that we accept it, only that there is
insufficient evidence to reject H0 .

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 18/61 HYPOTHESIS TESTING


Hypothesis Testing
The null hypothesis is phrased so that the status quo is preserved, i.e., things don’t
change.
For example, in (Western) criminal law, a defendant on trial for a crime is innocent
until proven guilty. Therefore, we make:

H0 : innocent

i.e., the null hypothesis states that he will still be innocent after the trial, preserving
the status quo.
Think in terms of a courtroom:
I the null hypothesis is like the defence lawyer, pleading innocence;
I the alternative hypothesis is like the prosecution lawyer, attempting to prove
guilt.
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 19/61 HYPOTHESIS TESTING


Hypothesis Testing

I When testing a hypothesis, the aim is to use sample data to refute the null
hypothesis (and not to prove the alternative hypothesis). Then, if there is any
doubt as to the validity of the alternative hypothesis, we revert back to the null
hypothesis.
I In mathematical terms, the null hypothesis is a statement of a population, and
not sample, statistic, because the population statistic is a representation of the
accepted wisdom, while the sample statistic is a representation of the new
evidence. In this chapter, null hypothesis statements will concern µ (and never
x̄).

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 20/61 HYPOTHESIS TESTING


Hypothesis Testing: Example
The speed of light is known to be 299,792,458 ms−1 . However, someone comes
along and measures it as 299,792,457 ms−1 . Null hypothesis: the accepted speed of
light value has been determined over decades of rigorous research, and there is no
need to change it:
H0 : µ = 299,792,458
Alternative hypothesis: the new evidence suggests that the value needs changing:
Ha : µ 6=299,792,458
Note that we do not test Ha : x̄ = 299,792,457. Furthermore, our conclusions after
the statistical test is performed will be made in terms of the null hypothesis: For
example:
I there is no evidence to suggest that H0 should be rejected; or,
I the new evidence suggests that H0 should be rejected.
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 21/61 HYPOTHESIS TESTING


Errors in hypothesis testing

Because our results are based only on a sample then our decision to reject or accept
a hypothesis may be incorrect.
There are two types of errors:
I Type I error: reject H0 when it is true (being too sceptical)
I Type II error: accept H0 when it is false (being too gullible)

H0 true H0 false
Accept H0 Correct decission Type II error
Reject H0 Type I error Correct decission

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 22/61 HYPOTHESIS TESTING


Errors in hypothesis testing

A type II error is more difficult to detect than a type I error. Recall, this is when we
accept H0 when it is false (i.e., we free the guilty man).
In an experiment, if we accept H0 it may be because H0 is actually true, but it may
also be because we did not have enough evidence to reject it. This latter case is like
the police bungling an investigation, so that the jury have no choice but to free the
guilty man.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 23/61 HYPOTHESIS TESTING


Errors in hypothesis testing

To avoid making a type II error, we say “do not reject H0 ” rather than “accept H0 ”.
I This statement does not discard the possibility that we have mistakenly
accepted H0 .
I It embodies the subtle difference between saying “this man is innocent” and
“we cannot prove this man guilty”.
So a statistical test can either reject or fail to reject a null hypothesis, but can never
prove it.

Failing to reject a null hypothesis does not prove it true.

It is largely up to the researcher to determine which type of error is the “worst” to


commit.
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 24/61 HYPOTHESIS TESTING


Errors in hypothesis testing: Example

Example 1: A woman suspects she’s pregnant and visits the doctor.


H0 Reality Action Error Consequence
not pregnant not pregnant informed pregnant type I freaks out
(H0 rejected)
not pregnant pregnant informed not pregnant type II carries on
(H0 not rejected)

Example 2: A man is on trial for murder:


H0 Reality Action Error Consequence
innocent innocent convicted type I sentenced to death
(H0 rejected)
innocent guilty freed type II released
(H0 not rejected)

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 25/61 HYPOTHESIS TESTING


Significance levels
Type I errors are controlled by setting the level of significance (α) for the test:

α = p (type I error occuring)

i.e., there is an α chance that we mistakenly reject H0 .

The value of α gives the area under the probability distribution curve corresponding
to the probability of making a type I error. An example for the normal distribution is:

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 26/61 HYPOTHESIS TESTING


Significance levels

As a null hypothesis can never be rejected with 100% certainty, we test at various
levels of significance:

a small value of α means a small chance of making the wrong decision, and thus a
large chance of making the right decision.

Suppose we reject H0 at α= 0.01. This is a more significant result than if H0 were


rejected at α = 0.05, because 0.01 represents only a 1% chance of mistakenly
rejecting it, whereas 0.05 represents a 5% chance of mistakenly rejecting it.
The choice of value ofα is subjective, but should always be greater than zero. The
chosen value should be at most 0.1, but the most popular choice is 0.05.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 27/61 HYPOTHESIS TESTING


The 8 steps to hypothesis testing

Performing a hypothesis test can be broken down into 8 steps.


1. formulate hypotheses
2. determine number of tails
3. determine significance levels
4. determine critical z value
5. determine rejection region
6. determine test statistic
7. peform the test
8. draw conclusion

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 28/61 HYPOTHESIS TESTING


Step 1 - formulate hypotheses

i) Formulate an alternative hypothesis, Ha

This is the statement of the “new” result, i.e., the result that is contended to alter
the status quo; or, the case for the prosecution.
Decide what we are testing for:
I are we testing whether the new results are “less than” or “greater than” the
established results (use < or > in the formulation for Ha );
I or whether they are “just different” (use 6= in the formulation for Ha )?
The clue comes from the wording of the question. If the question doesn’t specifically
state “less than” or “greater than” (or wording to that effect), use 6=.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 29/61 HYPOTHESIS TESTING


Step 1 - formulate hypotheses
ii) Formulate the null hypothesis, H0

This will be the opposite or inverse of the alternative hypothesis; i.e., what is the
status quo?
Use the opposite sign to Ha (≥ , ≤ , or = ):
In summary, a hypothesis test concerning the value of a population mean (µ) can
take one of three forms:
H0 Ha
µ ≤ µ0 µ > µ0
µ ≥ µ0 µ < µ0
µ = µ0 µ 6= µ0

where, µ0 = numerical value being considered in the hypothesis


-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 30/61 HYPOTHESIS TESTING


Step 2 - determine number of tails

The number of tails of a test comes from a graphical representation of the


hypothesis:

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 31/61 HYPOTHESIS TESTING


Step 2 - determine number of tails - Example

It is known that a certain quantity has a value µ0 . Recent tests find that this
quantity actually has a value x̄.

For a 1-tailed test:


I if the value of x̄ < µ0 , then have Ha : µ < µ0
I if the value of x̄ > µ0 , then have Ha : µ > µ0
For a 2-tailed test always have:
I Ha : µ < µ0

In general: if in doubt, use a 2-tailed test.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 32/61 HYPOTHESIS TESTING


Step 3 - determine significance level

Recall, the significance level is the probability of rejecting a true null hypothesis. This
value is usually given to you as either a fraction between (but not including) 0 and 1.
If it is not given, it is up to you to choose a value. Common choices are α= 0.1,
0.05, 0.01 and 0.001, but 0.05 is most widely used.
The value of α equals the area of the rejection region: this is the part of the normal
distribution where the sample data indicate that H0 should be rejected.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 33/61 HYPOTHESIS TESTING


Step 4 - determine critical z value

Use the value of α to get a value of zα from the normal tables.


I i.e., what value of z gives an area in the rejection region of α ?

This critical value will be used to test the null hypothesis - see Step 7.

For a given value of α the value of zα will depend on whether we have a 2-tailed or
1-tailed test:
I in a 1-tailed test, all of α is in one rejection region, so find zα
I in a 2-tailed test, α is split into two rejection regions, each one with area α/2, so
find zα/2

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 34/61 HYPOTHESIS TESTING


Step 5 - determine rejection region

The boundary of the rejection region is determined by the value of zα .

Its location is determined by the form of the alternative hypothesis (<, >, or 6= ):

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 35/61 HYPOTHESIS TESTING


Step 5 - determine rejection region

1-tailed:

H0 : µ ≤ µ0
Ha : µ > µ0

1-tailed:

H0 : µ ≥ µ0
Ha : µ < µ0
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 36/61 HYPOTHESIS TESTING


Step 5 - determine rejection region

2-tailed:

H0 : µ = µ0
Ha : µ 6= µ0

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 37/61 HYPOTHESIS TESTING


Step 6 – determine test statistic

The test statistic value is calculated using:


x̄ − µ
z= (4)
σx̄
x̄ is the mean of the sample taken to test the hypotheses
µ is the √population mean
σx̄ = σ/ n ; where σ is the population standard deviation.

NB: If σ is unknown, then as long as n ≥ 30, we may use the


sample standard deviation (s) as an approximation.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 38/61 HYPOTHESIS TESTING


Step 7 – perform the test

Compare the test statistic value against its critical value.

i.e., plot the position of z on the z-axis, and check its position relative to zα and the
rejection region:
I if z lies in the rejection region, reject H0 ;
I if z does not lie in the rejection region, do not reject H0 .

Always state the significance level at which you make your decision.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 39/61 HYPOTHESIS TESTING


Step 7 – perform the test
For example, the following would indicate rejection of H0 :

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 40/61 HYPOTHESIS TESTING


Step 8 – draw conclusions

Always refer back to the wording of the original problem:


I do not just leave the answer as “reject H0 ”;
I and always include the significance or confidence level in your answer.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 41/61 HYPOTHESIS TESTING


Example 1

A count of vehicles travelling past a point on Albany Highway in 30 seconds (during


peak hour) is supposed to be about 25 vehicles with a standard deviation of 4.3
vehicles. When a further 40 measurements were taken it was found that the mean
was 23.5 vehicles per 30 seconds. Can we be 99% certain that the vehicular traffic is
less than 25 vehicles per 30 seconds?

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 42/61 HYPOTHESIS TESTING


Example 1

A count of vehicles travelling past a point on Albany Highway in 30 seconds (during


peak hour) is supposed to be about 25 vehicles with a standard deviation of 4.3
vehicles. When a further 40 measurements were taken it was found that the mean
was 23.5 vehicles per 30 seconds. Can we be 99% certain that the vehicular traffic is
less than 25 vehicles per 30 seconds?

Take 25 vehicles as the population mean.


We therefore want to test whether the sample mean from the new data (23.5)
indicates that this value is too high. We have:µ = 25, σ = 4.3, x̄ = 23.5, n = 40, α
= 0.01.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 42/61 HYPOTHESIS TESTING


Example 1
Step 1

Formulate alternative hypothesis: Ha : µ < 25


I i.e., test whether the true population mean is actually less than the established
value.
Formulate null hypothesis: H0 : µ ≥ 25
I i.e., assume the given population mean is correct, and the sample data are
misleading.

Step 2

Determine number of tails.


This is a 1-tailed test, because the null hypothesis has an inequality.
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 43/61 HYPOTHESIS TESTING


Example 1

Step 3

Determine level of significance: We are told that the confidence level is 99%,
therefore α = 0.01.

Step 4

Determine the critical value of z:


We have a 1-tailed test, so we need to find zα = z0.01
From the standard normal distribution table, we have:
z0.01 = z(0.5 − 0.01) = z( 0.49) =2.33

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 44/61 HYPOTHESIS TESTING


Example 1
Step 5

Determine the rejection region: The null hypothesis will be rejected if µ < 25, so we
have the following situation:

Since we are testing µ < 25, we are in the LHS of the normal curve, therefore the
rejection region is z < –2.33. -IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 45/61 HYPOTHESIS TESTING


Example 1
Step 6
Determine the test statistic (z-score) from the sample data:
x̄ − µ 23.5 − 25
z= = 4.3 √ = −2.21 (5)
σx̄ / 40

Step 7
Compare the test statistic against its critical value: –2.21 > –2.33, therefore z, and
hence x̄, the sample mean, do not lie in the rejection region.
Hence, we do not reject H0 at the 0.01 significance level.

Step 8
Our sample measurement is compatible with the supposed population mean at 99%
confidence level. Therefore it follows that the true mean is not less than 25 vehicles
per 30 seconds. -IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 46/61 HYPOTHESIS TESTING


Exercise

The value of a well-observed angle was known to be 30◦ 150 3000 . A new theodolite
was tested against this angle for calibration. A sample of 36 arcs produced a mean
of 30◦ 150 3200 , with an SD of 6”. Is this value significantly different from the standard
value at the 5% level of significance?

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 47/61 HYPOTHESIS TESTING


Exercise

The value of a well-observed angle was known to be 30◦ 150 3000 . A new theodolite
was tested against this angle for calibration. A sample of 36 arcs produced a mean
of 30◦ 150 3200 , with an SD of 6”. Is this value significantly different from the standard
value at the 5% level of significance?

Take 30◦ 150 3000 as the population mean. We therefore want to test whether the
sample mean from the new data (30◦ 150 3200 ) indicates that this value is incorrect.
We have: µ = 30◦ 150 3000 , s = 6”, x̄ = 30◦ 150 3200 , n = 36, α = 0.05.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 47/61 HYPOTHESIS TESTING


Confidence Interval

I The degree of confidence is defined as: 1 − α


I It is usually expressed as a percentage, called the confidence level (CL).
I Confidence is the probability that the mean (sample or population) lies within a
confidence interval (CI).
I The confidence interval represents the region within which (1 − α)% of all the
sample means will lie:
CI = µ ± zα/2 σx̄
or
p(µ − zα/2 σx̄ ≤ x̄ ≤ µ + zα/2 σx̄ ) = 1 − α

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 48/61 HYPOTHESIS TESTING


Confidence Interval

CI also represents the region in which we are (1 − α)% likely to find the population
mean, µ:
CI = x̄ ± zα/2 σx̄
or
p(x̄ − zα/2 σx̄ ≤ µ ≤ x̄ + zα/2 σx̄ ) = 1 − α

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 49/61 HYPOTHESIS TESTING


Confidence Interval

This means that we don’t need to know µ in order to determine the confidence
interval.
For instance, for α = 0.05, the CL = 95%, and zα/2 = ±1.96:

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 50/61 HYPOTHESIS TESTING


Confidence Interval

I The quantity zα/2 σx̄ is called the margin of error. This is not the precision of
the data (which is just σx̄ ); rather it gives the maximum allowable error.
I It is obviously desirable to have a low margin of error, because a low margin of
error indicates that we have pinned down the mean quite precisely.
However, a low margin of error implies a low zα/2 , and thus a low confidence
level. Conversely, a high confidence percentage gives a large margin of error
(because zα/2 is larger).
I So having high confidence does not imply that we have good data: it just
means that we have allocated a wider range in which to place the measurement.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 51/61 HYPOTHESIS TESTING


Confidence intervals vs hypothesis tests

Since the confidence interval can give a range of values where we are (1 − α)% likely
to find µ or x̄, we can use it to perform a 2-tailed hypothesis test (but not 1-tailed).

If significance is the probability of rejecting the null hypothesis when it is actually


true (making a type-I error), then confidence can be thought of as our degree of
certainty in making the correct decision.

Remember, a low value of α means a low probability of mistakenly rejecting H0 , and


therefore a high confidence in making the right decision. Conversely, a higher value
of α means a higher probability of mistakenly rejecting H0 , and therefore a lower
confidence in making the right decision.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 52/61 HYPOTHESIS TESTING


Confidence intervals vs hypothesis tests

Consider an extreme example, where α = 0. In a hypothesis test we have a zero


probability of making a mistake (type I error), and we are extremely confident of our
decision, with a confidence level of 100%. However, do not be fooled: the critical
value is zα/2 = ±∞, so the margin of error is infinite. So even if x̄ was a long long
way from µ we still would not reject H0 . In fact we could never reject H0 .

Obviously, real-world examples would never have a zero significance level. But this
example above shows that as α decreases, it becomes harder to reject H0 .

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 53/61 HYPOTHESIS TESTING


Confidence intervals vs hypothesis tests

Now consider:

H0 : µ = µ0
Ha : µ 6= µ0

where µ0 is the hypothesized value for the population mean.

By analogy with the 8 steps to hypothesis testing for a 2-tailed test, we can see that
the do-not-reject H0 region is given by:

µ0 ± zα/2

So if the sample mean does not fall in this region, we must reject H0 .
-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 54/61 HYPOTHESIS TESTING


Example 1
The value of a well-observed angle was known to be 30◦ 150 3000 . A new theodolite
was tested against this angle for calibration. A sample of 36 arcs produced a mean
of 30◦ 150 3200 , with an SD of 6”. Is this value significantly different from the standard
value at the 5% level of significance?

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 55/61 HYPOTHESIS TESTING


Example 1
The value of a well-observed angle was known to be 30◦ 150 3000 . A new theodolite
was tested against this angle for calibration. A sample of 36 arcs produced a mean
of 30◦ 150 3200 , with an SD of 6”. Is this value significantly different from the standard
value at the 5% level of significance?

We have: µ = 30◦ 150 3000 , s = 6”, x̄ = 30◦ 150 3200 , n = 36, α = 0.05.

So, the hypotheses are


H0 : µ = 30◦ 150 3000
Ha : µ 6= 30◦ 150 3000

We can use the confidence interval method because this is a 2-tailed test.

The critical value of z is: zα/2 = z0.025 = 1.96√


The standard error of the mean is: σx̄ = 600 / 36 = 100 -IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 55/61 HYPOTHESIS TESTING


Example 1

So the confidence limits are: zα/2 σx̄ = ±(1.96 × 100 ) = ±1.9600


And the confidence interval about the population mean is:
CI=30◦ 150 3000 ± 1.9600 = 30◦ 150 28.0400 to 30◦ 150 31.9600

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 56/61 HYPOTHESIS TESTING


Example 1

The sample mean (30◦ 150 3200 , the green arrow) does not lie within this interval.
Hence, we reject H0 with 95% confidence.
Our sample measurement is incompatible with the supposed population mean at 0.05
significance. Therefore it follows that the true mean is not 30◦ 150 3000 at this level.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 57/61 HYPOTHESIS TESTING


Example 2

What if we do the test at 0.01 level of significance (99% confidence)?

The critical value of z is now : zα/2 = z0.05 ≈ 2.58


And the confidence limits are: zα/2 σx̄ ≈ ±2.5800

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 58/61 HYPOTHESIS TESTING


Example 2

The confidence interval about the population mean is now:


CI=30◦ 150 3000 ± 2.5800 = 30◦ 150 27.4200 to 30◦ 150 32.5800

The sample mean (30◦ 150 3200 , the green arrow) does lie within this interval. Hence,
we do not reject H0 with 99% confidence.
Our sample measurement is now compatible with the supposed population mean at
0.01 significance. Therefore it follows that the true mean is 30◦ 150 3000 at this level.

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 59/61 HYPOTHESIS TESTING


Conclusion

I The data haven’t changed, but the outcomes have. In Example 1 we rejected
H0 , but in Example 2 we didn’t.
I Whereas in Example 1 we had a 5% chance of mistakenly rejecting H0 , in
Example 2 we only had a 1% chance of mistakenly rejecting it. So in Example 2
we “accepted” H0 , not because the data got better or worse, but because we
were allowed more freedom via a larger margin of error.
I In terms of statistical theory, while setting a low significance level means a low
probability of mistakenly rejecting H0 (making a type I error), it raises the
probability of making a type II error, i.e., mistakenly “accepting” H0 (and thus
accepting any old rubbish!).

-IM Anjasmara, 2022-

RM185101 - Applied Statistics and Probability 60/61 HYPOTHESIS TESTING

You might also like