CH 10 Slides

ECO 227Y1 Foundations of Econometrics
Kuan Xu
University of Toronto
kuan.xu@utoronto.ca
March 28, 2024
Kuan Xu (UofT) ECO 227 March 28, 2024 1 / 65

Ch 10 Hypothesis Testing
1 Introduction
2 Elements of a Statistical Test
3 Common Large-Sample Tests
4 Calculating Type II Error Probabilities and Finding the Sample Size for
Z Tests
5 Relationships between Hypothesis-Testing Procedures and Confidence
Intervals
6 Another Way to Report the Results of a Statistical Test: p-Values
7 Some Comments on the Theory of Hypothesis Testing
8 Small-Sample Hypothesis Testing for µ and µ1 − µ2
9 Testing Hypotheses for σ 2 and σ12 versus σ22

Introduction (1)
Two objectives of statistics/econometrics—parameter estimation (Ch

8 & 9) and hypothesis testing for parameter values (Ch 10)
Hypothesis testing is used in empirical science.
It shares the same principles of empirical science—rejecting or
accepting (not rejecting) a theory based on observable data.
Examples can be found in both science (medicine, biology, and so on)
and social science (political science, economics, and so on).
We are interested in the elements of any statistical test.

Elements of a Statistical Test (1)
A motivating example:
(1) A theory—a research hypothesis—about the parameter(s) we wish to
support. Jones claims that he will gain more than 50% of the votes
for him in an election (p > .5).
(2) We wish to find the contradicting evidence that Jones will not gain
more than 50% of the votes (p < .5). We call this hypothesis the
alternative hypothesis (Ha : p < .5).
(3) The converse of the alternative hypothesis is the null hypothesis—he
will gain exactly 50% of the votes (H0 : p = .5). Note that this is the
minimum percentage of the votes required for supporting Jones’s
claim.
(4) The scientific inquiry is to see to if we can find any proof by
contradiction to the research hypothesis.
(5) In the framework of hypothesis testing, we examine any evidence in
favor of the alternative hypothesis (Ha : p < .5) so that we can refute
H0 : p = .5.
(6) To gather evidence, we sample n = 15 voters and check how many
voters favor Jones, Y . Note that Yn = p̂ can be calculated. For
0
example, if n = 15 and Y = 0, then p̂ = 15 = 0. If Jones’s claim is
true (p > .5), this outcome is not impossible but highly improbable.
p̂−.5
(7) We need to establish a test statistic (using σp ?), which will be
discussed later.
(8) Assume that if the observed value of Y , y , fells into the interval
(0, 2),1 we would reject H0 : p = .5. This interval is called RR or the
rejection region. It is often written RR = {y : y ≤ 2} or simply
RR = {y ≤ 2}.
(9) Generally, RR = {y ≤ k}, where k is needed to be set objectively to
minimize decision errors.
(10) There are type I error—rejecting H0 when it is true—and type II
error—accepting H0 when it is false (or when Ha is true.)
1
The upper limit of 2 implies that only 13.33% voters in favor of Jones; that is
p̂ = 2/15 = 0.133.
(11) The probability of type I error is called α while the probability of type
II error is called β.
Fig. 10.1, p. 491
Figure: For point (9)

Example:
For Jones’s political poll, n = 15. We wish to test
H0 : p = .5 vs Ha : p < .5
Let the test statistic be Y , the number of sampled voters favoring Jones.
Calculate α, the probability of type I error, if we select RR = {y ≤ 2} as
rejection region.
Solution:
α = P(Type I error)
= P(rejecting H0 when H0 is true)
= P(Y ≤ 2 when p = .5).
If H0 is true, p = .5 and
2
15 15 15 15

y 15−y 15 15 15
X
α= (.5) (.5) = (.5) + (.5) + (.5) = .004.
y =0 y 0 1 2

Solution (continued): α = .004 can be obtained from the following table
with n = 15 and p = .5.
012 34456789 5

!"#$%&
'
( )*)% )*)& )*%) )*+) )*,) )*-) )*&) )*.) )*/) )*0) )*1) )*1& )*11 (
) *0.) *-., *+). *),& *))& *))) *))) *))) *))) *))) *))) *))) *))) )
% *11) *0+1 *&-1 *%./ *),& *))& *))) *))) *))) *))) *))) *))) *))) %
+ %*))) *1.- *0%. *,10 *%+/ *)+/ *))- *))) *))) *))) *))) *))) *))) +
, %*))) *11& *1-- *.-0 *+1/ *)1% *)%0 *))+ *))) *))) *))) *))) *))) ,
- %*))) *111 *10/ *0,. *&%& *+%/ *)&1 *))1 *))% *))) *))) *))) *))) -
& %*))) %*))) *110 *1,1 */++ *-), *%&% *),- *))- *))) *))) *))) *))) &
. %*))) %*))) %*))) *10+ *0.1 *.%) *,)- *)1& *)%& *))% *))) *))) *))) .
/ %*))) %*))) %*))) *11. *1&) */0/ *&)) *+%, *)&) *))- *))) *))) *))) /
0 %*))) %*))) %*))) *111 *10& *1)& *.1. *,1) *%,% *)%0 *))) *))) *))) 0
1 %*))) %*))) %*))) %*))) *11. *1.. *0-1 *&1/ *+/0 *).% *))+ *))) *))) 1
%) %*))) %*))) %*))) %*))) *111 *11% *1-% */0, *-0& *%.- *)%, *))% *))) %)
%% %*))) %*))) %*))) %*))) %*))) *110 *10+ *1)1 */), *,&+ *)&. *))& *))) %%
%+ %*))) %*))) %*))) %*))) %*))) %*))) *11. *1/, *0/, *.)+ *%0- *),. *))) %+
%, %*))) %*))) %*))) %*))) %*))) %*))) %*))) *11& *1.& *0,, *-&% *%/% *)%) %,
%- %*))) %*))) %*))) %*))) %*))) %*))) %*))) %*))) *11& *1.& */1- *&,/ *%-) %-
2"#$+)
'
( )*)% )*)& )*%) )*+) )*,) )*-) )*&) )*.) )*/) )*0) )*1) )*1& )*11 (
) *0%0 *,&0 *%++ *)%+ *))% *))) *))) *))) *))) *))) *))) *))) *))) )
% *10, */,. *,1+ *).1 *))0 *))% *))) *))) *))) *))) *))) *))) *))) %
+ *111 *1+& *.// *+). *),& *))- *))) *))) *))) *))) *))) *))) *))) +
, %*))) *10- *0./ *-%% *%)/ *)%. *))% *))) *))) *))) *))) *))) *))) ,
- %*))) *11/ *1&/ *.,) *+,0 *)&% *)). *))) *))) *))) *))) *))) *))) -
& %*))) %*))) *101 *0)- *-%. *%+. *)+% *))+ *))) *))) *))) *))) *))) &
. %*))) %*))) *110 *1%, *.)0 *+&) *)&0 *)). *))) *))) *))) *))) *))) .
/ %*))) %*))) %*))) *1.0 *//+ *-%. *%,+ *)+% *))% *))) *))) *))) *))) /
0 %*))) %*))) %*))) *11) *00/ *&1. *+&+ *)&/ *))& *))) *))) *))) *))) 0
1 %*))) %*))) %*))) *11/ *1&+ */&& *-%+ *%+0 *)%/ *))% *))) *))) *))) 1
%) %*))) %*))) %*))) *111 *10, *0/+ *&00 *+-& *)-0 *)), *))) *))) *))) %)
%% %*))) %*))) %*))) %*))) *11& *1-, */-0 *-)- *%%, *)%) *))) *))) *))) %%
%+ %*))) %*))) %*))) %*))) *111 *1/1 *0.0 *&0- *++0 *),+ *))) *))) *))) %+
%, %*))) %*))) %*))) %*))) %*))) *11- *1-+ */&) *,1+ *)0/ *))+ *))) *))) %,
%- %*))) %*))) %*))) %*))) %*))) *110 *1/1 *0/- *&0- *%1. *)%% *))) *))) %-
%& %*))) %*))) %*))) %*))) %*))) %*))) *11- *1-1 */.+ *,/) *)-, *)), *))) %&
%. %*))) %*))) %*))) %*))) %*))) %*))) *111 *10- *01, *&01 *%,, *)%. *))) %.
%/ %*))) %*))) %*))) %*))) %*))) %*))) %*))) *11. *1.& */1- *,+, *)/& *))% %/
%0 %*))) %*))) %*))) %*))) %*))) %*))) %*))) *111 *11+ *1,% *.)0 *+.- *)%/ %0
%1 %*))) %*))) %*))) %*))) %*))) %*))) %*))) %*))) *111 *100 *0/0 *.-+ *%0+ %1
Figure: Binomial Table

Example: Following the previous example, but the true percentage of the
votes received by Jones is 30% of the votes (p = .3). Recall that the test
statistic is Y , the number of sampled voters favoring Jones and that we
still use RR = {y ≤ 2} as rejection region. Calculate β, the probability of
type II error.
Solution: If Ha is true, p = .3.
β = P(Type II error)
= P(accepting H0 when Ha is true)
= P(Y > 2 when p = .3)
15
X 15
= (.3)y (.7)15−y
y
y =3
2
X 15
= 1− (.3)y (.7)15−y
y
y =0
= 1 − .127 = .873.
Solution (continued): β = 1 − .127 = .873 can be obtained from the

following table with n = 2, 15 and p = .3.
Example:
Following the previous two examples, but, in this current example, let the
true percentage of the votes received by Jones be 10% of the votes
(p = .1). Recall that the test statistic is Y , the number of sampled voters
favoring Jones and that we still use RR = {y ≤ 2} as rejection region.
Calculate β, the probability of type II error, again.

Solution: If Ha is true, p = .1.
= P(Y > 2 when p = .1)
15
X 15
= (.1)y (.9)15−y
y
y =3
2
X 15
= 1− (.1)y (.9)15−y
y
y =0
= 1 − .816
= .184.
β = 1 − .816 = .184 can be obtained from the binomial table with

n = 2, 15 and p = .1.
Remarks: From the previous two examples, we see that β depends on the
difference between the true p (.3 and .1) under the alternative hypothesis
and the hypothesized p (.5) under the null hypothesis. The value of β is
lower when the difference between the true p and the hypothesized p is
greater. In other words, when the difference between the true p and the
hypothesized p is greater, the probability of type II error (accepting H0
when it is false) is lower.
Example:
Following the previous examples, we still test H0 : p = .5 vs Ha : p < .5.
We assume that the true p is p = .3 and that our RR changes to
RR = {y ≤ 5}. Calculate α, the probability of type I error, and β, the
probability of type II error.

Solution: When H0 : p = .5 is true,
α = P(Type I error)
= P(rejecting H0 when H0 is true)
= P(Y ≤ 5 when p = .5)
5
X 15
= (.5)15 = .151.
y
y =0
However, when Ha : p = .3 is true,

= P(Y > 5 when p = .3)
15
X 15
= (.3)y (.7)15−6 = .278.
y
y =6

Table 10.1, p. 493
Figure: Comparison of α and β

Common Large-Sample Tests (1)
(1) Given the random sample Y1 , Y2 , . . . , Yn , we wish to test a hypothesis

about the parameter θ. Examples could be µ, p, µ1 − µ2 , and p1 − p2 .
(2) We can estimate θ using the estimator θ̂. Examples could be Ȳ , p̂,
Ȳ1 − Ȳ2 , and p̂1 − p̂2 .
(3) When the sample is very large, we can show that θ̂ follows
approximately the normal distribution with mean θ and variance σθ̂2 ;
d
that is, θ̂ → N(θ, σθ̂2 ).
(4) Let the specific value of θ be θ0 . We can develop one (upper-tail)
hypothesis test: H0 : θ = θ0 .
Ha : θ > θ 0 .
Test statistic: θ̂ (or Z = θ̂−θ θ̂−θ0
σ , T = S )
0
θ̂ θ̂
RR = {θ̂ > k} for some choice of k for a fixed α (or RR = {Z > zα },
RR = {T > tα })

Fig. 10.2, p. 496
Figure

Fig. 10.3, p. 496
Figure

(5) More specifically, if we desire an α-level test,
k = θ0 + zα σθ̂
is the appropriate choice for k. If Z follows the standard normal

distribution, then zα is a value such that P(Z > zα ) = α.
(6) Note ( )
θ̂ − θ0
RR = {θ̂ : θ̂ > θ0 + zα σθ̂ } = θ̂ : > zα .
σθ̂
(7) If Z = θ̂−θ
σθ̂ is used as a test statistic, then RR = {Z > zα }. RR is an
0
upper-tail rejection region.

Example:
The upper management claims that the average number of sales contracts
is no more than 15 per week. To check this claim, 36 salespeope are
randomly sampled. The sample mean and variance of the 36
measurements are 17 and 9, respectively. Test the upper management’s
claim at the .05 level of significance.
Solution:
H0 : µ = 15 against Ha : µ > 15.
d
With n = 36 (a large sample), Ȳ → N(µ, σȲ2 ), where σȲ = √σ .
n
The test statistic is
ȳ − µ 17 − 15
z= √ = √ = 4.
s/ n 3/ 36
Check the standard normal table to get z.05 = 1.645. Because z > z.05 , we
reject H0 and the upper management does not have enough evidence to
support its claim.
01234567289 294
44212345214524 !
4512948984579 14294 42917 3392

"#$%&&#'(!)*)!# $+
, ,- ,. ,/ ,0 ,1 ,2 ,3 ,4 ,5
, ,1 ,052 ,05. ,044 ,040 ,04- ,032- ,03.- ,024- ,020-
,- ,02. ,012. ,01.. ,004/ ,000/ ,000 ,0/20 ,0/.1 ,0.42 ,0.03
,. ,0.3 ,0-24 ,0-.5 ,05 ,01. ,0-/ ,/530 ,/5/2 ,/453 ,/415
,/ ,/4.- ,/34/ ,/301 ,/33 ,/225 ,/2/. ,/150 ,/113 ,/1. ,/04/
,0 ,/002 ,/05 ,//3. ,///2 ,// ,/.20 ,/..4 ,/-5. ,/-12 ,/-.-
,1 ,/41 ,/1 ,/-1 ,.54- ,.502 ,.5-. ,.433 ,.40/ ,.4- ,.332
,2 ,.30/ ,.35 ,.232 ,.20/ ,.2-- ,.134 ,.102 ,.1-0 ,.04/ ,.01-
,3 ,.0. ,./45 ,./14 ,./.3 ,..52 ,..22 ,../2 ,..2 ,.-33 ,.-04
,4 ,.--5 ,.5 ,.2- ,.// ,.1 ,-533 ,-505 ,-5.. ,-450 ,-423
,5 ,-40- ,-4-0 ,-344 ,-32. ,-3/2 ,-3-- ,-241 ,-22 ,-2/1 ,-2--
-, ,-143 ,-12. ,-1/5 ,-1-1 ,-05. ,-025 ,-002 ,-0./ ,-0- ,-/35
-,- ,-/13 ,-//1 ,-/-0 ,-.5. ,-.3- ,-.1- ,-./ ,-.- ,--5 ,--3
-,. ,--1- ,--/- ,---. ,-5/ ,-31 ,-12 ,-/4 ,-. ,-/ ,541
-,/ ,524 ,51- ,5/0 ,5-4 ,5- ,441 ,425 ,41/ ,4/4 ,4./
-,0 ,44 ,35/ ,334 ,320 ,305 ,3/1 ,3.. ,34 ,250 ,24-
-,1 ,224 ,211 ,20/ ,2/ ,2-4 ,22 ,150 ,14. ,13- ,115
-,2 ,104 ,1/3 ,1.2 ,1-2 ,11 ,051 ,041 ,031 ,021 ,011
-,3 ,002 ,0/2 ,0.3 ,0-4 ,05 ,0- ,/5. ,/40 ,/31 ,/23
-,4 ,/15 ,/1. ,/00 ,//2 ,/.5 ,/.. ,/-0 ,/3 ,/- ,.50
-,5 ,.43 ,.4- ,.30 ,.24 ,.2. ,.12 ,.1 ,.00 ,./5 ,.//
., ,..4 ,... ,.-3 ,.-. ,.3 ,.. ,-53 ,-5. ,-44 ,-4/
.,- ,-35 ,-30 ,-3 ,-22 ,-2. ,-14 ,-10 ,-1 ,-02 ,-0/
.,. ,-/5 ,-/2 ,-/. ,-.5 ,-.1 ,-.. ,--5 ,--2 ,--/ ,--
.,/ ,-3 ,-0 ,-. ,55 ,52 ,50 ,5- ,45 ,43 ,40
.,0 ,4. ,4 ,34 ,31 ,3/ ,3- ,25 ,24 ,22 ,20
.,1 ,2. ,2 ,15 ,13 ,11 ,10 ,1. ,1- ,05 ,04
.,2 ,03 ,01 ,00 ,0/ ,0- ,0 ,/5 ,/4 ,/3 ,/2
.,3 ,/1 ,/0 ,// ,/. ,/- ,/ ,.5 ,.4 ,.3 ,.2
.,4 ,.2 ,.1 ,.0 ,./ ,./ ,.. ,.- ,.- ,. ,-5
.,5 ,-5 ,-4 ,-3 ,-3 ,-2 ,-2 ,-1 ,-1 ,-0 ,-0
/, ,-/1
/,1 6.//
0, 6/-3 ECO 227
Kuan Xu (UofT) 0,1 6/0 March 28, 2024 20 / 65
Example: A supervisor claims that the machine must be repaired because
it produces more than 10% defectives in a lot during a day. A random
sample of 100 items are sampled, among which there are 15 defectives. Is
there any evidence to support the supervisor’s claim? Use a test with the
.01 level of significance.
Solution: It is known that n = 100, Y = 15, and α = .01.
H0 : p = .10 against Ha : p > .10.
d
It is also known that under H0 , p̂ → N(p0 , p0 (1 − p0 )/n). The test
statistic, which is based on p̂ = Y /n = 15/100 = .15, is
p̂ − p0 p̂ − p0
Z= =p .
σp̂ p0 (1 − p0 )/n
Check the standard normal table, we get P(Z > 2.33) = .01. Calculate
the observed value of Z to get
.15 − .10 .05 5
z=p == = = 1.667.
(.1)(.9)/100 .03 3
Solution (continued): Because the obverved value of Z , z = 1.667, is less

than 2.33, we do not reject H0 : p = .10. No evidence would support the
supervisor’s claim.

(8) Differing from point (4), we can develop a lower-tail hypothesis test:
H0 : θ = θ0 .
Ha : θ < θ 0 .
Test statistic: Z = θ̂−θ
σθ̂ .
0
RR = {Z < −zα } for a fixed α is a lower-tail rejection region.

(9) We can develop a two-tailed hypothesis test: H0 : θ = θ0 .
Ha : θ ̸= θ0 .
Test statistic: Z = θ̂−θ
σθ̂ .
0
RR = {Z < −zα/2 ∪ Z > zα/2 } = {|Z | > zα/2 } for a fixed α is a

union of two rejection regions.

Fig. 10.4, p. 499
Figure: Point (8)
Fig. 10.4, p. 499
Figure: Point (9)

Example: Y1 (Y2 ) is the reaction time of men (women) to a stimulus.
n1 = 50 men (n2 = 50 women) are randomly and independently drawn.
The data are given in the following table. Please test if there is a difference
between the mean reaction times between men and women using α = .05.
Table 10.2, p. 500
Figure: Data for Example

Solution:
H0 : µ1 − µ2 = 0 versus Ha : µ1 − µ2 ̸= 0. Because n1 = 50 and n2 = 50
(greater than 30), we can calculate the value of the test statistic Z , z.
3.6 − 3.8
z≈q
.18 .14
50 + 50
−.2 −.2
=q =q
.32 .64
50 100
−.2 −.2
=√ = = −2.5
.0064 .08
Because −2.5 = z < −zα/2 = −1.96, we reject H0 . In other words, there
is a sufficient evidence to suggest that the mean reaction times between
men and women differ.

Calculating Type II Error Probabilities and Finding the
Sample Size for Z Tests (1)
We demonstrate the idea using an example.
For the test H0 : θ = θ0 versus Ha : θ > θ0 , we can calculate type II error
probabilities
n only ofor specific values for θ (say θ = θa ) in Ha . Let
RR = θ̂ : θ̂ > k , where k is a critical value associated with α of this
test. The probability β of a type II error is
β = P(θ̂ is not in RR when Ha is true)

= P(θ̂ ≤ k when θ = θa )
!
θ̂ − θa k − θa
= P ≤ .
σθ̂ σθ̂
If θa is the true value of θ, then θ̂−θ

σθ̂ has approximately a standard normal
a
distribution. β can be found by find a corresponding area under a standard

normal curve.
Example: It is known that n = 36, ȳ = 17, s 2 = 9. We test H0 : µ = 15

versus Ha : µ = 16 at α = .05. Find β for this test.
ȳ −µ
√ 0 > 1.645. For this problem, write
Solution: It is known that z.05 = σ/ n
the above as

σ
ȳ > µ0 + 1.645 √
n

3
ȳ > 15 + 1.645 √
36
ȳ > 15.8225 = k.
It is clear that k is a critical value for θ (in this example, µ) associated

with α.

Solution (continued): For µa = 16,
!
θ̂ − θa k − θa
β = P ≤
σθ̂ σθ̂
!
θ̂ − θa 15.8225 − 16
= P ≤ √
σθ̂ 3/ 36
= P(Z ≤ −.36)
= .3594.

Fig. 10.5, p. 508
Figure: RR for (k = 15.8225)

We have shown that α and β are related to k. There is another variable
that α and β are related to. That is, n. From the figure given above, we
can infer
α = P(Ȳ > k when µ = µ0 )

Ȳ − µ0 k − µ0
= P √ > √
σ/ n σ/ n
= P(Z > zα ).
β = P(Ȳ ≤ k when µ = µa )

Ȳ − µa k − µa
= P √ ≤ √
σ/ n σ/ n
= P(Z ≤ −zβ ).

We can show two equations for α and β from the previous slide that are
functions of k and n. That is
k − µ0 k − µa
√ = zα and √ = −zα
σ/ n σ/ n
Solving both equations for k. From the above,

σ σ
k = µ 0 + zα √ = µa − zβ √ .
n n
Solving both equations for n. From the above,

σ
(zα + zβ ) √ = µa − µ0 .
n
Factoring out n,
√ (zα + zβ )σ (zα + zβ )2 σ 2
n= or n =
µ a − µ0 (µa − µ0 )2
Example: We know that H0 : µ = 15 versus Ha : µ = 16 and σ 2 = 9. Let

α = β = .05 for the test. How large should the sample size n be to ensure
this accuracy such that such that α and β are close to .05?
Solution:
It follows that zα = zβ = z.05 = 1.645. Then
(zα + zβ )2 σ 2 (1.645 + 1.645)2 (9)

n= = = 97.4
(µa − µ0 )2 (16 − 15)2
Therefore, we need n = 98 to ensure this accuracy.

Relationships between Hypothesis-Testing Procedures and
Confidence Intervals (1)
We can discuss the relationships as follows.

(a) For a large sample, we can estimate θ̂ and its 100(1 − α) confidence
interval as
θ̂ ± zα/2 σθ̂ .
(b) For a large sample, we can test H0 : θ = θ0 versus Ha : θ ̸= θ0 at the
two-tailed α-level. Here the acceptance region
RR = {−zα/2 ≥ z ≥ zα/2 } where z = θ̂−θ σθ̂ . This acceptance region
0
can be written alternatively as
θ̂ − zα/2 σθ̂ ≤ θ0 ≤ θ̂ + zα/2 σθ̂

Another Way to Report the Results of a Statistical Test:
p-Values (1)
In our previous discussion of the hypothesis testing, we set α and use n to

find k such that we find RR and RR and then to find out if we reject or do
not reject H0 .
There is another way for the hypothesis testing—the use of p-values.
p-value
If W is a test statistic, the p-value, or attained significance level, is the
smallest level of significance α for which the observed data (via a test
statistic) indicate that the null hypothesis should be rejected.
Remarks:
α is referred to as the significance level of a test, which is
predetermined at .01, .05, or .10. It is not directly related to the
value of a test statistic.

p-Values (2)
The p-value of a test statistic is directly related to the value of that

test statistic in such a way so that the p-value is set so that, based on
the value of that test statistic the null hypothesis should be rejected.
In practice, if the p-value is less (greater) than α, one should (not)
reject the null hypothesis.
The advantage of using the p-value is that the researcher does not
have to set the value for α.

p-Values (3)
For H0 : θ = θ0 versus Ha : θ < θ0 , we were to reject H0 in favor of

Ha for small values of a test statistic W —say, RR = {w ≤ k}—the
p-value associated with an observed value w0 of W is given by
p − value = P(W ≤ w0 , when H0 is true).
For H0 : θ = θ0 versus Ha : θ > θ0 , we were to reject H0 in favor of

Ha for large values of a test statistic W —say, RR = {w ≥ k}—the
p-value associated with an observed value w0 of W is given by
p − value = P(W ≥ w0 , when H0 is true).

p-Values (4)
For H0 : θ = θ0 versus Ha : θ ̸= θ0 , we were to reject H0 in favor of

Ha for either small or large values of a test statistic W —say,
RR = {either w ≤ k or w ≥ k}—the p-value associated with an
observed value w0 of W is given by
p − value = 2 × P( either W ≤ w0 or ≥ w0 , when H0 is true)
depending on the actual value of W . Note that this is a two-tailed

test.

p-Values (5)
Example:
Test H0 : µ1 − µ2 = 0 against Ha : µ1 − µ2 ̸= 0. It is known that the value
of the test statistic Z is z = −2.5. Please find the p-value and interpret
the test result.
Solution:
This is a two-tailed test. We will reject H0 if either Z ≤ −2.5 or Z ≥ 2.5.
See the following figure.

p-Values (6)
We need to find the value for the p-value (the shaded areas). Check the
standard normal table, we note that P(Z ≥ 2.5) = P(Z ≤ −2.5) = .0062.
The p-value is 2(.0062) = .0124 because this is a two-tailed test. If we
choose α = .05, we could reject H0 as the p-value is less than .05. If we
choose α = .01, we then could not reject H0 as the p-value is greater than
.01.
Remarks: As we can see, researchers often report p-values of their
hypothesis tests so that readers can interpret how strong the empirical
evidence is.

Some Comments on the Theory of Hypothesis Testing (1)
The use of a one-tailed or two-tailed test is dictated by the objective
of research.
The hypothesis testing procedure is clear-cut (accepting or rejecting
H0 ) but the real life setting can be complex. For example, some
measurements and data may not be available.
α can be determined as long as H0 : θ = θ0 is specified. But β can be
determined only if Ha : θ = θa . Unfortunately, Ha often takes the
form of θ > θ0 , or θ < θ0 , or θ ̸= θ0 .
In some cases noted above, it is not easy to minimize β but we know
that the smaller the value of β is, more powerful the test is.
If we cannot get a truly meaningful β, we shall modify the test
procedure.
The way in which we modify the test procedure is that we change
“accepting H0 ” to “not rejecting H0 ” if we have selected a value of
α. Alternatively, we do not reach any decision but rather report the
p-value of the test statistic.
Some Comments on the Theory of Hypothesis Testing (2)
The statistical significance when we reject H0 at a “small’ value of α

(or at a “small” p-value) does not automatically mean the practical
(for example, economic) significance when the difference in means
(reaction times, income levels, and so on) is so small.
The reasons that H0 : θ = θ0 are as follows: (1) We are interested in
finding any evidence for Ha . (2) The test statistic is developed under
H0 and has an known distribution and, hence, an exact value of α.

Small-Sample Hypothesis Testing for µ and µ1 − µ2 (1)
In our previous discussion, we state that when the sample is large, we can
use the standard normal distribution as an approximation according to the
Central Limit Theorem.
However, when our data are from the normal population(s) with the
sample size less than 30 and when we have no information on the
population variance(s), we cannot use the justification based on the
Central Limit Theorem. However, as demonstrated in Ch 7, we can form a
statistic that follows the t distribution. We call this test a small-sample
test. Because we focus on µ and µ1 − µ2 . We call this kind of tests
small-sample tests for µ and µ1 − µ2 .

A Small-Sample Test for µ
Assumptions: Y1 , Y2 , . . . , Yn form a random sample of size n < 30 from a normal
population with mean E (Yi ) = µ.
H0 : µ = µ0 .

µ > µ 0 ,

Ha : µ < µ 0 ,

µ ̸= µ0 .

Ȳ − µ0
Test statistic: T = √
S/ n

t > t α ,

Rejection region: t < −tα ,

|t| > tα/2 .

Here t is the value of T and tα (tα/2 ) is from the t distribution with n − 1 df.
Example:
Muzzle velocities of eight shells are tested with a new gunpowder. Muzzle
velocities are approximately normally distributed. The sample mean and
sample standard deviation are ȳ = 2959 (feet per second) and s = 39.1,
respectively. The manufacturer claims that the new gunpowder produces
an average velocity of not less than 3000 feet per second. Do the sample
data provide sufficient evidence to contradict this claim at the .025 level of
significance?
Solution: It is known that Yi (muzzle velocities) is approximately normally
distributed with mean E (Yi ) = µ. In addition, n = 8, ȳ = 2959, and
s = 39.1. We note both normality and n < 30, We also note that σ 2 is
unknown. We wish to test
H0 : µ = 3000 versus Ha : µ < 3000
at the .025 level of significance. The test statistic is
2959 − 3000 −41
t= √ = = −2.966.
39.1/ 8 13.824
Solution (continued): Check the t table (see the next slide) to get
−tα = −t.025 = −2.365 with 7 df. Because
t = −2.966 < −2.365 = −t.025 , we reject H0 , the manufacturer’s claim.
Example:
Visit the previous example again. Now given T = −2.966, please find the
p-value. We cannot find the precise p-value. We must use R code.
Tables 849
Table 5 Percentage Points of the t Distributions
a
ta
t.100 t.050 t.025 t.010 t.005 df

3.078 6.314 12.706 31.821 63.657 1
1.886 2.920 4.303 6.965 9.925 2
1.638 2.353 3.182 4.541 5.841 3
1.533 2.132 2.776 3.747 4.604 4
1.476 2.015 2.571 3.365 4.032 5
1.440 1.943 2.447 3.143 3.707 6
1.415 1.895 2.365 2.998 3.499 7
1.397 1.860 2.306 2.896 3.355 8
1.383 1.833 2.262 2.821 3.250 9
1.372 1.812 2.228 2.764 3.169 10
1.363 1.796 2.201 2.718 3.106 11
1.356 1.782 2.179 2.681 3.055 12
1.350 1.771 2.160 2.650 3.012 13
1.345 1.761 2.145 2.624 2.977 14
1.341 1.753 2.131 2.602 2.947 15
1.337 1.746 2.120 2.583 2.921 16
1.333 1.740 2.110 2.567 2.898 17
1.330 1.734 2.101 2.552 2.878 18
1.328 1.729 2.093 2.539 2.861 19
1.325 1.725 2.086 2.528 2.845 20
1.323 1.721 2.080 2.518 2.831 21
1.321 1.717 2.074 2.508 2.819 22
1.319 1.714 2.069 2.500 2.807 23
1.318 1.711 2.064 2.492 2.797 24
1.316 1.708 2.060 2.485 2.787 25
1.315 1.706 2.056 2.479 2.779 26
1.314 1.703 2.052 2.473 2.771 27
1.313 1.701 2.048 2.467 2.763 28
1.311 1.699 2.045 2.462 2.756 29
1.282 1.645 1.960 2.326 2.576 inf.
From “Table of Percentage Points of the t-Distribution.” Computed by
Maxine Merrington, Biometrika, Vol. 32 (1941), p. 300.
> # small sample test for mu

> n = 8
> df = n-1
> ybar = 2959
> s = 39.1
> mu0 = 3000
> sig = .025
> #H_0: mu = 300 versus H_a: mu <300
> t = (ybar-mu0)/(s/sqrt(n))
> t
[1] -2.9659
> t.025 = qt(sig, df, lower.tail = TRUE)
> t.025
[1] -2.3646
> isTRUE(t<t.025)
[1] TRUE
> # reject H_0
> # get the p-value
> p.value = pt(t,df, lower.tail = TRUE)
> p.value
[1] 0.010465

A Small-Sample Test for µ1 − µ2

Assumptions: Two independent random samples of sizes n1 < 30 and n2 < 30,
respectively, {Y11 , Y12 , . . . , Y1n1 } and {Y21 , Y22 , . . . , Y2n2 }, are from two normal
populations with means E (Y1i ) = µ1 and E (Y2i ) = µ2 , respectively, but with a
common variance σ12 = σ22 .
H0 : µ1 − µ2 = D0 .

 µ 1 − µ 2 > D0 ,

H a : µ 1 − µ 2 < D0 ,

µ1 − µ2 ̸= D0 .

Ȳ1 − Ȳ2 − D0
Test statistic: T = q ,
Sp n11 + n12
q
(n1 −1)S12 +(n2 −1)S22
where Sp = n1 +n2 −2 .

A Small-Sample Test for µ1 − µ2


t > t α ,

Rejection region: t < −tα ,

|t| > tα/2 .

Here t is the value of T and tα (αα/2 ) is from the t distribution with n1 + n2 − 2

df.

Example:
Test at the α = .05 level of significance for
H0 : µ1 − µ2 = 0 versus Ha : µ1 − µ2 ̸= 0,
given that the samples are drawn randomly from two populations that are
approximately normal.

Solution: To calculate the test statistic, we need
195.56 + 160.22 √
q r
sp = sp =2 = 22.24 = 4.716.
9+9−2
Then,
35.22 − 31.56
t= q = 1.65.
1 1
4.716 9 + 9
Check the t table to get t.025 = 2.120 with 9 + 9 − 2 = 16 df. Because
|t| = 1.65 < 2.120 = t.025 , we do not reject H0 .

Example: For the above example, find the p-value of the test.
> # small sample test for mu1-mu2
> # data
> n1 = 9
> n2 = 9
> y1bar = 35.22
> y2bar = 31.56
> ssry1 = 195.56
> ssry2 = 160.22
> sig =.05
> # H_0: mu1-mu2 = 0 versus H_a: mu1-mu2 neq 0
> # pooled standard deviation
> sp = sqrt((ssry1+ssry2)/(n1+n2-2))
> # test statistic
> t=(y1bar-y2bar)/(sp*sqrt(1/n1+1/n2))
> df = n1+n2-2
> t.025=qt(.025,df,lower.tail = FALSE)
> t.025
[1] 2.1199
> isTRUE(abs(t)<t.025)
[1] TRUE
> # Do not reject H_0
> # get the p-value for this two tailed test
> p.value = 2*pt(t,df, lower.tail = FALSE)
> p.value
[1] 0.11916

Testing Hypotheses σ 2 and σ12 versus σ22 (1)
Test of Hypotheses Concerning a Population Variances

Assumptions: A random sample, Y1 , Y2 , . . . , Yn , is from a normal population with
mean E (Yi ) = µ and variance V (Yi ) = σ 2 .
H0 : σ 2 = σ02 .

2 2
σ > σ0 ,

Ha : σ < σ02 ,
2

 2
σ ̸= σ02 .
(n − 1)S 2
Test statistic: χ2 =
σ02

Test of Hypotheses Concerning a Population Variances


2 2
χ > χα ,

Rejection region: χ < χ21−α ,
2
χ2 > χ2 or χ2 < χ2

α/2 1−α/2
χ2α (χ21−α ) and χ2α/2 (χ21−α/2 ) are from the χ2 table. That is, χ2α is chosen so that
P(χ2 > χ2α ) = α with n − 1 df.

Example:
A company produces machined engine parts that are supposed to have a
diameter variance no larger than .0002 (diameters measured in inches). A
random sample of ten (n = 10) parts gave a sample variance of .0003.
Assume that the diameter measures are normally distributed. Test, at the
5% level of significance, H0 : σ 2 = .0002 against Ha : σ 2 > .0002. In
addition, find the p-value of the test statistic.
Solution: Because the diameter measurements are normally distributed, it
is appropriate to use the χ2 test statistic χ2 (n − 1)S 2 /σ02 . The observed
value of this test statistic is
(n − 1)s 2 (9)(.0003)
χ2 = = = 13.5.
σ02 .0002
Check the χ2 table we find that χ2.05 = 16.919 (based on

n − 1 = 10 − 1 = 9 df). Because χ2 < χ2.05 , do not reject H0 . We can also
find out the p-value of χ2 : .14126. See the next slide.
> # Test a variance = .0002 at sig = .05

> # H_0: sigma^2 = sigma0 versus H_a: sigma^2 > sigma0
> # data
> n = 10
> sigma0 = .0002
> s2 = .0003
> # alpha
> sig = .05
> # test statistic
> chi2test = ((n-1)*(s2))/sigma0
> chi2test
[1] 13.5
> # critical value
> chi2critical = qchisq(sig,n-1,lower.tail = FALSE)
> chi2critical
[1] 16.919
> isTRUE(chi2test<chi2critical)
[1] TRUE
> # do not reject H_0: sigma^2 = sigma0
> # find the p-value
> p.value = pchisq(chi2test, n-1, lower.tail = FALSE)
> p.value
[1] 0.14126
The graphical interpretation can be found in the next slide.

Fig. 10.11, p. 533
Figure: Illustration of the p-value versus the Critical Value
Remarks: When testing H0 : σ 2 = σ02 versus Ha : σ 2 ̸= σ02 , we need to find

two critical values χ21−α/2 and χ2α/2 . Similarly, the appropriate p-value
should be the original p-value calculated from the test statistic multiplied
by 2 to take into account the fact that this is a two-tailed test.

Test of H0 : σ12 = σ22 versus Ha : σ12 > σ22

Assumptions: Two independent random samples, {Y11, , Y12 , . . . , Y1n1 }
and {Y21, , Y22 , . . . , Y2n2 }, are from their respective normal populations.
S12 and S22 are calculated based on the samples of n1 and n2 , respectively.
H0 : σ12 = σ22 .
Ha : σ12 > σ22 .
S12
Test statistic: F =
S22
Rejection region: F > Fα
where Fα is chosen so that P(F > Fα ) = α with v1 = n1 − 1 numerator df
and v2 = n2 − 1 denominator df, respectively.

Remarks: According to the definition of a F distributed random variable

discussed in Ch 7,
(n1 −1)S12
σ12 (n1 −1) S12 σ22 S2
F = = 2 2
= 12 ∼ F (v1 , v2 )
(n2 −1)S12 S2 σ1 S2
σ22 (n2 −1)
with σ12 = σ22 under H0 : σ12 = σ22 .

Example: Following the previous example, we measure diameter variations

from two independent random samples drawn from their respective normal
populations. Let n1 = 10, n2 = 20, s12 = .0003, s22 = .0001. Test, at the
α = .05 level of significance,
H0 : σ12 = σ22 versus Ha : σ12 > σ22
And also report the p-value of the test statistic. Solution: Note that
s12 .0003
F = 2
= = 3.
s2 .0001
Also note that F.05 = 2.4227 with v1 = n1 − 1 = 9 and v2 = n2 − 1 = 19.

Because F > F.05 , we do reject H0 : σ12 = σ22 . The p-value of the test
statistic is .02096. See the next slide for the calculation.

> # test
> # H_0: sigma1^2 = sigma2^2 versus H_a: sigma1^2 > sigma2^2
> # data
> n1 = 10
> n2 = 20
> v1 = n1 - 1
> v2 = n2 - 1
> sig = .05
> s12 = .0003
> s22 = .0001
> # test statistic
> F = s12/s22
> F
[1] 3
> # critical value
> F.critical = qf(.05, v1, v2, lower.tail = FALSE)
> F.critical
[1] 2.4227
> # the p-value of test statistic
> p.value = pf(F, v1, v2, lower.tail = FALSE)
> p.value
[1] 0.02096

Test of H0 : σ12 = σ22 versus Ha : σ12 ̸= σ22

When testing
H0 : σ12 = σ22 versus Ha : σ12 ̸= σ22 ,
we use a for 2the numerator df and b for the denominator df. We can use
S
either F = S12 ∼ F with a = v1 = n1 − 1 numerator and b = v2 = n2 − 1
2
S2
denominator df, respectively, or F −1 = S22 ∼ F with a = v1 = n2 − 1
1
numerator and b = v2 = n1 − 1 denominator df, respectively. Generically,
we can use Fba to denote a F distributed random variable with a numerator
and b denominator df, respectively.
Rejection region: {Fba > Fb,α/2

a
or Fab < (Fb,α/2
a
)−1 }

Remarks: Explain Fab < (Fb,α/2

a )−1 . Recall
P(Fba > Fb,α/2

a
) = α/2 ⇒
P((Fba )−1 < (Fb,α/2

a
)−1 ) = α/2 ⇒
P(Fab < (Fb,α/2

a
)−1 ) = α/2 ⇒
S2
For F = S12 ∼ F with a = v1 = n1 − 1 numerator and b = v2 = n2 − 1
2
denominator df, respectively,
−1 −1
RR = {F > Fnn21−1,α/2 or F −1 < Fnn12−1,α/2 }

> # Data
> s12 = 26.4
> s22 = 12.7
> n1 = 10
> n2 = 14
> v1 = n1 - 1
> v2 = n2 - 1
> sig = .05
> F.test= s12/s22
> F.test
[1] 2.0787
> F.test.inverse = 1/F.test
> F.test.inverse
[1] 0.48106
> # test H_0: sigma12 = sigma22 versus H_a: sigma12 neq sigma22
> # critical value
> F.sig = qf(sig, v1, v2, lower.tail = FALSE)
> F.sig
[1] 2.7144
> # critical value inverted
> F.sig.inv1 = qf(sig, v2, v1, lower.tail = TRUE)
> F.sig.inv1
[1] 0.36841
> F.sig.inv2 = 1/F.sig
> F.sig.inv2
[1] 0.36841
> isTRUE(F.test > F.sig)
[1] FALSE
> # do not reject H_0
> isTRUE(F.test.inverse < F.sig.inv1)
[1] FALSE
> # do not reject H_0

The End

CH 10 Slides

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 10 Slides

Uploaded by

Copyright:

Available Formats

ECO 227Y1 Foundations of Econometrics

March 28, 2024

Kuan Xu (UofT) ECO 227 March 28, 2024 1 / 65

Kuan Xu (UofT) ECO 227 March 28, 2024 2 / 65

Two objectives of statistics/econometrics—parameter estimation (Ch

Kuan Xu (UofT) ECO 227 March 28, 2024 3 / 65

Fig. 10.1, p. 491

Figure: For point (9)

Kuan Xu (UofT) ECO 227 March 28, 2024 6 / 65

Kuan Xu (UofT) ECO 227 March 28, 2024 7 / 65

Figure: Binomial Table

Kuan Xu (UofT) ECO 227 March 28, 2024 8 / 65

Solution (continued): β = 1 − .127 = .873 can be obtained from the

Kuan Xu (UofT) ECO 227 March 28, 2024 10 / 65

β = 1 − .816 = .184 can be obtained from the binomial table with

Kuan Xu (UofT) ECO 227 March 28, 2024 12 / 65

However, when Ha : p = .3 is true,

Kuan Xu (UofT) ECO 227 March 28, 2024 13 / 65

Table 10.1, p. 493

Figure: Comparison of α and β

Kuan Xu (UofT) ECO 227 March 28, 2024 14 / 65

(1) Given the random sample Y1 , Y2 , . . . , Yn , we wish to test a hypothesis

Kuan Xu (UofT) ECO 227 March 28, 2024 15 / 65

Fig. 10.2, p. 496

Kuan Xu (UofT) ECO 227 March 28, 2024 16 / 65

Fig. 10.3, p. 496

Kuan Xu (UofT) ECO 227 March 28, 2024 17 / 65

(5) More specifically, if we desire an α-level test,

is the appropriate choice for k. If Z follows the standard normal

upper-tail rejection region.

Kuan Xu (UofT) ECO 227 March 28, 2024 18 / 65

Solution (continued): Because the obverved value of Z , z = 1.667, is less

Kuan Xu (UofT) ECO 227 March 28, 2024 22 / 65

RR = {Z < −zα } for a fixed α is a lower-tail rejection region.

RR = {Z < −zα/2 ∪ Z > zα/2 } = {|Z | > zα/2 } for a fixed α is a

Kuan Xu (UofT) ECO 227 March 28, 2024 23 / 65

Fig. 10.4, p. 499

Figure: Point (8)

Fig. 10.4, p. 499

Figure: Point (9)

Kuan Xu (UofT) ECO 227 March 28, 2024 24 / 65

Table 10.2, p. 500

Figure: Data for Example

Kuan Xu (UofT) ECO 227 March 28, 2024 25 / 65

Kuan Xu (UofT) ECO 227 March 28, 2024 26 / 65

β = P(θ̂ is not in RR when Ha is true)

If θa is the true value of θ, then θ̂−θ

distribution. β can be found by find a corresponding area under a standard

Example: It is known that n = 36, ȳ = 17, s 2 = 9. We test H0 : µ = 15

It is clear that k is a critical value for θ (in this example, µ) associated

Kuan Xu (UofT) ECO 227 March 28, 2024 28 / 65

Solution (continued): For µa = 16,

Kuan Xu (UofT) ECO 227 March 28, 2024 29 / 65

Fig. 10.5, p. 508

Figure: RR for (k = 15.8225)

Kuan Xu (UofT) ECO 227 March 28, 2024 31 / 65

Example: We know that H0 : µ = 15 versus Ha : µ = 16 and σ 2 = 9. Let

(zα + zβ )2 σ 2 (1.645 + 1.645)2 (9)

Therefore, we need n = 98 to ensure this accuracy.

Kuan Xu (UofT) ECO 227 March 28, 2024 33 / 65

We can discuss the relationships as follows.

can be written alternatively as

θ̂ − zα/2 σθ̂ ≤ θ0 ≤ θ̂ + zα/2 σθ̂