Professional Documents
Culture Documents
CH 10 Slides
CH 10 Slides
Kuan Xu
University of Toronto
kuan.xu@utoronto.ca
1 Introduction
2 Elements of a Statistical Test
3 Common Large-Sample Tests
4 Calculating Type II Error Probabilities and Finding the Sample Size for
Z Tests
5 Relationships between Hypothesis-Testing Procedures and Confidence
Intervals
6 Another Way to Report the Results of a Statistical Test: p-Values
7 Some Comments on the Theory of Hypothesis Testing
8 Small-Sample Hypothesis Testing for µ and µ1 − µ2
9 Testing Hypotheses for σ 2 and σ12 versus σ22
(11) The probability of type I error is called α while the probability of type
II error is called β.
Example:
For Jones’s political poll, n = 15. We wish to test
H0 : p = .5 vs Ha : p < .5
Let the test statistic be Y , the number of sampled voters favoring Jones.
Calculate α, the probability of type I error, if we select RR = {y ≤ 2} as
rejection region.
Solution:
α = P(Type I error)
= P(rejecting H0 when H0 is true)
= P(Y ≤ 2 when p = .5).
If H0 is true, p = .5 and
2
15 15 15 15
y 15−y 15 15 15
X
α= (.5) (.5) = (.5) + (.5) + (.5) = .004.
y =0 y 0 1 2
β = P(Type II error)
= P(accepting H0 when Ha is true)
= P(Y > 2 when p = .1)
15
X 15
= (.1)y (.9)15−y
y
y =3
2
X 15
= 1− (.1)y (.9)15−y
y
y =0
= 1 − .816
= .184.
Remarks: From the previous two examples, we see that β depends on the
difference between the true p (.3 and .1) under the alternative hypothesis
and the hypothesized p (.5) under the null hypothesis. The value of β is
lower when the difference between the true p and the hypothesized p is
greater. In other words, when the difference between the true p and the
hypothesized p is greater, the probability of type II error (accepting H0
when it is false) is lower.
Example:
Following the previous examples, we still test H0 : p = .5 vs Ha : p < .5.
We assume that the true p is p = .3 and that our RR changes to
RR = {y ≤ 5}. Calculate α, the probability of type I error, and β, the
probability of type II error.
Figure
Figure
k = θ0 + zα σθ̂
(7) If Z = θ̂−θ
σθ̂ is used as a test statistic, then RR = {Z > zα }. RR is an
0
01234567289 294
44212345214524 !
4512948984579 14294 42917 3392
"#$%&&#'(!)*)!# $+
, ,- ,. ,/ ,0 ,1 ,2 ,3 ,4 ,5
, ,1 ,052 ,05. ,044 ,040 ,04- ,032- ,03.- ,024- ,020-
,- ,02. ,012. ,01.. ,004/ ,000/ ,000 ,0/20 ,0/.1 ,0.42 ,0.03
,. ,0.3 ,0-24 ,0-.5 ,05 ,01. ,0-/ ,/530 ,/5/2 ,/453 ,/415
,/ ,/4.- ,/34/ ,/301 ,/33 ,/225 ,/2/. ,/150 ,/113 ,/1. ,/04/
,0 ,/002 ,/05 ,//3. ,///2 ,// ,/.20 ,/..4 ,/-5. ,/-12 ,/-.-
,1 ,/41 ,/1 ,/-1 ,.54- ,.502 ,.5-. ,.433 ,.40/ ,.4- ,.332
,2 ,.30/ ,.35 ,.232 ,.20/ ,.2-- ,.134 ,.102 ,.1-0 ,.04/ ,.01-
,3 ,.0. ,./45 ,./14 ,./.3 ,..52 ,..22 ,../2 ,..2 ,.-33 ,.-04
,4 ,.--5 ,.5 ,.2- ,.// ,.1 ,-533 ,-505 ,-5.. ,-450 ,-423
,5 ,-40- ,-4-0 ,-344 ,-32. ,-3/2 ,-3-- ,-241 ,-22 ,-2/1 ,-2--
-, ,-143 ,-12. ,-1/5 ,-1-1 ,-05. ,-025 ,-002 ,-0./ ,-0- ,-/35
-,- ,-/13 ,-//1 ,-/-0 ,-.5. ,-.3- ,-.1- ,-./ ,-.- ,--5 ,--3
-,. ,--1- ,--/- ,---. ,-5/ ,-31 ,-12 ,-/4 ,-. ,-/ ,541
-,/ ,524 ,51- ,5/0 ,5-4 ,5- ,441 ,425 ,41/ ,4/4 ,4./
-,0 ,44 ,35/ ,334 ,320 ,305 ,3/1 ,3.. ,34 ,250 ,24-
-,1 ,224 ,211 ,20/ ,2/ ,2-4 ,22 ,150 ,14. ,13- ,115
-,2 ,104 ,1/3 ,1.2 ,1-2 ,11 ,051 ,041 ,031 ,021 ,011
-,3 ,002 ,0/2 ,0.3 ,0-4 ,05 ,0- ,/5. ,/40 ,/31 ,/23
-,4 ,/15 ,/1. ,/00 ,//2 ,/.5 ,/.. ,/-0 ,/3 ,/- ,.50
-,5 ,.43 ,.4- ,.30 ,.24 ,.2. ,.12 ,.1 ,.00 ,./5 ,.//
., ,..4 ,... ,.-3 ,.-. ,.3 ,.. ,-53 ,-5. ,-44 ,-4/
.,- ,-35 ,-30 ,-3 ,-22 ,-2. ,-14 ,-10 ,-1 ,-02 ,-0/
.,. ,-/5 ,-/2 ,-/. ,-.5 ,-.1 ,-.. ,--5 ,--2 ,--/ ,--
.,/ ,-3 ,-0 ,-. ,55 ,52 ,50 ,5- ,45 ,43 ,40
.,0 ,4. ,4 ,34 ,31 ,3/ ,3- ,25 ,24 ,22 ,20
.,1 ,2. ,2 ,15 ,13 ,11 ,10 ,1. ,1- ,05 ,04
.,2 ,03 ,01 ,00 ,0/ ,0- ,0 ,/5 ,/4 ,/3 ,/2
.,3 ,/1 ,/0 ,// ,/. ,/- ,/ ,.5 ,.4 ,.3 ,.2
.,4 ,.2 ,.1 ,.0 ,./ ,./ ,.. ,.- ,.- ,. ,-5
.,5 ,-5 ,-4 ,-3 ,-3 ,-2 ,-2 ,-1 ,-1 ,-0 ,-0
/, ,-/1
/,1 6.//
0, 6/-3 ECO 227
Kuan Xu (UofT) 0,1 6/0 March 28, 2024 20 / 65
Common Large-Sample Tests (7)
Example: A supervisor claims that the machine must be repaired because
it produces more than 10% defectives in a lot during a day. A random
sample of 100 items are sampled, among which there are 15 defectives. Is
there any evidence to support the supervisor’s claim? Use a test with the
.01 level of significance.
Solution: It is known that n = 100, Y = 15, and α = .01.
H0 : p = .10 against Ha : p > .10.
d
It is also known that under H0 , p̂ → N(p0 , p0 (1 − p0 )/n). The test
statistic, which is based on p̂ = Y /n = 15/100 = .15, is
p̂ − p0 p̂ − p0
Z= =p .
σp̂ p0 (1 − p0 )/n
Check the standard normal table, we get P(Z > 2.33) = .01. Calculate
the observed value of Z to get
.15 − .10 .05 5
z=p == = = 1.667.
(.1)(.9)/100 .03 3
Kuan Xu (UofT) ECO 227 March 28, 2024 21 / 65
Common Large-Sample Tests (8)
(8) Differing from point (4), we can develop a lower-tail hypothesis test:
H0 : θ = θ0 .
Ha : θ < θ 0 .
Test statistic: Z = θ̂−θ
σθ̂ .
0
Solution:
H0 : µ1 − µ2 = 0 versus Ha : µ1 − µ2 ̸= 0. Because n1 = 50 and n2 = 50
(greater than 30), we can calculate the value of the test statistic Z , z.
3.6 − 3.8
z≈q
.18 .14
50 + 50
−.2 −.2
=q =q
.32 .64
50 100
−.2 −.2
=√ = = −2.5
.0064 .08
Because −2.5 = z < −zα/2 = −1.96, we reject H0 . In other words, there
is a sufficient evidence to suggest that the mean reaction times between
men and women differ.
!
θ̂ − θa k − θa
β = P ≤
σθ̂ σθ̂
!
θ̂ − θa 15.8225 − 16
= P ≤ √
σθ̂ 3/ 36
= P(Z ≤ −.36)
= .3594.
β = P(Ȳ ≤ k when µ = µa )
Ȳ − µa k − µa
= P √ ≤ √
σ/ n σ/ n
= P(Z ≤ −zβ ).
Remarks:
α is referred to as the significance level of a test, which is
predetermined at .01, .05, or .10. It is not directly related to the
value of a test statistic.
We need to find the value for the p-value (the shaded areas). Check the
standard normal table, we note that P(Z ≥ 2.5) = P(Z ≤ −2.5) = .0062.
The p-value is 2(.0062) = .0124 because this is a two-tailed test. If we
choose α = .05, we could reject H0 as the p-value is less than .05. If we
choose α = .01, we then could not reject H0 as the p-value is greater than
.01.
Remarks: As we can see, researchers often report p-values of their
hypothesis tests so that readers can interpret how strong the empirical
evidence is.
In our previous discussion, we state that when the sample is large, we can
use the standard normal distribution as an approximation according to the
Central Limit Theorem.
However, when our data are from the normal population(s) with the
sample size less than 30 and when we have no information on the
population variance(s), we cannot use the justification based on the
Central Limit Theorem. However, as demonstrated in Ch 7, we can form a
statistic that follows the t distribution. We call this test a small-sample
test. Because we focus on µ and µ1 − µ2 . We call this kind of tests
small-sample tests for µ and µ1 − µ2 .
H0 : µ = µ0 .
µ > µ 0 ,
Ha : µ < µ 0 ,
µ ̸= µ0 .
Ȳ − µ0
Test statistic: T = √
S/ n
t > t α ,
Rejection region: t < −tα ,
|t| > tα/2 .
Here t is the value of T and tα (tα/2 ) is from the t distribution with n − 1 df.
Kuan Xu (UofT) ECO 227 March 28, 2024 44 / 65
Small-Sample Hypothesis Testing for µ and µ1 − µ2 (3)
Example:
Muzzle velocities of eight shells are tested with a new gunpowder. Muzzle
velocities are approximately normally distributed. The sample mean and
sample standard deviation are ȳ = 2959 (feet per second) and s = 39.1,
respectively. The manufacturer claims that the new gunpowder produces
an average velocity of not less than 3000 feet per second. Do the sample
data provide sufficient evidence to contradict this claim at the .025 level of
significance?
Solution: It is known that Yi (muzzle velocities) is approximately normally
distributed with mean E (Yi ) = µ. In addition, n = 8, ȳ = 2959, and
s = 39.1. We note both normality and n < 30, We also note that σ 2 is
unknown. We wish to test
H0 : µ = 3000 versus Ha : µ < 3000
at the .025 level of significance. The test statistic is
2959 − 3000 −41
t= √ = = −2.966.
39.1/ 8 13.824
Kuan Xu (UofT) ECO 227 March 28, 2024 45 / 65
Small-Sample Hypothesis Testing for µ and µ1 − µ2 (4)
Solution (continued): Check the t table (see the next slide) to get
−tα = −t.025 = −2.365 with 7 df. Because
t = −2.966 < −2.365 = −t.025 , we reject H0 , the manufacturer’s claim.
Example:
Visit the previous example again. Now given T = −2.966, please find the
p-value. We cannot find the precise p-value. We must use R code.
Tables 849
a
ta
H0 : µ1 − µ2 = D0 .
µ 1 − µ 2 > D0 ,
H a : µ 1 − µ 2 < D0 ,
µ1 − µ2 ̸= D0 .
Ȳ1 − Ȳ2 − D0
Test statistic: T = q ,
Sp n11 + n12
q
(n1 −1)S12 +(n2 −1)S22
where Sp = n1 +n2 −2 .
195.56 + 160.22 √
q r
sp = sp =2 = 22.24 = 4.716.
9+9−2
Then,
35.22 − 31.56
t= q = 1.65.
1 1
4.716 9 + 9
Check the t table to get t.025 = 2.120 with 9 + 9 − 2 = 16 df. Because
|t| = 1.65 < 2.120 = t.025 , we do not reject H0 .
Example: For the above example, find the p-value of the test.
> # small sample test for mu1-mu2
> # data
> n1 = 9
> n2 = 9
> y1bar = 35.22
> y2bar = 31.56
> ssry1 = 195.56
> ssry2 = 160.22
> sig =.05
> # H_0: mu1-mu2 = 0 versus H_a: mu1-mu2 neq 0
> # pooled standard deviation
> sp = sqrt((ssry1+ssry2)/(n1+n2-2))
> # test statistic
> t=(y1bar-y2bar)/(sp*sqrt(1/n1+1/n2))
> df = n1+n2-2
> t.025=qt(.025,df,lower.tail = FALSE)
> t.025
[1] 2.1199
> isTRUE(abs(t)<t.025)
[1] TRUE
> # Do not reject H_0
> # get the p-value for this two tailed test
> p.value = 2*pt(t,df, lower.tail = FALSE)
> p.value
[1] 0.11916
H0 : σ 2 = σ02 .
2 2
σ > σ0 ,
Ha : σ < σ02 ,
2
2
σ ̸= σ02 .
(n − 1)S 2
Test statistic: χ2 =
σ02
χ2α (χ21−α ) and χ2α/2 (χ21−α/2 ) are from the χ2 table. That is, χ2α is chosen so that
P(χ2 > χ2α ) = α with n − 1 df.
H0 : σ12 = σ22 .
Ha : σ12 > σ22 .
S12
Test statistic: F =
S22
Rejection region: F > Fα
where Fα is chosen so that P(F > Fα ) = α with v1 = n1 − 1 numerator df
and v2 = n2 − 1 denominator df, respectively.
And also report the p-value of the test statistic. Solution: Note that
s12 .0003
F = 2
= = 3.
s2 .0001
> # test
> # H_0: sigma1^2 = sigma2^2 versus H_a: sigma1^2 > sigma2^2
> # data
> n1 = 10
> n2 = 20
> v1 = n1 - 1
> v2 = n2 - 1
> sig = .05
> s12 = .0003
> s22 = .0001
> # test statistic
> F = s12/s22
> F
[1] 3
> # critical value
> F.critical = qf(.05, v1, v2, lower.tail = FALSE)
> F.critical
[1] 2.4227
> # the p-value of test statistic
> p.value = pf(F, v1, v2, lower.tail = FALSE)
> p.value
[1] 0.02096