Week 12

Section 8
t-test, χ2 -test, two-sample z/F -test
Andrew Thangaraj (IIT Madras) Hypothesis testing 44 / 71

Normal samples and statistics
X1 , . . . , Xn ∼ iid Normal(µ, σ 2 )
1
Sample mean X = (X1 + · · · + Xn ), E [X ] = µ
n
1
Sample variance S 2 = ((X1 −X )2 + · · · + (Xn −X )2 ), E [S 2 ] = σ 2
n−1
N(0, 1/ n)
1.0 n=1
n=5
2 n = 10
X ∼ Normal(µ, σ /n) 0.5
0.0
3 2 1 0 1 2 3
2
n
(n − 1) 2
S ∼ χ2n−1 0.5
σ2 0.0
0 2 4 6 8 10 12 14
tn
X −µ 0.4
√ ∼ tn−1 0.2
S/ n
0.0
3 2 1 0 1 2 3

t-test for mean (variance unknown)
X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 unknown
Null H0 : µ = µ0 , Alternative HA : µ > µ0

T = X , Test: Reject H0 if T > c

t-test for mean (variance unknown)
X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 unknown
Null H0 : µ = µ0 , Alternative HA : µ > µ0

T = X , Test: Reject H0 if T > c
How to compute significance level?
1 P n
Sample variance S 2 = (Xi − X )2
n − 1 i=1
T − µ0
Given H0 , √ ∼ tn−1
S/ n
For a given sampling, let S 2 = s 2
c − µ0
α = P(T > c|µ = µ0 ) = P(tn−1 > √ )
s/ n

χ2 -test for variance
X1 , . . . , Xn ∼ iid Normal(µ, σ 2 )
Null H0 : σ = σ0 , Alternative HA : σ > σ0

1 P n
S2 = (Xi − X )2 , Test: Reject H0 if S > c
n − 1 i=1

χ2 -test for variance
X1 , . . . , Xn ∼ iid Normal(µ, σ 2 )
Null H0 : σ = σ0 , Alternative HA : σ > σ0

1 P n
S2 = (Xi − X )2 , Test: Reject H0 if S > c
n − 1 i=1
(n − 1) 2
Given H0 , S ∼ χ2n−1
σ02
(n − 1) 2 (n − 1) 2 (n − 1) 2
α = P(S > c|H0 ) = P( 2 S > 2 c ) = P(χ2n−1 > c )
σ0 σ0 σ02

Two samples from normal distribution
X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
X ∼ Normal(µ1 , σ12 /n1 ), Y ∼ Normal(µ2 , σ22 /n2 )

(n1 −1) 2 (n2 −1) 2
σ12
SX ∼ χ2n1 −1 , σ22
SY ∼ χ2n2 −1

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
X ∼ Normal(µ1 , σ12 /n1 ), Y ∼ Normal(µ2 , σ22 /n2 )

(n1 −1) 2 (n2 −1) 2
σ12
SX ∼ χ2n1 −1 , σ22
SY ∼ χ2n2 −1
F(n1, n2)
1.75 n1 = n2 = 1
n1 = n2 = 5
n1 = 5, n2 = 10
1.50
σ2 σ2 1.25
X − Y ∼ Normal(µ1 − µ2 , 1 + 2 )
n1 n2 1.00
0.75
SX2 0.50
∼ F (n1 − 1, n2 − 1) 0.25
SY2
0.00
0.0 0.5 1.0 1.5 2.0 2.5 3.0

Two sample z-test (known variances)
X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
Null H0 : µ1 = µ2 , Alternative HA : µ1 6= µ2
T = Y − X , Test: Reject H0 if |T | > c

Two sample z-test (known variances)
X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
Null H0 : µ1 = µ2 , Alternative HA : µ1 6= µ2
T = Y − X , Test: Reject H0 if |T | > c
σ12 σ22
Given H0 , T ∼ Normal(0, σT2 ), where σT2 = n1 + n2
c
α = P(|T | > c|H0 ) = P(|Normal(0, 1)| > )
σT

Two sample F -test
X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
Null H0 : σ1 = σ2 , Alternative HA : σ1 6= σ2
SX2
T = , Test: Reject H0 if T > 1 + cR or T < 1 − cL
SY2

Two sample F -test
X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
Null H0 : σ1 = σ2 , Alternative HA : σ1 6= σ2
SX2
T = , Test: Reject H0 if T > 1 + cR or T < 1 − cL
SY2
Given H0 , T ∼ F (n1 − 1, n2 − 1)
α/2 = P(T < 1 − cL |H0 ) = P(T > 1 + cR |H0 )

Section 9
Problems on t-test, χ2 -test and two-sample z/F -test

Problem 1
Suppose X ∼ Normal(µ, σ 2 ) with unknown σ. For n = 16 iid samples of X ,
the observed sample mean is 10.2 and the sample standard deviation is 3.
What conclusion would a t-test reach if the null hypothesis assumes µ = 9.5
(against an alternative hypothesis µ > 9.5) at a significance level of
α = 0.05?

Problem 2
Suppose X is normally distributed with unknown standard deviation σ. For
n = 16 iid samples of X , the sample standard deviation is 3.5. What
conclusion would a χ2 -test reach if the null hypothesis assumes σ = 3, with
an alternative hypothesis that σ > 3, and a signifiance level of α = 0.05?

Problem 3
Suppose X ∼ Normal(µ1 , 3) and Y ∼ Normal(µ2 , 4). For n1 = 16 iid
samples of X and n2 = 8 samples of Y , the observed sample means are
10.2 and 8.2, respectively. What conclusion would a two-sample z-test
reach if the null hypothesis assumes µ1 = µ2 (against an alternative
hypothesis µ1 6= µ2 ) at a significance level of α = 0.05?

Problem 4
Suppose X1 , X2 , . . . , X30 is an i.i.d. sample from a distribution
X ∼ Normal(µ1 , σ12 ) and suppose Y1 , Y2 , . . . , Y25 is an i.i.d. sample from a
distribution Y ∼ Normal(µ2 , σ22 ) independent of the Xj variables. If
SX2 = 11.4 and SY2 = 5.1, what conclusion would an F -test reach for null
hypothesis suggesting σ1 = σ2 , an alternative hypothesis suggesting
σ1 6= σ2 , and a signifiance level of α = 0.05?

Section 10
More problems on t-test, χ2 -test and two-sample

z/F -test

Problem 1
The average annual salary of an entry-level data scientist is reported to be
Rs. 8 lakhs per annum. You suspect that this seems too high, and make
enquiries with 10 such persons and find that their annual salaries are
6.9, 7.2, 8.7, 7.7, 8.5, 8.0, 8.0, 7.5, 8.7, 7.4
Based on the above, what conclusion can you reach about your suspicion?

Problem 2
The weight of a cooking gas cylinder is reported to have a standard
deviation of 500 g, which you suspect is too low. A sample of 10 cylinders
had weights (in Kgs) of 15.1, 14.0 , 14.8, 14.8, 15.8, 14.8, 16.1, 14.3, 14.8,
14.8. Based on this data, what is your conclusion on the standard deviation?

Problem 3
Weights of a species of squirrels have a standard deviation σ = 10 grams.
Suppose a sample of 30 squirrels from two different locations results in
respective sample averages of 122.4 grams and 127.6 grams. Do the
squirrels have the same average weight in the two locations?

Problem 4
Two instruments for measuring resistors provide the following measurements
when measuring a 1000 Ohm and a 3000 Ohm resistor, repeatedly.
Instrument 1 Instrument 2
1004 3005
999 2995
993 3019
1000 2993
1008 2992
1002 2991
994 3015
999 2986
Do the two instruments have the same variance in their measurements?

Section 11
Likelihood ratio tests

Recall: Is a coin authentic or counterfeit?
An authentic coin is known to have P(H) = 0.5 when tossed, while a

counterfeit coin has P(H) = 0.6. Suppose you have a coin that could be
authentic or counterfeit. You may toss the coin multiple times and observe
the results. How will you test whether the coin is authentic or counterfeit?


Hypothesis testing
Null H0 : P(H) = 0.5, Alternative HA : P(H) = 0.6
Toss the coin n times: 2n possible outcomes
I A: acceptance set, i.e. if outcome is in A, accept H0 ; otherwise, reject H0
Significance level: α = P(not A|H0 ), Power: 1 − β = P(not A|HA )


Hypothesis testing
Toss the coin n times: 2n possible outcomes
I A: acceptance set, i.e. if outcome is in A, accept H0 ; otherwise, reject H0
Significance level: α = P(not A|H0 ), Power: 1 − β = P(not A|HA )
Question: How to decide acceptance subset A?

Size vs power: n = 3 tosses

Size vs power: n = 3 tosses
Goal: For a given α, find A that minimizes β or maximizes 1 − β.

Is this possible for 100 tosses?

Simple hypotheses and likelihood ratio test
X1 , . . . , Xn ∼ P
Simple null and alternative

H0 : P = fX and HA : P = gX

X1 , . . . , Xn ∼ P

Likelihood ratio
Qn
gX (Xi )
L(X1 , . . . , Xn ) = Qi=1
n
i=1 fX (Xi )

X1 , . . . , Xn ∼ P

Likelihood ratio
Qn
gX (Xi )
L(X1 , . . . , Xn ) = Qi=1
n
i=1 fX (Xi )
Likelihood ratio test

Reject H0 if T = L(X1 , . . . , Xn ) > c

X1 , . . . , Xn ∼ Bernoulli(p)

0.6w 0.4n−w
T = >c
0.5n
where w is the number of H’s in the samples

X1 , . . . , Xn ∼ Bernoulli(p)

0.6w 0.4n−w
T = >c
0.5n
where w is the number of H’s in the samples
Likelihood ratio test is equivalent to w > wc
I Eg, n = 100: Reject null if number of H > 55

Optimality of likelihood ratio test
Theorem
Suppose both null and alternative hypotheses are simple, and there is some
test that achieves a power of 1 − β at a significance level α. Then, there is
a likelihood ratio test at significance level α achieving power at least as high
as 1 − β.
Good news: For simple null and alternative, likelihood ratio tests are
enough.

Optimality of likelihood ratio test
Theorem
Suppose both null and alternative hypotheses are simple, and there is some
test that achieves a power of 1 − β at a significance level α. Then, there is
a likelihood ratio test at significance level α achieving power at least as high
as 1 − β.
Good news: For simple null and alternative, likelihood ratio tests are
enough.
Bad news
I If any of the hypotheses are composite, then the optimality result does
not hold.
I In most situations, we will not have simple null and alternative!

Fake coin: Size vs power for n = 100 tosses
Optimal test: Reject H0 if number of heads > c, c = 0, 1, . . . , 100

Section 12
Goodness of fit tests

Example: Assessing goodness of fit
The expected distribution of grades of students in a class (0 < p < 1) and
the actual frequencies of grades are shown below:
Grade S A B C D E U
Fit p/32 p/4 p/2 1−p p/8 p/16 p/32
Observed 15 97 203 397 55 33 10
ML estimate: L ∝ (1 − p)397 p 413 , p̂ML = 413/(413 + 297) = 0.51

Grade S A B C D E U
Fit p/32 p/4 p/2 1−p p/8 p/16 p/32
Observed 15 97 203 397 55 33 10

Expected counts of grades:
Grade S A B C D E U
Expected 12.9 103.2 206.6 396.9 51.6 25.8 12.9

Grade S A B C D E U
Fit p/32 p/4 p/2 1−p p/8 p/16 p/32
Observed 15 97 203 397 55 33 10

Expected counts of grades:
Grade S A B C D E U
Expected 12.9 103.2 206.6 396.9 51.6 25.8 12.9
Question: Is the above a good-enough fit?

Chi-square goodness of fit test
X ∈ {a1 , . . . , ak } with P(X = ai ) = pi

X ∈ {a1 , . . . , ak } with P(X = ai ) = pi
Observed vs expected counts: n samples
a1 a2 ... ak
Observed y1 y2 ... yk
Expected np1 np2 ... npk
H0 : Samples are iid X , HA : Samples are not iid X

X ∈ {a1 , . . . , ak } with P(X = ai ) = pi
a1 a2 ... ak

k (yi − npi )2
is approx χ2k−1
P
Test Statistic: T =
i=1 npi
Test: Reject H0 if T > c

X ∈ {a1 , . . . , ak } with P(X = ai ) = pi
a1 a2 ... ak

k (yi − npi )2
is approx χ2k−1
P
Test Statistic: T =
i=1 npi
Significance level: α = P(T > c|H0 ) ≈ 1 − Fχ2 (c)
k−1

Problem: Grades data
n = 810 samples, k = 7
Grade S A B C D E U
Observed 15 97 203 397 55 33 10
Expected 12.9 103.2 206.6 396.9 51.6 25.8 12.9

Goodness of fit for continuous distributions
Basic idea: convert continuous to discrete by binning

X ∼ fX (x ), continuous with PDF fX (x ), n samples
Bins: [a0 , a1 ], [a1 , a2 ], . . ., [ak−1 , ak ]
R ai
Bin probabilities: Let pi = P(ai−1 < X < ai ) = ai−1 fX (x )dx
Observed vs Expected counts:
[a0 , a1 ] [a1 , a2 ] ... [ak−1 , ak ]

Pick bins such that each yi ≥ 5 and

P
i pi = 1

X ∼ fX (x ), continuous with PDF fX (x ), n samples
Bins: [a0 , a1 ], [a1 , a2 ], . . ., [ak−1 , ak ]
R ai
Bin probabilities: Let pi = P(ai−1 < X < ai ) = ai−1 fX (x )dx
Observed vs Expected counts:
[a0 , a1 ] [a1 , a2 ] ... [ak−1 , ak ]

Pick bins such that each yi ≥ 5 and

P
i pi = 1
Test: same as before

Example: Beta(3,3) goodness of fit
Bin counts of 100 samples from Beta(3,3) distribution are given below.
[0.0, 0.2] [0.2, 0.4] [0.4, 0.6] [0.6, 0.8] [0.8, 1.0]
Observed 7 23 40 28 6
Expected 5.8 25.9 36.5 25.9 5.8
Eg: Expected count for [0.2, 0.4] = 100(FB(3,3) (0.4) − FB(3,3) (0.2)) = 25.9

[0.0, 0.2] [0.2, 0.4] [0.4, 0.6] [0.6, 0.8] [0.8, 1.0]
Observed 7 23 40 28 6
Expected 5.8 25.9 36.5 25.9 5.8
(7 − 5.8)2 (23 − 25.9)2 (40 − 36.5)2

T = + +
5.8 25.9 36.5
(28 − 25.9)2 (6 − 5.8)2
+ + = 1.09
25.9 5.8

[0.0, 0.2] [0.2, 0.4] [0.4, 0.6] [0.6, 0.8] [0.8, 1.0]
Observed 7 23 40 28 6
Expected 5.8 25.9 36.5 25.9 5.8
(7 − 5.8)2 (23 − 25.9)2 (40 − 36.5)2

T = + +
5.8 25.9 36.5
(28 − 25.9)2 (6 − 5.8)2
+ + = 1.09
25.9 5.8
P-value = 1 − Fχ2 (1.09) = 0.895 is quite high. So, accept fit to Beta(3,3).
4

Example: Test for independence
Consider the following cross-tabulation of grades across 3 different courses.
S A B C D E U Total
Math I 15 97 203 387 55 33 10 800
Stats I 28 182 381 726 103 62 19 1500
CT 47 303 634 1209 172 103 31 2500
Total 90 582 1218 2321 331 198 60 4800
Are the grades independent of subjects?

Example: Test for independence
Consider the following cross-tabulation of grades across 3 different courses.
S A B C D E U Total
Math I 15 97 203 387 55 33 10 800
Stats I 28 182 381 726 103 62 19 1500
CT 47 303 634 1209 172 103 31 2500
Total 90 582 1218 2321 331 198 60 4800
Are the grades independent of subjects?

Marginal PMF of grades: P(S) = 90/4800, P(A) = 582/4800 etc.
Marginal PMF of subjects: P(Math I) = 800/4800,
P(Stats I) = 1500/4800 etc.
800 90
If independent, count of (Math I, S) = · · 4800 = 15
4800 4800
1500 582
If independent, count of (Stats I, A) = · · 4800 = 181.9 etc.
4800 4800
Example: Observed vs Expected
Observed
S A B C D E U Total
Math I 15 97 203 387 55 33 10 800
Stats I 28 182 381 726 103 62 19 1500
CT 47 303 634 1209 172 103 31 2500
Total 90 582 1218 2321 331 198 60 4800
Expected, if independent
S A B C D E U
Math I 15 97 203 386.8 55.2 33 10
Stats I 28.1 181.9 380.6 725.3 103.4 61.9 18.8
CT 46.9 303.1 634.4 1208.9 172.4 103.1 31.2

Example: Chi-squared test for independence
Null H0 : Joint PMF is product of marginals, HA : It is not

P (yij − npij )2
Test statistic: T = i,j
npij
I pij : product of marginals for (i, j)
I npij : expected, if independent
I We get T = 0.012
Approximate distribution of T : Chi-squared with (3 − 1)(7 − 1) = 12
degrees of freedom
I dof = (no of rows −1)(no of cols −1)

Example: Chi-squared test for independence
Null H0 : Joint PMF is product of marginals, HA : It is not

P (yij − npij )2
Test statistic: T = i,j
npij
I pij : product of marginals for (i, j)
I npij : expected, if independent
I We get T = 0.012
Approximate distribution of T : Chi-squared with (3 − 1)(7 − 1) = 12
degrees of freedom
I dof = (no of rows −1)(no of cols −1)

Significance level: α = 1 − Fχ2 (c)
12
P-value: 1 − Fχ2 (0.012) = 0.999, very good fit

12

Week 12

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 12

Uploaded by

Copyright:

Available Formats

Section 8

t-test, χ2 -test, two-sample z/F -test

Andrew Thangaraj (IIT Madras) Hypothesis testing 44 / 71

Andrew Thangaraj (IIT Madras) Hypothesis testing 45 / 71

X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 unknown

Null H0 : µ = µ0 , Alternative HA : µ > µ0

Andrew Thangaraj (IIT Madras) Hypothesis testing 46 / 71

X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 unknown

Null H0 : µ = µ0 , Alternative HA : µ > µ0

Andrew Thangaraj (IIT Madras) Hypothesis testing 46 / 71

Null H0 : σ = σ0 , Alternative HA : σ > σ0

Andrew Thangaraj (IIT Madras) Hypothesis testing 47 / 71

Null H0 : σ = σ0 , Alternative HA : σ > σ0

Andrew Thangaraj (IIT Madras) Hypothesis testing 47 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 48 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

X ∼ Normal(µ1 , σ12 /n1 ), Y ∼ Normal(µ2 , σ22 /n2 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 48 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

X ∼ Normal(µ1 , σ12 /n1 ), Y ∼ Normal(µ2 , σ22 /n2 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 48 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 49 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 49 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 50 / 71

X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

α/2 = P(T < 1 − cL |H0 ) = P(T > 1 + cR |H0 )

Andrew Thangaraj (IIT Madras) Hypothesis testing 50 / 71

Problems on t-test, χ2 -test and two-sample z/F -test

Andrew Thangaraj (IIT Madras) Hypothesis testing 51 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 52 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 53 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 54 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 55 / 77

More problems on t-test, χ2 -test and two-sample

Andrew Thangaraj (IIT Madras) Hypothesis testing 56 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 57 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 58 / 77

Andrew Thangaraj (IIT Madras) Hypothesis testing 59 / 77

Do the two instruments have the same variance in their measurements?

Andrew Thangaraj (IIT Madras) Hypothesis testing 60 / 77

Likelihood ratio tests

Andrew Thangaraj (IIT Madras) Hypothesis testing 61 / 76

An authentic coin is known to have P(H) = 0.5 when tossed, while a

Andrew Thangaraj (IIT Madras) Hypothesis testing 62 / 76

An authentic coin is known to have P(H) = 0.5 when tossed, while a

Andrew Thangaraj (IIT Madras) Hypothesis testing 62 / 76

An authentic coin is known to have P(H) = 0.5 when tossed, while a

Andrew Thangaraj (IIT Madras) Hypothesis testing 62 / 76

Andrew Thangaraj (IIT Madras) Hypothesis testing 63 / 76

Goal: For a given α, find A that minimizes β or maximizes 1 − β.

Andrew Thangaraj (IIT Madras) Hypothesis testing 63 / 76

Simple null and alternative

Andrew Thangaraj (IIT Madras) Hypothesis testing 64 / 76

Simple null and alternative

Andrew Thangaraj (IIT Madras) Hypothesis testing 64 / 76

Simple null and alternative

Likelihood ratio test

Andrew Thangaraj (IIT Madras) Hypothesis testing 64 / 76

Null H0 : P(H) = 0.5, Alternative HA : P(H) = 0.6

Andrew Thangaraj (IIT Madras) Hypothesis testing 65 / 76