Class 4 Slides

R300 Advanced Econometric Methods:
Problem Set 4
TA: Cheuk (Next online class 2pm-4pm)
November 10, 2022
1
Question 1
Suppose that
yi = xi β1 + zi β2 + εi .
State and show the Frisch-Waugh-Lovell theorem (say, for β1 ) by

explicit calculation.
2
Frisch-Waugh-Lovell Theorem
Recall the projection matrix
P = X (X ′ X )−1 X ′
and the annihilator matrix
M = I − P = I − X (X ′ X )−1 X ′
These matrices are symmetric and idempotent (A2 = A).
3
FWL states
y = X1 βˆ1 + X2 βˆ2 + ε̂
and
M1 y = M1 X2 βˆ2 + ε̂, M1 = I − X1 (X1′ X1′ )−1 X1′

|{z} | {z }
û v̂
give the same βˆ2 .
4
Proof:
y = X1 βˆ1 + X2 βˆ2 + ε̂
M1 y = M1 X1 βˆ1 + M1 X2 βˆ2 + M1 ε̂
| {z } |{z}
=0 =ε̂
5
Proof:
y = X1 βˆ1 + X2 βˆ2 + ε̂
M1 y = M1 X1 βˆ1 + M1 X2 βˆ2 + M1 ε̂
| {z } |{z}
=0 =ε̂
where
M1 X1 = X1 − X1 (X1′ X1 )−1 X1′ X1 = 0
M1 ε̂ = I − X1 (X1′ X1 )−1 X1′ ε̂

= ε̂ − X1 (X1′ X1 )−1 X1′ ε̂

|{z}
=0
= ε̂
5
Question 1 - Solution
The Frisch-Waugh-Lovell theorem here goes as follows. Regressing

yi on zi gives residuals
∑i zi yi
ûi = yi − zi γ̂1 , γ̂1 = .
∑i zi2
Regressing xi on zi gives residuals

∑i zi xi
v̂i = xi − zi γ̂2 , γ̂2 = .
∑i zi2
Regressing ûi on v̂i gives slope coecient

∑ v̂i ûi
βˆ1 = i 2 .
∑i v̂i
This slope is numerically identical to the least-squares estimator
applied directly to the bivariate regression problem.
6
To see this rst note that the least-squares estimator solves

n
min
b1 ,b2
∑ (yi − xi b1 − zi b2 )2 .
i=1
The rst-order conditions are

n
∑ xi (yi − xi b1 − zi b2 ) = 0,
i=1
n
∑ zi (yi − xi b1 − zi b2 ) = 0.
i=1
From the second equation,

∑ zi (yi − xi b1 ) = ∑ zi2 b2
i i
and so
∑i zi (yi − xi b1 ) 7
b2 = .
∑ z2
Substituting back into the rst-order condition for b1 gives

n
∑i ′ zi ′ yi ′ ∑i ′ zi ′ xi ′
∑ xi yi − xi b1 − zi ∑i ′ z 2′ + zi ∑i ′ z 2′ b1 = 0.
i=1 i i
But note that (using the denitions from above) this is

n
∑ xi ((yi − zi γ̂1 ) − (xi − zi γˆ2 ) b1 ) = 0,
i=1
or even n
∑ xi (ûi − v̂i b1 ) = 0.
i=1
The solution is
∑ xi ûi ∑ v̂i ûi
βˆ1 = i = i 2 .
x v̂
∑i i i ∑i v̂i
8
To see the last transition note that, by denition, xi = zi γ̂2 + v̂i .

Then,
∑ xi ûi = ∑(zi γ̂2 + v̂i ) ûi = γ̂2 ∑ zi ûi + ∑ v̂i ûi = ∑ v̂i ûi .
i i i i i
Indeed, ∑i zi ûi = 0 by properties of the least-squares regression of

yi on zi . The same argument explains why ∑i xi v̂i = ∑i v̂i2 .
Note that the regression of yi on zi is performed mostly for

didactical purposes. Replacing ûi by yi and estimating β1 by
∑i yi v̂i
∑i v̂i2
gives the same result. (Be sure to convince yourself of this.)

9
Question 2
Suppose x is continuous and uniformly distributed on the interval

[θ , θ + 1]. We wish to test
H0 : θ = 0 vs H1 : θ > 0.
Consider the procedure
Reject H0 if x > .95, Accept H0 otherwise.
(i) Compute the size of this test.
(ii) Derive the power function.
10
Hypothesis Testing
Reading: Casella and Berger Chapter 8.

Denition (Hypothesis)
A hypothesis is a statement about a population parameter
Denition (Null and Alternative Hypothesis)

In any hypothesis test, there are two opposing hypotheses: the null
hypothesis, denoted by H0 , and the alternative hypothesis, denoted by H1 .
The general form of above is given by:

H0 : θ ∈ Θ0 vs. H1 : θ ∈ Θc0 ≡ Θ/Θ0
where θ ∈ Θ is a parameter and Θ0 ⊂ Θ is a subset of the parameter space
mentioned in the null hypothesis.
Denition (Hypothesis Test)
A hypothesis test is a rule that species whether H0 is rejected or not for any
realizations of the sample.
11
Hypothesis Testing
Denition (Statistic)
A statistic is a measurable function of the sample. When a statistic is used
for estimation, it is called an estimator. When a statistic is used for
hypothesis testing, it is called a test statistic.
Denition (Rejection Region)

The rejection region is the subset of the sample space such that H0 is
rejected. This rejection region is often dened using the test statistic.
e.g. if we are testing the hypothesis that the population mean is θ , we might
dene the rejection region as
Rn = {x1 , x2 , . . . , xn : x̄n ≤ cn }
where cn is a number to be determined and is called the critical value.
12
Hypothesis Testing
Denition (Power Function)

The power function of a hypothesis test with rejection region R is the
function of θ dened by
β (θ ) = Pθ (Xn ∈ Rn ).
13
Hypothesis Testing
Denition (Power Function)

The power function of a hypothesis test with rejection region R is the
function of θ dened by
β (θ ) = Pθ (Xn ∈ Rn ).
Denition (Type I and Type II Errors)

Type I: reject H0 when H0 is true
Type II: accept H0 when H1 is true
13
Hypothesis Testing
Relating power function to Type I and Type 2 Errors

Theorem

probability of a Type I Error if θ ∈ Θ0
β (θ ) =
1 − probability of a Type II Error if θ ∈ Θc0
A good test has power function near 1 for θ ∈ θ0c and near 0 for θ ∈ Θ0 .
Denition (Size)
The size is the maximum probability of error of type I among distributions in
the null hypothesis:
sizen = sup Pθ (Xn ∈ Rn )

θ ∈Θ0
14
Hypothesis Testing
Denition (Size α Test)

A test with power function β (θ ) is a size α test if
sup β (θ ) = α.
θ ∈Θ0
where α ∈ [0, 1].
Denition (Level α Test)

A test with power function β (θ ) is a level α test if
sup β (θ ) ≤ α.
θ ∈Θ0
where α ∈ [0, 1].
We dened these object to xed the probability of committing type 1 error.

Then it makes sense to talk about maximizing power. Otherwise, we can
trivially achieve maximum power by reject everything .
15
Hypothesis Testing
Denition (Unbiased Test)

A test with power function β (θ ) is unbiased if β (θ ′ ) ≥ β (θ ′′ ) for every
θ ′ ∈ Θc0 and θ ′′ ∈ Θ0 .
A unbiased test is more likely to reject H0 if θ ∈ Θc0 than if θ ∈ Θ0 .
16
Hypothesis Testing
Denition (Unbiased Test)

A test with power function β (θ ) is unbiased if β (θ ′ ) ≥ β (θ ′′ ) for every
θ ′ ∈ Θc0 and θ ′′ ∈ Θ0 .
A unbiased test is more likely to reject H0 if θ ∈ Θc0 than if θ ∈ Θ0 .
Denition (Uniformly Most Powerful (UMP))

Let C be the class of all level α tests. A test in class C with power function
β (θ ) is uniformly most powerful if
β (θ ) ≥ β ′ (θ )
for every θ ∈ Θc0 , where β ′ (θ ) is a power function of another test in class C .
Recall β (θ ) is equal to one minus probability of a Type II error when θ ∈ Θc0 .

This means an UMP test minimizes the Type II error while holding the Type I
error constant.
16
Task 2 - Solution
(i) Under H0 , x is uniformly distributed on [0, 1]. Hence,
sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

θ ∈Θ0
= 1 − 0.95 = 0.05
so that this test has size α = .05.
17
Task 2 - Solution
sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

θ ∈Θ0
= 1 − 0.95 = 0.05
(ii) Given the parameter θ , x is uniformly distributed on [θ , θ + 1] and so
Pθ (x ≤ v ) = v − θ .
17
Task 2 - Solution
sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

θ ∈Θ0
= 1 − 0.95 = 0.05
Pθ (x ≤ v ) = v − θ .
The power function at θ is
 0 if θ ≤ −.05


β (θ ) := Pθ (x > .95) = 1 − Pθ (x ≤ .95) = θ + .05 if θ ∈ (−.05, .95]) ,
 1

if θ > .95
17
Task 2 - Solution
sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

θ ∈Θ0
= 1 − 0.95 = 0.05
Pθ (x ≤ v ) = v − θ .
The power function at θ is
 0 if θ ≤ −.05


β (θ ) := Pθ (x > .95) = 1 − Pθ (x ≤ .95) = θ + .05 if θ ∈ (−.05, .95]) ,
 1

if θ > .95
Note that β (0) = 0.05 < β (0 + ε) = 0.05 + ε . This is an unbiased test!
17
Illustration - one-sided test
Figure 1: Source: lecture slides

18
Illustration - two-sided test
Figure 2: Source: lecture slides

19
Question 3
Suppose x ∼ N(µ, σ 2 ). Consider two independent random samples

on x , {x1i }ni=1 and {x2i }ni=1 . Find a sample size n so that
σ
P |x 1 − x 2 | <
5
is .99. Explain how you proceed.
20
By normality both sample means x 1 and x 2 have distribution

N(µ, σ 2 /n). By independence, their difference x 1 − x 2 is
distributed as N(0, 2σ 2 /n). Now,
σ σ σ 1/5 1/5
P |x 1 − x 2 | < = P − < x1 − x2 < =P −√ √ <z < √ √
5 5 5 2/ n 2/ n
√ √
for z = (x 1 − x 2 )/( 2σ / n) standard normal.
21
By normality both sample means x 1 and x 2 have distribution

N(µ, σ 2 /n). By independence, their difference x 1 − x 2 is
distributed as N(0, 2σ 2 /n). Now,
σ σ σ 1/5 1/5
P |x 1 − x 2 | < = P − < x1 − x2 < =P −√ √ <z < √ √
5 5 5 2/ n 2/ n
√ √
for z = (x 1 − x 2 )/( 2σ / n) standard normal.
Thus, we need to solve
1 n 1rn 2
r
= .005 =⇒ 2 5Φ−1 (0.995) = n

P z≥ = 1−Φ
5 2 5 2
for n. Solving it gives us that this is equivalent to solving 51 n2 = 2.576 for n
p
(e.g. Use norminv from Matlab or look it up from a table). This yields n ≈ 332.
21
Question 4
Suppose that xi is exponential with density
fθ (x) = θ e −xθ ,
where θ > 0 and x ≥ 0.
(i) Derive the maximum likelihood estimator of θ , say θ̂ .
(ii) Derive the asymptotic distribution of θ̂ .
(iii) Derive the asymptotic distribution of the estimator 1/θ̂ .
22
Delta Method
Theorem (Delta Method)

Let {Xn }n≥1 be a sequence of k−dimensional random variables,
let µ ∈ Rk , let µ ∈ Rk such that
√
→ N(0k , Σ)
d
n(Xn − µ) −
and let F : Rk → Ru be a continuous dierentiable function at µ .

Then
√
→ N(0µ , DF (µ)ΣDF (µ)′ )
d
n(F (Xn ) − F (µ)) −
where DF (µ) is the gradient (Jacobian) of F evaluated at µ .
23
Delta Method
Proof - Univariate :
By the mean value theorem (1st order Taylor expansion around θ )
F (Xn ) = F (θ ) + DF (θ̃ )(Xn − θ )
where θ̃ lies between Xn and θ .

√
Rearrange the object above and multiply by n to get
√ √
n[F (Xn ) − F (θ )] = DF (θ̃ ) n(Xn − θ )
| {z } | {z }
p d
−→DF (θ ) − 2
→N(0,σ )
p p
where DF (θ̃ ) −
→ DF (θ ) because θ̃ −
→ θ and continuous mapping theorem.
p
Note that θ̃ −
→ θ because |θ̃ − θ | < |Xn − θ |.
24
(i)
log fθ (x) = log θ − xθ
and so
∂ log fθ (x) 1
= −x
∂θ θ
The MLE is 1/x .
(ii) The information bound is θ 2 /n. The asymptotic distribution result is

√
n(θ̂ − θ ) → N(0, θ 2 ).
d
(iii) We have F (θ ) = 1 and DF (θ ) = − θ12 . Apply the delta method

θ
√ 1 1 d 1 2
n − → N(0,
− θ 2 ) = N(0, 1/θ 2 ).
θ̂ θ θ
25

Class 4 Slides

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Class 4 Slides

Uploaded by

Copyright:

Available Formats

R300 Advanced Econometric Methods:

TA: Cheuk (Next online class 2pm-4pm)

November 10, 2022

State and show the Frisch-Waugh-Lovell theorem (say, for β1 ) by

Recall the projection matrix

and the annihilator matrix

These matrices are symmetric and idempotent (A2 = A).

M1 y = M1 X2 βˆ2 + ε̂, M1 = I − X1 (X1′ X1′ )−1 X1′

give the same βˆ2 .

M1 ε̂ = I − X1 (X1′ X1 )−1 X1′ ε̂

= ε̂ − X1 (X1′ X1 )−1 X1′ ε̂

The Frisch-Waugh-Lovell theorem here goes as follows. Regressing

Regressing xi on zi gives residuals

Regressing ûi on v̂i gives slope coecient

To see this rst note that the least-squares estimator solves

The rst-order conditions are

From the second equation,

Substituting back into the rst-order condition for b1 gives

But note that (using the denitions from above) this is

To see the last transition note that, by denition, xi = zi γ̂2 + v̂i .

Indeed, ∑i zi ûi = 0 by properties of the least-squares regression of

Note that the regression of yi on zi is performed mostly for

gives the same result. (Be sure to convince yourself of this.)

Suppose x is continuous and uniformly distributed on the interval

Consider the procedure

Reject H0 if x > .95, Accept H0 otherwise.

(i) Compute the size of this test.

(ii) Derive the power function.

Reading: Casella and Berger Chapter 8.

Denition (Null and Alternative Hypothesis)

The general form of above is given by:

Denition (Rejection Region)

where cn is a number to be determined and is called the critical value.

Denition (Power Function)

Denition (Power Function)

Denition (Type I and Type II Errors)

Relating power function to Type I and Type 2 Errors

sizen = sup Pθ (Xn ∈ Rn )

Denition (Size α Test)

where α ∈ [0, 1].

Denition (Level α Test)

where α ∈ [0, 1].

We dened these object to xed the probability of committing type 1 error.

Denition (Unbiased Test)

A unbiased test is more likely to reject H0 if θ ∈ Θc0 than if θ ∈ Θ0 .

Denition (Unbiased Test)

A unbiased test is more likely to reject H0 if θ ∈ Θc0 than if θ ∈ Θ0 .

Denition (Uniformly Most Powerful (UMP))

for every θ ∈ Θc0 , where β ′ (θ ) is a power function of another test in class C .

Recall β (θ ) is equal to one minus probability of a Type II error when θ ∈ Θc0 .

(i) Under H0 , x is uniformly distributed on [0, 1]. Hence,

sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

so that this test has size α = .05.

(i) Under H0 , x is uniformly distributed on [0, 1]. Hence,

sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

so that this test has size α = .05.

(ii) Given the parameter θ , x is uniformly distributed on [θ , θ + 1] and so

(i) Under H0 , x is uniformly distributed on [0, 1]. Hence,

sup Pθ (X > 0.95) = Pθ0 =0 (X > 0.95)

so that this test has size α = .05.

(ii) Given the parameter θ , x is uniformly distributed on [θ , θ + 1] and so

The power function at θ is

(i) Under H0 , x is uniformly distributed on [0, 1]. Hence,

Regressing ûi on v̂i gives slope coecient

To see this rst note that the least-squares estimator solves

The rst-order conditions are

Substituting back into the rst-order condition for b1 gives

But note that (using the denitions from above) this is

To see the last transition note that, by denition, xi = zi γ̂2 + v̂i .

Denition (Null and Alternative Hypothesis)

Denition (Rejection Region)

Denition (Power Function)

Denition (Power Function)

Denition (Type I and Type II Errors)

Denition (Size α Test)

Denition (Level α Test)

We dened these object to xed the probability of committing type 1 error.

Denition (Unbiased Test)

Denition (Unbiased Test)

Denition (Uniformly Most Powerful (UMP))

and let F : Rk → Ru be a continuous dierentiable function at µ .