Module 5

Econ 3334
Module 5
Linear Regression with a
Single Regressor:
Inference
Department of Economics, HKUST
Instructor: Junlong Feng
Fall 2022
Menu of Module 5
I. Hypothesis II. Confidence III. Two-sample

testing interval mean differential
IV. Variance
estimation and
heteroscedasticity
2
I. Hypothesis testing
Linear regression model 𝑌! = 𝛽" + 𝛽#𝑋! + 𝑢! under unconfoundedness, i.i.d, and no

large outliers,
• 𝐸 𝑌 𝑋 = 𝑏 − 𝐸 𝑌 𝑋 = 𝑎 : ATE of 𝑋 changing from 𝑎 to 𝑏 on 𝑌.
• 𝐸 𝑌 𝑋 = 𝛽" + 𝛽#𝑋
∑! %! &%' (! &('
,
• OLS estimator 𝛽# ≡ and 𝛽," ≡ 𝑌. − 𝛽,#𝑋. are
∑! %! &%'
"
• Unbiased for 𝛽! and 𝛽"

• Consistent for 𝛽! and 𝛽"
• Asymptotically normal:
! '() *" +,# -"
𝛽"! is approximately 𝑁 𝛽! , 𝜎$#% , where 𝜎$#% = $
! ! & .#
! '() /" -" ,#
𝛽"" is approximately 𝑁 𝛽" , 𝜎$#% , where 𝜎$#% = , where 𝐻1 =1− 𝑋1
% % & 0 /& & 0 *"&
"
3
Recall estimation and inference of population mean: 𝜇! .

"
"
• By CLT: 𝑌* is approximately 𝑁 𝜇! , ! .
#
$ !
# !%&
• Standardize: is approximately 𝑁 0,1 .
"!
• For a null: 𝐻' : 𝜇! = 𝜇!' against 𝐻( : 𝜇! ≠ 𝜇!' with size 𝛼,
𝑛 𝑌* − 𝜇!'
Pr ≤ 𝑧(%) ≈ 1 − 𝛼
𝜎! *
• In practice, 𝜎! unknown. Replaced by a consistent estimator 𝑠! (sample standard
∑ !# %!$ "
deviation: )
#%(
$ ! $ $ ! $
# !%& # !%&
• Reject if > 𝑧(%% . Do not reject if ≤ 𝑧(%%
,! " ,! "
4
Now suppose we want to test the null 𝐻' : 𝛽( = 𝛽(' against 𝐻( : 𝛽( ≠ 𝛽(' . We can use exactly
the same idea.
• 𝛽]( is approximately 𝑁 𝛽( , 𝜎-* . .&
-& %.&
.
• Standardize: is approximately 𝑁 0,1 .
"(
' &
• Under the null: 𝐻' : 𝛽( = 𝛽(' with size 𝛼,
𝛽]( − 𝛽('
Pr ≤ 𝑧(%) ≈ 1 − 𝛼
𝜎.- *
&
• In practice, 𝜎.- unknown. Need to replace it by a consistent estimator. Recall 𝜎.-* =

& &
𝑣𝑎𝑟 𝑋/ − 𝜇0 𝑢/ /(𝑛𝜎0 ). Details will be given later. For now, call the estimator 𝑆𝐸(𝛽]( ).
1
-& %.&$
. -& %.&$
.
• Reject if -& ) > 𝑧(%% . Do not reject if -& ) ≤ 𝑧(%% .
23(. " 23(. "
5
)# &*#$
* )# &*#$
*
Reject if +,(*)# ) > 𝑧#&% . Do not reject if +,(*)# ) ≤ 𝑧#&% .
" "
)# &*#$
*
• )# ) is a t-statistic.
+,(*
• The rejection rule is a two-sided t-test.
• With critical value 𝑧#&% , the size or the significance level of the test is controlled
"
at 𝛼.
• The asymptotic power is 1, just like the t-test for population mean with sample
average as the estimator.
6
Same procedure for 𝐻": 𝛽" = 𝛽"" 𝑣𝑠 𝐻#: 𝛽" ≠ 𝛽"":

)$ &*$$
*
• Form a t-statistic: +,(*)$ ) .
)$ &*$$
* )$ &*$$
*
• T-test: reject if +,(*)$ ) > 𝑧#&% . Do not reject if +,(*)$ ) ≤ 𝑧#&% .
" "
7
Example:
i
𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 698.9 − 2.28 ⋅ 𝑆𝑇𝑅
10.4 (0.52)
• Convention: numbers in parentheses and below the estimated parameter are the
standard errors.
• 𝛽]' = 698.9; 𝑆𝐸 𝛽]' = 10.4.
• 𝛽]( = −2.28; 𝑆𝐸 𝛽]( = 0.52.
• Consider a two-sided test for 𝐻' : 𝛽( = 0.
• Interpretation of the null: since 𝛽( is the marginal average causal effect of STR on test
score, 𝛽( = 0 means 𝑆𝑇𝑅 has no causal effect on test score on average.
%*.*7
• T-statistic: = 4.38 > 2.58 = 𝑧(%$.$& . So reject the null at 1% significance level.
'.8* "
8
We can also compute the 𝑝-value.

• For instance for the null 𝛽# = 𝛽#" with the two-sided alternative,
𝛽,# − 𝛽#"
𝑝 = 2Φ −
𝑆𝐸 𝛽,#
• In the example, 𝑝 = 2Φ −4.38 = 1.19×10&/ < 0.01. So, again, significant at 1%
level.
9
Example:
h
10.4 (0.52)
• We can also test hypotheses for 𝛽".
• E.g. 𝐻": 𝛽" = 690 𝑣𝑠 𝐻#: 𝛽" ≠ 690
012.1&01"
• T-statistic: #".4
= 0.86
• Smaller than any commonly used critical values.
• Not rejecting at 𝛼 = 0.1,0.05,0.01.
• 𝑝 = 2Φ −0.86 = 0.39.
10
Output from R
• The t-value and p-value are all for 𝐻": 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 = 0 against a two-sided
alternative.
11
II. Confidence interval
Recall there is another approach for statistical inference:

• Based on the estimator, build a set/interval such that it covers the true
parameter with a large probability (1 − 𝛼).
• (1 − 𝛼) CI for 𝛽#: 𝛽,# − 𝑧#&% ⋅ 𝑆𝐸 𝛽,# , 𝛽,# + 𝑧#&% ⋅ 𝑆𝐸 𝛽,#
" "
2
• 𝑧!+' is again the (1 − )th quantile of 𝑁(0,1)
& %
• For 𝛼 = 0.01, 𝑧!+' = 2.58. For 𝛼 = 0.05, 𝑧!+' = 1.96. For 𝛼 = 0.1, 𝑧!+' = 1.64.
& & &
#! +$!
$
• The coverage probability is still derived from Pr #! ≤ 𝑧!+' ≈ 1 − 𝛼 .
30 $ &
• Similarly, (1 − 𝛼) CI for 𝛽": 𝛽," − 𝑧#&% ⋅ 𝑆𝐸 𝛽," , 𝛽," + 𝑧#&% ⋅ 𝑆𝐸 𝛽,"

" "
12
Example:
h
10.4 (0.52)
• A 95% confidence interval for 𝛽#: −2.28 − 1.96×0.52, −2.28 + 1.96×0.52 =
−3.30, −1.26 .
• 0 is not is the confidence interval, so 𝐻" = 0 can be rejected if the alternative is two-sided.
• Cannot reject any null inside the interval.
• A 95% confidence interval for 𝛽": 698.9 − 1.96×10.4, 698.9 + 1.96×10.4 =
678.52, 719.28 .
• 690 is in the confidence set.
13
From the output from R, you’re able to compute the confidence intervals for any
given coverage probability.
• Example: 90% CI for 𝛽", i.e., the intercept, is [698.93 − 1.64 ⋅ 10.36, 698.93 +
1.64 ⋅ 10.36]
14
III. Two-sample mean differential
We didn’t talk about the two-sample mean testing problem. It’s not because of lack
of importance, but because the conventional way is not convenient.
• Two-sample mean testing problem is crucial for causal inference.
• Suppose you randomize a treatment variable 𝐷 ∈ 0,1 in an i.i.d sample.
• 𝐷! = 1 means individual 𝑖 receives the treatment.
• 𝐷! = 0 means individual 𝑖 does not receive the treatment.
• Examples for 𝐷 include vaccine, vouchers, draft (military service), etc.
• Let the outcome for 𝑖 be 𝑌! . By the randomness of 𝐷! ,
𝐴𝑇𝐸 = 𝐸 𝑌! 𝐷! = 1 − 𝐸 𝑌! 𝐷! = 0
• 𝐻": 𝐴𝑇𝐸 = 0 is equivalent as 𝐻": 𝐸 𝑌! 𝐷! = 1 = 𝐸 𝑌! 𝐷! = 0 .
15
A conventional method: two-sample mean test

• Divide your sample into two subsamples, each associated with one value of 𝐷.
• Calculate the two subsample averages.
• Figure out the asymptotic distribution of the subsample average difference.
• Conduct a t-test.
Inconvenient and need new formulas.
16
We can solve the problem by simply running a regression:

• Write down a linear regression model:
𝑌/ = 𝛽' + 𝛽( 𝐷/ + 𝑢/
• Since 𝐷/ is randomly assigned, the assumption 𝐸 𝑢/ 𝐷/ = 𝐸 𝑢/ is plausible.
• 𝐸 𝑌/ 𝐷/ = 0 = 𝛽' + 𝐸 𝑢/
• 𝐸 𝑌/ 𝐷/ = 1 = 𝛽' + 𝛽( + 𝐸 𝑢/
• Therefore, 𝛽( = 𝐸 𝑌/ 𝐷/ = 1 − 𝐸 𝑌/ 𝐷/ = 0
• Run OLS, get 𝛽]( , and test if it’s zero.
• Further, by constructing the confidence interval, we can get a range for the true ATE
with confidence.
• A unified approach. No separate procedure or formulas are required.
17
Example: In the STR-Test score data, construct 𝐷 = 1 if 𝑆𝑇𝑅 < 20 and 𝐷 = 0 otherwise.
• Interpretation: 𝛽F& : sample average of test scores for group 𝐷 = 0. 𝛽F' : sample mean difference of test
scores in the two groups. 𝛽F& + 𝛽F' : sample average of test scores for group 𝐷 = 1.
• For the null 𝐻& : 𝛽' = 0, t value is 4.04, significant at all commonly adopted levels.
• The means of the two subsamples 𝑆𝑇𝑅 < 20 and 𝑆𝑇𝑅 ≥ 20 are thus significantly different at all
commonly adopted levels.
• 95% confidence interval for 𝛽' : [7.37 − 1.96 ⋅ 1.82, 7.37 + 1.96 ⋅ 1.82].
18
IV. Variance estimation and heteroscedasticity
The rationale behind the testing and confidence interval is

𝛽,# − 𝛽# 𝛽," − 𝛽#
~𝑁 0,1 and ~𝑁 0,1
𝜎*) 𝜎*)
# $
• However in practice we use 𝑆𝐸 𝛽,# and 𝑆𝐸 𝛽," to replace 𝜎*) and 𝜎*)
# $
• This is because 𝜎*) and 𝜎*) are not directly computable:

# $
5 1 𝑣𝑎𝑟 𝑋! − 𝜇% 𝑢! 5 1 𝑣𝑎𝑟 𝐻! 𝑢! 𝜇%
𝜎*) = 4 , 𝜎*) = 5 , where 𝐻! = 1 − 5
𝑋!
# 𝑛 𝜎% $ 𝑛 𝐸 𝐻5 𝐸 𝑋!
!
• 𝑆𝐸 𝛽,# needs to be a consistent estimator of 𝜎*)# . Same for 𝑆𝐸 𝛽,"
• We now only discuss 𝑆𝐸 𝛽,# for simplicity.
19
! '() *" +,# -"

For 𝜎$# = $ , a natural estimator is
! & .#
1
1 𝑛 ∑1 𝑋1 − 𝑋[ 𝑢\ 1
% %
𝑆𝐸 𝛽"! = × %
𝑛 1
∑1 𝑋1 − 𝑋[ %
𝑛
This estimator makes sense because
!
• In the denominator ∑1 𝑋1 − 𝑋[ % is consistent for 𝜎*% .
&
• For the numerator, note that
% %
𝑣𝑎𝑟 𝑋1 − 𝜇* 𝑢1 = 𝐸 𝑋1 − 𝜇* 𝑢1 − 𝐸 𝑋1 − 𝜇* 𝑢1
The second term is 0 because
𝐸 𝑋1 − 𝜇* 𝑢1 = 𝐸 𝐸 𝑋1 − 𝜇* 𝑢1 |𝑋1 = 𝐸 𝑋1 − 𝜇* 𝐸 𝑢1 𝑋1 = 𝐸 𝑋1 − 𝜇* ⋅ 0 = 0
!
Meanwhile, ∑1 𝑋1 − 𝑋[ % 𝑢\ 1% is consistent for 𝐸 𝑋1 − 𝜇* 𝑢1 % .
&
20
( 9:; 0# %&* <#

For 𝜎.- = + , a natural estimator is
& # "*
1
1 𝑛 ∑/ 𝑋/ − 𝑋* 𝑢„ /
* *
𝑆𝐸 𝛽]( = × *
𝑛 1
∑/ 𝑋/ − 𝑋* *
𝑛
This estimator is complicated. This is because 𝜎.- is complicated. Under one condition,
&
𝜎.- can be simplified.
&
• Homoscedasticity: 𝐸 𝑢/* 𝑋/ = 𝐸 𝑢/* ≡ 𝜎<* (or 𝑣𝑎𝑟 𝑢/ 𝑋/ = 𝑣𝑎𝑟 𝑢/ ≡ 𝜎<* . Why?)

• Under this condition, recall 𝑣𝑎𝑟 𝑋/ − 𝜇0 𝑢/ = 𝐸 𝑋/ − 𝜇0 𝑢/ * . And
𝐸 𝑋/ − 𝜇0 𝑢/ * = 𝐸 𝑋/ − 𝜇0 * 𝑢/* = 𝐸 𝐸 𝑢/* 𝑋/ 𝑋/ − 𝜇0 * = 𝜎<* 𝜎0*
( ",
• Therefore, 𝜎.- under homoscedasticity is simplified as .
& # "*
21
6(
Under homoscedasticity, 𝜎*) = . A corresponding consistent estimator is
# 76)
1
1 ∑! 𝑢• !5
𝑛
𝑆𝐸 𝛽,# = ⋅
𝑛 1
∑! 𝑋! − 𝑋. 5
𝑛
• Much simpler than the general case without assuming homoscedasticity.
• Incorrect when homoscedasticity fails.
.
• When homoscedasticity fails, this simpler formula of 𝑆𝐸 𝛽"! is still consistent of ( , but
&.#
.(
is no longer equal to 𝜎# . Then t-test based on this simpler 𝑆𝐸 𝛽"! no longer has the
$!
&.#
desired size control.
22
When 𝐸 𝑢/* 𝑋/ ≠ 𝐸 𝑢/* with positive probability, we say the error has
heteroscedasticity.
• Homo- and heteroscedasticity are concerning the conditional variance independence of
𝑢/ and 𝑋/ .
• Our three assumptions for OLS to be unbiased, consistent and asymptotically normal
are irrelevant to them.
• The only thing that matters is whether you can use the simpler formula for 𝑆𝐸.
• The simpler formula can only be used under homoscedasticity.
• The more complicated one can be used in both scenarios because we derived it without
assuming either homo- or heteroscedasticity.
• For this reason, the more complicated formula is called heteroscedastic robust standard
error.
23
In the past, people test homoscedasticity first and if homo cannot be rejected, they
use the simpler formula for SE.
This is NOT necessary because the homo tests are known for many problems (low
power etc.)
Why don’t we just use a universal formula that applies to both cases and stay
agnostic about the conditional variance? It doesn’t affect unbiasedness and
consistency of OLS anyways.
Just using the heteroscedastic robust SE (the more complicated version) without
worrying about heteroscedasticity is today’s standard.
24

Module 5

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module 5

Uploaded by

Copyright:

Available Formats

Econ 3334

I. Hypothesis II. Confidence III. Two-sample

Linear regression model 𝑌! = 𝛽" + 𝛽#𝑋! + 𝑢! under unconfoundedness, i.i.d, and no

• Unbiased for 𝛽! and 𝛽"

Recall estimation and inference of population mean: 𝜇! .

• In practice, 𝜎.- unknown. Need to replace it by a consistent estimator. Recall 𝜎.-* =

Same procedure for 𝐻": 𝛽" = 𝛽"" 𝑣𝑠 𝐻#: 𝛽" ≠ 𝛽"":

We can also compute the 𝑝-value.

Recall there is another approach for statistical inference:

• Similarly, (1 − 𝛼) CI for 𝛽": 𝛽," − 𝑧#&% ⋅ 𝑆𝐸 𝛽," , 𝛽," + 𝑧#&% ⋅ 𝑆𝐸 𝛽,"

A conventional method: two-sample mean test

We can solve the problem by simply running a regression:

The rationale behind the testing and confidence interval is

• This is because 𝜎) and 𝜎) are not directly computable:

! '() *" +,# -"

( 9:; 0# %&* <#

• Homoscedasticity: 𝐸 𝑢/* 𝑋/ = 𝐸 𝑢/* ≡ 𝜎<* (or 𝑣𝑎𝑟 𝑢/ 𝑋/ = 𝑣𝑎𝑟 𝑢/ ≡ 𝜎<* . Why?)

You might also like