Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

Econometrics

For
Consultants
Data, everywhere
Instructor: Tirthatanmoy Das

Lecture 3
Jun 12, 2024
1
Last class…
Software for analysis

Have a project in mind?

Data fitting?

Estimation – bivariate Linear


regression
2
Today…

Causality

Multivariate linear
regression

Causal interpretation of
results

Evaluating regression results


3
Causality

4
Is this GDPN’s causal
effect?
Are the estimates Is 𝛽!! = 0.92 the causal
confounded? effect?

▷ Theory says 𝐶𝑉𝑁, 𝑃𝑃, 𝐷𝑃𝐶


also affect P, but they are
not included

▷ Could G𝐷𝑃𝑁 pick up their


effects?: possibly yes 5
Notion of causality

6
Causality comes Causal effect of 𝑋"
from domain
concepts/theory
establish causal Consider
relationships, not 𝑌 = 𝑓(𝑋" , 𝑋# … 𝑋$ )
from data
▷ Causal relationship: a
Think of thought change in 𝑋" causes a
experiments change in 𝑌, holding the
other 𝑋s constant (ceteris
paribus)
7
How to hold other Causal effect of 𝑋"
factors constant
With linearity
𝑢: the regression 𝑌 = 𝐵! + 𝐵" 𝑋" + ⋯ 𝐵$ 𝑋%
error
▷ But if you run
All omitted 𝑌 = 𝐵! + 𝐵" 𝑋"
determinants of 𝑌
▷The you are missing
Any random noise 𝑢 = 𝐵" 𝑋# + ⋯ 𝐵$ 𝑋% 8
Confounding? The imperfect regression
model
Imperfect regression model

𝑌 = 𝐵! + 𝐵" 𝑋 + 𝑢

▷ 𝐵" not causal effect if


Corr 𝑋, 𝑢 ≠ 0

9
What can cause Corr 𝑋, 𝑢 ≠ 0

10
Investigate the Ask the following for
root cause of the 𝐶𝑜𝑣[𝑋!" , 𝑢] ≠ 0
violation ▷ Any relevant
determinants of 𝑌 are
omitted that could be
correlated with 𝑋?

▷ Using linear when true


relationship is nonlinear

▷ Is 𝑋 measured with error?


11
Investigate the Ask the following for
root cause of the 𝐶𝑜𝑣[𝑋!" , 𝑢] ≠ 0
violation ▷ Does 𝑌 also affect 𝑋?
(reverse causality)?

▷ Does your sample (or


data) fail to represent the
population?

12
Investigate the Ask the following for
root cause of the 𝐶𝑜𝑣[𝑋!" , 𝑢] ≠ 0
violation

Answer to all these


questions must be NO for
causal interpretation

13
Today…
Causality

Multivariate linear regression

Another example – US
manufacturing data?

Causal interpretation of results

Evaluating regression results


14
What about How to ensure ‘ceteris
factors that paribus’
affect 𝑌 but are
unobserved or How to account for
unavailable? unobserved or unavailable
factors? Add 𝑋# … 𝑋%

▷ Then add 𝑢, the error


term to the equation

𝑌 = 𝐵! + 𝐵" 𝑋" + ⋯ 𝐵$ 𝑋% + 𝑢
15
A more complete Multivariate regression
model emerges
𝑌 = 𝐵! + 𝐵" 𝑋" + ⋯ 𝐵$ 𝑋% + 𝑢

▷ If it is true for all


observations, then its true
for a particular 𝑖 &' unit or
observations
𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(
16
What about the ‘pharma’ example?

17
GDPN, CVN, PP Is this GDPN’s causal
are indeed effect?
correlated
Answer: possibly yes,
unfortunately!

▷ If 𝐶𝑉𝑁, 𝑃𝑃 are correlated


with G𝐷𝑃𝑁 (is this true here)

▷ If so, 𝛽!! = 0.92 may not


be the true effect of 𝐺𝐷𝑃𝑁
on 𝑃 18
How to deal with New crisis? Causality?
this new crisis
Need to hold 𝐶𝑉𝑁, 𝑃𝑃
constant (‘ceteris paribus’)

▷ Add them in the


regression

19
A more complete Multivariate regression
model emerges
The ‘pharma’example
Including other 𝑃(
factors = 𝐵! + 𝐵)*+, 𝐺𝐷𝑃𝑁(
determining + 𝐵-., 𝐶𝑉𝑁( + 𝐵*+- 𝐷𝑃𝐶(
‘pharma’ price + 𝐵/+- 𝐼𝑃𝐶( +𝐵++ 𝑃𝑃( + 𝑢(

where 𝑖 is country identifier

20
The model is Multivariate regression
applicable to model
every entity in the
population 𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(

▷ Deterministic: ( E(Y|X)):
XB= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%(

▷ Error: ui
21
The model is Multivariate regression
applicable to model
every entity in the
population 𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(

▷ For now: assume u is


uncorrelated with 𝑋" ,
𝑋# … 𝑋$

▷ More later on this


assumption 22
Nomenclature of Multivariate regression
the model model
𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(
where 𝑖 = 1, 2, … , 𝑁

▷ Y: regressand or
outcome or dependent
variable
23
Nomenclature of Multivariate regression
the model model
𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(
where 𝑖 = 1, 2, … , 𝑁

▷ X: vector of regressors or
determinants or
independent variable
24
Nomenclature of Multivariate regression
the model model
𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(
where 𝑖 = 1, 2, … , 𝑁

▷ u: error term with E 𝑢 = 0

▷ 𝐵! , 𝐵" , … 𝐵$ : regression
coefficients or regression
parameters 25
Nomenclature of Multivariate regression
the model model
𝑌(
= 𝐵! + 𝐵" 𝑋"( + ⋯ + 𝐵$ 𝑋%( + 𝑢(
where 𝑖 = 1, 2, … , 𝑁
▷ N: the number of
observations

▷ Reminder: linearity is
assumed, though it doesn't
have to be 26
Obtaining causal Causal effect
effect from the
regression When 𝑋" changes by 1 unit,
(remember 𝑢 is 𝑌 changes by 𝐵" unit (on
uncorrelated with average), holding ‘ceteris
𝑋s) paribus’
𝜕𝐸[𝑌|𝑋]
𝜕𝑋!
𝜕(𝐵" +𝐵! 𝑋! + 𝐵# 𝑋# + ⋯ 𝐵$ 𝑋% )
=
𝜕𝑋!
= 𝐵!
27
Obtaining causal Sample counterpart of
effect from the the regression
regression 𝑌(
(remember 𝑢 is = 𝑏! + 𝑏" 𝑋"( + ⋯ + 𝑏$ 𝑋%( + 𝑒(
uncorrelated with
𝑋s)
▷ where 𝑏! , 𝑏" , … 𝑏% are the
estimates of 𝐵! , 𝐵" , … , 𝐵%

▷ And 𝑒( is the residual


28
How to obtain the Least squared method
estimates again
,

𝑀𝑖𝑛0 G(𝑌( − 𝑏! − 𝑏" 𝑋"(


(1(
)"
− 𝑏# 𝑋#( − 𝑏$ 𝑋$(

▷ Take derivatives w.r.t.


coefficients

▷ Set them to zero 29


Note Multivariate regression –
‘pharma’ example
Estimates

▷ The coefficient of 𝐺𝐷𝑃𝑁


changed quite a bit

▷ Does 1.58 represent the


causal effect? Yes, only if
now 𝑢( is uncorrelated with
𝑋𝑠
30
Today…

Causality

Multivariate linear
regression

Evaluating regression results

31
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ Is the regression equation


supported by sound
domain concept/theory?

32
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ Are all important


variables included in the
equation?

33
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ Is the functional form


correct? Note we used
linear

34
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ Is OLS the best estimator


to be used for this
equation? Note that there
could be other estimators

35
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ Is the regression free from


other econometric
problems? (more later)

36
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ Is the dataset reasonably


large and accurate and
representative of the
population you are
interested in?
37
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ How do the estimates


compare with the
expectations (e.g.,
magnitude, sign)?

38
Evaluating the Result evaluations
results from
regression – How reliable are OLS
credibility and results? Check boxes
reliability

✓ How well does the


estimated regression fit the
data?

39
!
Evaluating the Goodness of fit: 𝑅
results from
regression – ▷ 𝑅 : the coefficient of
"
credibility and determination
reliability

▷ Overall measure of
goodness of fit

40
!
How good is the Goodness of fit: 𝑅
fit

▷ Percentage of the total


variation in the 𝑌 that is
explained by the 𝑋s

▷ It is a value between 0
(no fit) and 1 (perfect fit)

41
!
How good is the Goodness of fit: 𝑅
fit
▷ Explained sum squares
"
𝐸𝑆𝑆 = ∑ 𝑌P − 𝑌Q
▷ Residual sum squares
∑ "
𝑅𝑆𝑆 = 𝑒
▷ Total sum squares 𝑇𝑆𝑆 =
∑ 𝑌 − 𝑌Q "

233 533
▷ Then, 𝑅 =
"
=1−( )
433 433 42
" !
Higher 𝑅 is Misuse of 𝑅
better, but not
the only thing to ▷ Do not attempt to
consider maximize 𝑅" only

▷ Spurious regressions can


"
sometimes give high 𝑅 ,
but has no theoretical
basis

43
" !
Higher 𝑅 is Example: misuse of 𝑅
better, but not
the only thing to Effect of income on
consider Mozzarella consumption
▷ Model 1 𝑅 = 0.88
"

▷ Model 2 𝑅 = 0.97
"

▷ Deaths due to drowning


a determinant? No
▷ Retain Model 1
44
Steps for applied regression project

45
Applied Applied regression: steps
regression
analysis 6 steps
progresses
according to a
set of steps ▷ Review domain
concepts/literature/theory

▷ Based on domain
concepts/literature/theory
- select the 𝑌 and 𝑋s
46
Applied regression Applied regression: steps
analysis
progresses 6 steps
according to a set
of steps
▷ Based on domain
concepts/literature/theory
- select the functional form
(linear for now)

47
Applied regression Applied regression: steps
analysis
progresses 6 steps
according to a set
of steps
▷ Hypothesize the
expected signs of the
coefficients – based on
theory

48
Applied regression Applied regression: steps
analysis
progresses 6 steps
according to a set
of steps
▷ Collect the data

▷ Inspect and clean the


data

49
Applied Applied regression: steps
regression
analysis 6 steps
progresses
according to a
set of steps ▷ Estimate and evaluate
the equation

▷ Document the results

50

You might also like