Professional Documents
Culture Documents
Econometrics Formula
Econometrics Formula
F-TEST
1. Hypotheses:
H0: null
Ha: alternate
2. Significance level
3. Estimate restricted and unrestricted models, where the restricted model is one
obtained if Ho is true
- Fc = Fm, n-k-1, a
- Fm = number of variables
6. Compare observed statistic to the critical value – reject Ho if F>Fc and write conclusion
in words
- Violation of assumption 3 (All independent variables are uncorrelated with the error
term) (exogeneity)
- Oi: omitted variable excluded results in omitted variable bias
o Oi is a predictor of Yi (criterion 1)
o Oi is correlated to one of your Xs (criterion 2)
B1 = true beta
-a1*b2 is size of bias
Rescaling variables
- If X1 is multiplied by 10
o B1 is divided by 10
o Standard error of B1 is divided by 10
- If Y1 is multiplied by 10
o Every term in regression multiplied by 10
o Standard error in all coefficients multiplied by 10
Level-Level
- Usual interpretation
Log-log
-both coefficients as a percent
- don’t multiply by 100 so 0.37 coeff equals 0.37%
Log-level
- Income increases by 1 euro, house increases by 20%
- Ln(y)
- Y interpreted as a percentage of the coefficent in x
- Multiply by 100 for percentage
Level-log
- Income increases by 1%, house size increases by 0.015 square meters
- X interpreted as a percentage, Y not
- Divide coefficient by 100
Dummy variable
- 1 or 0
- Gender
- Economic interpretation
o Men will have 0.015 more votes than women, ceteris paribus
Interaction term
- If you want to see if the impact of X on Y is the same across (2) groups
Nominal Variable
B4 interpretation =
- By being asian you earn b4% more or less wages than a white person, ceteris paribus
- Since white is omitted you always compare with omitted ordinal variable, in this case,
white
- Compare variable against omitted variable
Ordinal variable
- Ranking
- Age group, happiness
o 0-18 years
o 18-40 years
o 40+ years
B3 interpretation =
- If you are less than 18, then you will earn b3 percent more or less than individuals who
are over 60, ceteris paribus
- As over 60 is the group omitted
If you include categorical variable (dummy) you are not comparing against the overall
population but against omitted category
Key question of causal interpretation of a regression model is whether the error term is
correlated with the independent variable(s)
Week 8:
- How to calculate predicted values outside 0-1 range for the LPM
o Outside 0 or 1 = unbounded
Week 7
Spurious regression
- When dealing with time series, always check whether time trend is significant
- Include time trend in regression model to make sure you control for time
o Put year in regression
- Check P VALUE for significance
Non-stationary variables:
Week 8
Linear probability model:
2 problems:
- Heteroskedasticity
o Always use command robust on stata
- Predicted probabilities may lie outside the 0-1 interval
o If they are outside use logit or probit model instead of LPM
Causality
Solutions:
- Observational
o Control for more observables to remove OVB; use a larger sample to obtain
more precision
- Experimental
o Increase validity
o Have other age groups
o Use larger sample to obtain more precision
o Treatment vs non-treatment
Give two groups different treatment intake levels to gauge the effect
of the treatment intensity
- Error term at time t and explanatory variables is uncorrelated at all times: present, past,
future
- In theory, does not exist in real life
Weak exogenity
- Error term at time t and explanatory variables is correlated in the present; uncorrelated
in the past and future
- Practice
- Learn the pattern
- Do not spend too much time on lecture slides
- If both wage and unemployment contain a time trend, then it would be neccesary to
include a time trend
For every year in education wage increases by 2.7% all else equal
c. Region 9 workers will earn 13.3% more on wages compared to region 5, all else equal
2 conditions:
- Omitted variable has a relationship with one of the other independent variables
- Omitted variable has a relationship with y variable
- Violates OLS assumption that error term and independent variable cannot be correlated
- Parameter estimates are biased, E(b_hat)!=B hence do not have a causal interpretation
- B2+b3+b4/(1-b1) = 0.0009
- Hence long run effect is that 1000 more prison inmates leads to an increase of 0.0009
percentage points in the unemployment rate
- Male politicians are 1.6% more likely to win in a municipal election compared to female
politicians, all else equal
- This will increase validity for the regression and reduces omitted variable bias
- The estimate for the causal effect of age would go down/become smaller once obesity
rate is controlled
d. T-TEST
- T-stat = 7.25
- 2.9/0.4 = 7.25
- Tc = 1.96
- T>Tc =
Women who live in urban areas are 6.5% less likely to be in a polygamous relationship
compared to non urban areas, all else equal
Summary
Heteroskedasticity:
- Heteroskedasticity does not bias coefficient estimates, but bias the standard errors, thus
not possible to do hypothesis testing
Multicollinearity:
- Error term at time t and explanatory variables is uncorrelated at all times: present, past,
future
- In theory, does not exist in real life
Weak exogenity
- Error term at time t and explanatory variables is correlated in the present; uncorrelated
in the past and future
Biased upwards/downwards
Drawbacks of LPM:
Partial derivative outside domain = The stationary point (10.60) is outside of the domain for
grade, since grade goes from 1 to 10 [1 pt], hence log SETs (/log evaluation scores) increase
with grades but at a decreasing rate / is part of the upward going graph of a hump-shaped
function [2 pts]