Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Var(ε) = smaller than var(Y) always

2 2
Adjust R is always less than∨equal¿ R becausetakes into account the cost of adding more variables
2
Adjusted R =
[ 1−( 1−R ) ) ( n−1 )
2

n−k
ESS + RSS = TSS
Var(X – Y) = var(X) + var(Y) – 2cov(X,Y) unless independence then Var(X-Y) = Var(X) + var(Y)
Robust command uses regular OLS, it fixes standard errors
Instrumental variables are used to correct for omitted variable bias
Probit model – case of binary dependent variables
Instrumental Variables must satisfy:
Cov(x,z) ≠ 0
Cov(z,e) = 0
E(ε) = 0 is an assumption about the PRF in the CLRM not OLS
Σε = 0 is a property of OLS
OLS estimators are the minimum variance among all unbiased estimators under CLRM assumptions
Estimates of coefficients are only biased if the omitted variable is correlated with the previous variables
Z ~ N(0,1)

R=
2 ESS
=1−
RSS
=1−
∑e 2

TSS TSS TSS


Standard error changes if you scale a variable by the same factor
OLS estimators are still efficient in the multiple regression model, they just have larger SE
OLS estimators are biased in the presence of omitted variables
Irrelevant variables – OLS estimators are unbiased but inefficient.
ESS goes down as variables are dropped.
OLS is still blue with multicollinearity (SE might be large)
OVB might bias OLS estimates of betas
Instrumental variables:
Regress x on z
Get X hat
Regress y on X hat
Probit model:
Can’t interpret magnitude of coefficients
The larger the % of the confidence interval, the bigger it is
Least squares assumption for OLS: E(ε | x) = 0
Multicollinearity: degree of linear association between two variables
Exogenous are outside the model endogenous are determined inside the model
Instrumental variable estimation:
When there is a correlation b/w the error term and the independent variable
This is from OVB, measurement error in x, or simultaneous causality bias
The instrument is valid if correlated with x and not correlated with error
Allows us to separate the variation in x which is correlated with the error term from the part that is not. So our
estimates are consistent.
Linear probability model: binary variable as the left hand side variable
Predicted values can be greater than 1 or less than 0
False the coefficients are the change in the probability that Y=1
Probit model uses normal distrubtion
Coefficients can be positive or negative since they are changing x on the z score
Slopes tell you how z changes if we increase x by a unit the effect on y depends on how the predicted probability
changes
Logit model uses logistic distribution
Binary dependent variable model
Predicted values mean that that’s the probability that the independent variable equals 1
If a variable is insignificant we should consider whether the adjusted R^2 is affected by taking it out and we might introduct
omitted variable bias if we drop it out.
Adjusted R^2 can be negative
Under perfect multicollinearity OLS estimators cannot be computed.
Adjusted R^2 and R^2 are only the same if k=1
If the estimated slope coefficient is 0 then R^2 is 0
Slope Estimator: cov(X,Y)/var(X)
Only necessary to compute OLS is variation in x
E(Y) = E(E(Y | X)) is true for all X and Y
Correlation can be from – 1 to 1
Standardize a variable:
(x-µ)/σ^2
If two variables are independent then they have a correlation of 0
The sample average is the most efficient estimator of the sample mean it is also unbiased.
OLS is biased when:
Omitted variables, misspecification of the regression function, measurement error in the indepependent variable,
sample selection, simultaneous causality
Instrumental Variables can fix:
OVB, measurement error, simultaneous causality
Can’t fix: sample selection, misspecification
Simultaneous causality:
Y causes one or more of the X’s
Model picks up on effect of X on Y and Y on X therefore the coefficient is biased and inconsistent
Instrumental variables:
Isolate the variation in X that is not correlated with error
1) Regress X on Z
2) Replace X with Xhat in the regression
3) Or use ivreg y (x=z)
Long run impact propensity = sum of coefficients on those vars
Linear probability model: P(y = 1| x) = E(y | x)
Always use reg y x, robust
Predict probabilities can be outside of 0 and 1
To restrict to b/w
Heteroskedastic: variance of error term is different for different values of the x’s
OLS is still unbiased and consistent
SE is biased, so OLS is not efficient.
Use ROBUST in stata
Test for it using the white test

You might also like