Professional Documents
Culture Documents
Lecture: Simultaneous Equation Model (Wooldridge's Book Chapter 16)
Lecture: Simultaneous Equation Model (Wooldridge's Book Chapter 16)
Lecture: Simultaneous Equation Model (Wooldridge's Book Chapter 16)
Model
y1 = β1 y2 + u1 (1)
y2 = β2 y1 + u2 (2)
• Both variables are determined within the model, so are endogenous, and denoted by letter y.
3
Example
• If β1 < 0, β2 > 0, then (1) is the demand function while (2) is the supply function; u1 is the
demand shock and u2 is the supply shock.
• Another example is the Keynesian cross (45 degree line) model in which y1 is the national
income and y2 is total consumption.
4
Structural Form
• (1) and (2) are structural in the sense that they are directly implied by economics theory.
• We assume
E(u1 u2 ) = 0 (3)
So the structural errors are uncorrelated (orthogonal).
• Our goal is to estimate the structural coefficient β that measures the causal effect of one
endogenous variable on the other endogenous variable
5
Simultaneity Bias
y2 = β2 (β1 y2 + u1 ) + u2 .
• So structural model (1) suffers endogeneity issue (simultaneity bias)—the regressor in (1) is
correlated with the error term. Consequently, OLS applied to (1) gives inconsistent and
biased estimate. So does (2).
6
Example
• Next quantity y1 affects the price y2 through the supply function (2). Some people call this
reverse causation.
• So u1 affects y2 , and the two variables are correlated. In other words, the regressor in (1) is
endogenous.
7
Endogeneity
• We just show SEM suffers simultaneity bias. Therefore those variables are correlated with
the error, so are endogenous from the econometrics perspective
Graph
We can not identify either demand curve or supply curve from the scatter plot of quantity versus
price. However, if there are some exogenous variables, say, input price that can shift the supply
curve, then we can identify the demand curve.
9
y1 = β1 y2 + c1 z1 + u1 (6)
y2 = β2 y1 + c2 z2 + u2 (7)
Reduced Form
The reduced form expresses the endogenous variables in terms of exogenous variables only
c1 z1 + β1 c2 z2 + e1
y1 = (9)
1 − β2 β1
β2 c1 z1 + c2 z2 + e2
y2 = (10)
1 − β2 β1
e1 = u1 + β1 u2 (11)
e2 = β2 u1 + u2 (12)
(9) and (10) are reduced forms, and only exogenous variables z1 and z2 appear on the right hand
side (RHS)
11
• Let
c1 β1 c 2 e1
π11 = ; π12 = ; e∗1 =
1 − β2 β1 1 − β2 β1 1 − β2 β1
β2 c 1 c2 e2
π21 = ; π22 = ; e∗2 =
1 − β2 β1 1 − β2 β1 1 − β2 β1
• The reduced form can be rewritten as
• Note that reduced-form error is correlated cov(e∗1 , e∗2 ) ̸= 0, whereas the structural error is
uncorrelated.
12
• Note that all exogenous variables appear on the right hand side of each reduced form; by
contrast, the structural form has endogenous variable and some exogenous variables on the
right hand side. As a result, OLS applied to structural form is inconsistent, whereas OLS
applied to reduced form is consistent
• Reduced form (14) is the first-stage regression if we want to use 2SLS estimator to obtain
the causal effect of y2 on y1 . Notice that all exogenous variables are used as regressors in
the first-stage regression.
13
• OLS applied to (13) and (14) separately gives consistent estimate for π s.
• The indirect least squares estimator (ILS) estimates the structural-form coefficients β based
on the estimated reduced-form coefficient π
14
c2 ̸= 0 (17)
c1 ̸= 0 (18)
15
• c2 ̸= 0 indicates that there is an exogenous variable z2 which is excluded from the first
structural equation (order condition) but appears in the second structural equation with
non-zero coefficient (rank condition)
• For the demand-and-supply example, the demand function can be identified if input price is
present in the supply function. Graphically the demand curve can be traced out (identified)
when supply curve shifts due to varying input price.
16
Remarks
• β1 is over-identified if there are more than one exogenous variables that are excluded from
the first equation and appear in the second equation with non-zero coefficients
• In that case, the ILS estimator for β1 is not unique (Exercise), a big disadvantage of ILS.
• Another disadvantage of ILS is, β̂ ILS is nonlinear function of π̂ , so deriving the variance
entails delta method
17
Delta Method
• For simplicity, let π̂ and β̂ = f (π̂ ) be scalars. Consider the first order Taylor expansion
β̂ = f (π̂ ) ≈ f (π ) + f ′ (π )(π̂ − π )
∂ β̂
where ∂ π̂ is called gradient (column) vector.
18
2SLS Estimator
• The nice by-product of the structural model is that instrumental variables are readily
available.
• The reduced form (9) and (10) clearly show that the exogenous variables z1 and z2 are
correlated with the endogenous regressors y1 and y2 . Moreover, we assume z1 and z2 are
uncorrelated with the error u, see (8)
• So z1 and z2 are instrumental variables for y1 and y2 , if the exogeneity assumption (8) holds
• Essentially the 2SLS estimator replaces the endogenous regressors with their exogenous
parts, and we use instrumental variables to isolate those exogenous parts.
19
• Step 1: Estimate the reduced form (13) and (14) (first stage) using OLS and keep the fitted
values
• Step 2: Replace the endogenous regressors with fitted values, and fit the second-stage
regressions using OLS
y1 = β1 ŷ2 + c1 z1 + u1 (21)
y2 = β2 ŷ1 + c2 z2 + u2 (22)
• ŷ1 is the exogenous part of y1 ; ŷ2 is the exogenous part of y2 ; Both are linear combinations
of exogenous z1 and z2 . (Where are the endogenous parts?)
20
• Exercise: what is the stata command to get β̂22SLS in (7)? You need to think carefully which
variable is which.
• Reduced form (13) and (14) are example of seemingly unrelated regressions
• They are indeed related because the reduce-form errors are correlated across equations, i.e.,
cov(e∗1 , e∗2 ) ̸= 0,
• Generally the optimal estimator for SUR model is generalized least squares estimator
(GLS), due to the correlation between errors across regressions.
• However, if each equation in SUR has the identical RHS variables, GLS becomes
equation-by-equation OLS
• If the error is homoskedastic and uncorrelated, then OLS estimator is the best linear
unbiased estimator (BLUE) conditional on the regressors.
• The estimator better than OLS is GLS, which is the OLS applied to the transformed
regression (24).
25
We use GLS when E(UU′ |X) = Ω ̸= σ 2 I. Because Ω is symmetric and positive definite, there is
spectral decomposition Ω = AA′ . Now consider the transformed model
Y∗ = X∗ β + U∗
where Y∗ = A−1 Y, X∗ = A−1 X, U∗ = A−1 U. It follows that GLS estimator is OLS applied to the
transformed regression, which satisfies the conditions of Gauss-Markov Theorem. That is,
′
E(U∗ U∗ |X∗ ) = I.
In short,
( )−1 ( ) ( ′ −1 )−1 ( ′ −1 )
∗′ ∗ ∗′ ∗
β̂ GLS
= X X X Y = XΩ X XΩ Y
27
• You use command sureg to obtain GLS estimator for SUR model.
• Alternatively, you can generate the transformed variables, and fit the transformed regression
using OLS
28