Professional Documents
Culture Documents
Applied Econometrics: William Greene Department of Economics Stern School of Business
Applied Econometrics: William Greene Department of Economics Stern School of Business
Applied Econometrics: William Greene Department of Economics Stern School of Business
William Greene
Department of Economics
Stern School of Business
Applied Econometrics
Longitudinal data
National longitudinal survey of youth (NLSY)
British household panel survey (BHPS)
Panel Study of Income Dynamics (PSID)
German Socioeconomic Panel (GSOEP)
Agricultural Resource Management Survey (ARMS)
Cross section time series
Grunfeld’s investment data
Penn world tables
Financial data by firm, year
rit – rft = i(rmt - rft) + εit, i = 1,…,many; t=1,…many
Exchange rate data, essentially infinite T, large N
Effects: i= + vi
Terms of Art
Cross sectional vs. time series variation -
(history: consumption function studies)
Heterogeneity
Group effects (individual effects)
Fixed effects and/or random effects
Substantive differences?
Is it possible to tell them apart in observed data?
Panel Data
Rotating panels: Spanish household survey
Spanish income study
(http://www.cemfi.es/~albarran/0008r.pdf)
Efficiency analysis: “Efficiency measurement in
rotating panel data,” Heshmati, A, Applied
Economics, 30, 1998, pp. 919-930
Hierarchical (nested) data sets: Student
outcome, by year, district, school, teacher
Nested Panel Data
Antweiler, W., Nested Random Effects…”
Journal of Econometrics, 101, 2001, 295-313
Sulfide concentration(year,country,station=t,c,s)
= β1 +β2 (logGDP/km2 )c,s,t +β 3log(K/L)c,t 4Communist c
... + 8 log(Oil Pr ice )t 9t c ,s ,t v c ,s w s
Balanced and Unbalanced Panels
Distinction
A notation to help with mechanics
zi,t, i = 1,…,N; t = 1,…,Ti
The role of the assumption
Mathematical and notational convenience:
Balanced, NT
Unbalanced: i=1 Ti
N
x i1
x
X i Ti rows, K columns
i2
x iTi
Linear specification:
Fixed Effects: E[ci | Xi ] = g(Xi); effects are correlated with
included variables. Common: Cov[xit,ci] ≠0
Random Effects: E[ci | Xi ] = μ; effects are uncorrelated with
included variables. If Xi contains a constant term, μ=0 WLOG.
Common: Cov[xit,ci] =0, but E[ci | Xi ] = μ is needed for the
full model
Convenient Notation
Fixed Effects
Random Effects
+ c i + εit
Hbit hemoglobin level, grams/deciliter, range 3+ to 15
Hbit7 1(3 Hbit < 7.5) (Base case; 7 = 0)
Hbit8 1(7.5 Hbit < 8.5)
Hb15
it 1(14.5 Hbit 15)
Effects and Covariates
Individual effects that would impact a self reported
QOL: Depression, comorbidity factors (smoking), recent
financial setback, recent loss of spouse, etc.
Covariates
Change in tumor status
Measured progressivity of disease
Change in number of transfusions
Presence of pain and nausea
Change in number of chemotherapy cycles
Change in radiotherapy types
Elapsed days since chemotherapy treatment
Amount of time between baseline and exit
First Differences Model
QOL i QOL i1 QOL i0
= (1 0 ) 15
j 8 j (Hb j
i1 Hb j
i0 ) K
k 1k (x ik ,1 x ik ,0 ) i1 i0
See Baltagi (2001, p. 24) for analysis of these data. The article on which the
analysis is based is Baltagi, B. and Griffin, J., "Gasolne Demand in the OECD: An
Application of Pooling and Testing Procedures," European Economic Review, 22,
1983, pp. 117-137. The data were downloaded from the website for Baltagi's
text.
Analysis of Variance
Analysis of Variance
+--------------------------------------------------------------------------+
| Analysis of Variance for LGASPCAR |
| Stratification Variable _STRATUM |
| Observations weighted by ONE |
| Total Sample Size 342 |
| Number of Groups 18 |
| Number of groups with no data 0 |
| Overall Sample Mean 4.2962420 |
| Sample Standard Deviation .5489071 |
| Total Sample Variance .3012990 |
| |
| Source of Variation Variation Deg.Fr. Mean Square |
| Between Groups 85.68228007 17 5.04013 |
| Within Groups 17.06068428 324 .05266 |
| Total 102.74296435 341 .30130 |
| Residual S.D. .22946990 |
| R-squared .83394791 MSB/MSW 21.96425 |
| F ratio 95.71734806 P value .00000 |
+--------------------------------------------------------------------------+
Estimating the Fixed Effects Model
The FEM is a linear regression model but with
many independent variables
Least squares is unbiased, consistent, efficient,
but inconvenient if N is large.
1
b X X X D X y
Dy
a
D X D D
Using the Frisch-Waugh theorem
b =[X MD X ]1 X MD y
Fixed Effects Estimator (cont.)
M1D 0 0
2
0 M 0
MD D (The dummy variables are orthogonal)
N
0 0 MD
MDi I Ti di (didi ) 1 d = I Ti (1/Ti )did
X MD X = Ni=1 X iMDi X i ,
X iMDi X i
k,l
T
t=1
i
(x it,k -x i.,k )(x it,l -x i.,l )
y it x itβ c i +εit
y i x iβ c i +εi
y it y i ( x it - x i )β (εit εi )
y it x itβ εit
Classical assumptions apply to the transformed model
Least Squares Dummy Variable Estimator
b is obtained by ‘within’ groups least squares
(group mean deviations)
Normal equations for a are D’Xb+D’Da=D’y
a = (D’D)-1D’(y – Xb)
ai=(1/Ti )Σ Ti
t=1 (yit -xitb)=ei
2 u2 u2 u2
u2 2 u2 u2
Var[ε i +uii ]
u u2 2 u2
2
= 2I Ti u2ii Ti Ti
= 2I Ti u2ii
= Ωi
Ω1 0 0
0 Ω2 0 (Note these differ only
Var[w | X]
in the dimension Ti )
0 0 ΩN
Convergence of Moments
X X N X i X i
N
i1 fi a weighted sum of individual moment matrices
i1 T Ti
X ΩX N X iΩi X i
N
i1 fi a weighted sum of individual moment matrices
i1 T Ti
X i X i
= 2 Ni1fi u2 Ni1fi x i x i
Ti
X i X i
Note asymptotics are with respect to N. Each matrix is the
Ti
moments for the Ti observations. Should be 'well behaved' in micro
level data. The average of N such matrices should be likewise.
T or Ti is assumed to be fixed (and small).
Random vs. Fixed Effects
Random Effects
Small number of parameters
Efficient estimation
Objectionable orthogonality assumption (ci Xi)
Fixed Effects
Robust – generally consistent
Large number of parameters
Ordinary Least Squares
Standard results for OLS in a GR model
Consistent
Unbiased
Inefficient
True Variance
1 1
1 XX X ΩX XX
Var[b | X]
Ni1 Ti Ni1 Ti Ni1 Ti Ni1 Ti
0 Q-1 Q * Q-1
0 as N with our convergence assumptions
Estimating the Variance for OLS
1 1
1 X X X ΩX X X
Var[b | X ] N N N
i1 Ti i1 Ti i1 Ti Ni1 Ti
X ΩX X iΩi X i
N
N
i1 fi , where = Ωi =E[w i w i | X i ]
i1 T Ti
In the spirit of the White estimator, use
X ΩX X i w ˆ i X i
ˆ iw
N
N
f
i1 i
ˆ i = y i - X ib
, w
i1 T Ti
Hypothesis tests are then based on Wald statistics.
Est.Var[b | X] X X
1
X i w
N
i1 1
ˆ i X i X X
ˆ iw
ˆ i = set of Ti OLS residuals for individual i.
w
X i = Ti xK data on exogenous variable for individual i.
Xi w
ˆ i = K x 1 vector of products
(X i w ˆ i X i ) KxK matrix (rank 1, outer product)
ˆ i )(w
Ni1 Xi w
ˆ i w
ˆ i X i = sum of N rank 1 matrices. Rank K.
i1 Ti
Ni1 tTi 1 (y it aOLS x itbOLS )2
2 2 2
From the pooled OLS estimator: u
ˆ
Ni1 Ti
2 N
Ti
(y a x b ) 2
N
Ti
(y a x b ) 2
ˆ u i1 t 1 it OLS it OLS
N
i1 t 1 it i it LSDV
0
i1 Ti
Regress; lhs=lwage;rhs=fixedx,varyingx;res=e$
Matrix ; tebar=7*gxbr(e,person)$
Calc ; list;lm=595*7/(2*(7-1))*
(tebar'tebar/sumsqdev - 1)^2$
LM = 3797.06757
Hausman Test for FE vs. RE
ˆ -β
ˆ=β
Wald Criterion: q ˆ ;W=q
ˆ[Var(q
ˆ)]-1q
ˆ
FE RE
ˆ - β]
nT[β d
N[0,VFE ] (inefficient)
FE
ˆ - β)-(β
ˆ = (β
Note: q ˆ β). The lemma states that in the
FE RE
ˆ - β] and
joint limiting distribution of nT[ β ˆ , the
nT q
RE
Ti
-1
N
ˆ T
ˆ
2
ˆ ]
Est.Var[β ˆ i1 X i I ii X i , 0 ˆ i = 2
2 i i u
1
RE 2
T i ˆ T
iˆu
As long as 2
ˆ and
2
ˆ u are consistent, as N , Est.Var[β ˆ ] Est.Var[β
ˆ ]
FE RE
invariant variables in X.
+--------------------------------------------------+
| Random Effects Model: v(i,t) = e(i,t) + u(i) |
| Estimates: Var[e] = .235236D-01 |
| Var[u] = .133156D+00 |
| Corr[v(i,t),v(i,s)] = .849862 |
| Lagrange Multiplier Test vs. Model (3) = 4061.11 |
| ( 1 df, prob value = .000000) |
| (High values of LM favor FEM/REM over CR model.) |
| Fixed vs. Random Effects (Hausman) = 2632.34 |
| ( 4 df, prob value = .000000) |
| (High (low) values of H favor FEM (REM).) |
+--------------------------------------------------+
Wu (Variable Addition) Test