Abrevaya Projectionapproachunbalanced 2013

The projection approach for unbalanced panel data
Author(s): Jason Abrevaya

Source: The Econometrics Journal , 2013, Vol. 16, No. 2 (2013), pp. 161-178
Published by: Oxford University Press on behalf of the Royal Economic Society
Stable URL: http://www.jstor.com/stable/43697634
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Royal Economic Society and Oxford University Press are collaborating with JSTOR to digitize,
preserve and extend access to The Econometrics Journal
This content downloaded from

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
All use subject to https://about.jstor.org/terms
Econometrics Journal (2013), volume 16, pp. 161-178.
doi: 10.1 1 1 l/j.l368-423X.2012.00389.x
The projection approach for unbalanced panel data
Jason Abrevaya1
t Department of Economics, The University of Texas at Austin, Austin, 2225 Speedway Stop
C3100, TX 78712, USA.
E-mail: abrevayaOeco . utexas . edu
First version received: August 2010; final version accepted: September 2012
Summary The Chamberlain projection approach, a powerful tool for the analysis of linear
fixed-effects models, was introduced within the context of balanced panels. This paper extends
the Chamberlain projection approach to unbalanced panels. The extension is especially useful
for models with sequential exogeneity , where existing control-variable approaches are not
applicable. A generalized method of moments (GMM) estimation framework is considered,
and hypothesis tests (testing strict exogeneity, testing random effects, etc.) are discussed within
the GMM context.
Keywords: Fixed effects , Linear projections, Unbalanced panel data.
1. INTRODUCTION
Unbalanced panel data sets are commonly encountered in empirical research.1

unbalanced nature of panels does not affect applicability of many commonly used
(such as the within estimator or the random-effects estimator), this is not true of al
In this paper, we discuss one such estimation approach, the projection approach of
(1982), for which unbalanced panels pose a difficulty. The original idea of Chamber
projecting the (unobserved) fixed effect upon covariates from all time periods, is not
applicable in unbalanced panels. We introduce a modified Chamberlain approac
the fixed-effect projection depends on the form of missingness for a given cr
unit. The resulting projections lead to orthogonality conditions that depend upon t
exogeneity assumption maintained. We focus on the alternative assumptions of strict
and sequential exogeneity.
Related work by Wooldridge (2009) considers the use of Mundlak (1978) p
(i.e. projections of fixed effects onto averages of covariates) in the context of
panels under an assumption of strict exogeneity. Wooldridge (2009) shows that poo
squares regressions that use covariate averages as control variables are numerically
1 Baltagi and Song (2006) give an extensive survey of the econometrics literature on unbalanced panels.
2 These remain applicable as long as the missingness is not informative about the error disturbances, an
in Section 2.
© 2013 The Author(s).

The Econometrics Journal © 2013 Royal Economic Society. Published by John Wiley & Sons Ltd, 9600 Garsington
Road, Oxford OX4 2DQ, UK and 350 Main Street, Maiden, MA, 02148, USA.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
162 J. Abrevaya
to the traditional fixed-effects (wit

a projection approach in the strictly
to point out that a Chamberlain ap
as we view the resulting orthogonal
(GMM) framework, efficiency gains
obtain.
A drawback of the Mundlak approa

strict exogeneity fails, for example,
'predetermined'). The Mundlak project
covariates, making the usual applicat
or lagged covariates as instruments) i
In contrast, the modified Chamberla
sequential exogeneity.
Among empirical researchers, the u
The fixed-effects estimator is used
estimators based upon first differen
are used in the presence of predeterm
and are included as part of statistical
the addition of general GMM comma
outlined in this paper is also easy to
Chamberlain GMM estimator can pr
estimator. The Chamberlain framew
noted in the literature, including (a) t
(or weaker versions of exogeneity), (b
and (c) the ability to estimate the rel
(from the estimated projection coeffi
The paper is organized as follows. S
data missingness that can lead to an
for the balanced-panel case is review
balanced-panel case is also reviewed. F
an example is provided to illustrate
inconsistent estimates of the mod
case in more detail. A modified Ch
The approach requires additional pro
for the cross-sectional units. The pr
in a GMM framework to develop e
GMM framework allows for straigh
For the sequentially exogenous case
inconsistency of the Mundlak appro
Chamberlain-based orthogonality con
assumption. Section 4 illustrates the
panel data set on wages, first assum
the strict exogeneity assumption of
also compared to commonly used esti
3 Wooldridge (2009) also generalizes the Mu

models.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
Unbalanced panel data 163
exogeneity and Arellano-Bond-type first-difference in

sequential exogeneity.
2. THE MODEL
Consider the standard linear fixed-effects model
yit = Xitßo + Ci + Uit (i = 1, . . . , n' t = 1, . . . , T' (2.1)
where T represents the maximum number of time periods that a cross-sectional unit i would
be observed. As in Chamberlain (1982), we consider the case of large-n ( n -> oo), fixed-
T asymptotics. Let k denote the number of covariates in x ¡t (so that ßo is a A: -vector). The
unbalanced nature of the panel is introduced by allowing time periods to be 'missing'. Following
the notation of (Wooldridge, 2002, subsection 17.7.1), an observability indicator is defined as
{1 00ifotherwise.
otherwise. (yit , xit) observed
For missing observations (sit = 0), we use the convention that xit = 0 and ytř = 0. W
sit denoting the total number of time periods observed for cross-sectional unit /,
transformations (used for the standard fixed-effects (or within) estimator of (2.1)) are
% = yit - T~x ^ Sirytr , Xi, = xit - Trl y^sirXjr.

r r
We w
no di
follow
y¡ = (yn . • • • , yn)' Xi = (xn, xiT)' U¡ = (un , . . . , uiT)', Si = (s, , siTy.

Each of these is a column vector, with jc, of dimension Tk x 1 and y i , u¡ and s¡ each of dimension
T x 1.
To complete the set-up of the model, we introduce two assumptions. First, the strict
exogeneity assumption is given by:
Assumption 2. 1 (Strict Exogeneity). E(u¡ 'x c,- , s¡) = 0.
This assumption allows observability (s¡) to be related in arbitrary ways with and c,
but restricts the error disturbances (u¡) to be conditionally mean independent of (x/, q, s¡).5
Assumption 2.1 is a stronger version of the usual strict exogeneity assumption, as it maintains
strict exogeneity of observability (selection) in addition to strict exogeneity of covariates.
Observability of (yitixit) can not be systematically related to any of the disturbances uis
(s = 1, . . . , T) once unobserved heterogeneity and observables are controlled for.
4 If the missingness mechanism causes only a single observation to remain for a cross-sectional unit, we simply assume
that this unit has already been dropped from the observed data. Note that this implicit assumption would not require
anything stronger than Assumption 2. 1 below, which already assumes that the error disturbances are conditionally mean
independent of the missingness indicators.
5 This assumption is identical to the assumption made by Wooldridge (2002) in his textbook treatment of unbalanced
panel data and also in Wooldridge (2009).

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
164 J.Abrevaya
Secondly, to guarantee parameter

possible missingness is required.6
ASSUMPTION 2.2 (Full Rank). Let

s e S and t e {1, . . . , T] such that
The Chamberlain (1982) projection

effect Ci upon (xu , . . . , t),
ci = ÝO + xi'^0' H" xi2^02 H" • * • + XíjXqj + <2/, (2.2)
where ýo is a scalar, each Ào, is a k x 1 vector, and E[x'itai] = 0 by construction (for e

Plugging (2.2) into (2.1) yields a model from which (ßo, Ýo, ôi> • • • » t) can be es
A simple estimation method is the pooled ordinary least squares (OLS) regression o
(xa, 1, Xu , . . . , xiT). It is now well-known (e.g. Imbens and Wooldridge, 2007, and Woo
2009) that this estimator is numerically equivalent to the within estimator and also the
(1978) regression estimator, as stated in the following result:
Proposition 2.1 (Equivalence Result for Balanced Panels and No Missingness). The
following estimators of ßo are numerically equivalent: (a) the within estimator ( OLS of y u
on xit ); (b) the Chamberlain regression estimator (OLS of yit on (xit, 1, jc/i, . . . , *,t)); (c) the
Mundlak regression estimator (OLS of y u on (x íř, 1, *,)); (d) the least-squares dummy variable
(LSDV) regression (OLS of y¿t on (*„, ¿/1/, d2¡, . . . , dN¿), where d ji is an indicator variable
equal to one if i = j and zero if i ± j ).
When missingness causes the panel to be unbalanced, the Chamberlain approach is no longer
directly applicable. If a researcher blindly attempts to apply the projection (e.g. by plugging in
zeros for xit in periods without data), this will generally cause inconsistency of the resulting
estimators. We provide a simple three-period ( T = 3) example with scalar covariates to illustrate
this point. Consider the model
y u =xit -b Cļ +uit (t = 1,2, 3),
Ci = */3,
where E(j ct) = 0 and var(jc;) = /3. That is, the covariates are each mean zero with unit vari
and no serial correlation. The third period is missing for unit i with probability p. (When
third period is missing, it is still the case that c,- = JC/3 but (^/3, JC/3) are just not observed.)
'blind' Chamberlain approach in this situation would be a pooled OLS regression with xits a
control variables but JC/3 = 0 used in the case of a missing observation. To easily evaluate
probability limit of the Chamberlain pooled OLS estimator, we insert a row of zeroes (so as n
to otherwise affect estimation) for missing t = 3 observations. Then, the Chamberlain covari
6 Note that Assumption 2.2 is somewhat stronger than the assumption made by Wooldridge (2002, Assumption 17
Whereas the latter assumption is sufficient for identification of the within estimator, the former assumption is need
guarantee identification of all parameters within the Chamberlain projection approach.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
matrix, denoted X,- is given by
r Xii 1 Xn xi2 Xi3

Xi = xi2 1 Xn xi2 Xi3 (2.3)
_Xi3 1 Xn Xi 2 Xi 3_
for fully observed /, and by
Xn 1 Xn Xi 2 0
Xi = xi2 1 xn Xi2 0 (2.4)
0 0 0 o 0_
for i with missing t = 3. The probability limit of the pooled OL
E(X¡Xi)~l E(X¡yi). The distributional assumption on x¿ and the model fo
"30111] [20110"
03000 02000
E(X¡Xi) = (l- p) 1 0300 + p 1 0200

10030 10020
_1 0 0 0 3
and
"41 [3'
0 0
E(X¡yi) = (1 - p) 1 +p 1 .
1 1
_4J [o_
We focus on the model's ß parameter (equal to one in our design), corresponding to the first
component of the pooled OLS estimator. Figure 1 graphs the probability limit of the Chamberlain
estimator of ß versus the probability of t = 3 missingness (/?). With no missingness (p = 0), the
Chamberlain estimator is consistent and has probability limit equal to one. The inconsistency
worsens as p increases, with a probability limit that reaches twice as large as the true value when
p gets close to l.7
Interestingly, as Wooldridge (2009) notes, the Mundlak regression (with x¡ = T¡~x Ylt sitxu)
remains consistent and numerically identical to the within estimator in the presence of
unbalanced panels.8 Unfortunately, while fine for estimation purposes under strict exogeneity,
the Mundlak approach does not offer the researcher the same flexibility for handling (and testing)
less stringent assumptions like sequential exogeneity.
3. CHAMBERLAIN PROJECTION FOR UNBALANCED PANELS
The T = 3 case is considered for ease of exposition. The methods discussed

naturally to larger T but with more moment conditions and additional paramet
7 Note that E{X'Xi) is not invertible when p = 1 but is otherwise non-singular for p < 1. The p
reported in Figure 1 were calculated for values of p strictly less than 1 .
8 Others have recommended the use of covariate averages as control variables to handle unbalanced p
linear models (Wooldridge 2002, 2009) and quantile regression models (Bach et al., 2008).

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
166 J. Abrevaya
Figure 1. Inconsistency of Chamb
case of model (2.1) is
yu = Xitßo + Ci + Uit (/ = 1, . . . , n' t = 1, 2, 3). (3.1)
Recall that 7} >2 for all i , so that 7} is either two (one missing period) or three (no mis
periods) here.
The possible s¡ values for the Tt =2 observations are (0, 1, 1/, (1, 0, 1/ and (1,1, 0/. There
are two possible methods for dealing with the 7} = 2 subsample: (a) combining all possible
configurations into a single projection method or (b) handling each possible s¡ configuration
as a separate projection method. Method (a) has the virtue of being simpler to implement
and requiring fewer moment conditions for estimation/testing purposes. By treating different s¡
configurations differently, method (b) offers the researcher the ability to check the sensitivity of
the model specification to violations of strict exogeneity that depend upon which time periods are
missing (and not just how many time periods are missing). These two methods, which we call 7}-
dependent projections and s i -dependent projections , respectively, are described in Sections 3.1
and 3.3.
3.1. Tļ -dependent projections
To use a single method for all st configurations for 7} =2 observations, the easiest approach is
to simply shift the time indices for unit i such that £¿=(1,1,0)'. That is, the last time period
(, t = 3) is made to be missing for each Tļ =2 observation.9 The shift of the time indices should
be done in such a way that the ordering of time is maintained , which allows for a treatment of
9 Note that this relabelling of the time indices does not affect estimation of time dummies (or other /-dependent
covariates) since we are not changing any of the covariate values themselves.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
the sequentially exogenous case in subsection 3.2. Specifi

be left alone, a unit i with = (1, 0, 1)' would have t = 3 s
unit i with Si = (0, 1, 1)' would have t = 2 and t = 3 shift
respectively.
For fully observed cross-sectional units (7} = 3), the usual projection is applied:
Ci = ýo + */i*oi + *¿2*02 + */3*03 + at (fully observed), (3.2)
where E(x'nãi) = E(x'i2ai) = E(x'i3ai) = 0 by construction. For the T¡ = 2 observations, where

jc/3 is not observed, the fixed eífect c, is projected, onto only *n and x¿2'
Ci = Ýo + *zi*oi + *i2*02 + aì (ř = 3 missing), (3.3)

where E(x'naf) = E(x'i2af) = 0 by construction. Superscripts are used for parameters and
projection errors as a convention for denoting which period is missing data (period t = 3 here).
Plugging the two projections (3.2) and (3.3) back into the original model (3.1) yields,
respectively,
yit - xitßo H- Ýo + */i*oi + */2*02 + */3*03 + ai + uit for t = 1, 2, 3 and 7} = 3 (3.4)
and
yit = Xitßo + Vô "ł* */i*oi */2 *02 + aì + uit f°r ř = 1» 2 and 7} = 2. (3.5)
For (3.4) and (3.5), the composite error disturbances a¡ + uit and a? -h uit , respectively, are
uncorrected with the regressors due to Assumption 2.1 and the linear projections. These
orthogonality conditions can naturally be represented as moment conditions to allow for GMM
estimation. To simplify notation, let 0 = ( ß , x/r, Ài, À2, *3, *1» *2) denote the full vector of
parameters and #0 = (ßo, Ýo, *01 » *02, *03» V'o» *01 ' *02) true parameters.
Specifically, consider the following set of moment functions corresponding to the
orthogonality conditions for (3.4) and (3.5):
' 1 "
* •
SiiSiìSiì J1 • (y¡, - Xitß -ý - Xi 1X1 - xi2^2 - XiļX3) for t = 1, 2, 3, (3.6)
¡2
~ 1 1
5/15/2(1 - S/3) x'n ( y¡ , - xitß - Ý3 - Xu*.] - Xi2xļ) for t = 1, 2. (3.7)
_*;2J
Let g(zi , 0) denote the stacked vector of all these functions, where = (>>/, */, sř). There are a
total of 13A: + 5 moments conditions and 6k + 2 parameters (Ik + 3 overidentifying restrictions).
In contrast, the usual situation with no missing data would be based solely upon (3.6), with 9k -f 3
moment conditions and 4Ä: H- 1 parameters (5Ā: + 2 overidentifying restrictions).
Note that we have additional overidentifying restrictions from the proposed orthogonality
conditions as compared to the non-missing case. These additional restrictions arise since the

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
168 J. Abrevaya
orthogonality conditions must hold

The additional restrictions need not y
not make sense to purposefully throw
The orthogonality conditions imply
true parameter values.
LEMMA 3.1. Under Assumption 2.

moment functions given in (3.6) and
G ± Bo.
Let 0 denote the unweighted GMM estimator obtained by minimizing
YlsiZi, 0)^ 2^g(Zh 0)j . (3.8)

The GMM estimator can be implemented without numerical optimization by using instrumental-
variables methods. Specifically, define the instrumental- variable matrix Z, as
'in 0 0"
Z, S o Za. 0 , (3.9)
o 0 Z,3_
where
Zfl = [-5/3 i/3*/l i/3*/2 Í/3X/3 (I-Í/3) d - Siļ)xn (1 - í/3)*/2] . (3.10)
Zi2 = [í/3 Si3Xii S¡1Xi2 SiiXji (1 í/3) (1 - í/3)*/ 1 (1 - í/3 fe] , (3.11)
Z/3 = [í/3 í/3*/l í/3*/2 S/3JC/3]. (3.12)
and the Os in (3.9) are row vectors of appropriate dimensio

Note that (3.4) and (3.5) can be combined into a single e
y it = Xi,ßo + 5/3(^0+^/1^01 +x/2^02 +-*/3^03 + 0/ ) + ( 1

(3.13)
Then, defining
Xi 1 Í/ 3 Í/3JC/1 SiļXi2 SiļXjļ (1 - 5i3) (1 - Í/

X¡ = X¡2 í/3 S/3*/i SjļXj2 Si 3X¡1 (1 - í/3) (1 ~ í/3)*/l (1 - í/3).X/2 ,(3.14)
_X¡3 í/3 í/3*/l í/3*/2 í/3*/3 (1 - í/3) (1 ~ í/3)*/l (1 ~ Siļ)xi2 _
the unweighted GMM estimator 9 from (3.8) can be obtained directly as the system IV estimator
ê = (X'ZZ'X)-'(X'ZZ'Y), (3.15)
10 An alternative way to proceed, which maintains the same number of ortho

case, is provided in the Appendix. This alternative approach would be particula
of orthogonality conditions will increase rapidly in T.
© 2013 The Author(s). The Econometrics Journal © 2013

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
where X, Z and Y are the stacked versions of X¿, Z,-

x (13* + 5) and nT x 1, respectively.
The unweighted GMM estimator is, in general, ineff
allows for weighting is
w^n-'¿g(zi,0
The system 2SLS estimator has W2sls = (^)_1,so
Oisls = (X'ZWTSLsZ'XyilX'ZWtsLsZ
Let fa s lš denote the components of §2sls correspon

also not necessarily efficient, is equivalent to the with
PROPOSITION 3.1. The following estimators of ßo

estimator ( OLS of y u on x it)', (b) the 2SLS Chamber
on Xc, ( d ) the Mundlak regression estimator (OLS of
(e) the LSDV regression for sit = 1 observations.
This proposition extends the equivalence results

Looking at the combined model in (3.13), the validity
in the unbalanced-panel setting by interacting the app
missingness configuration. Equivalently, one can think
a specific missingness configuration as in the moment
Finally, the optimal GMM estimator can be obtained
instance, after obtaining the 2SLS estimator 62SLS , t
objective function (3.16) where the optimal weighting
w = #2 SLs)g(Zi, 025L5)^
The optimal GMM estimator can be obtained directly as 6 = (X'ZWZ'XyîX'ZWZ'Y).
3.1.1. Test of overidentifying restrictions. If the moment conditions are correctly specified
(which will occur if the model itself is correctly specified and Assumption 2.1 holds), then
the optimal GMM estimator Õ can be used directly for a test of overidentifying restrictions.
Specifically, under correct specification
J = n ¿ gin, (9)^ W ^T1 ¿ g(z¡, Õ)j Xn+3-

Rejection based upon this test statistic (i.e. a value greater than the appropriately chosen critical
value from the Xn+3 distribution) is evidence of a violation of strict exogeneity.
Note that the degrees of freedom for the test of overidentifying restrictions can be quite
large, even for the T = 3 case considered here. As a result, it is quite possible that such tests will

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
170 J. Abrevaya
have low power. This property is cert

balanced panels and has been noted a
3.1.2. Test of a random-effects

heterogeneity c, is independent of a
À parameters in the Chamberlain pr
interest is
Ho : A.01 = ^02 = A.03 = A.0
which could be tested in a variety o

a Wald test based upon an estimator
distribution under Hq. LM- and LR-t
3.2. Sequential exogeneit
An appealing feature of the Chambe

(Assumption 2.1) can be weakened di
researcher would like to relax. In t
assumption with a sequential-exogen
Whereas the Mundlak approach is
projections does not seem to provide
exogenous case, even for a balanced
period balanced panel example with
y it =xit+Ci +uit ( t = 1,
ci =xn,
where the violation of strict exogeneity comes from a strong form of fee
Ui'. This model satisfies the following sequential-exogeneity assumption
E(uii'xn, Ci) = 0 and E(ui2'xn, xi2, q) = 0.
The Mundlak and Chamberlain projections would lead to corresponding 'residuals' given
by y it - ß*it - and y¿t - ßxu - ÀiJC/i - A.2JC/2» respectively. The Mundlak residual is not
guaranteed to be orthogonal to x¿ ' or xt2 for either t = 1 or t = 2. In contrast, the Chamberlain
residual is orthogonal to x¡' and xi2 for t = 2 and orthogonal to x¡' for t = 1. We conducted a
simple GMM exercise using the model above, drawing jc, 1 , un and u¿ 2 as independent standard
normal random variables (and ct , */2, yn, y /2 resulting from the specification). We used a sample
size of n = 1,000,000 to ensure precision. Table 1 reports the results from the exercise. Four sets
of (efficient) GMM estimates are reported, corresponding to Mundlak and Chamberlain under
strict exogeneity and sequential exogeneity. For strict exogeneity, both covariates are used in the
orthogonality conditions in both time periods. For sequential exogeneity, jt/2 is dropped from
the first time period's orthogonality conditions. Both estimators are clearly inconsistent when
strict exogeneity is incorrectly assumed, with the Chamberlain estimator actually performing
worse than the Mundlak estimator. When sequential exogeneity is correctly assumed, however,
the Chamberlain estimator works perfectly and recovers both the true ß parameter and the correct
projection parameters. The Mundlak estimator remains inconsistent.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
Table 1. Failure of Mundlak approach under sequ

Assuming Assuming
strict exogeneity sequential exogeneity
Mundlak Chamberlain Mundlak Chamberlain
ß (true value = 1) 0.6686 0.5015 0.6673 1.0000

(0.0006) (0.0009) (0.0013) (0.0014)
Projection coefficient 1 .9995 2.0022
on X (0.0005) (0.0024)
Projection coefficient 1.3492 1.0002
on x¡' (0.0008) (0.0010)
Projection coefficient 0.8497 0.0007
on jc,2 (0.0003) (0.0017)
Note: Efficient GMM estimates

and Chamberlain uses a projection
periods. The last two columns do n
Returning to the T = 3
one is interested in impo
easily restore the mome
exogeneity, incorporati
exogeneity assumption E(u
The error disturbance is
and lagged covariates but
back effects in the dyna
observability is still restri
Note that only a subset o
sequential exogeneity, spe
" 1 "
xf.
Siisi2Siļ (y¡ 3 - xi3ß -ý - *nA.i - xi2X2 - xnkļ), (3.17)
/2
"1"
susnsn x'n (yi2 - xi2ß - f - xi2X2 - xi3k3), (3.18)
_ Xi2 _
Si'si2Si 3 ^ j (yn - xnß -ý - xn>-i - xi2X2 - xi3k3), (3.19)
"1]
s<isi2(l - si3) x'n (yi2 - xi2ß - V3 - *¡1*1 - *¿2^2) • (3.20)

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
172 J. Abrevaya
Silnil - si3) J (y,i - x

A total of 4 k moment functions have b
leaving 9k + 5 moment functions and
the stacked moment functions (3.1
an optimal GMM estimator (denote
instrument matrix for IV estimation
corresponding to the dropped orthogo
Zi I = [s,'3 Si 3*n (1 - si3) (1-5,3 )*ii], (3.22)
z¡2 = [í,3 Sj3xn SiļXi2 (1-5,-3) (1 - s,3)*, i (1 - ^,3)^2] , (3.23)
Zi3 = [5/3 i/3*/l Si3Xi2 5, 3*, 3]. (3.24)
Redefining Z, using these rows (and Z as the stacked version
Osec.lSLS = ( X'ZW2SLsZ'Xrl(X'ZW2SLsZ'Y ),
where 'V2s1_s = (^)~l ■ The optimal GMM estimator is
Õseq = (X'ZWseqZ'X)-'X'ZWseqZ'Y),
where
W seq - ^ ^ 8seq(Zi 1 @seq,2SLs)8seq(.Zi y @seq,
3.2.1. Overidentification test. The overidentification test stati

estimator Bseq is
Jseq = ^ ^ ^ gseqiZr , 0 W Seq ^ ^ > &seq(j>i ? @

and, under correct model specification and sequential exogeneity, has a Xm+s limiting
distribution.
3.2.2. Test of strict exogeneity. To test the stronger assumption of strict exogeneity (Assumption
2.1) against the alternative of sequential exogeneity, one wants to test the validity of the extra
moments used for GMM estimation under strict exogeneity. There are several ways of performing
such a test, but a particularly simple method is the GMM-based approach of Eichenbaum
et al. (1988) (EHS hereafter).12 The EHS test statistic is simply the difference between the two
1 1 In the case with no missingness, there are 6k + 3 moment functions and 4k + 1 parameters.
,z See Hall (2005) for an excellent discussion of the EHS test and its asymptotic properties.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
overidentification test statistics given above (one unde

exogeneity):
J EHS = J - Jseq-
Under the null hypothesis that the full set of mome

d ?
J EHS - ► xtk' ? Note that the degrees of freedom here is the number o
used for 0.
3. 3. Si -dependent projections
In this subsection, the GMM approach of subsection 3.1 is modified to

conditions to be conditional on specific configurations rather than
of the aforementioned specification tests will apply to the GMM estim
subsection; as these tests only require a change in the degrees of freedo
described below.
Whereas the 7} -dependent projection approach allows for two types of projections (one for
Ti = 2, one for 7} = 3), the s i -dependent projection approach specifies four different projections
(three for 7} = 2, one for 7} = 3):
Ci = Ýo + */i*oi + */2*02 + */3*03 + «/ (fully observed), (3.25)
Ci = Ýo + *12*02 + */3*03 + ai = 1 missing), (3.26)
Ci = Ýo + *¿i*oi + *¿3*03 + aì (f = 2 missing), (3.27)
ci = Ýo + *¿1*01 + *¿2*02 + aì (' = 3 missing). (3.28)

Recall that the superscript notation is used to denote the missing time period. For each projection,
the projection error is, by construction, uncorrelated with the xits that appear on the right-hand
side of the projection equation (e.g. a] is uncorrelated with xn and */3). The total number of
estimable parameters has increased to lOfc + 4, with the true parameter vector given by 0q =
(ßOi Ýo, *01» *02» *03» Vô » *02' *03» ^0' *01' *03» ^0' *01' *02^'
Using the projections (3.25)-(3.28) in conjunction with the model (3.1), the following
moment functions are implied by the resulting orthogonality conditions under strict exogeneity :
" 1 "
X•
snsnsn J1 • (yu - xitß -f- x(lî - xi2k2 - xi3X3) for t = 1, 2, 3, (3.29)
12
~1"
(1 - 5,i)s,-2i¿3 x¡2 (y¡t - xitß - x¡2''- xnX') fori =2, 3, (3.30)
_*;3J
■i■
5/i(l - si2)si3 x'n ( yit - xitß - Ý2 - xnk] - XijXj) for t = 1, 3, (3.31)
_*;3J

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
174 J. Abrevaya
~ 1 "
Si'si2(l - Su) x[ļ (yit - xitß
M
There are 2lk + 9 moment functions within (3.29)-(3.32) and, therefore, llfc + 5
overidentifying restrictions. For sequential exogeneity, one would drop the necessary
orthogonality conditions from the set above, similar to subsection 3.2.
To keep the notation simple, re-define g(zi , 0) to be the stacked moment functions above
and 02SLS and 9 to be the 2SLS and optimal GMM estimators, respectively. Similarly, re-define
gseqizi, 0) to be the stacked moment functions under sequential exogeneity (removing a total of
6k moments) and 0seq,2SLS and 0seq to be the corresponding 2SLS and optimal GMM estimators,
respectively. Finally, let J and Jseq denote the overidentification test statistics associated with
the optimal GMM estimators 6 and Õseq, respectively. With this notation, it is straightforward to
extend the various tests introduced above. A Wald test of the random-effects specification (all X
parameters being equal to zero) would have 9k degrees of freedom. The overidentification tests
based upon J and Jseq have 1 Ik + 5 and 5k + 5 degrees of freedom, respectively. The EHS test
of the additional moment restrictions used under strict exogeneity, based upon the test statistic
J - Jseq, has 6k degrees of freedom.
The larger number of moment conditions for the s¡ -dependent approach could yield efficiency
gains over the 7} -dependent approach. It is important to note, however, that even though there
are additional moments, each of these moments will have fewer associated observations within
the sample (i.e. observations for which the moment function is not trivially equal to zero). Any
efficiency gain would come from the fact that the form of residual heteroscedasticity and/or serial
correlation varies with s¿ even after conditioning on 7}. If this is not the case, the 7} -dependent
and Si -dependent estimators should yield extremely similar results.
4. AN EMPIRICAL EXAMPLE
To illustrate the projection method described in Section 3, we consider the pane

wages originally studied by Velia and Verbeek (1998). Their original sample, take
National Longitudinal Survey (Youth Sample), was a balanced panel of 545 full-ti
males for the 8 years between 1980 and 1987. 13 Starting from this original sample
created an unbalanced panel data set. Specifically, for a random sample of 25% of
in the data, we dropped 4 years of data (1981, 1983, 1985, 1987). 14 The resultin
panel has 409 individuals with complete data (8 years) and 136 individuals with i
(4 years).
Since our primary purpose is an illustrative one, we focus on a simple specification of the
fixed-effects model for log- wages that includes indicator variables for marital (1 if married) and
union status (1 if union member) and a quadratic specification in experience (measured in years):
In (wage)it = ß'marriedit + ß2 unionu + ß?>experit + ß4experft + c¡ + un.
Table 2 reports the results from several different estimators and hypothesis tests for this model.
13 Velia and Verbeek (1998) provide additional details on the choice of sample.
14 Every fourth observation from the original data was dropped, so the missingness mechanism is totally random and,
by itself, would not cause a violation of the strict exogeneity assumption.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
✓ - '
^ - ' /-V ' O V)
osmvoo^Hvo^H^H
<; ^hcsooinoscNQQ
Êa <; _J osmvoo^Hvo^H^H
_J OOOOOOOO i^-iSvoriOmvo^H ^hcsooinoscNQQ OOOOOOOO
«3 (NoÖOOÖÖOÖ
¡3 w W W ļ w
o
s
ôû ^
0 ^ ^ ^ in es
>< ^ rh VOVOr^r-ÔOtNO
<t> ^ <u /-N^5 cs»noo»nr-om^H
ś> S <u ^ÊÖOÖÖOÖOO
j S82SS888
<D
§ O 'g w w w , w
3 C ^
CT1 cd ✓ - s OO ^ - '
<ü 'TS rd ' - v X - s VO VO CS ✓ - s
** .o 25 ^nôo^cn^
on .o c Sá -«èmvocN^Hr-mò ^ m (S o ^ m
P ¿='
fā ^ a^ fi
c *3 S ^ÔN(NOOOO
OOOO-HOOO (S^HÔooO
r-vooo-H»n
tž o ww
ÖOÖÖOÖÖÖ (N¡ O O O co O
W W WlW^W^W^W
1
^ ) t^<Nr-r-^(Ncor- /■ s X s oo ^
! ) /■
ZSl m(N^H*nOS'- |^-Q (N Q

J w O
C/3 OO*- 'O'- 'OOO ^ O
^ ÖOOOÖOOO ¿é
s
^ (N
r3 /^-v /- s ✓ - s VO VO
2 5 o'^)oooi-o'mi r-Hôo^
<u Ea oo^HCNcomvomo w O (N Q
ā 'à ^ oooo2888 S S (N S
C fi
■4-»
3 ~ O dddddddo in ö ö ö
Td W w wiw^wfi-jw '
C
2 8 wiw^wfi-jw ' ^ <N
O § ^
1
73 ^ ^ ^ o' w ^
_o cncMcnONCS^Hcnr- oo ^
e tM^^"ONHî5 <N o
•a i_) W TTCNOOCN'- i'- 'OO WQ
Cfl £ ^ onE oooo-hooo m o
u 3 <N,W dddddddo o' ©
H g w w w , w OS w
4>
Ž -
I aj
c
^ ✓ - s /
2«^t^»nor-r->nooTj-
c S g,<^csvo(Nr-r-cnd
- s ✓ - s ł- i r-
g> S f| g,<^csvo(Nr-r-cnd 3 S S S 2 8 8 8
S o
Q
Is«2 ^ ^ OO *
*-h es oo o oo o'
T3 ^ OO ^ (N M CO O
^ ^ p CO (N ^ (N I Q O
3 Uh O O O O OO
fc dodo do
S_^ I s_^
^ ^ ^ O Os
mo^HOooNr-mvo
e«n^HCS(NvOOÔ
e«n^HCS(NvOOÔ ^■(N00<N-H-HOO
^■(N00<N-H-HOO
OOOO^HOOO OOOO^HOOO
dodooooo
/ - ' r-V
<+H ^ ^
T3 T3 «4-3
w w -J
-t-» co
CA -4- > -
D O
-t-> (U C
g Sc o
1 g §
g I -8 1 1 1 1 § I
1 c 1 g ! s 1 1 ¿ 1 1 1
I I ¿1 £ o £ S
© 2013 The Authors). The Econometrics Journal © 2013 Royal Economic Society.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
176 J.Abrevaya
The first five columns of Table 2

first three columns provide the fu
optimal GMM
estimates as a baselin
optimal GMM) for the unbalanced p
by Assumption 2.1 are utilized. The
with previous studies. The FE point
in magnitude, as expected given th
(FE) estimator has standard errors
to lower number of observations.
in standard errors as compared to
estimates, standard errors are cut n
evidence of misspecification (p-v a
case has standard errors very simi
Note, however, that the unbalance
the implicit assumption that the
Assumption 2. 1 .
The final four columns of Table 2
relaxed to allow for correlation between error disturbances and future union status. Strict
exogeneity is maintained for marital status and experience, whereas sequential exogeneity i
maintained for union status. Four different estimators are considered. The 2SLS and optima
GMM estimators from the modified Chamberlain approach are used. As a comparison, t
different IV estimators based upon a first-difference model are also presented. FD-1LAG is
first-difference estimator where unioni t-' is used as an instrument for union¿t - union¿1t-' ,
FD-2LAG is a first-difference estimator where unioni j- ' and unioni t~ 2 are used as instrumen
for unionit - unioniit-'.15 These two FD estimators, based upon 2SLS rather than efficient I
are used since they correspond closely to popular practice by empirical researchers.
Looking at the results, for either 2SLS or optimal GMM, the coefficient estimates a
standard errors for the non-union variables do not change much. As expected, the standard err
on the union variable increase (e.g. going from 0.0130 to 0.0223 for optimal GMM estimatio
The difference in estimated union premium between the sequential-exogeneity optimal GMM
estimator (0.0963) and strict-exogeneity optimal GMM estimator (0.0628) is quite large
magnitude. A back-of-the-envelope Hausman-type test has an associated z-statistic of rough
(0.0963 - 0.0628)/ VO.02232 - 0.01302 ^ 1.85, meaning the difference is nearly statistica
significant at the 5% level.16 As for the strict exogeneity case, the optimal GMM estimator
exhibits substantial efficiency gains relative to the 2SLS estimator. The optimal GMM standar
errors are roughly one-third lower than the 2SLS standard errors.
The optimal GMM estimator also exhibits substantial efficiency gains relative to the FD-
1LAG and FD-2LAG estimators. The standard error on the union estimate for FD-1LAG is the
same as the 2SLS standard error. There is sensitivity in the standard error to the inclusion of
the additional lag, as the FD-2LAG standard error for the union estimate is substantially higher.
The standard errors for the non-union variables are also higher for the FD-1LAG and FD-2LAG
estimators as compared to even the 2SLS standard errors.
15 Note that the 'Experience' coefficient is identified by these estimators, whereas it is not by the FD estimator on the
full data. The reason is that missingness causes the first difference of experience to be equal to two for a quarter of the
observations.
16 1 am grateful to a referee for pointing this out.

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
Unbalanced panel data Yll
The conclusion from the overidentification test (n

the strict-exogeneity case. To test the additional or
exogeneity GMM, we conducted the EHS test. The E
of 0.508, providing little statistical evidence against
used for the strict-exogeneity GMM estimation. Fin
specification is soundly rejected across all estimator
ACKNOWLEDGMENTS
Financial support from the National Science Foundation (grant SES-092120

acknowledged. The author thanks Stephen Donald and seminar participants at Univ
Columbia for useful comments and Badi Baltagi and Peter Schmidt for useful motiv
Suggestions from the Editor and two anonymous referees greatly enhanced this p
code necessary to replicate the paper's empirical results are available from the journ
REFERENCES
Arellano, M. and S. Bond (1991). Some tests of specification for panel data: Monte Carlo e
application to employment equations. Review of Economic Studies 58 , 277-97.
Bach, S. H., C. Dahl and J. T. Kristensen (2008). Headlights on tobacco road to low birthwe
evidence from a battery of quantile regression estimators and a heterogeneous panel. Rese
2008-20, CREATES, Aarhus University.
Baltagi, B. H. and S. H. Song (2006). Unbalanced panel data: a survey. Statistical Papers 4
Bowsher, C. G. (2002). On testing overidentifying restrictions in dynamic panel data mo
Letters 77, 21 1-20.
Chamberlain, G. (1982). Multivariate regression models for panel data. Journal of Econom
Eichenbaum, M., L. P. Hansen and K. J. Singleton (1988). A time series analysis of repre
models of consumption and leisure choice under uncertainty. Quarterly Journal of E
51-78.
Hall, A. R. (2005). Generalized Method of Moments. New York: Oxford University Press.
Imbens, G. and J. M. Wooldridge (2007). Linear panel data models. What's New in Econometrics , Summer
Institute 2008 Lectures, National Bureau of Economic Research. See http://www.nber.org/WNE/
Lect_2Jinpanel.pdf.
Mundlak, Y. (1978). On the pooling of time series and cross sectional data. Econometrica 56 , 69-86.
Velia, F. and M. Verbeek (1998). Whose wages do unions raise? a dynamic model of unionism and wage
rate determination for young men. Journal of Applied Econometrics 13 , 163-83.
Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT
Press.
Wooldridge, J. M. (2009). Correlated random effects models with unbalanced panels. Working paper,
Michigan State University.
APPENDIX
Estimation with fewer moments: The approach of subsection 3.1 considers separate o
conditions for each value of 7}, with associated moment functions in (3.6) and (3.7). The res

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00
178 J. Abrevaya
of moments (13/: -h 5) is a significant

there are many covariates ( k large), th
the orthogonality conditions by using mo
^1*/1 Ui,(yi,Xi,Si) fori =1,2,3, (A.l)

si2x¡2
_ SÍ3x¡3 _
where
Cv. r. C.Ï = í y" ~ Xi'P -Ý - -*<1^1 - xn*-2 - *.3*3 for T¡ = 3

" Cv. y" r. " C.Ï • = - ļ y.t - xitß - ý* -J,,*] - JC,2X| for T¡ = 2.
This formulation has 9 k + 3 moments and 6k + 2 parameters. The consolidation of orthogonality conditions
will generally result in a loss of efficiency relative to the original GMM estimator - optimal weighting for
that estimator will provide efficiency gains when the form of heteroscedasticity and/or serial correlation
depends upon T¡. The overidentification test for the GMM estimator based upon (A.l) would test
orthogonality unconditionally , whereas the original overidentification test tests orthogonality conditional
on Ti . The formulation in (A. 1) may still be preferable in cases where k and/or T are large. Also, the idea to
consolidate orthogonality conditions can also be used in a similar way for s, -dependent GMM estimation
(Section 3.3).
Proof of Proposition 3.1: It is well-known that (a) and (d) are equivalent even in unbalanced panels. The
other equivalence results are most easily seen in a partitioned regression framework. For the pooled OLS in
(c), the regression of xit upon the other partition yields fitted values Jč, for both sl3 = 1 cross-sectional units
and 5i3 = 0 cross-sectional units and, thus, residuals xit - Jč, for all observations. Therefore, (a) and (c) are
numerically equivalent. The Mundlak regression in (d) yields the same partitioned-regression residuals as
the pooled OLS (trivially xit - x¡ since x¡ is part of the non-*,, partition). For the 2SLS estimator in (b),
the first-stage regression is vacuous in the sense that the fitted values from the first stage are identical to the
original covariates (specifically, Z{Z'ZYxZ'X = X). Then, the 2SLS estimator is immediately equivalent
to the pooled OLS estimator in (c). □
SUPPORTING INFORMATION
Additional Supporting Information may be found in the online version of this artic
publisher's web site:
Replication Files Data and Code
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Socie

86.59.13.237 on Mon, 11 Dec 2023 22:50:44 +00:00

Abrevaya Projectionapproachunbalanced 2013

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Abrevaya Projectionapproachunbalanced 2013

Uploaded by

Copyright:

Available Formats

The projection approach for unbalanced panel data

Author(s): Jason Abrevaya

Stable URL: http://www.jstor.com/stable/43697634

This content downloaded from

The projection approach for unbalanced panel data

Keywords: Fixed effects , Linear projections, Unbalanced panel data.

Unbalanced panel data sets are commonly encountered in empirical research.1

© 2013 The Author(s).

This content downloaded from

to the traditional fixed-effects (wit

A drawback of the Mundlak approa

3 Wooldridge (2009) also generalizes the Mu

This content downloaded from

exogeneity and Arellano-Bond-type first-difference in

Consider the standard linear fixed-effects model

yit = Xitßo + Ci + Uit (i = 1, . . . , n' t = 1, . . . , T' (2.1)

% = yit - T~x ^ Sirytr , Xi, = xit - Trl y^sirXjr.

y¡ = (yn . • • • , yn)' Xi = (xn, xiT)' U¡ = (un , . . . , uiT)', Si = (s, , siTy.

Assumption 2. 1 (Strict Exogeneity). E(u¡ 'x c,- , s¡) = 0.

This content downloaded from

Secondly, to guarantee parameter

ASSUMPTION 2.2 (Full Rank). Let

The Chamberlain (1982) projection

ci = ÝO + xi'^0' H" xi2^02 H" • * • + XíjXqj + <2/, (2.2)

where ýo is a scalar, each Ào, is a k x 1 vector, and E[x'itai] = 0 by construction (for e

y u =xit -b Cļ +uit (t = 1,2, 3),

This content downloaded from

matrix, denoted X,- is given by

r Xii 1 Xn xi2 Xi3

E(X¡Xi) = (l- p) 1 0300 + p 1 0200

3. CHAMBERLAIN PROJECTION FOR UNBALANCED PANELS

The T = 3 case is considered for ease of exposition. The methods discussed

This content downloaded from

Figure 1. Inconsistency of Chamb

case of model (2.1) is

yu = Xitßo + Ci + Uit (/ = 1, . . . , n' t = 1, 2, 3). (3.1)

3.1. Tļ -dependent projections

This content downloaded from

the sequentially exogenous case in subsection 3.2. Specifi

Ci = ýo + */i*oi + *¿2*02 + */3*03 + at (fully observed), (3.2)

where E(x'nãi) = E(x'i2ai) = E(x'i3ai) = 0 by construction. For the T¡ = 2 observations, where

Ci = Ýo + *zi*oi + *i2*02 + aì (ř = 3 missing), (3.3)

yit - xitßo H- Ýo + */i*oi + */2*02 + */3*03 + ai + uit for t = 1, 2, 3 and 7} = 3 (3.4)

This content downloaded from

orthogonality conditions must hold

LEMMA 3.1. Under Assumption 2.

Let 0 denote the unweighted GMM estimator obtained by minimizing

YlsiZi, 0)^ 2^g(Zh 0)j . (3.8)

Zfl = [-5/3 i/3*/l i/3*/2 Í/3X/3 (I-Í/3) d - Siļ)xn (1 - í/3)*/2] . (3.10)

Z/3 = [í/3 í/3*/l í/3*/2 S/3JC/3]. (3.12)

and the Os in (3.9) are row vectors of appropriate dimensio

y it = Xi,ßo + 5/3(^0+^/1^01 +x/2^02 +-*/3^03 + 0/ ) + ( 1

Xi 1 Í/ 3 Í/3JC/1 SiļXi2 SiļXjļ (1 - 5i3) (1 - Í/

10 An alternative way to proceed, which maintains the same number of ortho

© 2013 The Author(s). The Econometrics Journal © 2013

This content downloaded from

where X, Z and Y are the stacked versions of X¿, Z,-

Let fa s lš denote the components of §2sls correspon

PROPOSITION 3.1. The following estimators of ßo

This proposition extends the equivalence results

J = n ¿ gin, (9)^ W ^T1 ¿ g(z¡, Õ)j Xn+3-

This content downloaded from

Ci = ýo + /ioi + ¿202 + /303 + at (fully observed), (3.2)

Ci = Ýo + zioi + i202 + aì (ř = 3 missing), (3.3)

yit - xitßo H- Ýo + /ioi + /202 + /303 + ai + uit for t = 1, 2, 3 and 7} = 3 (3.4)

Zfl = [-5/3 i/3/l i/3/2 Í/3X/3 (I-Í/3) d - Siļ)xn (1 - í/3)*/2] . (3.10)

Z/3 = [í/3 í/3/l í/3/2 S/3JC/3]. (3.12)

Zi I = [s,'3 Si 3n (1 - si3) (1-5,3 )ii], (3.22)

Zi3 = [5/3 i/3/l Si3Xi2 5, 3, 3]. (3.24)

Ci = Ýo + /ioi + /202 + /303 + «/ (fully observed), (3.25)

Ci = Ýo + 1202 + /303 + ai = 1 missing), (3.26)

Ci = Ýo + ¿ioi + ¿303 + aì (f = 2 missing), (3.27)

ci = Ýo + ¿101 + ¿202 + aì (' = 3 missing). (3.28)

Cv. r. C.Ï = í y" ~ Xi'P -Ý - -<1^1 - xn-2 - .33 for T¡ = 3