Professional Documents
Culture Documents
Abrevaya Projectionapproachunbalanced 2013
Abrevaya Projectionapproachunbalanced 2013
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Royal Economic Society and Oxford University Press are collaborating with JSTOR to digitize,
preserve and extend access to The Econometrics Journal
Jason Abrevaya1
t Department of Economics, The University of Texas at Austin, Austin, 2225 Speedway Stop
C3100, TX 78712, USA.
E-mail: abrevayaOeco . utexas . edu
First version received: August 2010; final version accepted: September 2012
Summary The Chamberlain projection approach, a powerful tool for the analysis of linear
fixed-effects models, was introduced within the context of balanced panels. This paper extends
the Chamberlain projection approach to unbalanced panels. The extension is especially useful
for models with sequential exogeneity , where existing control-variable approaches are not
applicable. A generalized method of moments (GMM) estimation framework is considered,
and hypothesis tests (testing strict exogeneity, testing random effects, etc.) are discussed within
the GMM context.
1. INTRODUCTION
1 Baltagi and Song (2006) give an extensive survey of the econometrics literature on unbalanced panels.
2 These remain applicable as long as the missingness is not informative about the error disturbances, an
in Section 2.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
2. THE MODEL
where T represents the maximum number of time periods that a cross-sectional unit i would
be observed. As in Chamberlain (1982), we consider the case of large-n ( n -> oo), fixed-
T asymptotics. Let k denote the number of covariates in x ¡t (so that ßo is a A: -vector). The
unbalanced nature of the panel is introduced by allowing time periods to be 'missing'. Following
the notation of (Wooldridge, 2002, subsection 17.7.1), an observability indicator is defined as
{1 00ifotherwise.
otherwise. (yit , xit) observed
For missing observations (sit = 0), we use the convention that xit = 0 and ytř = 0. W
sit denoting the total number of time periods observed for cross-sectional unit /,
transformations (used for the standard fixed-effects (or within) estimator of (2.1)) are
We w
no di
follow
To complete the set-up of the model, we introduce two assumptions. First, the strict
exogeneity assumption is given by:
This assumption allows observability (s¡) to be related in arbitrary ways with and c,
but restricts the error disturbances (u¡) to be conditionally mean independent of (x/, q, s¡).5
Assumption 2.1 is a stronger version of the usual strict exogeneity assumption, as it maintains
strict exogeneity of observability (selection) in addition to strict exogeneity of covariates.
Observability of (yitixit) can not be systematically related to any of the disturbances uis
(s = 1, . . . , T) once unobserved heterogeneity and observables are controlled for.
4 If the missingness mechanism causes only a single observation to remain for a cross-sectional unit, we simply assume
that this unit has already been dropped from the observed data. Note that this implicit assumption would not require
anything stronger than Assumption 2. 1 below, which already assumes that the error disturbances are conditionally mean
independent of the missingness indicators.
5 This assumption is identical to the assumption made by Wooldridge (2002) in his textbook treatment of unbalanced
panel data and also in Wooldridge (2009).
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
Proposition 2.1 (Equivalence Result for Balanced Panels and No Missingness). The
following estimators of ßo are numerically equivalent: (a) the within estimator ( OLS of y u
on xit ); (b) the Chamberlain regression estimator (OLS of yit on (xit, 1, jc/i, . . . , *,t)); (c) the
Mundlak regression estimator (OLS of y u on (x íř, 1, *,)); (d) the least-squares dummy variable
(LSDV) regression (OLS of y¿t on (*„, ¿/1/, d2¡, . . . , dN¿), where d ji is an indicator variable
equal to one if i = j and zero if i ± j ).
When missingness causes the panel to be unbalanced, the Chamberlain approach is no longer
directly applicable. If a researcher blindly attempts to apply the projection (e.g. by plugging in
zeros for xit in periods without data), this will generally cause inconsistency of the resulting
estimators. We provide a simple three-period ( T = 3) example with scalar covariates to illustrate
this point. Consider the model
Ci = */3,
where E(j ct) = 0 and var(jc;) = /3. That is, the covariates are each mean zero with unit vari
and no serial correlation. The third period is missing for unit i with probability p. (When
third period is missing, it is still the case that c,- = JC/3 but (^/3, JC/3) are just not observed.)
'blind' Chamberlain approach in this situation would be a pooled OLS regression with xits a
control variables but JC/3 = 0 used in the case of a missing observation. To easily evaluate
probability limit of the Chamberlain pooled OLS estimator, we insert a row of zeroes (so as n
to otherwise affect estimation) for missing t = 3 observations. Then, the Chamberlain covari
6 Note that Assumption 2.2 is somewhat stronger than the assumption made by Wooldridge (2002, Assumption 17
Whereas the latter assumption is sufficient for identification of the within estimator, the former assumption is need
guarantee identification of all parameters within the Chamberlain projection approach.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
Xn 1 Xn Xi 2 0
Xi = xi2 1 xn Xi2 0 (2.4)
0 0 0 o 0_
for i with missing t = 3. The probability limit of the pooled OL
E(X¡Xi)~l E(X¡yi). The distributional assumption on x¿ and the model fo
"30111] [20110"
03000 02000
_1 0 0 0 3
and
"41 [3'
0 0
E(X¡yi) = (1 - p) 1 +p 1 .
1 1
_4J [o_
We focus on the model's ß parameter (equal to one in our design), corresponding to the first
component of the pooled OLS estimator. Figure 1 graphs the probability limit of the Chamberlain
estimator of ß versus the probability of t = 3 missingness (/?). With no missingness (p = 0), the
Chamberlain estimator is consistent and has probability limit equal to one. The inconsistency
worsens as p increases, with a probability limit that reaches twice as large as the true value when
p gets close to l.7
Interestingly, as Wooldridge (2009) notes, the Mundlak regression (with x¡ = T¡~x Ylt sitxu)
remains consistent and numerically identical to the within estimator in the presence of
unbalanced panels.8 Unfortunately, while fine for estimation purposes under strict exogeneity,
the Mundlak approach does not offer the researcher the same flexibility for handling (and testing)
less stringent assumptions like sequential exogeneity.
7 Note that E{X'Xi) is not invertible when p = 1 but is otherwise non-singular for p < 1. The p
reported in Figure 1 were calculated for values of p strictly less than 1 .
8 Others have recommended the use of covariate averages as control variables to handle unbalanced p
linear models (Wooldridge 2002, 2009) and quantile regression models (Bach et al., 2008).
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
Recall that 7} >2 for all i , so that 7} is either two (one missing period) or three (no mis
periods) here.
The possible s¡ values for the Tt =2 observations are (0, 1, 1/, (1, 0, 1/ and (1,1, 0/. There
are two possible methods for dealing with the 7} = 2 subsample: (a) combining all possible
configurations into a single projection method or (b) handling each possible s¡ configuration
as a separate projection method. Method (a) has the virtue of being simpler to implement
and requiring fewer moment conditions for estimation/testing purposes. By treating different s¡
configurations differently, method (b) offers the researcher the ability to check the sensitivity of
the model specification to violations of strict exogeneity that depend upon which time periods are
missing (and not just how many time periods are missing). These two methods, which we call 7}-
dependent projections and s i -dependent projections , respectively, are described in Sections 3.1
and 3.3.
To use a single method for all st configurations for 7} =2 observations, the easiest approach is
to simply shift the time indices for unit i such that £¿=(1,1,0)'. That is, the last time period
(, t = 3) is made to be missing for each Tļ =2 observation.9 The shift of the time indices should
be done in such a way that the ordering of time is maintained , which allows for a treatment of
9 Note that this relabelling of the time indices does not affect estimation of time dummies (or other /-dependent
covariates) since we are not changing any of the covariate values themselves.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
and
yit = Xitßo + V^o "ł* */i*oi */2 *02 + aì + uit f°r ř = 1» 2 and 7} = 2. (3.5)
For (3.4) and (3.5), the composite error disturbances a¡ + uit and a? -h uit , respectively, are
uncorrected with the regressors due to Assumption 2.1 and the linear projections. These
orthogonality conditions can naturally be represented as moment conditions to allow for GMM
estimation. To simplify notation, let 0 = ( ß , x/r, Ài, À2, *3, *1» *2) denote the full vector of
parameters and #0 = (ßo, Ýo, *01 » *02, *03» V'o» *01 ' *02) true parameters.
Specifically, consider the following set of moment functions corresponding to the
orthogonality conditions for (3.4) and (3.5):
' 1 "
* •
SiiSiìSiì J1 • (y¡, - Xitß -ý - Xi 1X1 - xi2^2 - XiļX3) for t = 1, 2, 3, (3.6)
¡2
~ 1 1
5/15/2(1 - S/3) x'n ( y¡ , - xitß - Ý3 - Xu*.] - Xi2xļ) for t = 1, 2. (3.7)
_*;2J
Let g(zi , 0) denote the stacked vector of all these functions, where = (>>/, */, sř). There are a
total of 13A: + 5 moments conditions and 6k + 2 parameters (Ik + 3 overidentifying restrictions).
In contrast, the usual situation with no missing data would be based solely upon (3.6), with 9k -f 3
moment conditions and 4Ä: H- 1 parameters (5Ā: + 2 overidentifying restrictions).
Note that we have additional overidentifying restrictions from the proposed orthogonality
conditions as compared to the non-missing case. These additional restrictions arise since the
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
'in 0 0"
Z, S o Za. 0 , (3.9)
o 0 Z,3_
where
Zi2 = [í/3 Si3Xii S¡1Xi2 SiiXji (1 í/3) (1 - í/3)*/ 1 (1 - í/3 fe] , (3.11)
Then, defining
ê = (X'ZZ'X)-'(X'ZZ'Y), (3.15)
w^n-'¿g(zi,0
The system 2SLS estimator has W2sls = (^)_1,so
Oisls = (X'ZWTSLsZ'XyilX'ZWtsLsZ
w = #2 SLs)g(Zi, 025L5)^
The optimal GMM estimator can be obtained directly as 6 = (X'ZWZ'Xy^iX'ZWZ'Y).
3.1.1. Test of overidentifying restrictions. If the moment conditions are correctly specified
(which will occur if the model itself is correctly specified and Assumption 2.1 holds), then
the optimal GMM estimator Õ can be used directly for a test of overidentifying restrictions.
Specifically, under correct specification
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
y it =xit+Ci +uit ( t = 1,
ci =xn,
where the violation of strict exogeneity comes from a strong form of fee
Ui'. This model satisfies the following sequential-exogeneity assumption
The Mundlak and Chamberlain projections would lead to corresponding 'residuals' given
by y it - ß*it - and y¿t - ßxu - ÀiJC/i - A.2JC/2» respectively. The Mundlak residual is not
guaranteed to be orthogonal to x¿ ' or xt2 for either t = 1 or t = 2. In contrast, the Chamberlain
residual is orthogonal to x¡' and xi2 for t = 2 and orthogonal to x¡' for t = 1. We conducted a
simple GMM exercise using the model above, drawing jc, 1 , un and u¿ 2 as independent standard
normal random variables (and ct , */2, yn, y /2 resulting from the specification). We used a sample
size of n = 1,000,000 to ensure precision. Table 1 reports the results from the exercise. Four sets
of (efficient) GMM estimates are reported, corresponding to Mundlak and Chamberlain under
strict exogeneity and sequential exogeneity. For strict exogeneity, both covariates are used in the
orthogonality conditions in both time periods. For sequential exogeneity, jt/2 is dropped from
the first time period's orthogonality conditions. Both estimators are clearly inconsistent when
strict exogeneity is incorrectly assumed, with the Chamberlain estimator actually performing
worse than the Mundlak estimator. When sequential exogeneity is correctly assumed, however,
the Chamberlain estimator works perfectly and recovers both the true ß parameter and the correct
projection parameters. The Mundlak estimator remains inconsistent.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
Returning to the T = 3
one is interested in impo
easily restore the mome
exogeneity, incorporati
exogeneity assumption E(u
The error disturbance is
and lagged covariates but
back effects in the dyna
observability is still restri
Note that only a subset o
sequential exogeneity, spe
" 1 "
xf.
Siisi2Siļ (y¡ 3 - xi3ß -ý - *nA.i - xi2X2 - xnkļ), (3.17)
/2
"1"
susnsn x'n (yi2 - xi2ß - f - xi2X2 - xi3k3), (3.18)
_ Xi2 _
"1]
s<isi2(l - si3) x'n (yi2 - xi2ß - V3 - *¡1*1 - *¿2^2) • (3.20)
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
Osec.lSLS = ( X'ZW2SLsZ'Xrl(X'ZW2SLsZ'Y ),
Õseq = (X'ZWseqZ'X)-'X'ZWseqZ'Y),
where
3.2.2. Test of strict exogeneity. To test the stronger assumption of strict exogeneity (Assumption
2.1) against the alternative of sequential exogeneity, one wants to test the validity of the extra
moments used for GMM estimation under strict exogeneity. There are several ways of performing
such a test, but a particularly simple method is the GMM-based approach of Eichenbaum
et al. (1988) (EHS hereafter).12 The EHS test statistic is simply the difference between the two
1 1 In the case with no missingness, there are 6k + 3 moment functions and 4k + 1 parameters.
,z See Hall (2005) for an excellent discussion of the EHS test and its asymptotic properties.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
J EHS = J - Jseq-
3. 3. Si -dependent projections
" 1 "
X•
snsnsn J1 • (yu - xitß -f- x(l^i - xi2k2 - xi3X3) for t = 1, 2, 3, (3.29)
12
~1"
(1 - 5,i)s,-2i¿3 x¡2 (y¡t - xitß - x¡2''- xnX') fori =2, 3, (3.30)
_*;3J
■i■
5/i(l - si2)si3 x'n ( yit - xitß - Ý2 - xnk] - XijXj) for t = 1, 3, (3.31)
_*;3J
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
~ 1 "
Si'si2(l - Su) x[ļ (yit - xitß
M
There are 2lk + 9 moment functions within (3.29)-(3.32) and, therefore, llfc + 5
overidentifying restrictions. For sequential exogeneity, one would drop the necessary
orthogonality conditions from the set above, similar to subsection 3.2.
To keep the notation simple, re-define g(zi , 0) to be the stacked moment functions above
and 02SLS and 9 to be the 2SLS and optimal GMM estimators, respectively. Similarly, re-define
gseqizi, 0) to be the stacked moment functions under sequential exogeneity (removing a total of
6k moments) and 0seq,2SLS and 0seq to be the corresponding 2SLS and optimal GMM estimators,
respectively. Finally, let J and Jseq denote the overidentification test statistics associated with
the optimal GMM estimators 6 and Õseq, respectively. With this notation, it is straightforward to
extend the various tests introduced above. A Wald test of the random-effects specification (all X
parameters being equal to zero) would have 9k degrees of freedom. The overidentification tests
based upon J and Jseq have 1 Ik + 5 and 5k + 5 degrees of freedom, respectively. The EHS test
of the additional moment restrictions used under strict exogeneity, based upon the test statistic
J - Jseq, has 6k degrees of freedom.
The larger number of moment conditions for the s¡ -dependent approach could yield efficiency
gains over the 7} -dependent approach. It is important to note, however, that even though there
are additional moments, each of these moments will have fewer associated observations within
the sample (i.e. observations for which the moment function is not trivially equal to zero). Any
efficiency gain would come from the fact that the form of residual heteroscedasticity and/or serial
correlation varies with s¿ even after conditioning on 7}. If this is not the case, the 7} -dependent
and Si -dependent estimators should yield extremely similar results.
4. AN EMPIRICAL EXAMPLE
Table 2 reports the results from several different estimators and hypothesis tests for this model.
13 Velia and Verbeek (1998) provide additional details on the choice of sample.
14 Every fourth observation from the original data was dropped, so the missingness mechanism is totally random and,
by itself, would not cause a violation of the strict exogeneity assumption.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
✓ - '
^ - ' /-V ' O V)
osmvoo^Hvo^H^H
<; ^hcsooinoscNQQ
Êa <; _J osmvoo^Hvo^H^H
_J OOOOOOOO i^-iSvoriOmvo^H ^hcsooinoscNQQ OOOOOOOO
«3 (NoÖOOÖÖOÖ
¡3 w W W ļ w
o
s
ôû ^
0 ^ ^ ^ in es
>< ^ rh VOVOr^r-^OOtNO
<t> ^ <u /-N^5 cs»noo»nr-om^H
ś> S <u ^ÊÖOÖÖOÖOO
j S82SS888
<D
§ O 'g w w w , w
3 C ^
CT1 cd ✓ - s OO ^ - '
<ü 'TS rd ' - v X - s VO VO CS ✓ - s
** .o 25 ^n^oo^cn^
on .o c Sá -«èmvocN^Hr-mò ^ m (S o ^ m
P ¿='
fā ^ a^ fi
c *3 S ^^ON(NOOOO
OOOO-HOOO (S^H^OooO
r-vooo-H»n
tž o ww
ÖOÖÖOÖÖÖ (N¡ O O O co O
W W WlW^W^W^W
1
^ ) t^<Nr-r-^(Ncor- /■ s X s oo ^
! ) /■
^ ÖOOOÖOOO ¿é
s
^ (N
r3 /^-v /- s ✓ - s VO VO
2 5 o'^)oooi-o'mi r-H^oo^
<u Ea oo^HCNcomvomo w O (N Q
ā 'à ^ oooo2888 S S (N S
C fi
■4-»
3 ~ O dddddddo in ö ö ö
Td W w wiw^wfi-jw '
C
2 8 wiw^wfi-jw ' ^ <N
O § ^
1
73 ^ ^ ^ o' w ^
_o cncMcnONCS^Hcnr- oo ^
e tM^^"ONHî5 <N o
•a i_) W TTCNOOCN'- i'- 'OO WQ
Cfl £ ^ onE oooo-hooo m o
u 3 <N,W dddddddo o' ©
H g w w w , w OS w
4>
Ž -
I aj
c
^ ✓ - s /
2«^t^»nor-r->nooTj-
c S g,<^csvo(Nr-r-cnd
- s ✓ - s ł- i r-
g> S f| g,<^csvo(Nr-r-cnd 3 S S S 2 8 8 8
S o
Q
Is«2 ^ ^ OO *
*-h es oo o oo o'
T3 ^ OO ^ (N M CO O
^ ^ p CO (N ^ (N I Q O
3 Uh O O O O OO
fc dodo do
S_^ I s_^
^ ^ ^ O Os
mo^HOooNr-mvo
e«n^HCS(NvOO^O
e«n^HCS(NvOO^O ^■(N00<N-H-HOO
^■(N00<N-H-HOO
OOOO^HOOO OOOO^HOOO
dodooooo
/ - ' r-V
<+H ^ ^
T3 T3 «4-3
w w -J
-t-» co
CA -4- > -
D O
-t-> (U C
g Sc o
1 g §
g I -8 1 1 1 1 § I
1 c 1 g ! s 1 1 ¿ 1 1 1
I I ¿1 £ o £ S
© 2013 The Authors). The Econometrics Journal © 2013 Royal Economic Society.
15 Note that the 'Experience' coefficient is identified by these estimators, whereas it is not by the FD estimator on the
full data. The reason is that missingness causes the first difference of experience to be equal to two for a quarter of the
observations.
16 1 am grateful to a referee for pointing this out.
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
ACKNOWLEDGMENTS
REFERENCES
Arellano, M. and S. Bond (1991). Some tests of specification for panel data: Monte Carlo e
application to employment equations. Review of Economic Studies 58 , 277-97.
Bach, S. H., C. Dahl and J. T. Kristensen (2008). Headlights on tobacco road to low birthwe
evidence from a battery of quantile regression estimators and a heterogeneous panel. Rese
2008-20, CREATES, Aarhus University.
Baltagi, B. H. and S. H. Song (2006). Unbalanced panel data: a survey. Statistical Papers 4
Bowsher, C. G. (2002). On testing overidentifying restrictions in dynamic panel data mo
Letters 77, 21 1-20.
Chamberlain, G. (1982). Multivariate regression models for panel data. Journal of Econom
Eichenbaum, M., L. P. Hansen and K. J. Singleton (1988). A time series analysis of repre
models of consumption and leisure choice under uncertainty. Quarterly Journal of E
51-78.
Hall, A. R. (2005). Generalized Method of Moments. New York: Oxford University Press.
Imbens, G. and J. M. Wooldridge (2007). Linear panel data models. What's New in Econometrics , Summer
Institute 2008 Lectures, National Bureau of Economic Research. See http://www.nber.org/WNE/
Lect_2Jinpanel.pdf.
Mundlak, Y. (1978). On the pooling of time series and cross sectional data. Econometrica 56 , 69-86.
Velia, F. and M. Verbeek (1998). Whose wages do unions raise? a dynamic model of unionism and wage
rate determination for young men. Journal of Applied Econometrics 13 , 163-83.
Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT
Press.
Wooldridge, J. M. (2009). Correlated random effects models with unbalanced panels. Working paper,
Michigan State University.
APPENDIX
Estimation with fewer moments: The approach of subsection 3.1 considers separate o
conditions for each value of 7}, with associated moment functions in (3.6) and (3.7). The res
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Society.
where
This formulation has 9 k + 3 moments and 6k + 2 parameters. The consolidation of orthogonality conditions
will generally result in a loss of efficiency relative to the original GMM estimator - optimal weighting for
that estimator will provide efficiency gains when the form of heteroscedasticity and/or serial correlation
depends upon T¡. The overidentification test for the GMM estimator based upon (A.l) would test
orthogonality unconditionally , whereas the original overidentification test tests orthogonality conditional
on Ti . The formulation in (A. 1) may still be preferable in cases where k and/or T are large. Also, the idea to
consolidate orthogonality conditions can also be used in a similar way for s, -dependent GMM estimation
(Section 3.3).
Proof of Proposition 3.1: It is well-known that (a) and (d) are equivalent even in unbalanced panels. The
other equivalence results are most easily seen in a partitioned regression framework. For the pooled OLS in
(c), the regression of xit upon the other partition yields fitted values Jč, for both sl3 = 1 cross-sectional units
and 5i3 = 0 cross-sectional units and, thus, residuals xit - Jč, for all observations. Therefore, (a) and (c) are
numerically equivalent. The Mundlak regression in (d) yields the same partitioned-regression residuals as
the pooled OLS (trivially xit - x¡ since x¡ is part of the non-*,, partition). For the 2SLS estimator in (b),
the first-stage regression is vacuous in the sense that the fitted values from the first stage are identical to the
original covariates (specifically, Z{Z'ZYxZ'X = X). Then, the 2SLS estimator is immediately equivalent
to the pooled OLS estimator in (c). □
SUPPORTING INFORMATION
Additional Supporting Information may be found in the online version of this artic
publisher's web site:
© 2013 The Author(s). The Econometrics Journal © 2013 Royal Economic Socie