Professional Documents
Culture Documents
Holtz Eakin1988 PDF
Holtz Eakin1988 PDF
North-Holland
Douglas HOLTZ-EAKIN*
Columbia Uniuersit.y, New York, NY I002 7, USA
This note describes a test for the presence of individual effects in dynamic models. The test
employs restrictions on sample moments implied by the absence of individual effects. Testing the
assumption of unobserved heterogeneity is desirable because estimation methods which control
for heterogeneity are typically more complicated and require more data than would be required
without individual effects. The test is easily implemented as the calculation of statistics follows
directly from a linear, instrumental variables estimation technique. The test is illustrated in the
context of estimating a dynamic wage equation using data from the Panel Study of Income
Dynamics.
1. Introduction
This note presents a simple, linear test for individual effects in dynamic
models using panel data. Econometric specifications which include individual
effects as a component of the stochastic disturbance have become increasingly
popular because these specifications permit the investigator to control for
unobserved heterogeneity which may otherwise contaminate estimates of the
parameters of interest. During estimation, the primary complication intro-
duced by this consideration is that the individual effects may be correlated
with the observed data for the right-side variables. In the context of dynamic
models, i.e., models with lagged dependent variables, this correlation is
guaranteed. If correlation between right-side regressors and individual effects
is present, consistent estimation of the parameters will not be possible using a
single cross-section of data. Moreover, there will be no internal evidence of the
problem. Through the use of panel data consistent parameter estimates may be
obtained.
The use of an individual effects specification is not, however, costless. First,
in the absence of individual effects a single cross-section is sufficient data to
*I am indebted to Joe Altonji, Bruce Lehmann and Whitney Newey for comments on previous
drafts and to Joe Altonji for generously providing the data. I particularly thank Whitney Newey
for detecting an error in the original version. This research was supported in part by the Council
for Research in the Humanities and Social Sciences at Columbia University. It is part of the
National Bureau of Economic Research Project on State and Local Government Finance.
298 D. Holtz-Eakin, Individual effects in autoregressive models
‘See Hsiao (1986, ch. 4) for a detailed discussion of both fixed and random effects estimation
methods. The hypothesis of individual effects is, however, maintained throughout.
D. Ho&z-Eakin, Individual effects in autoregressive models 299
2. Basic method
The basic notion is easily seen by considering the following simple autore-
gressive model:
Yi,r=PYi,t-1 +fr+Pi,t,
E{Y,,,-l~i,J =o.
In the presence of the f,, the orthogonality condition (2) will be violated.
Notice also that the use of an instrumental variable, say JJ_~, does not solve
the problem. In this case, the necessary condition is
and note that the individual effects are eliminated: E, f - E, ,_l = p,,, - p, r_l.
Due to the induced serial correlation in the error term and the presence of
‘In what follows, the term consistent applies to situations in which the size of the cross-section
grows (N + m), while the number of time periods, T, is held fixed.
3Additional right-side variables are easily accommodated. However, because the asymptotic bias
is due to the autoregressive structure, I focus on purely autoregressive models in what follows.
4As is well known, taking deviations from individual specific time means (the ‘within’ estimator)
will also eliminate the individual effect. However, in the presence of lagged values of the
dependent variable, this technique fails to yield consistent estimates. Throughout, I restrict the
discussion to the first-difference approach.
I.Econ-C
300 D. Holtr-Eakin, Individual effects in autoregressive models
E{Y,-,bi,t- E,,,-I)}= 0,
which is satisfied both in the presence and in the absence of individual effects.
The orthogonality conditions in eqs. (2) and (4) may be exploited to test for
the presence of individual effects. Consider a panel data set which contains
observations on the variable y for three periods. The simple AR(l) model in
eq. (1) may be estimated for the last two periods. Under the null hypothesis of
no individual effects, the following orthogonality conditions hold:
q Y&,3 - d) = 02
E{ Yi,l&i,Z 1 = ‘3
Notice that eqs. (6) impose the same restrictions as eqs. (5) and, thus, are
equivalent under the null hypothesis. On the other hand, condition (6a) will
also hold in the presence of individual effects and can be used to identify a
first-differenced estimator of j3 under the alternative hypothesis of individual
effects. The null hypothesis imposes only the two additional restrictions (6b)
and (6~) on the data. Intuitively, the test for individual effects is a test of
whether the sample moments corresponding to these restrictions are suffi-
ciently close to zero; contingent upon imposing (6a) to identify the parame-
ter /3.
To implement the test, first estimate the model under the null hypothesis. It
is conceptually simplest to think of each of the orthogonality conditions (6) as
identifying different ‘equations’ since cross-sectional variation may be used to
D. Hoh-Eakin, Individual effects in autoregressive models 301
identify the parameters. Thus, under the null, there is an equation in levels for
periods 2 and 3 and an equation in differences for period 3. Estimates of the
parameter, p, may be obtained using the method of moments. Efficient
estimation requires that they are estimated jointly with the restriction imposed
that the estimated value of p be the same for both periods. Using the
estimated parameter, sample residuals may be calculated and a cl-u-square test
statistic for whether the sample moments satisfy the orthogonality conditions
(6) computed. (See below for the details of the computations.)
To test for individual effects one needs to measure the marginal contribu-
tion of the conditions (6b) and (6~) after first imposing only (6a). Since (6a)
holds in the presence of individual effects, this amounts to a comparison of
estimation results under the null versus the alternative hypothesis. The only
complication is that the equations in levels are not estimable in the presence of
individual and the test must correctly account for the fact that some parame-
ters are not identified under the alternative hypothesis.
As the example makes clear, implementation of the test requires specifica-
tion of the dynamic structure of the model, consideration of the appropriate
orthogonality conditions, and computation of the parameter estimates and test
statistics. The next section considers these issues in greater detail,
Notice that only the equations for t > m may be estimated in order to
accommodate the lag distribution. As a result, there are T - m equations
containing the m parameters. The null hypothesis imposes R = [T(T - 1) -
m( m - 1)]/2 orthogonality conditions (ignoring constant terms) which may be
used both to identify the parameters of (7) and leaves (R - m) overidentifying
restrictions which may contribute to a test of the null hypothesis.
As in the example above, the conditions (8) may be reformulated in a
manner which sheds more light on the test. An equivalent set of R restrictions
‘HNR discuss the inference of m in great detail and allow both the lag parameters and the
individual effects to be time-varying. Here I restrict the discussion to the case of time-invariant
parameters. Also, while not shown, l’t is straightforward to include other variables on the right side
of eq. (7).
302 D. H&z-Eakin, Individual effects in autoregressive models
is
j=2,...,t-1,
E{ X&i,,- &,,,-A} = 03 (9a)
t=(rn+2),...,T,
t=(m+2),...,T.
The restrictions (9a) are the moment conditions needed to identify the
first-differenced estimator and hold under both the null and alternative hy-
potheses. The restrictions (9b) are those needed to estimate the equation
for period (m + 1) in levels using the previous m lags as instrumental
variables. Similarly, the restrictions (SC) are those needed to admit the first lag
of y as an instrumental variable in the estimation of the last (T- m - 1)
equations in levels.
Before considering the empirical implementation of these restrictions, it is
worth noting that the investigator may not wish to implement the full set of
orthogonality conditions. For example, in the empirical work below a panel
data set covering fourteen years is used (T = 14). Choosing even a relatively
long lag length, say m = 5, gives R = 81 orthogonality conditions. Included
among these are the covariance between yi,i and E;,~~. Covariances at such
long lags are likely to be quite small in practice, contribute little to sample
measures of covariance, and lower the power of tests.
Instead, consider imposing the orthogonality conditions on only the most
recent q covariances, i.e.:
j=l 4 if qlt-1,
E{Yl,t-jEl,tl =O,
j=l:::::r-1 if q>t-1, (10)
t=(m+l),...,T.
In this case, fewer restrictions are imposed, the total number being R = qT -
Uq(q + 1) + m(m - 1)1/2).
Once again, a reformulation of the moment conditions is desirable. Here
there are two cases. First, when q I m, the conditions are a straightforward
modification of those in (9):
j=2 ,..., 4,
E{ Yr.r-jCEi,fBEi,tpl)} =O? W4
t=(m+2),...,T,
t=(m+2),...,T. (114
D. Holiz-Eakin, Individual effects in autoregressive models 303
If, instead, q > m there are fewer than q conditions imposed on the first
(q - m - 1) equations, viz.:
j=2 ,...,t-1,
E{Yz,t-jbi,t- %-I)) = 0, 024
t=(m+2),...,q,
j=2 ,...,
4,
E,,t-I>}=o,
E{Y,,t-,(&,,,- (12b)
t=(q+l),...,T,
Notice that there will be more than one equation for most time periods due to
the use of both differenced and levels equations under the null hypothesis. For
example, in the model from section 2 the equations corresponding to (13) are
Next ‘stack’ the eqs. (14) to form a system. With this, the observations for eqs.
(14) may be written
Y= I@+,. 05)
fi=E{Z’&Z}. 08)
B = [ W’ZO-‘Z’W]-‘W’ZO-‘Z’Y. (20)
6To see this, let P be defined by P’P = s2-I. Then the transformed residuals - PZ’e/fi - are
asymptotically normally distributed with an identity covariance matrix. As usual, sums of squared
normal variables will be distributed as a chi-square.
D. H&z-Eakin, Individual effects VI autoregressive models 305
L=Q,-Q, (24
where Qa is the sum of squared residuals from the restricted estimation. L has
a &i-squared distribution with degrees of freedom equal to the degrees of
freedom of QR minus the degrees of freedom of Q. In this applications, QR is
the sum of squared residuals when imposing the full set of orthogonality
conditions implied by the null hypothesis, and Q is the sum of squared
residuals from imposing only those restrictions needed for the first differenced
versions.
Importantly, the same estimate of fi must be used in both computations.
Because not all the equations are identified under the alternative hypothesis, ti
is estimated under the null hypothesis. When calculating parameters and test
statistics under the alternative hypothesis, the inverse of the submatrix corre-
sponding to the differenced equations only is employed.’
4. Empirical example
‘HNR discuss this in greater detail, including the estimation of D in the presence of
heteroskedasticity.
“I thank Joe Altonji for suggesting this example of the test.
‘AltonJi and Paxson (1986) has a complete discussion of the data.
306 D. Holtz-Eakin, Individual effects in autoregressive models
Table 1
(Xi-square test resultsa
Log wage
L D.F.
Lag7=0 0.07 1
Lag6=0 0.07 1
Lag 5 = 0 0.09 1
Lag4=0 0.29 1
Lag 3 = 0 0.49 1
Lag 2 = 0 2.79d 1
No individual effects 98.08b 9
Lag 2 = 0,
in first differences 6.86b 1
m = 6, and differencing to eliminate the fj. As a result, equations for the years
1975 to 1981 are estimable.
Twice lagged values ( w,_~) of the dependent variable for each year were
used as an instrumental variables, the equations estimated, and an estimate of
the asymptotic covariance matrix 0 constructed. Following this, the equations
were successively re-estimated with one fewer lags until the lag length could no
longer be reduced.
The tests for lag length are shown in table 1. In each case one compares the
difference between the sum of squared residuals for the previous model (Q)
with the sum of squared residuals for the model with the shorter lag length
(QR). The difference L = QR - Q is distributed as a &i-square with one
degree of freedom. Since the main danger is to inappropriately truncate the lag
distribution (Type II error) these tests were conducted at the 10% level. The
result is an AR(2) specification for the reduced form wage equation.‘O
Contingent upon this specification of the autoregressive structure, the test
for individual effects was conducted. In doing so, restrictions were placed only
on the first two covariances (i.e., q = 2 in terms of the notation of section 3)
rather than on the covariance with all potential lags. As shown in table 1, the
hypothesis of no individual effects is strongly rejected. Thus, as is typically
assumed, wage determination over time contains an important element of
unobserved heterogeneity and the steady state of the wage process will differ
across individuals.
Finally, the AR(2) reduced form and the presence of individual effects
suggests that the correct model may be an AR(l) in first differences. This
Table 2
Parameter estimates (r-ratios).
Lag 1 Lag 2
0.323 0.098
(6.547) (2.620)
5. Summary
This note has presented a method of testing for the presence of individual
effects in dynamic models. Controlling for cross-sectional heterogeneity using
an individual effects specification increases data requirements and substan-
tially complicates the consistent estimation of parameters. Testing the assump-
tion of heterogeneity is desirable because in the absence of individual effects
simpler methods of data analysis are feasible. In an empirical example, the test
finds evidence of individual effects in a dynamic wage equation estimated
using a subsample of the Panel Study of Income Dynamics. In this cir-
cumstance, then, the test supports the attention paid to controlling heterogene-
ity in the estimation of wage equations.
References
Altonji, Joseph G. and Christina Paxson, 1985, Job characteristics and hours of work, Research in
Labor Economics 8, 1-55.
Chamberlain, Gary, 1984, Panel data, in: Zvi Griliches and Michael Intriligator, eds., Handbook
of econometrics, Vol. II (North-Holland, Amsterdam) 1247-1318.
Hansen, Lam, 1982, Large sample properties of generalized method of moments estimators,
Econometrica 50, 1029-1054.
Hausman, Jerry and William Taylor, 1981, Panel data and unobservable individual effects,
Econometrica 49, 1377-1398.
Holtz-Eakin, Douglas, Whitney Newey and Harvey Rosen, 1985, Implementing causality tests
with panel data, with an example from local public finance, National Bureau of Economics
technical working paper no. 48.
Hsiao, Cheng, 1986, Analysis of panel data (Cambridge University Press, Cambridge).
Nickell, Stephen, 1981, Biases in dynamic models with fixed effects, Econometrica 49,1417-1426.
White, Halbert, 1980, A heteroskedasticity-consistent covariance matrix estimator and a direct test
for heteroskedasticity, Econometrica 48, 817-838.