Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

1 Lecture 20: Panel Data Models

 Multiple regression works well when all observed variables are available.
However, if variables are missing, then model has omitted variable bias.
 Panel data allows methods for controlling for some types of omitted vari-
ables without even observing them!
 Notation:
i = 1:::n and denotes di erent entities (agents)
t = 1:::T and denotes time periods
n  T provide the total number of observations
 Distinction between:
Balanced panels (T constant across n)
Unbalanced panels (T not constant across n)
 Example of two di erent time periods:
Yit = a + b1 Xit + Zi + uit
Yit+1 = a + b1 Xit+1 + Zi + uit+1
 Note the Zi variable: does not vary with time. Lets assume that we
cannot observe it directly.
 Note the di erence of the two equations:
(Yit+1 , Yit) = b1 (Xit+1 , Xit) + (uit+1 , uit)
 In this instance the di erencing with respect to time has removed the time
invariant variables.
1.1 Other possibilities for estimating entity speci c e ects
1.1.1 Fixed E ects Model
 Fixed e ects model:
Yit = a + b1 Xit + Zi + uit
 Fixed e ects regression introduces a new variable for each i.
 De ne Zi , i = 1:::n to be a binary variable for each i. Note, you will
include n , 1 binary (dummy) variables in the regression.
 Model will produce `n' intercepts.
1
 De ne variables such that:
D1 = 1 if n = 1
=0 otherwise
D2 = 1 if n = 2
=0 otherwise
#
D(n,1) = 1 if n = (n , 1)
=0 otherwise
 Model for estimation becomes:
Yit = a + b1 Xit + 1 D1 + 2 D2 + ::: + (n,1)D(n,1) + uit
1.1.2 Within Groups
 Within groups (entity time demeaned) is estimated by OLS.
 Same basic model as before:
Yit = a + b1 Xit + Zi + uit
 De ne a transformation of the variables:
PT
Yi = t=1 Yit
T 8i
e

 Note that
PT
Zi = t=1
f Zi = Z
T i

 Demean the basic model:


     
Yit , Yei = (a , ea) + b1 Xit , Xfi + Zi , f
Zi + (uit , uei)
 Estimating equation becomes:
   
Yit , Yei = b1 Xit , Xfi + (uit , uei )

 Note that the coecient bb1 for the xed e ects model, the rst-di erenced
form, and the within groups should all return the same point estimate.
 For reasons beyond this course: within groups can cause problems in dy-
namic speci cations.

2
1.1.3 Dummy variables and transformed models.
 Note that other dummy variables might be included.
 What if the dummy variables are `time invariant' e.g. race, gender, indus-
try? Removed as part of the transformation.
 What if the dummy variable indicates change per entity over time? E.g.
union status for individuals, or rms? Change in marital status. Change
in management. Transformations can be picked up in model with any of
the methods mentioned above.
 Special case for time dummies (see below).
1.2 Applications to data
 Examination of the Cobb-Douglas production function:
Yit = eA L itKit e it
 In logs:
ln Yit = A + ln Lit + ln Kit +  it
 Think of the error component ( it) as being composed of three factors:
 it = Zi +  t + uit
 Notice that a new element has been added: t = time e ects. Factors that
vary or evolve across time, but are common to all employers (entities), are
represented by time e ects.
 t can be represented in the regression by including T , 1 binary (dummy)
variables for time:
DT1 = 1 if t = 1976
=0 otherwise
DT2 = 1 if t = 1977
=0 otherwise
#
DT(T ,1) = 1 if t = (T , 1)
=0 otherwise
 For example, time e ects ( t) can be thought of as common macroeco-
nomic factors that may e ect company performance.
 What might we think of as unobserved e ects (Zi )? Managerial talent,
team work at the plant, worker-management relations. Note, these must
be time invariant factors.

3
 Combined time ( t) and xed e ects (Zi ) regression eliminates omitted
variable bias arising from both unobserved variables constant over time
and over entity (agent).
 L20.xls has balanced panel data on publicly quoted British manufacturing
employers 1976-84.
 Note, change of Government in 1979 saw new macroeconomic policies
enacted on British industry in 1980. Large decline in the manufacturing
sector (mirrors decline in the US), and fall of numbers in employment in
manufacturing.
 How to estimate the various types of panel data models:
1. First-di erences: L20.xls and L20a.xls. Note that Linest cannot
cope with the missing data cells.
2. Fixed e ects estimates: L20 1.xls. Note that Linest cannot cope
with the number of variables - a dummy variable for each entity
( rm).
3. Within groups estimates: L20 2.xls.
 Compare the results from various forms of estimation.
1.3 Random e ects estimation
 Return to the model before:
Yit = a + b1Xit +  it
 it = Zi + t + uit
 Underlying the xed e ects (entity and time speci c) model, both the Zi
and t are assumed correlated with the independent variables (Xit):
6 0
E (Xit ; Zi) =
E (Xit ; t) =6 0
 With an assumption of random e ects, the entity speci c e ects (Zi ) are
assumed random variables.
 Assumptions behind the random e ects model are:
E (Zi ) = E ( t) = E (uit) = 0
E (Zi ; t ) = E ( t; uit) = 0
E (Zi ; Xit) = E ( t; Xit) = E (uit; Xit) = 0
V (Zi ) = 2Z ; V (t ) =  2 ; V (uit) =  2u

4
 If we return to think about the production function example, the presence
of xed e ects supposes that output (Yit) will be a ected by managerial
ability (Zi ). More ecient management will use less inputs (Lit and Kit ).
Zi and Lit and Kit (or Xit ) are not independent.
 What do random e ects imply? That although there may be unobserved
managerial ability, it is uncorrelated with the independent variables as the
management combine inputs by a process of `trial and error'- or random
luck!
 If they are random, and uncorrelated with the variables, so that there is no
omitted variable bias, we nevertheless still need to consider their inclusion
in the estimation.
 If the random e ects are not considered, then the results are not ecient.
The variance component of the model has not been modeled correctly.
 Mundlak, Y. (1978) Econometrica supposed that the distinction between
random and xed e ects is not necessary. Suppose that there is a linear
relation between the unobserved xed e ects and the independent vari-
ables:
E (Zi j Xit ) = X
fi + i

 If H0 :  = 0 then no correlation between the xed e ect and the inde-


pendent variable.
 Question of how to test for this when the Zi are unobserved!
 See Hausman, J. \Speci cation Tests in Econometrics" Econometrica,
1978. Hausman, J. and Taylor, W. \Panel Data and Unobservable In-
dividual E ects" Econometrica, 1981. Hausman test.
 Note that T xed and balanced. (Worrying about unbalanced T is beyond
this course). Drop t for notational convenience. Estimate (averaging
over T ) by OLS (see L20 2a.xls):
Yei = a + b1 X
fi + Zi + ufit

 Note that the variance of the error term (Zi + ufit ) would be:
V ar(Zi + ufit) =  2 + Ti 2ui
 A consistent estimate of this term is given by d Ye Xe (the variance for the
2

regression line from: Yei = a + b1Xfi + Zi + ufit). Denote this as: c2 .
 Estimate for variance on the random e ects model:
!
c2

dRE  0; c2 , Te
2

5
 r 
d
 
An estimate for  : i = 1 , Ti 2 ,2 :
b 2
FE
dRE dFE
 d
FE estimated by within groups.
2

 Makes intuitive sense: as variation in xed e ects is greater (so that d


FE >
2

d
2
RE ) then the square root term is unde ned and the estimate for  is a
constant 1.
 Use the estimate of  in generalized least squares (GLS) to obtain random
e ects estimates:
      h  i
Yit , bi Yei = a 1 , bi + b1 Xit , bi X
fi + 1 , bi Zi + uit , bi uei

From this estimating equation bb1 provides the random e ects coecient
on the independent variable.
 Hausman test (of which this is a close approximation):
 2
b1d;FE , b1d
;RE
   2(k,1)
d d
b1 ;FE , b1 ;RE
2 2

where (k , 1) are the number of partial coecients in the model. Note


that this must be the same between the xed and random e ects models.
 Also note that the true estimate of 2bd
1 ;FE
and 2bd
1 ;RE
contains a covariance
term:
2bd
1 ;RE
= var (b1 ) + 2cov (b1 )
1.3.1 Application to data.
 L20 2.xls and L20 2a.xls contains details of how the estimation is carried
out.

You might also like