Professional Documents
Culture Documents
Regresi Data Panel
Regresi Data Panel
Regresi Data Panel
Brodjol Sutijo
Introduction
• A panel data set, or longitudinal data set, is one where there are repeated
observations on the same units
• The units may be individuals, households, enterprises, countries, or any
set of entities that remain stable through time.
• The National Longitudinal Survey of Youth (NLSY) is an example. The
same respondents were interviewed every year from 1979 to 1994. Since
1994 they have been interviewed every two years.
• A balanced panel is one where every unit is surveyed in every time period.
The NLSY is unbalanced because some individuals have not been
interviewed in some years. Some could not be located, some refused, and
a few have died.
Introduction
• Panel data sets have several advantages over cross-section data sets:
• They may make it possible to overcome a problem of bias caused
by unobserved heterogeneity.
• They make it possible to investigate dynamics without relying on
retrospective questions that may yield data subject to measurement
error.
• They are often very large. If there are n units and T time periods,
the potential number of observations is nT.
• Because they tend to be expensive to undertake, they are often well
designed and have high response rates. The NLSY is an example.
Introduction
• We will start with an example of the use of panel data to investigate simple
dynamics. We will use data from the 1988 round of the NLSY for 1,538 males in
full-time employment.
• Here is the result of regressing the logarithm of hourly earnings on a
dummy variable for being married and a set of control variables (years of
schooling, ASVABC score, years of tenure and square, years of work
experience and square, etc; coefficients not shown).
• Married males earn 12.9 percent more than single males and the effect is highly
significant (standard error in parentheses).
Introduction
R2 0.271 0.274
n 1538 1538
• To test whether it is significantly lower, the easiest method is to change the reference
category to those who were married by 1988 and to introduce a new dummy variable
SINGLE that is equal to 1 if the respondent was still single four years later.
Introduction
• REGRESSION ANALYSIS WITH PANEL DATA
k s
Yit 1 j X jit p Z pi t it
j2 p 1
• where the Xj variables are observed and the Zp variables are unobserved (unobserved
heterogeneity )
• The index i refers to the unit of observation, t refers to the time period, and j and p
are used to differentiate between different observed and unobserved explanatory
variables
• Note that the unobserved heterogeneity is assumed to be unchanging and
accordingly the Zp variables do not have a time subscript.
Introduction
k
Yit 1 j X jit i t it
j2
• In that case the ai term may be dropped and pooled OLS may be used to fit the model,
treating all the observations for all of the time periods as a single sample.
Fix Effect
k
Yit 1 j X jit i t it
Yit Yi j X jit X ji t t it i
k
j2
k
Yi 1 j X ji i t i
j 2
j 2
• Last model is known as the ‘within-groups’ method because the model is explaining the
variations about the mean of the dependent variable in terms of the variations about the
means of the explanatory variables for the group of observations relating to a given
individual.
• The intercept b1 and any X variable that remains constant for each individual will drop
out of the model.
• the fixed effects approach is that the dependent variables are likely to have much smaller
variances than in the original specification. Now they are measured as deviations from
the individual mean, rather than as absolute amounts.
Fix Effect First Difference
k
Yit 1 j X jit i t it
Yit Yit 1 j X jit X jit 1 it it 1
k
j2
k
j2
Yit 1 1 j X jit 1 i t 1 it 1 k
j2
Yit j X jit it it 1
j 2
• Note that the error term is now (eit – eit–1). Thus the differencing gives rise to
moving average autocorrelation if eit satisfies the regression model assumptions.
• However, if eit is subject to AR(1) autocorrelation and r is close to 1, taking first
differences may approximately solve the problem.
it it 1 vit
it it 1 vit 1 it 1
vit