Professional Documents
Culture Documents
Panel Data Models
Panel Data Models
Panel Data Models
• Setting up
• First, know what type of panel you’ll be dealing with – because this can affect how
you estimate your equation
• Balanced panel (each 𝑖 has the same number of 𝑡) vs. Unbalanced panel (𝑖’s have different 𝑡)
• Short panel – where 𝑁 > 𝑇 vs. Long panel – where 𝑁 < 𝑇
• Why does this matter? Aside from different estimation, longer 𝑇 would mean being subject to time-
series concepts (i.e., stationarity, cointegration, etc.)
• Real panel data – “longitudinal data” which follows EXACTLY the same subjects over
time (e.g., Panel Study of Income Dynamics/PSID in the US, Indonesian Family Life
Survey/IFLS, Household Income and Labor Dynamics in Australia/HILDA Survey)
• Take the case where structural differences are attributed to different entities 𝑖
• 𝑌𝑖𝑡 = 𝛽0𝑖 + 𝛽1 𝑋𝑖𝑡 + 𝑢𝑖𝑡 ∀ 𝑖 = 1, … , 𝑁 and 𝑡 = 1, … , 𝑇
• Note the differential intercepts – each 𝑖 has its own intercept, representing a distinct average 𝑌 for a
particular 𝑖 – “on average, 𝑌 for this 𝑖 is different from other 𝑗 ≠ 𝑖, and the manner this occurs is fixed for
each 𝑖” – hence “fixed effects”
• These differential intercepts represent time-invariant characteristics of each 𝑖
• Can also be written as 𝑌𝑖𝑡𝑛−1
= 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝛿𝑖 + 𝑢𝑖𝑡 where 𝛿𝑖 are the fixed effects per 𝑖, ∀ 𝑖 = 1, … , 𝑁 − 1,
or 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + σ𝑖=1 𝛿𝑖 𝐷𝑖 + 𝑢𝑖𝑡
• Note that we only add 𝑛 − 1 dummy variables or differential slopes in this notation.
• Sometimes called LSDV M1
The Fixed Effects and Random Effects Models
• Fixed Effects Model Least-Squares Dummy Variable (LSDV) Model
• The effect of controlling for fixed effects is illustrated below
The Fixed Effects and Random Effects Models
• Fixed Effects Model Least-Squares Dummy Variable (LSDV) Model
• The effect of controlling for fixed effects is illustrated on the right
• This can be also done for a time-varying intercept – if we assume structural
differences occur across time and is experienced by all entities (LSDV M2)
• 𝑌𝑖𝑡 = 𝛽0𝑡 + 𝛽1 𝑋𝑖𝑡 + 𝑢𝑖𝑡 ∀ 𝑖 = 1, … , 𝑁 and 𝑡 = 1, … , 𝑇
• 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝜏𝑡 + 𝑢𝑖𝑡 where 𝜏𝑡 are the fixed effects per 𝑡, ∀ 𝑡 = 1, … , 𝑇 − 1
• 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + σ𝑇−1
𝑡=1 𝜏𝑡 𝐷𝑡 + 𝑢𝑖𝑡
• Can also be done for both entity-varying and time-varying intercept – this is called
a two-way fixed effects estimator (LSDV M3)
• 𝑌𝑖𝑡 = 𝛽0𝑖𝑡 + 𝛽1 𝑋𝑖𝑡 + 𝑢𝑖𝑡 ∀ 𝑖 = 1, … , 𝑁 and 𝑡 = 1, … , 𝑇
• 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝛿𝑖 + 𝜏𝑡 + 𝑢𝑖𝑡 where 𝛿𝑖 , 𝜏𝑡 are the fixed effects per 𝑖 and 𝑡, ∀ 𝑖 = 1, … , 𝑁 −
1 and 𝑡 = 1, … , 𝑇 − 1
• 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + σ𝑁−1 𝑇−1
𝑖=1 𝛿𝑖 𝐷𝑖 + σ𝑡=1 𝜏𝑡 𝐷𝑡 + 𝑢𝑖𝑡
• In Stata – regress using OLS then add - 𝑖. 𝑣𝑎𝑟, or 𝑖𝑏#. 𝑣𝑎𝑟 such that # ∈ 𝑣𝑎𝑟
The Fixed Effects and Random Effects Models
• Fixed Effects Model Least-Squares Dummy Variable (LSDV) Model
• We can test whether the inclusion of differential intercepts are significant by using the
Wald’s test for linear restrictions
𝑅𝑆𝑆𝑅 −𝑅𝑆𝑆𝑈𝑅
𝛼
• 𝐹= # 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝑅𝑆𝑆𝑈𝑅 ~𝐹𝑑𝑓 𝑈𝑅
𝑑𝑓𝑈𝑅
• 𝐻0 : restrictions (that is, the fe’s 𝛿𝑖 = 0, 𝜏𝑡 = 0) are valid,
• 𝐻1 : restrictions are invalid
• OLS (without fe’s, called “Naïve” or N ) is considered the restricted model, LSDV’s are unrestricted
models
• This gives consistent estimates of the slope 𝛽1 , but inefficient (i.e., larger variance) because of
smaller variation in variables (and therefore, larger variation in 𝑢𝑖𝑡 ).
• This may also render any time-invariant variation (i.e., sex, race, industry) inestimable, and may
remove long-run effects, because of 𝐸 .
• In Stata – this can be done using the ”𝑥𝑡𝑟𝑒𝑔” command with the “𝑓𝑒” option – 𝑥𝑡𝑟𝑒𝑔 𝑦 𝑥, 𝑓𝑒
provided you did the “𝑥𝑡𝑠𝑒𝑡” command at the beginning.
The Fixed Effects and Random Effects Models
• Other Fixed Effects Models
• First-difference estimator
• An alternative to the WG estimator – in principle, it also effectively removes any entity-specific,
time-invariant heterogeneity
• You take the first difference of all variables in the equation
• For any 𝑍𝑖𝑡 ∈ 𝑌𝑖𝑡 , 𝑋𝑖𝑡 , 𝑢𝑖𝑡 , ∆𝑍𝑖𝑡 = 𝑍𝑖,𝑡 − 𝑍𝑖,𝑡−1
• Estimate the equation ∆𝑌𝑖𝑡 = 𝛽1 ∆𝑋𝑖𝑡 + ∆𝑢𝑖𝑡
• In Stata – This can be done manually by doing an OLS regression but transforming the variables
using “𝑑#. 𝑣𝑎𝑟” operator where “#” is the number of differences, usually either 1 or 2. You must
have done 𝑥𝑡𝑠𝑒𝑡 or 𝑡𝑠𝑠𝑒𝑡 first
• For example, 𝑟𝑒𝑔𝑟𝑒𝑠𝑠 𝑑1. 𝑖𝑛𝑐𝑜𝑚𝑒 𝑑1. 𝑒𝑑𝑢𝑐
The Fixed Effects and Random Effects Models
• Random Effects Model
• Also known as the error components model (ECM)
• As a criticism to FEM – inclusion of dummy variables are a representation of the lack of
knowledge about the “true” model, why not express it through the disturbance term
• 𝑌𝑖𝑡 = 𝛽0𝑖 + 𝛽1 𝑋𝑖𝑡 + 𝑢𝑖𝑡 , ∀ 𝑖 = 1, … , 𝑁
• But 𝛽0𝑖 = 𝛽0 + 𝜀𝑖 , … so it becomes 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝜀𝑖 + 𝑢𝑖𝑡
• Where 𝜀𝑖 is the cross-sectional error component which represents the unobserved
heterogeneity across 𝑖 such that 𝜀𝑖 ~𝑁 0, 𝜎 2
• So, we now have a new error term, 𝜔𝑖𝑡 , such that 𝜔𝑖𝑡 = 𝜀𝑖 + 𝑢𝑖𝑡
• Usual assumptions of the ECM include
• 𝜀𝑖 ~𝑁 0, 𝜎𝜀2
• 𝑢𝑖𝑡 ~𝑁 0, 𝜎𝑢2
• 𝐸 𝜀𝑖 𝑢𝑖𝑡 = 0; 𝐸 𝜀𝑖 𝜀𝑗 = 0 ∀ 𝑖 ≠ 𝑗; 𝐸 𝑢𝑖𝑡 𝑢𝑖𝑠 = 0 ∀𝑡 ≠ 𝑠; 𝐸 𝑢𝑖𝑡 𝑢𝑗𝑡 = 0 ∀𝑖 ≠ 𝑗; - meaning error
components are not correlated with other cross-section and time series units
• 𝐸 𝑋𝑖𝑡 𝜔𝑖𝑡 = 0 – exogeneity of regressors must be preserved otherwise ECM will be inconsistent.
The Fixed Effects and Random Effects Models
• Random Effects Model
• Note that by assumptions about the distribution of the error components, 𝐸 𝑤𝑖𝑡 = 0
and 𝑣𝑎𝑟 𝜔𝑖𝑡 = 𝜎𝜀2 + 𝜎𝑢2
• If 𝜎𝜀2 = 0, then that means the model is no different from the Naïve model – we can
just use OLS.
• Use the Breusch-Pagan Lagrange Multiplier Test – which tests the null hypothesis,
𝐻0 : 𝜎𝜀2 = 0 or just use Naïve vs 𝐻1 : 𝜎𝜀2 > 0 or use REM.
• If we reject the null hypothesis, that means REM is better than Naïve