Professional Documents
Culture Documents
Chapter 1. Linear Panel Models and Heterogeneity: School of Economics and Management - University of Geneva
Chapter 1. Linear Panel Models and Heterogeneity: School of Economics and Management - University of Geneva
University of Orléans
February 2018
Objectives
Notations
Let us consider the following linear model
8 i = 1, .., n, 8 t = 1, .., T
αit is a scalar that varies across i and t.
2 Regression intercepts are the same, and slope coe¢ cients are not
(unusual).
yit = α + βi0 xit + εit
3 Both slope and intercept coe¢ cients are the same (homogeneous /
pooled panel).
yit = α + β0 xit + εit
Speci…cation Tests
Remark 1
Under H1 , the residual sum of squares is equal to the sum of the n residual
sum of squares associated to the n individual regressions:
n n n
RSS1 = ∑ RSS1,i = ∑ bε2it = ∑ Syy ,i 0
Sxy 1
,i Sxx ,i Sxy ,i
i =1 i =1 i =1
T T T
1 1
Syy ,i = ∑ (yit y i )2 with x i =
T ∑ xit and y i =
T ∑ yit
t =1 t =1 t =1
T T
Sxx ,i = ∑ (xit x i ) (xit x i )0 Sxy ,i = ∑ (xit x i ) (yit yi)
t =1 t =1
Remark 2
Under H01 , the model becomes:
with
n T
Syy = ∑ ∑ (yit y i )2
i =1 t =1
n T
Sxx ,i = ∑ ∑ (xit x i ) (xit x i )0
i =1 t =1
n T
Sxy ,i = ∑ ∑ (xit x i ) (yit yi)
i =1 t =1
H02 : βi = β 8 i = 1, ..n
T T T
1 1
Syy ,i = ∑ (yit y i )2 with x i =
T ∑ xit yi =
T ∑ yit
t =1 t =1 t =1
T
Sxx ,i = ∑ (xit x i ) (xit x i )0
t =1
T
Sxy ,i = ∑ (xit x i ) (yit yi)
t =1
Under the null, the model is homogeneous (pooled) and the restricted
residual sum of squares is SCR1,c .
Under the alternative, the model is yit = αi + β0 xit + εit , and there is
nT K n degrees of freedom
(RSS1,c RSS1,c 0 ) / (n 1)
F3 = (1)
RSS1,c 0 / [n (T 1) K ]
sit the number of strike days for 1,000 workers for the country i at
time t.
uit the unemployment rate
pit the in‡ation rate
F1 F (48, 544)
H 01
since
(n 1) (K + 1) = (17 1) (2 + 1) = 48
nT n (K + 1) = 595 17 (2 + 1) = 544
F2 F (32, 544)
H 01
since
(n 1) K = (17 1) 2 = 32
nT n (K + 1) = 595 17 (2 + 1) = 544
F3 F (16, 576)
H 01
since
(n 1) = 17 1 = 16
n (T 1) K = 17 (35 1) 2 = 576
Example
It is reasonnable to assume that the slope parameters of the production
function are the same accros countries? what does it imply? Should I
impose a common mean for the TFP for France and Germany? The
answer is probably no.
Objectives
yi = eαi + Xi β + εi
E ( εi ) = 0
E εi εi0 = σ2ε IT
E εi εj0 = 0 if i 6= j
(T ,T )
yi = e αi + Xi β + εi
(3,1 ) (3,1 )(1,1 ) (3,2 )(2,1 ) (3,1 )
0 1 0 1 0 1 0 1
yi ,1 1 ki ,1 ni ,1 εi ,1
@ yi ,2 A = @ 1 A αi + @ ki ,2 ni ,2 A βk
+ @ εi ,2 A
βn
yi ,3 1 ki ,3 ni ,3 εi ,3
Y =e eeα+Xβ+ε
10 0 1 0 1
y1 X1 ε1
B y2 C B X2 C B ε2 C
Y =B C
@ ... A X =B C
@ ... A ε =B C
@ ... A
(Tn,1 ) (Tn,K ) (Tn,1 )
yn Xn εn
where 0T is the null vector (T , 1) .
0 1 0 1
e 0T ... 0T α1
B 0T e ... 0T C B α2 C
e = In e = B
e @ ... ...
C α =B
e C
(Tn,n ) ... 0T A (n,1 ) @ ... A
0T 0T ... e αn
ee
Y =eα+Xβ+ε
0 1 0 1 0 1 0 1
y1,1 1 0 k11 n11 ε11
B y1,2 C B 1 0 C B k12 n12 C B C
B C B C B C B ε12 C
B y1,3 C B C B k13 n13 C B C
B C B 1 0 C α1 B C βk B ε1,3 C
B C=B C +B C +B C
B y2,1 C B 0 1 C α2 B k21 n21 C βn B ε2,1 C
B C @ A B C B C
@ y2,2 A 0 1 @ k22 n22 A @ ε2,2 A
y2,3 0 1 k,3 n23 ε2,3
Discussion
For Wooldridge (2010), these discussions about whether the αi should
be treated as random variables or as parameters to be estimated are
wrongheaded for microeconomic panels.
Mundlak Y. (1978), ”On the Pooling of Time Series and Cross Section
Data”, Econometrica, 46, 69-85
Remarks
Actually, a stronger conditional mean independence assumption,
E ( αi j xi 1 , ..., xiT ) = 0
Remarks
Wooldridge (2010) avoids referring to αi as a random e¤ect or a …xed
e¤ect. Instead, he refers to αi as unobserved e¤ect, unobserved
heterogeneity, and so on.
where RDit is spending on R&D for …rm i at time t and zit contains other
explanatory variables. αi is a …rm heterogeneity term that may in‡uence
patentsit and that may be correlated with current, past, and future R&D
expenditures.
cov (αi , RDit k ) 6= 0 8k
Objectives
Notations
Let us denote
0 1 0 1
yi ,1 x1,i ,1 x2,i ,1 ... xK ,i ,1
B yi ,2 C B x1,i ,2 x2,i ,2 ... xK ,i ,2 C
yi = B @ ... A
C Xi = B@ ...
C
A
(T ,1 ) (T ,K ) ... ... ...
yit x1,it x2,it ... xK ,it
yi = eαi + Xi β + εi 8 i = 1, .., n
Assumptions (H1) The errors terms εit are i.i.d. 8 (it ) with:
E (εit ) = 0
σ2ε t=s
E (εit εi ,s ) = , or E (εi εi0 ) = σ2ε IT where It denotes
0 8t 6 = s
the identity matrix (T , T ) .
Theorem
Under assumptions H1 , the ordinary-least-squares (OLS) estimator of β is
the best linear unbiased estimator (BLUE).
with
T T
1 1
xi =
T ∑ xit yi =
T ∑ yit
t =1 t =1
Given the second FOC (with respect to β) and the previous result, we can
derive the formula for b
βLSDV .
i =1 t =1
!
n T
∑ ∑ (xit x i ) (yit yi)
i =1 t =1
Remarks
yi = eαi + Xi β + εi
0 1 1 1 1 1
1 T T ... T T
B 1
1 1
... 1 1 C
B T T T T C
1 0 B C
Q = IT ee = B
B .... .... ... ... ... C
C
(T ,T ) T B 1 1 1 1 C
@ T T ... 1 T T A
1 1 1 1
T T ... T 1 T
1 0
QXi = Xi ee Xi
0 T 1
x1,i ,1 x2,i ,1 ... xK ,i ,1
B x1,i ,2 x2,i ,2 ... xK ,i ,2 C
= B
@ ...
C
A
... ... ...
x1,it x2,it ... xK ,it
0 1
1
1 BB 1 C
C
@ ∑Tt=1 x1,it ∑Tt=1 x2,it ... ∑Tt=1 xK ,it
T ... A
1
1 0
Qe = IT ee e
T
1 0
= e ee e
T
= e e=0
since 0 1
1
e 0e = 1 .. 1 @ .. A = T
1
So, we have:
yi = eαi + Xi β + εi
where
1 0
Q = IT ee
T
1 0 1 0
Qzi = IT ee zi = zi ee zi = zi z i e = 0T
T T
Example
Let us consider a simple panel regression model for the total number of
strike days in OECD countries. We have a balanced panel data set for 17
countries (n = 17) and annual data form 1951 to 1985 (T = 35). General
idea: evaluate the link between strikes and some macroeconomic factors
(in‡ation, unemployment etc..)
sit the number of strike days for 1000 workers for the country i at
time t.
uit the unemployement rate
pit the in‡ation rate
Theorem
The LSDV estimator b β is unbiased and consistent when either n, or T , or
both tend to in…nity.
p
bβLSDV ! β
nT !∞
Theorem
The estimator for the unobserved e¤ects b
αi , although unbiased, is
consistent only when T ! ∞.
p
b
αi ! αi
T !∞
Theorem
The asymptotic variance–covariance matrix of the LSDV estimator b
β is
given by:
! 1
n
V b β = σ2ε ∑ Xi0 QXi
LSDV
i =1
! 1
n
b b
V βLSDV = b2ε
σ ∑ Xi0 QXi
i =1
with
n T
1
n i∑ ∑
b2ε =
σ bε2it
nT K =1 t =1
n T
1
nT K n i =1 t∑
∑
b2ε =
σ bε2it
=1
1
= 0.146958e 09
595 2 17
= 505.1093
Objectives
εit = αi + λt + vit
Terminology
εit = αi + λt + vit
σ2α i =j
E ( αi αj ) =
0 8i 6 = j
σ2λ t=s
E ( λt λs ) =
0 8t 6 = s
σ2v t = s, i = j
E (vit vj ,s ) =
0 8t 6= s, 8i 6= j
E (αi xit0 ) = E (λt xit0 ) = E (vit xit0 ) = 0
Remark
As suggested by Wooldridge (2001), the "…xed e¤ect" speci…cation can be
viewed as a case in which αi is a random parameter with
cov αi , xit0 6= 0
cov αi , xit0 = 0
E ( αi ) = µ
εit = αi + λt + vit
εit = αi + vit
Vectorial form
The vectorial expression of the individual e¤ects model is then de…ned as:
yi = ei
X γ + εi
(T ,1 ) (T ,K +1 )(K +1,1 ) (T ,1 )
εi = e αi + vi
(T ,1 ) (T ,1 )(1,1 ) (T ,1 )
ei = (e : Xi ) and γ0 = µ : β0
X
Remark
The presence of αi produces a correlation among errors of the same
cross-sectional unit (autocorrelation) as
Remark
0
If we consider the nT 1 vector of errors ε = (ε10 , ..., εn0 ) , we have
V (ε) = E εε0 = V In
0 1
V 0 ... 0
(T ,T ) (T ,T ) (T ,T )
B C
B V ... 0 C
B (T ,T ) (T ,T ) C
V (ε) = BB
C
C
(nT ,nT ) B ... 0 C
@ (T ,T ) A
V
(T ,T )
Within transformation
Regardless of whether the αi are treated as …xed or as random, the
individual-speci…c e¤ects for a given sample can be swept out by the
idempotent (covariance) transformation matrix Q
Since Qe = IT T 1 ee 0 e = 0, we have
Theorem
Under assumptions H2 , when αi are treated as random, the LSDV
estimator is unbiased and consistent either n, or T , or both tend to
in…nity. However, whereas the LSDV is the BLUE under the assumption
that αi are …xed constants, it is not the BLUE when αi are assumed
random. The BLUE in the latter case is the Generalized-Least-Squares
(GLS) estimator.
Summary
Notations
Let us consider the model
ei γ + εi
yi = X 8 i = 1, .., n
ei = (e : Xi ) and γ0 = µ : β0 .
where εi = αi e + vi , X
1 1 1 0
V = Q+ψ ee
σ2v T
σ2v
ψ=
σ2v + T σ2α
Proof
1 1 0
V 1
V = Q+ψ ee σ2α ee 0 + σ2v IT
σ2v T
1 σ2α 0 0 σ2
= σ2α Qee 0 + σ2v Q + ψ ee ee + ψ v ee 0
σ2v T T
1 σ 2
= σ2v Q + ψσ2α ee 0 + ψ v ee 0 as e 0 e = T
σ2v T
1 1
= σ2v Q + ee 0 ψ T σ2α + σ2v
σ2v T
1 0 1 T σ2α + σ2v
= IT ee + ee 0 ψ
T T σ2v
Proof (ct’d)
1 0 1 T σ2α + σ2v
V 1
V = IT ee + ee 0 ψ
T T σ2v
Since
σ2v
ψ=
σ2v + T σ2α
we have
1 0 1
V 1
V = IT ee + ee 0 = IT
T T
ei = (e Xi ) and γ0 = µ β0
with X
i =1 i =1
!
n n
1
T ∑ Xi0 Qyi + ψ ∑ (x i x ) (y i y)
i =1 i =1
1
with ψ=σ2v σ2v + T σ2α
i =1 i =1
!
n n
1
T ∑ Xi0 Qyi + ψ ∑ (x i x ) (y i y)
i =1 i =1
y i = c + β 0 x i + εi 8 i = 1, .., n
! 1 !
n n
b
βBE = ∑ (x i x ) (x i x) 0
∑ (x i x ) (y i y)
i =1 i =1
! 1
T n
b
βpooled = ∑ ∑ (xit x ) (x i x) 0
t =1 i =1
!
T n
∑ ∑ (xit x ) (yit y)
t =1 i =1
Theorem
Under assumptions H2 , the GLS estimator b
βGLS is a weighted average of
the between-group b
βBE and the within-group (LSDV) estimators b βLSDV .
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
i =1 i =1
!
n
∑ (x i x ) (x i x )0
i =1
! 1
n n
∆ = ψT ∑ Xi0 QXi + ψT ∑ (x i x ) (x i x) 0
i =1 i =1
!
n
∑ (x i x ) (x i x )0
i =1
∆=0 b
βGLS = b
βLSDV
! 1 !
n n
b
βBE = ∑ (x i x ) (x i x )0 ∑ (x i x ) (y i y)
i =1 i =1
! 1 !
n T n T
b
βLSDV = ∑ ∑ (xit x i ) (xit xi ) 0
∑ ∑ (xit x i ) (yit yi)
i =1 t =1 i =1 t =1
Proof: case 2
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
! 1 !
n n n
= T ∑ Xi0 QXi + T ∑ (x i x ) (x i x) 0
∑ (x i x ) (y i y)
i =1 i =1 i =1
! 1 !
n T n T
+ ∑ ∑ (xit x i ) (xit xi ) 0
∑ ∑ (xit x i ) (yit yi)
i =1 t =1 i =1 t =1
! 1
n n n T
T ∑ Xi0 QXi +T ∑ (x i x ) (x i x) 0
∑ ∑ (xit x i ) (yit
i =1 i =1 i =1 t =1
i =1 t =1
!
n T
∑ ∑ (xit x ) (yit y)
i =1 t =1
= b
βpooled
Fact
The procedure of treating αi as random coe¢ cients provides an
intermediate solution between treating them all as di¤erent (…xed
e¤ects, LSDV) and treating them all as equal (pooled model).
σ2v
lim ψ = lim =0
T !∞ T !∞ σ2v + T σ2α
Interpretation
b
βGLS ! b
βLSDV
T !∞
For each i we assume that they are just like …xed parameters.
P = IT 1 ψ1/2 (1/T ) ee 0
We have
1 1 0
V = PP
σ2v
Premultiplying the model by the transformation matrix P, we obtain the
GLS estimator by applying the least-squares method to the transformed
model (Theil (1971, Chapter 6)).
Transformation matrix
The GLS estimator is equivalent to
1 Transforming the data by subtracting a fraction 1 ψ1/2 of
individual means y i and x i from their corresponding yit and xit
2 Regressing yit 1 ψ1/2 y i on a constant and xit 1 ψ1/2 x i
using simple OLS.
i =1 i =1
Remark
! 1
n n
V b
βGLS = σ2v ∑ Xi0 QXi + ψT ∑ (x i x ) (x i x) 0
i =1 i =1
! 1
n
V b
βLSDV = σ2ε ∑ Xi0 QXi
i =1
V b
βGLS V b
βLSDV
b2v
σ
b=
ψ
b2v + T σ
σ b2α
b 1 1 b 1 0
V = Q+ψ ee
b2v
σ T
Lemma
When the sample size is large (in the sense of either n ! ∞,or T ! ∞),
the two-step GLS estimator will have the same asymptotic e¢ ciency as the
GLS procedure with known variance components.
Lemma
Even for moderate sample size (for T 3, n (K + 1) 9; for T 2,
n (K + 1) 10), the two-step procedure is still more e¢ cient than the
LSDV estimator in the sense that the di¤erence between the covariance
matrices of the covariance estimator and the two-step estimator is
nonnegative de…nite.
Example
Let us consider a simple panel regression model for the total number of
strikes days in OECD countries. We have a balanced panel data set for 17
countries (n = 17) and annual data form 1951 to 1985 (T = 35).
Here we have
b2v = 0.25514e 06 = 255, 140
σ
b2α = 55, 401
σ
b2v
σ 255, 140
b=
ψ = = 0.1163
b2v
σ b2α
+ Tσ 255, 140 + 35 55, 401
1 Error-component model.
2 GLS, Between and pooled estimators.
3 Write the GLS estimator as a weighted average.
4 Feasible GLS estimator.
5 Properties of the GLS estimator.
6 Asymptotic variance-covariance matrix of the GLS estimator.
Objectives
b
βGLS ! b
βLSDV
T !∞
Mundlak’s speci…cation
Mundlak Y. (1978), ”On the Pooling of Time Series and Cross Section
Data”, Econometrica, 46, 69-85.
Mundlak’s speci…cation
αi = x i0 a + αi
|{z} |{z}
component proportional to x component orthogonal to x
E αi xit0 = 0
εit = αi + vit
Assumption H3: The error term εit = αi + vit are i.i.d. 8 (it ) with:
E (αi ) = E (vit ) = 0
E (αi vit ) = 0
σ2α i =j
E αi αj =
0 8i 6 = j
σ2v t = s, i = j
E (vit vj ,s ) =
0 8t 6= s, 8i 6= j
E (vit xit0 ) = E (αi xit0 ) = 0
Mundlak’s speci…cation
The model can be rewritten as follows:
yi = ei
X γ + εi 8 i = 1, .., n
(T ,1 ) (T ,K +1 )(K +1,1 ) (T ,1 )
with
εi = αi e + vi
ei = ex i0 : e : Xi
X
γ0 = a0 : µ : β0
Mundlak’s speci…cation
The variance-covariance matrix of the error term is de…ned as:
0
E εi εj0 = E (αi e + vi ) αj e + vj
σ2α ee 0 + σ2v IT = V i =j
=
0 i 6= j
GLS estimator
Utilizing the expression for the inverse of a partitioned matrix, we obtain
the GLS estimator of µ, β, and a as:
bGLS = y
µ x 0b
βBE
βGLS = ∆ b
b βBE + (IK ∆) b
βLSDV
aGLS = b
b βBE b
βLSDV
Between estimator
The between estimator b
βBE corresponds to the OLS estimator obtained
in the model:
y i = c + ( β + a ) 0 x i + εi = c + θ 0 x i + εi 8 i = 1, .., n
! 1 !
n n
b
θ BE = ∑ (x i x ) (x i x )0 ∑ (x i x ) (y i y)
i =1 i =1
b
βGLS ! β
nT !∞
Besides, we have
b
βGLS ! b
βLSDV
T !∞
εit = αi + vit
αi = x i0 a + αi
αi = x i0 a + αi
εit = αi + vit
In general, we have:
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
with
! 1
n n
∆ = ψT ∑ Xi0 QXi + ψT ∑ (x i x ) (x i x) 0
i =1 i =1
!
n
∑ (x i x ) (x i x) 0
i =1
When T is …xed and n tends to in…nity, the GLS is not consistent if there
is a correlation between individual e¤ects and the expanatory variables:
plim b
βGLS = ∆ plim b
βBE + IK ∆ plim b
βLSDV
n !∞ n !∞ n !∞
= ∆ ( β + a) + IK ∆ β (2)
= β + ∆a
with
∆ = plim ∆.
n !∞
Summary
E ( αi j xi 1 , .., xiK ) = 0 E ( αi j xi 1 , .., xiK ) 6= 0
LSDV GLS LSDV GLS
T …xed, n ! ∞ Consistent — Consistent Not Consistent
T ! ∞ and n ! ∞ Consistent BLUE Consistent Consistent
y = f (x; β) + ε
Distance measure
This distance is naturally de…ned as follows:
0 1
H= b
β2 b
β1 V b
β2 b
β1 b
β2 b
β1
Generaly we know V b
β2 and V b
β1 , but not V b
β2 b
β1 .
Theorem
From this lemma, it follows that
V b
β2 b
β1 = V b
β2 V b
β1
or equivalently
1
b0 (V (q
H=q b)) b
q
e = plim
q b
β2 b
β1
H 1 /n !∞
H0 : E ( αi j Xi ) = 0
H1 : E ( αi j Xi ) 6= 0
H0 : E ( αi j Xi ) = 0
H1 : E ( αi j Xi ) 6= 0
yi = Xi β + eαi + vi
3 Under H1 , b
βGLS is not consistent.
cov b
βGLS , b
βLSDV b
βGLS = 0 () cov b
βLSDV , b
βGLS =V b
βGLS
Since,
V b
βLSDV b
βGLS =V b
βLSDV + V b
βGLS 2cov b
βLSDV , b
βGLS
We have:
V b
βLSDV b
βGLS =V b
βLSDV V b
βGLS
Under H0 : E ( αi j Xi ) = 0, we have:
HO
H ! χ2 (K )
nT !∞
Remarks
0 1
H= b
βLSDV b
βGLS V b
βLSDV V b
βGLS b
βLSDV b
βGLS
V b
βLSDV V b
βGLS >0
since b
βGLS is the BLUE.
Remarks
Example
Let us consider a simple panel regression model for the total number of
strikes days in OECD countries. We have a balanced panel data set for 17
countries (n = 17) and annual data form 1951 to 1985 (T = 35).
1 Mundlak’s speci…cation.
2 Dependence between random e¤ects and explanatory variables.
3 The GLS estimator may be not consistent (…xed T , n tends to
in…nity).
4 The GLS is always consistent when T tends to in…nity.
5 Hausman’s lemma.
6 Hausman’s speci…cation test for …xed or random e¤ects models.
Objectives
But some "links" between the individuals may require a panel regression
model :
βi = β + ζi
(K ,1 ) (K ,1 ) (K ,1 )
where yi and vi are T 1 vectors (yit , ..., yiT ) and (vit , ..., viT ), and Xi is
the T K matrix of the time-series observations of the i th individual’s
explanatory variables with the t th row equal to xit .
E ( βi ) = β V ( βi ) = ∆ 8i = 1, ..., n
K
yit = ∑ ( βk + ξ ki ) xkit + vit i = 1, .., n
k =1
βki = βk + ξ ki
Remarks
E ( αi ) = 0
Swamy P.A. (1970), ”E¢ cient Inference in a Random Coe¢ cient Regression
Model”, Econometrica, 38, 311-323
∆ i =j
E ξ i ξ j0 =
0 8i 6 = j
σ2i IT t = s, i = j
E (vi vj 0 ) =
0 8t 6= s, 8i 6= j
E vi vi0 = σ2i IT
E ( βi ) = β
(K ,1 ) (K ,1 )
V ( βi ) = E ξ i ξ i0 = ∆
(K ,K ) (K ,K )
Homegeneous moments
The variance covariance matrix ∆ of the random parameters
βi = ( β1i , β2i , .., βKi )0 is assumed to be common to all cross section units:
0 2 1
σβ σ β1 ,β2 ... σ β1 ,βK
B σ 1 2 ... σ β2 ,βK C
B β ,β σ β2 C
∆ = E ( βi β ) ( βi β ) 0 = B 2 1 C
(K ,K ) @ ... ... ... ... A
σ βK ,β1 σ βK ,β2 ... σβ2
K
Vectorial form
For each cross section unit, we have:
yi = Xi β + Xi ξ i + vi
βi = β + ξ i
where the vector Xi include a constant term (i.e. the average of random
individual e¤ects, α).
yi = Xi β + εi
εi = Xi ξ i + vi = Xi ( βi β) + vi
Covariance matrix
For a given cross unit, the covariance matrix for the composite disturbance
term εi = Xi ξ i + vi is de…ned by:
Φi = E εi εi0
= E (Xi ξ i + vi ) (Xi ξ i + vi )0
= Xi E ξ i ξ i0 Xi0 + E vi vi0
= Xi ∆Xi0 + σ2i IT
De…nition
For a given cross unit, the covariance matrix for the composite disturbance
term εi = Xi ξ i + vi is de…ned by:
Φi = Xi ∆Xi0 + σ2i IT
Remarks
1 Under Swamy’s assumption, the simple regression of y on X will yield
an unbiased and consistent estimator of β if (1/nT )X 0 X converges
to a nonzero constant matrix.
2 But the estimator is ine¢ cient, and the usual least-squares formula
for computing the variance–covariance matrix of the estimator is
incorrect, often leading to misleading statistical inferences.
b 1
βi = Xi0 Xi Xi0 yi
1 1
b2i =
σ yi0 IT Xi Xi0 Xi Xi0 yi
T K
T
1
=
T K ∑ vbit
t =1
with
yit = βi0 xit + vit
Remarks
1 Swamy proved that substituting σ b for σ2 and ∆ yields an
b2i and ∆ i
asymptotically normal and e¢ cient estimator of β.
2 The speed of convergence of the GLS estimator is n1/2 .
2 Compute σ b as follows
b2i and the Swamy’s estimator ∆
! !0 !
n
1 1 n b 1 n b
1 i∑ n i∑ n i∑
b =
∆ b
βi βi b
βi βi
(K ,K ) n =1 =1 =1
T
1
b2i =
σ
T K ∑ vbit
t =1
with
Φ b i0 + σ
b i = Xi ∆X b2i IT
b= 0.0011 0.0002
∆
0.0187
This predictor is the best linear unbiased estimator in the sense that
E ( βi βi ) = 0, where the expectation is an unconditional one.
Assumptions
y = X β + W γ + v
(nT ,1 ) (nT ,nK )(nK ,1 ) (nT ,nm )(nm,1 ) (nT ,1 )
0 1 0 1
X1 0 0 W1 0 0
B 0 X2 C B 0 W2 C
X =B
@
C
A W =B
@
C
A
0 Xn 0 Wn
β = A1 β + ζ
β is an L 1 vector of constants,
γ = A2 γ
γ is an n 1 vector of constants.
MFR model
Since A2 is known, we can rewrite the model as
y = X f
β + W γ + v
(nT ,1 ) (nT ,nK )(K ,1 ) (nT ,n )(n,1 ) (nT ,1 )
f = WA2 .
where W
Many of the linear panel data models with unobserved individual speci…c
but time-invariant heterogeneity can be treated as special cases of this
model.
Example
A common model for all cross-sectional units. If there is no interindividual
di¤erence in behavioral patterns, we may let X = 0, A2 = en Im , so
model becomes
yit = wit γ + vit
Example
When each individual is considered di¤erent, then X = 0, A2 = In Im ,
and the model becomes
yit = wit γi + vit
Example
When the e¤ects of the individual speci…c, time-invariant omitted variables
are treated as random variables just as in the assumption on the e¤ects of
other omitted variables, we can let Xi = eT , ζ 0 = (ζ 1 , ..., ζ n ), A1 = en ,
C = In σ2α , β be an unknown constant, and wit not contain an intercept
term. Then the model becomes:
Units in the same group share the same time pro…le αgt
De…nition
The grouped …xed-e¤ects estimator in this model is de…ned as the solution
of the following minimization problem:
n T
b = arg min ∑ ∑
b 2
θ, b
α, γ yit xit0 θ αg i t
i =1 t =1
where the minimum is taken over all possible groupings γ = fg1 , ..., gn g of
the n units into G groups, common parameters θ, and group-speci…c time
e¤ects α.
De…nition
For given values of θ and α, the optimal group assignment for each
individual unit is given by:
T
∑
2
gbi (θ, α) = arg min yit xit0 θ αgt
g 2f1,...,G g t =1