Chapter 1. Linear Panel Models and Heterogeneity: School of Economics and Management - University of Geneva

Chapter 1.
Linear Panel Models and Heterogeneity

School of Economics and Management - University of Geneva
Christophe Hurlin, Université of Orléans
University of Orléans
February 2018
C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 1 / 258

Introduction
The outline of this chapter is the following:
Section 1: Speci…cation tests and analysis of covariance
Section 2: Linear unobserved e¤ects panel data models
Section 3: Fixed e¤ects estimation methods
Section 4: Random e¤ects estimation methods
Section 5: Speci…cation tests: random or …xed e¤ects?
Subsection 5.1: The Mundlak’s speci…cation
Subsection 5.2: The Hausman’s test
Section 6: Heterogeneous panel data models
Subsection 6.1: Random coe¢ cient models
Subsection 6.2: Other heterogeneous models

Section 1
Speci…cation tests and analysis of covariance

1. Speci…cation tests and analysis of covariance
Objectives
1 De…ne the concept of homogeneous panel data model.

2 De…ne the concept of heterogeneous panel data model.
3 De…ne the concept of individual (unobserved) e¤ects.
4 Introduce the speci…cation tests (Hsiao, 2003).
5 Propose an empirical application for the strike days in OECD.

Notations
Let us consider the following linear model
yit = αit + βit0 xit + εit
8 i = 1, .., n, 8 t = 1, .., T
αit is a scalar that varies across i and t.
βit = ( β1it , β2it , ..., βKit )0 is a K 1 vector of parameters that vary

across i and t,
xit = (x1it , ..., xKit )0 is a K 1 vector of exogenous variables,
εit is an error term.

Di¤erentrestrictions on the regression coe¢ cients can be tested:

1 the homogeneity of regression slope coe¢ cients
2 the homogeneity of regression intercept coe¢ cients
3 the time stability of parameters (slopes and constants). We will not
consider this issue here (since it is not speci…c to panel data models).

Fact (Time stability)

We assume that the parameters are constant over time (no structural
break, no regime switching, etc.), but can vary across individuals.
yit = αi + βi0 xit + εit

Three types of restrictions can be imposed on this model.

1 Regression slope coe¢ cients are identical, and intercepts are not
(model with individual / unobserved e¤ects).
yit = αi + β0 xit + εit
2 Regression intercepts are the same, and slope coe¢ cients are not
(unusual).
yit = α + βi0 xit + εit
3 Both slope and intercept coe¢ cients are the same (homogeneous /
pooled panel).
yit = α + β0 xit + εit

De…nition (Heterogeneous panel data model)

An heterogeneous panel data model is a model in which all parameters
(constant and slope coe¢ cients) vary accross individuals.

De…nition (Homogeneous panel data model)

An homogeneous panel data model (or pooled model) is a model in
which all parameters (constant and slope coe¢ cients) are common

De…nition (individual e¤ects)

In a panel data model, the individual (unobserved) e¤ects are captured
by the constant terms αi .

How to choose the appropriate speci…cation of the panel data model?
Economic interpretation: is it plausible to assume the homogeneity

of the parameters across individuals?
Speci…cation tests: testing strategy proposed by Hsiao (2003) for

instance.

Speci…cation Tests


Lemma (Normality assumption)

Under the assumption that the εit are independently normally distributed
over i and t with mean zero and variance σ2ε :
i .i .d .
εit N 0, σ2ε
di¤erent Fisher F -tests can be used to test the restrictions on β and α.

First step (homogeneous/ pooled assumption)

Let us consider the general model
The hypothesis of common intercept and slope can be viewed as a set of

(K + 1)(n 1) linear restrictions:
H01 : βi = β αi = α 8 i 2 f1, ..., ng
Ha1 : 9 (i, j ) 2 f1, ..., ng2 / βi 6= βj or αi 6= αj

Consider the model

H01 : βi = β αi = α 8 i 2 f1, ..., ng
Under the alternative H1 , there are nK estimated slope coe¢ cients

for the n vectors βi (K 1) and n estimated constants.
Under H1 , the unrestricted residual sum of squares S1 divided by σ2ε

has a chi-square distribution with nT n (K + 1) degrees of freedom.

De…nition (Homogeneity test)
Under the homogeneous assumption H01 ,
H01 : βi = β αi = α 8 i 2 f1, ..., ng

(K ,1 ) (K ,1 )
the F statistic, denoted F1 ,and de…ned by:
(RSS1,c RSS1 ) / [(n 1) (K + 1)]

F1 =
RSS1 / [nT n (K + 1)]
has a Fischer distribution with (n 1) (K + 1) and nT n (K + 1)

degrees of freedom. RSS1 denotes the residual sum of squares of the
model and RSS1,c the residual sum of squares of the constrained model

Remark 1
Under H1 , the residual sum of squares is equal to the sum of the n residual
sum of squares associated to the n individual regressions:
n n n
RSS1 = ∑ RSS1,i = ∑ bε2it = ∑ Syy ,i 0
Sxy 1
,i Sxx ,i Sxy ,i
i =1 i =1 i =1
T T T
1 1
Syy ,i = ∑ (yit y i )2 with x i =
T ∑ xit and y i =
T ∑ yit
t =1 t =1 t =1
T T
Sxx ,i = ∑ (xit x i ) (xit x i )0 Sxy ,i = ∑ (xit x i ) (yit yi)
t =1 t =1

Remark 2
Under H01 , the model becomes:
The least-squares regression of the pooled model yields parameter

estimates
b
β = Sxx1 Sxy
n T n T
1
Sxx = ∑ ∑ (xit x ) (xit x )0 with x =
nT ∑ ∑ xit
i =1 t =1 i =1 t =1
n T n T
1
Sxy = ∑ ∑ (xit x ) (yit y) with y =
nT ∑ ∑ yit
i =1 t =1 i =1 t =1

Under H01 , the overall RSS is de…ned by

0
SCR1,c = Syy Sxy Sxx1 Sxy
with
n T
Syy = ∑ ∑ (yit y i )2
i =1 t =1
n T
Sxx ,i = ∑ ∑ (xit x i ) (xit x i )0
i =1 t =1
n T
Sxy ,i = ∑ ∑ (xit x i ) (yit yi)
i =1 t =1

Second step (individual/unobserved e¤ects)

Let us consider the general model
yit = αi + βi0 xit + εit
The hypothesis of heterogeneous intercepts but homogeneous slopes can

be reformulated as subject to (n 1)K linear restrictions (no restrictions
on αi ).
H02 : βi = β 8 i = 1, ..n

De…nition (Test for common slope parameters)
Under the assumption H02 ,
H02 : βi = β 8 i = 1, ..n
the F statistic, denoted F2 , and de…ned by:
(RSS1,c 0 RSS1 ) / [(n 1) K ]

F2 =
RSS1 / [nT n (K + 1)]
has a Fischer distribution with (n 1) K et nT n (K + 1) degrees of

freedom under H02 . RSS1 denotes the residual sum of squares of the model
and RSS1,c 0 the residual sum of squares of the constrained model (model
with individual e¤ects):

Under H02 , the residual sum of squares is:

!0 ! 1 !
n n n n
RSS1,c 0 = ∑ Syy ,i ∑ Sxy ,i ∑ Sxx ,i ∑ Sxy ,i
i =1 i =1 i =1 i =1
T T T
1 1
Syy ,i = ∑ (yit y i )2 with x i =
T ∑ xit yi =
T ∑ yit
t =1 t =1 t =1
T
Sxx ,i = ∑ (xit x i ) (xit x i )0
t =1
T
Sxy ,i = ∑ (xit x i ) (yit yi)
t =1

Third step: homogeneous constants

If H02 is not rejected, one can also apply a conditional test for
homogeneous intercepts (n 1 linear restrictions).
H03 : αi = α 8 i = 1, .., n given βi = β
Under the null, the model is homogeneous (pooled) and the restricted
residual sum of squares is SCR1,c .
Under the alternative, the model is yit = αi + β0 xit + εit , and there is
nT K n degrees of freedom

De…nition (Test for homogeneous constant terms)

Under the assumption H03 ,
H03 : αi = α 8 i = 1, .., n given βi = β
the F statistic, denoted F3 , and de…ned by:
(RSS1,c RSS1,c 0 ) / (n 1)
F3 = (1)
RSS1,c 0 / [n (T 1) K ]
has a Fischer distribution with n 1 and n (T 1) K degrees of

freedom under H02 . RSS1,c 0 denotes the residual sum of squares of the
model with individual e¤ects and SCR1,c the residual sum of squares of
the pooled model previously de…ned.

Application: Strikes in OECD

Example (strikes in OECD countries)

Let us consider a simple panel regression model for the total number of
strike days in OECD countries. We consider a balanced panel data set for
17 countries (n = 17) and annual data form 1951 to 1985 (T = 35).
General idea: evaluate the link between strikes and some macroeconomic
factors (in‡ation, unemployment, etc..).

We consider the following model
sit = αi + βi uit + γi pit + εit
sit the number of strike days for 1,000 workers for the country i at
time t.
uit the unemployment rate
pit the in‡ation rate


Recall that, we have:

H01 : βi = β αi = α 8 i 2 f1, ..., ng

The test statistic satis…es
F1 F (48, 544)
H 01
since
(n 1) (K + 1) = (17 1) (2 + 1) = 48
nT n (K + 1) = 595 17 (2 + 1) = 544


For the second test:

H02 : βi = β 8 i = 1, ..n
the F statistic, denoted F2 , has the following distribution
F2 F (32, 544)
H 01
since
(n 1) K = (17 1) 2 = 32
nT n (K + 1) = 595 17 (2 + 1) = 544

For the third test:
H03 : αi = α 8 i = 1, .., n given βi = β
the F statistic, denoted F3 , satis…es:
F3 F (16, 576)
H 01
since
(n 1) = 17 1 = 16
n (T 1) K = 17 (35 1) 2 = 576

Should we use these speci…cation tests?
These heterogeneity / homogeneity tests of the parameters are valid

under speci…c assumptions (normality of residuals).
More generally, the assumption of heterogeneity / homogeneity of the

parameters (slope coe¢ cients and constants) has to be evaluated
through an economic reasoning.

Example
It is reasonnable to assume that the slope parameters of the production
function are the same accros countries? what does it imply? Should I
impose a common mean for the TFP for France and Germany? The
answer is probably no.

Key Concepts Section 1
1 Heterogeneous panel data model

2 Homogeneous panel data model
3 Individual (unobserved) e¤ects
4 Speci…cation tests (Hsiao, 2003)

Section 2
Linear Unobserved E¤ects Panel Data Models

2. Linear unobserved e¤ects panel data models
Objectives
1 De…ne the concept of linear unobserved e¤ects panel data model.

2 De…ne the concept of individual e¤ect.
3 Write the linear model in a vectorial form.
4 De…ne the notion of …xed e¤ects.
5 De…ne the notion of random e¤ects.

De…nition (linear unobserved e¤ects panel data model)

A linear unobserved (individual) e¤ects panel data model is de…ned as:
where αi is a scalar, β = ( β1 , β2 , ..., βK )0 denotes a K 1 vector of

parameters, xit = (x1it , ..., xKit )0 is a K 1 vector of exogenous variables,
and εit is an error term, assumed to be i.i.d., with 8 i = 1, .., n,
8 t = 1, .., T
E (εit ) = 0 E ε2it = σ2ε

De…nition (individual e¤ects)

There are many names for the scalars αi , i = 1, ..., n: (1) unobserved
e¤ects, (2) individual e¤ects, (3) unobserved components, and (4)
latent variables (for random e¤ects models).

De…nition (error terms)

The errors εit are called the idiosyncratic errors or idiosyncratic
disturbances. They change accross t as well as accross i.

Vectorial form (1)

Let us denote
0 1 0 1
yi ,1 x1,i ,1 x2,i ,1 ... xK ,i ,1
B yi ,2 C B x1,i ,2 x2,i ,2 ... xK ,i ,2 C
yi = B @ ... A
C Xi = B@ ...
C
A
(T ,1 ) (T ,K ) ... ... ...
yit x1,it x2,it ... xK ,it
Let us denote e a unit vector and εi the vector of errors:

0 1 0 1 0 1
1 εi ,1 β1
B 1 C B εi ,2 C B β C
e =B C
@ ... A εi = B @ ... A
C β =B 2
@ ...
C
A
(T ,1 ) (T ,1 ) (K ,1 )
1 εit βK

De…nition (vectorial form)

For any individual 8 i = 1, ..., n, the linear unobserved e¤ects panel
data model can be de…ned as follows:
yi = eαi + Xi β + εi
E ( εi ) = 0
E εi εi0 = σ2ε IT
E εi εj0 = 0 if i 6= j
(T ,T )

Example (production function)

Let us consider the case of a Cobb Douglas production function in log, as
de…ned previously, for the case T = 3 and K = 2. We have:
yit = αi + βk kit + βn nit + εit 8 i, 8 t 2 f1, 2, 3g
or in a vectorial form for a country i as:
yi = e αi + Xi β + εi
(3,1 ) (3,1 )(1,1 ) (3,2 )(2,1 ) (3,1 )
0 1 0 1 0 1 0 1
yi ,1 1 ki ,1 ni ,1 εi ,1
@ yi ,2 A = @ 1 A αi + @ ki ,2 ni ,2 A βk
+ @ εi ,2 A
βn
yi ,3 1 ki ,3 ni ,3 εi ,3

Vectorial form (2)
It is also possible to stackle all these vectors/matrices as follows
Y =e eeα+Xβ+ε
10 0 1 0 1
y1 X1 ε1
B y2 C B X2 C B ε2 C
Y =B C
@ ... A X =B C
@ ... A ε =B C
@ ... A
(Tn,1 ) (Tn,K ) (Tn,1 )
yn Xn εn
where 0T is the null vector (T , 1) .
0 1 0 1
e 0T ... 0T α1
B 0T e ... 0T C B α2 C
e = In e = B
e @ ... ...
C α =B
e C
(Tn,n ) ... 0T A (n,1 ) @ ... A
0T 0T ... e αn

Example (production function)

Consider the case of the production function with T = 3 and n = 2
ee
Y =eα+Xβ+ε
0 1 0 1 0 1 0 1
y1,1 1 0 k11 n11 ε11
B y1,2 C B 1 0 C B k12 n12 C B C
B C B C B C B ε12 C
B y1,3 C B C B k13 n13 C B C
B C B 1 0 C α1 B C βk B ε1,3 C
B C=B C +B C +B C
B y2,1 C B 0 1 C α2 B k21 n21 C βn B ε2,1 C
B C @ A B C B C
@ y2,2 A 0 1 @ k22 n22 A @ ε2,2 A
y2,3 0 1 k,3 n23 ε2,3

Especially in methodological papers, but also in applications, one often

sees a discussion about whether the individual e¤ects αi have to be treated
as a random e¤ect or a …xed e¤ect.

De…nition (Traditional approach)

In the traditional approach to panel data models, αi is called a “random
e¤ect” when it is treated as a random variable and a “…xed e¤ect” when
it is treated as a parameter to be estimated for each cross section
observation i.

Discussion
For Wooldridge (2010), these discussions about whether the αi should
be treated as random variables or as parameters to be estimated are
wrongheaded for microeconomic panels.
With a large number of random draws from the cross section, it

almost always makes sense to treat the unobserved e¤ects, αi ,
as random draws from the population, along with yit and xit .
This approach is certainly appropriate from an omitted variables or

neglected heterogeneity perspective.

Fact (Mundlak’s approach)

As suggested by Mundlak (1978), the key issue involving αi is whether or
not it is uncorrelated with the observed explanatory variables xit .
Mundlak Y. (1978), ”On the Pooling of Time Series and Cross Section
Data”, Econometrica, 46, 69-85

De…nition (a "modern" approach)

In a modern approach, “random e¤ect” is synonymous with zero
correlation between the observed explanatory variables and the
unobserved (random) e¤ect αi :
cov (xit , αi ) = 0, 8t, 8i

Remarks
Actually, a stronger conditional mean independence assumption,
E ( αi j xi 1 , ..., xiT ) = 0
is needed to fully justify statistical inference.
In applied papers, when αi is referred to an “individual random

e¤ect,” then αi is probably being assumed to be uncorrelated with the
xit .

De…nition (Fixed e¤ects)

In microeconometric applications, the term “…xed e¤ect” does not
usually mean that αi is being treated as nonrandom; rather, it means that
one is allowing for arbitrary correlation between the unobserved e¤ect αi
and the observed explanatory variables xit .

Remarks
Wooldridge (2010) avoids referring to αi as a random e¤ect or a …xed
e¤ect. Instead, he refers to αi as unobserved e¤ect, unobserved
heterogeneity, and so on.
Nevertheless, later we will label two di¤erent estimation methods as

random e¤ects estimation and …xed e¤ects estimation methods.
This terminology is so ingrained that it is pointless to try to change it

now.

Fixed or random e¤ects?
The economic interpretation of the individual e¤ects generally allows

to show that they are probably correlated to the explanatory
variables.
But, in case of doubt, it is possible to use a speci…cation test

(Hausman’s test, 1978)

Example (Production function)

Let us consider the simple example of the Cobb Douglass production
function.
yit = βi kit + γi nit + αi + vit
In this case, αi corresponds to the unobserved e¤ect on TFP due to
scountry speci…c omitted factor (climate, institutions, organization, etc..).
In this case, we might expect that the the level of factors are positivily
correlated with this component of TFP: the more a country is productive,
the more it invests in capital for instance.
cov (αi , kit ) > 0 cov (αi , nit ) > 0

Example (Patents and R&D)

Hausman, Hall, and Griliches (1984) estimate (nonlinear) distributed lag
models to study the relationship between patents awarded to a …rm and
current and past levels of R&D spending. A linear version of their model is:
patentsit = θ t + zit γ + δ0 RDit + δ1 RDit 1 + .. + δ5 RDit 5 + αi + vit
where RDit is spending on R&D for …rm i at time t and zit contains other
explanatory variables. αi is a …rm heterogeneity term that may in‡uence
patentsit and that may be correlated with current, past, and future R&D
expenditures.
cov (αi , RDit k ) 6= 0 8k

De…nition (Hausman’s test)

The Hausman test (1978), is a test of the null hypothesis
cov (xit , αi ) = 0, 8 (it )
and is generally presented as a speci…cation test (…xed or random) for

the unobserved e¤ects.

1 Linear unobserved e¤ects panel data model.

2 Vectorial form of the linear panel data model
3 Individual e¤ects.
4 Unobserved e¤ects
5 Random e¤ects.
6 Fixed e¤ects.

Section 3
Fixed E¤ects Estimation Methods

3. Fixed e¤ects estimation methods
Objectives
1 Specify the linear regression model with …xed e¤ects.

2 Introduce the LSDV (within) estimator.
3 De…ne the within transformation.
4 Estimate the slope parameters.
5 Estimate the …xed e¤ects.

Notations
Let us denote
0 1 0 1
yi ,1 x1,i ,1 x2,i ,1 ... xK ,i ,1
B yi ,2 C B x1,i ,2 x2,i ,2 ... xK ,i ,2 C
yi = B @ ... A
C Xi = B@ ...
C
A
(T ,1 ) (T ,K ) ... ... ...
yit x1,it x2,it ... xK ,it
Let us denote e a unit vector and εi the vector of errors:

0 1 0 1 0 1
1 εi ,1 β1
B 1 C B εi ,2 C B β C
e =B C
@ ... A εi = B @ ... A
C β =B 2
@ ...
C
A
(T ,1 ) (T ,1 ) (K ,1 )
1 εit βK

We consider the …xed e¤ects model:
yi = eαi + Xi β + εi 8 i = 1, .., n
where αi is assumed to be a constant term or a random variable

satisfying E ( αi j xi 1 , ..., xiT ) = 0.

Assumptions (H1) The errors terms εit are i.i.d. 8 (it ) with:
E (εit ) = 0
σ2ε t=s
E (εit εi ,s ) = , or E (εi εi0 ) = σ2ε IT where It denotes
0 8t 6 = s
the identity matrix (T , T ) .
E (εit εj ,s ) = 0, 8j 6= i, 8 (t, s ) , or E εi εj0 = 0 where 0 denotes the

null matrix (T , T ) .

Theorem
Under assumptions H1 , the ordinary-least-squares (OLS) estimator of β is
the best linear unbiased estimator (BLUE).

De…nition (LSDV estimator)

In this context, the OLS estimator bβ is called the least-squares
dummy-variable (LSDV) or Fixed E¤ect (FE) estimator, because the
observed values of the variable for the coe¢ cient αi takes the form of
dummy variables.

The OLS estimators of αi and β and are obtained by minimizing

n o n
αi , b
b βLSDV = arg min ∑ εi0 εi
fαi ,βgni=1 i =1
n
= ∑ (yi eαi Xi β)0 (yi eαi Xi β)
i =1

FOC1 (with respect to αi ) gives:

0
b
αi = y i b
βLSDV x i
with
T T
1 1
xi =
T ∑ xit yi =
T ∑ yit
t =1 t =1
Given the second FOC (with respect to β) and the previous result, we can
derive the formula for b
βLSDV .

De…nition (LSDV estimator)

Under assumption H1 , the …xed e¤ect estimator or LSDV estimator of β
is de…ned by:
! 1
n T
b
βLSDV = ∑ ∑ (xit x i ) (xit xi ) 0
i =1 t =1
!
n T
∑ ∑ (xit x i ) (yit yi)
i =1 t =1

Remarks
1 The computational procedure for estimating the slope parameters in

this model does not require that the dummy variables for the
individual (and/or time) e¤ects actually be included in the matrix of
explanatory variables.
2 We only need (1) the empirical means of time-series observations
separately for each cross-sectional unit, (2) transform the
observed variables by subtracting out these means, and (3) then apply
the least squares method to the transformed data.

The foregoing procedure is equivalent to premultiplying the i th equation
by a T T idempotent (covariance) transformation matrix (within

operator)
1 0
Q = IT ee
T
to “sweep out” the individual e¤ect αi so that individual observations are
measured as deviations from individual means (over time).

0 1 1 1 1 1
1 T T ... T T
B 1
1 1
... 1 1 C
B T T T T C
1 0 B C
Q = IT ee = B
B .... .... ... ... ... C
C
(T ,T ) T B 1 1 1 1 C
@ T T ... 1 T T A
1 1 1 1
T T ... T 1 T

Qyi and QXi correspond to the observations are measured as deviations

from individual means :
1 0
Qyi = IT ee yi
T
1 0
= yi e e yi
T
0 1 0 1
yi ,1 ! 1
B yi ,2 C 1 T B 1 C
= B
@ ... A
C
T ∑ yit B C
@ ... A
t =1
yit 1
= yi yie

1 0
QXi = Xi ee Xi
0 T 1
x1,i ,1 x2,i ,1 ... xK ,i ,1
B x1,i ,2 x2,i ,2 ... xK ,i ,2 C
= B
@ ...
C
A
... ... ...
x1,it x2,it ... xK ,it
0 1
1
1 BB 1 C
C
@ ∑Tt=1 x1,it ∑Tt=1 x2,it ... ∑Tt=1 xK ,it
T ... A
1

Finally, when the transformation Q is applied to a vector of constant (or a

time invariant variable), it lead to a null vector.
1 0
Qe = IT ee e
T
1 0
= e ee e
T
= e e=0
since 0 1
1
e 0e = 1 .. 1 @ .. A = T
1

So, we have:
() Qyi = Qeαi + QXi β + Qεi
() Qyi = QXi β + Qεi

De…nition (Within - LSDV estimator)

Under assumption H1 , the …xed e¤ect estimator or LSDV estimator or
Within estimator of parameter β is de…ned by:
! 1 !
n n
b
βLSDV = ∑ Xi0 QXi ∑ Xi0 Qyi
i =1 i =1
where
1 0
Q = IT ee
T

Fact (Time-invariant regressors)

If the explanatory variables contain some time-invariant variables zi , their
coe¢ cients cannot be estimated by LSDV, because the covariance
transformation eliminates zi .
1 0 1 0
Qzi = IT ee zi = zi ee zi = zi z i e = 0T
T T

Example
strike days in OECD countries. We have a balanced panel data set for 17
countries (n = 17) and annual data form 1951 to 1985 (T = 35). General
idea: evaluate the link between strikes and some macroeconomic factors
(in‡ation, unemployment etc..)

We consider the following model
sit the number of strike days for 1000 workers for the country i at
time t.
uit the unemployement rate
pit the in‡ation rate



Theorem
The LSDV estimator b β is unbiased and consistent when either n, or T , or
both tend to in…nity.
p
bβLSDV ! β
nT !∞

Theorem
The estimator for the unobserved e¤ects b
αi , although unbiased, is
consistent only when T ! ∞.
p
b
αi ! αi
T !∞

Theorem
The asymptotic variance–covariance matrix of the LSDV estimator b
β is
given by:
! 1
n
V b β = σ2ε ∑ Xi0 QXi
LSDV
i =1

Estimator of the asymptotic covariance matrix
! 1
n
b b
V βLSDV = b2ε
σ ∑ Xi0 QXi
i =1
with
n T
1
n i∑ ∑
b2ε =
σ bε2it
nT K =1 t =1

n T
1
nT K n i =1 t∑
∑
b2ε =
σ bε2it
=1
1
= 0.146958e 09
595 2 17
= 505.1093

Be careful with a simple OLS

method!
n T
1
b2ε =
σ
nT K ∑ ∑ bε2it
i =1 t =1
1
= 0.146958e 09
595 2
= 497.8165
as it does not take into

account the correct number of
constant terms.

1 Linear unobserved e¤ects panel data model.

2 Fixed e¤ects and assumptions H1 .
3 LSDV or within estimator.
4 Within transformation
5 Properties of the LSDV estimator.
6 Asymptotic variance-covariance of the LSDV estimator.

Section 4
Random E¤ects Estimation Methods

4. Random e¤ects estimation methods
Objectives
1 Specify the error-component model.

2 De…ne the Generalized Least Squares (GLS) estimator.
3 De…ne the between and pooled estimators.
4 Write the GLS estimator as a weigthed average of the LSDV and
between estimators.
5 Study the properties of the GLS estimator..
6 De…ne the feasible GLS estimator.

De…nition (error-component model)

The random speci…cation of unobserved e¤ects corresponds to a particular
case of variance-component or error-component model, in which the
error is assumed to consist of three components
yit = β0 xit + εit 8 (it )
εit = αi + λt + vit

Terminology
αi : individual (random) e¤ect
λt : time (random) e¤ect
vit : idiosyncratic error term

Assumptions (H2) The errors terms εit = αi + λt + vit are i.i.d. 8 (it )
with:
E (αi ) = E (λt ) = E (vit ) = 0
E (αi λt ) = E (λt vit ) = E (αi vit ) = 0
σ2α i =j
E ( αi αj ) =
0 8i 6 = j
σ2λ t=s
E ( λt λs ) =
0 8t 6 = s
σ2v t = s, i = j
E (vit vj ,s ) =
0 8t 6= s, 8i 6= j
E (αi xit0 ) = E (λt xit0 ) = E (vit xit0 ) = 0

Remark
As suggested by Wooldridge (2001), the "…xed e¤ect" speci…cation can be
viewed as a case in which αi is a random parameter with
cov αi , xit0 6= 0
whereas the "random e¤ect model" correspond to the situation in which
cov αi , xit0 = 0

De…nition (error-component model)

Under H2, the variance of yit conditional on xit is equal to:
σ2y jx = σ2ε = σ2α + σ2λ + σ2v

De…nition (centered individual e¤ects)

If the individual e¤ects αi are supposed to have a non zero mean, with
E ( αi ) = µ
then we can de…ned individual e¤ects αi = µ + αi with zero mean. The

error-component model is then de…ned as:
yit = µ + β0 xit + εit

Random coe¢ cient model

In the sequel, for simplicity we do not introduce any time e¤ects and
consider a simple random e¤ect model with
εit = αi + vit

Vectorial form
The vectorial expression of the individual e¤ects model is then de…ned as:
yi = ei
X γ + εi
(T ,1 ) (T ,K +1 )(K +1,1 ) (T ,1 )
εi = e αi + vi
(T ,1 ) (T ,1 )(1,1 ) (T ,1 )
ei = (e : Xi ) and γ0 = µ : β0
X

De…nition (variance-covariance matrix of errors)

Under assumptions H2 , the variance-covariance matrix of εi is equal to:
V = E εi εi0 = E (αi e + vi ) (αi e + vi )0 = σ2α ee 0 + σ2v IT
Its inverse is:

1 σ2α
V 1
= IT ee 0
σ2v σ2v + T σ2α

Remark
The presence of αi produces a correlation among errors of the same
cross-sectional unit (autocorrelation) as
V = E εi εi0 = σ2α ee 0 + σ2v IT

0 1
σ2α + σ2v σ2α ... σ2α
B C
B σ2α + σ2v ... σ2α C
V =B B
C
C
(T ,T ) @ ... σ2α A
σ2α + σ2v

Remark
0
If we consider the nT 1 vector of errors ε = (ε10 , ..., εn0 ) , we have
V (ε) = E εε0 = V In
0 1
V 0 ... 0
(T ,T ) (T ,T ) (T ,T )
B C
B V ... 0 C
B (T ,T ) (T ,T ) C
V (ε) = BB
C
C
(nT ,nT ) B ... 0 C
@ (T ,T ) A
V
(T ,T )

Within transformation
Regardless of whether the αi are treated as …xed or as random, the
individual-speci…c e¤ects for a given sample can be swept out by the
idempotent (covariance) transformation matrix Q
Qyi = Qeµ + QXi β + Qeαi + Qvi
Since Qe = IT T 1 ee 0 e = 0, we have
Qyi = QXi β + Qvi

Theorem
Under assumptions H2 , when αi are treated as random, the LSDV
estimator is unbiased and consistent either n, or T , or both tend to
in…nity. However, whereas the LSDV is the BLUE under the assumption
that αi are …xed constants, it is not the BLUE when αi are assumed
random. The BLUE in the latter case is the Generalized-Least-Squares
(GLS) estimator.

Summary
Assumptions LSDV GLS

H2 + E ( αi j xi 1 , .., xiK ) = 0 Unbiased BLUE
H2 + E ( αi j xi 1 , .., xiK ) 6= 0 BLUE Biased

Notations
Let us consider the model
ei γ + εi
yi = X 8 i = 1, .., n
ei = (e : Xi ) and γ0 = µ : β0 .
where εi = αi e + vi , X
We assume that the variance covariance matrix V = E (εi εi0 ) is

known.

De…nition (GLS estimator)

If the variance covariance matrix V is known, the GLS estimator of the
γ vector, denoted γb GLS , is de…ned by:
! 1 !
n n
b GLS =
γ ∑ ei0 V
X 1e
Xi ∑ ei0 V
X 1
yi
i =1 i =1
Under assumptions H2 , this estimaor is BLUE.

De…nition (inverse of the variance-covariance matrix)

Following Maddala (1971), we can write V 1 as:
1 1 1 0
V = Q+ψ ee
σ2v T
where Q = (IT ee 0 /T ) and where the parameter ψ is de…ned by:
σ2v
ψ=
σ2v + T σ2α

Proof
1 1 0
V 1
V = Q+ψ ee σ2α ee 0 + σ2v IT
σ2v T
1 σ2α 0 0 σ2
= σ2α Qee 0 + σ2v Q + ψ ee ee + ψ v ee 0
σ2v T T
1 σ 2
= σ2v Q + ψσ2α ee 0 + ψ v ee 0 as e 0 e = T
σ2v T
1 1
= σ2v Q + ee 0 ψ T σ2α + σ2v
σ2v T
1 0 1 T σ2α + σ2v
= IT ee + ee 0 ψ
T T σ2v

Proof (ct’d)
1 0 1 T σ2α + σ2v
V 1
V = IT ee + ee 0 ψ
T T σ2v
Since
σ2v
ψ=
σ2v + T σ2α
we have
1 0 1
V 1
V = IT ee + ee 0 = IT
T T

Given this de…nition of V 1,we have:

! 1 !
n n
1 0 e 1
b GLS
γ = ∑ ei0
X Q + ψ ee Xi
T ∑ ei0
X Q + ψ ee 0 yi
T
i =1 i =1
! 1 !
n n n n
1 1
= ∑ Xei0 Q Xei + ψ T ∑ Xei0 ee 0 Xei ∑ Xei0 Qyi + ψ T ∑ Xei0 ee 0 yi
i =1 i =1 i =1 i =1
ei = (e Xi ) and γ0 = µ β0
with X

It is possible to show that

0 n 0
1 1
ψnT ψT ∑ x i
bGLS
µ B i =1 C
b = @ n n n 0
A
βGLS ψT ∑ x i ∑ Xi0 QXi + ψT ∑ x i x i
i =1 i =1 i =1
0 1
ψnT y
@ n n A
∑ Xi0 Qyi + ψT ∑ x i y i
i =1 i =1
Using the formula of the partitioned inverse, we can derive b

βGLS .


If the variance covariance matrix V is known, the GLS estimator of β is:
! 1
n n
1
b
βGLS =
T ∑ Xi0 QXi + ψ ∑ (x i x ) (x i x) 0
i =1 i =1
!
n n
1
T ∑ Xi0 Qyi + ψ ∑ (x i x ) (y i y)
i =1 i =1
1
with ψ=σ2v σ2v + T σ2α

The GLS estimator can be expressed as a weigthed average of the LSDV

(OLS) estimator and the between estimator:
! 1
n n
1
b
βGLS =
T ∑ Xi0 QXi + ψ ∑ (x i x ) (x i x) 0
i =1 i =1
!
n n
1
T ∑ Xi0 Qyi + ψ ∑ (x i x ) (y i y)
i =1 i =1

De…nition (between group estimator)

The between-group estimator or between estimator b
βBE corresponds to
the OLS estimator obtained in the model:
y i = c + β 0 x i + εi 8 i = 1, .., n
! 1 !
n n
b
βBE = ∑ (x i x ) (x i x) 0
∑ (x i x ) (y i y)
i =1 i =1
The estimator b βBE is called the between-group estimator because it

ignores variation within the group

De…nition (pooled estimator)

The pooled estimator b
βpooled corresponds to the OLS estimator obtained
in the pooled model:
yit = α + β0 xit + εit 8 i = 1, .., n 8 t = 1, .., T
! 1
T n
b
βpooled = ∑ ∑ (xit x ) (x i x) 0
t =1 i =1
!
T n
∑ ∑ (xit x ) (yit y)
t =1 i =1

Theorem
Under assumptions H2 , the GLS estimator b
βGLS is a weighted average of
the between-group b
βBE and the within-group (LSDV) estimators b βLSDV .
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
where ∆ denotes a weigth matrix de…ned by:

! 1
n n
∆ = ψT ∑ Xi0 QXi + ψT ∑ (x i x ) (x i x) 0
i =1 i =1
!
n
∑ (x i x ) (x i x )0
i =1

GLS estimator properties
1 If ψ ! 0, the GLS estimator converges to LSDV estimator.

p
b
βGLS !b
βLSDV
ψ !0
2 If ψ ! 1, then GLS converges to the OLS pooled estimator.

p
b
βGLS !b
βpooled
ψ !1

Proof: case 1
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
! 1
n n
i =1 i =1
!
n
∑ (x i x ) (x i x )0
i =1
Consider the case ψ = 0 then
∆=0 b
βGLS = b
βLSDV
So, if ψ ! 0, the GLS estimator converges to LSDV estimator.

p
b
βGLS !b
βLSDV
ψ !0

Proof: case 2
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
Consider the case ψ = 1 we have
! 1 !
n n n
∆=T ∑ Xi0 QXi + T ∑ (x i x ) (x i x )0 ∑ (x i x ) (x i x )0
i =1 i =1 i =1
! 1 !
n n
b
βBE = ∑ (x i x ) (x i x )0 ∑ (x i x ) (y i y)
i =1 i =1
! 1 !
n T n T
b
βLSDV = ∑ ∑ (xit x i ) (xit xi ) 0
i =1 t =1 i =1 t =1

Proof: case 2
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
! 1 !
n n n
= T ∑ Xi0 QXi + T ∑ (x i x ) (x i x) 0
∑ (x i x ) (y i y)
i =1 i =1 i =1
! 1 !
n T n T
+ ∑ ∑ (xit x i ) (xit xi ) 0
i =1 t =1 i =1 t =1
! 1
n n n T
T ∑ Xi0 QXi +T ∑ (x i x ) (x i x) 0
∑ ∑ (xit x i ) (yit
i =1 i =1 i =1 t =1

Proof: case 2
So, if ψ = 1 we have
! 1
n T
b
βGLS = ∑ ∑ (xit x ) (xit x) 0
i =1 t =1
!
n T
∑ ∑ (xit x ) (yit y)
i =1 t =1
= b
βpooled
So, if ψ ! 1, the GLS estimator converges to the OLS pooled estimator.

p
b
βGLS !b
βpooled
ψ !1


1
The parameter ψ = σ2v σ2v + T σ2α measures the weight given to the
between-group variation.
In the LSDV (or …xed-e¤ects model) procedure, this source of

variation is completely ignored (ψ = 0).
The OLS procedure (pooled model) corresponds to ψ = 1. The

between-group and within-group variations are just added up.

Fact
The procedure of treating αi as random coe¢ cients provides an
intermediate solution between treating them all as di¤erent (…xed
e¤ects, LSDV) and treating them all as equal (pooled model).




Given the de…nition of ψ, we have:
σ2v
lim ψ = lim =0
T !∞ T !∞ σ2v + T σ2α

Theorem (GLS and LSDV)

When T tends to in…nity, the GLS estimator converges to the LSDV
estimator:
b
βGLS ! b βLSDV
T !∞

Interpretation
b
βGLS ! b
βLSDV
T !∞
When T ! ∞, we have an in…nite number of observations for each i.
Therefore, we can consider each αi as a random variable which has

been drawn once and forever
For each i we assume that they are just like …xed parameters.

De…nition (Transformation matrix)

Computation of the GLS estimator can be simpli…ed by introducing a
transformation matrix P such that
P = IT 1 ψ1/2 (1/T ) ee 0
We have
1 1 0
V = PP
σ2v
Premultiplying the model by the transformation matrix P, we obtain the
GLS estimator by applying the least-squares method to the transformed
model (Theil (1971, Chapter 6)).

Transformation matrix
The GLS estimator is equivalent to
1 Transforming the data by subtracting a fraction 1 ψ1/2 of
individual means y i and x i from their corresponding yit and xit
2 Regressing yit 1 ψ1/2 y i on a constant and xit 1 ψ1/2 x i
using simple OLS.

De…nition (Asymptotic variance covariance matrix)

Under assumptions H2, the asymptotic variance covariance matrix of
the GLS estimator is given by:
! 1
n n
V b
βGLS = σ2v ∑ Xi0 QXi + ψT ∑ (x i x ) (x i x) 0
i =1 i =1

Remark
! 1
n n
V b
βGLS = σ2v ∑ Xi0 QXi + ψT ∑ (x i x ) (x i x) 0
i =1 i =1
! 1
n
V b
βLSDV = σ2ε ∑ Xi0 QXi
i =1
As ψ > 0, the di¤erence between the covariance matrices of b

βLSDV and
b
βGLS is a positive semide…nite matrix. For K = 1, we have:
V b
βGLS V b
βLSDV
=> the LSDV is not BLUE

De…nition (feasible GLS)

If the variance components σ2ε and σ2α are unknown, we can use a two-step
GLS estimation procedure, called as feasible GLS.
1 In the …rst step, we estimate the variance components using some

consistent estimators.
2 In the second step, we substitute their estimated values into
! 1 !
n n
b GLS =
γ ∑ Xei0 Vb 1e
Xi ∑ Xei0 Vb 1
yi
i =1 i =1
or its equivalent form.

Two-step GLS estimator

De…ne y i = αi + β0 x i + εi and (yit y i ) = (xit x i ) + (vit v i ), we can
use the within and between-group residuals to estimate σ2ε and σ2α by
n T 0 2
∑ ∑ (yit yi) b
βLSDV (xit xi )
i =1 t =1
b2v =
σ
n (T 1) K
n 0 2
∑ yi b
βLSDV x i
i =1
b2α =
σ b2v
σ
n K 1

Then, we have an estimate of ψ and V 1
b2v
σ
b=
ψ
b2v + T σ
σ b2α
b 1 1 b 1 0
V = Q+ψ ee
b2v
σ T

Lemma
When the sample size is large (in the sense of either n ! ∞,or T ! ∞),
the two-step GLS estimator will have the same asymptotic e¢ ciency as the
GLS procedure with known variance components.

Lemma
Even for moderate sample size (for T 3, n (K + 1) 9; for T 2,
n (K + 1) 10), the two-step procedure is still more e¢ cient than the
LSDV estimator in the sense that the di¤erence between the covariance
matrices of the covariance estimator and the two-step estimator is
nonnegative de…nite.

Example
strikes days in OECD countries. We have a balanced panel data set for 17
countries (n = 17) and annual data form 1951 to 1985 (T = 35).
sit = αi + βuit + γpit + εit

Figure: Random e¤ects method

Here we have
b2v = 0.25514e 06 = 255, 140
σ
b2α = 55, 401
σ
b2v
σ 255, 140
b=
ψ = = 0.1163
b2v
σ b2α
+ Tσ 255, 140 + 35 55, 401

1 Error-component model.
2 GLS, Between and pooled estimators.
3 Write the GLS estimator as a weighted average.
4 Feasible GLS estimator.
5 Properties of the GLS estimator.
6 Asymptotic variance-covariance matrix of the GLS estimator.

Section 5
Speci…cation tests: Fixed or Random e¤ects?

5. Speci…cations tests
Objectives
1 De…ne the Mundlak’s speci…cation.

2 Discuss the independence assumption between random e¤ects and
explanatory variables.
3 Show that the GLS estimator may be not consistent when T is
…xed.
4 Show that the GLS is always consistent when T tends to in…nity.
5 Introduce the Hausman’s lemma.
6 De…ne the Hausman’s speci…cation test.

Fact (large T sample)

Whether to treat the e¤ects as …xed or random makes no di¤erence when
T is large, because both the LSDV estimator and the generalized
least-squares estimator become the same estimator:
b
βGLS ! b
βLSDV
T !∞

Fixed or random e¤ects
When T is …nite and n is large, whether to treat the e¤ects as …xed

or random is not an easy question to answer.
It can make a surprising amount of di¤erence in the estimates of the

parameters.

Example (Hausman, 1978)

Hausman (1978) estimates a wage equation using a sample of 629 high
school graduates followed over six years by the Michigan income dynamics
study. The explanatory variables include a piecewise-linear representation
of age, the presence of unemployment or poor health in the previous year,
and dummy variables for self-employment, living in the South, or living in
a rural area.


In the random-e¤ects framework, there are two fundamental assumptions.

1 One is that the unobserved individual e¤ects αi are random draws
from a common population.
2 The explanatory variables are strictly exogenous: it implies that all
the components of the error terms are orthogonal to the
regressors:
E ( εit j xi 1 , .., xiK ) = E ( αi j xi 1 , .., xiK ) = E ( vit j xi 1 , .., xiK ) = 0

What happens when this condition is violated?
E ( αi j xi 1 , .., xiK ) 6= 0 or E αi xit0 6= 0
1 The Mundlak’s speci…cation (1978)

2 The Hausman’s speci…cation test

Subsection 5.1
The Mundlak’s Speci…cation

5.1. Mundlak’s speci…cation
Mundlak’s speci…cation
Mundlak (1978) criticized the random-e¤ects formulation on the

grounds that it neglects the correlation that may exist between the
e¤ects αi and the explanatory variables xit .
There are reasons to believe that in many circumstances αi and xit

are indeed correlated.
Mundlak Y. (1978), ”On the Pooling of Time Series and Cross Section
Data”, Econometrica, 46, 69-85.

The properties of various estimators we have discussed thus far

depend on the existence and extent of the relations between the X ’s
and the e¤ects αi .
Therefore, we have to consider the joint distribution of these

variables. However, αi are unobservable.
Mundlak (1978) suggests to approximate E (αi xit ) by a linear

function.

De…nition (Mundlak’s speci…cation)

Let us assume that the individual e¤ects satisfy:
αi = x i0 a + αi
|{z} |{z}
component proportional to x component orthogonal to x
with a 2 RK , x i = T 1 ∑Tt=1 xit the K 1 vector of individual means of

the explanatory variables and
E αi xit0 = 0

De…nition (Mundlak’s speci…cation)

With the Mundlak’s speci…cation, the unobserved e¤ects model becomes:
yit = µ + β0 xit + x i0 a + εit
εit = αi + vit

Assumption H3: The error term εit = αi + vit are i.i.d. 8 (it ) with:
E (αi ) = E (vit ) = 0
E (αi vit ) = 0
σ2α i =j
E αi αj =
0 8i 6 = j
σ2v t = s, i = j
E (vit vj ,s ) =
0 8t 6= s, 8i 6= j
E (vit xit0 ) = E (αi xit0 ) = 0

The model can be rewritten as follows:
yi = ei
X γ + εi 8 i = 1, .., n
(T ,1 ) (T ,K +1 )(K +1,1 ) (T ,1 )
with
εi = αi e + vi
ei = ex i0 : e : Xi
X
γ0 = a0 : µ : β0

The variance-covariance matrix of the error term is de…ned as:
0
E εi εj0 = E (αi e + vi ) αj e + vj
σ2α ee 0 + σ2v IT = V i =j
=
0 i 6= j

GLS estimator
Utilizing the expression for the inverse of a partitioned matrix, we obtain
the GLS estimator of µ, β, and a as:
bGLS = y
µ x 0b
βBE
βGLS = ∆ b
b βBE + (IK ∆) b
βLSDV
aGLS = b
b βBE b
βLSDV

Between estimator
The between estimator b
βBE corresponds to the OLS estimator obtained
in the model:
y i = c + ( β + a ) 0 x i + εi = c + θ 0 x i + εi 8 i = 1, .., n
! 1 !
n n
b
θ BE = ∑ (x i x ) (x i x )0 ∑ (x i x ) (y i y)
i =1 i =1


Under H3, the GLS estimator b
βGLS is consistent (cf. section 4):
b
βGLS ! β
nT !∞
Besides, we have
b
βGLS ! b
βLSDV
T !∞

Question: What is the consequence to neglect the dependence between

αi and xit and to wrongly consider the following model?
εit = αi + vit
αi = x i0 a + αi

Let us assume that the DGP corresponds to the Mundlak’s model
αi = x i0 a + αi
and we apply GLS to the initial model:
εit = αi + vit

In general, we have:
b
βGLS = ∆ b
βBE + (IK ∆) b
βLSDV
It is easy to show that:

p
b
βBE ! β+a
n !∞
p
b
βLSDV !β
n !∞

Let us assume that

p
∆ !∆
n !∞
with
! 1
n n
i =1 i =1
!
n
∑ (x i x ) (x i x) 0
i =1

When T is …xed and n tends to in…nity, the GLS is not consistent if there
is a correlation between individual e¤ects and the expanatory variables:
plim b
βGLS = ∆ plim b
βBE + IK ∆ plim b
βLSDV
n !∞ n !∞ n !∞
= ∆ ( β + a) + IK ∆ β (2)
= β + ∆a
with
∆ = plim ∆.
n !∞

Theorem (GLS bias)

If αi = x i0 a + αi with a 6= 0,the GLS is not consistent when T is …xed
and n tends to in…nity:
p
b
βGLS ! β + ∆a
n !∞
As usual, the GLS is consistent with T :

p
b
βGLS ! β
T !∞

Summary
E ( αi j xi 1 , .., xiK ) = 0 E ( αi j xi 1 , .., xiK ) 6= 0
LSDV GLS LSDV GLS
T …xed, n ! ∞ Consistent — Consistent Not Consistent
T ! ∞ and n ! ∞ Consistent BLUE Consistent Consistent

Subsection 5.2
The Hausman’s Speci…cation Test

5.2. Hausman’s speci…cation test
Hausman (1978) proposes a general speci…cation test, that can be

applied in the speci…c context of linear panel models to the issue of
speci…cation of individual e¤ects (…xed or random).
Hausman J.A., (1978) ”Speci…cation Tests in Econometrics”, Econometrica,

46, 1251-1271

General idea of the Hausman’s lemma

Let us consider a general model
y = f (x; β) + ε
and particular hypothesis H0 on the parameters, error terms, etc.
Let us consider two estimators of the K -vector β, denoted b

β1 and b
β2 ,
both consistent under H0 and asymptotically normally distributed.
Under H0 , the estimator b

β1 reachs the asymptotic Cramer–Rao
bound.
Under H1 , the estimator b

β2 is biased and not consistent.

General idea of the Hausman’s lemma

By examing the distance between b
β1 and b
β2 , it is possible to conclude
about H0 :
1 If the distance is small, H0 can not be rejected.
2 If the distance is large, H0 can be rejected.

Distance measure
This distance is naturally de…ned as follows:
0 1
H= b
β2 b
β1 V b
β2 b
β1 b
β2 b
β1
However, the issue is to compute the variance-covariance

matrix V bβ2 b β1 of the di¤erence between both estimators.
Generaly we know V b
β2 and V b
β1 , but not V b
β2 b
β1 .

Lemma (Hausman, 1978)

Based on a sample of n observations, consider two estimates b β1 and b
β2
that are both consistent and asymptotically normally distributed, with b
β1
p b
attaining the asymptotic Cramer–Rao bound so that n β β is
1
asymptotically normally distributed with variance–covariance matrix V1 .
p
Suppose n b β2 β is asymptotically normally distributed, with mean
zero and variance–covariance matrix V2 . Let q b=b
β2 b β1 . Then the
p b p
limiting distributions [under the null] of n β1 β and nb q have zero
covariance:
E( b b0 ) = 0K
β1 q

Theorem
From this lemma, it follows that
V b
β2 b
β1 = V b
β2 V b
β1
Thus, Hausman suggests using the test statistic

0 1
H= b
β2 b
β1 V b
β2 V b
β1 b
β2 b
β1
or equivalently
1
b0 (V (q
H=q b)) b
q

Under the null hypothesis, the test statistic H has an asymptotic

chi-square distribution with K degrees of freedom.
HO
H ! χ2 (K )
n !∞
Under the alternative, it has a noncentral chi-square distribution with

noncentrality parameter q b)) 1 q
e0 (V (q e, where qe is de…ned as follows:
e = plim
q b
β2 b
β1
H 1 /n !∞

Speci…cation test for …xed versus random e¤ects
Let us apply the Hausman’s test to discriminate between …xed e¤ects

methods and random e¤ects methods.
We assume that αi are random variable and the key assumption

tested is here de…ned as:
H0 : E ( αi j Xi ) = 0
H1 : E ( αi j Xi ) 6= 0

De…nition (Hausman’s speci…cation test)

The Hausman speci…cation test is a test of the null of no dependence
between the (random) individual e¤ects and the explanatory variables.
H0 : E ( αi j Xi ) = 0
H1 : E ( αi j Xi ) 6= 0

Speci…cation test for …xed versus random e¤ects

The Hausman’s test can also be interpreted as a speci…cation test between
"…xed e¤ect methods" and "random e¤ect methods".
1 If the null is rejected, the correlation between individual e¤ects and
the explicative variables induces a bias in the GLS estimates. So, a
standard LSDV approach (…xed e¤ects method) has to be
privilegiated.
2 If the null is not rejected, we can use a GLS estimator (random e¤ect
method) and specify the individual e¤ects as random variables
(random e¤ects model).

Hausman’s speci…cation test

How to implement this test? Let us consider the standard model with
random e¤ects (µ = 0):
yi = Xi β + eαi + vi
1 Under H0 (and assumptions A2 ) we know that b βLSDV and b

βGLS are
consistent and asymptotically normally distributed.
2 Under H0 , b
βGLS is BLUE and attains asymptotic Cramer–Rao bound.
3 Under H1 , b
βGLS is not consistent.

According to the Hausman’s lemma, we have (for K = 1):
cov b
βGLS , b
βLSDV b
βGLS = 0 () cov b
βLSDV , b
βGLS =V b
βGLS
Since,
V b
βLSDV b
βGLS =V b
βLSDV + V b
βGLS 2cov b
βLSDV , b
βGLS
We have:
V b
βLSDV b
βGLS =V b
βLSDV V b
βGLS

De…nition (Hausman’s speci…cation test)

The Hausman speci…cation test statistic of individual e¤ect can be de…ned
as follows:
0 1
H= b
βLSDV b
βGLS V b
βLSDV V b
βGLS b
βLSDV b
βGLS
Under H0 : E ( αi j Xi ) = 0, we have:
HO
H ! χ2 (K )
nT !∞

Remarks
0 1
H= b
βLSDV b
βGLS V b
βLSDV V b
βGLS b
βLSDV b
βGLS
Let us assume that K = 1, then under the null H0 : E ( αi j Xi ) = 0
V b
βLSDV V b
βGLS >0
since b
βGLS is the BLUE.

Remarks
1 When n is …xed and T tends to in…nity, b βGLS and b

βMCG become
identical. However, it was shown by Ahn and Moon (2001) that the
numerator and denominator of H approach zero at the same speed.
Therefore the ratio remains chi-square distributed. However, in this
situation the …xed-e¤ects and random-e¤ects models become
indistinguishable for all practical purposes.
2 The more typical case in practice is that n is large relative to T , so
that di¤erences between the two estimators or two approaches are
important problems.

Example
strikes days in OECD countries. We have a balanced panel data set for 17
countries (n = 17) and annual data form 1951 to 1985 (T = 35).


1 Mundlak’s speci…cation.
2 Dependence between random e¤ects and explanatory variables.
3 The GLS estimator may be not consistent (…xed T , n tends to
in…nity).
4 The GLS is always consistent when T tends to in…nity.
5 Hausman’s lemma.
6 Hausman’s speci…cation test for …xed or random e¤ects models.

Section 6
Heterogeneous Panel Data Models

6. Heterogeneous panel data models
Objectives
1 De…ne the heterogeneous panel data model.

2 Introduce the random coe¢ cient model.
3 Introduce the Swamy’s model.
4 De…ne the GLS estimator

There are cases in which there are changing economic structures or

di¤erent socioeconomic and demographic background factors that imply
that the slope parameters β may be varying over time and/or may be
di¤erent for di¤erent crosssectional units.

Heterogeneous panel data model

The most general form of an heterogeneous and time-varying coe¢ cient
model is:
K
yit = ∑ βkit xkit + vit i = 1, .., n and t = 1, .., T
k =1
In contrast to previous sections, we no longer treat the intercept di¤erently

than other explanatory variables and let x1it = 1.

Assumption H4: We assume that parameters do not vary with time.

Then, we have:
βkit = βki
K
yit = ∑ βki xkit + vit i = 1, .., n and t = 1, .., T
k =1

Panel or not panel?

This model is equivalent to postulating a separate regression for each
cross-sectional unit
yit = βi0 xit + vit i = 1, .., n
where βi = ( β1i , β2i , ..., βKi )0 is a K 1 vector of parameters, and

xit = (x1it , ..., xKit )0 is a K 1 vector of exogenous variables.

Panel or not panel?
But some "links" between the individuals may require a panel regression
model :
The error terms vit are cross-correlated among cross-units.
The slope parameters βi are considered as random variable with a

common probability distribution or at least common moments

De…nition (heterogeneous slope parameters)

The vectors of slope parameters βi are assumed to satisfy
βi = β + ζi
(K ,1 ) (K ,1 ) (K ,1 )
for i = 1, .., n, where β is a K 1 vector of constants, and ζ i denotes a

K 1 vector of constant or random variables.

The heterogeneous coe¢ cient model becomes

K
yit = ∑ ( βk + ξ ki ) xkit + vit i = 1, .., n and t = 1, .., T
k =1
β = ( β1 , β2 , ..., βK )0 denotes the common mean coe¢ cient K 1

vector.
ξ i = (ξ 1i , ξ 2i , ..., ξ Ki )0 is the vector of individual deviation from the

common mean.
The errors terms may be cross-correlated or not, i.e.

cov (vjit , vit ) 6= 0 or cov (vjit , vit ) = 0.

For this type of model we are interested in

1 Estimating the mean coe¢ cient vector β,
2 Predicting each individual component βi ,
3 Estimating the dispersion of the individual-parameter vectors βi ,

Heterogeneous panel models with …xed or random coe¢ cients

1 If individual observations are heterogeneous, then ξ i can be treated as
…xed constants.
2 If conditional on xkit , individual units can be viewed as random draws
from a common population, then ξ i are generally treated as random
variables having for instance, zero means and constant variances and
covariances.
E (ζ i ) = 0 and V (ζ i ) = ∆

De…nition (Fixed-coe¢ cient model)

When βi are treated as …xed constants, we can stack the nT
observations in the form of the Zellner (1962) seemingly unrelated
regression (SURE) model
0 1 0 10 1 0 1
y1 X1 0 0 β1 v1
@ . A = @ 0 .. .. A @ . A + @ . A
yn 0 .. Xn βn vn
where yi and vi are T 1 vectors (yit , ..., yiT ) and (vit , ..., viT ), and Xi is
the T K matrix of the time-series observations of the i th individual’s
explanatory variables with the t th row equal to xit .

Heterogeneous panel models with …xed coe¢ cients

1 If the covariances between di¤erent cross-sectional units are not zero,
e.g. E (vi vj 0 ) 6= 0, the GLS estimator of β10 , ..., βn0 is more e¢ cient
than the single-equation estimator of i for each cross-sectional unit.
Panel data is useful.
2 If Xi are identical for all i or E (vi vi 0 ) = σ2i IT and E (vi vj 0 ) = 0 for
i 6= j, the GLS estimator for β10 , ..., βn0 is the same as applying least
squares separately to the time-series observations of each
cross-sectional unit. Panel data is useless.

De…nition (random coe¢ cient model)

Alternatively, each regression coe¢ cient can be viewed as a random
variable with a common probability distribution:
i .i .d .
βi Common Distribution
or at least common moments:
E ( βi ) = β V ( βi ) = ∆ 8i = 1, ..., n

Random coe¢ cient model
1 The random-coe¢ cient speci…cation reduces the number of

parameters to be estimated substantially, while still allowing the
coe¢ cients to di¤er from unit to unit and/or from time to time.
2 Depending on the type of assumption about the parameter variation,
it can be further classi…ed into one of two categories: stationary and
nonstationary random-coe¢ cient models.
3 For more details, see Hurwicz (1950), Klein (1953), Theil and Mennes
(1959), or Zellner (1966).

Subsection 6.1
Random Coe¢ cient Models

6.1. Random coe¢ cient model

The vectors of slope parameters βi are randomly distributed with a
common mean E ( βi ) = β, and
K
yit = ∑ ( βk + ξ ki ) xkit + vit i = 1, .., n
k =1
with β = ( β1 , β2 , ..., βK )0 and ξ i = (ξ 1i , ξ 2i , ..., ξ Ki ). Let us denote
βki = βk + ξ ki

Remarks
1 The vector xi = (x1i ..xKi ) includes a constant term. The parameter

βki associated to this constant term corresponds to an individual
(random) e¤ect.
2 An alternative notation is:
K
yit = α + ∑ ( βk + ξ ki ) xkit + αi + vit
k =2
E ( αi ) = 0

Consider the set of assumptions used in the seminal paper of Swamy

(1970).
Swamy P.A. (1970), ”E¢ cient Inference in a Random Coe¢ cient Regression
Model”, Econometrica, 38, 311-323

Assumption H5 (Swamy’s model): Let us assume that

E (ξ i ) = 0, E (vi ) = 0
∆ i =j
E ξ i ξ j0 =
0 8i 6 = j
E xit ξ j0 = 0, E ξ i vj0 = 0, 8 (i, j )
σ2i IT t = s, i = j
E (vi vj 0 ) =
0 8t 6= s, 8i 6= j

Remark: we assume that the error term vi is heteroskedastic:
E vi vi0 = σ2i IT

De…nition (moments of slopes parameters)

The two …rst moments of the vector of random parameters βi = β + ξ i
are de…ned by, 8 i = 1, ..n :
E ( βi ) = β
(K ,1 ) (K ,1 )
V ( βi ) = E ξ i ξ i0 = ∆
(K ,K ) (K ,K )

Homegeneous moments
The variance covariance matrix ∆ of the random parameters
βi = ( β1i , β2i , .., βKi )0 is assumed to be common to all cross section units:
0 2 1
σβ σ β1 ,β2 ... σ β1 ,βK
B σ 1 2 ... σ β2 ,βK C
B β ,β σ β2 C
∆ = E ( βi β ) ( βi β ) 0 = B 2 1 C
(K ,K ) @ ... ... ... ... A
σ βK ,β1 σ βK ,β2 ... σβ2
K

Vectorial form
For each cross section unit, we have:
yi = Xi β + Xi ξ i + vi
βi = β + ξ i
where the vector Xi include a constant term (i.e. the average of random
individual e¤ects, α).


The random coe¢ cient model can be rewriten as follows :
yi = Xi β + εi
εi = Xi ξ i + vi = Xi ( βi β) + vi

Covariance matrix
For a given cross unit, the covariance matrix for the composite disturbance
term εi = Xi ξ i + vi is de…ned by:
Φi = E εi εi0
= E (Xi ξ i + vi ) (Xi ξ i + vi )0
= Xi E ξ i ξ i0 Xi0 + E vi vi0
= Xi ∆Xi0 + σ2i IT

De…nition
For a given cross unit, the covariance matrix for the composite disturbance
term εi = Xi ξ i + vi is de…ned by:
Φi = Xi ∆Xi0 + σ2i IT
Stacking all nT observations, the covariance matrix for the composite

disturbance term is block-diagonal and heteroskedastic.

Remarks
1 Under Swamy’s assumption, the simple regression of y on X will yield
an unbiased and consistent estimator of β if (1/nT )X 0 X converges
to a nonzero constant matrix.
2 But the estimator is ine¢ cient, and the usual least-squares formula
for computing the variance–covariance matrix of the estimator is
incorrect, often leading to misleading statistical inferences.


The best linear unbiased estimator of β is the GLS estimator
! 1 !
n n
b
βGLS = ∑ Xi0 Φi 1
Xi ∑ Xi0 Φi 1
yi
i =1 i =1


The GLS estimator b βGLS is a matrix-weighted average of the least-squares
b
estimator βi for each cross-sectional unit, with the weights inversely
proportional to their covariance matrices:
n
b
βGLS = ∑ ωi bβi
i =1
! 1
n 1 h i 1
∑
1 1
ωi = ∆ + σ2i Xi0 Xi ∆ + σ2i Xi0 Xi
i =1
b 1
βi = Xi0 Xi Xi0 yi

The covariance matrix for the GLS estimator is:

! 1
n
V b β = ∑ Xi Φ Xi
GLS
0 1
i
i =1
! 1
n 1
∑
1
= ∆ + σ2i Xi0 Xi
i =1

Feasible GLS estimator

1
Swamy proposes to use the OLS estimators b βi = (Xi0 Xi ) Xi0 yi and their
residuals vbi = yi Xi b
βi to obtain unbiased estimators of σ2i and ∆
1 1
b2i =
σ yi0 IT Xi Xi0 Xi Xi0 yi
T K
T
1
=
T K ∑ vbit
t =1
with
yit = βi0 xit + vit

Feasible GLS estimator

For the ∆ matrix, we have:
! !0 !
n
1 1 n b 1 n b
1 i∑ n i∑ n i∑
b
∆ = b
βi βi b
βi βi
(K ,K ) n =1 =1 =1
1 n 2
n i∑
1
bi Xi0 Xi
σ
=1

De…nition (estimator for ∆)

b is not necessarily nonnegative de…nite.
However, the previous estimator ∆
In this situation, Swamy (1970) has suggested replacing this estimator by:
! !0 !
n
1 1 n b 1 n b
1 i∑ n i∑ n i∑
b =
∆ b
βi βi b
βi βi
(K ,K ) n =1 =1 =1
This estimator, although not unbiased, is nonnegative de…nite and is

consistent when T tends to in…nity.

Remarks
1 Swamy proved that substituting σ b for σ2 and ∆ yields an
b2i and ∆ i
asymptotically normal and e¢ cient estimator of β.
2 The speed of convergence of the GLS estimator is n1/2 .

Summary: how to estimate a random coe¢ cient model?

1 Run the n individual regressions yit = βi0 xit + vit .
2 Compute σ b as follows
b2i and the Swamy’s estimator ∆
! !0 !
n
1 1 n b 1 n b
1 i∑ n i∑ n i∑
b =
∆ b
βi βi b
βi βi
(K ,K ) n =1 =1 =1
T
1
b2i =
σ
T K ∑ vbit
t =1

3. Compute the GLS estimate of the mean of the parameters βi

! 1 !
n n
b
βGLS = ∑ Xi0 Φb i 1
Xi ∑ Xi0 Φb i 1
yi
i =1 i =1
with
Φ b i0 + σ
b i = Xi ∆X b2i IT

Example (Swamy, 1970)

Swamy (1970) used this model to reestimate the Grunfeld investment
function with the annual data of 11 U.S. corporations. His GLS estimates
of the common-mean coe¢ cients of the …rms’beginning-of-year value of
outstanding shares and capital stock are 0.0843 and 0.1961, with
asymptotic standard errors 0.014 and 0.0412, respectively. The estimated
dispersion measure of these coe¢ cients is
b= 0.0011 0.0002
∆
0.0187

Predicting Individual Coe¢ cients
1 Sometimes one may wish to predict the individual component βi ,

because it provides information on the behavior of each individual and
also because it provides a basis for predicting future values of the
dependent variable for a given individual.
2 Swamy (1970, 1971) has shown that the best linear unbiased
predictor, conditional on given xi , is the least-squares estimator b
βi .

De…nition (individual predictors for βi )

Lee and Gri¢ ths (1979) suggest predicting βi by
1
βi = b
βGLS + ∆Xi0 Xi ∆Xi0 + σ2i IT yi Xi b
βGLS
This predictor is the best linear unbiased estimator in the sense that
E ( βi βi ) = 0, where the expectation is an unconditional one.

Predicting Individual Coe¢ cients : Lindley and Smith (1972)
1 It is also possible to consider a Bayesian approach: the random

coe¢ cient model is also called the hierarchical model
2 In this case, the prior distribution for the βi parameters is speci…ed
with the value for β, ∆ and σ2i .
3 From a Bayesian perspective, the likelihood is combined with priors to
generate posterior distributions of the parameters.

Subsection 6.2
Other Heterogeneous Panel Data Models

6.2. Other hetorogeneous panel data models
Heterogeneous panel data models

There are many other way to model the heterogeneity of slope parameters:
1 Mixed …xed and random (MFR) coe¢ cients model.

2 Mean group estimation.
3 Panel threshold regression models.
4 Grouped Patterns of Heterogeneity.
5 etc.



5 etc.

De…nition (mixed …xed and random coe¢ cient model)

We assume that each cross section unit is di¤erent
K m
yit = ∑ βki xki + ∑ γli wli + vit i = 1, .., n
k =1 l =1
where xit and wit are each a K 1 and an m 1 vector of explanatory

variables that are independent of the error of the equation vit .
Hsiao C. (1989), ”Modelling Ontrario Regional Electricity System Demand

Using a Mixed Fixed and Random Coe¤cient Approaoch”, Regional Science
and Urban Economics, 19, 565-587

Assumptions
The parameters β = ( β10 , β20 , ..., βK0 )0 are assumed to be randomly

(nK ,1 )
distributed
The parameters γ = (γ10 , γ20 , ..., γm

0 )0 are …xed.
(nm,1 )

In a vectorial form, we have
y = X β + W γ + v
(nT ,1 ) (nT ,nK )(nK ,1 ) (nT ,nm )(nm,1 ) (nT ,1 )
0 1 0 1
X1 0 0 W1 0 0
B 0 X2 C B 0 W2 C
X =B
@
C
A W =B
@
C
A
0 Xn 0 Wn

Assumptions on the random coe¢ cients

The coe¢ cients of xit are assumed to be subject to stochastic restrictions
of the form
β = A1 β + ζ
A1 is an nK L matrix with known elements,
β is an L 1 vector of constants,
ζ is assumed to be (normally distributed) random variables with mean

0 and nonsingular constant covariance matrix C and is independent of
xi .

Assumptions on the …xed coe¢ cients

The coe¢ cients of wit are assumed to be subject to
γ = A2 γ
A2 is an nm n matrix with known elements,
γ is an n 1 vector of constants.

MFR model
Since A2 is known, we can rewrite the model as
y = X f
β + W γ + v
(nT ,1 ) (nT ,nK )(K ,1 ) (nT ,n )(n,1 ) (nT ,1 )
f = WA2 .
where W

Many of the linear panel data models with unobserved individual speci…c
but time-invariant heterogeneity can be treated as special cases of this
model.

Example
A common model for all cross-sectional units. If there is no interindividual
di¤erence in behavioral patterns, we may let X = 0, A2 = en Im , so
model becomes
yit = wit γ + vit

Example
When each individual is considered di¤erent, then X = 0, A2 = In Im ,
and the model becomes
yit = wit γi + vit

Example
When the e¤ects of the individual speci…c, time-invariant omitted variables
are treated as random variables just as in the assumption on the e¤ects of
other omitted variables, we can let Xi = eT , ζ 0 = (ζ 1 , ..., ζ n ), A1 = en ,
C = In σ2α , β be an unknown constant, and wit not contain an intercept
term. Then the model becomes:
yit = β + γ0 wit + ζ i + vit



5 etc.



5 etc.

Consider an heterogeneous panel data model
A consistent estimator of β = E ( βi ) can be obtained under more

general assumptions concerning βi and the regressors.
One such possible estimator is the Mean Group (MG) estimator

proposed by Pesaran and Smith (1995) for estimation of dynamic
random coe¢ cient models.

De…nition (mean group estimator)

The mean group (MG) estimator is de…ned as the simple average of the
OLS estimators b
βi
1 n
βMG = ∑ b
b β
n i =1 i

Mean Group (MG) estimator

When the regressors are strictly exogeneous and the errors are i.i.d, an
unbiased estimator of the covariance matrix is given by
1b
V b
βMG = ∆
n
! !0 !
n
1 1 n b 1 n b
1 i∑ n i∑ n i∑
b=
∆ b
βi βi b
βi βi
n =1 =1 =1

For more details
Pesaran, M.H. and R. Smith (1995), “Estimation of Long-Run Relationships

from Dynamic Heterogeneous Panels”, Journal of Econometrics, 68, 79-114.
Pesaran, M.H.,Y. Shin and R.P. Smith, (1999), “PooledMean Group

Estimation of Dynamic Heterogeneous Panels”, Journal of the American
Statistical Association, 94, 621-634.



5 etc.



5 etc.

Bonhomme S. and E. Manresa (2015), Grouped Patterns of Heterogeneity in

Panel Data, Econometrica, 83(3), 1147-1184
This paper introduces time-varying Grouped Patterns of
Heterogeneity in linear panel data models.
A distinctive feature of this approach is that group membership is left

unrestricted.
The authors estimate the parameters of the model using a “grouped

…xed-e¤ects” estimator that minimizes a least-squares criterion with
respect to all possible groupings of the cross-sectional units.

A simple linear model with grouped patterns of heterogeneity takes the

following form
yit = xit0 θ + αgi t + vit
αgi ,t 2 A R denotes a group-speci…c unobservable variable
The group membership variables gi 2 f1, ..., G g and the

group-speci…c time e¤ects αgi t are unrestricted.
Units in the same group share the same time pro…le αgt
The number of groups G is to be set or estimated by the researcher.

De…nition
The grouped …xed-e¤ects estimator in this model is de…ned as the solution
of the following minimization problem:
n T
b = arg min ∑ ∑
b 2
θ, b
α, γ yit xit0 θ αg i t
i =1 t =1
where the minimum is taken over all possible groupings γ = fg1 , ..., gn g of
the n units into G groups, common parameters θ, and group-speci…c time
e¤ects α.

De…nition
For given values of θ and α, the optimal group assignment for each
individual unit is given by:
T
∑
2
gbi (θ, α) = arg min yit xit0 θ αgt
g 2f1,...,G g t =1

This model can be extended to allow for group-speci…c e¤ects of

covariates (heterogeneous slope coe¢ cients):
yit = xit0 θ gi + αgi t + vit

The authors propose a very intuitive iterative algorithm to:
Estimate the parameters
Determine the group membership
For more details, see

Bonhomme S. and E. Manresa (2015), Grouped Patterns of Heterogeneity in
Panel Data, Econometrica, 83(3), 1147-1184

1 Heterogeneous panel data model.

2 Random coe¢ cient model.
3 GLS estimator.
4 Hierarchical model.
5 Mixed …xed and random coe¢ cients model.
6 Mean group estimator.

End of Chapter 1
Christophe Hurlin (University of Orléans)

Chapter 1. Linear Panel Models and Heterogeneity: School of Economics and Management - University of Geneva

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1. Linear Panel Models and Heterogeneity: School of Economics and Management - University of Geneva

Uploaded by

Copyright:

Available Formats

Chapter 1.

Linear Panel Models and Heterogeneity

Christophe Hurlin, Université of Orléans

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 1 / 258

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 2 / 258

Speci…cation tests and analysis of covariance

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 3 / 258

1 De…ne the concept of homogeneous panel data model.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 4 / 258

yit = αit + βit0 xit + εit

βit = ( β1it , β2it , ..., βKit )0 is a K 1 vector of parameters that vary

xit = (x1it , ..., xKit )0 is a K 1 vector of exogenous variables,

εit is an error term.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 5 / 258

Di¤erentrestrictions on the regression coe¢ cients can be tested:

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 6 / 258

Fact (Time stability)

yit = αi + βi0 xit + εit

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 7 / 258

Three types of restrictions can be imposed on this model.

yit = αi + β0 xit + εit

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 8 / 258

De…nition (Heterogeneous panel data model)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 9 / 258

De…nition (Homogeneous panel data model)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 10 / 258

De…nition (individual e¤ects)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 11 / 258

How to choose the appropriate speci…cation of the panel data model?

Economic interpretation: is it plausible to assume the homogeneity

Speci…cation tests: testing strategy proposed by Hsiao (2003) for

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 12 / 258

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 13 / 258

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 14 / 258

Lemma (Normality assumption)

di¤erent Fisher F -tests can be used to test the restrictions on β and α.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 15 / 258

First step (homogeneous/ pooled assumption)

yit = αi + β0 xit + εit

The hypothesis of common intercept and slope can be viewed as a set of

H01 : βi = β αi = α 8 i 2 f1, ..., ng

Ha1 : 9 (i, j ) 2 f1, ..., ng2 / βi 6= βj or αi 6= αj

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 16 / 258

Consider the model

Under the alternative H1 , there are nK estimated slope coe¢ cients

Under H1 , the unrestricted residual sum of squares S1 divided by σ2ε

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 17 / 258

H01 : βi = β αi = α 8 i 2 f1, ..., ng

the F statistic, denoted F1 ,and de…ned by:

(RSS1,c RSS1 ) / [(n 1) (K + 1)]

has a Fischer distribution with (n 1) (K + 1) and nT n (K + 1)

yit = α + β0 xit + εit

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 18 / 258

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 19 / 258

yit = α + β0 xit + εit

The least-squares regression of the pooled model yields parameter

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 20 / 258

Under H01 , the overall RSS is de…ned by

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 21 / 258

Second step (individual/unobserved e¤ects)

yit = αi + βi0 xit + εit

The hypothesis of heterogeneous intercepts but homogeneous slopes can