Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Regression with

Indicator Variables
Ani Katchova

© 2020 by Ani Katchova. All rights reserved.


Outline
• Single indicator variable
• Interaction terms with another indicator variable
• Several related indicator variables
• Interaction terms with a non-indicator variable
• F-test for differences across groups
• Chow test for differences across groups

2
Indicator variables
• Indicator variables or dummy variables are binary variables defined as 0 or
1.
• Indicator variables convey qualitative information (as opposed to
quantitative information).
• Examples: female, married, graduated, insured.
• Example: variable female where female=1 for females, female=0 for males.
• “gender” is not a good variable name because it is not clear if gender=1 for
male or female.
• An indicator variable can be an independent variable (discussed here) or
the dependent variable in a probit/logit model (discussed later).
3
Single indicator variable
• Regression model: 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝑢𝑢
• The intercept 𝛽𝛽̂0 is the average value of 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 or 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽̂0 .
• Regression model: 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝑢𝑢
• The coefficient 𝛿𝛿̂0 is the difference in 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 for females in comparison
to males.
• This model has different intercepts for females (𝛽𝛽̂0 + 𝛿𝛿̂0 ) and males 𝛽𝛽̂0 .
• The intercept for females is the average wage for females and the
intercept for males is the average wage for males.
• The regression t-test for significance of the coefficient on female is
identical to the t-test for significant differences in wage between the
female group and the male group, with the same mean, t-statistic, and
p-value. 4
Single indicator variable example
VARIABLES wage wage

25
female -2.51***

20
(0.30)

15
wage
Constant 5.90*** 7.10***

10
(0.16) (0.21)

5
0
0 .2 .4 .6 .8 1
female
wage wagehat

• 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝑢𝑢, the intercept 𝛽𝛽̂0 =5.90 is the average value of 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤. 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 =$5.90.
• 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝑢𝑢, the coefficient 𝛿𝛿̂0 =-2.51 is effect of the indicator variable 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
on 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤.
• Females have $2.51 lower wages than males. The interpretation of the coefficient is with respect
to the reference/base category of males.
• The intercept or average wage for males is $7.10 = 𝛽𝛽̂0 , and the intercept or average wage for
females is $4.59 = 𝛽𝛽̂0 + 𝛿𝛿̂0 =7.10-2.51.
• The figure shows the slope -2.51 and the intercepts $7.10 and $4.59. 5
Dummy variable trap
• The dummy variable trap refers to the problem that not all categories can be
included in the regression and one category needs to be left out, which is called a
base or reference category.
• For example, male and female cannot be both included in the regression because
of perfect collinearity.
• Including male instead of female in the regression:
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝑢𝑢 = 𝛽𝛽0 + 𝛿𝛿0 1 − 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑢𝑢
= (𝛽𝛽0 +𝛿𝛿0 ) − 𝛿𝛿0 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑢𝑢
• The coefficient −𝛿𝛿0 on 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 has the same magnitude and significance, but
opposite sign from the coefficient on 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓.
• The intercept in the model with 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 is (𝛽𝛽0 +𝛿𝛿0 ), which can also be obtained
from the model with 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓.
• A regression can be estimated with both 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 and 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 but with no constant
(this approach is not commonly used).
6
Dummy variable trap example
VARIABLES wage wage wage
female -2.51*** 4.59***
(0.30) (0.22)
male 2.51*** 7.10***
(0.30) (0.21)
Constant 7.10*** 4.59***
(0.21) (0.22)
• Regression 1: Females have $2.51 lower wages than males. The reference category is male.
• Regression 2: Males have $2.51 higher wages than females. The reference category is female.
• The intercept or average wage for females is $4.59 = 𝛽𝛽̂0 + 𝛿𝛿̂0 = 7.10-2.51.
• The intercept or average wage for males is $7.10 = 𝛽𝛽̂0 + 𝛿𝛿̂0 − 𝛿𝛿̂0 = 4.59+2.51.
• Regression 3: both female and male are included but there is no constant. The coefficients are
the average wage for females and males.

7
Interactions with another indicator variable
• Interaction terms for variables 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 and 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 can be done in two different ways.
1) Include 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 and 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 and 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 in the regression.

𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝛽𝛽2 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 + 𝛽𝛽3 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑢𝑢

2) Create four categories: female*single, male*single, female*married, and male*married


and include 3 of them in the regression model (the fourth/omitted category serves as a
base/reference category).

𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 + 𝛽𝛽2 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑚𝑚𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 + 𝛽𝛽3 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 ∗ 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑢𝑢
𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 ∗ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 is the reference category

8
Interaction terms with indicator variable

female * male * female * male *


female male single married single single married married
1 0 1 0 1 0 0 0
1 0 0 1 0 0 1 0
0 1 1 0 0 1 0 0
0 1 0 1 0 0 0 1
0 1 0 1 0 0 0 1
0 1 0 1 0 0 0 1
0 1 1 0 0 1 0 0
1 0 1 0 1 0 0 0
1 0 1 0 1 0 0 0
0 1 0 1 0 0 0 1
9
Interaction terms with indicator variable
VARIABLES wage VARIABLES wage
female * single -0.56 female -0.56
(0.47) (0.47)
female * married -0.60 married 2.82***
(0.46) (0.44)
male * married 2.82*** female * married -2.86***
(0.44) (0.61)
male * single Constant 5.17***
(0.36)
Constant 5.17***
(0.36)
Single females get $0.56 lower wages than single males, but the coefficient is not significant.
Marginal effect for female and single on wage: -0.56, same as -0.56.
Married females have $0.60 lower wages than single males, but the coefficient is not significant.
Marginal effect for female and married on wage: -0.60, same as -0.56+2.82-2.86=-0.60.
10
Several related indicator variables
• Regression with several related indicator variables needs to have one
reference/base category left out. Coefficients will be interpreted with
respect to the base category.
• Dummies for region: northcentral, south, west.
• Here, east = 1 – northcentral – south – west.
• 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛽𝛽2 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 + 𝛽𝛽3 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 + 𝛽𝛽4 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 + 𝑢𝑢
where the base category is 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒.
• Alternatively, 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛼𝛼0 + 𝛼𝛼1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛼𝛼2 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 +
𝛼𝛼3 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 + 𝛼𝛼4 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝑢𝑢, where the base category is 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤.

11
Several related indicator variables
VARIABLES wage wage Region Average
northcen -0.66 -0.90* wage
(0.47) (0.50) northcen $5.71
south -0.98** -1.23*** south $5.39
(0.43) (0.47)
west $6.61
west 0.24
east $6.37
(0.52)
east -0.24
(0.52)
Constant 6.37*** 6.61***
(0.34) (0.39)
In a model with no other independent variables, the intercept is the average wage for the reference category.
Wages in the south are $0.98 lower than wages in the east, and wages in the northcentral and west are not
significantly different from those in the east region.
Wages in northcentral region are $0.90 lower and in the south are $1.23 lower than in the west region.
12
Several related indicator variables
VARIABLES wage wage Region Average
educ 0.54*** 0.54*** wage
(0.05) (0.05) northcen $5.71
northcen -0.66 -1.01** south $5.39
(0.43) (0.46)
west $6.61
south -0.60 -0.94**
east $6.37
(0.40) (0.43)
west 0.34
(0.47)
east -0.34
(0.47)
Constant -0.50 -0.16
(0.75) (0.77)
With education as independent variable, the intercept is not the average wage.
Wages in the northcentral, south, and west regions are not significantly different from those in the east region.
13
Wages in northcentral region are $1.01 lower and in the south are $0.94 lower than in the west region.
Indicator variable in regression
• Regression model: 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑐𝑐 + 𝑢𝑢
• This model has the same intercept 𝛽𝛽̂0 and slope 𝛽𝛽̂1 on education
for females and males.
• Regression model: 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝑢𝑢
• The coefficient 𝛿𝛿̂0 is the effect of 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 on 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤.
• Same slope for females and males = 𝛽𝛽̂1
• Intercept for males = 𝛽𝛽̂0
• Intercept for females = 𝛽𝛽̂0 + 𝛿𝛿̂0

14
Indicator variable in regression
Female Male Regression 2: Females have $2.27 lower
wages than males.
dummy dummy Intercept for males = 0.62
VARIABLES wage wage wage Intercept for females = 0.62-2.27=-1.65
educ 0.54*** 0.51*** 0.51***
(0.05) (0.05) (0.05) Regression 3: Males have $2.27 higher
female -2.27*** wages than females. Same magnitude
and significance but opposite sign.
(0.28)
Intercept for females = - 1.65
male 2.27*** Intercept for males = - 1.65 +2.27=0.62.
(0.28)
Constant -0.91 0.62 -1.65** Slope for education is 0.51. One additional
(0.69) (0.67) (0.65) year of education is associated with $0.51
increase in wage.
Same slope for females and males.

15
Indicator variable in regression

25
25

20
20

15
15

wage
wage

10
10

5
5

0
0 5 10 15 20
0

educ
0 5 10 15 20
educ wage wagehat for females
wagehat for males
wage wagehat

𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑐𝑐 + 𝑢𝑢 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝑢𝑢
Same intercept (-0.91), same slope (0.54) for females and males. Same slope (0.51), different intercepts for
females (-1.65) and males (0.62).
Line for females is 2.27 lower than line for males.

16
Interaction terms with non-indicator variable
• Regression model:
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 + 𝛿𝛿1 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝑢𝑢
• This model has different slopes on education and intercepts for
females and males.
• Slope for males = 𝛽𝛽̂1
• Slope for females = 𝛽𝛽̂1 + 𝛿𝛿̂1
• Intercept for males = 𝛽𝛽̂0
• Intercept for females = 𝛽𝛽̂0 + 𝛿𝛿̂0

17
Model with interaction term versus
two separate models
Interaction Model for Model for Slope for males = 𝛽𝛽̂1 = 0.54
term females males Slope for females = 𝛽𝛽̂1 + 𝛿𝛿̂1 =0.54-0.09=0.45
VARIABLES wage wage wage Intercept for males = 𝛽𝛽̂0 = 0.20
educ 0.54*** 0.45*** 0.54*** Intercept for females = 𝛽𝛽̂0 + 𝛿𝛿̂0 = 0.20-1.20=-
(0.06) (0.06) (0.08) 1.00
female -1.20
Model with female and female*educ has the
(1.33) same coefficients as in the two separate models
male for females and males.

female*educ -0.09 The coefficient on female and on the interaction


(0.10) term between female and education are not
Constant 0.20 -1.00 0.20 significant. So the intercepts and the slopes of
(0.84) (0.73) (1.02) returns to education are not significantly
different for females and males.
Observations 526 252 274

18
Model with different slopes and intercepts
Interaction Model for Model for
term females males
VARIABLES wage wage wage

25
educ 0.54*** 0.45*** 0.54***

20
(0.06) (0.06) (0.08)

15
female -1.20

wage
(1.33)

10
male

5
female*educ -0.09

0
(0.10) 0 5 10
educ
15 20

Constant 0.20 -1.00 0.20 wage wagehat for females

(0.84) (0.73) (1.02) wagehat for males

Observations 526 252 274


Different intercepts and different slopes.
19
F-test for differences across groups
• F-test to test whether the returns to education, experience, and tenure are the same for males
and females.
• Unrestricted regression model:
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛽𝛽2 𝑒𝑒𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥 + 𝛽𝛽3 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 + 𝛿𝛿0 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 +
𝛿𝛿1 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛿𝛿2 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛿𝛿3 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 ∗ 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 + 𝑢𝑢
• H0: 𝛿𝛿0 =0 and 𝛿𝛿1 =0 and 𝛿𝛿2 =0 and 𝛿𝛿3 =0 Ha: 𝛿𝛿0 ≠0 or 𝛿𝛿1 ≠0 or 𝛿𝛿2 ≠0 or 𝛿𝛿3 ≠0, 𝑞𝑞 = 4 (restrictions)
• Restricted regression model:
• 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛼𝛼0 + 𝛼𝛼1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛼𝛼2 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛼𝛼3 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 + 𝑒𝑒
𝑆𝑆𝑆𝑆𝑅𝑅𝑟𝑟 −𝑆𝑆𝑆𝑆𝑅𝑅𝑢𝑢𝑢𝑢
𝑞𝑞 (4966−4394)/4
• 𝐹𝐹 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 𝑆𝑆𝑆𝑆𝑅𝑅𝑢𝑢𝑢𝑢 = =16.86
4394/(526−7−1)
𝑛𝑛−𝑘𝑘−1
• F critical value (4,518) = 2.39 < F-stat, p-value < 0.05
• Coefficients on female, female*educ, female*exper, and female*tenure are jointly significant.
• Females have significantly different wages than males.

20
F-test for differences across groups
Restricted model Unrestricted model
VARIABLES wage wage
educ 0.60*** 0.68***
(0.05) (0.06) Using t-tests, the coefficient on female
exper 0.02* 0.05*** is not significant, but the coefficients on
(0.01) (0.02) female*educ, female*exper, and
tenure 0.17*** 0.16*** female*tenure are individually significant.
(0.02) (0.03) Using F-test, these four coefficients
female 2.08 are jointly significant.
(1.40) Different wages for females and males.
femaleXeduc -0.21**
(0.10)
femaleXexper -0.05**
(0.02)
femaleXtenure -0.10**
(0.05)
Constant -2.87*** -3.53***
21
(0.73) (0.95)
Chow test for differences across groups
• Chow test is an F-test for significantly different coefficients in two
models estimated with two different groups. Instead of one
unrestricted model, two separate models are estimated, one for each
group.
• Regression model for females:
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛽𝛽0 + 𝛽𝛽1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛽𝛽2 𝑒𝑒𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥 + 𝛽𝛽3 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 + 𝑢𝑢
• Regression model for males:
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛼𝛼0 + 𝛼𝛼1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛼𝛼2 𝑒𝑒𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥 + 𝛼𝛼3 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 + 𝑒𝑒
• H0: 𝛽𝛽0 = 𝛼𝛼0 and 𝛽𝛽1 = 𝛼𝛼1 and 𝛽𝛽2 = 𝛼𝛼2 and 𝛽𝛽3 = 𝛼𝛼3
• Ha: 𝛽𝛽0 ≠ 𝛼𝛼0 or 𝛽𝛽1 ≠ 𝛼𝛼1 or 𝛽𝛽2 ≠ 𝛼𝛼2 or 𝛽𝛽3 ≠ 𝛼𝛼3
• Restricted regression model with both groups in one model:
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 = 𝛾𝛾0 + 𝛾𝛾1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛾𝛾2 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + 𝛾𝛾3 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 + 𝑣𝑣
22
Chow test
• After estimating the models:
• Obtain 𝑆𝑆𝑆𝑆𝑅𝑅𝑟𝑟 (sum of squared residuals) for restricted model.
• Calculate 𝑆𝑆𝑆𝑆𝑅𝑅1 and 𝑆𝑆𝑆𝑆𝑅𝑅2 for the models for females and males.
𝑆𝑆𝑆𝑆𝑅𝑅𝑟𝑟 −𝑆𝑆𝑆𝑆𝑅𝑅1 −𝑆𝑆𝑆𝑆𝑅𝑅2
(4966−1257−3137)/(3+1)
𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 𝐹𝐹 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 𝑘𝑘+1
𝑆𝑆𝑆𝑆𝑅𝑅1 +𝑆𝑆𝑆𝑆𝑅𝑅2 = =16.86
(1257+3137)/(526−2∗(3+1))
𝑛𝑛−2(𝑘𝑘+1)
• F critical value (4,518) = 2.39 < F-stat, p-value < 0.05
• Coefficients on intercept, educ, exper, and tenure are significantly different
for females and males.
• The Chow test is equivalent to the F-test.
• For the F-test, 𝑆𝑆𝑆𝑆𝑅𝑅𝑢𝑢𝑢𝑢 = 𝑆𝑆𝑆𝑆𝑅𝑅1 + 𝑆𝑆𝑆𝑆𝑅𝑅2 .

23
Chow test
Restricted model with Model if female=1 Model if female=0
female and male
VARIABLES wage wage wage
educ 0.60*** 0.46*** 0.68***
(0.05) (0.06) (0.07)
exper 0.02* 0.01 0.05***
(0.01) (0.01) (0.02)
tenure 0.17*** 0.06* 0.16***
(0.02) (0.03) (0.03)
Constant -2.87*** -1.46* -3.53***
(0.73) (0.80) (1.11)
Observations (n) 526 252 274
SSR 4966 1257 3137
Chow test for significant differences in coefficients between females and males. The coefficients are jointly
significantly different for females and males. Two separate models for females and males should be estimated.
24
Review questions
• Define an indicator or dummy variable.
• Describe the two different ways to estimate a regression with two
dummies and their interaction terms.
• Describe interactions of an indicator variable with a non-indicator
variable. How can different intercepts and slopes be obtained?
• Describe the F-test for differences across groups.
• Describe the Chow test for differences across groups.

25

You might also like