Econ 3044: Introduction To Econometrics Chapter-4: MLR: Further Issues and Dummy Variables

Econ 3044: Introduction to Econometrics
Chapter-4: MLR: Further Issues and Dummy Variables
Lemi Taye
Addis Ababa University

lemi.taye@aau.edu.et
December 28, 2019
Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy VariablesDecember 28, 2019 1 / 43
Overview
1 More on Functional Form
2 MLR with Qualitative Information
More on Using Logarithmic Functional Forms
We begin by reviewing how to interpret the parameters in the

following model, which relates the median housing price (price) in the
community to various community characteristics: nox is the amount
of nitrogen oxide in the air; rooms is the average number of rooms in
houses in the community.
log(price) = β0 + β1 log(nox) + β2 rooms + u. (1)
The coefficient β1 is the elasticity of price with respect to nox

(pollution).
The coefficient β2 is the change in log(price), when ∆rooms = 1; as
we have seen many times, when multiplied by 100, this is the
approximate percentage change in price.
When estimated using the data in HPRICE2, we obtain
\ = 9.23 − .718 log(nox) + .306 rooms

log(price)
(.19) (.066) (.019) (2)
2
n = 506, R = .514.
Thus, when nox increases by 1%, price falls by .718%, holding only
rooms fixed.
When rooms increases by one, price increases by approximately
100(.306) = 30.6%.
The estimate that one more room increases price by about 30.6%
turns out to be somewhat inaccurate for this application.
The approximation error occurs because, as the change in log(y)
becomes larger and larger, the approximation %∆y ≈ 100 · ∆ log(y)
becomes more and more inaccurate.
Fortunately, a simple calculation is available to compute the exact

percentage change.
To describe the procedure, we consider the general estimated model
\ = β̂0 + β̂1 log(x1 ) + β̂2 x2 .

log(y)
\ = β̂2 ∆x2 .
Now, fixing x1 , we have ∆log(y)
Using simple algebraic properties of the exponential and logarithmic
functions gives the exact percentage change in the predicted y as
%∆y = 100 · [exp(β̂2 ∆x2 ) − 1], (3)
where the multiplication by 100 turns the proportionate change into a

percentage change.
When ∆x2 = 1,
%∆y = 100 · [exp(β̂2 ) − 1]. (4)
Applied to the housing price example with x2 = rooms and
[ = 100[exp(.306) − 1] = 35.8%, which is notably
β̂2 = .306, %∆price
larger than the approximate percentage change, 30.6%, obtained
directly from (2).
The adjustment in equation (3) is not as crucial for small percentage
changes.
Models with Quadratics
Quadratic functions are also used quite often in applied economics

to capture decreasing or increasing marginal effects.
In the simplest case, y depends on a single observed factor x, but it
does so in a quadratic fashion:
y = β0 + β1 x + β2 x2 + u.
For example, take y = wage and x = exper. As we discussed in

Chapter 3, this model falls outside of simple regression analysis but is
easily handled with multiple regression.
It is important to remember that β1 does not measure the change in y
with respect to x; it makes no sense to hold x2 fixed while changing x.
If we write the estimated equation as
ŷ = β̂0 + β̂1 x + β̂2 x2 , (5)
then we have the approximation
∆ŷ ≈ (β̂1 + 2β̂2 x)∆x so ∆ŷ/∆x ≈ β̂1 + 2β̂2 x. (6)
This says that the slope of the relationship between x and y depends
on the value of x; the estimated slope is β̂1 + 2β̂2 x.
If we plug in x = 0, we see that β̂1 can be interpreted as the
approximate slope in going from x = 0 to x = 1. After that, the
second term, 2β̂2 x, must be accounted for.
If we are only interested in computing the predicted change in y given

a starting value for x and a change in x, we could use (5) directly:
there is no reason to use the calculus approximation at all.
However, we are usually more interested in quickly summarizing the
effect of x on y, and the interpretation of β̂1 and β̂2 in equation (6)
provides that summary.
Typically, we might plug in the average value of x in the sample, or
some other interesting values, such as the median or the lower and
upper quartile values.
In many applications, β̂1 is positive and β̂2 is negative.
For example, using the wage data in WAGE1, we obtain
[ = 3.73 + .298 exper − .0061 exper2

wage
(.35) (.041) (.0009) (7)
2
n = 526, R = .093.
This estimated equation implies that exper has a diminishing effect on

wage.
When the coefficient on x is positive and the coefficient on x2 is
negative, the quadratic has a parabolic shape.
There is always a positive value of x where the effect of x on y is
zero; before this point, x has a positive effect on y; after this point, x
has a negative effect on y.
In practice, it can be important to know where this turning point is.
Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

December 28, 2019 10 / 43
In the estimated equation (5) with β̂1 > 0 and β̂2 < 0, the turning
point (or maximum of the function) is always achieved at the
coefficient on x over twice the absolute value of the coefficient on x2 :
x∗ = |β̂1 /(2β̂2 )|. (8)
In the wage example, x∗ = exper∗ is .298/[2(.0061)] ≈ 24.4. (Note

how we just drop the minus sign on .0061 in doing this calculation.)
This quadratic relationship is illustrated in Figure 1.

December 28, 2019 11 / 43
Figure 1: Quadratic relationship between wage

[ and exper.

December 28, 2019 12 / 43
When a model has a dependent variable in logarithmic form and an

explanatory variable entering as a quadratic, some care is needed in
reporting the partial effects.
The following example also shows that the quadratic can have a
U-shape, rather than a parabolic shape.
A U-shape arises in equation (5) when β̂1 is negative and β̂2 is
positive; this captures an increasing effect of x on y.

December 28, 2019 13 / 43
Example (Effects of pollution on Housing prices)

We modify the housing price model to include a quadratic term in rooms:
log(price) = β0 + β1 log(nox) + β2 log(dist) + β3 rooms

(9)
+ β4 rooms2 + β5 stratio + u.
The model estimated using the data in HPRICE2 is
\ = 13.39 − .902 log(nox) − .087 log(dist)

log(price)
(.57) (.115) (.043)
− .545 rooms + .062 rooms2 − .048 stratio
(.165) (.013) (.006)
n = 506, R2 = .603.

December 28, 2019 14 / 43
Example (Effects of pollution on Housing prices (continued ))

The quadratic term rooms2 has a t statistic of about 4.77, and so it is
very statistically significant. But what about interpreting the effect of
rooms on log(price)? Initially, the effect appears to be strange. Because
the coefficient on rooms is negative and the coefficient on rooms2 is
positive, this equation literally implies that, at low values of rooms, an
additional room has a negative effect on log(price). At some point, the
effect becomes positive, and the quadratic shape means that the
semi-elasticity of price with respect to rooms is increasing as rooms
increases. This situation is shown in Figure 2.
We obtain the turnaround value of rooms using equation (8) (even though
β̂1 is negative and β̂2 is positive). The absolute value of the coefficient on
rooms, .545, divided by twice the coefficient on rooms2 , .062, gives
rooms∗ = .545/[2(.062)] ≈ 4.4; this point is labeled in Figure 2.

December 28, 2019 15 / 43
Example (Effects of pollution on Housing prices (continued ))
\ as a quadratic function of rooms.

Figure 2: log(price)

December 28, 2019 16 / 43
Models with Interaction Terms
Sometimes, it is natural for the partial effect, elasticity, or

semi-elasticity of the dependent variable with respect to an
explanatory variable to depend on the magnitude of yet another
explanatory variable.
For example, in the model
price = β0 + β1 sqrf t + β2 bdrms + β3 sqrf t · bdrms + β4 bthrms + u,
the partial effect of bdrms on price (holding all other variables fixed)
is
∆price
= β2 + β3 sqrf t. (10)
∆bdrms
If β3 > 0, then (10) implies that an additional bedroom yields a higher
increase in housing price for larger houses.

December 28, 2019 17 / 43
In other words, there is an interaction effect between square footage

and number of bedrooms.
The parameters on the original variables can be tricky to interpret
when we include an interaction term.
In summarizing the effect of bdrms on price, we must evaluate (10)
at interesting value of sqrf t, such as the mean value, or the lower and
upper quartiles in the sample.

December 28, 2019 18 / 43
Example (Effects of attendance on Final Exam performance)

A model to explain the standardized outcome on a final exam (stndf nl) in
terms of percentage of classes attended, prior college grade point average,
and ACT score is
stndf nl = β0 + β1 atndrte + β2 priGP A + β3 ACT + β4 priGP A2
(11)
+ β5 ACT 2 + β6 priGP A × atndrte + u.
(We use the standardized exam score to interpret a student’s performance

relative to the rest of the class.) In addition to quadratics in priGP A and
ACT , this model includes an interaction between priGP A and the
attendance rate. The idea is that class attendance might have a different
effect for students who have performed differently in the past, as measured
by priGP A. We are interested in the effects of attendance on final exam
score: ∆stndf nl/∆Datndrte = β1 + β6 priGP A.

December 28, 2019 19 / 43
Example (Effects of attendance on Final Exam performance

(continued ))
Using the 680 observations in ATTEND, for students in a course on
microeconomic principles, the estimated equation is
\nl = 2.05 − .0067 atndrte − 1.63 priGP A − .128 ACT

stndf
(1.36) (.0102) (.48) (.098)
2 2
+ .296 priGP A + .0045 ACT + .0056 priGP A × atndrte
(.101) (.0022) (.0043)
n = 680, R2 = .229, R̄2 = .222.
We must interpret this equation with extreme care. If we simply look at the
coefficient on atndrte, we will incorrectly conclude that attendance has a
negative effect on final exam score.

December 28, 2019 20 / 43
Example (Effects of attendance on Final Exam performance
(continued ))
But this coefficient supposedly measures the effect when priGP A = 0,
which is not interesting (in this sample, the smallest prior GPA is about
.86). We must also take care not to look separately at the estimates of β1
and β6 and conclude that, because each t statistic is insignificant, we
cannot reject H0 : β1 = 0, β6 = 0. In fact, the p-value for the F -test of
this joint hypothesis is .014, so we certainly reject H0 at the 5% level.
How should we estimate the partial effect of atndrte on stndf nl? We

must plug in interesting values of priGP A to obtain the partial effect. The
mean value of priGP A in the sample is 2.59, so at the mean priGP A, the
effect of atndrte on stndf nl is −.0067 + .0056(2.59) ≈ .0078. What does
this mean? Because atndrte is measured as a percentage, it means that a
10 percentage point increase in atndrte increases stndf
\nl by .078
standard deviations from the mean final exam score.
December 28, 2019 21 / 43
Describing Qualitative Information
Qualitative factors often come in the form of binary information (eg.

gender, PC ownership, marital status, etc.).
The relevant information can be captured by defining a binary
variable or a zero-one variable.
In econometrics, binary variables are most commonly called dummy
variables.
In defining a dummy variable, we must decide which event is assigned
the value one and which is assigned the value zero.
For example, in a study of individual wage determination, we might
define f emale to be a binary variable taking on the value one for
females and the value zero for males.
The name in this case indicates the event with the value one.

December 28, 2019 22 / 43
A Single Dummy Independent Variable
To incorporate binary information with only a single dummy

explanatory variable, we just add it as an independent variable in the
equation.
For example, consider the following simple model of hourly wage
determination:
wage = β0 + δ0 f emale + β1 educ + u. (12)
In model (12), only two observed factors affect wage: gender and
education.
Because f emale = 1 when the person is female, and f emale = 0
when the person is male, the parameter δ0 has the following
interpretation: δ0 is the difference in hourly wage between females and
males, given the same amount of education.

December 28, 2019 23 / 43
Thus, the coefficient δ0 determines whether there is discrimination

against women: if δ0 < 0, then for the same level of other factors,
women earn less than men on average.
In terms of expectations, if we assume the zero conditional mean
assumption E(u|f emale, educ) = 0, then
δ0 = E(wage|f emale = 1, educ) − E(wage|f emale = 0, educ).
Because f emale = 1 corresponds to females and f emale = 0

corresponds to males, we can write this more simply as
δ0 = E(wage|f emale, educ) − E(wage|male, educ). (13)
The key here is that the level of education is the same in both
expectations; the difference, δ0 , is due to gender only.

December 28, 2019 24 / 43
The situation can be depicted graphically as an intercept shift

between males and females.
In Figure 3, the case δ0 < 0 is shown, so that men earn a fixed
amount more per hour than women.
The difference does not depend on the amount of education, and this
explains why the wage-education profiles for women and men are
parallel.

December 28, 2019 25 / 43
Figure 3: Graph of wage = β0 + δ0 f emale + β1 educ for δ0 < 0.

December 28, 2019 26 / 43
At this point, you may wonder why we do not also include in (12) a
dummy variable, say male, which is one for males and zero for females.
This would be redundant. In (12), the intercept for males is β0 , and
the intercept for females is β0 + δ0 . Because there are just two
groups, we only need two different intercepts.
This means that, in addition to β0 , we need to use only one dummy
variable; we have chosen to include the dummy variable for females.
Using two dummy variables would introduce perfect collinearity
because f emale + male = 1, which means that male is a perfect
linear function of f emale.
Including dummy variables for both genders is the simplest example of
the so-called dummy variable trap, which arises when too many
dummy variables describe a given number of groups.

December 28, 2019 27 / 43
In (12), we have chosen males to be the base group or benchmark

group, that is, the group against which comparisons are made.
This is why β0 is the intercept for males, and δ0 is the difference in
intercepts between females and males.
We could choose females as the base group by writing the model as
wage = α0 + γ0 male + β1 educ + u,
where the intercept for females is α0 and the intercept for males is
α0 + γ0 ; this implies that α0 = β0 + δ0 and α0 + γ0 = β0 .
In any application, it does not matter how we choose the base group,
but it is important to keep track of which group is the base group.

December 28, 2019 28 / 43
Nothing much changes when more explanatory variables are involved.

Taking males as the base group, a model that controls for experience
and tenure in addition to education is
wage = β0 + δ0 f emale + β1 educ + β2 exper + β3 tenure + u. (14)
If educ, exper, and tenure are all relevant productivity

characteristics, the null hypothesis of no difference between men and
women is H0 : δ0 = 0.
The alternative that there is discrimination against women is
H1 : δ0 < 0.
To test for wage discrimination, we just estimate the model by OLS,
exactly as before, and use the usual t statistic.

December 28, 2019 29 / 43
Example (Hourly Wage Equation)

Using the data in WAGE1, we estimate model (14). For now, we use wage,
rather than log(wage), as the dependent variable:
[ = −1.57 − 1.81 f emale + .572 educ + .25 exper + .141 tenure

wage
(.72) (.26) (.049) (.012) (.021)
n = 526, R2 = .364.
The negative intercept—the intercept for men, in this case—is not very
meaningful because no one has zero values for all of educ, exper, and
tenure in the sample. The coefficient on f emale is interesting because it
measures the average difference in hourly wage between a man and a
woman who have the same levels of educ, exper, and tenure. If we take a
woman and a man with the same levels of education, experience, and
tenure, the woman earns, on average, $1.81 less per hour than the man.
December 28, 2019 30 / 43
A common specification in applied work has the dependent variable

appearing in logarithmic form, with one or more dummy variables
appearing as independent variables.
In this case, the coefficients have a percentage interpretation.
Example (Housing price Regression)

Using the data in HPRICE1, we obtain the equation
\ = −1.35 + .168 log(lotsize) + .707 log(sqrf t)

log(price)
(.65) (.038) (.093)
+ .027 bdrms + .054 colonial
(.029) (.045)
2
n = 88, R = .649.

December 28, 2019 31 / 43
Example (Housing price Regression (continued ))

All the variables are self-explanatory except colonial, which is a binary
variable equal to one if the house is of the colonial style. What does the
coefficient on colonial mean? For given levels of lotsize, sqrf t, and
bdrms, the difference in log(price) between a house of colonial style and
that of another style is .054. This means that a colonial-style house is
predicted to sell for about 5.4% more, holding other factors fixed.
This example shows that, when log(y) is the dependent variable in a

model, the coefficient on a dummy variable, when multiplied by 100, is
interpreted as the percentage difference in y, holding all other factors
fixed.

December 28, 2019 32 / 43
When the coefficient on a dummy variable suggests a large

proportionate change in y, the exact percentage difference can be
obtained exactly as with the semi-elasticity calculation in (4).
Example (Log Hourly Wage Equation)

Let us reestimate the wage equation, using log(wage) as the dependent
variable and adding quadratics in exper and tenure:
\ = .417 − .297 f emale + .080 educ + .029 exper

log(wage)
(.100) (.055) (.058) (.005)
− .00058 exper + .032 tenure − .00059 tenure2
2
(.00010) (.007) (.00023)

n = 526, R2 = .441.

December 28, 2019 33 / 43
Example (Log Hourly Wage Equation (continued ))
Using the approximation 100 · ∆ log(y) ≈ %∆y, the coefficient on female
implies that, for the same levels of educ, exper, and tenure, women earn
about 100(.297) = 29.7% less than men. We can do better than this by
computing the exact percentage difference in predicted wages. What we
want is the proportionate difference in wages between females and males,
\F − wage
holding other factors fixed: (wage \M )/wage\M . What we have
from the estimated model is
\ F ) − log(wage
log(wage \ M ) = −.297.
Exponentiating and subtracting one gives
\F − wage
(wage \M = exp(−.297) − 1 ≈ −.257.
\M )/wage
This more accurate estimate implies that a woman’s wage is, on average,
25.7% below a comparable man’s wage.
December 28, 2019 34 / 43
Using Dummy Variables for Multiple Categories
We can use several dummy independent variables in the same

equation. For example, we could add the dummy variable married to
equation (14).
The coefficient on married gives the (approximate) proportional
differential in wages between those who are and are not married,
holding gender, educ, exper, and tenure fixed.

December 28, 2019 35 / 43
Example (Log Hourly Wage Equation)
Let us estimate a model that allows for wage differences among four groups:
married men, married women, single men, and single women. To do this, we must
select a base group; we choose single men. Then, we must define dummy
variables for each of the remaining groups. Call these marrmale, marrf em, and
singf em. Putting these three variables into (14) (and, of course, dropping
female, since it is now redundant) gives
\ = .321 + .213 marrmale − .198 marrf em

log(wage)
(.100) (.055) (.058)
− .110 singf em + .079 educ + .027 exper − .00057 exper2
(.056) (.007) (.005) (.00011)
+ .029 tenure − .00053 tenure2
(.007) (.00023)
n = 526, R2 = .461.

December 28, 2019 36 / 43
Example (Log Hourly Wage Equation (continued ))

All of the coefficients, with the exception of singf em, have t statistics well
above two in absolute value. The t statistic for singf em is about −1.96,
which is just significant at the 5% level against a two-sided alternative.
To interpret the coefficients on the dummy variables, we must remember

that the base group is single males. Thus, the estimates on the three
dummy variables measure the proportionate difference in wage relative to
single males. For example, married men are estimated to earn about 21.3%
more than single men, holding levels of education, experience, and tenure
fixed. A married woman, on the other hand, earns a predicted 19.8% less
than a single man with the same levels of the other variables.

December 28, 2019 37 / 43
Interactions Involving Dummy Variables
We have effectively seen interactions among dummy variables in the

previous example, where we defined four categories based on marital
status and gender.
In fact, we can recast that model by adding an interaction term
between f emale and married to the model where f emale and
married appear separately.
The estimated model with the f emale-married interaction term is
\ = .321 − .110 f emale + .213 married

log(wage)
(.100) (.056) (.055)
(15)
− .301 f emale × married + . . . ,
(.072)
where the rest of the regression is necessarily identical to the previous

result.
December 28, 2019 38 / 43
Equation (15) shows explicitly that there is a statistically significant

interaction between gender and marital status.
This model also allows us to obtain the estimated wage differential
among all four groups, but here we must be careful to plug in the
correct combination of zeros and ones.
Setting f emale = 0 and married = 0 corresponds to the group single
men, which is the base group, since this eliminates f emale, married,
and f emale × married.
We can find the intercept for married men by setting f emale = 0 and
married = 1 in (15); this gives an intercept of .321 + .213 = .534,
and so on.
Equation (15) is just a different way of finding wage differentials
across all gender-marital status combinations.

December 28, 2019 39 / 43
Example (Effects of Computer Usage on Wages)

Krueger (1993) estimates the effects of computer usage on wages. He
defines a dummy variable, which we call compwork, equal to one if an
individual uses a computer at work. Another dummy variable, comphome,
equals one if the person uses a computer at home. Using 13,379 people
from the 1989 Current Population Survey, Krueger (1993, Table 4) obtains
\ = β̂0 + .177 compwork + .070 comphome

log(wage)
(.009) (.019)
+ .017 compwork × comphome + other f actors,
(.023)
(The other factors are the standard ones for wage regressions, including
education, experience, gender, and marital status; see Krueger’s paper for
the exact list.)
December 28, 2019 40 / 43
Example (Effects of Computer Usage on Wages (continued ))

Krueger does not report the intercept because it is not of any importance;
all we need to know is that the base group consists of people who do not
use a computer at home or at work.
It is worth noticing that the estimated return to using a computer at work

(but not at home) is about 17.7%. (The more precise estimate is 19.4%.)
Similarly, people who use computers at home but not at work have about a
7% wage premium over those who do not use a computer at all. The
differential between those who use a computer at both places, relative to
those who use a computer in neither place, is about 26.4% (obtained by
adding all three coefficients and multiplying by 100), or the more precise
estimate 30.2% obtained from equation (4).

December 28, 2019 41 / 43
Reading Assignment
Read the following from Wooldridge (2015):

I Allowing for Different Slopes
I Testing for Differences in Regression Functions across Groups

December 28, 2019 42 / 43
************* End of Chapter Four *************

December 28, 2019 43 / 43

Econ 3044: Introduction To Econometrics Chapter-4: MLR: Further Issues and Dummy Variables

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econ 3044: Introduction To Econometrics Chapter-4: MLR: Further Issues and Dummy Variables

Uploaded by

Copyright:

Available Formats

Econ 3044: Introduction to Econometrics

Chapter-4: MLR: Further Issues and Dummy Variables

Addis Ababa University

December 28, 2019

1 More on Functional Form

2 MLR with Qualitative Information

We begin by reviewing how to interpret the parameters in the

log(price) = β0 + β1 log(nox) + β2 rooms + u. (1)

The coefficient β1 is the elasticity of price with respect to nox

When estimated using the data in HPRICE2, we obtain

\ = 9.23 − .718 log(nox) + .306 rooms

Fortunately, a simple calculation is available to compute the exact

\ = β̂0 + β̂1 log(x1 ) + β̂2 x2 .

%∆y = 100 · [exp(β̂2 ∆x2 ) − 1], (3)

where the multiplication by 100 turns the proportionate change into a

Quadratic functions are also used quite often in applied economics

For example, take y = wage and x = exper. As we discussed in

If we write the estimated equation as

ŷ = β̂0 + β̂1 x + β̂2 x2 , (5)

then we have the approximation

∆ŷ ≈ (β̂1 + 2β̂2 x)∆x so ∆ŷ/∆x ≈ β̂1 + 2β̂2 x. (6)

If we are only interested in computing the predicted change in y given

For example, using the wage data in WAGE1, we obtain

[ = 3.73 + .298 exper − .0061 exper2

This estimated equation implies that exper has a diminishing effect on

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

x∗ = |β̂1 /(2β̂2 )|. (8)

In the wage example, x∗ = exper∗ is .298/[2(.0061)] ≈ 24.4. (Note

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Figure 1: Quadratic relationship between wage

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

When a model has a dependent variable in logarithmic form and an

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Example (Effects of pollution on Housing prices)

log(price) = β0 + β1 log(nox) + β2 log(dist) + β3 rooms

The model estimated using the data in HPRICE2 is

\ = 13.39 − .902 log(nox) − .087 log(dist)

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Example (Effects of pollution on Housing prices (continued ))

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Example (Effects of pollution on Housing prices (continued ))

\ as a quadratic function of rooms.

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Sometimes, it is natural for the partial effect, elasticity, or

price = β0 + β1 sqrf t + β2 bdrms + β3 sqrf t · bdrms + β4 bthrms + u,

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

In other words, there is an interaction effect between square footage

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Example (Effects of attendance on Final Exam performance)

(We use the standardized exam score to interpret a student’s performance

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Example (Effects of attendance on Final Exam performance

\nl = 2.05 − .0067 atndrte − 1.63 priGP A − .128 ACT

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

How should we estimate the partial effect of atndrte on stndf nl? We

Qualitative factors often come in the form of binary information (eg.

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

To incorporate binary information with only a single dummy

wage = β0 + δ0 f emale + β1 educ + u. (12)

Lemi Taye (AAUSC) Ch 4: MLR: Further Issues and Dummy Variables

Thus, the coefficient δ0 determines whether there is discrimination

δ0 = E(wage|f emale = 1, educ) − E(wage|f emale = 0, educ).