Dummy-Variable Regression Model

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 53

Dummy-Variable Regression

Model

© 1999 Prentice-Hall, Inc. Chap. 14 - 1


Multiple Regression
Models
Multiple
Multiple
Regression
Regression
Models
Models Non-
Non-
Linear
Linear Linear
Linear

Dummy
Dummy Inter-
Inter-
Linear
Linear
Variable
Variable action
action

Poly-
Poly- Square
Square Log
Log Reciprocal
Reciprocal Exponential
Exponential
Nomial
Nomial Root
Root

© 1999 Prentice-Hall, Inc. Chap. 14 - 2


Dummy-Variable
Regression Model
Involves categorical X variable with
2 or more levels
 e.g., male-female, college-no college etc.
 or firms or states or cities

Each level is coded 0 or 1


Assumes only intercept is different
 Slopes are constant across categories
The number of dummy variables that are
included is 1-# of levels
© 1999 Prentice-Hall, Inc. Chap. 14 - 3
Dummy-Variable Regression
Model Example Coding
Gender (2 levels): Male=1; Female=0 for
variable “MALE”

Marital Status (3 levels - requires 2


dummies):
 MARRIED: Single=0; Divorced=0; Married=1
 DIVORCED: Single=0; Divorced=1; Married=0

© 1999 Prentice-Hall, Inc. Chap. 14 - 4


Interpreting Dummy-
Variable Model Equation
Given: Yi  b 0  b 2 X 2 i
Y  Starting salary of college grad' s

0 if Male
X2 
1 if Female

b0 = mean Y for men since for each man Y=b0+b2*(0)


b2= difference of means between men and women since for
women Y=b0+b2*(1).
b0+b2 = mean Y for women
© 1999 Prentice-Hall, Inc. Chap. 14 - 5
Comparison to other
techniques
• This is identical to a t-test for the
difference of means. We test b2=0 to test
if there is a significant difference of
means.
• This is identical to a one-way ANOVA
for a difference of means.

© 1999 Prentice-Hall, Inc. Chap. 14 - 6


Dummy-Variable Model
Relationships
Y Means for males and females

Females
b0 + b2

b0
Males
0 X1
0
© 1999 Prentice-Hall, Inc. Chap. 14 - 7
Interpreting Dummy-
Variable Model Equation
Given: Yi  b0  b1X 1i  b2 X 2i
Y  Starting salary of college grad' s
X 1  GPA
0 if Male
X2 
1 if Female
Males ( X 2  0):
Yi  b0  b1X 1i  b2 0  b0  b1X 1i

© 1999 Prentice-Hall, Inc. Chap. 14 - 8


Interpreting Dummy-
Variable Model Equation

FEMALES
( X  1) :
Yˆi  b0  b1 X1i  b2 1   b0  b2   b1 X1i
2

© 1999 Prentice-Hall, Inc. Chap. 14 - 9


Dummy-Variable Model
Relationships
Y Same slopes b1

Females
b0 + b2

b0
Males
0 X1
0
© 1999 Prentice-Hall, Inc. Chap. 14 - 10
Dummy-Variable Model
Example
Computer Output : Yˆi  3  5 X 1i  7 X 2i
X 2  10  if Male
if Female
Same slopes
Males ( X  0) :
Yˆi  3  5 X 1i  7 0   3  5 X 1i
2

Females ( X  1) :
Yˆi  3  5 X1i  71   3  7   5 X1i  10  5 X
2

1i

© 1999 Prentice-Hall, Inc. Chap. 14 - 11


Interpretation

The difference in mean output between men


and women is 7, holding constant GPA.
When there are more than two groups, the
interpretation of the coefficient is always
the difference of means between that
group and the EXCLUDED GROUP.

© 1999 Prentice-Hall, Inc. Chap. 14 - 12


How many dummy
variables do you need?
• To compare union workers and
nonunion workers?
• To compare whites, blacks, hispanics
and asians?
• To compare months of the year?

© 1999 Prentice-Hall, Inc. Chap. 14 - 13


EXAMPLE

© 1999 Prentice-Hall, Inc. Chap. 14 - 14


EXAMPLE

© 1999 Prentice-Hall, Inc. Chap. 14 - 15


Interaction
Regression Model

© 1999 Prentice-Hall, Inc. Chap. 14 - 16


Multiple Regression
Models
Multiple
Multiple
Regression
Regression
Models
Models Non-
Non-
Linear
Linear Linear
Linear

Dummy
Dummy Inter-
Inter-
Linear
Linear Variable
Variable action
action

Poly-
Poly- Square
Square Log
Log Reciprocal
Reciprocal Exponential
Exponential
Nomial
Nomial Root
Root

© 1999 Prentice-Hall, Inc. Chap. 14 - 17


Interaction
Regression Model
Hypothesizes interaction between pairs of X
variables
 Response to one X variable varies at different
levels of another X variable
Contains two-way cross product terms
Yi   0   1X 1i   2 X 2i   3 X 1i X 2i   i

Canbe combined with other models


e.g., dummy variable model

© 1999 Prentice-Hall, Inc. Chap. 14 - 18


Effect of Interaction

Given:
Yi   0   1X 1i   2 X 2i   3 X 1i X 2i   i

Without interaction term, effect of X1 on Y is


measured by 1
With interaction term, effect of X1 on
Y is measured by 1 + 3X2
 Effect changes as X2i increases

© 1999 Prentice-Hall, Inc. Chap. 14 - 19


Interaction Example

Y Y = 1 + 2X1 + 3X2 + 4X1X2


Y = 1 + 2X1 + 3(1) + 4X1(1) = 4 + 6X1
12

8
Y = 1 + 2X1 + 3(0) + 4X1(0) = 1 + 2X1
4
0 X1
0 0.5 1 1.5
Effect (slope) of X1 on Y does depend on X2 value
© 1999 Prentice-Hall, Inc. Chap. 14 - 20
Interaction Regression
Model Worksheet

Case, i Yii X1i


1i X2i
2i X1i
1i X2i
2i

1 1 1 3 3
2 4 8 5 40
3 1 3 2 6
4 3 5 6 30
: : : : :
Multiply X1 by X2 to get X1X2.
Run regression with Y, X1, X2 , X1X2
© 1999 Prentice-Hall, Inc. Chap. 14 - 21
Interpretation when
there are 3+levels
Y    1MALE   2 MARRIED   3 DIVORCED
a MeanY for a single female
(MALE,MARRIED,DIVORCED=0)
b1 = Difference in means between males and females
(a+b1=mean Y for single males)
b2 = Difference in means between single and married
(holding gender constant)
b3= Difference in means between divorced and single
b2-b3=Difference in means between married and
divorced
© 1999 Prentice-Hall, Inc. Chap. 14 - 22
Interpretation when
there are 3+levels
It is possible to interact the dummy variables. This can
give an identical result as a 2-way ANOVA.
•In this example, this would allow the effect of marital
status to vary with gender.

© 1999 Prentice-Hall, Inc. Chap. 14 - 23


Interpretation when
there are 3+levels
Y    1MALE   2 MARRIED   3 DIVORCED
  4 MALE * MARRIED   5 MALE * DIVORCED
 MALE=0 if female and 1 if male

© 1999 Prentice-Hall, Inc. Chap. 14 - 24


Interpretation when
there are 3+levels
Y    1MALE   2 MARRIED   3 DIVORCED
  4 MALE * MARRIED   5 MALE * DIVORCED
 MALE=0 if female and 1 if male
 MARRIED=1 if married; 0 if divorced or single
 DIVORCED=1 if divorced; 0 if single or married
 MALE*MARRIED=1 if male married; 0 otherwise
=(MALE times MARRIED)
 MALE*DIVORCED=1 if male divorced; 0
otherwise(=MALE times DIVORCED)

© 1999 Prentice-Hall, Inc. Chap. 14 - 25


Y    1MALE   2 MARRIED   3 DIVORCED
  4MALE * MARRIED   5MALE * DIVORCED

SINGLE MARRIED DIVORCED

FEMALE 0 B2 B3

MALE B1 B1+B2+B4 B1+B3+B5

© 1999 Prentice-Hall, Inc. Chap. 14 - 26


Interpreting Results

Difference
FEMALE MALE
Single:  Single:   1 1

Married:    2 Married:   1   2   4 1   4
Divorced:    3 Divorced:   1   3  5 1   5

Main Effects: MALE; (MARRIED and DIVORCED)


Interaction Effects: MALE*MARRIED and
MALE*DIVORCED

© 1999 Prentice-Hall, Inc. Chap. 14 - 27


Interpreting results

Testing for interaction: Must do F-test of joint


hypothesis that  4   5  0

EXAMPLE

© 1999 Prentice-Hall, Inc. Chap. 14 - 28


Polynomial (Curvilinear)
Regression Model

© 1999 Prentice-Hall, Inc. Chap. 14 - 29


Multiple Regression
Models
Multiple
Multiple
Regression
Regression
Models
Models Non-
Non-
Linear
Linear Linear
Linear

Dummy
Dummy Inter-
Inter-
Linear
Linear action
Variable
Variable action

Poly-
Poly- Square
Square Log
Log Reciprocal
Reciprocal Exponential
Exponential
Nomial
Nomial Root
Root

© 1999 Prentice-Hall, Inc. Chap. 14 - 30


Curvilinear
Regression Model
• Relationship between 1 response variable and 2
or more explanatory variable is a polynomial
function
• Useful when scatter diagram indicates non-linear
relationship
• Curvilinear model:

• The second  0   1 X 1 i variable


Y i  explanatory   2 X 12i is the
 i square of
the 1st.

© 1999 Prentice-Hall, Inc. Chap. 14 - 31


Curvilinear Regression
Model
Curvilinear models may be considered when
scatter diagram takes on the following shapes:

Y Y Y Y

2 > 0 X1 2 > 0 X1 2 < 0 X1 2 < 0 X1

2 = the coefficient of the quadratic term

© 1999 Prentice-Hall, Inc. Chap. 14 - 32


Testing for Significance:
Curvilinear Model
• Testing for Overall Relationship
 Similar to test for linear model
 F test statistic =
MSR
MSE
• Testing the Curvilinear Effect
 Compare curvilinear model
Y i   0   1 X 1 i   2 X 12i   i
with the linear model
Yi   0   1 X 1i   i
© 1999 Prentice-Hall, Inc. Chap. 14 - 33
Testing for Significance:
Curvilinear Model
• May require testing a portion of the model
(e.g. the linear and squared terms) when
there are other variables in the model

Yi   0  1 X1i   2 X12i   3 X 2i   i
• Here we must test totest for the
significance of X1 - an F-test for these two
1   2  0
“variables”

© 1999 Prentice-Hall, Inc. Chap. 14 - 34


Inherently Linear
Models
Non-linear models that can be expressed in
linear form
 Can be estimated by LS in linear form
Require data transformation
Multiplicative model example
1 2
Yi   0  X 1i X 2i i

ln  Yi   ln   0   1ln  X 1i    2 ln  X 2i   ln   i 
© 1999 Prentice-Hall, Inc. Chap. 14 - 35
Using Transformations

• Requires Data Transformation


• Either or Both Independent and
Dependent Variables May be
Transformed
• Can be based on theory, logic or scatter
diagrams

© 1999 Prentice-Hall, Inc. Chap. 14 - 36


Square Root
Transformation
Yi   0   1 X 1i   2 X 2 i   i
Y
1 > 0
Similarly for X2

1 < 0
X1
Transforms one of above model to one that appears linear.
Often used to overcome heteroscedasticity.
© 1999 Prentice-Hall, Inc. Chap. 14 - 37
Logarithmic
Transformation
Y i   0   1 ln( X 1 i )   2 ln( X 2 i )   i

Y 1 > 0
Similarly for X2

1 < 0
X1

© 1999 Prentice-Hall, Inc. Chap. 14 - 38


Exponential
Transformation
 0  1X 1i   2 X 2 i
Original Model Yi  e i
Y 1 > 0

Similarly for X2

1 < 0
X1
Transformed into: ln Y i   0   1 X 1 i   2 X 2 i  ln  1

© 1999 Prentice-Hall, Inc. Chap. 14 - 39


Interpretation of
coefficients
• The dependent variable is logged.
 The coefficient on the independent variable can be
approximately interpreted as : a 1 unit change in X
leads to a b percentage change in Y.
• The independent variable is logged.
 The coefficient on the independent variable can be
approximately interpreted as : a 100 percent
change in X leads to a b unit change in Y.

© 1999 Prentice-Hall, Inc. Chap. 14 - 40


Interpretation of
coefficients
• Both dependent and independent
variables are logged.
 The coefficient on the independent variable
can be approximately interpreted as : a 1
percent change in X leads to a b percentage
change in Y. Therefore b is the elasticity of
Y with respect to a change in X.

© 1999 Prentice-Hall, Inc. Chap. 14 - 41


Income and Experience:
Scatter Plot

© 1999 Prentice-Hall, Inc. Chap. 14 - 42


Income and Experience:
Linear
Linear Model

© 1999 Prentice-Hall, Inc. Chap. 14 - 43


Income and Experience: Log
Independent Variable
Log independent variable

© 1999 Prentice-Hall, Inc. Chap. 14 - 44


Income and Experience:
Income Logged

Log(Y)

© 1999 Prentice-Hall, Inc. Chap. 14 - 45


Income and Experience:
Double Log
Double Log - Elasticity Model (Note: LFEXP is already logged in this example)

© 1999 Prentice-Hall, Inc. Chap. 14 - 46


Income and Experience:
Quadratic
Quadratic

© 1999 Prentice-Hall, Inc. Chap. 14 - 47


Income and Experience: Log
plus Quadratic
Log(Y) + Quadratic

© 1999 Prentice-Hall, Inc. Chap. 14 - 48


Income and Experience: All
Specifications
Many specifications

© 1999 Prentice-Hall, Inc. Chap. 14 - 49


Standardized and
Unstandardized
• Many disciplines report ONLY
standardized coefficients
• The usual coefficients are then referred to
as “unstandardized coefficients”
• The “standardized” coefficient are often
referred to as “beta weights”
• The t-tests for significance of the slopes are
identical for either of these two.
© 1999 Prentice-Hall, Inc. Chap. 14 - 50
Interpretation of
coefficients
• If both Y and X are measured in
standardized form,  Yi   Y 
yi    

and  
 Xi  X  Y

xi   
  
 
Then the b’s are called standardized
coefficients. They indicate the
number of standard deviations Y will
change when X changes by one
standard deviation
© 1999 Prentice-Hall, Inc. Chap. 14 - 51
BETA Coefficients Example

© 1999 Prentice-Hall, Inc. Chap. 14 - 52


Comparison of coefficients

• In general, we should NOT compare


coefficients unless they are measured in
the same units (e.g. dollars or inches)
• Two “unit free” measures are sometimes
used to compare coefficients:
 elasticities (percentage changes)
 standardized coefficients (Stand. Dev.

Changes)

© 1999 Prentice-Hall, Inc. Chap. 14 - 53

You might also like