Multiple Regression Analysis: The Problem of Estimation: Gujarati 5e, Chapter 7

Multiple regression analysis:

the problem of estimation

Gujarati 5e, Chapter 7

The three-variable model

Yi  1   2 X 2i  3 X 3i  ui
Y is the dependent variable
X2 and x3 are the explanatory variables
u is the stochastic disturbance term
1 is the intercept term
2 and 3 are the partial regression coefficients



CLRM Assumption (1)

1. Linear regression model, or linear in the

2. Fixed X values or X values independent of the
error term. Here, this means we require zero
covariance between ui and each X variables.
cov (ui , X2i) = cov (ui , X3i) = 0
n Zero mean value of disturbance ui.
E(ui | X2i, X3i) = 0 for each i
n Homoscedasticity or constant variance of ui.
var (ui ) = σ2



CLRM Assumption (2)

5. No autocorrelation, or serial correlation,

between the disturbances.
cov (ui , uj) = 0 i=j
6. The number of observations n must be
greater than the number of parameters to be
estimated, which is 3 in our current case.
7. There must be variation in the values of the
X variables.



CLRM Assumption (3)

8. No exact collinearity between the X

variables. No exact linear relationship
between X2 and X3
9. There is no specification bias. The model is
correctly specified



Interpretation of multiple regression
E  Yi X 2i , X 3i   1   2 X 2i  3 X 3i

 The conditional mean or expected value of Y

conditional upon the given or fixed values of
the variables X2 and X3
 The average mean value of Y or mean
response of Y for the fixed values of the X



The meaning of partial regression
 2 measures the change in the mean value of
Y, E(YX2i, X3i), per unit change in X2,
holding X3 constant
 3 measures the change in the mean value of
Y per unit change in X3, holding X2 constant



OLS estimation of partial regression
 
 SRF: min  uˆ i  Yi  ˆ1  ˆ2 X 2i  ˆ3 X 3i

Yi  ˆ1  ˆ2 X 2i  ˆ3 X 3i  uˆi

 The normal equations:
Yi  ˆ1  ˆ2 X 2  ˆ3 X 3i
 Y X  ˆ
i 2i 
1X  ˆ 2i 2 X 2
2i  ˆ
 3  X 2 i X 3i

Y X i 3i  ˆ  X
1 3i  ˆ
 2 X X
2 i 3i  ˆ
 3 X 2



OLS estimators
ˆ1  Y  ˆ2 X  ˆ3 X 3

  y x   x    y x    x
x 
i 2i 3i i 3i 2 i 3i

 x   x    x x 
2 2 2
2i 3i 2 i 3i

ˆ3 
  yi x3i   
 2i    yi x2i    x2i x3i 
x 2

  x    x    x 
2 2 2
2i 3i x
2 i 3i



Variance of OLS estimators
 
  1
var ˆ1    2
X 2
 x 2
3i  X 2
3  x 2
2i  2 X 2 X 3  x x
2 i 3i
 2
n      

 x2
i x3
i  x 2 i x3 i 

 
var ˆ2 
 3ix 2

 2

 x2i  x3i    x2i x3i 

2 2 2

var ˆ3
 x
 x x  x
 2

   x3i 
2 2 2
2i 3i 2i

 r23 2

cov ˆ2 , ˆ3  
 1 r 
23 x22i x32i

2   i
u 2




Properties of OLS estimators

1. The regression line passes through the mean

Y , X 21 , Xand
2. The mean value of estimated is equal to the
mean value of actual
3. The sum of residual is equal the mean value
of residual and equal zero:  uˆi  u  0
4. The residual are uncorrelated with X2i and X3i:
5.  uˆi X 2i are
The residual uncorrelated
uˆi X 3i  0 with
 uˆiYˆi  0

Properties of OLS estimators
6. r23, the coefficient correlation between X2i and
X3i, increase toward 1, the variance of
coefficient increase for given value of 2 and
 2 i  3i
x 2
or x 2

For given value of r23 and  2i  3,i the

2 2
x or x
variance of the OLS estimators are directly
proportional to 2
8. Given the assumption of the CLRM, they are



The multiple coefficient of
determination, R2
 R2 measures the goodness of fit the regression
equation  the proportion of the variation in Y
explained by the variables X2 and X3 jointly

2  i 2i 3  i 3i
 y x  ˆ yx
R 

TSS  yi 2

2  

 
var ˆ j 
x 2 
 j j
 1 
R 2 



R2 and the adjusted R2

 An important properties of R2 is that it is a

nondecreasing function of the number of
explanatory variables.

R 
 1
 1
 ˆ

TSS TSS  yi2

 The adjusted R2

R2  1
 i  nk
u 2
 Adjusted for the df
 i  n  1
y 2



R2 and the adjusted R2
n 1

R  1 1 R 2
 nk
1. For k>1, adjusted R2 < R2 which implies that as the
number of X variables increases, the adjusted R2
increases less than the unadjusted R2
2. Adjusted R2 can be negative

 Other criteria: Akaike’s Information Criterion

(AIC) and Amemiya’s Precidtion criterion



Comparing the two R2 values
 In comparing two models on the basis of R2 or
adjusted R2: the sample size n and the
dependent variable must be the same; the
explanatory variables may take any form
 For the models
ln Yi  1   2 X 2i  3 X 3i  ui
Yi  1   2 X 2i   3 X 3i  ui
R2 cannot be compared

09/08/20 Prepared by Sri Yani K 16

The “game” of maximizing R2
 Sometimes researcher play the game of
maximizing adjusted R2, that is, choosing the
model that gives the highest adjusted R2. But
this may be dangerous.
 The researcher should be more concerned
about the logical or theoretical relevance of the
explanatory variables to the dependent variable
and their statistical significance.
 If adjusted R2 is low, it does not mean the model
is necessarily bad.



The Cobb-Douglas production
2 3 ui
Yi  1 X X e 2i 3i
where Y=output
X2=labor input
X3=capital input
u=stochastic disturbance term
e=base of natural logarithm
 Model transform:
ln Yi   0   2 ln X 2i   3 ln X 3i  ui
where  0  ln 1

The properties of the Cobb-Douglas
production function
1. 2 is the partial elasticity of output with respect
to the labor input, holding the capital input
2. 3 is the partial elasticity of output with respect
to the capital input, holding the labor input
3. The sum (2+3) gives information about the
returns to scale, that is, the response of output
to a proportionate in the inputs



Polynomial regression models

 The general kth degree polynomial regression

may be written as

Yi   0  1 X i   2 X  ...   k X  ui

 ’s can be estimated by the OLS or ML




