Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables

Type author name/s here
Dougherty
Introduction to Econometrics,
5th edition
Chapter heading
Chapter 4: Nonlinear Models and
Transformations of Variables
© Christopher Dougherty, 2016. All rights reserved.

QUADRATIC EXPLANATORY VARIABLES
Y   1   2 X 2   3 X 22  u
We will now consider models with quadratic explanatory variables of the type shown. Such
a model can be fitted using OLS with no modification.
1
Y   1   2 X 2   3 X 22  u
However, the usual interpretation of a parameter, that it represents the effect of a unit
change in its associated variable, holding all other variables constant, cannot be applied. It
is not possible for X2 to change without X22 also changing.
2
Y   1   2 X 2   3 X 22  u
dY
  2  2 3 X 2
dX 2
Differentiating the equation with respect to X2, one obtains the change in Y per unit change
in X2. Thus, the impact of a unit change in X2 on Y, (b2 + 2b3X2), is a function of X2.
3
Y   1   2 X 2   3 X 22  u
dY
  2  2 3 X 2
dX 2
This means that b2 has an interpretation that is different from that in the ordinary linear
model where it is the unqualified effect of a unit change in X2 on Y.
4
Y   1   2 X 2   3 X 22  u
dY
  2  2 3 X 2
dX 2
In this model, b2 should be interpreted as the effect of a unit change in X2 on Y for the
special case where X2 = 0. For nonzero values of X2, the marginal effect will be different.
5
Y   1   2 X 2   3 X 22  u
dY
  2  2 3 X 2
dX 2
Y  1    2   3 X 2  X 2  u
b3 also has a special interpretation. If we rewrite the model as shown, b3 can be interpreted
as the rate of change of the coefficient of X2, per unit change in X2.
6
Y   1   2 X 2   3 X 22  u
dY
  2  2 3 X 2
dX 2
Y  1    2   3 X 2  X 2  u
Only b1 has a conventional interpretation. As usual, it is the value of Y (apart from the
random component) when X2 = 0.
7
Y   1   2 X 2   3 X 22  u
dY
  2  2 3 X 2
dX 2
Y  1    2   3 X 2  X 2  u
There is a further problem. We know that the estimate of the intercept may have no
sensible meaning if X2 = 0 is outside the data range. If X2 = 0 lies outside the data range, the
same type of distortion can happen with the estimate of b2.
8
. gen SSQ = S*S

. reg EARNINGS S SSQ
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 23.44
Model | 6061.38243 2 3030.69122 Prob > F = 0.0000
Residual | 64267.5838 497 129.311034 R-squared = 0.0862
-----------+------------------------------ Adj R-squared = 0.0825
Total | 70328.9662 499 140.939812 Root MSE = 11.372
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .1910651 1.785822 0.11 0.915 -3.317626 3.699757
SSQ | .0366817 .0606266 0.61 0.545 -.0824344 .1557978
_cons | 8.358401 12.86047 0.65 0.516 -16.90919 33.62599
----------------------------------------------------------------------------
We will illustrate this with the earnings function. The table gives the output of a quadratic
regression of earnings on schooling (SSQ is defined as the square of schooling).
9
. gen SSQ = S*S

----------------------------------------------------------------------------
-----------+------------------------------ F( 2, 497) = 23.44
Model | 6061.38243 2 3030.69122 Prob > F = 0.0000
-----------+------------------------------ Adj R-squared = 0.0825
Total | 70328.9662 499 140.939812 Root MSE = 11.372
----------------------------------------------------------------------------
-----------+----------------------------------------------------------------
S | .1910651 1.785822 0.11 0.915 -3.317626 3.699757
SSQ | .0366817 .0606266 0.61 0.545 -.0824344 .1557978
_cons | 8.358401 12.86047 0.65 0.516 -16.90919 33.62599
----------------------------------------------------------------------------
The coefficient of S implies that, for an individual with no schooling, the impact of a year of
schooling is to increase hourly earnings by $0.19.
10
. gen SSQ = S*S

----------------------------------------------------------------------------
-----------+------------------------------ F( 2, 497) = 23.44
Model | 6061.38243 2 3030.69122 Prob > F = 0.0000
-----------+------------------------------ Adj R-squared = 0.0825
Total | 70328.9662 499 140.939812 Root MSE = 11.372
----------------------------------------------------------------------------
-----------+----------------------------------------------------------------
S | .1910651 1.785822 0.11 0.915 -3.317626 3.699757
SSQ | .0366817 .0606266 0.61 0.545 -.0824344 .1557978
_cons | 8.358401 12.86047 0.65 0.516 -16.90919 33.62599
----------------------------------------------------------------------------
It is also doubtful whether the intercept has any sensible interpretation. Literally, it implies
that an individual with no schooling would have hourly earnings of $8.36, which seems
implausibly high.
11
------------------------
120 EARNINGS | Coef.
-----------+------------
S | .1910651
100 SSQ | .0366817
_cons | 8.358401
------------------------
Hourly earnings ($)
80
60
40
20 quadratic
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
The quadratic relationship is illustrated in the figure. Over the range of the actual data, it
fits the observations tolerably well. The fit is not dramatically different from those of the
linear and semilogarithmic specifications.
12
------------------------
120 EARNINGS | Coef.
-----------+------------
S | .1910651
100 SSQ | .0366817
_cons | 8.358401
------------------------
Hourly earnings ($)
80
60
40
20 quadratic
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
Most wage equation studies prefer the semilogarithmic specification. The slope coefficient
has a simple interpretation and the specification does not give rise to nonsensical
predictions outside the data range.
13
Average annual percentage growth rates

Employment GDP Employment GDP
Australia 2.57 3.52 Korea 1.11 4.48

Austria 1.64 2.66 Luxembourg 1.34 4.55
Belgium 1.06 2.27 Mexico 1.88 3.36
Canada 1.90 2.57 Netherlands 0.51 2.37
Czech Republic 0.79 5.62 New Zealand 2.67 3.41
Denmark 0.58 2.02 Norway 1.36 2.49
Estonia 2.28 8.10 Poland 2.05 5.16
Finland 0.98 3.75 Portugal 0.13 1.04
France 0.69 2.00 Slovak Republic 2.08 7.04
Germany 0.84 1.67 Slovenia 1.60 4.82
Greece 1.55 4.32 Sweden 0.83 3.47
Hungary 0.28 3.31 Switzerland 0.90 2.54
Iceland 2.49 5.62 Turkey 1.30 6.90
Israel 3.29 4.79 United Kingdom 0.92 3.31
Italy 0.89 1.29 United States 1.36 2.88
Japan 0.31 1.85
The data on employment growth rate, e, and GDP growth rate, g, for 25 OECD countries in
Exercise 1.5 provide another example where one might consider the use of a quadratic
function.
14
. gen gsq = g*g

. reg e g gsq
----------------------------------------------------------------------------
-----------+------------------------------ F( 2, 28) = 7.03
Model | 6.05131556 2 3.02565778 Prob > F = 0.0034
Residual | 12.0579495 28 .430641052 R-squared = 0.3342
-----------+------------------------------ Adj R-squared = 0.2866
Total | 18.109265 30 .603642167 Root MSE = .65623
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
g | .6616232 .2988805 2.21 0.035 .0493942 1.273852
gsq | -.0490589 .0336736 -1.46 0.156 -.1180362 .0199185
_cons | -.2576489 .5845635 -0.44 0.663 -1.455073 .939775
----------------------------------------------------------------------------
The output from a quadratic regression is shown. gsq has been defined as the square of g.
15
quadratic
Employment growth rate
hyperbolic
1
0 ------------------------
0 1 2 3 4 5 6 7 e |8 9
Coef.
-----------+------------
g | .6616232
-1
gsq | -.0490589
_cons | -.2576489
------------------------
-2
GDP growth rate
The quadratic specification appears to be an improvement on the hyperbolic function fitted

in a previous slideshow. It is more satisfactory than the latter for low values of g, in that it
does not yield implausibly large negative predicted values of e.
16
quadratic
hyperbolic
1
0 ------------------------
0 1 2 3 4 5 6 7 e |8 9
Coef.
-----------+------------
g | .6616232
-1
gsq | -.0490589
_cons | -.2576489
------------------------
-2
GDP growth rate
The only defect is that it predicts that the fitted value of e starts to fall when g exceeds 7.
17
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
Why stop at a quadratic? Why not consider a cubic, or quartic, or a polynomial of even
higher order? There are usually several good reasons for not doing so.
18
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
Diminishing marginal effects are standard in economic theory, justifying quadratic

specifications, at least as an approximation, but economic theory seldom suggests that a
relationship might sensibly be represented by a cubic or higher-order polynomial.
19
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
The second reason follows from the first. There will be an improvement in fit as higher-
order terms are added, but because these terms are not theoretically justified, the
improvement will be sample-specific.
20
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
Third, unless the sample is very small, the fits of higher-order polynomials are unlikely to
be very different from those of a quadratic over the main part of the data range.
21
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
These points are illustrated by the figure, which shows cubic and quartic regressions with
the quadratic regression. Over the main data range, from g = 1.5 to g = 5, the fits of the
cubic and quartic are very similar to that of the quadratic.
22
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
R2 for the quadratic specification is 0.334. For the cubic and quartic it is 0.345 and 0.355,
relatively small improvements.
23
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
Further, the cubic and quartic curves both exhibit implausible characteristics.
24
3
quartic
quadratic
0
0 1 2 3 4 5 6 7 8 9
cubic
-1
GDP growth rate
As g increases, the slope of the cubic first diminishes and then increases. There is no
reasonable explanation. The quartic curve actually declines for values of g from 5 to 7, and
then exhibits a strange upward twist at its end.
25
Copyright Christopher Dougherty 2016.
These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.3 of C. Dougherty,

Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
www.oxfordtextbooks.co.uk/orc/dougherty5e/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2016.05.02

Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables

Uploaded by

Copyright:

Available Formats

Type author name/s here

© Christopher Dougherty, 2016. All rights reserved.

. gen SSQ = S*S

. gen SSQ = S*S

. gen SSQ = S*S

Years of schooling (highest grade completed)

Years of schooling (highest grade completed)

Average annual percentage growth rates

Australia 2.57 3.52 Korea 1.11 4.48

. gen gsq = g*g

GDP growth rate

The quadratic specification appears to be an improvement on the hyperbolic function fitted

GDP growth rate

Diminishing marginal effects are standard in economic theory, justifying quadratic

These slideshows may be downloaded by anyone, anywhere for personal use.

The content of this slideshow comes from Section 4.3 of C. Dougherty,

You might also like