Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Extensions to OLS: NLLS,

Quantile regression, Model


specification

Miller GR5412
1
Non-linear regression
• Following Hansen, Ch. 9
• What if we want a non-linear regression?
(non-linear function of the parameters – θ in
Hansen’s notation)
• For example:

2
NLLS

• Above is objective function for least


squares estimation when m is linear in
θ
• If we use it when the conditional mean
function, m, is non-linear in θ the
estimator is called non-linear least
squares (NLLS)
• Needs numerical optimization, generally3
NLLS
If m, the conditional mean function, is
differentiable, the the FOCs are:

4
NLLS
Theorem: if the model is identified and m
is differentiable then:

5
NLLS
Can estimate the ACOV matrix in the
natural way:

Note the similarity to ML variance


estimator
6
NLLS
• Because of the numerical optimization,
all the relevant caveats from ML also
apply
• In highly non-linear models the
asymptotic standard errors may be
highly inaccurate in small sample
because of the use of Taylor’s theorem
• When reporting parameters, will often
want to calculate marginal effects
7
Least absolute deviations
(LAD)
• Another approach to estimation of the
linear model is least absolute deviations
estimation, AKA median regression,
where we minimize the sum of the
absolute errors, rather than the sum of
squared errors
• Because it is not quadratic, LAD may
behave better when there are outliers
or highly non-normal error distributions
• Note that if the error distribution is
symmetric, mean=median 8
Least absolute deviations
(LAD)
One motivation for the LAD estimator is
that for the median of a sample:

9
Least absolute deviations
(LAD)
Conditional median regression model:

10
Least absolute deviations
(LAD)
• Minimization must be numerical – note
that the objective is non-differentiable

• Estimation of the density at zero is a little


tricky – need to check your software for
method used 11
Quantile regression

• You can do something similar for any


quantile (e.g., the 90th percentile)
• This is called quantile regression
• Many applications in micro
• See Hansen for details

12
Different approaches to
specification
• No generally agreed upon approach to
specification searches
• Two main approaches are specific to
general, and general to specific
• Both have issues and there are fierce
proponents of each
• There’s another recent current that says
we should be concentrating on causality of
specific factors and looking for quasi-
experiments 13
Ramsey’s RESET

• Ramsey’s RESET test is specification


test
• Instead of adding functions of the x’s
directly, we add and test functions of ŷ
• Estimate y = b0 + b1x1 + … + bkxk +
d1ŷ2 + d2ŷ3 +ε and test
• H0: d1 = 0, d2 = 0 using F~F2,n-k-3 or
LM~χ2(2)

14
Goodness of fit tests: AIC
There are several different information criteria,
which are essentially goodness-of-fit measures.
One of the most heavily used, historically, is the
Akaike information criterion
AIC = log(SSR j /n)+ 2( j +1)/n

Where j is the number of parameters (in this case
p). Note that this is the form for OLS/NLLS. More
generally:
AIC = −2[maximized log likelihood]+ 2(# of parameters)
= nlog σ̂ ε2 + 2(# of parameters)

15
Goodness of fit tests: AIC (II)
Akaike presented a long complex argument to
motivate the AIC. Amemiya calls it “unconvincing”
and Chow has noted that it contains mathematical
errors

16
Lag selection – SIC and BIC
More used in recent literature is the Schwarz
criterion, which was derived based on Bayesian
arguments, in particular the notion that you want
to allow test size to shrink as the sample size gets
bigger
SIC = log(SSR j /n)+( j +1)log(n)/n

SIC = −2[maximized log likelihood]+(# of parameters)log(n)
For both AIC and SIC as model selection criterion
you want to minimize the criterion. Note that
Hayashi says SIC is the same as the Bayesian
information criterion (BIC), but this is not true in
general 17
Non-nested alternative tests
• With the same dependent variable, but
non-nested x’s, can still make a
comprehensive model and test
exclusions
• An alternative, the Davidson-MacKinnon
test, uses ŷ from one model as regressor
in the second model and tests for
significance

18
Issues with specification
testing
• If your estimation strategy is to first test
for specification and then test parameters
on the selected model you have a pre-test
estimator
• The problem with pre-test estimators is
that the distribution of the estimated
parameters is not equal to the distribution
of parameters estimated in the final model
• Generally, the distribution of pre-test
estimators is unknown 19

You might also like