Chapter 9

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Econometrics – II

BA (H) Econ Core– Spring, 2024 (February to May)

Chapter 9: More on specification and data issues


Wooldridge, J.M. (2015) Introductory Econometrics, 6e

Tirtha Chatterjee
Topics we cover

• What is functional form misspecification or model misspecification


• Tests to detect model misspecification
• Nested and non-nested models
• Drawbacks with the methods
• Functional mis-specification because of Omission of variables
• Using proxy variables
• Measurement Error in dependent and explanatory variables
• Missing Data
• Outliers or Outlying observations

Tirtha Chatterjee
Functional form misspecification

• Occurs when we do not account for the relationship b/w dependent and observed
explanatory variables
• Can be because of omitting important explanatory variables
• Also, can be because of wrong functional form of the dependent variable
• Say we use wage instead of log(wage)
• Biased estimators

Tirtha Chatterjee
Misspecification because of Omission

• Example- suppose hourly wage is determined by


log 𝑤𝑎𝑔𝑒 = 𝛽! + 𝛽"𝑒𝑑𝑢𝑐 + 𝛽#𝑒𝑥𝑝𝑒𝑟 + 𝛽$𝑒𝑥𝑝𝑒𝑟 # + 𝑢
• but we omit 𝑒𝑥𝑝𝑒𝑟 #, then we are committing a functional form misspecification.
• biased estimators of all parameters
• Could arise from omitting any other variable – say some interaction variable
• Could be 𝑒𝑥𝑝𝑒𝑟* 𝑒𝑑𝑢𝑐

Tirtha Chatterjee
Misspecification because of Omission- nested models

• F test for joint exclusion restrictions can be used to detect functional form
misspecification
• Sometimes difficult to pinpoint the precise reason for the misspecification
• Regression Specification Error Test- RESET (Ramsey, 1969)

Tirtha Chatterjee
RESET

• Useful to detect general functional form misspecification


• Say the original model to be estimated is
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝑢
• RESET is implemented to test if we should include non-linear or higher order terms in
our model
• RESET adds polynomials of fitted values to equation to detect general kinds of
functional form misspecification.
• To implement RESET, we must decide how many functions of the fitted values to include in an
expanded regression.
• There is no right answer to this question, but the squared and cubed terms have proven to be useful
in most applications.

Tirtha Chatterjee
RESET - Implementation

• First, we estimate our original model-𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝑢


• Let 𝑦3 denote the OLS fitted values.
• Next ,we regress y on 𝑥" … 𝑥% , 𝑦3 #, 𝑦3 $
• 𝑦( ! and 𝑦( " are just nonlinear functions of the 𝑥#
• The expanded/auxiliary regression is
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝛿"𝑦3 # + 𝛿#𝑦3 $ + 𝑣
• We are not interested in the estimated parameters
• We only use this equation to test whether original model has missed important
nonlinearities.

Tirtha Chatterjee
RESET - Implementation

• The null hypothesis is that original model is correctly specified.


• 𝐻$ : 𝛿% = 0, 𝛿! = 0
• RESET is the F statistic for testing the null
• A significant F statistic suggests some sort of functional form problem.
• the test can be made robust to heteroskedasticity using the methods discussed in Chapter 8

Tirtha Chatterjee
RESET- example
• We estimate two models for housing prices- sample size= 88
• The first one has all variables in level form:
𝑝𝑟𝑖𝑐𝑒 = 𝛽$ + 𝛽% 𝑙𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽! 𝑠𝑞𝑟𝑓𝑡 + 𝛽" 𝑏𝑑𝑟𝑚𝑠 + 𝑢
• The second one uses the logarithms of all variables except bdrms:
𝑙𝑝𝑟𝑖𝑐𝑒 = 𝛽$ + 𝛽% 𝑙𝑙𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽! 𝑙𝑠𝑞𝑟𝑓𝑡 + 𝛽" 𝑏𝑑𝑟𝑚𝑠 + 𝑢
• First model
• F statistics for the first model- 𝐹!,'! = 4.67, p = 0.012
• We reject the null hypothesis at 5% level of significance- evidence of misspecification
• Second model
• F statistics for the second model- 𝐹!,'! = 2.56, p = 0.084
• We do not reject the null hypothesis at 5% level of significance- no evidence of misspecification
• On the basis of RESET- log-log/ second model is preferred specification

Tirtha Chatterjee
RESET- drawback

• It does not provide any direction on how to proceed if the model is rejected.
• In our previous example, when the first model was rejected, we could think of log-log model
because it is easy to interpret
• In this example, it so happens that it passes the functional form test as well.
• We might not be so lucky all the time- What if the second model got rejected too?
• RESET has no power for detecting omitted variables which are linear
• The bottom line is that RESET is a functional form test, and nothing more.

Tirtha Chatterjee
Non-nested models

• Following are two non-nested models


𝑦 = 𝛽$ + 𝛽% 𝑥% + 𝛽! 𝑥! + 𝑢

𝑦 = 𝛽$ + 𝛽% log(𝑥% ) + 𝛽! log(𝑥! ) + 𝑢
• We cannot simply use a standard F test.
• Two approaches have been suggested.
• Mizon and Richard (1986)
• Davidson-MacKinnon test

Tirtha Chatterjee
First approach-Mizon and Richard (1986)- Implementation

• Steps
• First construct a comprehensive model that contains each model as a special case and
• then test the restrictions that led to each of the models.
• The comprehensive model is
𝑦 = 𝛾! + 𝛾"𝑥" + 𝛾#𝑥# + 𝛾$log(𝑥") + 𝛾&log(𝑥#) + 𝑢
• We can first test H0: 𝛾$= 0, 𝛾& = 0 as a test of the first model
• We can also test H0: 𝛾" = 0, 𝛾#= 0 as a test of second model.

Tirtha Chatterjee
Second approach-Davidson-MacKinnon test

• If the 1st model holds with E(u| 𝑥", 𝑥#) = 0, then fitted values from 2nd model should
be insignificant when added to the first model
• Therefore, to test whether first model is the correct model, we first estimate 2nd model
by OLS to obtain the fitted values; call these 𝑦.
<
• 𝑦I are just nonlinear functions of 𝑥% and 𝑥!
• Then estimate this auxiliary regression
𝑦 = 𝛽$ + 𝛽% 𝑥% + 𝛽! 𝑥! + 𝜃% 𝑦I + 𝑒𝑟𝑟𝑜𝑟
• Davidson-MacKinnon test is obtained from the t-stat on 𝑦< from this auxiliary
regression
• Significant t-test means rejection of first model

Tirtha Chatterjee
Second approach-Davidson-MacKinnon test

• The same thing can be done to test if the second model is correctly specified
• We estimate the first model and then the fitted values are used as regressors in the
second model.
• A significant t test is a rejection of our hypothesis
• This approach can be followed for testing any 2 non-nested models with the same
dependent variable

Tirtha Chatterjee
Problems with testing non-nested alternatives

• First, a clear winner need not emerge.


• Both models could be rejected or neither model could be rejected.
• If neither model is rejected, we can use the adjusted R-squared to choose between them.
• If both models are rejected, more work needs to be done.
• A second problem is that rejecting the first model does not directly imply that second
is correct
• First model can be rejected for a variety of functional form misspecifications.

Tirtha Chatterjee
Functional form misspecification- omission of key variables

• Omission of key explanatory variables could be because of data unavailability


• Suppose we want to estimate the following model & estimate : 𝛽"& 𝛽#

log 𝑤𝑎𝑔𝑒 = 𝛽$ + 𝛽% 𝑒𝑑𝑢𝑐 + 𝛽! 𝑒𝑥𝑝𝑒𝑟 + 𝛽" 𝑎𝑏𝑖𝑙 + 𝑢


• Now, it is very difficult to collect data for ability
• And educ is correlated with ability
• Putting ability in the error term causes the OLS estimator of 𝛽" (and 𝛽#) to be biased
• This is nothing but omitted variable bias
• Possible solution is to use proxy variables for omitted variable

Tirtha Chatterjee
Proxy variables

• Proxy variable is something that is related to the unobserved variable that we would
like to control for in our analysis.
• One proxy could be IQ
• IQ does not have to be the same thing as ability;
• what we need is for IQ to be correlated with ability
• Let us estimate the following model- 𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + 𝛽$𝑥$∗ + 𝑢
• We assume that data are available on y, 𝑥", and 𝑥#
• The explanatory variable 𝑥$∗ is unobserved, but we have a proxy variable for 𝑥$∗
• Let us call the proxy variable 𝑥$.
• In our example, 𝑥$∗ is ability and 𝑥$ is IQ

Tirtha Chatterjee
Using proxy variables for unobserved explanatory variables

• What do we require of 𝑥$(proxy variable)?


• First, it should have some relationship to 𝑥$∗ (unobserved explanatory variable).
• This is captured by the simple regression equation
𝑥$∗ = 𝛿! + 𝛿$𝑥$ + 𝑣$
• Where 𝑣$ is an error because 𝑥$∗ and 𝑥$ are not exactly related
• Parameter 𝛿$measures the relationship b/w 𝑥$∗ and 𝑥$
• If 𝛿" >0, 𝑥"∗ and 𝑥" are positively related, if >0, then negatively related.
• If 𝛿" =0, then 𝑥" is not a suitable proxy for 𝑥"∗

Tirtha Chatterjee
Using proxy variables for unobserved explanatory variables

• How can we use 𝑥$ to get unbiased (or at least consistent) estimators of 𝛽"& 𝛽#?
• We can plug in 𝑥$ for 𝑥$∗ & run the regression of y on 𝑥", 𝑥#, 𝑥$.
• We call this the plug-in solution to the omitted variables problem
• because 𝑥" is just plugged in for 𝑥"∗ before we run OLS.
• If 𝑥$ is truly related to 𝑥$∗ , this seems like a sensible thing.
• However, since 𝑥$ and 𝑥$∗ are not the same, we need some assumptions

Tirtha Chatterjee
Assumptions needed for the plug-in solution to work

𝑦 = 𝛽! + 𝛽" 𝑥" + 𝛽# 𝑥# + 𝛽$ 𝑥$∗ + 𝑢


𝑥$∗ = 𝛿! + 𝛿$ 𝑥$ + 𝑣$
• (1) 𝐸(𝑢|𝑥", 𝑥#, 𝑥$∗, 𝑥$) = 0 - The error u is uncorrelated with 𝑥", 𝑥# 𝑎𝑛𝑑 𝑥$∗ & and 𝑥$
• 𝐸(𝑢|𝑥% , 𝑥! , 𝑥"∗ )=0 is standard assumption of the model we want to estimate
• In addition, u is uncorrelated with 𝑥" . This means that 𝑥" is irrelevant in the population model, once
𝑥% , 𝑥! , 𝑥"∗ have been included. 𝑥" is a proxy for 𝑥"∗ . It is 𝑥"∗ that directly affects y and not 𝑥"
• (2) The error 𝑣$is uncorrelated with 𝑥", 𝑥#, and 𝑥$.
• 𝑣" is uncorrelated with 𝑥% 𝑎𝑛𝑑 𝑥! requires 𝑥" to be a “good” proxy for 𝑥"∗
• In terms of conditional expectations.
𝐸(𝑥"∗ | 𝑥% , 𝑥! , 𝑥" ) = 𝐸(𝑥"∗ | 𝑥" )=𝛿$ + 𝛿" 𝑥"
• Once 𝑥" is controlled for, the expected value of 𝑥"∗ does not depend on 𝑥% or 𝑥! .
• Alternatively, 𝑥"∗ has zero correlation with 𝑥% and 𝑥! once 𝑥" is partialled out.

Tirtha Chatterjee
Proxy variable
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + 𝛽$𝑥$∗ + 𝑢
𝑥$∗ = 𝛿! + 𝛿$𝑥$ + 𝑣$
• If we substitute 𝑥$ for 𝑥$∗ and do simple algebra, we get
𝑦 = 𝛽! + 𝛽$𝛿! + 𝛽"𝑥" + 𝛽#𝑥#+ 𝛽$𝛿$𝑥$+u+ 𝛽$𝑣$
• Composite error in this equation 𝑒 = 𝑢 + 𝛽$𝑣$- It depends on u and 𝑣$
• Write this equation as 𝑦 = 𝛼! + 𝛽"𝑥" + 𝛽#𝑥#+ 𝛼$𝑥$+e
• where 𝛼$ = 𝛽$ + 𝛽" 𝛿$ and 𝛼" = 𝛽" 𝛿" is the slope parameter on the proxy variable 𝑥"
• e has zero mean and is uncorrelated with 𝑥", 𝑥#, and 𝑥$
• Since u and 𝑣" both have zero mean and each is uncorrelated with 𝑥% , 𝑥! , and 𝑥"
• The estimated model – regressing y on 𝑥" 𝑥# & 𝑥$will get unbiased (or at least
consistent) estimators of 𝛼!, 𝛽", 𝛽# and 𝛼$.

Tirtha Chatterjee
Including a proxy could exacerbate problem of multicollinearity

• For example, Including variable like IQ as a proxy for ability in a regression that
includes educ
• But there are advantages
• First, the inclusion of IQ reduces the error variance because the part of ability explained by IQ has
been removed from the error- Reflected in a smaller standard error of the regression
• Second, the added multicollinearity is a necessary evil if we want to get an unbiased estimator of
educ
• There could be increase in multicollinearity even if we had data on ability and had included it in the
model
• Ultimately, it is a trade-off

Tirtha Chatterjee
Using lagged dependent variable as proxy variables

• Sometimes we suspect that some of the x variables could be correlated with the error
but we are not aware of a good proxy
• In those cases, we can use observed dependent variable from a previous time point as
a proxy
• Lagged dependent variable
• Using past values increases data requirements but it is a way to account for historical
factors that could have an impact on the present
• Consider a simple equation to explain city crime rates:
𝑐𝑟𝑖𝑚𝑒 = 𝛽$ + 𝛽% 𝑢𝑛𝑒𝑚 + 𝛽! 𝑒𝑥𝑝𝑒𝑛𝑑 + 𝛽" 𝑐𝑟𝑖𝑚𝑒)% + 𝑢
• where crime is a measure of per capita crime, unem is the city unemployment rate, expend is per
capita spending on law enforcement, and 𝑐𝑟𝑖𝑚𝑒)% indicates the crime rate measured in some earlier
year (this could be the past year or several years ago).

Tirtha Chatterjee
Using lagged dependent variable as proxy variables

• We want to find the effects of unemployment and law enforcement expenditures on


crime rate in the city
• We expect that 𝛽$> 0 because crime has inertia.
• What is the purpose of including 𝑐𝑟𝑖𝑚𝑒1" in the equation?
• Cities with high historical crime rates may spend more on crime prevention.
• Cities with higher historical crime rates may have effects on unemployment
• Engaged in criminal activities and not labour market

• Thus, factors unobserved to us (the econometricians) that affect crime are likely to be
correlated with expend (and unem).
• Using cross-sectional data is unlikely to give us unbiased estimator of causal effect of law
enforcement expenditure on crime

Tirtha Chatterjee
Measurement error

• When we use an imprecise measure of an economic variable in a regression model,


then our model contains measurement error.
• Some reasons for measurement error
• Recorded measures might contain error- For example- reported annual income is a measure of
actual annual income but generally people do not report it correctly
• Coding errors, rounding errors, imprecise data collection, imperfect proxy
• Measurement error can be both in the dependent variable or the independent variable
or both
• We will study the consequences of measurement error for ordinary least squares
estimation.

Tirtha Chatterjee
Measurement error in dependent variable

• Let 𝑦 ∗ denote the variable that we would like to explain- say, annual family savings.
• The regression model has the usual form
𝑦 ∗ = 𝛽! + 𝛽"𝑥" + ⋯ + 𝛽% 𝑥% + 𝑢
• and we assume it satisfies the Gauss-Markov assumptions.
• Say there is measurement error in 𝑦 ∗ or we do not measure it correctly
• Let y represent the observable measure of 𝑦 ∗ . In the savings case, y is reported annual
savings.
• Unfortunately, families are not perfect in their reporting of annual family savings

Tirtha Chatterjee
Measurement error in dependent variable

• Measurement error is the difference between the observed value and the actual value:
𝑒! = 𝑦 − 𝑦 ∗
• important thing is how the measurement error in the population is related to other
factors.
• To obtain an estimable model, we plug 𝑦 ∗ = 𝑦 − 𝑒! into equation, and rearrange:
y = 𝛽! + 𝛽"𝑥" + ⋯ + 𝛽% 𝑥% + 𝑢 + 𝑒!
• The error term is 𝑢 + 𝑒!. We estimate this model by OLS.

Tirtha Chatterjee
Measurement error in dependent variable

• We can estimate the model using OLS and ignore measurement error if 𝑒! also has
zero mean like u and is uncorrelated with each 𝑥2 .
• Biased estimators for the intercept if measurement error doesn’t have zero mean
• If we assume corr(x, 𝑒$ )=0, then no endogeneity and OLS estimators are unbiased and consistent.
• Measurement error in dependent variable results in larger variances of the OLS
estimators.
• Since, if 𝑒$ and u are uncorrelated, then 𝑉𝑎𝑟 𝑢 + 𝑒$ = 𝜎*! + 𝜎$! > 𝜎*! .
• Basically, if the measurement error is uncorrelated with the independent variables,
then OLS estimation has good properties.
• Otherwise, endogeneity and biased estimates

Tirtha Chatterjee
Measurement error in explanatory variable

• Much more serious problem compared to measurement error in dependent variable.

𝑦 = 𝛽! + 𝛽"𝑥"∗ + 𝑢
• The problem is that 𝑥"∗ is not observed. Instead, we have a measure of 𝑥"∗ ; call it 𝑥".
• For example, 𝑥%∗ could be actual income and 𝑥% could be reported income
• The measurement error in the population is 𝑒" = 𝑥" − 𝑥"∗
• This can be positive, negative, or zero.
• We assume that
• average measurement error in the population is zero: E(𝑒% ) = 0.
• 𝐸(𝑦|𝑥%∗, 𝑥% ) = 𝐸(𝑦|𝑥%∗), which just says that 𝑥% does not affect y after 𝑥%∗ has been controlled for.

Tirtha Chatterjee
Measurement error in explanatory variable

• We can estimate OLS model by plugging in 𝑥" in place of 𝑥"∗

𝑦 = 𝛽! + 𝛽" 𝑥" − 𝑒" + 𝑢


𝑦 = 𝛽! + 𝛽"𝑥" + (𝑢 − 𝛽" 𝑒")
𝑦 = 𝛽! + 𝛽"𝑥" + 𝑢G
• But additional assumptions required for OLS to have good properties
• The first assumption is that 𝑒% is uncorrelated with the observed measure, 𝑥% : 𝐶𝑜rr(𝑥% , 𝑒% ) = 0
• If this is true, then 𝑒% must be uncorrelated with 𝑥%∗ - Classical errors-in-variables (CEV)
assumption

Tirtha Chatterjee
Classical errors-in-variables (CEV) assumption

• Measurement error is uncorrelated with the unobserved explanatory variable (𝑥"∗ )-


𝐶𝑜rr(𝑥"∗ , 𝑒") = 0
• But 𝑒" = 𝑥" − 𝑥"∗
• Thus CEV implies that 𝑒" is correlated with the observed measure, 𝑥"
• Thus, even with CEV, OLS regression of y on 𝑥" gives a biased and inconsistent
estimator- with cov(𝑥", composite error ) ≠ 0

Tirtha Chatterjee
Missing data
• Can arise because of various reasons- collected information & later saw some info
missing
• Will reduce the sample size
• Are there any statistical consequences of using the OLS estimator and ignoring the
missing data?
• No statistical problem if the data are missing completely at random (MCAR)
• If MCAR- then units which have all data are not systematically different from those which don’t
• Reason for missing data is is independent, in a statistical sense
• Don’t replace missing with Zeros- can cause substantial bias in the OLS estimators.
• For MCAR-, one could create a dummy variable –missing data indicator
• which equals the actual data from variable, and zero if the data is missing
• All the variables including this new dummy variable is included in the regression

Tirtha Chatterjee
Missing data- Non random samples

• Say the missing data is not random


• For example
• individuals with higher income do not report their savings/income
• Or say in the birth weight data set, probability that education is missing is higher for those people
with lower than average levels of education
• The random sampling assumption MLR.2 is violated, and we must worry about these
consequences for OLS estimation.
• Sample selection two types- exogenous and endogenous

Tirtha Chatterjee
Exogenous sample selection

• When sample is selected on the basis of independent variables


• Example- Suppose that we are estimating a saving function,
𝑠𝑎𝑣𝑖𝑛𝑔 = 𝛽! + 𝛽"𝑖𝑛𝑐𝑜𝑚𝑒 + 𝛽#𝑎𝑔𝑒 + 𝛽$𝑠𝑖𝑧𝑒 + 𝑢
• Data set comprises of survey of people over 35 years age
• This is nonrandom sample of adults- but is a case of exogenous sample selection
• We can still get unbiased and consistent estimators of the parameters in the population
model using the nonrandom sample
• E(saving|income,age,size) is the same for any subset of the population described by income, age, or
size.
• Not so serious problem if there is enough variation in the independent variables

Tirtha Chatterjee
Endogenous sample selection

• Sample selection is based on the dependent variable


• For example, suppose we wish to estimate the relationship between individual wealth
and several other factors in the population of all adults:
𝑤𝑒𝑎𝑙𝑡ℎ = 𝛽! + 𝛽"𝑒𝑑𝑢𝑐 + 𝛽#𝑒𝑥𝑝𝑒𝑟 + 𝛽$𝑎𝑔𝑒 + 𝑢
• Suppose that only people with wealth below $250,000 are included in the sample.
• This is a nonrandom sample from the population of interest, and it is based on the
value of the dependent variable.
• This will result in biased and inconsistent estimators of the parameters in
• Because the population regression E(wealth|educ,exper,age) is not the same as the
expected value conditional on wealth being less than $250,000.

Tirtha Chatterjee
Outliers

• Outlying observations can occur for two reasons.


• Mistake has been made in entering the data. Adding extra zeros to a number or misplacing a decimal point can
throw off the OLS estimates.
• Outliers can also arise when sampling from a small population if one or several members of the population are
very different in some relevant aspect from the rest of the population.
• OLS is susceptible to outlying observations because it minimizes the sum of squared
residuals
• Least Absolute Deviations (LAD) method of estimation less sensitive to outliers

Tirtha Chatterjee
Least Absolute Deviation estimators

• The LAD estimators of the 𝛽2 in a linear model minimize the sum of the absolute
values of the residuals
• LAD does not give increasing weight to larger residuals
• it is much less sensitive to changes in the extreme values of the data than OLS.
• LAD is designed to estimate the parameters of the conditional median of y given
explanatory variables rather than the conditional mean.
• Because the median is not affected by large changes in the extreme observations, it
follows that the LAD parameter estimates are more resilient to outlying observations.

Tirtha Chatterjee
What to do in case of outliers?

• We can report and compare results with and without outliers


• Differentially weight observations based on outlier quality (robust regression)
• Formal approach of dealing with outliers beyond the scope of this course
• Excluding outliers might not be the best approach
• statistical properties of the resulting estimators are complicated.
• Outlying observations can provide important information by increasing the variation in the
explanatory variables

Tirtha Chatterjee

You might also like