Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 64

Bedru B.

and Seid H

Chapter seven: Multicollinearity


7.1 The nature of Multicollinearity
Originally, multicollinearity meant the existence of a “perfect” or exact, linear
relationship among some or all explanatory variables of a regression model. For k-
variable regression involving explanatory variables x1 , x 2 ,......, x k , an exact linear
relationship is said to exist if the following condition is satisfied.
1 x1   2 x 2  .......   k x k  vi  0       (1)

where are constants such that not all of them are simultaneously zero.
Today, however , the term multicollineaity is used in a broader sense to include
the case of perfect multicollinearity as shown by (1) as well as the case where the
x-variables are inter-correlated but not perfectly so as follows

where is the stochastic error term.


The nature of multicollinarity can be illustrated using the figures below. Let, in
the figures and represent respectively the variation in y (the dependent
variable) and x1and x2 (explanatory variables). The degree of collinearity can be
measured by the extent of overlap (shaded area) of the x 1 and x2. In the fig.(a)
below there is no overlap between and hence no collinearity. In figs. ‘b’
through ‘e’, there is “low” to “high” degree of collinearity. In the extreme if
were to overlap completely (or if x1 is completely inside x2, or vice
versa), Collinearity would be perfect.

Econometrics: Module-II
Bedru B. and Seid H

Note that: multicollinearity refers only to linear relationships among the x-


variables. It does not rule out non-linear relationships among the x-variables.
For example:
Where: Y-Total cost and X-output.
The variables are obviously functionally related to xi but the relationship
is non-linear. Strictly, therefore, models such (3.31) do not violate the assumption
of no multicollineaity. However, in concrete applications, the conventionally
measured correlation coefficient will show to be highly correlated,
which as we shall show, will make it difficult to estimate the parameters with
greater precision (i.e. with smaller standard errors).

7.2 Reasons for Multicollinearity


1. The data collection method employed: Example: If we regress on small
sample values of the population; there may be multicollinearity but if we
take all the possible values, it may not show multicollinearity.
2. Constraint over the model or in the population being sampled.

Econometrics: Module-II
Bedru B. and Seid H

For example: in the regression of electricity consumption on income (x 1)


and house size (x2), there is a physical constraint in the population in that
families with higher income generally have larger homes than with lower
incomes.
3. Overdetermined model: This happens when the model has more
explanatory variables than the number of observations. This could happen
in medical research where there may be a small number of patients about
whom information is collected on a large number of variables.

7.3 Consequences of Multicollinearity


Why does the classical linear regression model put the assumption of no
multicollinearity among the X’s? It is because of the following consequences of
multicollinearity on OLS estimators.
1. If multicollinearity is perfect, the regression coefficients of the X variables are
indeterminate and their standard errors are infinite.
Proof: - Consider a multiple regression model with two explanatory variables,
where the dependent and independent variables are given in deviation form as
follows.
Dear distance student, do you recall the formulas of and from our discussion
of multiple regression?
x1 yx 22  x 2 yx1 x 2
ˆ1 
x12i x 22  (x1 x 2 ) 2

x 2 yx12  x1 yx1 x 2


ˆ1 
x12 x 22  (x1 x 2 ) 2

Assume ------------------------3.32
Where is non-zero constants. Substitute 3.32in the above and formula:

Econometrics: Module-II
Bedru B. and Seid H

 indeterminate.

Applying the same procedure, we obtain similar result (indeterminate value) for
. Likewise, from our discussion of multiple regression model, variance of is

given by :

Substituting in the above variance formula, we get:

 infinite.

These are the consequences of perfect multicollinearity. One may raise the
question on consequences of less than perfect correlation. In cases of near or high
multicollinearity, one is likely to encounter the following consequences.

2. If multicollineaity is less than perfect (i.e near or high multicollinearity), the


regression coefficients are determinate
Proof: Consider the two explanatory variables model above in deviation form.
If we assume it indicates us perfect correlation between because
the change in x2 is completely because of the change in x 1.Instead of exact
multicollinearity, we may have: Where is stochastic error
term such that . In this case x2 is not only determined by x1,but also
affected by some other variables given by vi (stochastic error term).
Substitute in the formula of above
x1 yx 22  x 2 yx1 x 2
ˆ1 
x12i x 22  (x1 x 2 ) 2

Econometrics: Module-II
Bedru B. and Seid H

 determinate.

This proves that if we have less than perfect multicollinearity the OLS coefficients
are determinate.
The implication of indetermination of regression coefficients in the case of perfect
multicolinearity is that it is not possible to observe the separate influence of
. But such extreme case is not very frequent in practical applications.
Most data exhibit less than perfect multicollinearity.
3. If multicollineaity is less than perfect (i.e. near or high multicollinearity) , OLS
estimators retain the property of BLUE
Explanation:
Note: While we were proving the BLUE property of OLS estimators in simple
and multiple regression models(module-I); we did not make use of the assumption
of no multicollinearity. Hence, if the basic assumptions which are important to
prove the BLUE property are not violated ,whether multicollinearity exist or
not ,the OLS estimators are BLUE .
3. Although BLUE, the OLS estimators have large variances and covariances.

Multiply the numerator and the denominator by

Where is the square of correlation coefficient between ,

Econometrics: Module-II
Bedru B. and Seid H

If , what happen to the variance of as is line rises.


As tends to 1 or as collinearity increases, the variance of the estimator increase
and in the limit when variance of becomes infinite.

Similarly . (why?)

As increases to ward one, the covariance of the two estimators increase in


absolute value. The speed with which variances and covariance increase can be
seen with the variance-inflating factor (VIF) which is defined as:

VIF shows how the variance of an estimator is inflated by the presence of


multicollinearity. As approaches 1, the VIF approaches infinity. That is, as the
extent of collinearity increase, the variance of an estimator increases and in the
limit the variance becomes infinite. As can be seen, if there is no multicollinearity
between , VIF will be 1.
Using this definition we can express and interms of VIF

VIF and VIF

Which shows that variances of are directly proportional to the VIF.

4. Because of the large variance of the estimators, which means large standard
errors, the confidence interval tend to be much wider, leading to the acceptance of
“zero null hypothesis” (i.e. the true population coefficient is zero) more readily.

5. Because of large standard error of the estimators, the computed t-ratio will be
very small leading one or more of the coefficients tend to be statistically
insignificant when tested individually.

Econometrics: Module-II
Bedru B. and Seid H

6. Although the t-ratio of one or more of the coefficients is very small (which
makes the coefficients statistically insignificant individually), R 2, the overall
measure of goodness of fit, can be very high.
Example: if
In the cases of high collinearity, it is possible to find that one or more of the partial
slope coefficients are individually statistically insignificant on the basis of t-test.
But the R2 in such situations may be so high say in excess of 0.9.in such a case on
the basis of F test one can convincingly reject the hypothesis that
Indeed, this is one of the signals of multicollinearity-
insignificant t-values but a high overall R2 (i.e a significant F-value).

7. The OLS estimators and their standard errors can be sensitive to small change
in the data.

7.4 Detection of Multicollinearity

A recognizable set of symptoms for the existence of multicollinearity on which


one can rely are:
a. High coefficient of determination ( R2)
b. High correlation coefficients among the explanatory variables
c. Large standard errors and smaller t-ratio of the regression parameters
Note: None of the symptoms by itself are a satisfactory indicator of
multicollinearity. Because:
i. Large standard errors may arise for various reasons and not only because of the
presence so linear relationships among explanatory variables.
ii. A high is only sufficient but not a necessary condition (adequate condition)
for the existence of multicollinearity because multicollinearity can also exist even
if the correlation coefficient is low.

Econometrics: Module-II
Bedru B. and Seid H

However, the combination of all these criteria should help the detection of
multicollinearity.

4.3.4.1 Test Based on Auxiliary Regressions:


Since multicollinearity arises because one or more of the regressors are exact or
approximately linear combinations of the other regressors, one may of finding out
which X variable is related to other X variables to regress each X i on the
remaining X variables and compute the corresponding R2, which we designate as
, each one of these regressions is called auxiliary to the main regression of Y on
the X’s. Then, following the relationship between F and R 2 established in chapter
three under over all significance , the variable,

where: - n is number of observation


- k is number of parameters including the intercept
If the computed F exceeds the critical F at the chosen level of significance, it is
taken to mean that the particular X i collinear with other X’s; if it does not exceed
the critical F, we say that it is not collinear with other X’s in which case we may
retain the variable in the model.
If Fi is statistically significant, we will have to decide whether the particular X i
should be dropped from the model.
Note that according tot Klieri’s rule of thumb, which suggest that multicollinearity
may be a trouble some problem only if R 2 obtained from an auxiliary regression is
greater than the overall R2, that is obtained from the regression of Y on all
regressors.
4.3.4.2 The Farrar-Glauber test - They use three statistics for testing
mutlicollinearity There are chi-square, F-ratio and t-ratio. This test may be
outlined in three steps.

Econometrics: Module-II
Bedru B. and Seid H

A. Computation of 2 to test orthogonalitly: two variables are called orthogonal


if i.e. if there is no any colinearity between them. In our discussion of in
multiple regression models, we have seen the matrix representation of a three
explanatory variable model which is given by

Divide each elements of x’x by and compute the determinant

The value of the determinant is equal to zero in the case of perfect


multicollinearity. (since
On the other hand, it the case of orthogonality of the x’s, therefore and the
value of he determinant is unity. It follows, therefore, that if the value of this
determinant lies between zero and unity, there exist some degree of
mutlicollinearity. For detecting the degree of multicollinearity over the whole set
of explanatory variables, Glauber and Farrar suggests 2 to test in the following
way.
are orthogonal (i.e.

are not orthogonal (i.e.


Glauber and Farrar have found that the quantity

Econometrics: Module-II
Bedru B. and Seid H

2 = -[ n –1 – 1/6(2k+5)] . log e {value of the standardized determinant }; has a 2


distribution with 1/2k(k-1) df. If the computed 2 is greater than the critical value
of 2, reject H0 in favour of multicollinearty. But if it is less, then accept H0.

B. Computation of t-ratio to test the pattern of multicollinearity


The t-test helps to detect those variables which are the cause of multicollinearity.
This test is performed based on the partial correlation coefficients through the
following procedure of hypothesis.

In the three variable model

(How?)

(How?)

if t*>t (tabulate), H0 is rejected


t*<t (tabulated), H0 is accepted, we accept Xi and Xj are not the cause of
muticollinearity since ( is not significant)

4.3.4.3 Test of multicollinearity using Eigen values and condition index:


Using Eigen values we can drive a number called condition number K as

follows:

Econometrics: Module-II
Bedru B. and Seid H

In addition using these value we can drive the condition index (CI) defined as

Decision rule: if K is between 100 and 1000 there is moderate to strong


muticollinearity and if it exceeds 1000 there is sever muticollinearity.
Alternatively if CI( ) is between 10 and 30, there is moderate to strong
multicollineaity and if it exceeds 30 there is sever muticollinearity.
Example . If k=123,864 and CI=352 – This suggest existence of multicollinearity

4.3.4.4 Test of multicollinearity using Tolerance and variance inflation factor

where is the in the auxiliary regression of Xj on the remaining (k-2)


regressors and VIF is variance inflation factor.
Some authors therefore use the VIF as an indicator of multicollinearity: The
larger is the value of VIF j, the more “trouble some” or collinear is the variable X j.
However, how high should VIF be before a regressor becomes troublesome? As a
rule of thumb, if VIF of a variable exceeds 10 (this will happens if exceeds
(0.9) the variable is said to be highly collinear.
Other authors use the measure of tolerance to detect multicollinearity. It is defined

as

Clearly, TOLj =1 if Xj is not correlated with the other regressors, where as it is


zero if it is perfectly related to other regressors.

VIF (or tolerance) as a measure of collinearity, is not free of criticism. As we have

seen earlier ; depends on three factors A

high VIF can be counter balanced by low To put differently, a

Econometrics: Module-II
Bedru B. and Seid H

high VIF is neither necessary nor sufficient to get high variances and high standard
errors. Therefore, high multicollinearity, as measured by a high VIF may not
necessary cause high standard errors.

4.3.5.Remedial measures
It is more difficult to deal with models indicating the existence of multicollinearity
than detecting the problem of multicollinearity. Different remedial measures have
been suggested by econometricians; depending on the severity of the problem,
availability of other sources of data and the importance of the variables, which are
found to be multicollinear in the model.
Some suggest that minor degree of multicollinearity can be tolerated although one
should be a bit careful while interpreting the model under such conditions. Others
suggest removing the variables that show multicollinearity if it is not important in
the model. But, by doing so, the desired characteristics of the model may then get
affected. However, following corrective procedures have been suggested if the
problem of multicollinearity is found to be serious.
1. Increase the size of the sample: it is suggested that multicollinearity may be
avoided or reduced if the size of the sample is increased. With increase in the size
of the sample, the covariances are inversely related to the sample size. But we
should remember that this will be true when intercorrelation happens to exist only
in the sample but not in the population of the variables. If the variables are
collinear in the population, the procedure of increasing the size of the sample will
not help to reduce multicollinearity.
2. Introduce additional equation in the model: The problem of mutlicollinearity
may be overcome by expressing explicitly the relationship between multicollinear
variables. Such relation in a form of an equation may then be added to the original
model. The addition of new equation transforms our single equation (original)
model to simultaneous equation model. The reduced form method (which is

Econometrics: Module-II
Bedru B. and Seid H

usually applied for estimating simultaneous equation models) can then be applied
to avoid multicollinearity.
3. Use extraneous information – Extraneous information is the information
obtained from any other source outside the sample which is being used for the
estimation. Extraneous information may be available from economic theory or
from some empirical studies already conducted in the field in which we are
interested. Three methods, through which extraneous information is utilized in
order to deal with the problem of multicollinearity.
a. Method of using prior information: Suppose that the correct specification
of the model is , and also are found to be
collinear. If it is possible to gather information on the exact value of
from extraneous source, we then make use of such information in
estimating the influence of the remaining variable of the model in the
following way.
Suppose known a priori, then:

Applying OLS method:

i.e. is the OLS estimator of the slope of the regression of

Thus, the estimating procedure described is equivalent to correcting the dependent


variable for the influence of those explanatory variables with known coefficients
(from extraneous source of information) and regressing this residual on the
remaining explanatory variables.
b. Methods of transforming variables: This method is used when the
relationship between certain parameters is known as a priori. For instance,
suppose that we want to estimate the production function expressed in the
form where Q is quantity produced L-labor input and K- the

Econometrics: Module-II
Bedru B. and Seid H

input of capital. It is required to estimate . On logarithmic


transformation, the function becomes:

The asterisk indicates logs of the variables. Suppose, it is observed that K and L
move together so closely that it is difficult to separate the effect of changing
quantities of labor inputs on output from the effect of variation in the use of
capital. Again, let us assume that on the basis of information from some other
source, we have a solid evidence that the present industry is characterized by
constant returns to scale. This implies that , we can therefore, on the basis
of this information, substitute in the transformed function. On combining
the results, the relationship becomes:

Where is derived from the time series data, is obtained by using the cross-
section data. By following the pooling technique, we have skirted the
multicollinearity between income and price.

The methods described above are no sure methods to get rid of the problem of
multicollinearity. Which of these rules work in practice will depend on the nature
of the data under investigation and severity of the multicollinearity problem.

Chapter Five

Regression on Dummy Variables

5.1 The nature of dummy variables

Econometrics: Module-II
Bedru B. and Seid H

In regression analysis the dependent variable is frequently influenced not only by


variables that can be readily quantified on some well-defined scale (e.g., income,
output, prices, costs, height, and temperature), but also by variables that are
essentially qualitative in nature (e.g., sex, race, color, religion, nationality, wars,
earthquakes, strikes, political upheavals, and changes in government economic
policy). For example, holding all other factors constant, female college professors
are found to earn less than their male counterparts, and nonwhites are found to
earn less than whites. This pattern may result from sex or racial discrimination, but
whatever the reason, qualitative variables such as sex and race do influence the
dependent variable and clearly should be included among the explanatory
variables. Since such qualitative variables usually indicate the presence or absence
of a “quality” or an attribute, such as male or female, black or white, or Christian
or Muslim, one method of “quantifying” such attributes is by constructing
artificial variables that take on values of 1 or 0, 0 indicating the absence of an
attribute and 1 indicating the presence (or possession) of that attribute. For
example, 1 may indicate that a person is a male, and 0 may designate a female; or
1 may indicate that a person is a college graduate, and 0 that he is not, and so on.
Variables that assume such 0 and 1 values are called dummy variables.
Alternative names are indicator variables, binary variables, categorical variables,
and dichotomous variables.

Dummy variables can be used in regression models just as easily as quantitative


variables. As a matter of fact, a regression model may contain explanatory
variables that are exclusively dummy, or qualitative, in nature.

Example: ------------------------------------------(5.01)
where Y=annual salary of a college professor
if male college professor
= 0 otherwise (i.e., female professor)

Econometrics: Module-II
Bedru B. and Seid H

Note that (5.01) is like the two variable regression models encountered previously
except that instead of a quantitative X variable we have a dummy variable D
(hereafter, we shall designate all dummy variables by the letter D).

Model (5.01) may enable us to find out whether sex makes any difference in a
college professor’s salary, assuming, of course, that all other variables such as age,
degree attained, and years of experience are held constant. Assuming that the
disturbance satisfy the usually assumptions of the classical linear regression
model, we obtain from (5.01).
Mean salary of female college professor: -------(5.02)
Mean salary of male college professor:
that is, the intercept term gives the mean salary of female college professors and
the slope coefficient tells by how much the mean salary of a male college
professor differs from the mean salary of his female counterpart, reflecting
the mean salary of the male college professor. A test of the null hypothesis that
there is no sex discrimination can be easily made by running regression
(5.01) in the usual manner and finding out whether on the basis of the t test the
estimated is statistically significant.

5.2 Regression on one quantitative variable and one qualitative variable with
two classes, or categories
Consider the model: ----------------------------(5.03)
Where: annual salary of a college professor
years of teaching experience
1 if male
=0 otherwise
Model (5.03) contains one quantitative variable (years of teaching experience) and
one qualitative variable (sex) that has two classes (or levels, classifications, or

Econometrics: Module-II
Bedru B. and Seid H

categories), namely, male and female. What is the meaning of this equation?
Assuming, as usual, that we see that
Mean salary of female college professor: ---------(5.04)
Mean salary of male college professor: ------(5.05)
Geometrically, we have the situation shown in fig. 5.1 (for illustration, it is
assumed that ). In words, model 5.01 postulates that the male and female
college professors’ salary functions in relation to the years of teaching experience
have the same slope but different intercepts. In other words, it is assumed that
the level of the male professor’s mean salary is different from that of the female
professor’s mean salary (by but the rate of change in the mean annual salary by
years of experience is the same for both sexes.

If the assumption of common slopes is valid, a test of the hypothesis that the two
regressions (5.04) and (5.05) have the same intercept (i.e., there is no sex
discrimination) can be made easily by running the regression (5.03) and noting the
statistical significance of the estimated on the basis of the traditional t test. If
the t test shows that is statistically significant, we reject the null hypothesis that
the male and female college professors’ levels of mean annual salary are the same.

Econometrics: Module-II
Bedru B. and Seid H

Before proceeding further, note the following features of the dummy variable
regression model considered previously.

1. To distinguish the two categories, male and female, we have introduced


only one dummy variable For if always denotes a male, when
we know that it is a female since there are only two possible
outcomes. Hence, one dummy variable suffices to distinguish two
categories. The general rule is this: If a qualitative variable has ‘m’
categories, introduce only ‘m-1’ dummy variables. In our example, sex
has two categories, and hence we introduced only a single dummy variable.
If this rule is not followed, we shall fall into what might be called the
dummy variable trap, that is, the situation of perfect multicollinearity.
2. The assignment of 1 and 0 values to two categories, such as male and
female, is arbitrary in the sense that in our example we could have assigned
D=1 for female and D=0 for male.
3. The group, category, or classification that is assigned the value of 0 is often
referred to as the base, benchmark, control, comparison, reference, or
omitted category. It is the base in the sense that comparisons are made
with that category.

4. The coefficient attached to the dummy variable D can be called the


differential intercept coefficient because it tells by how much the value of
the intercept term of the category that receives the value of 1 differs from
the intercept coefficient of the base category.

5.3 Regression on one quantitative variable and one qualitative variable with
more than two classes

Econometrics: Module-II
Bedru B. and Seid H

Suppose that, on the basis of the cross-sectional data, we want to regress the
annual expenditure on health care by an individual on the income and education of
the individual. Since the variable education is qualitative in nature, suppose we
consider three mutually exclusive levels of education: less than high school, high
school, and college. Now, unlike the previous case, we have more than two
categories of the qualitative variable education. Therefore, following the rule that
the number of dummies be one less than the number of categories of the variable,
we should introduce two dummies to take care of the three levels of education.
Assuming that the three educational groups have a common slope but different
intercepts in the regression of annual expenditure on health care on annual income,
we can use the following model:
--------------------------(5.06)
Where annual expenditure on health care
annual expenditure
1 if high school education
= 0 otherwise
1 if college education
= 0 otherwise
Note that in the preceding assignment of the dummy variables we are arbitrarily
treating the “less than high school education” category as the base category.
Therefore, the intercept will reflect the intercept for this category. The
differential intercepts and tell by how much the intercepts of the other two
categories differ from the intercept of the base category, which can be readily
checked as follows: Assuming , we obtain from (5.06)

Econometrics: Module-II
Bedru B. and Seid H

which are, respectively the mean health care expenditure functions for the three
levels of education, namely, less than high school, high school, and college.
Geometrically, the situation is shown in fig 5.2 (for illustrative purposes it is
assumed that ).

5.4 Regression on one quantitative variable and two qualitative variables


The technique of dummy variable can be easily extended to handle more than one
qualitative variable. Let us revert to the college professors’ salary regression
(5.03), but now assume that in addition to years of teaching experience and sex the
skin color of the teacher is also an important determinant of salary. For simplicity,
assume that color has two categories: black and white. We can now write (5.03)
as :
-------------------------------------------(5.07)
Where annual salary
years of teaching experience
if female
=0 otherwise
if white
=0 otherwise

Econometrics: Module-II
Bedru B. and Seid H

Notice that each of the two qualitative variables, sex and color, has two categories
and hence needs one dummy variable for each. Note also that the omitted, or base,
category now is “black female professor.”
Assuming , we can obtain the following regression from (5.07)
Mean salary for black female professor:

Mean salary for black male professor:

Mean salary for white female professor:

Mean salary for white male professor:

Once again, it is assumed that the preceding regressions differ only in the intercept
coefficient but not in the slope coefficient .
An OLS estimation of (5.06) will enable us to test a variety of hypotheses. Thus,
if is statistically significant, it will mean that color does affect a professor’s
salary. Similarly, if is statistically significant, it will mean that sex also affects
a professor’s salary. If both these differential intercepts are statistically
significant, it would mean sex as well as color is an important determinant of
professors’ salaries.

From the preceding discussion it follows that we can extend our model to include
more than one quantitative variable and more than two qualitative variables. The
only precaution to be taken is that the number of dummies for each qualitative
variable should be one less than the number of categories of that variable.

5.5 Testing for structural stability of regression models

Econometrics: Module-II
Bedru B. and Seid H

Until now, in the models considered in this chapter we assumed that the qualitative
variables affect the intercept but not the slope coefficient of the various subgroup
regressions. But what if the slopes are also different? If the slopes are in fact
different, testing for differences in the intercepts may be of little practical
significance. Therefore, we need to develop a general methodology to find out
whether two (or more) regressions are different, where the difference may be in
the intercepts or the slopes or both.

5.6 Interaction effects


Consider the following model:
---------------------------------(5.08)
where annual expenditure on clothing
income
if female
= 0 if male
if college graduate
= 0 otherwise
Implicit in this model is the assumption that the differential effect of the sex
dummy is constant across the two levels of education and the differential effect
of the education dummy is also constant across the two sexes. That is, if, say,
the mean expenditure on clothing is higher for females than males this is so
whether they are college graduates or not. Likewise, if, say, college graduates on
the average spend more on clothing than non college graduates, this is so whether
they are female or males.

In many applications such an assumption may be untenable. A female college


graduate may spend more on clothing than a male graduate. In other words, there
may be interaction between the two qualitative variables and and therefore

Econometrics: Module-II
Bedru B. and Seid H

their effect on mean Y may not be simply additive as in (5.08) but multiplicative
as well, as in the following model:
-----------------(4.09)
From (4.09) we obtain
------------(4.10)
which is the mean clothing expenditure of graduate females. Notice that
differential effect of being a female
differential effect of being a college graduate
differential effect of being a female graduate
which shows that the mean clothing expenditure of graduate females is different
(by from the mean clothing expenditure of females or college graduates. If
are all positive, the average clothing expenditure of females is
higher (than the base category, which here is male nongraduate), but it is much
more so if the females also happen to be graduates. Similarly, the average
expenditure on clothing by a college graduate tends to be higher than the base
category but much more so if the graduate happens to be a female. This shows
how the interaction dummy modifies the effect of the two attributes considered
individually. Whether the coefficient of the interaction dummy is statistically
significant can be tested by the usual t test. If it turns out to be significant, the
simultaneous presence of the two attributes will attenuate or reinforce the
individual effects of these attributes. Needless to say, omitting a significant
interaction term incorrectly will lead to a specification bias.

5.7 The use of dummy variables in seasonal analysis


Many economic time series based on monthly or quarterly data exhibit seasonal
patterns (regular oscillatory movement). Examples are sales of department stores
at Christmastime, demand for money (cash balances) by households at holiday
times, demand for ice cream and soft drinks during the summer, and prices of

Econometrics: Module-II
Bedru B. and Seid H

crops right after the harvesting season. Often it is desirable to remove the seasonal
factor, or component, from a time series so that one may concentrate on the other
components, such as the trend. The process of removing the seasonal component
from a time series is known as deseasonalization, or seasonal adjustment, and the
time series thus obtained is called the deseasonalized or seasonally adjusted, time
series. Important economic time series, such as the consumer price index, the
wholesale price index, the index of industrial production, are usually published in
the seasonably adjusted form.

5. 8 Piecewise linear regression


To illustrate yet another use of dummy variables, consider fig 5.3, which shows
how a hypothetical company remunerates its sales representatives.

It pays commissions based on sales in such manner that up to a certain level, the
target, or threshold, level X*, there is one (stochastic) commission structure and
beyond that level another. (Note: Besides sales, other factors affect sales
commission. Assume that these other factors are represented by the stochastic
disturbance term.) More specifically, it is assumed that sales commission increases
linearly with sales until the threshold level X*, after which also it increases
linearly with sales but at a much steeper rate. Thus, we have a piece-wise linear

Econometrics: Module-II
Bedru B. and Seid H

regression consisting of two linear pieces or segments, which are labeled I and II
in fig. 5.3, and the commission function changes its slope at the threshold value.
Given the data on commission, sales, and the value of the threshold level X*, the
technique of dummy variables can be used to estimate the (differing) slopes of the
two segments of the piecewise linear regression shown in fig. 5.3. We proceed as
follows:
------------------------------------(5.11)
where sales commission
volume of sales generated by the sales person
X*= threshold value of sales also known as a knot (Known in advance)
D=1 if
= 0 if
Assuming we see at once that
---------------------------------------(5.12)
which gives the mean sales commission up to the target level X* and
----------------------(5.13)
which gives the mean sales commission beyond the target level X*.

Thus, gives the slope of the regression lien in segment I, and gives the
slope of the regression line in segment II of the piecewise linear regression shown
in fig 5.3. A test of the hypothesis that there is no break in the regression at the
threshold value X* can be conducted easily by noting the statistical significance of
the estimated differential slope coefficient .

Summary:
1. Dummy variables taking values of 1 and 0 (r their linear transforms) are a
means of introducing qualitative regressors in regression analysis.

Econometrics: Module-II
Bedru B. and Seid H

2. Dummy variables are a data-classifying device in that they divide a sample


into various subgroups based on qualities or attributes (sex, marital status,
race, religion, etc.) and implicitly allow one to run individual regressions
for each subgroup. If there are differences in the response of the regress
and to the variation in the quantitative variables in the various subgroups,
they will be reflected in the differences in the intercepts or slope
coefficients, or both, of the various subgroup regressions.
3. Although a versatile took, the dummy variable technique needs to be
handled carefully. First, if the regression contains a constant term, the
number of dummy variables must be less than the number of classifications
of each qualitative variable. Second, the coefficient attached to the dummy
variables must always be interpreted in relation to the base, or reference,
group, that is, the group that gets the value of zero. Finally, if a model has
several qualitative variables with several classes, introduction of dummy
variables can consume a large number of degrees of freedom. Therefore,
one should always weigh the number of dummy variables to be introduced
against the total number of observations available for analysis.
4. Among its various applications, this chapter considered but a few. These
included (1) comparing two (or more) regressions, (2) deseasonalizing time
series data, (3) combining time series and cross-sectional data, and(4)
piecewise linear regression models.
5. Since the dummy variables are non stochastic, they pose no special
problems in the application of OLS. However, care must be exercised in
transforming data involving dummy variables. In particular, the problems
of autocorrelation and heteroscedasticity need to be handled very carefully.

Test your-self question

Econometrics: Module-II
Bedru B. and Seid H

In studying the effect of a number of qualitative attributes on the prices


charged for movie admissions in a large metropolitan area for the period 1961-
1964, R. D.Lampson obtained the following regression for the year 1961:

where : theater location: 1 if suburban, 0 if city center


theater age: 1 if less than 10 years since construction or major
renovation, 0 otherwise.
type of theater: 1 if outdoor, 0 if indoor
parking: 1 if provided, 0 otherwise
Screening policy: 1 if first run, 0 otherwise
average percentage unused seating capacity per showing
Average film rental, cents per ticket charged by the distributor
adult evening admission price, cents
and where the figures in parentheses are standard errors.
a. Comment on the results.
b. How would you rationalize the introduction of the variable ?
c. How would you explain the negative value of the coefficient of ?

Chapter Six

Dynamic econometric models

6.1 Introduction
While considering the standard regression model, we did not pay attention to the
timing of the explanatory variable(s) on the dependent variable. The standard
linear regression implies that change in one of the explanatory variables causes a

Econometrics: Module-II
Bedru B. and Seid H

change in the dependent variable during the same time period and during that
period alone. But in economics, such specification is scarcely found. In economic
phenomenon, generally, a cause often produces its effect only after a lapse of time;
this lapse of time (between cause and its effect) is called a lag. Therefore, realistic
formulations of economic relations often require the insertion of lapped values of
the explanatory or insertion of lagged dependent variables.

6.2 Autoregressive and distributed lag models


In regression analysis involving time series data, if the regression model includes
not only the current but also the lagged (past) values of the explanatory variables
(the X’s), it is called distributed large models. For example:

is a distributed lag model of consumption function. This means that the value of
the consumption expenditure at any given time depends on the current and
past values of the disposable income . The general form of a distributed lag
model (with only lagged exogenous variables) is written as:

The number of lags, s, may be either finite or infinite. But generally it is assumed
to be finite. The coefficient is known as the short run, or impact, multiplier
because it gives the change in mean value of Y following a unit change in X in the
same time period t. If the change in X is maintained at the same level thereafter,
then,(0+1)gives the change in the (mean value of) Y in the next period,(0+1+
2) in the following period, and so on. These partial sums are called interim, or
intermediate, multipliers. Finally, after ‘s’ periods we obtain which is known as
the long run, distributed-lag multiplier provided the sum exists.

It should be noted that distributed lag model is not to be confused with


autoregressive model. If the regression model includes lagged values of the

Econometrics: Module-II
Bedru B. and Seid H

explanatory variables it is called distributed lag model where as of the model


includes one or more lagged values of the dependent variable among its
explanatory variables, it is called an autoregressive model. Thus,
represents distributed-lag model whereas
is an example of an autoregressive model.

The Reasons for Lags:


There are three major reasons why lags may occur.
1. Psychological reasons: As a result of the force of habit (inertia), people,
for example, do not change their consumption habits immediately following
a price decrease or an income increase perhaps because the process of
change may involve some immediate disutility. Thus, those who become
instant milliners by winning lotteries may not change the lifestyle to which
they were accustomed for a long time because they may not know how to
react to such windfall gain immediately. Of course, given reasonable time,
they may learn to live with their newly acquired fortune. Also, people may
not know whether a change is “permanent” or ‘transitory”. Thus, my
reaction to an increase in my income will depend in whether or not the
increase in permanent. If it is only a nonrecurring increase and in
succeeding periods my income returns to its previous level, I may save the
entire increase, whereas someone else in my position might decide to “live
it up”.
2. Technological reason: suppose, for instance, the price of capital relative
to labor declines making substitution of capital for labor economically
feasible. Of course, addition of capital takes time (gestation period)
moreover, if the drop in price is expected to be temporarily, firms may not
rush to substitute capital for labor, especially if they expect that after
temporarily drop the price of capital may increase beyond its previous
level.

Econometrics: Module-II
Bedru B. and Seid H

3. Institutional reasons: These reasons also contribute to lags. For example,


contractual obligations may prevent firms from switching from one source
of labor or raw material to another. As another example, those who have
placed funds in long term saving accounts for fixed durations such as one
year, three years or seven years are essentially “locked” in even though
money market conditions may be such that higher yields are available.

6.3 Estimation of distributed lag models


Suppose we have the following distributed lag model in one explanatory variable.
------------(6.01)
In (6.01) the length of the lag, that is, how far back into the past we want to go
hasn’t been defined. Such a model is called an infinite (lag) model, whereas
models with specified lags are called a finite (lag) distributed-lag model.
How do we estimate in (6.01)? We may adopt two approaches:
I. Ad Hoc estimation of distributed-lag models
II. A priori restriction on by assuming that the follow some
systematic pattern.

I. Ad Hoc estimation of distributed lag models


Since the explanatory variable is assumed to be non-stochastic (or at least
uncorrelated with the disturbance term Ut), and so on, are non-
stochastic too. Therefore in principle, OLS can be applied to the above model
(6.01). The Ad Hoc approach is undertaken as follows.
First, regress Y on , then on , then on , this procedure
will continue until the regression coefficients of the lagged variables start
becoming statistically insignificant and/or the coefficient of at least one of the
variables changes signs from positive to negative vise versa. Consider the
following hypothetical example.

Econometrics: Module-II
Bedru B. and Seid H

Proponents of this approach chose the second regression as the “best” one because
in the last two equation the sign of was not stable and in the last equation the
sign of was negative, which may be difficult to interpret economically.
Although seemingly straight forward, ad hoc estimation suffers from many
drawbacks, such as the following:
a. There is no guide as to what is the maximum lag length
b. As one estimates successive lags, there are fewer degrees of freedom
left, making statistical inference some what shaky
c. More importantly, in economic time series data, successive values
(lags) tend to be highly correlated; hence multicollinearity rears its
ugly head.
d. The sequential search for the lengths of lags opens the researcher to
the charge of data mining.

In view of the preceding problems, the ad hoc estimation procedure has very little
to recommend it. Some prior or theoretical considerations are brought to bear upon
the various’s if we are to make headway with the estimation problem.

II. Methods based on A priori restriction on


II.1 The koyck approach to distributed lag models
In order to reduce the number of lags in the given distributed lag model, the
present model makes an assumption that the impact of explanatory variables (on
the dependent variable) it the most distant past is less than what is in most recent

Econometrics: Module-II
Bedru B. and Seid H

periods. More specifically, the Koyck lag formulation assumes that the weights
(impacts) are declining continuously.
Assume the original model is:

According to Koyck:

is known as the rate of decline or decay, of the distributed lag and 1- is known
as the speed of adjustment. But assuming non-negative values for , koyck rules
out the ’s from changing sign, and by assuming <1, lesser weight is assigned to
the decline ’s than current one. Also, the long run multiplier is a finite amount in
Koyck scheme.

Substituting the values of ’s in the original model we obtain:

Lagging by one period and multiply by we get:

Substituting from we obtain

Let  *   (1   ),Vt  U t  U t 1

The above procedure of transformation is known as koyck transformation. If the


koyck hypothesis concerning the lag scheme and assumptions concerning V t are
accepted, ordinary least square can be applied to obtain estimators of .

Econometrics: Module-II
Bedru B. and Seid H

From these estimates, the estimates of the original parameters


can be easily obtained through:

However, the following features of the koyck transformation may be taken note of:
a. Our original model was a distributed lag model but he transformed model is
autoregressive model because appears as one of the explanatory
variables. Koyck transformation, therefore, also helps to convert
distributed lag model into an auto regressive model.
b. In the new formulation the error term is found to be auto
correlated despite the fact that the disturbance term of the original model is
non-auto correlated. It can be seen as under

c. The lagged variable is not also independent of the error term i.e.
this is because is directly dependent on . Similarly on
. But since and are not independent, will obviously be
related to .
Due to these two problems, the Koyck transformation of the distributed lag model
will give rise to biased and inconsistent estimates. In addition to these estimation
problem, the Koyck hypothesis is quite restrictive in the sense that it assumes that
impact of past periods decline successively in a specific way. But the following
are also possible.

Econometrics: Module-II
Bedru B. and Seid H

II.2 Rationalization of the Koyck Model: Adaptive Expectations (AE) Model


The Koyck model is ad hoc since it was obtained by a purely algebraic process; it
is devoid of any theoretical underpinning. But this gap can be filled if we start
from a different perspective. Suppose we postulate the following model:
-----------------------------------------------------(i)
where:Y=demand for money (real cash balances)
X*= equilibrium, optimum, expected long-run or normal rate of interest
u= error term
Equation (i) postulates that the demand for money is a function of expected (in the
sense of anticipation) rate of interest. Since the expectational variable X* is not
directly observable, let’s propose the following hypothesis about how expectations
are formed:
---------------------------------------------(ii) where ,
such that 0< 1, is known as the coefficient of expectation. The hypothesis is
known as the adaptive expectation, progressive expectation, or error learning
hypothesis, popularized by Cagan and Friedman.

What equation (ii) implies is that “economic agents will adapt their expectations
in the light of past experience and that in particular they will learn from their
mistakes.” More specifically, (ii) states that expectations are revised each period
by a fraction of the gap between the current value of the variable and its
previous expected value. Thus, for our model this would mean that expectations
about interest rates are revised each period by a fraction of the discrepancy
between the rate of interest observed in the current period and what its anticipated
value had been in the previous period. Another way of stating this would be to
write (ii) as: -------------------------------------------------(iii)
which shows that the expected value of the rate of interest at time t is a weighted
average of the actual value of the interest rate at time ‘t’ and its value expected in

Econometrics: Module-II
Bedru B. and Seid H

the previous period, with weights of ‘ ’ and ‘1- ’, respectively. If =1, ,


meaning that expectations are realized immediately and fully, that is, in the same
time period. If, on the other hand, =0, , meaning that expectations are
static, that is, “conditions prevailing today will be maintained in all subsequent
periods. Expected future values then become identified with current values.”
Substituting (iii) into (i), we obtain

---------------------------(iv)
Now, lag equation (i) by one period, multiply it by 1- , and subtract the product
from (iv). After simple algebraic manipulations, we obtain:

-----------------------------(v)
where

Let us note the difference between (i) and (v). In the former, measures the
average response of Y to a unit change in X*, the equilibrium or long-run value of
X. In (v), on the other hand, measures the average response of Y to a unit
change in the actual or observed value of X. These responses will not be the same
unless, of course, =1, that is, the current and long-run values of X are the same.
In practice, we first estimate (v). Once an estimate of is obtained from the
coefficient of lagged Y, we can easily compute by simply dividing the
coefficient of by .
Note that like the Koyck model, the adaptive expectations model is autoregressive
and its error term is similar to the Koyck error term.

II. 3 Another Rationalization of the Koyck model: the stock adjustment, or


partial adjustment model

Econometrics: Module-II
Bedru B. and Seid H

The adaptive expectation model is one way of rationalizing the Koyck model.
Another rationalization is provided by Marc Nerlove in the so-called stock
adjustment or partial adjustment model. To illustrate this model, consider the
flexible accelerator model of economic theory, which assumes that there is
equilibrium, optimal, desired, or long-run amount of capital stock needed to
produce a given output under the given state of technology, rate of interest, etc.
For simplicity assume that this desired level of capital is a linear function of
output X as follows:
------------------------------------------------(1)
Since the desired level of capital is not directly observable, Nerlove postulates the
following hypothesis, known as the partial adjustment, or stock adjustment,
hypothesis:
--------------------------------------------(2)
where , such that , is known as the coefficient of adjustment and where
actual change and =desired change.
Since , the change in capital stock between two periods, it is nothing but
investment, (2) can alternatively be written as:
----------------------------------------------------------------(3)
where Investment in time period t.

Equation (2) postulates that the actual change in capital stock (investment) in any
given time period t is some fraction of the desired change for that period. If =1,
it means that the actual stock of capital is equal to the desired stock; that is, actual
stock adjusts to the desired stock instantaneously (in the same period). However,
if =0, it means that nothing changes since actual stock at time t is the same as
that observed in the previous time period. Typically, is expected to lie between
these extremes since adjustment to the desired stock of capital is likely to be
incomplete because of rigidity, inertia, contractual obligations, etc. – hence the

Econometrics: Module-II
Bedru B. and Seid H

name partial adjustment model. Note that the adjustment mechanism (2)
alternatively can be written as:
-------------------------------------------------(4)
showing that the observed capital stock at time t is a weighted average of the
desired capital stock at that time and the capital stock existing in the previous time
period, and (1- ) being the weights. Now substitution of (1) into (4) gives:

----------------------------------(5)
This model is called the partial adjustment model.
Since (1) represents the long-run, or equilibrium, demand for capital stock, (5) can
be called the short-run demand function for capital stock since in the short run the
existing capital stock may not necessarily be equal to its long-run level. Once we
estimate the short-run function (5) and obtain the estimate of the adjustment
coefficient (from the coefficient of ), we can easily derive the long-run
function by simply dividing and by and omitting the lagged Y term,
which will then give (1).

The partial adjustment model resembles both the Koyck and adaptive expectation
models in that it is autoregressive. But it has a much simpler disturbance term: the
original disturbance term multiplied by a constant . But bear in mind that
although similar in appearance, the adaptive expectation and partial adjustment
models are conceptually very different. The former is based on uncertainty (about
the future course of prices, interest rates, etc.), whereas the latter is due to
technical or institutional rigidities, inertia, cost of change, etc. However, both of
these models are theoretically much sounder than the Koyck model.

The important point to keep in mind is that since Koyck, adaptive expectations,
and stock adjustment models – apart from the difference in the appearance of the

Econometrics: Module-II
Bedru B. and Seid H

error term – yield the same final estimating model, one must be extremely careful
in telling the reader which model the researcher is using and why. Thus,
researchers must specify the theoretical underpinning of their model.

II. 4 Combination of adaptive expectations and partial adjustment models


Consider the following model:
------------------------------------------------------ (a)
where = desired stock of capital and = expected level of output.
Since both and are not directly observable, one could use the partial
adjustment mechanism for and the adaptive expectations model for to arrive
at the following estimating equation.

---------------------------------------------(b)
where . This model too is autoregressive, the only difference
from the purely adaptive expectations model being that appears along with
as an explanatory variable. Like Koyck and the AE models, the error term in
(b) follows a moving average process. Another feature of this model is linear in
the ’s, it is nonlinear in the original parameters.

A celebrated application of (a) has been Friedman’s permanent income hypothesis,


which states that “permanent” or long-run consumption is a function of
“permanent” or long-run income.
The estimation of (b) presents the same estimation problems as the Koyck’s or the
AE model in that all these models are autoregressive with similar error structures.

II.4 The Almon approach to distributed lag models

Econometrics: Module-II
Bedru B. and Seid H

The Almon lag model possesses two advantages over the Koyck procedure. First,
it does not violate any of the ordinary least square basic assumptions concerning
the disturbance term. Second it is far more flexible than Koyck method in terms
of the form of lag scheme. It is because; this method does not hypothesize any
form of lag before hand.

This model assumes that any pattern of lag scheme among can be described by
polynomial. This idea is based on a theorem in mathematics known as
Weierstrass’s theorem, which states that under general conditions a curve may be
approximated by a polynomial whose degree is one more than the number of
turning points in the curve. Suppose that the in a given distributed lag model
are expected to decrease first, then increase and again decrease

Suppose our original model to be estimated is:


, assuming the above
polynomial:
where are parameters to be estimated.
We are, now in a position to obtain all by setting i equal to the value of the
subscript of the particular coefficient.

Naturally therefore, what is needed to be estimated is only four parameters of the


polynomial function: . Obtaining the values of
, we are able to estimate all the parameters of the original
distributed lag model. Substituting values of in the original model,

Econometrics: Module-II
Bedru B. and Seid H

where: -----------------------------------------------(c)

This is the final form (or transformed form) of Almon Lag model. We can now
apply OLS method to estimate to obtain in the original
form. Note that vt remains in its original form.

Chapter Seven

An Introduction to Simultaneous Equation models

7.1 Introduction
In all the previous chapters discussed so far, we have been focusing exclusively
with the problems and estimations of a single equation regression models. In such
models, a dependent variable is expressed as a linear function of one or more
explanatory variables. The cause-and-effect relationship in such models between
the dependent and independent variable is unidirectional. That is, the explanatory
variables are the cause and the independent variable is the effect. But there are
situations where such one-way or unidirectional causation in the function is not
meaningful. This occurs if, for instance, Y (dependent variable) is not only
function of X’s (explanatory variables) but also all or some of the X’s are, in turn,
determined by Y. There is, therefore, a two-way flow of influence between Y and
(some of) the X’s which in turn makes the distinction between dependent and
independent variables a little doubtful. Under such circumstances, we need to

Econometrics: Module-II
Bedru B. and Seid H

consider more than one regression equations; one for each interdependent
variables to understand the multi-flow of influence among the variables. This is
precisely what is done in simultaneous equation models.

A system describing the joint dependence of variables is called a system of


simultaneous equation or simultaneous equations model. The number of
equations in such models is equal to the number of jointly dependent or
endogenous variables involved in the phenomenon under analysis. Unlike the
single equation models, in simultaneous equation models it is not usually possible
(possible only under specific assumptions) to estimate a single equation of the
model without taking into account the information provided by other equation of
the system. If one applies OLS to estimate the parameters of each equation
disregarding other equations of the model, the estimates so obtained are not only
biased but also inconsistent; i.e. even if the sample size increases indefinitely, the
estimators do not converge to their true values.

The bias arising from application of such procedure of estimation which treats
each equation of the simultaneous equations model as though it were a single
model is known as simultaneity bias or simultaneous equation bias. To avoid this
bias we will use other methods of estimation, such as, Indirect Least Square (ILS),
Two Stage Least Square (2SLS), three Stage Least Square(3SLS), Maximum
Likelihood Methods and the Method of Instrumental Variable (IV).

What happens to the parameters of the relationship if we estimate by applying


OLS to each equation without taking into account the information provided by the
other equations in the system? The application of OLS to estimate the parameters
of economic relationships presupposes the classical assumptions discussed in
chapter one of this course. One of the crucial assumptions of the OLS is that the
explanatory variables and the disturbance term is independent i.e. the disturbance

Econometrics: Module-II
Bedru B. and Seid H

term is truly exogenous. Symbolically: E[XiUi] = 0. As a result, the linear model


could be interpreted as describing the conditional expectation of the dependent
variable (Y) given a set of explanatory variables. In the simultaneous equation
models, such independence of explanatory variables and disturbance term is
violated i.e. E[XiUi]  0. If this assumption is violated, the OLS estimator is
biased and inconsistent.
Simultaneity bias of OLS estimators: The two-way causation in a relationship
leads to violation of the important assumption of linear regression model, i.e. one
variable can be dependent variable in one of the equation but becomes also
explanatory variable in the other equations of the simultaneous-equation model.
In this case E[XiUi] may be different from zero. To show simultaneity bias, let’s
consider the following simple simultaneous equation model.
--------------------------------------------------(10)
Suppose that the following assumptions hold.

where X and Y are endogenous variables and Z is an exogenous variable.


The reduced form of X of the above model is obtained by substituting Y in the
equation of X.

Applying OLS to the first equation of the above structural model will result in
biased estimator because . Now, let’s proof whether this
expression.

Econometrics: Module-II
Bedru B. and Seid H

Substituting the value of X in equation (11) into equation (12)

, since E(UV) = 0

That is, covariance between X and U is not zero. As a consequence, if OLS is


applied to each equation of the model separately the coefficients will turn out to be
biased. Now, let’s examine how the non-zero co-variance of the error term and
the explanatory variable will lead to biasness in OLS estimates of the parameters.
If we apply OLS to the first equation of the above structural equation (10)
we obtain

; (since is zero)

But, we know that and , hence

Taking the expected values on both sides;

Econometrics: Module-II
Bedru B. and Seid H

Since, we have already proved that ; which is the same as .


Consequently, when ; , that is will be biased by the amount

equivalent to .

7.2 Definitions of Some Concepts


 Endogenous and exogenous variables
In simultaneous equation models variables are classified as endogenous and
exogenous. The traditional definition of these terms is that endogenous variables
are variables that are determined by the economic model (within the system) and
exogenous variables are those determined from outside. Exogenous variables are
also called predetermined. Predetermined groups can be divided into two
categories which are considered in general as exogenous variables. These are:
current and lagged exogenous and lagged endogenous. For instance;
depict the current and lagged exogenous variables and depicts lagged
endogenous variable. This is on the assumption that X’s symbolize the exogenous
variables and Y’s symbolize the endogenous variables. Thus, and
are regarded as predetermined (exogenous) variables.

Since the exogenous variables are predetermined, they are supposed to be


independent of the error terms in the model.
Consider the demand and supply functions.

where : Q=quantity , Y=income, P=price, R=Rainfalls, are error terms.

Here P and Q are endogenous variables and Y and R are exogenous variables.
 Structural models

Econometrics: Module-II
Bedru B. and Seid H

A structural model describes the complete structure of the relationships among the
economic variables. Structural equations of the model may be expressed in terms
of endogenous variables, exogenous variables and disturbances (random
variables). The parameters of structural model express the direct effect of each
explanatory variable on the dependent variable. Variables not appearing in any
function explicitly may have an indirect effect and is taken into account by the
simultaneous solution of the system. For instance, a change in consumption affects
the investment indirectly and is not considered in the consumption function. The
effect of consumption on investment cannot be measured directly by any structural
parameter, but is measured indirectly by considering the system as a whole.
Example: The following simple Keynesian model of income determination can
be considered as a structural model.
-----------------------------------------------(16)
----------------------------------------------------(17)
for  >0 and 0<<1
where: C=consumption expenditure
Z=non-consumption expenditure
Y=national income
C and Y are endogenous variables while Z is exogenous variable.

 Reduced form of the model:


The reduced form of a structural model is the model in which the endogenous
variables are expressed a function of the predetermined variables and the error
term only.
Illustration: Find the reduced form of the above structural model.
Since C and Y are endogenous variables and only Z is the exogenous variables,
we have to express C and Y in terms of Z. To do this substitute Y=C+Z into
equation (16).
+U

Econometrics: Module-II
Bedru B. and Seid H

----------------------------------(18)

Substituting again (18) into (17) we get;

--------------------------------(19)

Equation (18) and (19) are called the reduced form of the structural model of the above.
We can write this more formally as:
Structural form equations Reduced form equations

Parameters of the reduced form measure the total effect (direct and indirect) of a
change in exogenous variables on the endogenous variable. For instance, in the

above reduced form equation(18), measures the total effect of a unit

change in the non-consumption expenditure on consumption. This total effect is

, the direct effect, times ,the indirect effect.

The reduced form equations can be obtained in two ways:


1) To express the endogenous variables directly as a function of the
predetermined variables.

Econometrics: Module-II
Bedru B. and Seid H

2) To solve the structural system of endogenous variables in terms of the


predetermined variables, the structural parameters, and the disturbance
terms.
Consider the following simple model for a closed economy.
Ct = a1Yt + U1 ---------------------------------------------------------(i)
It = b1Yt + b2Yt-1 + U2-----------------------------------------------(ii)
Yt = Ct +It + Gt-------------------------------------------------------(iii)
This model has three equations in three endogenous variables (C t , It , and Yt ) and
two predetermined variables (Gt, andYt-1).
To obtain the reduced form of this model, we may use two methods (direct method
and solving the structural model method).
Direct Method: Express the three endogenous variables(Ct , It , and Yt ) as
functions of the two predetermined variables (G t, andYt-1) directly using ’s as the
parameters of the reduced form model as follows.
Ct = 11Yt-1 + 12Gt + V1 ------------------------------------(iv)
It , =21Yt-1 + 22Gt +V2 -------------------------------------(v)
Yt =31Yt-1 + 32Gt + V3 ------------------------------------(vi)
Note: 11 , 12 , 21 , 22 , 31 , and 32 are reduced from parameters. By solving the
structural system of endogenous variables in terms of predetermined variables,
structural parameters and disturbances, the expressions for the reduced parameters
can be obtained easily. For instance, the third structural equation (iii) can be
expressed in reduced form as follows:
Yt = b2/ (1-a1-b1)Yt-1 + 1/(1-a1-b1) Gt + (U1 +U2)/ (1-a1-b1). This equation is
obtained by simply substituting structural equations (i) and (ii) in (iii). Form this
expression: 31 = b2/ (1-a1-b1)
32 = b2/ (1-a1-b1)
Test yourself Questions:

Econometrics: Module-II
Bedru B. and Seid H

a) Determine the reduced form equations for the structural equations (ii) and
(iii).
b) Indicate the expressions for 11 , 12, 21 , and 22 form (a) above.
How to estimate the reduced form parameters?
The estimates of the reduced from coefficients (’s ) may be obtained in two ways.
1) Direct estimation of the reduced coefficients by applying OLS.
2) Indirect estimation of the reduced form coefficients:
Steps:
i) Solve the system of endogenous variables so that each equation contains
only predetermined explanatory variables. In this way we may obtain
the system of parameters’ relations (relations between ’s and
structural parameters)
ii) Obtain the estimates of the structural parameters by any appropriate
econometric method.
iii) Substitute the estimates of the structural coefficients into the system of
parameters’ relations to find the estimates of the reduced coefficients,.
 Recursive models
A model is called recursive if its structural equations can be ordered in such a way
that the first equation includes only the predetermined variables in the right hand
side; the second equation contains predetermined variables and the first
endogenous variable (of the first equation) in the right hand side and so on. The
special feature of recursive model is that its equations may be estimated, one at a
time, by OLS without simultaneous equations bias.

OLS is not applicable if there is interdependence between the explanatory


variables and the error term. In the simultaneous equation models, the endogenous
variables may depend on the error terms of the model; hence the OLS technique is
not appropriate for estimation of an equation in a simulations equations model.

Econometrics: Module-II
Bedru B. and Seid H

However, in a special type of simultaneous equations model called Recursive,


Triangular or Causal model, the use of OLS procedure of estimation is
appropriate. Consider the following three equation system to understand the
nature of such models:

In the above illustration, as usual, the X’s and Y’s are exogenous and endogenous
variables respectively. The disturbance terms follow the following assumptions.

The above assumption is the most crucial assumption that defines the recursive
model. If this does not hold, the above system is no longer recursive and OLS is
also no longer valid. The first equation of the above system contains only the
exogenous variables on the right hand side. Since by assumption, the exogenous
variable is independent of , the first equation satisfies the critical assumption of
the OLS procedure. Hence OLS can be applied straight forwardly to this equation.

Consider the second equation. It contains the endogenous variable as one of the
explanatory variables along with non-stochastic X’s. OLS can be applied to this
equation only if it can be shown that are independent of each other.
This is true because U1, which affects is by assumption uncorrelated with , i.e.
. acts as a predetermined variable in so far as is concerned.
Hence OLS can be applied to this equation. Similar argument can be stretched to
the 3rd equation because are independent of . In this way, in the
recursive system OLS can be applied to each equation separately.

Let us build a hypothetical recursive model for an agricultural commodity, say


wheat. The production of wheat , may be assumed to depend on exogenous
factors: = climatic conditions; and =last season’s price. The retail rice =

Econometrics: Module-II
Bedru B. and Seid H

may be assumed to be the function of production level = and exogenous factor


= disposable income. Finally the price obtained by the producer = can be
expressed in terms of the retail price and exogenous factor = the cost of
marketing the producer.

The relevant equations of the model may be described as under:

In the first equation, there are only exogenous variables and are assumed to be
independent of . In the second equation, the causal relation between and
is in one direction. Also is independent of and can be treated just like
exogenous variable. Similarly since is independent of , OLS can be applied
to the third equation. Thus, we can rewrite the above equations as follows:

We can again rewrite this in matrix form as follows:

The coefficient matrix of endogenous variables is thus a triangular one; hence


recursive models are also called as triangular models.

7.3 Problems of simultaneous equation models


Simultaneous equation models create three distinct problems. These are:
1. Mathematical completeness of the model: any model is said to be
(mathematically) complete only when it possesses as many independent
equations as endogenous variables. In other words if we happen to know
values of disturbance terms, exogenous variables and structural
parameters, then all the endogenous variables are uniquely determined.

Econometrics: Module-II
Bedru B. and Seid H

2. Identification of each equation of the model: Many times it so happens


that a given set of values of disturbance terms and exogenous variables
yield the same values of different endogenous variables included in the
model. It is because the equations are observationally indistinguishable,
what is needed is that the parameters of each equation in the system
should be uniquely determined. Hence, certain tests are required to
examine the identification of each equation before its estimation.
3. Statistical estimation of each equation of the model: Since application
of OLS yield biased and inconsistent estimates, different statistical
techniques are to be developed to estimate the structural parameters.
Some of the most common simultaneous methods* of estimation are:
i) The indirect least square method(ILS)
ii) The two-stage least square method(2SLS)
iii) The three-stage least square method(3SLS)
iv) Limited information maximum likelihood method (LIML)
v) The instrumental variable method (IV)
vi) The mixed estimation method; and
vii) The full information maximum likelihood method (FIML)
Of the three problems, we are going to discuss the second problem (the
identification problem) in the following section.

7. 4 The identification problem


In simultaneous equation models, the Problem of identification is a problem of
model formulation; it does not concern with the estimation of the model. The
estimation of the model depends up on the empirical data and the form of the
model. If the model is not in the proper statistical form, it may turn out that the
parameters may not uniquely estimated even though adequate and relevant data are

*
These methods of estimation are not discussed in this module as they are beyond the scope of this
introductory course.

Econometrics: Module-II
Bedru B. and Seid H

available. In a language of econometrics, a model is said to be identified only


when it is in unique statistical form to enable us to obtain unique estimates of its
parameters from the sample data. To illustrate the problem identification, let’s
consider a simplified wage-price model.
W =  + P + E + U --------------------------------------(i)
------------------------------------------------(ii)
where W and P are percentage rates of wage and price inflation respectively, E is a
measure of excess demand in the labor market while U and V are disturbances, E
is assumed to be exogenously determined. If E is assumed to be exogeneoulsy
determined, then (i) and (ii) represent two equations determining two endogenous
variables: W and P. Let’s explain the problem of identification with help of these
two equations of a simultaneous equation model.
Let’s use equation (ii) to express ‘W’ in terms of P:

-------------------------------------------------(iii)

Now, suppose A and B are any two constants. Let’s multiply equation (i) by A,
multiply equation (ii) by B and then add the two equations. This gives

or

-------------------(iv)

Equation (iv) is what is known as a linear combination of (i) and (ii). The point
about equation (iv) is that it is of the same statistical form as the wage equation (i).
That is, it has the form:
W = constant + (constant)P + (constant)E + disturbance
Moreover, since A and B can take any values we like, this implies that our wage
price model generates an infinite number of equations such as (iv), which are all
statistically indistinguishable from the wage equation (i). Hence, if we apply OLS
or any other technique to data on W, P and E in an attempt to estimate the wage

Econometrics: Module-II
Bedru B. and Seid H

equation, we can’t know whether we are actually estimating (i) rather than one of
the infinite number of possibilities given by (iv). Equation (i) is said to be
unidentified, and consequently there is now no way in which unbiased or even
consistent estimators of its parameters may be obtained.

Notice that, in contrast, price equation (ii) cannot be confused with the linear
combination (iv), because it is a relationship involving W and P only and does not,
like (iv), contain the variable E. The price equation (ii) is therefore said to be
identified, and in principle it is possible to obtain consistent estimates of its
parameters. A function (an equation) belonging to a system of simultaneous
equations is identified if it has a unique statistical form, i.e. if there is no other
equation in the system, or formed by algebraic manipulations of the other
equations of the system, contains the same variables as the function(equation) in
question.

Identification problems do not just arise only on two equation-models. Using the
above procedure, we can check identification problems easily if we have two or
three equations in a given simultaneous equation model. However, for ‘n’
equations simultaneous equation model, such a procedure is very cumbersome. In
general for any number of equations in a given simultaneous equation, we have
two conditions that need to be satisfied to say that the model is in general
identified or not. In the following section we will see the formal conditions for
identification.

7. 5 Formal Rules (Conditions) for Identification


Identification may be established either by the examination of the specification of
the structural model, or by the examination of the reduced form of the model.
Traditionally identification has been approached via the reduced form. Actually
the term ‘identification’ was originally used to denote the possibility (or

Econometrics: Module-II
Bedru B. and Seid H

impossibility) of deducing the values of the parameters of the structural relations


from a knowledge of the reduced form parameters. In this section we will examine
both approaches. However, we think that the reduced form approach is
conceptually confusing and computationally more difficult than the structural
model approach, because it requires the derivation of the reduced from first and
then examination of the values of the determinant formed form some of the
reduced form coefficients. The structural form approach is simpler and more
useful.

In applying the identification rules we should either ignore the constant term, or, if
we want to retain it, we must include in the set of variables a dummy variable (say
X0) which would always take on the value 1. Either convention leads to the same
results as far as identification is concerned. In this chapter we will ignore the
constant intercept.

7.5.1 Establishing identification from the structural form of the model


There are two conditions which must be fulfilled for an equation to be identified.
1. The order condition for identification
This condition is based on a counting rule of the variables included and excluded
from the particular equation. It is a necessary but not sufficient condition for the
identification of an equation. The order condition may be stated as follows.
For an equation to be identified the total number of variables (endogenous and
exogenous) excluded from it must be equal to or greater than the number of
endogenous variables in the model less one. Given that in a complete model the
number of endogenous variables is equal to the number of equations of the model,
the order condition for identification is sometimes stated in the following
equivalent form. For an equation to be identified the total number of variables
excluded from it but included in other equations must be at least as great as the
number of equations of the system less one.

Econometrics: Module-II
Bedru B. and Seid H

Let: G = total number of equations (= total number of endogenous variables)


K= number of total variables in the model (endogenous and predetermined)
M= number of variables, endogenous and exogenous, included in a
particular equation.
Then the order condition for identification may be symbolically expressed as:

For example, if a system contains 10 equations with 15 variables, ten endogenous


and five exogenous, an equation containing 11 variables is not identified, while
another containing 5 variables is identified.
a. For the first equation we have

Order condition:

; that is, the order condition is not satisfied.

b. For the second equation we have

order condition:

; that is, the order condition is satisfied.

The order condition for identification is necessary for a relation to be identified,


but it is not sufficient, that is, it may be fulfilled in any particular equation and yet
the relation may not be identified.
2. The rank condition for identification
The rank condition states that: in a system of G equations any particular equation
is identified if and only if it is possible to construct at least one non-zero
determinant of order (G-1) from the coefficients of the variables excluded from
that particular equation but contained in the other equations of the model. The

Econometrics: Module-II
Bedru B. and Seid H

practical steps for tracing the identifiablity of an equation of a structural model


may be outlined as follows.
Firstly. Write the parameters of all the equations of the model in a separate table,
noting that the parameter of a variable excluded from an equation is equal to zero.
For example let a structural model be:

where the y’s are the endogenous variables and the x’s are the predetermined
variables. This model may be rewritten in the form

Ignoring the random disturbance the table of the parameters of the model is as follows:
Variables
Equations
1st equation -1 3 0 -2 1 0
2nd equation 0 -1 1 0 0 1
3rd equation 1 -1 -1 0 0 -2

Secondly. Strike out the row of coefficients of the equation which is being
examined for identification. For example, if we want to examine the identifiability
of the second equation of the model we strike out the second row of the table of
coefficients.
Thirdly. Strike out the columns in which a non-zero coefficient of the equation
being examined appears. By deleting the relevant row and columns we are left
with the coefficients of variables not included in the particular equation, but
contained in the other equations of the model. For example, if we are examining
for identification the second equation of the system, we will strike out the second,
third and the sixth columns of the above table, thus obtaining the following tables.

Econometrics: Module-II
Bedru B. and Seid H

Table of structural parameters Table of parameters of excluded variables

  
st
1 -1 3 0 -2 1 0 -1 -2 1
2 nd
0 -1 1 0 0 1
3 rd
1 -1 -1 0 0 -2 1 0 0

Fourthly. Form the determinant(s) of order (G-1) and examine their value. If at
least one of these determinants is non-zero, the equation is identified. If all the
determinants of order (G-1) are zero, the equation is underidentified.
In the above example of exploration of the identifiability of the second structural
equation we have three determinants of order (G-1)=3-1=2. They are:

(the symbol stands for ‘determinant’) We see that we can form two non-zero
determinants of order G-1=3-1=2; hence the second equation of our system is
identified.
Fifthly. To see whether the equation is exactly identified or overidentified we use
the order condition With this criterion, if the equality sign is
satisfied, that is if , the equation is exactly identified. If the
inequality sign holds, that is, if , the equation is overidentified.
In the case of the second equation we have:
G=3 K=6 M=3
And the counting rule gives
(6-3)>(3-1)
Therefore the second equation of the model is overidentified.
The identification of a function is achieved by assuming that some variables of the
model have zero coefficient in this equation, that is, we assume that some
variables do not directly affect the dependent variable in this equation. This,
however, is an assumption which can be tested with the sample data. We will
examine some tests of identifying restrictions in a subsequent section. Some

Econometrics: Module-II
Bedru B. and Seid H

examples will illustrate the application of the two formal conditions for
identification.
Example 1. Assume that we have a model describing the market of an agricultural
product. From the theory of partial equilibrium we know that the price in a market
is determined by the forces of demand and supply. The main determinants of the
demand are the price of the commodity, the prices of other commodities, incomes
and tastes of consumers. Similarly, the most important determinants of he supply
are the price of the commodity, other prices, technology, the prices of factors of
production, and weather conditions. The equilibrium condition is that demand be
equal to supply. The above theoretical information may be expressed in the form
of the following mathematical model.

Where: D= quantity demanded


S= quantity supplied
price of the given commodity

price of other commodities


Y= income
C= costs (index of prices of factors of production)
t= time trend. In the demand function it stands for ‘tastes’; in the supply function it stands for
‘technology’.

The above model is mathematically complete in the sense that it contains three
equations in three endogenous variables, D,S and P 1. The remaining variables, Y,
P2, C, t are exogenous. Suppose we want to identify the supply function. We
apply the two criteria for identification:
1. Order condition:
In our example we have: K=7 M=5 G=3
Therefore, (K-M)=(G-1) or (7-5)=(3-1)=2
Consequently the second equation satisfies the first condition for identification.

Econometrics: Module-II
Bedru B. and Seid H

2. Rank condition
The table of the coefficients of the structural model is as follows.
Variables
Equations S
st
1 equation -1 0 0
2nd equation 0 0 -1
3rd equation 1 0 0 0 0 1 0

Following the procedure explained earlier we strike out the second row an the second,
third, fifth, sixth and seventh columns. Thus we are left with the table of the coefficients
of excluded variables:
Complete table of Table of parameters of
Structural parameters variables excluded from
the second equation
-1 0 0 -1
0 0 1
1 0 0 0 0 1 1 -1 0
From this table we can form only one non-zero determinant of order
(G-1) = (3-1) =2

The value of the determinant is non-zero, provided that .


We see that both the order and rank conditions are satisfied. Hence the second
equation of the model is identified. Furthermore, we see that in the order
condition the equality holds: (7-5) = (3-1) = 2. Consequently the second
structural equation is exactly identified.
Example 2. Assume the following simple version of the Keynesian model of
income determination.
Consumption function:
Investment function:
Taxation function:
Definition:
This model is mathematically complete in the sense that it contains as many
equations as endogenous variables. There are four endogenous variables, C,I,T,Y,

Econometrics: Module-II
Bedru B. and Seid H

and two predetermined variables, lagged income and government


expenditure (G).
A. The first equation (consumption function) is not identified
1. Order condition:
There are six variables in the model (K=6) and four equations (G=4). The
consumption function contains three variables (M=3).
(K-M)=3 and (G-1)=3
Thus (K-M)=(G-1), which shows that the order condition for identification is
satisfied.
2. Rank condition
The table of structural coefficients is as follows
Variables
Equations
1st equation -1 0 0 0
2nd equation 0 0 -1 0
3rd equation 0 -1 0 0 0
4th equation -1 0 1 0 1
1
1

We strike out the first row and the three first columns of the table and thus obtain
the table of coefficients of excluded variables.
-1
Complete table of 0 0
0 Table of coefficients of
0 0 -1
0 parameters
structural 0 -1
excluded variables 0
0 -1 0 0 0 0 0 0
1 -1 0 1 0 1 -1 0 0
We evaluate the determinant of this table. Clearly the value of this determinant is
zero, since the second row contains only zeros. Consequently we cannot form any
nonzero determinant of order 3(=G-1). The rank condition is violated. Hence we
conclude that the consumption function is not identified, despite the satisfaction of
the order criterion.
B. The investment function is overidentified

Econometrics: Module-II
Bedru B. and Seid H

1. Order condition
The investment function includes two variables. Hence
K-M = 6-2
Clearly (K-M) > (G-1), given that G-1=3. The order condition is fulfilled.
2. Rank condition
Deleting the second row and the fourth and fifth columns of the structural
coefficients table we obtain.
Complete table of structural Table of coefficients of
Parameters excluded variables
-1 0 0 0
0 0 0 -1 0 -1 0
0 -1 0 0 0 0 -1 0
1 -1 0 1 0 1 -1 -1 0 1

The value of the first 3x3 determinant of the parameters of excluded variables is

(provided )
The rank condition is satisfied since we can construct at least one non-zero
determinant of order 3=(G-1).
Applying the counting rule we see that the inequality sign holds:
4>3; hence the investment function is overidentified.

Self-test Question: Detect the identificability of the tax equation .

7.5.2 Establishing identification from the reduced form


Like that of the identification conditions from structural equations, there are two
conditions for identification based on the reduced form of the model, an order
condition and a rank condition. The order condition is the same as in the

Econometrics: Module-II
Bedru B. and Seid H

structural model. The rank condition here refers to the value of the determinant
formed from some of the reduced form parameters, π‘s.

1. Order condition (necessary condition), as applied to the reduced form


An equation belonging to a system of simultaneous equations is identified if

where K, M and G have the same meaning as before:


K= total number of variables, endogenous and exogenous, in the entire
model
M= number of variables, endogenous and exogenous, in any particular
equation
G= number of structural equations=number of all endogenous variables in
the model
If (K-M) = (G-1), the equation is exactly identified, provided that the rank
condition set out below is also satisfied. If (K-M)>(G-1), the equation is
overidentified, while if (K-M)<(G-1), the equation is underidentified, under the
same proviso.
2.Rank condition as applied to the reduced form
Let G* stand for the number of endogenous variables contained in a particular
equation. The rank condition as applied to the reduced form may be stated as
follows.
An equation containing G* endogenous variables is identified if and only if it is
possible to construct at least one non-zero determinant of order G*-1 from the
reduced form coefficients of the exogenous (predetermined) variables excluded
from that particular equation.

Econometrics: Module-II
Bedru B. and Seid H

The practical steps involved in this method of identification may be outlined as


follows.
Firstly. Obtain the reduced form of structural model. For example assume that
the original model is

This model is complete in the sense that it contains three equations in three
endogenous variables. The model contains altogether six variables, three
endogenous and three exogenous
The reduced form of the model is obtained by solving the original equations for
the exogenous variables. The reduced form in the above example is:

where the π’s are functions of the structural parameters.


Secondly. Form the complete table of the reduced form coefficients.
Exogenous Variables
Equations
1st equation: y1
2nd equation:
3rd equation:

Strike out the rows corresponding to endogenous variables excluded from the
particular equation being examined for identifiability. Also strike out all the
columns referring to exogenous variables included in the structural form of the
particular equation.
After these deletions we are left with the reduced form coefficients of exogenous
variables excluded (absent) from the structural equation. For example, assume that

Econometrics: Module-II
Bedru B. and Seid H

we are investigating the identification procedure are found by striking out the first
row (since , does not appear in the second equation) and the third column (since
, is included in this equation).
Complete table of reduced Table of reduced form
form coefficients coefficients of excluded
exogenous variables

Thirdly. Examine the order of the determinants of the π’s of excluded exogenous
variables and evaluate them. If the order of the larges non-zero determinant is
G*-1, the equation is identified. Otherwise the equation is not identified.

Major References

Gujarati, D.N., 1995. Basic Econometrics (3rd ed.), Mc Graw Hill


Koutsoyiannis, A., 1997. Theory of Econometrics (2nd ed.), Macmillan
Madanani, G.M.K., 1995. Basic Econometrics (3rd ed.), Macmillan
Thomas,R.L., 1997. Modern Economtetrics: An Introduction, Addison-wesley

We hope you enjoyed the reading! Any comments are welcome!

Econometrics: Module-II

You might also like