This Content Downloaded From 190.164.207.86 On Sun, 27 Jun 2021 01:50:18 UTC

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

A Note on Identification in the Multinomial Probit Model

Author(s): Michael P. Keane


Source: Journal of Business & Economic Statistics , Apr., 1992, Vol. 10, No. 2 (Apr.,
1992), pp. 193-200
Published by: Taylor & Francis, Ltd. on behalf of American Statistical Association

Stable URL: https://www.jstor.org/stable/1391677

REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/1391677?seq=1&cid=pdf-
reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to
digitize, preserve and extend access to Journal of Business & Economic Statistics

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
? 1992 American Statistical Association Journal of Business & Economic Statistics, April 1992, Vol. 10, No. 2

A Note on Identification in the Multinomial


Probit Model
Michael P. Keane
Industrial Relations Center and Department of Economics, University of Minnesota, Minneapolis, MN
and Research Department, Federal Reserve Bank of Minneapolis, Minneapolis, MN 55480

Although formal conditions for identification in the multinomial probit (MNP) model are now
clearly established, little is known about how various estimable MNP specifications perform in
practice. This article shows that parameter identification in the MNP model is extremely tenuous
in the absence of exclusion restrictions. This previously unnoticed fact is important because
formal identification of MNP models does not require exclusion restrictions, and many potential
economic applications of MNP are to situations in which exclusion restrictions are not readily
available. Thus, failure to be aware of the difficulties present in such situations may lead to
reporting of unreliable results.

KEY WORDS: Discrete choice; Latent variables; Parameter estimability.

The multinomial probit (MNP) model has rarely beenFor clarity of exposition, it is important to note that
used as a model of choice in applied work, despite its are two types of identification problems. For a
there
well-known advantages over the popular logit model formally nonidentified model, there is a range of pa-
rameter
(i.e., its relaxation of the restrictive independence of values that generate the maximized value of
theof
irrelevant alternatives assumption). The lack of use objective function. In addition, a model may be
MNP stems from the computational burden involved formally
in identified yet exhibit very small variation in
the objective function from its maximum over a wide
its estimation. The model generates choice probabilities
range of parameter values. I refer to identification in
that are multivariate integrals of order M - 1, where
M is the number of alternatives. Thus, even whensuch M =cases as being tenuous or fragile. Common symp-
toms of this problem in practice are a close-to-singular
3, estimation by maximum likelihood (ML) is expensive
given large data sets. Hessian, large standard errors, and inability of optimi-
zation algorithms to find steps that improve the objec-
The recent development of a computationally prac-
tical method of simulated moments (MSM) estimator tive or to achieve convergence. As we shall see, these
for the MNP model (see McFadden 1989; Pakessymptoms and of fragile identification are quite severe in
MNP models without exclusion restrictions.
Pollard 1989), along with the development of practical
parameterization methods that avoid proliferationNote of that in complex nonlinear models such as MNP,
formal identification or nonidentification is often very
covariance matrix elements (see Ben-Akiva and Bolduc
1989; Bolduc 1991; Elrod and Keane 1991), has raised
difficult to prove due to the complexity of the analytical
renewed interest in MNP as a model of choice. Given Hessian. Thus, in this article, rather than working with
the lack of practical experience with MNP models, how- the analytical Hessian, I use a series of trial estimations
ever, there is a need to develop a "folklore" concerning of MNP models on Monte Carlo and actual data to
the conditions under which the model performs well. illustrate the nature of the identification problem. Be-
An important step in that direction is the recent paper cause of the difficulties in proving identification in non-
by Bunch and Kitamura (1991). linear models, a common practice in applied work is to
The purpose of this article is to demonstrate that simply attempt to estimate a model and see if the Hes-
parameter identification in MNP models is extremely sian is singular (or nearly singular). Such practice is
tenuous in the absence of exclusion restrictions. By ex- dangerous when using simulation estimators of the type
clusion restrictions I mean restrictions that certain ex- likely to be applied to MNP models, because simulation
ogenous variables in the model do not affect the utility error will generate contours where the true objective
levels of certain alternatives. This fact is important, function is flat and will generate a nonsingular Hessian
because formal identification of MNP models does not when the true Hessian is singular. An illustration of this
require exclusion restrictions, and (to my knowledge) danger was provided by Horowitz, Sparmann, and Da-
the fact that identification is tenuous in their absence ganzo (1982). They were testing the accuracy of sim-
has not been previously noted in the literature. Hence, ulated ML based on the Clark approximation for pur-
given the new MSM technology, researchers are likely poses of estimating the MNP model. One of the models
to attempt simulation estimation of unrestricted mul- they used for the tests was actually nonidentified, but
tinomial probits. This may, in turn, lead to the reporting approximation errors introduced contours in the ob-
of unreliable results. jective function that masked the problem and allowed
193

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
194 Journal of Business & Economic Statistics, April 1992

them to obtain estimates. (Their other tests did use abilities of choosing alternatives 2 and 3 are expre
identified models, however, and they indicate that the symmetrically.
Clark approximation does not work well for purposes This unrestricted model is not identified for two ob-
of estimating the MNP model.) Because of this prob- vious reasons. First, a proportional change in all ele-
lem, all of the estimation in this article is done using ments of the covariance matrix and of the (ac, /3j) for
ML estimation based on accurate numerical evaluations j = 1, 3 leaves all probabilities unaffected. Second,
of the MNP choice probabilities. addition of a constant to all of the aj leaves all proba-
bilities unchanged, because choice depends only on util-
1. THE MULTINOMIAL PROBIT MODEL ity differences.
Thus, as described by Bunch (1991), Dansie (1985),
In this section, I describe the trinomial probit model
and Albright, Lerman, and Manski (1977), identifica-
(TNP). Extension to multinomial probit is obvious. In
tion may be achieved by normalizing the utility of one
the TNP model there are three alternatives with random
alternative to 0 and by restricting one covariance matrix
utilities given by
element. (Other normalizations are possible.) Setting
Uli = a1 + P1Xi + Eli U3 = 0 and o1 = 1, we have the model

U2i = a2 + 2 Xi + E2i Uli = a, + 1 Xi + Eli

U3i = C3 + f3 Xi + E3i. (1) U2i = a2 + 2 Xi + E2i

Here, Xi is a vector of regressors for person i, and f3 U3i = 0 (1')


is the corresponding coefficient vector for alternative j. with error covariance matrix
In this notation, there are exclusion restrictions if we
have pjk = 0 for some element k of Xi but pj'k : 0 for
some j' : j. For example, if Xik is an attribute of al- Ov( = ( 1 ) (2')
ternative 1, a common exclusion restriction would be
3jk = 0 for j = 1. Such exclusion restrictions arise Then we have
naturally in problems of transportation-mode choice- r(Ui -U2)/(1 + or2-22)1/2(w
the area where MNP has usually been applied-because Pri = J(w)
W = -X

mode attributes are readily available in data. Then, it


is natural to assume that attributes of mode j (such as
price or travel time) affect only the utility generated by (3')
x (D[Ul/(1 - r2)1/2 - wr/(1 -
mode j and do not affect the utilities generated by other where r1 = (1 - -12)/(l + o-2 - 2crl2)12, the correlation
modes. In many economic applications, however (e.g., between E1 - E2 and El. Again, Pr2i and Pr3i are ex-
choice of industry or occupation), the data available to pressed symmetrically.
the econometrician will typically include only attributes Note that here, for any particular X vector, we can
of survey respondents and not of respondents' alter- always find alternative values for (aj, /3j), j = 1, 2, and
native choices.
(0-12, c2) that give the same values for the choice prob-
The key feature separating probit from other choiceabilities. We cannot, however, find alternative values
models (i.e., logit or conditional logit) is the assumptionfor these parameters that give the same values for the
that the stochastic terms (E, E2i, E3i) have a multivariate choice probabilities for all X. Thus, as was pointed out
normal distribution with covariance matrix by Heckman and Sedlacek (1985), the TNP model is
identified so long as X contains a single regressor that
varies over individuals. No exclusion restrictions are
COV E2 = 0-12 0-2 ) (2)
\E3 \0-13 023 02 required for formal identification. Unfortunately, even
the proper conditions for formal identification of MNP
In this model, the probability of individual i choosing models appear to be not widely known. Bunch and
alternative 1 is given by Kitamura (1991) pointed out that nearly half of the
r(IP - U 2)/(ol2 + a2- 2012)1/2 existing applications of MNP have used formally non-
identified models.
Prli = (w)
JW= -Xc

2. IDENTIFICATION PROBLEMS IN THE


X I[('U1 - U3)/[(or2 + 02 - 2or13)(1 - r2)]1/2 TRINOMIAL PROBIT MODEL
- wr1/(1 - r2)1/2]dw, (3)
This section presents evidence that identification in
where Uj-, a + f3,Xi, r -(= - 0c13 - 0'12 +the
023)/
TNP model of Equations (1') and (2') is extremely
[(('2 + 02 - 2-12)(02 + 02 - 2Cf13)]1/2, andfragile
/()) and
in the absence of exclusion restrictions. This evi-
dence is from both Monte Carlo and actual data. Con-
?( ) are the normal density and distribution function,
sider
respectively (see Hausman and Wise 1978). Here r1 first
is the evidence from Monte Carlo data. A data
the correlation between e? - E2 and e? - ?3. The
set prob-
of size 8,000 was constructed by drawing values for

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
Keane: The Multinomial Probit Model 195

a single regressor Xi (i = 1), 8,000 from the N(6, 5) the Monte Carlo data we know the true model and since
distribution, and by drawing (Ei,, E2i) for i = 1, 8,000
the sample size of 8,000 is rather large, this approxi-
with covariance matrix given by (2'). mation to the Hessian should be a good one.
The first set of results using Monte Carlo data is
TNP model estimates were obtained by ML using the
algorithm of Berndt, Hall, Hall, and Hausman (1974),
presented in Table 1. The true values of the parameters
are reported in the column headed "True value" [here
which will be referred to as the BHHH algorithm. Let
p is defined as corr(ei, e2)]. The columns headed (1)-
0 denote the vector of all parameters of the model, and
let ok denote the estimate of 0 at iteration k. The log-
(4) report estimates obtained with p and 0-2 pegged at
likelihood function L for the TNP model is given by
various values. The X2(2) tests are for the null that the
N 3 hypothesized constraints are valid [note that the opti-
L( k) = E di ln Prji(0k), (4) mized value of the log-likelihood for the unrestricted
i=l j=1
model is -7,953.57 (Table 2, col. (1))]. The true values
of p and 0-2 used in constructing the data were .60 and
where dij is an indicator equal to 1 if i chooses alternative
1.50, respectively.
j and 0 otherwise. The TNP choice probabilities Prji(0k)
In column (2), p is restricted to 0 (whereas a2 is
were evaluated using 100 term tetrachoric expansions.
restricted to the true value of 1.5). The deterioration
A modified Newton-Raphson step is given by
of the log-likelihood resulting from this restriction is
only .57 of a point. Similarly, in Column (3), or2 is
ok+ = k AkH-1 L( O) (5) restricted to 1.00 whereas p is pegged at the true value
of .60. The deterioration of the log-likelihood resulting
where from this restriction is only one point. In column (4),
p is restricted to 0, and a2 is restricted to 1.00. The
H =2L(8) deterioration in the log-likelihood is 2.25 points, giving
d000' ok a X2(2) value of 4.5 that is not significant at the 10%
level (critical value = 4.605).
is the Hessian evaluated at ok and Ak is a step size that
It is apparent that restrictions on p and o02 produce
is 1 on the first step of an iteration but that is reduced
some slight deterioration in fit of the TNP model with-
if a unit step does not improve the log-likelihood func-
out exclusion restrictions-as they must because these
tion. Of course, numerical evaluation of the Hessian is
parameters are formally identified. The improvement
quite computationally expensive in this model. In the
in fit obtained by introducing these parameters is so
BHHH algorithm, the fact that the expected value of
minor, however, as to render their identification in
the Hessian is (at the optimum) equal to the negative
practice problematic. The problem is illustrated in Table
of the expectation of the outer product of the gradient
2. Here TNP models were estimated with p and 02 free.
vectors is invoked to justify approximation of the Hes-
The starting values used for the runs in columns (1)-
sian by
(4) of Table 2 are, respectively, the estimates in columns
(1)-(4) of Table 1. Classic symptoms of fragile iden-
H -_ dL(O) dL(0)
ao a' ok tification were found in all runs in Table 2. First, the
Hessian was so close to singular that, to obtain a sen-
In this article, this approximation tosible
the direction
Hessianvectorwas
for use in BHHH iterations, the
used to obtain steps and in covarianceMarquadt
matrix (1963)
calcula-
procedure of adding positive diagonal
elements
tions. When using Monte Carlo data, the outerto the (approximate) Hessian had to be ap-
product
plied
of the gradients was evaluated at the 0 (note, however,
vector. Sincethatin this was not done when cal-

Table 1. Trinomial Probit Model-ML Estimates With p and -2 Restricted

Parameter True value (1) (2) (3) (4)

p .60 .60 .00 .60 .00


02 1.50 1.50 1.50 1.00 1.00
a - .80 -.7593 -.7411 -.7148 -.6875*
(.0503) (.0494) (.0476) (.0471)
p1 .20 .1943 .1260** .1633** .1029**
(.0093) (.0089) (.0086) (.0083)
a2 - 2.00 -1.9234 - 2.2000* - 1.2490** - 1.4811 **
(.0734) (.0781) (.0524) (.0545)
32 .40 .3933 .3927 .2782** .2759**
(.0124) (.0129) (.0091) (.0092)
Log-likelihood - 7,953.57 -7,954.14 -7,954.57 -7,955.82
X2(2) .00 1.14 2.00 4.50

NOTE: Standard errors are in parenthese


1% level. An asterisk indicates the 5% leve
value is 4.605). Sample size is 8,000.

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
196 Journal of Business & Economic Statistics, April 1992

Table 2. Trinomial Probit Model-ML Estimates With p and Cr2 Unrestricted

Parameter True value (1) (2) (3) (4)

p .60 .60007 .0228 .6121 .0176


(1.6274) (1.0462) (1.3209) (1.1423)
a 1.50 1.5065 1.5106 1.0133 1.0122
(2.4405) (1.5031) (.6053) (.7544)
al - .80 - .7590 -.7435 -.7161 -.6902
(.1797) (.0701) (.0966) (.0711)
Pi .20 .1945 .1288 .1658 .1052
(.1073) (.0766) (.1036) (.0747)
a2 -2.00 - 1.9303 -2.2069 - 1.2582 - 1.4932
(4.4203) (2.3958) (1.5697) (1.3220)
12 .40 .3944 .3954 .2810 .2790
(.5830) (.3169) (.1719) (.1609)
Log-likelihood -7,953.57 -7,954.05 -7,954.34 -7,955.49

NOTE: Standard errors are in parentheses. Sample size is 8,000.

culating the covariance matrices).random


Second,variable, so the probability o
despite using
less than a
the Marquadt procedure, it was extremely certain for
difficult threshold increa
the algorithm to find improvingcorrectly
steps, and fixed atfour
in all 0, both the U1 an
instances the parameters did not mimic
movethese two effects
far from the of an incre
Figure
starting values. Third, the estimated 1, we see
standard thatof
errors the U1 line fla
creasing
the p and a2 estimates are very large, andthe
thedistance
standard between U1 an
distance
errors of all other parameters increase to the right
dramatically when or left of the
increases compare
p and 72 are unrestricted. For example, the probability
col- of alternat
umn (1) of Table 2, in which all (and vice versa),
estimates are closethus
to mimicking ef
the true values, with column (1) we see that
of Table both
1, in the U1
which p and U2 lines
ThisIn
and cr2 are fixed at their true values. increases
column the
(1), probability
the of altern
estimated standard error of p is icking
1.6274,effect
so all 2.
points in
In Figure
the -1 to 1 range are within one standard 2, the
error of true
the UL and U2 a
estimate. The standard error ofthose
O2 is estimated
2.4405, so in
r2 Table
is 1, column
(2 is 1.5,
only .62 of a standard error above but -2 is fixed at 1.0. The eff
0. Furthermore,
freeing p and a2 to be estimated in a2 is twofold:
increases (1) It reduces the p
the standard
native
errors on the regressor coefficients by2 when
up to U2> U1 and increases
6,000%
(the standard error on di increasesalternative
from .05 to 2 when
.18, that U2 < U1, and (
on .1 from .009 to .107, that onprobability
a2 from .07 of alternative
to 4.42, 3 when U
and that on P2 from .012 to .58).the probability of alternative 3 whe
2,this
These results are not isolated to observe that the
particular dataU2 line flattens
set but were consistently presentline, thus reducing
in many the distance bet
experiments
(and also in actual data, as we any
shallgiven
see). distance to the right or l
The source
point.
of the fragility of identification This
in the TNPreduces
model the
is probability of
U2> Ui (andcan
that movements in the regressor coefficients vice versa), thus mimic
effec-
pronounced
tively mimic the effects of changes flattening of the U2
in the covariance
effect 2.
parameters. Thus, restricting the covariance parameters
Note that
has little effect on the fit of the model, changes
and the in p and a2 both a
values
probabilities
of the covariance parameters are difficult of alternatives 1 and 2 to
to disentangle
of the U1(X) = U2(X) point. Incre
from the values of the regressor coefficients.
the
This is illustrated in Figure 1, in probability
which the truethat
U, alternative j
and U2 are plotted against those Uj(X) > Uj,(X).
estimated Increases
in Table 1, in 02 reduc
that
column (2). Here the true p is .60, butalternative
p is fixed 2
atwill
0. be chosen w
These effects
The effect of an increase in p is twofold: can be mimicked by cha
(1) It increases
slopes U2
the probability of alternative 2 when of >the
U1U,(X) and U2(X) lines t
and re-
or reduce
duces the probability of alternative 2 when theU2distance
< U1, IUA(X) -_U2
and (2) since both El and E2 must distance
be lessto the certain
than right or left of the
thresholds to generate choice 3, (i.e., for all X).
it increases theThis
prob-suggests that d
p, (72,
ability of alternative 3. To see why theand the effect
first regressors
is may be dise
present, observe that as p -> 1 weducing exclusion
approach restrictions. Then
a deter-
U1(X)
ministic rule whereby the alternative = U2(X)
that points
gives the larg-and rotations
est U is always chosen. ConcerningU2(X)
thelines that
second either increase or r
effect,
observe that as p -> 1, el and 62IUL(X)
collapse- into
U2(X)I for all X are impo
a single

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
Keane: The Multinomial Probit Model 197

U1,U2

IU =-0.8+0.2Xi

Xi

//
/-- U2 = -2.200 + 0.393 X i

/
-2.0
/

Figure 1. Effect of Fixing p at .0 When True p = .6. Note that the solid lines
are the true U, and U2 lines. The dashed lines are the U, and U2 lines estimated
with p fixed at .0 when the true p is .60 [see Table 1, col. (2)].

A TNP model with exclusion restrictions is estimated Experimentation with a wide range of specifications
in Table 3 (p. 199). This model has the form revealed that in MNP models it is necessary to have
Uli = a, + 1BilX,i + P12X2i one exclusion from each utility index to avoid identi-
fication problems. Simply introducing additional re-
U2i = a2 + 321X1i + 323X3i, (6) gressors, without introducing exclusion restrictions, does
where X1i is constructed as before, but X2i and X3i are not solve the problem. This is illustrated by the results
dummy variables equal to 1 with probability .50. Thus column (5) of Table 3, where estimates were obtained
in
there are four Ui(Xli) = U2(X1i) points corresponding with the p13 = p22 = 0 restrictions removed [the results
to the four possible values of (X1i, X2/), and any pivotingfrom col. (4) were used as starting values]. The symp-
to increase or reduce the distance IU1(X1i) - U2(X1i) toms of fragile identification, including increases in the
for all X1i is impossible. standard errors of up to 700%, are again apparent here.
The true values of the parameters of the model are It is also apparent that the identification problems
reported in the column headed "True value." All values found in the preceding Monte Carlo data do not result
are the same as before except that the new parameters from the particular distributional assumptions on the
312 and p23 equal -.60. The unrestricted model results Xi. This is best seen by considering a TNP model es-
are reported in column (4). In contrast to the results in timated on actual data. In Table 4, I report results of
Table 2, all of the parameters of the model are esti-estimating a model of industry choice on data from the
mated with precision. Columns (1)-(3) report estimates National Longitudinal Survey of Young Men (NLS).
obtained with p and a2 restricted as in Table 1, columns This application is motivated by the fact that the only
(2)- (4). Here, all of the restrictions are overwhelmingly application of MNP in labor economics is that of Heck-
rejected. For example, the X2(2) statistic for the re-man and Sedlacek (1985), who considered industry choice
striction p = .0, Cr2 = 1.0 is 44.78, compared to a 1%in Current Population Survey data. As in the work of
significance level of 9.21. Clearly, the problem of fragileHeckman and Sedlacek, industries are grouped into
identification in the TNP model is solved by introducing three alternatives-manufacturing (M), nonmanufac-
exclusion restrictions. turing (NM), and unemployment. The NLS sample used

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
198 Journal of Business & Economic Statistics, April 1992

U1, U2 / /

//>/ , U1 = -0

,//s/
/ // /
/ool
A/ / ~-0.8
// U2 = -1.249 + =0.278 X -i

/ /< --- U2= -2.0+0.4 Xi

-2.0

Figure 2. Effect of Fixing r2 at 1.0 When True r2 = 1.5. Note that the solid
lines are the true U7 and U2 lines. The dashed lines are the U1 and U2 lines
estimated with a2 fixed at 1.0 when the true a2 is 1.5 [see Table 1, col. (3)].

here has 11,886 person-year observationsinteractions


from of age with the SMSA and South dummies
1966-
1981. For a complete description of the and
data, see
of the Keane,
SMSA dummy with the South dummy. Fitted
values of sector-specific
Moffitt, and Runkle (1988). The person-specific re- nonlabor income are then con-
gressors are the national unemployment structed
rate for each sector for each worker. NLINC for a
(U-RATE),
a time trend (TREND), years of educationparticular(EDUC),
sector only enters the utility index for that
labor market experience and its square sector,(EXPER,
so the estimated model has exclusion restric-
tions. Theisother
EXPER2), a dummy to indicate if respondent whiterestrictions are that the utility index
for the is
(WHITE), a dummy to indicate if respondent unemployment
married alternative is normalized to 0
(WIFE), and the number of child dependents of the
and the manufacturing-utility-index error is normalized
respondent (KIDS). to have unit variance. p is the correlation of the man-
Following Heckman and Sedlacek (1985), an and
ufacturing alter-
nonmanufacturing utility function er-
native specific variable-sector-specificrors, and o.2 isin-
nonlabor the standard deviation of the nonman-
come (NLINC)-is also included in the model.
ufacturing The
error.
The
rationale for this variable is that it captures first column of Table 4 reports results obtained
differences
by sector in unemployment compensation with p and other
pegged at 0 and 02 pegged at 1. This model gives
benefits. Since nonlabor income is only a log-likelihood
observed for value of -10,300.710. Unfortunately,
the chosen sector, instruments of thethe coefficient
type used on by
nonlabor income in nonmanufactur-
Heckman and Sedlacek are constructed as follows: ing is estimated to be negative. The column (2) esti-
Nonlabor income for all workers in a particularmates
sector are obtained with p and 02 free, using the column
is regressed on a set of instruments. Instruments are
(1) estimates as starting values. As with the Monte Carlo
respondent's age, years of education, a Standard data,
Met-this again resulted in a close-to-singular (approx-
ropolitan Statistical Area (SMSA) resident dummy,imate) a Hessian, so that the Marquadt procedure again
had to be used to obtain a reasonable step matrix for
South regional dummy, local labor-force size, TREND,
EXPER, EXPER2, WHITE, WIFE, KIDS, and the the BHHH algorithm. (Again, however, the procedure

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
Keane: The Multinomial Probit Model 199

Table 3. Trinomial Probit Model With Exclusion Restrictions-ML Estimates

Parameter True value (1) (2) (3) (4) (5)

p .60 .00 .60 .00 .6368 .7190


(.0786) (.2846)
oa 1.50 1.50 1.00 1.00 1.5215 2.5786
(.2099) (1.4086)
a - .80 -.7336 -.7649 -.6811 -.8110 -.7961
(.0516) (.0489) (.0497) (.0522) (.0839)
P/3 .20 .1338** .1673** .1125** .2023 .2295
(.0088) (.0083) (.0083) (.0151) (.0234)
P12 -.60 -.6501 -.5063** -.6336 -.5324 -4786
(.0350) (.0269) (.0339) (.0451) (.1284)
13 0 -.0855
(.1039)
a2 -2.00 -2.1767* - 1.3103** -1.4698 -2.0146 -3.4868
(.0812) (.0534) (.0564) (.2903) (2.2865)
P21 .40 .4041
.2832** .2814** .4086 .6431
(.0128) (.0088) (.0091) (.0492) (.3343)
P22 0 .2376
(.4283)
23 - .60 -.7235** -.4120** - .4848 -.5984 -1.0880
(.0475) (.0262) (.0382) (.0947) (.7287)
Log-likelihood - 7,675.53 -7,661.55 -7,678.72 -7,656.33 -7,655.17
X2(2) 38.40** 10.40** 44.78** 2.32

NOTE: Standard errors are in parentheses. Double asterisks


1% level. An asterisk indicates the 5% level. In columns (1)-(3
valid. In column (5), the X2(2) statistic is for the null that

was not used when calculating covarian


of the estimates remain very close to th
values, and the standard errors for severa
p and o-2) are very large. The log-likelih

Table 4. Trinomial Probit Model of In

(1) (2) (3)

Parameter M NM M NM M NM
Regressor coefficients

Regressor coefficients
NLINC .0077** -.0471** .0081** -.0473** .0029 -.0367
(.0033) (.0061) (.0036) (.0229) (.0028) (.0255)
U-RATE - .0760* - .0484** - .0752** -.0484** -.0978** - .0804**
(.0145) (.0135) (.0176) (.0179) (.0202) (.0222)
TREND -.0231** .0462** -.0233** .0461* -.0116 .0370
(.0070) (.0089) (.0078) (.0270) (.0078) (.0314)
EDUC .0121* .1083"* .0138 .1106** .0328** .1024*
(.0071) (.0074) (.0140) (.0496) (.0152) (.0552)
EXPER .0252** -.0290** .0269** -.0277 .0142 -.0229
(.0109) (.0110) (.0111) (.0205) (.0115) (.0249)
EXPER2 -.0017** .0005 -.0018** .0005 -.0015** -.0000
(.0005) (.0005) (.0005) (.0007) (.0005) (.0008)
WHITE .1047** .0865* .1117** .0976* .1455** .1356*
(.0495) (.0479) (.0565) (.0567) (.0585) (.0695)
WIFE .4711** .9468** .4782** .9599** .5080** .9110**
(.0390) (.0922) (.0639) (.3610) (.0678) (.3911)
KIDS .1164** -.1777** .1174** -.1798* .0984** -.1092
(.0225) (.0321) (.0276) (.1053) (.0259) (.1184)
CONSTANT -.0585 -.1268 - .0741 -.1346 .4552 .3131
(.1434) (.1156) (.1730) (.3379) (.1814) (.3499)
Covariance matrix
p .0000 .0315 .6419*
(.4093) (.3682)
a2 1.0000 1.0506** 1.1596**
(.5087) (.5843)
Log-likelihood 10,300.710 -10,299.700 -10,299.645

NOTE: M denotes manufacturing utility index coefficients. NM denotes nonmanufacturing. Standard errors are in parentheses. Double
asterisks indicate significance at the 5% level. An asterisk indicates significance at the 10% level. The estimates in column (1) were
obtained by fixing p and -2 at 0 and 1, respectively. The estimates in column (2) were obtained using the column (1) estimates as
starting values. The estimates in column (3) were obtained using as starting values the estimates from a model that included only
CONSTANT, U-RATE, TREND, and NLINC as regressors and that held p and a2 fixed at 0 and 1, respectively.

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms
200 Journal of Business & Economic Statistics, April 1992

constants, U-RATE, TREND, and NLINC and with p ACKNOWLEDGMENTS


and 0o2 pegged at 0 and 1, respectively. (The Marquadt
I thank the Alfred P. Sloan Foundation for its support
procedure was again used to obtain a step matrix.) This
and the two anonymous referees for their comments.
produced a value of p very far from 0 (the estimate is
Any views expressed here are mine and not necessarily
.6419) and several coefficient estimates far from the
those of the Federal Reserve Bank of Minneapolis or
column (1) and column (2) values. The log-likelihood
the Federal Reserve System.
function value is -10,299.645, however, which is not
significantly different from the column (1) and column [Received August 1990. Revised August 1991.]
(2) values, and the standard errors for a number of
REFERENCES
parameters are again quite large. Thus the parameters
p and o(2 do not seem to be well identified despite the Albright, R., Lerman, S., and Manski, C. (1977). "Report on the
inclusion of sector-specific regressors. The reason that Development of an Estimation Program for the Multinomial Probit
Model," report prepared by Cambridge Systematics for the Federal
identification problems remain is apparently that NLINC
Highway Administration.
does not have sufficient correlation with sectoral choices
Ben-Akiva, M., and Bolduc, D. (1989), "Multinomial Probit With
in these data. In Heckman and Sedlacek's (1985) data, Taste Variation and Generalized Autoregressive Alternatives," pa-
the sector-specific nonlabor income variable was very per presented at the 69th Annual Meeting, Transportation Re-
highly significant in both the manufacturing and non- search Board, Washington.
Berndt, E., Hall, R., and Hausman, J. (1974), "Estimation and In-
manufacturing utility indexes.
ference in Nonlinear Structural Models," Annals of Economic and
Social Measurement, 3, 653-666.
3. CONCLUSION Bolduc, D. (1991), "Generalized Autoregressive Alternatives in the
Multinomial Probit Model," unpublished manuscript, Universite
This article has illustrated, via Monte Carlo tests and
Laval, Departement d'Economique. Submitted to Transportation
Research B.
an application to actual data, that identification of mul-
Bunch, D. S. (1991), "Estimability in the Multinomial Probit Model,"
tinomial probit models in the absence of exclusion re-
Transportation Research B, 25, 1-12.
strictions is extremely fragile, despite the fact that ex-
Bunch, D. S., and Kitamura, R. (1991), "Multinomial Probit Esti-
clusion restrictions are not necessary for these mation
models Revisited: Testing Estimable Model Specifications, Maxi-
to be formally identified. A straightforward geometric
mum Likelihood Algorithms, and Probit Integral Approximations
intuition suggests that this fragility arises becausefor
it Trinomial
is Models of Household Car Ownership," working pa-
per, University of California, Davis, Graduate School of Manage-
difficult to disentangle covariance parameters from ment.
re-
gressor coefficients in such models. Dansie, B. (1985), "Parameter Estimability in the Multinomial Probit
The reason that the fragility of identification inModel,"
MNP Transportation Research B, 19, 526-528.
models without exclusion restrictions has not been Elrod, pre-
T., and Keane, M. (1991), "Modelling Heterogeneity in Con-
viously noted is that these models have rarelysumer beenChoice Behavior," unpublished manuscript, University of
Alberta, Faculty of Business.
applied, and almost all existing applications are to
Hausman, J., and Wise, D. (1978), "A Conditional Probit Model for
transportation-mode choice. There, exclusion restric-
Qualitative Choice: Discrete Decisions Recognizing Interdepen-
tions arise naturally because there are mode-specific
dence and Heterogeneous Preferences," Econometrica, 46, 403-
attributes that only affect the utility derived from426.
choos-
Heckman,
ing the specific mode. In other applications, such as in J., and Sedlacek, G. (1985), "Heterogeneity, Aggrega-
tion, and Market Wage Functions: An Empirical Model of Self-
labor economics for example, exclusion restrictions do
Selection in the Labor Market," Journal of Political Economy, 93,
not arise naturally. Microeconomic data sets usually
1077-1125.
contain only attributes of survey respondents them-J. L., Sparmann, J. M., and Daganzo, C. F. (1982), "An
Horowitz,
selves. If one attempted to model, say, choice Investigation
of in- of the Clark Approximation of the Multinomial Probit
Model,"
dustry or occupation, attributes of the alternatives would Transportation Science, 16, 382-401.
Keane, M., Moffitt, R., and Runkle, D. (1988), "Real Wages Over
not usually be available in the data. This renders the
the Business Cycle: E~stimating the Impact of Heterogeneity With
application of MNP to such choice problems problem-
Micro Data," Journal of Political Economy, 96, 1232-1266.
atic. Because of the recent advent of practicalMarquadt,
simula- D. (1963), "An Algorithm for the Estimation of Non-
tion estimators for the MNP model, many more em-
Linear Parameters," Society for Industrial and Applied Mathe-
matics Journal, 11, 431-441.
pirical researchers are likely to attempt applications of
McFadden, D. (1989), "A Method of Simulated Moments for Esti-
the model to many previously untried problems-in
mation of Discrete Response Models Without Numerical Integra-
many of which exclusion restrictions do not naturally
tion," Econometrica, 57, 995-1026.
arise. Thus it is important that the practical limitations
Pakes, A., and Pollard, D. (1989), "Simulation and the Asymptotics
of MNP be realized. of Optimization Estimators," Econometrica, 57, 1027-1058.

This content downloaded from


190.164.207.86 on Sun, 27 Jun 2021 01:50:18 UTC
All use subject to https://about.jstor.org/terms

You might also like