Professional Documents
Culture Documents
Some Probabilistic Models of Best, Worst, and Best Worst Choices-Marley
Some Probabilistic Models of Best, Worst, and Best Worst Choices-Marley
Abstract
Over the past decade or so, a choice design in which a person is asked to select both the best and the worst option in an available
set of options has been gaining favor over more traditional designs, such as where the person is asked, for instance, to: select the best
option; select the worst option; rank the options; or rate the options. In this paper, we develop theoretical results for three
overlapping classes of probabilistic models for best, worst, and best–worst choices, with the models in each class proposing specific
ways in which such choices might be related. The models in these three classes are called random ranking and random utility, joint
and sequential, and ratio scale. We include some models that belong to more than one class, with the best known being the
maximum-difference (maxdiff) model, summarize estimation issues related to the models, and formulate a number of open
theoretical problems.
r 2005 Elsevier Inc. All rights reserved.
0022-2496/$ - see front matter r 2005 Elsevier Inc. All rights reserved.
doi:10.1016/j.jmp.2005.05.003
ARTICLE IN PRESS
2 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]
without clear guidelines on appropriate experimental scales b and w such that for x; y 2 X ,
designs, data analyses, and interpretation of results.
bðxÞ wðyÞ
This paper develops theoretical results for three BX ðxÞ ¼ P ; W X ðyÞ ¼ P . (2)
overlapping classes of probabilistic models for best, r2X bðrÞ s2X wðsÞ
worst, and best–worst choices, with the models in each Then direct substitution of (2) in (1) yields that for xay,
class proposing specific relationships between such
bðxÞwðyÞ
choices. The model classes are called random ranking BW X ðx; yÞ ¼ P . (3)
and random utility, joint and sequential, and ratio scale. bðrÞwðsÞ
r;s2X
ras
We consider models that belong simultaneously to one Notice that this combined set of representations has
or more of these classes, with the best known being the three interesting properties—the best–worst choice
maximum-difference (maxdiff) model, which is intro- probabilities are represented in (1) directly in terms of
duced later in this section, and we formulate a number the best and worst choice probabilities, and are
of open theoretical problems. represented in (3) in terms of the ratio scale values that
We now illustrate the framework, and the basic determine the best and the worst choice probabilities,
models, through the maximum-difference (maxdiff) with the same functional form in both representations.
model of best–worst choice. To do so we require some Thus, this aggregation method is plausible, and inter-
basic notation. Let T with jTjX2 denote the finite set of esting, in that it works both at the level of the choice
potentially available choice options, and for any subset probabilities and at the level of the scale values. We are
X T, with jX jX2, let BX ðxÞ denote the probability immediately lead to the first theoretical question,
that the alternative x is chosen as best in X, W X ðyÞ the namely, are there other (ratio scale) models that satisfy
probability that the alternative y is chosen as worst in X, this type of aggregation property, and, if so, how large is
and BW X ðx; yÞ the probability that, jointly, the alter- this class. The detailed formulation of this problem
native x is chosen as best in X and the alternative yax is requires extensive further notation, and its solution the
chosen as worst in X . Thus use of complex functional equation techniques. The
0pBX ðxÞ; W X ðyÞ; BW X ðx; yÞp1 conjectured final result is that the class of solutions is
large, but in an interesting sense the individual solutions
and do not differ greatly from the above model.
X X X It is important to note that the above example was for
BX ðxÞ ¼ W X ðyÞ ¼ BW X ðx; yÞ ¼ 1. choices from a given fixed set X. Often probabilistic
x2X y2X x;y2X
xay models of choice (or, briefly, probabilistic choice
We assume throughout the paper that for each x 2 T, models) are assumed to be consistent over all the
Bfxg ðxÞ ¼ W fxg ðxÞ ¼ 1. subsets of a finite master set T. If that is assumed in the
For motivational purposes only, we assume now that present context, and we consider the binary choice
best and worst choices are in some sense more basic than probabilities Bfx;yg ðxÞ and W fx;yg ðyÞ, then it is reasonable
simultaneous best–worst choices, and develop a model to assume that, and to test whether,
of the latter based on the former. So suppose that when Bfx;yg ðxÞ ¼ W fx;yg ðyÞ
asked to choose the best and the worst option in a
(finite) set X , the person simultaneously, but indepen- which with (2) gives that3 for each for z 2 T,
dently, chooses the best, respectively, the worst, option 1
bðzÞ ¼ (4)
in X . If the resulting options are distinct, these are wðzÞ
reported as the best–worst pair of options in X,
and so in particular any scale transform a of b is linked
otherwise the person re-samples. As we show later in
to the scale transform 1a of w. In the following work, we
detail, such a process gives rise to the following
consider separately the cases where b and w are
representation for the best–worst choice probabilities
independent ratio scales, and where they are subject to
in terms of the best and the worst choice probabilities:2
common scale transforms.
for x; y 2 X ; xay
The paper has the following structure. Each of the
BX ðxÞW X ðyÞ three main sections develops one of three classes of
BW X ðx; yÞ ¼ P . (1)
r;s2XBX ðrÞW X ðsÞ models, gives examples of that class, and presents
ras
solved, and open, theoretical problems about that class.
Now, suppose that the multinomial logit (MNL or Section 2 introduces the terminology and basic condi-
Luce’s choice) model holds separately for the best tions, Section 3 presents random ranking and random
and the worst choice probabilities, i.e., there exist ratio utility models, Section 4 joint and sequential choice
2 3
We discuss later the theoretical and empirical reasonableness of this (2) gives actually that bðzÞ ¼ c=wðzÞ for some constant c40, but no
representation when jX j ¼ 2. generality is lost by assuming that c ¼ 1.
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 3
models, and Section 5 ratio scale models. Section 6 attention on the task in a way that separate best and
summarizes the results and restates the main open worst choices do not.
theoretical problems.
Random ranking and random utility models are the
most commonly studied probabilistic choice models,
and frequently models in the other classes have
2. General terminology and conditions alternative descriptions as models in this class. Thus,
we include the general terminology related to random
Given a finite master set T, and a particular set X, ranking and random utility models in the present
X T, we refer to the set fBX ðxÞ; x 2 X g, fW X ðyÞ; section.
y 2 X g, fBW X ðx; yÞ; x; y 2 X , xayg, respectively, as a First, we need the idea of a ranking of a set X from its
set of best, worst, best– worst choice probabilities (on X). best (most preferred) to its worst (least preferred)
We have a complete set of best, worst, best– worst choice element, and similarly of a ranking of X from its worst
probabilities, respectively, (on a master set T) when we (least preferred) to its best (most preferred) element. We
have a set of best, worst, best–worst choice probabilities refer to these as best to worst, and worst to best,
on each X, X T. Unless stated otherwise, we assume rankings, respectively. Such rankings may be empirical,
that we have a complete set of best, worst, and i.e., resulting from a person’s judgments, but of at least
best–worst choice probabilities on a finite master set T equal importance in this paper is the role such rankings
with jTjX2. play in the development of probabilistic choice models.
One can, of course, study distinct models for each of For any set X, X T, let RðX Þ denote the set of rank
the three types of choice paradigm—that is, for best, orders of X. Then, with jX j ¼ n, and a given rank order
worst, and best–worst. However, this creates a very r ¼ r1 r2 . . . rn 1 rn of X, let BX ðrÞ denote the prob-
large number of model types. For instance, as indicated ability that r occurs as a best to worst rank order, and
above, in the paper we discuss three classes of models— W X ðrÞ the probability that r occurs as a worst to best
random ranking and random utility, joint and sequential rank order—thus, in the former case r1 is the best
choice, and ratio scale. If the model for each type of element in the rank order, in the latter case it is the worst
choice paradigm—best, worst, best–worst—is assumed element.4 The assumption that BX ðrÞ and W X ðrÞ are
to belong independently to each of these classes, then we probabilities and that a ranking occurs at each choice
have already 27 different patterns of assumptions, and, opportunity is summarized by: for each r 2 RðX Þ,
in fact, there will be even more possible patterns because 0pBX ðrÞ; W X ðrÞp1
of model subtypes within each model class. Partly
because of this proliferation of models, but also because and
it makes conceptual, and one would hope, empirical, X X
sense, we will assume normally that the three types of BX ðrÞ ¼ 1 ¼ W X ðrÞ.
choices satisfy a common type of model. We explore r2RðX Þ r2RðX Þ
Thus, if we know that a set of best–worst choice data checking that the above expression reduces to one
satisfies a random ranking model, then we can test involving only nonnegative (rank order) terms.
whether or not an additional set of best (vis, worst) We have been assuming so far that the best–worst
choice data is consistent with the same random ranking choice probabilities are generated by a random ranking
model by comparing the relevant margins of the model. We can consider also the possibility that the
best–worst data with the best (vis, worst) choice data. rankings are generated by a sequence of best–worst
Thus, an important open theoretical question is what choices to yield the following representation: for
are necessary and sufficient conditions for a complete set r 2 RðX Þ; jX j ¼ n,
of best–worst choice probabilities to satisfy a random
pX ðrÞ
ranking model. The following are necessary conditions: 8 9
> BW X ðr1 ; rn Þ >
>
> >
>
P >
> >
>
< BW X ðr1 ; rn ÞBW X fr1 ;rn g ðr2 ; rn 1 Þ =
(i) The marginal choice probabilities z2X fxg ¼
BW X ðx; zÞ satisfy a best random rankingPmodel, >
> BW X ðr1 ; rn ÞBW X fr1 ;rn g ðr2 ; rn 1 Þ:::BW frj ;rjþ1 g ðrj ; rjþ1 Þ >
>
>
> >
>
>
: BW ðr ; r ÞBW >
;
and the marginal choice probabilities z2X fyg X 1 n ðr ; r Þ:::BW
X fr1 ;rn g 2 n 1 ðr ; r
frj ;rjþ1 ;rjþ2 g j jþ2 Þ
BW X ðz; yÞ satisfy a worst random ranking model 8
> n ¼ 2; 3;
(with common rank order probabilities). >
>
>
>
(ii) The best–worst choice probabilities satisfy regular- < n ¼ 4; 5;
ity: for distinct x; y 2 X Y T, xay, if
>
> n ¼ 2j; jX3;
>
>
BW X ðx; yÞXBW Y ðx; yÞ. (9) >
: n ¼ 2j þ 1; jX3:
(iii) For distinct fx; y; zg ¼ X T, The distinction between the cases with an even versus an
odd number of elements arises because in the latter case,
BW fx;yg ðx; yÞ ¼ BW X ðx; yÞ þ BW X ðx; zÞ
with 2j þ 1 elements, the rank position of the final
þ BW X ðz; yÞ. ð10Þ element is determined after j best–worst choices.
We have the open problem of determining whether
When jTj ¼ 3, i.e., we have only a 3 element set and there are any complete sets of best–worst choice
its 2 element subsets, the above condition, (10), is probabilities on (all the subsets of) a set T that satisfy
necessary and sufficient for the best–worst choice a random ranking model with the above relation
probabilities to satisfy a random ranking model. This between the probabilities of the rank orders in that
is easily seen by setting, for each r ¼ r1 r2 r3 2 RðTÞ, random ranking model and the complete set of
pT ðrÞ ¼ BW T ðr1 ; r3 Þ. Then the best–worst choice prob- best–worst choice probabilities on the subsets X T.
abilities on T are compatible with the random ranking This question parallels a solved classic one regarding
model, and the compatibility of the best–worst choices Luce’s choice model (Luce and Suppes, 1965).
for the 2 element subsets follows by substituting the We now introduce a general class of random utility
rank order probabilities associated with the terms in the models for best, worst, and best–worst choice probabil-
right hand side of (10) with X ¼ T in those terms. ities, specialize it in a way that produces representations
One would hope that an approach similar to that used formally equivalent to the earlier random ranking frame-
to solve the parallel general case for best (vis., worst) work, and then study related random utility models that
choice probabilities (see Falmagne, 1978; Barbera and do not retain this equivalence. Throughout the paper, we
Pattanaik, 1986; Fiorini, 2004) would give a general set implicitly assume that there is a zero probability of two
of necessary and sufficient conditions in the present case, distinct random variables being equal, which ensures that
though we do not currently see how to do so. It is quite a single option is selected at each choice opportunity.
possible that the relevant conditions will have a similar Definition 5. A complete set of best, worst, and
structure to those in the earlier literature—for instance, best–worst choice probabilities on a finite set T satisfies
the first such condition beyond regularity that occurs in a (best, worst, best– worst) random utility model provided
the characterization of best choice probabilities yields a there are random variables Bz ; Wz ; BWr;s ; r; s; z 2 T,
parallel necessary condition in the present situation, ras, such that for each x; y 2 X T,
though it does not seem to have an obvious interpreta-
tion. The condition is: for a master set T ¼ fx; y; z; wg, BX ðxÞ ¼ PrðBx ¼ max Bz Þ,
z2X
BW ðx;yÞ ðx; yÞ BW fx;y;zg ðx; yÞ BW fx;y;wg ðx; yÞ W X ðyÞ ¼ Pr Wy ¼ max Wz
z2X
þ BW fx;y;z;wg ðx; yÞX0, and
which is easily checked to be a necessary condition by BW X ðx; yÞ ¼ PrðBWx;y ¼ max BWs;t Þ ðxayÞ.
r;s2X
substituting in the relevant rank orders from (8) and ras
ARTICLE IN PRESS
6 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]
Note that this definition does not imply any relations margins, Definition 1, and so in particular satisfy
between the best, worst, and best–worst choice prob- BW fx;yg ðxÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ. This class of models
abilities (other than that each set separately satisfies a has been studied extensively for best (equivalently,
random utility model). As we have argued earlier, such worst) choice probabilities, with its strengths and
generality is unwarranted at this time, so we now look at weaknesses quite well understood, particularly for
specializations that involve assumed relations between Thurstone and Luce (MNL) models. We now present
the three sets of random variables. the random utility versions of these two (classes of)
models, with extensions to best–worst choices.
One might think that we should require that, when
jX j ¼ 2,
Definition 8. A complete set of best, worst, and
BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ best– worst choice probabilities on a finite set T satisfies
or, more generally, that for some a, 0pap1, a consistent (best, worst, and best–worst) Thurstone
random utility model iff it satisfies a consistent random
BW fx;yg ðx; yÞ ¼ aBfx;yg ðxÞ þ ð1 aÞW fx;yg ðyÞ.
utility model, Definition 6, such that there exist interval
Neither property holds generally for random utility scale values vz ; z 2 T, and independent5 random vari-
models that satisfy Definition 5. It is a theoretical and ables ez ; z 2 T with
empirical issue to decide the appropriate resolution of
this apparent inconsistency. Shafir (1993) presents data, U z ¼ vz þ ez .
in an accept versus reject design, that can be interpreted
A consistent extreme value random utility model is a
as showing that Bfx;yg ðxÞ may differ from W fx;yg ðyÞ.
Thurstone random utility model where each ez ; z 2 T,
Definition 6. A complete set of best, worst, and has the extreme value distribution.6
best–worst choice probabilities on a finite set T satisfies
a (best, worst, best– worst) consistent random utility Rewriting the random variables in a consistent
model provided it satisfies a random utility model, extreme value random utility model in the form of
Definition 5, with a common set of random variables Uz , Definition 6, we have
z 2 T, such that for r; s 2 T, ras,
B z ¼ vz þ ez ,
Bz ¼ Wz ¼ Uz
Wz ¼ vz ez ,
and
BWr;s ¼ vr vs þ er es ð14Þ
BWr;s ¼ Ur Us .
It is important to note that the random variable with ez , z 2 T, having extreme value distributions. It is
notation above is intended to mean that the best–worst important to note again that all the random variable
choice probabilities are derived on the basis of a single notation above is intended to mean that the best–worst
common set of sample values Uz ; z 2 T. Combining the choice probabilities are derived on the basis of a single
various assumptions, a consistent random utility model common set of sample values Uz ; z 2 T, equivalently
becomes ez ; z 2 T.
It follows from standard results, based on (11), that,
BX ðxÞ ¼ Pr Ux ¼ max Uz , (11) given a consistent extreme value random utility model,
z2X Definition 8, the best choice probabilities satisfy the
Luce (MNL) choice model, with scale values exp vz
W X ðyÞ ¼ Pr Uy ¼ min Uz , (12) (McFadden, 1974). However, and this is important,
z2X
neither the worst or the best–worst choice probabilities
BW X ðx; yÞ given by that model will then, in general, satisfy the
¼ PrðUx 4Uz 4Uy ; z 2 X fx; ygÞ ðxayÞ. Luce (MNL) choice model when the scale values
vz ; z 2 T, are not all equal. Nonetheless, the relationship
ð13Þ
between the best and the worst choice probabilities in
We have the following equivalence:
this model is well known, and has lead to much
fascinating research (Yellott, 1977, 1980, 1997). Here,
Proposition 7. A complete set of best, worst, and we add a result about the representation of best–worst
best– worst choice probabilities on a finite set T satisfies choice probabilities that follows from that earlier work,
a consistent random utility model, Definition 6, iff it
satisfies a consistent random ranking model, Definition 3. 5
We include independence as part of the definition as we only
consider that case in this paper.
This result follows routinely, using classic methods for 6
This means that
showing such equivalences (Luce and Suppes, 1965; PrðUz ptÞ ¼ exp e t ð 1oto1Þ
Falmagne, 1978). Note that such models have consistent
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 7
but has not been noted previously as others have not probabilities satisfy Luce’s choice model on all subsets
studied best–worst choice.7 X T with jX jp3, then the best choice probabilities
It is known (e.g., Critchlow et al., 1991) that if a satisfy Luce’s choice model for all subsets X T
complete set of best, worst and best–worst choice (Theorem 5, Yellott, 1977).
probabilities on a finite set T satisfies a consistent Combining all of the above results we have the
extreme value random utility model, Definition 8, following proposition.
then for each Y T; jY j ¼ m, and each r ¼ r1 r2 :::
rm 1 rm 2 RðY Þ, Proposition 9. Assume that a complete set of best, worst,
PrðUr1 4Ur2 4 4Um Þ and best– worst choice probabilities on a finite set T
satisfies a Thurstone random utility model. Then the
¼ BY ðr1 ÞBY fr2 g ðr2 Þ:::Bfrm 1 ;rmg ðrm Þ ð15Þ
following conditions are equivalent:
and so for each x; y 2 T, xay
BW fx;yg ðx; yÞ ¼ PrðUx 4Uy Þ ¼ Bfx;yg ðxÞ, (i) The best choice probabilities satisfy Luce’s choice
and, using (13) and (15), for x; y 2 X T; jX j ¼ n42, model.
xay, and Z ¼ Z2 :::Zn 1 2 RðX fx; ygÞ, (ii) For all x; y 2 X T, xay, BW X ðx; yÞ ¼
BX ðxÞW X fxg ðyÞ.
BW X ðx; yÞ
¼ PrðUx 4Uz 4Uy ; z 2 X fx; ygÞ Note that the argument that (ii) implies (i) only uses
X subsets X T with jX jp3.
¼ BX ðxÞBX fxg ðZ2 Þ:::BfZn 1 ;yg ðZn 1 Þ
Z2RðX fx;ygÞ
X Also, given that the best choice probabilities satisfy
¼ BX ðxÞ BX fxg ðZ2 Þ:::BfZn 1 ;yg ðZn 1 Þ the Luce (MNL) choice model, it follows from standard
Z2RðX fx;ygÞ results (e.g., Yellott, 1977) that the worst choice
¼ BX ðxÞ PrðUz 4Uy ; z 2 X fx; ygÞ probabilities do not satisfy a Luce (MNL) choice model,
except in special cases such as when the best choice
¼ BX ðxÞW X fxg ðyÞ, probabilities are equal for each x 2 X . Nonetheless, for
i.e., distinct x; y 2 X with Z ¼ Z2 :::Zn 1 2 RðX fx; ygÞ and
so r ¼ xZy 2 RðX Þ, we have, by (15),
BW X ðx; yÞ ¼ BX ðxÞW X fxg ðyÞ, (16)
an example of a (mixed) sequential best–worst choice W X fxg ðyÞ ¼ PrðUz 4Uy ; z 2 X fx; ygÞ
model (Section 4, Definition 13).
On the other hand, suppose that we have a complete ¼ Pr Uy ¼ min Uz
z2X x
set of best, worst and best–worst choice probabilities on X
¼ PrðUZ2 4 4UZn 1 4Uy Þ
a finite set T, jTj42, that satisfies a Thurstone random
Z2RðX fx;ygÞ
utility model, Definition 8, and also satisfies (16) for all X
X T. Then, in particular, for every u; v 2 T; uav, ¼ BX fxg ðZ2 Þ . . . BfZn 1 ;yg ðZn 1 Þ,
W fu;vg ðuÞ ¼ Bfu;vg ðvÞ, and so for any X T, jX j ¼ 3, and Z2RðX fx;ygÞ
using (12), we have An inverse extreme value random utility model is not
a consistent random utility model, Definition 6, even
W X ðyÞ ¼ Pr Uy ¼ min Uz when uz ¼ vz (see below).
z2X
If we let, for z 2 T,
¼ Pr vx ex ¼ min½vz ez bðzÞ ¼ exp uz ; wðzÞ ¼ exp vz ,
z2X
then standard techniques show that the choice prob-
e vx
¼ Pr vx þ ex ¼ max½ vz þ ez ¼ P v , abilities given by an inverse extreme value random
z2X e z
utility model, Definition 11, yield the Luce (MNL)
where the final equality corresponds to the standard choice model for each of BX , W X , BW X , X T, with,
result regarding the representation of the Luce (MNL) for x; y 2 X T,
choice model in terms of random variables ez , z 2 T, bðxÞ
with each ez having the extreme value distribution. If we BX ðxÞ ¼ P ,
z2X bðzÞ
now proceed in a manner exactly paralleling the wðyÞ
previous example, but using (11)–(13) with the worst W X ðyÞ ¼ P ,
choice probabilities satisfying the Luce (MNL) choice z2X wðzÞ
Proposition 10. Assume that a complete set of best, We now make several observations about the
worst, and best– worst choice probabilities on a finite set T similarities and differences between the two extreme
satisfies a Thurstone random utility model. Then the value models of (14) and (17):
following conditions are equivalent:
(i) Each random variable BWr;s depends on a difference
(i) The worst choice probabilities satisfy Luce’s choice of extreme value random variables, er es , in (14)
model. and on a single extreme value random variable, er;s ,
(ii) For all x; y 2 X T, xay, BW X ðx; yÞ ¼ in (17). Obviously, this difference leads to differences
W X ðxÞBX fxg ðyÞ. in the predictions of each model for the best–worst
choice probabilities. In particular, (14) predicts
Note that the representation in (ii) is an example of a that the three types of binary choice probabilities
(mixed) sequential best–worst choice model (Section 4, agree, i.e., that BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðxÞ,
Definition 13) where the roles of best and worst choices whereas the general case of (17) predicts that they all
are interchanged relative to the first example. differ. It is an empirical question as to which pattern
of results—or neither—is correct.
Now we present a variant on the assumptions of the (ii) The extreme value random variables Bz and Wz in (14),
d d
above examples that leads to a random utility model but not in (17), satisfy Bz ¼ Wz , where ¼ means
where all three sets of choice probabilities—best, worst, equal in distribution. Thus, a consistent extreme value
and best–worst—satisfy the Luce (MNL) model. In fact, model, Definition 8, is a consistent random utility
the resulting choice probabilities are those of the model, Definition 6, but an inverse extreme value
example in Section 1. random utility model, Definition 11, is not—the latter
result follows from the fact that the extreme value
Definition 11. A complete set of best, worst, and distribution is not symmetric. If the extreme value
best–worst choice probabilities satisfies an inverse (best, distributed random variables ez are replaced by random
worst, best– worst) Thurstone random utility model iff it variables with symmetric distributions (for instance,
satisfies a random utility model, Definition 5, for which d
normals), we have Bz ¼ Wz for both (14) and (17).
there exist interval scale values uz ; vz ; z 2 T, and
independent random variables er;s ; r; s; z 2 T, ras, such
As noted above, the extreme value random variables
that
ez in Definition 8 and Definition 11 are not symmetric.
Bz ¼ uz þ ez , Rephrasing the above results, we have the following
Wz ¼ vz þ ez , results for consistent Thurstone and inverse Thurstone
random utility models, i.e., those where the extreme
BWr;s ¼ ur vs þ er;s . ð17Þ
value random variables are replaced by general random
An inverse extreme value random utility model is variables ez with equal means and variances:
an inverse Thurstone random utility model where each
ez and er;s ; r; s; z 2 T, ras, has the extreme value 1. When ez is symmetric, the representations of the best
distribution. and worst probabilities given by (14) are of the same
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 9
form as are the representations of the best and worst options are distinct, these are reported as the best–worst
probabilities given by (17), and (thus) (14) and (17) pair in X . Otherwise, further decisions are required to
agree with each other for the best and the worst select the best–worst pair. We now consider two possible
choice probabilities. processes for these later decisions, leaving other more
2. When ez is not symmetric, the representations of the complex possibilities for future study should such be
best and worst probabilities given by (14) are not of required by data. The two cases we consider are:
the same form, but the representations of the best and
worst probabilities given by (17) are of the same 1. The person chooses with equal probability amongst
form, and (thus) (14) and (17) do not agree with each the possible distinct best–worst pairs. Thus, each pair
other for the best and the worst choice probabilities. x, y 2 X , xay, is chosen with probability jX jðjX1 j 1Þ.
3. It is an empirical question as to which pattern of 2. The person re-samples for the best–worst pair.
results—(1) or (2) or neither—is correct.
We now study the details of models generated by each
of these processes.
4. Joint and sequential best–worst choice models
We now motivate models through the idea that a 4.1.1. Case 1: Any choices after the first are equally
person, in selecting the best–worst pair of options, probable
approaches the selection of the best and the worst First we consider the case where, if no decision is
option independently, and follows up on these choices, reached at the first stage, then at the second stage the
as needed, to ensure that the same option is not selected person chooses with equal probability amongst the
as both best and worst. We first consider models where possible distinct best–worst pairs. Then the best–worst
the best and worst choices are made jointly, then models choice probabilities are given by: for all x; y 2 X ; xay,
where these choices are made sequentially. BW X ðx; yÞ ¼ BX ðxÞW X ðyÞ
A referee asked whether such an independence X
1
assumption is reasonable and suggested that we should þ BX ðzÞW X ðzÞ. ð19Þ
also consider models with some kind of correlation. jX jðjX j 1Þ z2X
The question and suggestion are eminently reasonable. We have the following interesting proposition:
We introduce the ‘‘independence’’ assumption mainly
as a way to introduce the two models of this subsection, Proposition 12. A set of best– worst choice probabilities
the first of which can be interpreted as a quasi- on a finite set X that satisfies (19) has consistent margins,
independent (log-linear) model9 of a type that has been Definition 1, iff for every r; s 2 X , ras,
extensively studied, and successfully applied, in the
BX ðrÞW X ðrÞ ¼ BX ðsÞW X ðsÞ. (20)
analysis of other types of categorical data, including best
choices (see, e.g., Agresti, 2002; Louviere et al., 2000) Proof. From (19), we obtain
and best–worst choices (Cohen, 2003; Cohen and Neira, X
2003, Finn and Louviere (1992)). The second model BW X ðx; yÞ
y2X fxg
involves a slight generalization of the first model. These X
two models have not received detailed theoretical ¼ BX ðxÞW X ðyÞ
analysis previously and, given the length of the present y2X fxg
It then follows from (21) and (22) that the best–worst and
choice probabilities have consistent margins iff, for
every x 2 X , X 1=bðyÞ
BW X ðz; yÞ ¼ P ,
z2X 1=bðzÞ
1 X z2X fyg
BX ðzÞW X ðzÞ ¼ BX ðxÞW X ðxÞ. (23)
jX j z2X which with (24) and (25) shows that it has consistent
margins. This is quite fascinating as the best and worst
Clearly, (20) is sufficient for (23) to hold. We now show, choice probabilities each satisfy a Luce (MNL) choice
by contradiction, that (20) is necessary for (23) to hold. model, and the best–worst choice probabilities do not,
So assume that (23) holds for all z 2 X , but that (20) yet the model has consistent margins. Later, we show
does not hold for all r; s 2 X , ras. Then, using the that the second proposed decision procedure has the
properties of an arithmetic average, there must be two ‘opposite’ property, namely that the best, worst and the
distinct elements of X that we denote by xmin ; xmax , with best–worst choice probabilities each satisfy a Luce
(MNL) choice model, but these sets of choice prob-
BX ðxmax ÞW X ðxmax Þ
abilities do not have consistent margins.
1 X Since the above model has consistent margins, we
4 BX ðzÞW X ðzÞ
jX j z2X have that for two element sets X ¼ fx; yg,
4BX ðxmin ÞW X ðxmin Þ,
BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ.
which contradicts (23). Hence (20) must hold for all
We have already discussed issues related to these
x 2 X. &
equalities, and mentioned that Shafir (1993) presents
data, in an accept versus reject design, that can be
We now illustrate Proposition 12 with the Luce interpreted as showing that the second and third
(MNL) model holding for the best and worst choices. probabilities may be unequal.
For each x; y 2 X , we have An alternative way of writing (26) sets uðzÞ ¼ log bðzÞ,
z 2 X , giving, for x; y 2 X ; xay,
bðxÞ
BX ðxÞ ¼ P ,
z2X bðzÞ exp½uðxÞ uðyÞ þ jX 1j 1
wðyÞ BW X ðx; yÞ ¼ P . (27)
W X ðyÞ ¼ P . ð24Þ r;s2X exp½uðrÞ uðsÞ þ
1
z2X wðzÞ ras jX j 1
Then with the set of best–worst choice probabilities which is a biased form of the maximum-difference
given by (19), Proposition 12 shows that this model has (maxdiff) best–worst choice model (see Section
consistent margins iff there is a constant c such that for 4.1.2) that converges to the latter model for ‘large’
every z 2 X , sets X.
Now we consider some estimation issues related to the
bðzÞwðzÞ ¼ c, representation (26), equivalently (27). An appropriate
experimental design for testing this model, called a 2j
i.e., fractional factorial, ensures that each option and each
c pair of distinct options, is presented equally often across
wðzÞ ¼ . (25) the selected subsets of size j of the master set T (Finn
bðzÞ
and Louviere, 1992). We assume such a design, and thus
Note that the resulting representation for W X in (24) we lose no generality by developing the results for a
will not depend on c. Then it follows from a routine fixed set X ; X T, with jX j ¼ j.
calculation that (19) becomes: for x; y 2 X ; xay, Suppose that we have a sample of N best–worst
choices from the set X. Denote these choices by
bðxÞ ðxi ; yi Þ; i ¼ 1; . . . ; N. These may be N best–worst choices
bðyÞ þ jX 1j 1
BW X ðx; yÞ ¼ P . (26) for a single individual, or one best–worst choice for each
bðrÞ
r;s2X
bðsÞ þ jX 1j 1 of N individuals. For the samples i ¼ 1; . . . ; N and any
ras
x; y 2 X , xay, let
For reassurance, one can check that the representation
(26) satisfies c i ðx; yÞ
bw
( ) ( )
X bðxÞ 1 x is best; y is worst; for sample i
BW X ðx; zÞ ¼ P ¼ if
z2X fxg z2X bðzÞ 0 otherwise
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 11
and for each z 2 X , let choice options. In order for such a resampling process to
terminate, it is necessary that there are r; s 2 X , ras,
X
N X
b ¼
bðzÞ c i ðz; yÞ,
bw with BX ðrÞa0 and W X ðsÞa0,10 in which case
P
r;s2X BX ðrÞW X ðsÞ40. The formula for the best–worst
i¼1 y2X fzg ras
choice probabilities is then, with xay,
X
N X
b ¼
wðzÞ c i ðx; zÞ.
bw
i¼1 x2X fzg BW X ðx; yÞ
" #k
Note that the likelihood of obtaining the data X 1 X
¼ BX ðrÞW X ðrÞ BX ðxÞW X ðyÞ
ðxi ; yi Þ; i ¼ 1; . . . ; N, is
k¼o r2X
2 3k
Y
N Y b i ðx;yÞ X
1 X
bw
½BW X ðx; yÞ ¼ BX ðxÞW X ðyÞ 41 BX ðrÞW X ðsÞ5
i¼1 x;y2X
xay k¼o r;s2X
ras
and it is clear from (26), or equivalently from (27), that BX ðxÞW X ðyÞ
b ¼P .
neither bðzÞ b
or wðzÞ, z 2 X , or both together, are r;s2X BX ðrÞW X ðsÞ
ras
sufficient statistics for this likelihood. However, we
know that this model has consistent margins that satisfy As discussed earlier, if the best and the worst choice
the Luce (MNL) choice model—in fact, it is routine to probabilities each satisfy the Luce (MNL) choice model,
show, with E denoting expectation, that for each x 2 X , then the above representation of the best–worst choice
b probabilities is a Luce (MNL) choice model representa-
E½bðxÞ ¼ N:BðX ÞbðxÞ, (28) tion with scale values bðxÞwðyÞ; x; y 2 X ; xay. As with
where Case 1, when we impose the constraint that for all
z 2 X ; bðzÞwðzÞ ¼ c, i.e., wðzÞ ¼ c=bðzÞ, an alternative
1 version of this representation is given by: for z 2 X , let
BðX Þ ¼ P .
z2X bðzÞ uðzÞ ¼ log bðzÞ, then for x; y 2 X ; xay,
b
Thus, since b is a ratio scale, the scores bðxÞ; x 2 X , or
1 b
rather N bðxÞ, give unbiased estimates of the scale values exp½uðxÞ uðyÞ
BW X ðx; yÞ ¼ P , (30)
for bðxÞ; x 2 X . exp½uðrÞ uðsÞ
r;s2X
ras
Similarly, with
1 a representation that is usually called maximum-
W ðX Þ ¼ P 1
,
z2X bðzÞ
difference (maxdiff).
When X ¼ fx; yg, we have
we obtain, for each x 2 X ,
bðxÞ=bðyÞ
1 BW fx;yg ðx; yÞ ¼ ,
b
E½wðxÞ ¼ N:W ðX Þ , (29) bðxÞ=bðyÞ þ bðyÞ=bðxÞ
bðxÞ
and so the scores w bðxÞ; x 2 X , or rather N1 wðxÞ,
b give which does not in general equal bðxÞ=½bðxÞ þ bðyÞ,
unbiased estimates of the scale values 1=bðxÞ. which is the value of each of Bfx;yg ðxÞ and W fx;yg ðyÞ. As
However, it would be preferable to estimate each scale we have discussed already, we can replace the prediction
value bðxÞ; x 2 X , from some combination of the scores of the maxdiff model for the case jX j ¼ 2 with the
b
bðxÞ b
and wðxÞ. Louviere, Burgess, Street, and Marley representation
(2004) explore the properties of related estimation
BW fx;yg ðx; yÞ ¼ aBfx;yg ðxÞ þ ð1 aÞW fx;yg ðyÞ.
procedures for the model of the following Section
4.1.2, and it will be useful in the future to study whether
However, this is now a sequential process and there is no
or not it is possible to discriminate between the model
reason why it should not be applied for all set sizes,
just presented and that of the next section on the basis of
leading to a different model that is discussed in
data.
Section 4.2.
We have just seen that the unmodified model does not
4.1.2. Case 2: Any choices after the first involve re- have consistent margins when jX j ¼ 2, and the fact that
sampling
Now we consider the case where, if no decision is
reached at the first stage, then the person re-samples for 10
An equivalent condition is that there is no r 2 X with
the best–worst choice pair from the set X of available BX ðrÞ ¼ W X ðrÞ ¼ 1.
ARTICLE IN PRESS
12 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]
it does not have consistent margins in general can be 4.2. Sequential best– worst choice processes
seen by noting that for each x; y 2 X ,
X The following definition is based on the idea that a
BW X ðx; zÞ person, in choosing the best–worst pair from a set X,
z2X fxg first decides in which order to choose the best and the
!
1 X 1 worst option (best–worst order with probability a,
¼P bðxÞ 1 worst–best order with probability 1 a), then proceeds
r;s2X bðrÞ=bðsÞ bðzÞ
ras z2X to the actual choices. Note that the question format in
aBX ðxÞ the experiment will likely affect (though not necessarily
completely determine) the value of a.
and
X Definition 13. A set of best, worst, and best–worst
BW X ðz; yÞ choice probabilities on a set X satisfies a mixed
z2X fyg sequential (best, worst, and) best– worst choice model iff
!
1 1 X there is a constant a; 0pap1, such that for all
¼P bðzÞ 1 x; y 2 X ; xay,
r;s2X bðrÞ=bðsÞ bðyÞ z2X
ras
where the best and worst choice operabilities are as However, (i) with (38) implies that both the best and the
given in Proposition 15. Then by that theorem we have worst choice probabilities satisfy Luce’s choice model
an example of a concordant best–worst choice model, (Theorem 50, Luce and Suppes, 1965) and then, using
Definition 14. In fact, Marley (1968) shows that, under (38) again, we obtain that, for each x; y 2 X T;
the conditions of Proposition 15, this is the unique jX jX2; xay, we have BX ðxÞ ¼ W X ðxÞ ¼ jX1 j (Yellott,
representation of a concordant best–worst-choice mod- 1980). It then follows from the fact that the best, worst
el. However, note that this is really a ‘‘class’’ of models and best–worst choice probabilities on T satisfy a
as it depends on the assumed representations of the concordant best–worst choice model, Definition 14, that
binary choice probabilities. BW X ðx; yÞ ¼ jX1 j jX 1 1j. Combining these results, we
have that ii. holds. &
Marley (1968) shows also that, when the binary choice
probabilities are transitive and for all distinct r; s 2 X , Returning to Proposition 15, note that, using (33), the
bðr; sÞ ¼ wðs; rÞ, the above concordant best–worst choice best and worst probabilities in (34) can be rewritten in
model is the only one that is compatible with a structure the form
Q P
of ‘‘best to worst’’ and ‘‘worst to best’’ ranking prob- z2X fxg bðx; zÞ s2RðX fxgÞ bX fxg ðsÞ
abilities that he called a reversible ranking model BX ðxÞ ¼ P Q P ,
r2X z2X frg bðr; zÞ s2RðX frgÞ bX frg ðsÞ
(Marley, 1968 Theorem 8). However, we see next that Q P
a concordant best–worst choice model, Definition 14, is z2X fyg wðy; zÞ s2RðX fygÞ wX fyg ðsÞ
W X ðyÞ ¼ P Q P ,
not in general compatible with a consistent random r2X z2X frg wðr; zÞ s2RðX frgÞ wX frg ðsÞ
ranking model, Definition 3. ð39Þ
Proposition 16. Assume that a complete set of best, worst which can be given the following process interpretation:
and best– worst choice probabilities on a three-element set the person makes all possible paired comparisons
T satisfies a concordant best– worst choice model, according to the best paired comparison probabilities
Definition 14. Then the following are equivalent. b. If the options end up as rank ordered by this process,
then the person selects as best the (necessarily unique)
(i) The complete set of best, worst and best– worst choice option that beats every other option in these compar-
probabilities satisfies a consistent random ranking isons—that is, the option that is first (best) in the
model, Definition 3. resulting rank order; otherwise the process starts over.
ARTICLE IN PRESS
14 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]
The process for worst choices is parallel, with the 5. Ratio scale models
best binary choice probabilities b replaced by the
worst binary choice probabilities w. We now return to the maxdiff model, i.e., the example
It is interesting to compare the above process in Section 1, and use it to motivate a theoretical question
with a related process model for best choices (with a concerning when a set of best, worst and best–worst
parallel interpretation for worst choices): the person choice probabilities are ‘‘of the same form’’ with the
makes all possible paired comparisons according best–worst choice probabilities determined by some
to the best-paired comparison probabilities b; if some ‘‘functions’’ of the best and worst choice probabilities.
(necessarily unique) option beats every other option Unfortunately, the notation required for the general
in these comparisons, then that option is selected formulation is complex, so we use the earlier example to
as best; otherwise the process starts over.11 Note that, illustrate the ideas, and include details in the Appendix.
in this case, in contrast to the previous process, we do We concentrate on a fixed choice set X, though the ideas
not require the results of the binary choices to be can be extended to cover all subsets X ; X T, of a
consistent with a rank order, only that they are master set T.
consistent with the existence of a best option. This We begin with the restatement of the formulae of the
process, with a parallel one for worst choices, gives the example: for x; y 2 X ,
following representation for the best and the worst
bðxÞ
choice probabilities: BX ðxÞ ¼ P ,
r2X bðrÞ
Q
z2X fxg bðx; zÞ
wðyÞ
BX ðxÞ ¼ P Q , W X ðyÞ ¼ P ð40Þ
r2X s2X frg bðr; sÞ s2X wðsÞ
Q
z2X fyg wðy; zÞ and
W X ðyÞ ¼ P Q .
r2X s2X frg wðr; sÞ BX ðxÞW X ðyÞ
BW X ðx; yÞ ¼ P ðxayÞ. (41)
r;s2X BX ðrÞW X ðsÞ
By comparing these equations with those of (39), it is rar
clear that the two processes give different representa- Then direct substitution of (40) in (41) yields: for
tions when jX j43. It is routine, though tedious, to show x; y 2 X ; xay,
also that the above representation does not satisfy a
bðxÞwðyÞ
concordant best–worst choice model, Definition 14, BW X ðx; yÞ ¼ P . (42)
r;s2X bðrÞwðsÞ
when jX j43—one simple counter-example takes the set ras
X ¼ fx; y; z; wg with the binary choice probabilities Thus, combining (41) with (42), we have
satisfying Luce’s choice model with x; y; z; w having
scale values 1; 2; 3; 4, respectively. BX ðxÞW X ðyÞ
BW X ðx; yÞ ¼ P
These are only two of a vast array of possible x;y2X BX ðxÞW X ðyÞ
xay
best–worst choice models based on binary comparisons.
bðxÞwðyÞ
There are many alternate ways to combine a sequence of ¼ P , ð43Þ
r;s2X bðrÞwðsÞ
(probabilistic) best and worst choices that will lead to a ras
final best–worst pair. Also, the best (respectively, worst) that is, the best–worst choice probabilities are functions
choices in a mixed sequential choice process can be of the best and worst choice probabilities, as well as
assumed to satisfy any ‘‘standard’’ discrete choice functions of the ratio scale values that determine those
model, not necessarily one motivated by a sequence of choice probabilities, and, in fact, the two functional
binary comparisons. Of course, for parsimony, we forms have identical structure. We wish to know what
would expect some links between the best and worst other functional forms, if any, have parallel properties.
choice processes, such as those that arise as a result of The full notation required to formulate, and the
assuming a consistent extreme value model, Definition techniques to solve, this question are complex (see the
8. Given the vast diversity of possible models, we leave appendix). Nonetheless, based on previous work on
their further systematic study for the future, at which closely related aggregation problems (Aczel et al., 1997;
time we expect to have data with which to challenge Aczel et al., 2000), we conjecture that the representa-
them. tions given by (43) are the only ones that satisfy the full
set of constraints, which include the important require-
11
ments that the two functional forms have identical
The following is an alternate description of the process. Some structure and that b and w are independent ratio scales.
element x 2 X is chosen at random and compared pairwise with every
other element of X, with the best option in each comparison being
If we generalize the formulation by allowing the two
determined by the paired comparison probabilities b. If x wins every functional forms to have somewhat different structures,
comparison, it is selected as best, otherwise the process starts over. then, again based on the previous work, we conjecture
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 15
that the class of solutions is larger, but still involves summarizing the data obtained using best–worst choice
relatively simple functions of the best and worst choice in discrete choice designs.
probabilities (vis., the best and worst ratio scales).
Finally, as in the maxdiff model, we need to consider the
cases, where for each z 2 X , Acknowledgments
1
bðzÞ ¼ . This research has been supported by Australian
wðzÞ
Research Council Discovery Grant DP034343632 to
Nothing similar to this case has been studied in the the University of Technology, Sydney, for Louviere,
previous work, and it raises several complexities in Street and Marley, and Natural Science and Engineering
generalizing the earlier results to the present situation. Research Council Discovery Grant 8124-98 to the
University of Victoria for Marley. It was begun while
Marley was a Visiting Researcher at Systems, Organiza-
6. Summary and conclusions tions and Management of the University of Groningen
and supported by the Netherlands’ Organization for
We derived and discussed theoretical results for a Scientific Research for the period July 1, 2003, to June
number of best, worst and best–worst choice models, 30, 2004. We appreciate useful feedback from Tom
with the focus on the latter. Our results include a Wickens and J. C. Falmagne on earlier versions of this
number of interesting theoretical relationships between work, and the detailed evaluation by two referees (one
these types of models, which in turn suggest a variety of anonymous, the other Reinhard Suck) on the original
tests to determine which model is most consistent with version of the paper.
choice data. For example, the maximum-difference
model is of the well-known Luce (1959), equivalently
Multinomial Logit (McFadden, 1974), form with ratios Appendix A. The functional equations associated with
of scale values, and has difference scores for best versus ratio scale models of best, worst, and best–worst choice
worst choices that are sufficient statistics for the probabilities
parameters of the model, with the separate best and
worst scores having biases that decrease with increasing We use a slightly different notation than previously
choice set size. It will be interesting and important to for the options and the choice probabilities, namely we
undertake empirical studies for different set sizes, which let X ¼ fx1 ; . . . ; xn g, and let BX ðxi Þ, WX ðxj Þ, BWX ðxi ; xj Þ,
can be achieved by constructing the sets using balanced respectively, iaj, i; j 2 f1; . . . ; ng, denote the best, worst,
incomplete block designs (see, e.g., Street and Street, best–worst, respectively, choice probabilities. Thus, as
1987). One can then compare theoretically correct before, we have the constraints
estimates with the obtained best, worst, and best minus
worst scores to learn how much the biases actually 0pBX ðxi Þ; W X ðxj Þ; BW X ðxi ; xj Þp1
matter in real applications. ði; j ¼ 1; . . . ; nÞðiajÞ
General classes of best–worst choice models have not
and
been studied previously in a systematic way, and, as far
as we know, our results constitute the first formal X
n X
n X
BX ðxi Þ ¼ W X ðxj Þ ¼ BW X ðxi ; xj Þ ¼ 1.
presentation of the properties of some of these models. xi ;xj 2X
i¼1 j¼1
The results suggest that best–worst choice tasks and the iaj
associated models are both theoretically and empirically Also, b and w will denote ratio scales defined over
interesting, with the potential to provide important possible options, and so their values are nonnegative
insights into preference and choice processes. We noted real numbers. For mathematical simplicity we eliminate
a number of open problems, the most pressing of which options that have scale values of zero.
demand the axiomatization of important best–worst We now assume that there are functions Bi and W j
choice models, such as the best–worst ranking model, ði; j ¼ 1; . . . ; nÞ, BW ij , H i;j and K i;j ði; j ¼ 1; . . . ; nÞ ðiajÞ,
the maximum-difference model, and the concordant such that
best–worst choice model, each without reference to
representations of best or worst choice probabilities. It is BX ðxi Þ ¼ Bi ðbðx1 Þ; . . . ; bðxn ÞÞ ði ¼ 1; . . . ; nÞ, (44)
also important to characterize the class of ratio scale
W X ðxj Þ ¼ W j ðwðx1 Þ; . . . ; wðxn ÞÞ ðj ¼ 1; . . . ; nÞ. (45)
models of best, worst, and best–worst choice probabil-
ities where all three sets of choice probabilities are ‘‘of We need a matrix notation in order to formulate
the same form.’’ Finally, as already indicated, there are the form of the functional equations that we need to
interesting practical questions regarding the usefulness solve. Let Rþþ ¼0; 1½. For any pair of n-component
of ‘‘simple’’ (sufficient) statistics in analyzing and vectors r ¼ ðr1 ; . . . ; rn Þ; s ¼ ðs1 ; . . . ; sn Þ with ri ; si 2 Rþþ ,
ARTICLE IN PRESS
16 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]
and a function O : Rþþ Rþþ !Rþþ , define the and where Fðu; vÞ ¼ uv, Gðr; sÞ ¼ rs, and for i; j ¼
matrix kOðri ; sj Þk by 1; . . . ; n, iaj,
2 3
0 Oðr1 ; s2 Þ : : : Oðr1 ; sn 1 Þ Oðr1 ; sn Þ
6 Oðr ; s Þ 0 : : : Oðr2 ; sn 1 Þ Oðr2 ; sn Þ 7
6 2 1 7
6 7
6 : : : : : : : 7
6 7
6 : : : : : : : 7
kOðrk ; sl Þk ¼ 6 7.
6 7
6 : : : : : : : 7
6 7
6 Oðr ; s Þ Oðrn 1 ; s2 Þ : : : 0 Oðrn 1 ; sn Þ 7
4 n 1 1 5
Oðrn ; s1 Þ Oðrn ; s2 Þ : : : Oðrn ; sn 1 Þ 0
W j ½nwðx1 Þ; . . . ; nwðxn Þ ¼ W j ½wðx1 Þ; . . . ; wðxn Þ, (48) Thus, the problem before us is, given reasonable
technical assumptions, find all solutions of the set of
functional equations (44)–(49).
and
Cohen, S., & Neira, L. (2003). Measuring preference for product Louviere, J. J., Hensher, D. A., & Swait, J. D. (2000). Stated choice
benefits across countries: Overcoming scale usage bias with models. Cambridge: Cambridge University Press.
maximum difference scaling. Paper presented at the Latin Luce, R. D. (1959). Individual choice behavior. New York, NY: Wiley.
American conference of the European society for opinion and Luce, R. D., & Suppes, P. (1965). Preference, utility, and subjective
marketing research, Punta del Este, Uruguay (pp. 1–22). Reprinted probability. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.).
in Excellence in International Research: 2004. ESOMAR, Am- Handbook of mathematical psychology (Vol III, pp. 235–406). New
sterdam, Netherlands, 2004. York, NY: Wiley.
Critchlow, D. E., Fligner, M. A., & Verducci, J. S. (1991). Probability Marley, A. A. J. (1968). Some probabilistic models of simple
models on rankings. Journal of Mathematical Psychology, 35, choice and ranking. Journal of Mathematical Psychology, 5,
294–318. 311–332.
Falmagne, J. C. (1978). A representation theorem for finite random McFadden, D. (1974). Conditional logit analysis of qualitative choice
scale systems. Journal of Mathematical Psychology, 18, 52–72. behavior. In P. Zarembka (Ed.), Econometrics (pp. 105–142). New
Finn, A., & Louviere, J. J. (1992). Determining the appropriate York, NY: Academic Press.
response to evidence of public concern: The case of food safety. Shafir, E. (1993). Choosing versus rejecting: Why some options are
Journal of Public Policy and Marketing, 11(1), 12–25. both better and worse. Memory and Cognition, 21, 546–556.
Fiorini, S. (2004). A short proof of a theorem of Falmagne. Journal of Street, A. P., & Street, D. J. (1987). Combinatorics of experimental
Mathematical Psychology, 48, 80–82. design. New York: Oxford University Press.
Fishburn, P. (1994). On ‘‘choice’’ probabilities derived from ranking Yellott, J. I., Jr. (1977). The relationship between Luce’s choice axiom,
distributions. Journal of Mathematical Psychology, 38, 274–285. Thurstone’s theory of comparative judgment, and the double
Fishburn, P. C. (2002). Stochastic utility. In S. Barbera, P. J. exponential distribution. Journal of Mathematical Psychology, 15,
Hammond, & C. Seidl (Eds.). Handbook of utility theory, Vol. I, 109–144.
pp. 275–319. Dordrecht, Boston, London: Kluwer Academic Yellott, J. I., Jr. (1980). Generalized Thurstone models for ranking:
Publishers. Equivalence and reversibility. Journal of Mathematical Psychology,
Louviere, J. J., Burgess, L., Street, D., & Marley, A. A. J. (2004). 22, 48–69.
Modelling the choices of single individuals by combining efficient Yellott, J. I., Jr. (1997). Preference models and reversibility. In A. A. J.
choice experiment designs with extra preference information. Marley (Ed.), Choice, decision, and measurement: Essays in honor of
Working paper no. 04-003, Centre for the Study of Choice, R. Duncan Luce (pp. 131–151). Mahwah, NJ: Lawrence Erlbaum
University of Technology, Sydney. Associates.