Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Journal of Economic Literature

Vol. XLIII (March 2005), pp. 65–107

Economic Theory and


Experimental Economics
LARRY SAMUELSON∗

1. Introduction Experimental economics is currently


making its transition from topic to tool.1
G ame theory had its beginnings in eco-
nomics as a separate topic of analysis,
practiced by a cadre of specialists. It has
Once viewed skeptically by many econo-
mists, experiments have become common-
place. Once again, this transition has
since become commonplace. Every econo-
involved changes both in the way econo-
mist is acquainted with the basic ideas, often
mists view experimental methods and in the
without notice, and there is free movement
experimental methods themselves.
between the use of game theory and other
This paper explores one aspect of this
techniques. This incorporation as a standard
integration of experimental economics into
economic tool has helped shape the nature
economics. How can we usefully combine
of game theory itself—the mix of questions
work in economic theory and experimental
has changed and more attention has been
economics? What do economic theory and
devoted to how game theoretic models are
experimental economics have to contribute
to be interpreted as capturing economic
to one another, and how can we shape their
interactions.
interaction to enhance these contributions?
Mathematical economics and economet-
There is already plenty of work
rics have each similarly progressed from
that insightfully integrates theory and
being a topic pursued by a band of specialists
to becoming a sufficiently familiar tool as to
be used without comment. In the process,
each has been shaped by issues arising in
1
economic applications. For example, the Journal of Economic Literature’s
“Mathematical and Quantitative Methods” classification
section includes a “Design of Experiments” subsection,
and a Nobel prize has been given for experimental work.
∗ Samuelson: University of Wisconsin. I thank Jim At the same time, training in experimental methods has
Andreoni, Jakob K. Goeree, Roger Gordon, John not yet joined basic econometrics or game theory as a stan-
McMillan, Georg Nöldeke, and three referees for helpful dard part of the first-year graduate curriculum. Alvin E.
comments. I thank the Economic Science Association for Roth (1993) provides a history of early work in experimen-
the invitation to give a talk at the 2003 Pittsburgh meetings tal economics. Roth (1995) continues this history and pro-
that developed into this paper. I thank the National vides a more detailed discussion of recent experimental
Science Foundation (SES-0241506) and Russell Sage work. Roth (1991) proceeds further with some thoughts on
Foundation (82-02-04) for financial support. the future of experiments in economics.

65
mr05_Article 2 3/28/05 3:25 PM Page 66

66 Journal of Economic Literature, Vol. XLIII (March 2005)

experiments.2 However, the methods for put- studies. The modal proposal is typically to split
ting the two together are still developing. The the sum of money evenly. If player 1 asks for
goal here is to examine the issues involved in two-thirds or more of the surplus, he stands a
this development. Much can be gained by good chance of being rejected.
combining economic theory and experiments, We thus have a marked contrast between
but doing so calls for thinking carefully about theory and experiment. A common initial
the way we do theory as well as experiments. reaction was to dismiss the laboratory envi-
ronment as uninteresting. Why should we be
2. An Example interested in how experimental subjects play
an artificial game for token amounts of
It is helpful to begin with an example in
money? Borrowing a term from experimen-
which experimental results and economic the-
tal psychology, this is a question of external
ory have constructively mingled. This exam-
validity: is the experimental environment
ple illustrates the ideas that will be developed
sufficiently close to the situation of interest
more generally in section 3, illustrated in
to be informative? In this case, for example,
section 4, and then extended in section 5.
is the laboratory environment close enough
In 1965, Reinhard Selten (1965) intro-
to the situations envisaged by contract theo-
duced the concept of a subgame-perfect
rists when they assume that subgame-per-
equilibrium. Subgame perfection is now
fect equilibria appear in the ultimatum
taken for granted, in the sense that a paper
games embedded in their models?
whose conclusion hinged upon an equilibri-
One way of gaining some perspective on
um that was not subgame perfect would
such questions is to turn them around. How
have a great deal of explaining to do.
special is the laboratory environment generat-
Some years later, Werner Güth, Rolf
ing the experimental results? Can we link the
Schmittberger, and Bernd Schwarze (1982)
results to aspects of the experimental envi-
performed a simple experiment, examining
ronment that appear to be especially artificial,
what has come to be known as the ultimatum
or do they appear to be robust? In the case of
game. Player 1 makes a proposal for how a
the ultimatum game, a long string of experi-
sum of money is to be split between players 1
ments has investigated the effects of playing
and 2. Player 2 then either accepts, imple-
for larger amounts of money, playing in dif-
menting the proposal, or rejects, in which case
ferent countries and cultures, playing with
the interaction ends with zero payoffs for each.
differing degrees of anonymity, playing with
This is the type of game—perfect information,
different amounts of experience, playing
two players, only one move per player—in
games of different length, and playing with
which subgame perfection is often viewed as
different types of opponents.3 Some of these
being obviously compelling. In any subgame-
perfect equilibrium of the ultimatum game, 3
For example, Lisa A. Cameron (1999), Elizabeth
player 1 makes and player 2 accepts a propos- Hoffman, Kevin A. McCabe and Vernon L. Smith (1996),
al that gives player 2 at most one penny (or one and Robert L. Slonim and Roth (1998) (larger payoffs);
Joseph Henrich (2000), Henrich, Robert Boyd, Samuel
of whatever is the smallest monetary unit Bowles, Colin F. Camerer, Ernst Fehr, Herbert Gintis, and
available). In contrast, Güth, Schmittberger, Richard McElreath (2001), and Roth, Vesna Prasnikar,
and Schwarze obtained results that have been Masahiro Okuno-Fujiwara, and Shmuel Zamir (1991) (dif-
ferent countries and cultures); Gary E. Bolton and Rami
echoed by an ever-growing list of subsequent Zwick (1995), Hoffman, McCabe, and Smith (1996), and
Hoffman, McCabe, Shachet, and Smith (1994) (anonymi-
2
Vincent P. Crawford (1997) and Roth (1988) explore ty); David Cooper, Nick Feltovich, Roth, and Zwick (2002)
the interaction between economic theory and experi- (experience); Robert Forsythe, Joel L. Horowitz, N. E.
ments, each arguing (as does this paper) that there are Savin, and Martin Sefton (1995) and Glenn W. Harrison
good reasons for thinking about experiments when doing and McCabe (1992) (length); and Sally Blount (1995),
economic theory as well as thinking about theory when Harrison and McCabe (1996), and Eyal Winter and Zamir
doing experiments. (1997) (opponents).
mr05_Article 2 3/28/05 3:25 PM Page 67

Samuelson: Economic Theory and Experimental Economics 67

variations matter, and there is much to be invariably reflect equilibrium play. However,
learned about which matter more than oth- we do not expect theories to make exact pre-
ers. However, no combination of conditions dictions. How close is close enough? When
has been found that produces the subgame- are experimental results within the margin of
perfect equilibrium outcome sufficiently approximation that is inevitably built into a
reliably as to allow us to dismiss the remain- theory, and when do they indicate that the
ing experimental results. The mounting evi- theory is on the wrong track? There is typi-
dence suggests that the ultimatum game has cally no obvious standard for answering these
something to tell us about behavior. One can questions. One can then imagine Davis and
often find reasons to dismiss any single Holt’s summary figure (Figure 5.6) being
experiment, but cannot ignore such a large regarded as evidence for both the success
and varied body of work. and the failure of conventional bargaining
Attention then turns to the theory. What models, depending upon one’s point of view.
implications for economic theory do the A return to the theory is again helpful, this
experimental results have? Perhaps none. We time with an eye toward finding within the
know that any theory is a deliberate approxi- theory some guide for evaluating the experi-
mation, and hence that there must be some mental results. Glen W. Harrison (1989,
circumstances under which it fails. Could it 1992) (see also Robert Drago and John S.
be that the theory is meant for settings not Heywood [1989]) suggests one approach. A
captured by the experiments, and that the cornerstone of the relevant theory is that peo-
theory is still useful in the applications for ple maximize their expected payoffs. In light
which it is intended? In this spirit, Ken of this, a natural measure for evaluating the
Binmore, Avner Shaked, and John Sutton theory is the payoff losses subjects incur as a
(1985) argue that the ultimatum game fea- result of not behaving as predicted. The larg-
tures an atypically asymmetric division of bar- er are these losses, the stronger is the evi-
gaining power, making subgame perfection dence that the theory has missed something.
unrealistically demanding, and that models Harrison argues that in the case of auctions,
built around subgame perfection might be a seemingly large departures from equilibrium
better match for two-stage bargaining games behavior often translate into very small payoff
that feature a less extreme (though still asym- losses, suggesting that the contrast between
metric) distribution of bargaining power. theory and behavior is not nearly as large as it
Their experiments produced outcomes much first appears. Drew Fudenberg and David K.
closer to the subgame-perfect equilibrium in Levine (1997) turn a similar eye toward a
two-stage bargaining games. Are we then to variety of other games, including the ultima-
assume that the subgame-perfect equilibrium tum game. They find that behavior in many
is a useful model of behavior in bargaining of these games is consistent with subjects’
models, as long as we stay away from models holding beliefs about others’ behavior that is
with equilibria that are too asymmetric? And consistent with their experience and against
if so, what does “too asymmetric” mean? which they suffer relatively small payoff loss-
Once again, we can seek insight in the es. This again suggests that the theory may
ensuing body of experimental work. capture important elements of behavior,
Subsequent experiments have examined bar- despite seemingly unencouraging experi-
gaining games with varying degrees of asym- mental results. At the same time, however,
metry in bargaining power (see Douglas D. payoff losses in the ultimatum game are rela-
Davis and Charles A. Holt [1993] for a sum- tively large compared to many other experi-
mary). Departures from equilibrium are ments. Rejecting an offer often involves a
often much less pronounced than in the ulti- significant sacrifice, regardless of what one
matum game, but the data still does not believes about how others are playing. It is
mr05_Article 2 3/28/05 3:25 PM Page 68

68 Journal of Economic Literature, Vol. XLIII (March 2005)

then harder to argue here that one can what can the subjects, especially respon-
rationalize nonequilibrium behavior simply ders, possibly have to learn in a game so
by arguing that the players are nonetheless simple as the ultimatum game? Without a
achieving approximately equilibrium payoffs. clear answer to this question, learning mod-
Binmore, John Gale, and Larry els are difficult to interpret. We return to
Samuelson (1995) and Alvin E. Roth and this question in section 5. Second, learning
Ido Erev (1995) suggest an alternative theories have proven to be cumbersome
approach to assessing how close observed tools with which to examine strategic inter-
behavior is to the predictions of the theory. actions. A successful theory trades off its
Why should we expect equilibrium behavior explanatory power with its ease of use. It has
in the first place? The traditional answer in not been easy to formulate learning models
economics is not that equilibria spring to that rival equilibrium theories in terms of
life as a result of sheer calculation or exter- readily yielding sharp predictions.6
nal organization, but rather that behavior is Perhaps one should view the connection
pushed toward equilibrium by an adjust- between theory and experiment differently.
ment or learning process that continually Instead of asking whether the theory gets
puts pressure on players to alter nonequilib- the behavior right, and then wrangling over
rium behavior. Adopting this view, how how the distance between experimental
strong are the incentives for players to and theoretical outcomes is to be measured
adjust nonequilibrium behavior in simple and interpreted, let us ask whether the the-
bargaining games?4 The stronger are these ory captures the important considerations
incentives, the stronger is the experimental shaping the behavior. This directs attention
evidence that the theory has missed some- away from the point predictions of the the-
thing. In the case of bargaining games, it ory and toward its comparative statics. For
turns out that these incentives can be quite example, experimental behavior that con-
weak. Even small amounts of noise or sistently responds to changes in discount
imperfection can cause the learning process rates as predicted by the theory of bargain-
to get stuck, for long or even indefinite peri- ing might lead us to believe that the theory
ods of time, far away from a neighborhood has identified an important role for impa-
of the subgame-perfect equilibrium. We tience in shaping behavior, even if the the-
thus again have a suggestion that the ory is not complete enough to capture
observed behavior may not be too far from every aspect of behavior. This emphasis on
equilibrium, by at least one measure of “too comparative statics pushes experimental
far.” Motivated by similar considerations, a analysis closer to methods familiar in other
literature on learning and its relationship areas of economics.
to experimental behavior has developed.5 The results in this respect for bargaining
However, two difficulties now arise. First, theory are mixed. For example, Binmore,
Peter Morgan, Shaked, and Sutton (1991)
4
In spirit, this is close to asking how far realized pay- and Binmore, Shaked, and Sutton (1989)
offs fall short of the payoffs that could be obtained by play- report experiments in which behavior
ing a best response. The difference is that the incentives
for adjusting one’s behavior are now taken to be not the responds to the difference between a volun-
payoffs promised by perfect optimization, but rather the tarily exercised and involuntarily exercised
incentives to pursue the potentially imperfect learning outside option in a direction consistent with
process that shapes behavior.
5
See Camerer (2003) and Drew Fudenberg and David theoretical predictions. However, Jack Ochs
K. Levine (1998) for examples. Raymond Battalio,
Samuelson, and John Van Huyck (2001) report an experi-
6
ment linking the speed of learning and the incentives for Ed Hopkins (2002) provides an indication of why it
adjusting one’s strategy, providing some hint that learning can be difficult to identify the learning process behind
can be important. experimental behavior.
mr05_Article 2 3/28/05 3:25 PM Page 69

Samuelson: Economic Theory and Experimental Economics 69

and Roth (1989) report an experiment in self-interested behavior but revise the
which behavior does not respond to the dis- structure of the model.
count factor and the length of the game con- An appeal to fairness has an intuitive ring
sistently with the predictions of subgame to it. It is hard to believe that fairness does
perfection. This latter finding is all the more not play a role in our lives, or that extreme-
disconcerting because the role of impatience ly asymmetric allocations would not strike
is viewed as one of the key insights of non- one as unfair. It also seems quite natural
cooperative bargaining models.7 These that these considerations would carry over
results suggest that rates of impatience may into behavior in bargaining experiments.
be less central, and the prospect of a break- Here, however, we return to a theme that
down in negotiations more important, than appeared in connection with learning mod-
captured by the original models. els and that runs throughout this essay. The
Taken together, the body of experimental relevant question for evaluating a theory is
evidence suggests that our simplest theories not so much whether it is “correct,” but
of bargaining leave some aspects of behav- whether it can be readily and usefully
ior unexplained. This is interesting, but is applied to a sufficiently broad range of set-
most useful if the experiments also suggest tings. The difficulty with appeals to fairness
how we might construct a more encompass- is that they too often have an “I know it
ing account of behavior. This brings us to when I see it” quality that makes them par-
the question, again borrowed from experi- ticularly cumbersome to use. Vesna
mental psychology, of internal validity. How Prasnikar and Roth (1992) develop this
do we assess whether our interpretation of idea, reporting experimental results show-
an experimental result captures the rele- ing that, under some circumstances, experi-
vant aspects of the experimental situation mental subjects do settle on extremely
and the resulting behavior, and hence asymmetric allocations (see also James
points the way to a better understanding of Andreoni, Paul M. Brown, and Lise
the behavior and to better theoretical mod- Vesterlund [2002] and Harrison and Jack
els of that behavior? For example, Güth, Hirshleifer [1989]). This appears to suggest
Schmittberger, and Schwarze (1982) (see that we have been too hasty in concluding
also Güth and Reinhard Tietz [1982]) inter- that concerns for fairness routinely push
pret their results as indicating that subjects’ people away from asymmetric allocations.
behavior is shaped primarily by considera- However, the extreme allocations in
tions of fairness. If this is the case, then we Prasnikar and Roth’s best-shot game Pareto
may be on the road to a new “fairness theo- dominate the less asymmetric allocations. In
ry” of behavior. We might work with famil- response, it is tempting to refine the notion
iar bargaining models, but with quite of fairness, viewing it as inducing an antipa-
different views of how people behave in thy to asymmetric allocations, but an antipa-
these models. Notice that this assessment thy that is tempered when asymmetric
differs markedly from the hints with which allocations have efficiency properties that
the previous paragraph concluded, under symmetric allocations lack. Adding the best-
which we would retain the basic view of shot experimental results to our portfolio
may thus suggest that fairness is important
7 after all, but is a more subtle notion than
The contrast between these two results becomes
sharper in the context of section 4, which suggests that one simply a concern for equality or symmetry.8
should be especially disappointed when a theory fails to
8
exhibit behavior integral to its original structure, such as Prasnikar and Roth (1992) investigate these possibili-
the appropriate sensitivity to discount rates, but especially ties by examining a market game in which asymmetric
pleased when the theory successfully extends to originally equilibrium outcomes appear that do not Pareto dominate
novel questions, such as the effect of outside options. the symmetric outcomes.
mr05_Article 2 3/28/05 3:25 PM Page 70

70 Journal of Economic Literature, Vol. XLIII (March 2005)

There is much that is appealing about this behavior into standard models of bargain-
argument, but it illustrates the difficulties of ing. Gary E. Bolton and Axel Ockenfels
working with such an elusive concept as (2000) and Ernst Fehr and Klaus M.
fairness. The more subjective or context- Schmidt (1999) (see also Bolton [1991])
dependent is the idea of fairness, the less offer models of preferences that capture a
useful it becomes as a component of a the- concern for fairness. Each is centered
ory, regardless of how important it is in around a utility function that involves one’s
shaping behavior. own payoff and the payoff of one’s oppo-
Making progress in interpreting seeming- nent, and that exhibits some aversion to
ly anomalous experimental results thus payoff inequality.
requires making the idea of fairness, or One tempting reaction to these models is
whatever else it is that one imagines affect- that nothing so simple could possibly cap-
ing players’ behavior, sufficiently precise. A ture the complexity of human behavior.
first question is theoretical: can we do so Pursuing this view, it is not hard to find evi-
with conventional theoretical techniques, or dence that some factors are missing from
are we dealing with something quite differ- these models.9 However, such criticisms
ent? Are we dealing with a world to which miss the point. It is again important to recall
the underlying structure of economic mod- that one purpose of any theory is to judi-
els applies—people maximize, they bal- ciously choose considerations to neglect.
ance competing objectives, they respond The ability to find some circumstances in
to variations in the constraints on how which the theory does not work perfectly is
these objectives can be traded against one then not by itself cause to reject the theory.
another—even if they are concerned with While they may still be incomplete, the
something other than simply how much models offered by Bolton and Ockenfels
money they make? Or are such models of (2000) and Fehr and Schmidt (1999) have
behavior simply on the wrong track? the key virtue that their predictions are clear
Andreoni and John Miller (2002) provide and they can easily be extended to encom-
some insight into this question through pass novel situations. This allows us to con-
experiments in which a dictator faces a vari- firm that these models predict behavior
ety of exchange rates between the payoffs matching that of standard models in a wide
that the dictator can keep or allocate to a variety of circumstances in which the latter
recipient (while Andreoni, Marco Castillo, appear to be applicable, to confirm that they
and Ragan Petrie [2003] do much the same capture the apparent fairness considerations
for the ultimatum game). As is often the case that operate in the bargaining models that
in such games, their results are not consis- motivated their construction, and to investi-
tent with the proposition that all subjects gate the extent to which they apply to new
care only about how much money they applications. This is just what we need to
receive. However, their results are consis- make progress, and is what economic theory
tent with the claim that most subjects have must do if we are to effectively combine
stable preferences satisfying revealed-pref- theory and experiments.
erence axioms. Whatever motivates the sub- A variety of alternative and more elaborate
jects, whether money or fairness or models have appeared, many enriching the
something else, it is something that we can
model with the familiar optimization tools of 9
Among others, Ken Binmore, John McCarthy,
economics, without abandoning rational Giovanni Ponti, Samuelson, and Avner Shaked (2002) and
behavior as a unifying principle. Armin Falk, Fehr, and Urs Fischbacher (2003) report
experimental results indicating that preferences must
The next task is again theoretical: fitting depend upon more than simply payoffs, even the payoffs
some more encompassing model of individual of all players.
mr05_Article 2 3/28/05 3:25 PM Page 71

Samuelson: Economic Theory and Experimental Economics 71

theory by incorporating elements in prefer- S∞ are their infinite-dimensional prod-


ences beyond simply the final allocation of ucts.11 We think of the function F as taking
payoffs.10 There is considerable work to be in information, given by an element of the
done in assessing and synthesizing these set X∞, that defines a situation of interest.
models, work that will require a continual This information might define an exten-
interplay between economic theory and sive-form game, or a set of lotteries from
experiments. How is this interplay to pro- which one is to choose, or a market or an
ceed? It will be helpful to develop some of economy. With each such situation, the
the ideas raised in the course of this example function F associates an output from the
more generally. set S∞.12 Depending on the situation, this
output might be an equilibrium of the
3. A Theoretical Perspective game, or a selected lottery, or a market
price or a competitive equilibrium.
This section opens a more general discus- We can view each of the dimensions of
sion with a theoretical perspective, in the X∞ and S∞ as corresponding to a property
form of a model of economic theory and or characteristic that a situation or an output
economic experiments. The idea is to pro- might have, with the sets X and S providing
vide a precise way of talking about what a the language in which one describes such
theory is, what an experiment is, and how properties. The details of the sets X and S
the two might be related. are not particularly important. What does
3.1 A Model matter is that there is a potentially endless
list of relevant properties, sufficiently many
The environment. The model begins that neither theoretical nor experimental
with the assumption that there is an objective work could ever hope to describe every
environment or “real world” to be studied, aspect of reality. We ensure this in the
represented by a function model by assuming that there are infinitely
F : X  → S, many such properties, so that the sets of
inputs and outputs are the infinite products
where X and S are finite sets and X∞ and X∞ and S∞.13
We think of the environment as generat-
10 ing situations which are then transformed
Bolton (1991) offers an early model in which prefer-
ences depend upon others’ payoffs, while Matthew Rabin into outcomes by the function F. We let 
(1993), building on the theory of psychological games denote a probability distribution describing
(John Geanakoplos, David Pearce, and Ennio Stacchetti the process that generates situations. We
[1989]), is an early example of how one might make the
idea of fairness theoretically operational. Other examples think of a theory or an experiment as being
include Gary Charness and Rabin (2002), Martin a tool for understanding the function F. In
Dufwenberg and Georg Kirchsteiger (2004), Fehr and
Simon Gächter (2000), Levine (1998), and Uzi Segal and
12
Joel Sobel (2003). Andreoni and Samuelson (2003) report Again, the model skirts a philosophical issue, con-
experimental results that explore, in a somewhat different cerning whether the world is deterministic or random. We
setting, some of the key features of these models. adopt the technical convenience that outcomes are deter-
11
This discussion thus avoids questions concerning the ministic (conditional on being able to identify the situation
existence of an objective reality have been raised from completely), though in practice we can identify only finite
widely differing perspectives. Physicists argue that quan- approximations of situations, with outcomes that then
tum phenomena are not determined independently of appear to be random (conditional on this information). A
attempts to measure them, while some social scientists random-world view requires only additional notation in
argue that nothing objective exists independently of the order to accommodate an extra layer of probability distri-
observer, who constructs reality as she observes it. Such butions in the model.
concerns can be relevant for economics. For example, 13
The sets X ∞ and S∞ can be viewed as a short-hand
could experimental procedures designed to elicit valua- for sets that are finite but prohibitively large. Daniel C.
tions affect those valuations? We return to this set of issues Dennett’s Library of Mendel (1995) provides the setting
in section 5.1. for an intriguing discussion of large finite sets.
mr05_Article 2 3/28/05 3:25 PM Page 72

72 Journal of Economic Literature, Vol. XLIII (March 2005)

the absence of any constraints, of course, rates may use information on wage rates,
one would simply work with F itself. marital status, age and educational attain-
Unfortunately, the function F is too com- ment, but may neglect information concern-
plicated to work with directly. The idea is ing foreign exchange rates. The theory’s
then to combine theory and experimental output may provide information about how
work to produce tools that are simple much time an individual devotes to leisure,
enough to be used, while capturing enough but may say nothing about which activities
of F to be useful. consume this time.
Theory. Like the function F that As a first approximation, we might think of
describes the environment, a theory takes in the theory as choosing an output from SM.
information concerning a situation and pro- However, given that the theory’s input leaves
vides information concerning the correspon- some details of the situation unspecified, it is
ding outcome. However, instead of taking in more natural to view the theory as produc-
all of the information contained in an ele- ing a probability distribution over the out-
ment of the set X∞, we model the theory as puts in SM (i.e., an element of ∆SM). We then
making use only of the dimensions 1,…,N, interpret the random output as reflecting
for some finite N. Similarly, instead of speci- the uncontrolled realization of those aspects
fying every detail of the output, the theory of the situation that are not captured by XN.
provides information only about the dimen- For example, a theory of labor force partici-
sions 1,…, M, for some finite M. Let XN be pation may provide an expected participa-
the set of N-tuples corresponding to the first tion rate, as a function of an individual’s age
N dimensions of the set X∞, and let SM simi- and education, but would view actual partic-
larly be M-tuples whose elements corre- ipation as being randomly distributed
spond to the first M dimensions of S∞. A around this expected value, reflecting other,
theory is then a function unobserved characteristics.15
It is useful to go one step further and
f : X N → S M , allow the theory to produce an element of
∆∆SM, the space of distributions over distri-
for some N and M, where ∆SM is the set of
butions. We may have more information
probability measures over SM and ∆∆SM is
concerning the likely values and implica-
the set of probability measures over ∆SM.
tions of some of the unmodeled features of
The restriction to finite N and M captures
a situation than of others. We may then
the fact that a theory does not make use of
have a distribution over the realizations of
all of the information defining a situation,
the features about which we have relatively
nor does it specify every detail of the out-
good information, each in turn inducing a
put.14 Instead, one of the challenges in
distribution over outcomes. For example,
crafting a theory is to choose its inputs and
an analyst asked to predict the outcome of
outputs, i.e., to choose N and M, so as to
the next presidential election might begin
include relevant information and neglect
with the question of whether the economy
relatively unimportant details. For example,
will then be healthy or in recession. The
a theory about labor force participation
analyst’s theory may involve a distribution
over which of these is likely to be the case
14
While it is intuitive that a theory cannot make use of and, conditional on either, a distribution
all the information in the environment, there is in princi-
ple no reason why it should be restricted to the first N
dimensions of X ∞. Why not a theory that makes use of the 15
The idea that the theory produces a distribution over
information in dimensions 1, 3, and 14 and ignores the outcomes is perhaps most familiarly exploited by weather
rest? There is no loss in assuming that, whatever theory forecasters, who regularly announce probabilities of rain,
we have, the dimensions of X ∞ are arranged so that the but also appears routinely in economics. We return to this
theory makes use of an initial string of them. idea in section 4.1.2.
mr05_Article 2 3/28/05 3:25 PM Page 73

Samuelson: Economic Theory and Experimental Economics 73

over likely outcomes of the election. the features (xN) determined by the
Similarly, an economist asked to analyze experimental design.17
the market for skilled labor might begin It may seem counterintuitive to character-
with a distribution over likely macroeco- ize the experiment as yielding realizations of
nomic conditions, each of which in turn F, since experiments are often viewed as
induces a distribution over conditions in (and criticized for) being artificial rather
the relevant labor market. In a model of than “real.” However, a more precise formu-
labor force participation, one’s education lation of this criticism is that the experimen-
and age may induce a distribution over par- tal situation involves a value of xN that is not
ticipation decisions that itself depends ran- precisely the one in which we are most
domly on labor market conditions. Who is interested.18 But given this value, the output
to say whether the probability that the is given by FM(x∞) for some x∞ that matches
weather forecaster attaches to rain is not xN on the relevant dimensions.
itself chosen randomly? There are many situations x∞ consistent
Experiments. An experiment similarly with an experimental design xN, as must be
associates an output with an input. The the case when we are unable to specify
experiment again begins with an element every detail of the experimental situation.
of XN (for some finite N), which we denote One hopes the experimental design deter-
by xN and refer to as the experimental mines most of the important aspects of the
design. This design fixes those features of situation, but cannot control all of the
the environment captured in the N dimen- dimensions of the experimental situation
sions of XN. For example, the design xN may x∞. In effect, the experiment is a model of a
specify how much money the subjects earn situation, just as is a theory. The output of an
in various circumstances. experiment is similarly a model, given by
The actual input to the experiment is a sit- FM(x∞) rather than F(x∞). We can hope to
uation, i.e. an element of X∞ (denoted by identify the salient points of the experimen-
x∞), that matches xN on the first N dimen- tal outcome, but again cannot identify
sions.16 The idea here is that the experimen- everything.
tal design fixes those details of the
environment described by xN, while leaving 17
An experiment may yield many observations, but we
others uncontrolled. For example, the can arrange the notation to represent the entire experi-
mental outcome as a single observation. This model
design may leave uncontrolled the wealth ignores an issue raised by Roth (1994): the tendency to
levels of the subjects. concentrate on “successes” when reporting experimental
Let FM denote the function comprising results can cause useful information to be neglected. A
report of a successful experiment, whether it involves a
the first M dimensions of F. Given an seeming confirmation or contradiction of a theory, may be
experimental design xN and a corresponding less informative than a report that also details the process
input x∞, the experiment consists of an leading to that experiment. The latter may include investi-
gations of alternative games, alternative experimental pro-
observation of the form: cedures, alternative presentations of the experiment to the
FM(x∞)
subjects, and so on. As Roth notes, the line between hav-
ing also run alternative (possibly unsuccessful) experi-
ments and having run pilot or diagnostic trials is often
for some M. Hence, an experiment consists ambiguous, so that even the best of intentions do not
of a partial description (FM(x∞)) of the out- ensure the optimal provision of information. The discus-
put of the function F that describes the sion here assumes that this problem has been solved, so
that we have precisely the information we would want
environment, evaluated at one of a collec- from an experiment, and then asks how we combine that
tion of possible inputs (x∞ ∈ X∞) that share information with economic theory.
18
For example, the experiment may involve university
undergraduates choosing between small-stakes lotteries
16
Hence, the input is drawn from the set while we may be interested in risk attitudes among large
{x∞ ∈ X ∞ : x∞(n)  xN(n), n  1,…,N}. traders in financial markets.
mr05_Article 2 3/28/05 3:25 PM Page 74

74 Journal of Economic Literature, Vol. XLIII (March 2005)

The Goal. The goal of both theoretical while the ability to predict behavior may be
and experimental work is to understand the a good test of our understanding of the
world, or in the context of our model, to world, the ultimate goal is the understanding
understand the function F. How can eco- itself. Economic theory can then be helpful
nomic theory, often seemingly quite in making precise our intuition and estab-
removed from the world, be combined with lishing relationships between our ideas, even
experiments in pursuit of this goal? without adding to our predictive abilities.
It is not easy to make this goal more pre- This is a popular view, but one that makes it
cise. How do we know when we have all the harder to identify the criteria by
achieved some understanding of F? We which theory is to be evaluated.
might judge our understanding by the abili-
3.2 Why Experiments?
ty to make predictions that match the out-
puts generated by F. For example, Erev, How do experiments help us assess and
Roth, Robert L. Slonim, and Greg Barron design economic theory? It is useful to start
(2002) pose the following question. Suppose by considering the limitations of economic
we have both a theory and some experimen- theory, organized around four ideas:
tal evidence bearing on a question of eco- • Economic theory may be inaccurate:
nomic behavior, perhaps making different given an input xN ∈ XN, the theory f(xN)
predictions and each potentially subject to may produce distributions over outputs
error. How do we combine the two to reach that do not match the distribution
a more precise, joint prediction? induced by the environment. Hence,
The perspective of this paper is different. given the information on which it condi-
Instead of asking how we use existing theo- tions and the results it predicts, the the-
retical and experimental results to make pre- ory provides a result that we would
dictions, our focus will be on how we can change if we knew the true model.
exploit experimental results in the develop- • Economic theory may be imprecise: the
ment of more useful theory, and vice versa. theory may produce a random output of
We thus shift the emphasis from using exist- sufficient noisiness to be unhelpful. If
ing theory and experiments in making pre- possible, we might then seek more preci-
dictions to using them in making new theory sion by increasing N to encompass more
and experiments. information than that captured by XN,
To be meaningful, of course, this process bringing more of the relevant variation in
must be organized by the ultimate goal of the situation within the purview of the
understanding the world. At this point, we model.
confront new difficulties. While we might • Economic theory may be uninformative:
hope that a theory’s predictions will be close, important information may be missing
we again cannot expect them to be exact. from the output of the theory. We may
Then how are we to judge whether a new then need to expand the range of the
theory is an improvement? This would be theory (increase M).
straightforward if there were only benefits • Economic theory can be too complicated:
and no costs to enhancing the predictive if the vectors N and M are large, then the
power of a theory, but this is not the case. informational demands of the theory
We return to this issue in section 3.2. may be so burdensome as to make the
More importantly, the ability to make pre- theory useless.
dictions is only part of what is involved in In practice, we must expect these cate-
using economic theory to understand the gories to blur together, with any particular
world. Robert J. Aumann (1985) and Ariel theory exhibiting some degree of each
Rubinstein (1998), for example, argue that shortcoming.
mr05_Article 2 3/28/05 3:25 PM Page 75

Samuelson: Economic Theory and Experimental Economics 75

How can economic experiments help research programs in economics and psy-
address the shortcomings of economic the- chology that serve as a conscience for eco-
ory? First, experiments can fill the gap nomic theory, arguing that much of our
when the theory is either too uninformative theory does not provide a good match for
or too complicated to be useful.19 For behavior.22
example, the role of economists in design- The difficulty here is that theories are
ing and running Federal Communications intended to be inaccurate and imprecise. As
Commission spectrum auctions in the we have noted, a theory is a deliberate
United States, and subsequently throughout approximation of a world too complicated to
the world, has been offered as evidence for be analyzed in complete detail. It is then no
the usefulness of economic theory. Before surprise to find that the theory does not
running the auctions, however, the FCC always match behavior. Experimental con-
commissioned experiments (spearheaded firmation of this fact is potentially helpful,
by Charles R. Plott) to explore their proper- but only if it also points the way toward an
ties (cf. Paul Milgrom 2004). These experi- improved theory.23 The constructive role for
ments played an important role in verifying experiments that challenge economic theo-
the internal consistency of the auction pro- ries is thus not to simply argue that existing
cedure and in making the case that the auc- theories do not work, but to point the way to
tion could work. Experiments were improvements.24 Perhaps paradoxically, it is
similarly important in designing the British when playing this role that experiments
spectrum auctions (Binmore and Paul pose the greatest challenge to economic
Klemperer 2002). There are many other theorists. It is relatively easy to dismiss an
examples, from designing multiunit auc- experimental contention that a theory is
tions (Jeffrey S. Banks, John O. Ledyard, sometimes off the mark, but much harder to
and David P. Porter 1989) to designing pro- ignore an indication of how it might be
cedures to allocate access to railroad tracks improved.
(Paul J. Brewer and Plott 1996), payload A new difficulty now appears. When
priority on the space shuttle (Ledyard, assessing potentially improved theories, we
Porter and Randii Wessen 2002), and air- must trade off competing features that
port take-off and landing slots (Stephen J. leave us with only a partial order over alter-
Rassenti, Vernon L. Smith, and Robert L. natives. Theories are better if they are
Bulfin 1982). more accurate and precise, but also if they
Second, and of more relevance for our dis-
cussion, much of the work in experimental 22
See, for example, Dan Ariely, George Loewenstein,
economics has centered around identifying and Drazen Prelec (2003), Daniel Kahneman and Amos
inaccuracies and imprecisions in economic Tversky (2000), Loewenstein and Prelec (1992), Richard
theory.20 For example, the standard eco- H. Thaler (1992, 1994), and Tversky and Kahneman
(1982).
nomic model of individual behavior is that 23
Neglecting this last point makes it all too easy to fall
people maximize expected utility. However, into a state of tension in which the primary value of exper-
ample experimental evidence suggests that iments is seen as debunking theory, and theory is viewed as
having to defend itself from the challenge of experiments.
people do not always maximize expected Against this backdrop, it is noteworthy that economic
utility, and do not count upon others to do experiments owe much of their prominence to their
so.21 More generally, there are long-standing demonstration that economic theory can be surprisingly
robust. For example, Vernon Smith’s work on actions (see
Theodore Bergstrom [2003] for a survey and Smith [1991,
19
This falls into Roth’s (1987) category of “Whispering 2000] for collections of papers) showed that elementary
in the Ears of Princes.” supply-and-demand models, the bread-and-butter tool of
20
This falls into Roth’s (1987) category of “Speaking to much of economics, were surprisingly descriptive.
24
Theorists.” Binmore (1999) advocates such a “consolidating”
21
See Camerer (1995) and Roth (1995) for surveys. view of the interaction between theory and experiments.
mr05_Article 2 3/28/05 3:25 PM Page 76

76 Journal of Economic Literature, Vol. XLIII (March 2005)

are more parsimonious (i.e., have smaller harder to find papers that use these tools.
N, for fixed M, or in some cases that have Why? The informal explanation typically is
smaller N and M). A theory that makes bet- that for most applications, expected utility
ter predictions at the cost of more com- theory’s lack of realism is a reasonable price
plexity is not necessarily more desirable. to pay for its simplicity. This assessment
Nor is the goal necessarily a single “cor- convinces some, while striking others as too
rect” theory. Instead, we can expect to work easy an excuse.
with a portfolio of theories that address dif- David W. Harless and Colin F. Camerer
ferent issues and that lie at different points (1994) provide a foundation for examining
along a frontier that trades off power and this issue, introducing the notion of an effi-
complication. ciency frontier for generalizations of
The idea that a more complicated theory expected utility theory, balancing predictive
may not be better is obscured by economic power and simplicity. They find that ordi-
theory itself, which implicitly assumes that nary expected utility theory lies on this fron-
reasoning and inference is costless and auto- tier, as do several more sophisticated
matic.25 In practice, however, it is a familiar theories. This at least provides some reas-
idea that theories are costly to use, and surance that expected utility theory is not
hence that a more accurate or more precise dominated on every dimension. Depending
theory is not always superior.26 This point is upon our requirements, we might reason-
often illustrated in introductory economics ably choose to work at various points on this
classes by asking students to think of a road frontier, including work with expected utili-
map as a metaphor for an economic theory, ty. But what is the criterion by which points
and then to note that a map on a scale of 1:1 along this frontier are evaluated, other than
would be more precise than is commonly conventional wisdom and accepted prac-
found, but its very detail would render it tice? How much of conventional wisdom
useless. and accepted practice reflects inertia, his-
One of the obstacles to the integration of torical accident, a lack of familiarity with
economic theory and experiments is thus new theories, fashion, and similar factors? If
that we have no clear idea of what makes a we are to insist that the goal of a theory is
theory good. For example, we have ample not to be right but to be useful, then one of
evidence of shortcomings of expected utili- the great difficulties with economic theory
ty theory, as well as an ample collection of is that we have little consensus on what
alternative models. However, while it is makes a theory useful, other than that it is
easy to find papers in theory journals work- customarily used.
ing on the tools that might serve as alterna- This presents a challenge in two respects.
tives to expected utility theory, it is much Theorists need to be more explicit, both in
their theory and in their reactions to experi-
25
Standard models of reasoning and knowledge begin ments, as to how they assess the trade-offs
with a set of states and a partition over these states repre- between various limitations. Experimentalists,
senting the structure of the available information (e.g., when interpreting results as supporting an
Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and
Moshe Y. Vardi 1995). These models have the implication elaboration of existing theory, must address
that one automatically knows every implication of any not only the potentially increased precision
information received. Hence, knowledge of the rules of and accuracy of the theory but also the
chess ensures that one knows an optimal strategy, while
knowledge of the basic axioms of mathematics makes all of increased complication.
the theorems of mathematics instantly available. It is no Is there anything special about experi-
surprise that such models do not encourage one to think ments in this discussion? In one sense, no.
about the costs of complicated reasoning.
26
Barton L. Lipman (1999) examines a formal model of The ideas apply to the use of data in gener-
the cost of using a theory. al, regardless of whether an experiment lies
mr05_Article 2 3/28/05 3:25 PM Page 77

Samuelson: Economic Theory and Experimental Economics 77

at their source.27 However, the great attrac- which we are interested to be informa-
tion and relevance of experiments is the tive about the latter? Experimental psy-
ability they provide to control inputs. If we chology refers to this as the question of
are interested in assessing the output f(xN) external validity.29
produced by the theory f in response to • Experiments may be imprecise: our inter-
input xN, it may be easier to create (or pretation of an experiment may incor-
approximate) input xN in the laboratory than rectly identify the links between the
“in the field.”28 The value of this control situation and the results. The unrecog-
becomes all the more apparent upon realiz- nized links may make the resulting infer-
ing that we typically can neither ensure that ences too noisy to be useful. This is a
our inputs include all of the factors we question of internal validity.30
would like to have, nor that they exclude all • Experiments may be uninformative. It
of the ones we would like to not have. At the may not be possible to bring the experi-
same time, this advantage brings with it a mental design xN close enough to the
new challenge. How do we know when the situation to provide useful information.
experimental setting has done its job, giving • Experiments may be informative only at
us observations from situations consistent prohibitive cost. Though one of the obvi-
with the desired input xN and not something ous advantages of experiments is the
else? We return to this issue in section 5.1. ability to address otherwise intractable
problems in a manageable way, there
3.3 Why Theory may be cases where this is not feasible.
What does economic theory have to con- Again, these are neither sharply defined
tribute? Paralleling the preceding discus- nor mutually exclusive categories, and we
sion, it is helpful to begin with the
limitations of experimental work: 29
In one view of economic theory, there would be no
• Experiments may be inaccurate: the problem of external validity. Elon Kohlberg and Jean
experimental procedure is itself a situa- François Mertens (1986, p. 1005) state: “We adhere to
the classical point of view that the game under consider-
tion. This procedure has presumably ation fully describes the real situation—that any
been designed to control the key fea- (pre)commitment possibilities, any repetitive aspect, any
tures of the situation, but we cannot probabilities of error, or any possibility of jointly observ-
ing some random event, have already been modeled in
expect to have controlled everything. the game tree.” Pushing this view as far as it will go, the
How do we know that the design brings theory then identifies the situation exactly. If the theory
the experimental situation sufficiently is simple enough that all of its aspects can be captured in
the lab, then we literally have the situation of interest and
close to the real-world situations in not simply an approximation, leaving no room for ques-
tions of external validity. If not, then the laboratory inves-
27
The line between experimental and field data is tigation is irrelevant to the theory. Under this view, an
becoming increasingly blurred, as economists turn to experimental result at odds with the theory tells us only
“field experiments” designed to capture the best of both that the experimental design has not captured the condi-
settings. See Harrison and John A. List (2004) for an intro- tions under which the theory applies. This classical
duction to field experiments and the methodological issues approach is best viewed as a philosophical exploration of
they raise. the idea of rationality. It contrasts with a positive
28
If we cannot observe situations consistent with input approach, under which economic theory is viewed as a
xN in the field, why do we care about xN? The answer is that tool for modeling and understanding behavior, a tool that
some values of xN may provide especially revealing condi- is more useful the broader is its applicability. In this case,
tions under which to evaluate the theory. For example, experimental results at odds with the theory help identify
theories about bargaining may be more readily evaluated circumstances under which the theory is not applicable.
30
when complicating interpersonal factors are stripped away For example, do the choices of experimental subjects
by examining anonymous bargaining. This in turn my be reveal the values they place on the consequences of those
possible only in the laboratory. For similar reasons, scien- choices or some other aspect of the process by which
tists may endeavor to free their experimental environ- choices are made or values identified? See Harrison,
ments of impurities, even though such an environment is Ronald M. Harstad, and Elisabet Rutström (2002) for a
not observed in nature. discussion of value elicitation.
mr05_Article 2 3/28/05 3:25 PM Page 78

78 Journal of Economic Literature, Vol. XLIII (March 2005)

can expect experiments to exhibit elements former, showing that the continuous flow of
of each. offers, coupled with traders’ budget con-
How can economic theory help? First, straints, generates a mechanical but power-
economic theory can fill the gap when exper- ful push in the direction of efficient
iments are not sufficiently informative (at a outcomes (Brewer, Maria Huang, Brad
reasonable cost) to be useful. Oil companies Nelson, and Plott 2002; Dhananjay K. Gode
maintain teams of geologists who supple- and Shyam Sunder 1993, 1993, 1997;
ment sampling data with theoretical models Sunder 2004). Alternatively, inconsistent
designed to predict the likelihood of finding behavior in laboratory decision problems is
oil beneath a tract of land or ocean bed. Why often interpreted as reflecting preferences
bother, when a single experimental observa- that violate the expected-utility axioms. How
tion would suffice to provide the result? The do we know when we have uncovered some-
difficulty is that the experiment in question thing about preferences and when we should
consists of drilling a well, which can be suffi- seek some other explanation in the experi-
ciently expensive as to be undertaken only mental design? We have more confidence in
after a favorable theoretical assessment. the links between behavior and preferences
Similarly, our primary means of assessing when we have models of the latter.
nuclear weapons is theoretical, in the form Once again, a difficulty arises. A model
of computer simulations.31 The relevant consistent with the observed behavior does
experiments are too costly. not always identify the principles behind the
Second, and again of more relevance for behavior. Instead, experience has shown that
the current discussion, economic theory can economists can build a variety of models
be useful in assessing the external and inter- consistent with virtually any behavior.32 How
nal validity of experiments. Insight into links do we know when we have hit upon a clever
between experimental outcomes and uncon- but irrelevant model and when our model
trolled aspects of the experimental situation captures something important?33
(and hence external validity), or insight into Revisiting a theme, one of the obstacles to
the link between the experimental environ- the integration of economic theory and
ment and the observed behavior (internal
validity), can be provided by theoretical 32
Difficulties in distinguishing between theories that
models of the behavior. For example, exper- are consistent with observations and theories that “explain”
these observations are not special to economics. Similar
imental outcomes in continuous double auc- considerations arise in the view that one can falsify, but
tion markets (e.g., Plott and Smith 1978; cannot “prove,” a scientific theory.
33
Smith 1962, 1964, 1965, 1976, 1982) have One response to concerns over internal and external
validity is to subject the relevant experimental protocol to
been surprisingly efficient, given the appar- scrutiny. For example, Thaler (1988) wonders whether
ent thinness of the markets. How do the Binmore, Shaked, and John Sutton (1985) might have
traders overcome the frictions of a thin mar- influenced the behavior of their experimental subjects by
stressing in their experimental instructions that subjects
ket to achieve nearly efficient outcomes? should maximize their monetary payoffs. As in other
Under what circumstances can we expect experimental sciences, however, a useful response to
similar behavior in actual markets and when potential inaccuracy or imprecision in economic experi-
ments is to rely on replication. The more readily an exper-
should we be less sanguine about efficiency? imental result can be replicated, the less likely is it to
Addressing the latter question has become hinge upon uncontrolled or unrecognized features of a sit-
easier as theoretical models have tackled the uation. The evaluation of a new experimental situation
then lies in the ability of its “control” treatment to repli-
cate previous results. For two examples among many,
31
Developing such a theory is itself quite costly, so Binmore, Shaked, and Sutton begin their experiment by
much so that its provision to other countries is treason- replicating the results of Werner Güth, Rolf
ous. But in this case, the relevant experiments involve Schmittberger, and Bernd Schwarze (1982), and Charles
nuclear detonations whose direct and political costs are R. Plott and Zeiler (2003) begin their investigation of the
even larger. endowment effect by replicating previous findings.
mr05_Article 2 3/28/05 3:25 PM Page 79

Samuelson: Economic Theory and Experimental Economics 79

experiments is thus that we have no clear randomly drawn from the set of situations
idea of when we have a good match between whose first N dimensions match xN, i.e., from
theory and behavior. This difficulty again the set of situations that match the experi-
poses a challenge in two respects. For theo- mental design in those features controlled by
rists, there is much to be done in terms of the design. This situation is then converted
identifying behavior that would enhance into an output according to the function F
one’s confidence that the theory in question describing the environment, and we observe
has captured the relevant principles, or that the first M dimensions of this output, giving
would force one to question such a conclu- the output sM. We let π∗ denote the resulting
sion. A good start would be to consistently probability distribution over the set of possi-
explain what behavior a theory cannot ble experimental outputs SM, and refer to π∗
explain.34 For experimentalists, it can be as the true distribution.35
important to argue not only that a model Similarly, given the input xN, a theory f can
captures the outcomes of the experiment, be viewed as producing an output π ∈SM,
but that it captures the appropriate links i.e., a probability distribution over the set of
between the experimental situation and the possible outcomes . This output is itself ran-
outcome. Again, a good start would be to domly chosen according to a probability dis-
consistently explain what outcomes would tribution over SM that is determined by the
lead to the opposite conclusion. theory.36 We let f ∗ denote this distribution.
The task now is to describe the implica-
4. Combining Theory and Experiments tions of the experiment for the theory. We
4.1 Using Experiments to Learn About think of running the experiment, producing
Theory a randomly-drawn output sM (from the dis-
tribution π∗ induced by the experiment), and
4.1.1 Testing Theory: Accuracy choosing a randomly-drawn distribution π
How can we use experiments to evaluate (from the distribution f ∗ induced by the the-
economic theory? Suppose we fix an experi- ory). We insert these realizations into an
mental design xN and a set of possible out- evaluation rule T(sM,π). The evaluation rule
puts SM, identifying the features of the input produces the output T(sM,π)1 if we accept
and output that are considered salient in the the theory given realizations π and sM and
experiment. The resulting experiment pro- T(sM,π)0 if we reject the theory given real-
duces an output sM. Does this indicate that izations π and sM. Clearly, of course, a single
we should be more confident of economic experiment does not suffice to evaluate a
theories that place relatively large probabili- theory. The labels “accept” and “reject”
ty on the outcome sM, or on similar out- might accordingly be more precisely (but
comes, when faced with the input xN? Some also more cumbersomely) phrased as
useful insight into this question is given by “regard this experiment as evidence in favor
the following argument, adapted from
35
Alvaro Sandroni (2002), that is typical of the Formally, π*(sM) is proportional (being rescaled to
ensure a total probability of one) to ρ({x∞ ∈ X ∞ : x∞(n) 
calibration literature. xN(n), n  1,…, N and FM(x∞)  sM}).
Given the design xN, the experiment’s out- 36
Recall that a theory is an element of SM, being a
put sM is randomly determined by the envi- distribution from which a distribution over sM is randomly
ronment. In particular, a situation x∞ is
drawn. Notice that the outcomes of the experiment and
the theory are drawn from different spaces. This is famil-
iar. For example, the experiment produces the outcome
34
Among many such examples, Timothy N. Cason and rain or no rain, according to a distribution that depends
Daniel Friedman (1996) and John H. Kagel, Harstad, and upon such factors as the location and the season. The the-
Dan Levin (1987) begin their analysis with theoretical ory is allowed to announce a probability of rain, which
models, focusing on aspects of behavior the models cannot may itself be drawn from a distribution that depends upon
accommodate. similar factors.
mr05_Article 2 3/28/05 3:25 PM Page 80

80 Journal of Economic Literature, Vol. XLIII (March 2005)

of the theory” and “regard this experiment as always accepts the theory, but could be use-
evidence questioning the theory.” ful if it instead rejects the theory if the
How do we design a useful evaluation rule observed proportion of heads (or tails) is too
T? One desirable criterion is that if one were large. To capture this distinction, we say that
to offer the true distribution π∗ as the real- an evaluation rule is blindly passed by theory
ized output of one’s theory, then our evalua- f with probability 1 if, for every sM ∈ SM,
tion of the experimental evidence should be
unlikely to reject it. Because the experimen- f ∗ ({ : T ( sM , ) = 1}) ≥ 1 −  .
tal outcome is random, we cannot expect the
distribution π∗ to always prompt an accept- Hence, no matter what observation sM the
ance. For example, the evidence will some- experiment produces, with probability at
times reject the theory that a fair coin yields least 1 the theory f (via its induced distri-
heads on half of its flips, simply because we bution f∗) produces a distribution π over pos-
encounter an unusual and unlikely sequence sible experimental outcomes that causes the
of outcomes. However, we can reasonably theory to be accepted (given the observation
ask that such rejections be rare. We make sM). The phrase “blindly passed” here refers
this idea precise by saying that an evaluation to the fact that the theory f is accepted by
rule accepts the truth with probability at the evaluation rule with probability at least
least 1 if, for any true distribution π∗, 1 regardless of the experimental outcome

({ ) })
or, equivalently, to the fact that f embodies
(
 ∗ sM : T sM , ∗ = 1 ≥ 1 −  . no understanding of the true process gener-
ating experimental outcomes. As a result, a
Hence, with probability at least 1, the theory may blindly pass an evaluation rule
true distribution π∗ generates an experimen- with high probability, but without providing
tal outcome sM that would not prompt us to any insight into the principles governing the
reject the truth, if we were asked to evaluate outcome in this situation.
the truth as a possible theory. Notice that we The main result (proven in section 7) is
will typically not know the true distribution now:38
π∗ when designing an evaluation rule, and Proposition 1 Any evaluation rule that
hence our requirement is that the evaluation accepts the truth with probability 1 can
rule be unlikely to reject the truth (given the be blindly passed with probability 1.
distribution of experimental outcomes gen- At first glance, it seems obvious that an
erated by the truth), regardless of what the evaluation rule that accepts the truth can be
truth happens to be.37 passed—one need only propose the truth as
At the other end of the spectrum, an eval- one’s theory. However, Proposition 1 makes
uation rule is not particularly helpful in a quite different assertion. If an evaluation
assessing a theory if there are no experimen- rule accepts the truth sufficiently often
tal outcomes that would cause the theory to (i.e., with probability 1), then one can
be rejected (even though this would be one find a theory that requires no knowledge of
way to accept the truth with high probabili- the truth and has the property that, no mat-
ty). For example, an experimental test of the ter what the outcome of the experiment
theory that a coin is fair is not helpful if it and no matter what the actual process gen-
37
erating the experimental outcomes, the the-
For example, we can design an evaluation rule that ory is accepted with probability 1. The
can observe one hundred flips of a coin and simultaneous-
ly be quite likely to conclude that the coin is biased following illustrates:
towards heads when it is, and quite likely to conclude that
the coin is biased toward tails when it is, because these two
38
biases (if true) generate quite different distributions over This is a special case of Proposition 1 in Alvaro
experimental outcomes. Sandroni (2002).
mr05_Article 2 3/28/05 3:25 PM Page 81

Samuelson: Economic Theory and Experimental Economics 81

Example. Suppose that there are only two the outcome head is itself drawn randomly
possible outcomes of an experiment, head according to a distribution f ∗ over ∆SM.
and tail. The environment induces a true Given the uniform distribution we clearly
probability distribution over these two out- work without any information as to what the
comes, which we denote as π∗∈[0,1], where truth might be. Then
π∗ is the probability of the experimental out- f ∗ ({ : T ( tail , ) = 1}) = f ∗ ({ < 13 }) = 13 .
come head. As the notation suggests, we can
think of the experiment as a single flip of a f ∗ ({ : T ( head , ) = 1}) = f ∗ ({ ≥ 13 }) = 2
,
(possibly biased) coin, with π∗ being the true
3

probability of a head. The theory generates a and hence the evaluation rule is blindly
(possibly randomly determined) candidate passed with probability 13 .
probability π, which we must then combine To see the intuition behind Proposition 1,
with the experimental outcome to evaluate think of playing a zero-sum game against a
the theory. A possible evaluation rule is: malevolent and possibly omniscient oppo-
nent, “Nature,” where Nature chooses the
 1 if s = tail and  < 13
M

 true theory π∗ generating the experimental


T ( sM , ) =  1 if sM = head and  ≥ 13 . outcomes and you choose a theory f, with
 0 otherwise Nature attempting to maximize the probabil-

ity of an outcome that rejects your theory
Hence, the theory is accepted if the experi- (here we see Nature’s malevolence) and you
mental realization is tail and the realization trying to minimize this probability. Suppose
π of the theory attaches probability less than (counterfactually) that you had the luxury of
1
3 to head (the first line), and is accepted if observing Nature’s choice before making
the experimental realization is head and the your own. Then you could always simply
realization of the theory attaches probabili- name Nature’s choice as your theory, and the
ty at least 13 to head (the second line). This requirement that the test accept the truth
particular evaluation rule accepts the truth with probability 1 ensures that your suc-
with probability at least 13 .39 Such a mini- cess probability would be at least 1.
mum acceptance probability does not sound Alternatively, the worst that could happen is
very impressive. By altering the evaluation that Nature gets to observe your proposed
rule, we could manage to boost this proba- theory before choosing the truth (here we
1
bility to 2 , but could not go further in this see Nature’s potential omniscience) and then
40
case. Now suppose the theory f draws π chooses the truth to minimize your success
uniformly from the set [0,1]. Hence, consis- rate.41 The minmax theorem then gives us a
tent with the model of Section 3.1, the result that is familiar in the context of zero-
probability π with which the theory predicts sum games, namely that you can do as well in
the second circumstance as in the first, and
If the true distribution is π∗ 3 , the evaluation rule
1
39
hence can succeed with probability at least
accepts π∗ if the experiment generates outcome tail,
which happens with probability 1π∗( 3 ). If the true dis-
1 1 in the second circumstance. But your
tribution is π  , the evaluation rule accepts π∗ if the
∗ 13 optimal performance in the actual game, in
experiment generates outcome head, which happens with which neither side gets to observe the other’s
probability π∗( 3 ).
1

40
It is to be expected that an experiment with only two move, must be somewhere between these
outcomes provides rather crude information—how much best and worst cases, ensuring that the test
information can one expect to extract about the probabili- can be blindly passed with probability 1.
ty of heads, from a coin of unknown bias, from a single
flip? Higher minimum acceptance probabilities require
41
richer outcome spaces. Whether we are better off in this Here, it is clear that one is not simply predicting well
1
case with a rule that accepts the truth with probability 2 by offering the truth as a prediction, since the prediction is
depends upon the relative costs of mistakenly accepting or chosen first and then a worst-case specification of the truth
rejecting the various values of π. is chosen.
mr05_Article 2 3/28/05 3:25 PM Page 82

82 Journal of Economic Literature, Vol. XLIII (March 2005)

The implication of this result is that the clear enough that others could design new
ability of an economic theory to match exper- tests, and is one willing to risk the theory in
imental data does not necessarily provide such tests? If not, then it is not clear that
evidence in support of the theory. Instead, progress has been made.
given any specification of questions that a For example, Bolton and Ockenfels
theory could be asked, and any specification (2000) and Fehr and Schmidt (1999) offer
of how the answers to these questions are to models motivated by behavior in bargain-
be compared to the experimental evidence, ing experiments, with each model consist-
one can devise a theory based on no under- ing of an explicit specification of how utility
standing of the situation or the underlying depends upon (one’s own and one’s rival’s)
principles that allows one to be as successful payoffs (cf. section 2). In doing so, the
as knowing those principles precisely. authors are offering models that (like all
This result is not simply a restatement of others) cannot hope to capture every detail
the common view that it is somehow more of human motivation, and hence are bound
instructive if one first commits to a theory to fail some tests. However, these models
and then compares it to data (rather than exhibit the essential characteristic of being
first observing the data and then constructing sufficiently precise and powerful that new
a theoretical rationalization). More impor- tests can be devised. The authors are taking
tantly, this result is not simply a restatement some risk in presenting their theories so
of the observation that it is important for the- explicitly, but in return they ensure that
ories constructed in response to experimen- their models can be meaningfully investi-
tal observations to make “out of sample” gated experimentally. If their models do
predictions, i.e., predictions that could be not provide useful alternatives to the
assessed only with the collection of new data. hypothesis that players maximize their
Instead, the ability to blindly construct a expected monetary payoffs, they will be
theory f that fares as well as the truth stepping stones to such alternatives. Either
depends upon knowing the evaluation rule T way, their models allow progress that would
by which the theory is to be assessed. As be impossible without the ability to venture
long as we identify a fixed set of potential beyond the experimental designs that
tests to which a theory is to be subjected, prompted them.
whether in or out of sample, we can blindly
4.1.2 The Margin of Error: Precision
construct a theory that fares as well as the
truth in these tests, regardless of whether we We have modeled a theory as producing
have seen the outcomes of the tests and a probability distribution over probability
regardless of what these outcomes might be. distributions over outcomes.43 In most
Interpreting experimental evidence as sup- cases, an economic theory provides nothing
porting a theory, or offering a theory as an of the sort, with deterministic outcomes
interpretation of experimental evidence, being the rule. How do we put these two
thus acquires some bite only if the theory is together?
clear and complete enough that it can be Think first about how economists typi-
extended to answer new questions and con- cally do empirical work. The underlying
front new tests that did not play a role in the intuition and theoretical structure come
construction of the theory.42 Is the theory from a model free of anything random. But
before confronting this model with the
42
Eddie Dekel and Yossi Feinberg (2004) propose a
43
test for whether one’s theory matches the environmental This ability to mix is important, as without it one can-
function F that hinges upon asking one to design (rather not be assured of blindly passing evaluation rules that
than react to) an evaluation rule T. accept the truth.
mr05_Article 2 3/28/05 3:25 PM Page 83

Samuelson: Economic Theory and Experimental Economics 83

data generated by a noisy world, an error can come close to maximizing one’s expect-
term is added. The characteristics of this ed payoff with actions that seem far away
error can be important, providing the foun- from the equilibrium. If we are to view
dations for the inferences to be drawn from errors in this way, then we must be careful
the results. in concluding either that optimization is a
Assumptions about errors play a similarly poor description of individual behavior or
important role in interpreting experimental that the outcome is not (approximately) in
results. One argues not that the data and the equilibrium.46
theory are a perfect match, but rather that Yet another interpretation of the observed
the errors required to reconcile the data behavior assumes that subjects choose their
with the model are not too large. actions not by optimizing but through a
What does “not too large” mean? Auctions process of trial-and-error learning.47 Here,
have received significant attention from errors are measured in terms of the strength
experimentalists, with results that often of the incentives embedded in the learning
appear to be at odds with theoretical predic- process.
tions.44 One interpretation of the observed The implication in each case is that the
behavior is to assume that subjects invari- interpretation of experimental results
ably intend to identify and take their optimal requires not only a theory, but also some
actions, but that some sort of “tremble” idea of what types of errors are most likely
translates this optimal action into a random involved when the theory does not work per-
choice.45 The evidence convinces most fectly. Richard D. McKelvey and Thomas R.
observers that by this standard, there is Palfrey’s (1995) quantal response equilibri-
often a large gap between theoretical results um is perhaps the best developed and most
and experimental behavior: the trembles general such model, built around agents who
required to reconcile the two are too large, maximize utility functions perturbed by ran-
and hence much of auction theory appears dom terms. Notice that the errors here are
insufficiently accurate to be a useful built into the model of individual behavior
description of behavior. from the beginning rather than being added
Alternatively, one might interpret the at the end.48 These errors can be interpret-
observed behavior by assuming that subjects ed as capturing unmodeled but (one hopes)
are only -optimizers, being content with small effects on preferences. Quantal
identifying and playing an action that is response equilibria have been used to good
within some  of a best response. Section 2
touched on Harrison’s (1989, 1992) argu- 46
At the same time, the experimental results still pres-
ment that, by this standard, very little error ent a challenge for the theory. We no longer have evidence
is required to reconcile the theory with the that the model is inaccurate, but we have evidence that it
data. It turns out that one’s actions have rel- is not sufficiently precise to be a useful description of
behavior. In response, we could restrict our attention to
atively little effect on expected payoffs in payoffs (effectively, shortening the list M of outputs of the
many auctions (as long as actions are not too theory) or refine the theory in hopes of more precisely cap-
far from equilibrium), and hence that one turing behavior.
47
Binmore and Samuelson (1999) study learning mod-
els whose results depend importantly on the nature of
(possibly very small) errors.
44 48
For surveys, see Douglas D. Davis and Charles A. It is a familiar result that incorporating uncertainty
Holt (1993), Kagel (1995), and Kagel and Levin (2002). into the construction of a model can yield results that dif-
45
Such trembles may initially appear difficult to moti- fer from simply appending error terms to a deterministic
vate, but similar ideas have played an important role in the model. For example, incorporating an error term into play-
equilibrium refinements literature. More importantly, the ers’ choices in a game and then solving for a (perfect) equi-
possibility that typing or other errors might lead to mistak- librium (Reinhard Selten 1975) can give results quite
en bids was a serious concern in the design of the FCC different than first solving for a (Nash) equilibrium and
spectrum auctions (Paul Milgrom 2004). then adding an error term.
mr05_Article 2 3/28/05 3:25 PM Page 84

84 Journal of Economic Literature, Vol. XLIII (March 2005)

effect in analyzing a variety of experimental increasing the expected payoff of an alterna-


results.49 tive increases the probability that it is cho-
Once again, however, new challenges sen.52 Goeree, Holt, and Palfrey provide
appear. Philip Haile, Ali Hortacsu, and sufficient conditions for quantal response
Grigory Kosenok (2004) show that quantal equilibria to be monotonic and show that
response equilibrium is a sufficiently flexible monotonic quantal response equilibria can
notion that, by appropriately specifying the have substantive empirical content.53 The
error terms, one can obtain equilibria con- informativeness of experiments based on
sistent with any behavior that one might quantal response models is thus enhanced
possibly observe. The unmodeled errors are by a better theoretical understanding of
thus important. Without further assump- such models.
tions concerning their distribution, too much Focusing attention on the specification of
is left out of the model for its predictions to errors has the advantage of leading naturally
be usefully precise.50 to a provision for heterogeneity in players’
There are then two possibilities for har- behavior. Perhaps one of the most robust
nessing the potential power of quantal findings to emerge from experimental eco-
response equilibria. First, quantal response nomics is that such heterogeneity is wide-
models can provide comparative static spread and substantial. Despite this,
implications even without distributional heterogeneity has often not played a promi-
assumptions.51 Alternatively, Haile, nent role in many theoretical models.
Hortacsu, and Kosenok’s (2004) result Instead, theoretical explanations often have
depends upon having sufficient freedom in the flavor of seeking “the” model of individ-
specifying the errors in the individual utili- ual behavior that will account for the experi-
ties underlying the quantal response model. mental behavior. This appears to be a
We may often have either intuition or exper- holdover from the original presumption that
imental evidence about what forces are cap- monetary payoffs, common to all subjects,
tured by the errors. We may then augment suitably captured preferences, an assump-
the underlying model with hypotheses about tion that encourages a view of players as
the distribution of errors sufficiently power- homogeneous.54 Error terms provide a natu-
ful to produce precise results. In effect, we ral vehicle for capturing heterogeneity.
are enhancing precision by expanding the The implication is that there is much to be
set of inputs XN to capture more informa- gained by making our treatment of errors in
tion. Jacob K. Goeree, Holt, and Palfrey individual decision-making more explicit,
(2004) note that applications of quantal and hence much to be gained in the inter-
response equilibria typically work with mod- pretation of experimental results by being
els that are monotonic, in the sense that more careful with our theory. However, this
is a task made all the more daunting by the
49
See, for example, Simon P. Anderson, Jacob K.
52
Goeree, and Charles A. Holt (1998, 1998, 2001), Georee For example, a logit choice model with independent,
and Holt (2001), Goeree, Holt and Thomas R. Palfrey identically distributed extreme-value errors satisfies
(2002), and Richard D. McKelvey and Palfrey (1992, monotonicity.
53
1995, 1998). Other examples of work focussing on the structure of
50
Though the technical details are different, this result errors include Mahmoud A. El-Gamal and David M.
is similar in spirit to John O. Ledyard’s (1986) observation Grether (1995), David W. Harless and Camerer (1995),
that any behavior is consistent with the notion of Bayesian Harrison (1990), and Daniel Houser, Michael Keane, and
equilibrium. McCabe (1995). Similarly, Ledyard’s (1986) analysis of
51
A concentration on comparative statics requires Bayesian equilibrium suggests that we augment the model,
that the distributions of the error distributions do not perhaps with assumptions about players’ beliefs.
54
vary (or vary sufficiently regularly) as the parameters of For work on subject heterogeneity, see Andreoni,
the problem vary, a requirement lying behind many an Marco Castillo, and Ragan Petrie (2003), Andreoni and
econometric inquiry. John Miller (2002), and the examples cited in note 53.
mr05_Article 2 3/28/05 3:25 PM Page 85

Samuelson: Economic Theory and Experimental Economics 85

observation that the considerations relegat- present) is often bolstered with results from
ed to error terms are often there because we (nonhuman) animal as well as human exper-
know little about them. Once again, theorists iments. The use of hyperbolic discounting
are sent back to the drawing board in search in interpreting results from animal experi-
of theories precise enough to be useful. ments is routine (e.g., James E. Mazur 1984,
1986, 1987). The question to be considered
4.2 Using Theory to Learn about
here is one of external validity: how relevant
Experiments
is the animal evidence for human behavior?
4.2.1 External Validity Our approach to this question is not to
debate how similar are animals and humans,
Having found an experimental regularity,
but rather how similar are the typical dis-
how do we assess whether the experimental
counting problems faced by animals and
design from which it emerges is a good
humans. In turn, the approach to this latter
match for the intended application (the
question is to examine theoretical models of
question of eternal validity) and whether we
these discounting problems.
have linked the resulting behavior to the
Discounting in animals is commonly
appropriate characteristics of the design?
examined in the context of foraging behavior
The obvious observation is that more exper-
(e.g., Alasdair I. Houston and John M.
iments are always helpful, and one of the
McNamara (1999), Alex Kacelnik (1997),
great advantages of the experimental
Michael Bulmer (1997)). It is helpful to
method is the ability to collect more data.
begin with a highly simplified, deterministic
But economic theory has a role to play in
model. Suppose an animal faces the prob-
conjunction with these experiments.
lem of maximizing total food consumption
For example, the standard assumption
over an interval of length T. A function
when modeling intertemporal choice is that
c:IR+→IR+ identifies the quantity of con-
people maximize the sum of exponentially-
sumption c(t) that can be secured upon the
discounted expected utilities. Expected util-
investment of foraging time t. The animal is
ity theory derives much of its appeal from
to make a succession of foraging-time/con-
the fact that it rests upon a collection of
sumption pairs of the form (t,c(t)), where
axioms that can be interpreted as prescribing
each choice (t,c(t)) allows consumption c(t)
consistent behavior (Leonard J. Savage
but precludes another choice until time t
1972). Extending this argument to intertem-
has passed.
poral behavior, consistency is similarly
The animal’s task is to choose an optimal
ensured by exponential discounting.
pair (tr,c(tr)) for any length  of time remain-
The difficulty is that the experimental evi-
ing in the foraging interval. Let V() be the
dence has not been particularly supportive
value of the optimal continuation consump-
of exponential discounting.55 The consensus
tion plan, given the length  of time remain-
leans toward a model in which discounting
ing.56 If  is sufficiently large, then the
departs from exponential in the direction of
optimal consumption plan will be nearly sta-
being biased toward the present, so that dis-
tionary, featuring a choice of some fixed,
count rates decline as one evaluates more
optimal t∗ at each opportunity. This allows
distant payoffs. Hyperbolic discounting is
the approximation
the most prominent example.
The case for hyperbolic discounting (or c ( t∗ )
V ( ) =  .
other forms involving a bias toward the t∗
55
See Shane Frederick, Loewenstein, and Ted
56
O’Donoghue (2002) for a survey and Maribeth Coller, The function V is implicitly defined by t ∈ arg maxt
Harrison, and Rutström (2003) for an alternative view. {c(t) V(t)}.
mr05_Article 2 3/28/05 3:25 PM Page 86

86 Journal of Economic Literature, Vol. XLIII (March 2005)

But then the optimal consumption plan t∗ current decisions without noting the differ-
maximizes ent context.57 In effect, the opportunity
c ( t) costs of the time sacrificed while waiting for
(1) consumption may have been important in
t
the ordinary lives of our ancestors, even if
Hence, optimal foraging behavior induces a
we do not commonly encounter it in our
preference for consumption c(t) at time t
lives, potentially restoring the relevance of
over c(t) at t if
the animal experiments.58
c ( t ) c ( t ') A second difficulty now arises. Suppose we
> .
t t' expand our simple foraging model to accom-
Consumption at time t is thus optimally dis- modate uncertainty. Let {X(1),…, X(n)} be a
counted by 1/t, i.e., is discounted by the collection of independent, positive-valued
hyperbolic function 1/t. It then seems unsur- random variables. We interpret each of these
prising that experiments with animals are as representing a foraging strategy, with each
suggestive of hyperbolic discounting. foraging strategy characterized by a random
How relevant is this evidence for humans? length of time until it yields a consumption
Hyperbolic discounting arises out of a model opportunity. To keep the example transpar-
in which delayed consumption imposes an ent, we simplify our previous model by
opportunity cost, in the sense that other con- assuming that each consumption opportunity
sumption opportunities are precluded while features one unit of food. The animal chooses
waiting for the current realization. The vari- a foraging strategy, waits until its payoff is real-
able t measures the time spent foraging, dur- ized, chooses another strategy (perhaps the
ing which consumption is precluded. There same one), and so on, until a fixed foraging
is nothing like this in the intertemporal deci- period of length T has been exhausted.
sion problems typically associated with This model gives what is commonly known
hyperbolic discounting in humans, where t as a renewal process. The intuition is that
measures a delay during which other options once a unit of food has been received, the
are not closed. For example, when facing the process has been “renewed,” in the sense that
canonical hyperbolic-discounting story of the set of possible choices and outcomes has
choosing between one sum of money now reverted (literally in the case of an infinite
and another in a week, and then between horizon and approximately in the case of a
the same sums in fifty-two and fifty-three sufficiently long finite horizon) to its original
weeks, there is no presumption that inter- configuration. For sufficiently long horizons,
vening consumption possibilities are sacri- the optimal strategy will again be approxi-
ficed. We thus have reason to doubt that mately stationary. Consider a stationary strat-
hyperbolic discounting in animals has suffi- egy, in which the same random variable X(i)
cient external validity to be of relevance for is chosen at each opportunity. Let µi be the
human behavior. mean time before food is realized under X(i).
This observation is only the first step of
57
the story. There may still be good reasons for Peter D. Sozou (1998) and Partha Dasgupta and Eric
Maskin (2003) explore evolutionary motivations for hyper-
humans to engage in hyperbolic discounting. bolic discounting that do not depend upon foraging as the
One possibility is that human intertemporal standard decision problem.
58
preferences were formed during a time in This possibility provides one illustration of how elu-
sive external validity can be. There is often no single or
which people typically faced decision prob- obvious external situation to which the model is to be
lems similar to the foraging choices thought applied. The question may then not be whether there are
to be typical of animal decisions, and that situations outside the laboratory that correspond to the
experiment, but rather whether the corresponding situa-
people now simply apply the resulting tions are the “right” ones. We return to this point at the
(hyperbolically discounted) preferences to end of this section.
mr05_Article 2 3/28/05 3:25 PM Page 87

Samuelson: Economic Theory and Experimental Economics 87

Let N(T) be the number of renewals (i.e., Two qualifications are relevant. First,
number of units of food) secured by time T. there are things about animal behavior that
Then the elementary renewal theorem we do not understand.59 More importantly,
(Sheldon Ross (1996, Proposition 3.3.1)) the point here is not to defend exponential
indicates that, as T gets large, discounting. Instead, it would be quite a sur-
N (T ) 1 prise if discounting were precisely exponen-

T
i . tial. There is also evidence of hyperbolic
As a result, the stationary strategy that discounting from human experiments, which
chooses the random variable with the small- the current discussion does not call into
est mean time to renewal (µi) will be approx- question.60 The point is that extending
imately optimal (among the set of all results from animal experiments to conclu-
strategies, not just stationary ones), in the sions about human behavior raises questions
sense that it maximizes the number of of external validity that can be examined
renewals N(T) and hence consumption, for through the lens of economic theory. In con-
large values of T. This strategy chooses the nection with hyperbolic discounting, the
random variable X(i) that maximizes accompanying theory is not immediately
supportive of a link.
1
(2) , Assessments of external validity can be
E { t} further complicated by the fact that the
appropriate external environment for com-
where t is the renewal time and E{t}µi is its
parison is often not obvious. Consider one of
expected value. In contrast, applications of
the simplest experimental settings, the dicta-
hyperbolic discounting in economics typi-
tor game. Experiments find that dictators
cally assume that people maximize the
typically do not seize all of the money,
expected value of hyperbolically-discounted
despite the lack of any obvious reason for not
utilities. In our simplified case, recalling that
doing so (Davis and Holt 1993; Robert
each random delay is terminated by the
Forsythe, Joel L. Horowitz, N. E. Savin, and
appearance of one unit of food, this calls for
Martin Sefton 1995). What should we make
maximizing
of this result? Each of us is constantly
(3) E {1 / t} , involved in a version of the dictator game, in
The objectives given by (2) and (3) can espe- that we constantly have opportunities to give
cially differ if the menu of foraging strategies away the money in our wallets, or anything
includes alternatives with high mean renew- else that we own. Typically, however, we
al times but that attach some probability to hold on to what is ours. One might then view
very short waiting times. Such strategies may the experimental evidence as being
fare very well under (3), while being less swamped by a mass of practical experience
attractive under (2). with the dictator game, in which people for
We thus find that an appeal to our evolu- the most part tend to keep what they have.
tionary background may or may not allow us
to interpret animal evidence as bracing a 59
One of the puzzles facing biologists is that observed
belief in human hyperbolic discounting, but behavior appears to match the objective given by (3) more
that in the process we also provide evidence closely than the simple theoretical prediction that (2) be
against commonly-used models of (hyper- maximized (Melissa Bateson and Alex Kacelnik (1996),
Kacelnik (1997), Kacelnik and Fausto Brito e Abreu
bolically discounted) expected utility maxi- (1998)).
mization. There appears to be no obvious 60
Again, see Frederick, Loewenstein, and
way to interpret animal experiments as sup- O’Donoghue (2002). Here, as always, there are still ques-
tions of internal validity. Is the observed behavior a prod-
porting both hyperbolic discounting and uct of hyperbolic discounting, or something else? Halevy
expected utility maximization. (2004) and Ariel Rubinstein (2003) explore alternatives.
mr05_Article 2 3/28/05 3:25 PM Page 88

88 Journal of Economic Literature, Vol. XLIII (March 2005)

Then what do the experiments have to addressing this issue is again theoretical,
tell us? One message is clear: people do not aimed at identifying and modeling the fea-
always keep everything. This is a useful tures that distinguish the first set of circum-
point of departure. Outside the laboratory, stances from the second, and then
people also sometimes relinquish what interpreting these circumstances in terms of
they own, giving gifts and making contribu- experimental designs and findings. Once
tions to charity. A variety of explanations again, the general point is that examining the
have been offered for why this seemingly relevant theory can help assess the interpre-
altruistic behavior is consistent with ration- tation and external validity of experimental
al, selfish behavior.61 While often persua- results.
sive, and consistent with some aspects of
4.2.2 Internal Validity
behavior in dictator experiments, 62 it
seems a stretch to suggest that such expla- Experiments in economics typically fea-
nations can cover every bit of generosity. ture monetary payoffs. Can we assume that
One can then view dictator experiments as these monetary payoffs represent utilities?
an attempt to strip away the confounding Section 2 touched on one reason why the
factors and isolate a situation in which answer might be no, namely that subjects
rational, selfish behavior has a clear predic- might care about more than simply the
tion, allowing us to conclude that people amount of money they make. However, sup-
are not always relentlessly selfish. pose that this is not the case. If subjects are
This is instructive, but only the most risk averse, then monetary payoffs still do
extreme would claim that selfish preferences not provide a good representation of utility.
are a complete description in every circum- One of the early insights of experimental
stance. Do the dictator experiments have economics was that we can effectively elimi-
anything to contribute beyond challenging nate risk aversion, as long as subjects are
such extremists? Here we return squarely to expected-utility maximizers. Suppose one
the question of what is the appropriate con- has in mind an experiment that would make
text in which to evaluate the external validi- monetary payments ranging from 0 to 100.
ty of dictator experiments. Does the Then replace each payoff x∈[0,100] with a
experimental allocation represent the con- lottery that offers a prize of 100 with proba-
tinual decisions we implicitly make about bility x/100 and a prize of zero otherwise.
whether to keep our wealth or give it away? Expected payoffs are unchanged. However,
If so, then the findings provide a serious for any expected utility maximizer, regard-
challenge to the preferences commonly used less of risk attitudes, the expected utility of a
in economic models. Does the experiment lottery that pays 100 with probability p (and
capture those rarer circumstances under 0 otherwise) is
which people make anonymous contribu- pU (100) + (1 − p)U (0) = U (0) + [U (100) − U (0)] p.
tions to charity? If so, then the findings are
commonplace. A useful point of departure in This expression is linear in p, meaning that
the agent is risk neutral in the currency of
61
For example, people are said to give gifts in anticipa- probabilities. On the strength of this con-
tion of reciprocation, to contribute to charity in order to venience, lottery payoffs have often been
gain esteem, to tip in order to advertise their generosity to
fellow diners, and so on.
used in experimental economics.63
62
For example, the sensitivity of amounts retained by
63
dictators to the degree of anonymity in the experiment See Cedric A. B. Smith (1961) for an early theoreti-
(Hoffman, McCabe, Keith Shachat, and Smith 1994; cal discussion of lottery payoffs, and Roth and Michael W.
Hoffman, McCabe, and Smith 1996) could be interpreted K. Malouf (1979), Roth and J. Keith Murnighan (1982),
as indicating that one purpose of a seemingly altruistic act and Roth and Françoise Schoumaker (1983) for early
is to demonstrate one’s behavior to others. experimental applications.
mr05_Article 2 3/28/05 3:25 PM Page 89

Samuelson: Economic Theory and Experimental Economics 89

Against this background, Matthew Rabin where the first inequality breaks [0,∞] into
(2000) (see also Rabin and Richard H. Thaler intervals of length 200 and assumes that
2001) presents an argument that we illustrate the maximum possible marginal utility
with the following example. Suppose Alice holds throughout each interval, the second
would rather take $95.00 with certainty than repeatedly uses (4), and the remainder is a
face a lottery that pays nothing with proba- straightforward calculation.
1 1
bility 2 and $200 with probability 2 . Suppose Now consider a loss of 3000. A similar
further that Alice would make this choice no argument shows that the utility U(w03000)
matter what her wealth. Then either the must satisfy
standard model of utility maximization does
U ( w0 − 3000) ≤ U ( w0 ) − 200U ′( w0 ) −
not apply, or Alice is absurdly risk averse.
To see the reasoning behind this argu- 200U ′( w0 − 200) − …
ment, assume that Alice has a differentiable
− 200U ′( w0 − 2800)
utility function U(w) over her level of wealth
w, with (at least weakly) decreasing margin-
 20
al utility. Alice’s choice implies that the utili- ≤ U ( w0 ) − 200 U ′ ( w0 ) +
ty of an extra 95 dollars is more than half the  19
utility of an extra 200 dollars. This implies  20 
14

1
that 2 200U(w 200)95U(w), where w is U ′ ( w0 ) + … +   U ′ ( w0 )
 19  
Alice’s current wealth and U(w) is the
largest marginal utility found in the interval
≤ U ( w0 ) − 4400U ′ ( w0 ) .
[w, w 200] and U(w 200) is the smallest
marginal utility in that interval.64 Comparing these two results, we have that
Simplifying, we have, for any wealth w for any X 0,
19 1 1
(4) U ′ ( w + 200) ≤ U ′ ( w) . U ( w0 − 3000) + U ( w0 + X ) ≤
20 2 2
Now letting w0 be Alice’s initial wealth level
U ( w0 ) − 400U ′ ( w0 ) < U ( w0 ) .
and stringing such inequalities together, it
follows that, for any w, no matter how large, Hence, there is no positive amount of money
Alice’s utility U(w) satisfies X, no matter how large, that would induce
U ( w) ≤ U ( w0 ) + 200U ′ ( w0 ) + 200U ′ Alice, no matter how wealthy, to accept a
fifty/fifty lottery of losing 3000 and winning
( w0 + 200) + 200U ′ ( w0 + 4 00) X. Risk aversion over relatively small stakes
+ 200U ′ ( w0 + 600) + … thus implies absurd risk aversion over larger
 stakes.
19
≤ U ( w0 ) + 200 U ′( w0 ) + U ′( w0 ) + Risk aversion over small stakes seems
 20 quite reasonable and is consistent with labo-
 19  U ′ w +…  ratory evidence (e.g., Holt and Susan K.
2

 20 
( 0)  Laury 2002). How do we reconcile this with
 the seeming absurdity of the implied behav-
 19  ior over larger stakes? Taking it for granted
= U ( w0 ) + 200U ′ ( w0 ) / 1 − 
 20  that people are not so risk averse over large
= U ( w0 ) + 4000U ′ ( w0 ) . stakes, Rabin (2000) and Rabin and Thaler
(2001) suggest that the expected-utility
64
Hence, 95U′(w) is an upper bound on the utility of model should be abandoned.
an extra 95 dollars, and 200U′(w 200) a lower bound on This conclusion poses a puzzle for experi-
the utility of an extra 200 dollars. Rabin (2000) contains
additional examples and shows that the argument extends mental practice. The use of lottery payoffs
beyond the particular formulation presented here. appears to be either unnecessary (because
mr05_Article 2 3/28/05 3:25 PM Page 90

90 Journal of Economic Literature, Vol. XLIII (March 2005)

subjects are risk neutral over the relatively averse over small stakes, and still behaving
modest sums paid in experiments) or neces- plausibly over larger states.65
sarily ineffective (because subjects are risk This argument can be taken a step further.
averse over small sums, and hence cannot be Savage (1972, pp. 15–16, 82–91) views expect-
expected-utility maximizers). The argument ed utility theory as applicable only to “small-
is even more challenging for economic theo- worlds” problems, in which the sets of states,
ry, where expected-utility maximization is consequences and acts are simple enough
firmly entrenched. that one can identify and explore every impli-
In response, our attention turns to ques- cation of each act. Savage notes that it is
tions of internal validity. Is the observed “utterly ridiculous” to encompass all of our
behavior appropriately interpreted as decision-making within a single small-worlds
reflecting departures from expected-utility model (1972, p. 16). Instead, his view (1972,
maximization? Addressing this question pp. 82–91) is that decision makers break the
requires a more careful look at the theory. world they face into small chunks that are
Let X be a set of consequences, a set of simple enough to be approximated with a
states, and L a set of acts, where an act is a small-worlds view. We can expect behavior in
function associating a consequence with these subproblems to be described by expect-
each state. Savage (1972) shows that if an ed utility theory, but the theory tells us noth-
agent has preferences over the set of acts L ing about relationships between behavior
satisfying certain axioms, then the agent across problems.
chooses as if she has a probability distribu- Duncan Luce and Howard Raiffa (1957,
tion p over and a utility function U over X, pp. 299–300) continue this argument, noting
and maximizes expected utility. that, “one’s choices for a series of problems—
This theory makes no comment as to no matter how simple—usually are not con-
what is contained in the set X over which sistent.” They suggest that if one discovers an
utilities are defined (cf. James C. Cox and inconsistency, one should modify one’s deci-
Vjollca Sadiraj 2002). The argument that sions, with “this jockeying—making snap
Alice’s risk aversion over small stakes judgments, checking on their consistency,
implies implausible behavior over large modifying them, again checking on consisten-
stakes implicitly assumes that utility is a cy, etc.”, ultimately leading to consistent
function of (only) Alice’s final wealth—the expected-utility maximizing behavior. We can
amount of money she has after the outcome thus expect consistent behavior only across
of the lottery has been realized. Hence, sets of choices (or worlds) that are sufficient-
Alice must view winning a million-dollar ly small that we can expect the required
lottery when initially penniless as equiva- adjustment to have been made.
lent to losing $9,000,000 of an initial Returning to our original setting, the set of
$10,000,000. This is the most common way all lotteries may be too large a world to
that expected utility appears in theoretical encompass within a single expected-utility
models, but nothing in expected utility the-
ory precludes defining utility over pairs of 65
There is then no inconsistency in believing that
the form (w, y), where w is an initial wealth experimental subjects are expected utility maximizers
level and y is a gain or loss by which this while using lotteries to control for risk aversion over small
wealth level is adjusted. In this case, Alice stakes. The evidence on whether lottery payoffs success-
fully control for risk aversion is not entirely encouraging
may view the two final $1,000,000 out- (e.g., Joyce E. Berg, John W. Dickhaut, and Thomas A.
comes described above quite differently. Rietz 2003; James C. Cox and Ronald L. Oaxaca 1995;
And once this is the case, there need no Selten, Abdolkarim Sadrieh, and Klaus Abbink 1999; and
James M. Walker, Smith, and Cox 1990). These findings
longer be any conflict between being an present yet another challenge to the presumption that
expected utility maximizer, being risk experimental subjects maximize expected utility.
mr05_Article 2 3/28/05 3:25 PM Page 91

Samuelson: Economic Theory and Experimental Economics 91

formulation. If we define utility in terms of to the appearance of models in which pref-


final wealth levels, Alice’s expected-utility erences depend upon the vector of all pay-
maximization over small stakes may then not offs, one’s own as well as the payoffs of
be consistent with her behavior over large others. Subsequent experiments have sug-
stakes. But she may nonetheless be maximiz- gested that more is involved. Attitudes
ing expected utility, though with a utility towards payoffs appear to depend not only
function in which wealth or some other vari- on the payoffs themselves, but also the con-
able indexes different small worlds problems, text in which these payoffs were generated.
each of which is treated via a utility function A player is more likely to prefer a larger
over (some subset of) final wealth levels. opponent payoff if the opponent’s play has
This discussion is not to be read as a been appropriate (kind, or fair, or generous,
defense of expected utility theory. There is or expected) and more likely to prefer a
every reason to believe that so stark a theory smaller opponent payoff if the opponent’s
cannot always be a good approximation. This play has been inappropriate. The experi-
discussion is instead meant to provide a word mental evidence has provided evidence for
of caution in assessing the internal validity of positive reciprocity (the desire to reward
experimental results. Risk aversion over those who have behaved appropriately)
small gambles, one of the seemingly most (Fehr and Simon Gächter 2000; Fehr,
powerful challenges to the theory, may in Gächter, and Georg Kirchsteiger 1997;
fact be consistent with expected utility. Kevin A. McCabe, Rassenti, and Smith
More importantly, this argument does not 1998) and negative reciprocity (the desire
diminish the strength of the small-stakes- to punish those who have behaved inappro-
risk-aversion challenge to economic theory. priately) (Fehr and Gächter 2002).
The evidence remains that we can save However, we can expect subjects’ choices to
expected utility maximization as a useful the- reflect a mixture of concerns for one’s own
ory only if something other than wealth payoff, inequality aversion, altruism, trust,
enters utility functions. As Rubinstein (2001) and positive and negative reciprocity (cf.
notes, this opens the door to all manner of Cox 2004). How do we separate these
inconsistencies in decision making. forces, i.e., how do we assess the internal
Expected utility can be defended only by validity of the experiments? Once again, a
recognizing that economic theorists have a useful point of departure is a model of pref-
great deal of work to do. erences encompassing these forces and
Other illustrations of the importance of pointing to experiments that will distin-
theory in assessing internal validity are easi- guish them. Interpreting the experiments is
ly found. Game-theoretic models featuring again likely to rest upon careful theoretical
mixed Nash equilibria have been questioned modeling.
on the grounds that individual play does not
exhibit the identical, independent random-
5. The Search for Theory
ization required by the theory (e.g., James
N. Brown and Robert W. Rosenthal 1990). Where do we look for theoretical devel-
But if the mixed equilibrium reflects either a opments that will help integrate economic
population polymorphism (as suggested by theory and experimental economics?
John F. Nash 2002) or the result of an adap- To approach this question, think of an
tive process, we would expect such inde- experiment as being composed of three
pendence to fail (e.g., Binmore, Joe pieces. The game form (recognizing that the
Swierzbinski, and Chris Proulx 2001). “game” may include only a single-player)
Alternatively, section 2 sketched the dis- specifies the rules of play, including the
cussion of behavior in bargaining games up number and characteristics of the players,
mr05_Article 2 3/28/05 3:25 PM Page 92

92 Journal of Economic Literature, Vol. XLIII (March 2005)

the choices available to the players, their has not been particularly supportive of for-
timing and sequence, the information avail- ward induction,67 suggesting that theories
able to the players, the resulting conse- based on forward induction could well be
quences, and so on. To this, one adds a reconsidered.
specification of how the outcomes are trans- In other cases, there is little to be gained
lated into utilities. It would typically be con- by looking for alternative theories while
venient if the monetary payoffs given by the maintaining the game protocol. In the bar-
game form could also be taken to represent gaining games in section 2, for example,
players’ utilities, but this need not be the there appears to be no way to account for the
case. Let us refer to a game form and its observed behavior while clinging to a model
associated utilities as a game protocol or sim- based on rationality and the default protocol.
ply protocol, and let us think of the “default” The result, as we have seen, has been a flur-
protocol as equating monetary payoffs and ry of work developing alternative models of
utilities.66 The third piece of the triad is a preferences.68
theory describing the behavior one would We return to the modeling of prefer-
expect, given the game protocol. ences in section 5.2. First, however, section
In some cases, the game protocol leaves 5.1 considers another possible response
little to the discretion of the theory. If the when examining a protocol. There may be
protocol combines the dictator game with good reasons to question whether the game
the assumption that monetary payoffs are form perceived by the subjects matches
equivalent to utilities, then a theory based on that embedded in the experimental game
rational behavior leaves no room for maneu- protocol.
ver: dictators must keep all of the money.
5.1 Perceived Protocols
Similarly, if the protocol pairs the ultimatum
game with the assumption that monetary How could subjects help but perceive the
payoffs are utilities, then sequential rational- proper game form? The potential behavior
ity uniquely determines the implications of in an experiment is typically tightly con-
the theory. In other cases, the protocol trolled, including quite precise rules for who
leaves much to the discretion of the theory. gets to make what choices at what times. As
Work on equilibrium refinements grew out noted in section 3.2, a great advantage of
of the fact that even if one restricts attention experiments is the ability to control these
to relatively simple games and assumes that details. In assessing the effects of these con-
the payoffs are indeed utilities, sequential trols, however, we return to the idea that
rationality in general puts relatively few people, including experimental subjects, use
restrictions on behavior. models to make decisions.
Now consider how one might react if an Just as economists are forced to rely on
economic theory and experimental results models in their analysis, so can we expect
are consistently at odds. One possibility is people to rely on models when making their
that the theory should be refined, or decisions. Given the many choices people
extended, or altered, or abandoned. For
example, the equilibrium refinements liter- 67
See, for example, Dieter Balkenborg (1994); Jordi
ature culminated in models of equilibrium Brandts and Holt (1992, 1993, 1995); and Cooper, Susan
selection centered around notions of for- Garvin, and Kagel (1997, 1997).
68
Weibull (2004) stresses the possibility that an experi-
ward induction. The experimental evidence ment’s monetary payoffs may not capture subjects’ prefer-
ences and discusses the resulting difficulties involved in
drawing inferences from experiments. Roth (1991) notes
66
Jörgen W. Weibull (2004) introduces the concept of a that, given the difficulty in controlling every aspect of
game protocol, though drawing a somewhat different line subjects’ preferences and expectations, it is hard to know
between the game form and game protocol. precisely what game is involved in an experimental study.
mr05_Article 2 3/28/05 3:25 PM Page 93

Samuelson: Economic Theory and Experimental Economics 93

have to make in their everyday lives, most that the subjects may introduce aspects that
without the time and resources that econo- the experimenter deems irrelevant.71
mists devote to a problem, we cannot expect An illustration is provided by Douglas
people to make use of all of the information Dyer and John H. Kagel (1996). Their
in their environment.69 Instead, most research is motivated by the observation that
aspects of most decisions are ignored experimental subjects frequently bid too
because they are not important enough to aggressively in common-value auctions.
bother with. In essence, people use models, Even subjects who are experienced, profes-
stripping away unimportant considerations sional bidders in auctions for construction
to focus on more important ones. contracts fall prey to the winner’s curse in
Similarly, we should expect experimental laboratory experiments.
subjects to respond to the novelty of an Dyer and Kagel note that the auctions in
experimental setting by modeling its key fea- which the professionals routinely bid contain
tures. This need to rely on models when ana- some potentially important features that did
lyzing the real world ensures that not appear in the laboratory experiments.
researchers and experimental subjects both For example, the real-world auctions typical-
introduce a subjective element into their ly allowed bidders to withdraw winning bids,
perceptions.70 Using the notation developed without cost, when these bids contain mis-
above, an experiment is designed to fix an takes that are formally characterized as
experimental design xN. The experiment “arithmetic errors” but in practice are
itself, however, is a situation x∞ with the allowed to cover virtually any request to
property that x∞(n)xN(n) for n1,…, N. withdraw a bid (on the principal that one
The choice of the aspects of the situation to does not want a contractor who does not
bring within the experimental design, cap- want the job). As a result, the winner can
tured by N, represents the experimenter’s withdraw a bid that is revealed (by compari-
model of the situation. Suppose that an son with other bids) to be too optimistic,
experimental subject, confronted with the providing some protection against the win-
situation x∞, similarly constructs a model. ner’s curse. It appears as if the bidders have
This model is itself a choice of finitely many developed rules of behavior that are effec-
dimensions of the infinitely-dimensioned x∞ tive in the context with which they are famil-
to take into consideration. Is there any rea- iar, though perhaps without completely
son to expect the subject’s model to coincide identifying the key features of the environ-
with the experimenter’s, i.e., to expect the ment that make these rules work well. In
subject to hit upon the same choice of salient bringing the resulting rules into the experi-
information as did the experimenter? ment, the subjects are reacting to a per-
We may often be able to expect the subject ceived protocol that appears to be familiar,
to come close. The experiment is typically but with results that appear to be anomalous
designed so as to focus attention on xN. when held to the standard of the protocol
However, it would be surprising if the two chosen by the experimenter.72
models matched exactly. We thus run the risk
that subjects may ignore aspects of the situa- 71
Psychologists frequently run experiments based on the
tion that the experimenter deems critical or premise (often with the help of some deception) that the
experimenter and subject will perceive different protocols.
72
This raises the question of when we can expect les-
69
How long would it take to get through the grocery sons learned in one context to transfer to other contexts.
store if every detail of every purchase were analyzed? Such transfer will presumably be more effective the
70
Section 3.1 touched on the question of whether there greater is the extent to which people learn not only which
is an objective reality (cf. note 11). The point here is that, behavior works well, but also the reasons why the behavior
regardless of whether there is, models of this reality are works well. Cooper and Kagel (2003) provide an introduc-
subjective. tion to work on generalizing learning across contexts.
mr05_Article 2 3/28/05 3:25 PM Page 94

94 Journal of Economic Literature, Vol. XLIII (March 2005)

The general principle is that, just as sub- These differing conclusions are grounded
jects in an experiment may face effective in different assumptions about how the sub-
payoffs that differ from those of the game jects perceive the experiment. The biologist
protocol, so might they effectively play a dif- assumes that the subject will not perceive
ferent game. Our interpretation of experi- the difference between a real tail and a plas-
mental results can then depend importantly tic one or, in our terms, that the subject’s
on how we imagine subjects perceive the model of the experiment does not accommo-
game.73 date plastic tails. The typical assumption in
A hypothetical illustration will be helpful. economic contexts is that, provided the
Suppose that biologists were interested in a experiment is sufficiently transparent and
theory that female birds preferred males effectively presented, the subjects’ model of
with long tails, and that they did so rational- the experiment matches the experimental
ly because long tails were a signal of other design.
characteristics that make a mate particularly The example of plastic tails may seem a bit
desirable.74 To test this theory, an experi- removed from human experiments. Suppose
ment is designed in which some males have instead that the hypothesis in question is that
plastic feathers glued to their tails.75 human males are attracted to females with
Suppose that females indeed flock to the “hourglass” figures.76 An experimenter tests
males with now strikingly long tails. How do this by showing males a variety of porno-
we interpret the results? A biologist is likely graphic pictures, checking which females
to claim that the experimental results pro- prompt the most enthusiastic reaction. Many
vide support for the theory. However, one males are responsive to pornography, and
can well imagine an economist claiming just many will be especially responsive to
the opposite, that the theory has been females with the appropriate figure. A biolo-
demonstrated to be nonsense. After all, the gist or psychologist is again likely to interpret
theory is founded on the presumption of the experimental results as support for the
rational behavior. This seems obviously theory. Once more, however, one can imag-
inconsistent with exhibiting a preference for ine economists interpreting the results as
males with plastic tails, since the latter can- another blow to the contention that people
not be associated with the characteristics (or at least males) behave rationally. How
that make males desirable mates. can they be rational if they react the same
way to fictitious females as to real ones?
An alternative interpretation is that peo-
73
Uri Gneezy and Aldo Rustichini (2000, 2000) report ple behave rationally, but that they use mod-
experiments showing that behavior may be counterintu-
itively nonmonotonic in the scale of monetary payments, els that do not incorporate a distinction that
with (for example) the incidence of late pickups at a day- the experimenter takes for granted. Just as
care center increasing as the cost increases from zero to a birds may have a model of the world that
small amount, but then being deterred by higher costs.
Their interpretation is that the increase from zero to a pos- makes no provision for plastic tails, so may
itive cost causes agents to think about the interaction dif- people have models that do not distinguish
ferently. In our terms, attention has been focused on perfectly between real and fictitious females
different aspects of the situation, triggering the use of a
different analytical model. (though obviously also not treating the two
74
There is a rich body of work in biology on plant and identically). Why might people persist in
animal signaling. See Alan Grafen (1990a, 1990b) and using such models? We turn to this question
Rufus A. Johnstone and Grafen (1992) for theoretical
models; H. C. J. Godfray and Johnstone (2000) and in section 5.2. Before doing so, one more
Johnstone (1998) for surveys; and Johnstone (1995) for an
examination of the evidence.
75 76
See Malte Andersson (1982), J. Hoglund, M. Again, there may be a biological basis for such tastes.
Eriksson, and L. E. Lindell (1990), and Anders Møller See, for example, Bobbi S. Low, R. D. Alexander, and K.
(1988) for examples of similar experiments. M. Noonan (1987) and Matt Ridley (1993).
mr05_Article 2 3/28/05 3:25 PM Page 95

Samuelson: Economic Theory and Experimental Economics 95

example is useful, pushing the setting yet still recognizing and acting on the fact that
closer to traditional economic experiments. the likelihood of future interaction is quite
Return to the point of departure for sec- different in different situations (just as males
tion 2, the ultimatum game. A key compo- may recognize that they are not dealing with
nent of the game form is that the proposer real females when consuming pornography,
and responder will have no subsequent inter- and yet have a reaction to the latter shaped
actions. Experimenters have gone to great by their reaction to real females).
lengths to ensure that subjects understand At this point, this possibility is a hypothe-
this, including most notably ensuring that the sis awaiting further exploration and experi-
subjects interact anonymously. But can we mentation. Before expecting too much such
preclude the possibility that subjects model experimentation, however, we must ask for
the situation as if there is some possibility of some theoretical guidance on how people
future interaction? If not, then the observed model the situations they face. What behav-
behavior might be consistent with rationality, ior could we observe that would bolster our
without requiring any modification in how belief in such an explanation, or that would
we model preferences.77 call it into question? In essence, we need a
Fehr and Joseph Henrich (2003) shed theory of how people use theories in shaping
some light on this “phantom future” expla- their behavior. This is a relatively new but
nation, pointing to experimental studies important direction for economic theory.78
comparing behavior with and without the The implication is that models of subjects’
prospect of future interaction. For example, perceptions of experimental game forms
experimental behavior in one-shot and should take their place alongside models of
repeated games is markedly different (e.g., preferences in explaining behavior. We see
Fehr and Gächter 2000 and Gächter and evidence for an important role in the way
Armin Falk 2002). As Fehr and Henrich subjects perceive experimental protocols in
note, this provides evidence that the inabili- the importance of framing effects.79 Why do
ty to correctly account for the future is not a seemingly innocuous differences in the
plausible explanation for the observed description of a protocol make such a differ-
behavior. These results help fill in one piece ence? Presumably because they prompt sub-
of the puzzle, but leave some more to be jects to use different models in analyzing the
explored. The experiments strongly suggest experiment.
that people do not treat every situation as if This perspective suggests that some cau-
it has the same prospects for subsequent tion is called for when working with espe-
interaction. It then remains to ask whether cially complicated experiments, not simply
subjects may still model each situation as if because it may tax the abilities of the sub-
there is some prospect of future interaction, jects (as stressed by Binmore 1999), but also
perhaps on the strength of some reasoning because it may expand the range of models
to the effect that one can never absolutely that subjects apply to the experiment. At the
preclude any possibility, while still recogniz- same time, Harrison and John A. List (2004)
ing that games with an explicit future are dif- caution that the context-free framing of
ferent than games without. The idea behind many experiments, designed to eliminate
this middle ground is that people may not potentially confounding factors, may instead
perfectly model one-shot interactions, while simply invite subjects to impose their own
77 78
Returning to the question raised in section 2, what do Samuelson (2001) provides one example, in which
subjects have to learn in the ultimatum game? Perhaps the use of models makes an explicit appearance. Also see
that its futureless nature distinguishes it from other, more Philippe Jehiel (2004).
79
familiar situations, and accordingly calls for different See Robyn M. Dawes (1988) for an early discussion
behavior (and that enough others have also learned this). of framing effects.
mr05_Article 2 3/28/05 3:25 PM Page 96

96 Journal of Economic Literature, Vol. XLIII (March 2005)

context.80 Despite an experimenter’s best Where do we find the discipline to ensure


efforts to ensure that subjects understand that our models are meaningful? This dan-
what they are dealing with, including careful ger seems all the more real once we open
presentations, questions, and preliminary the door to the possibility that subjects may
quizzes, it is not clear when we can be confi- form their own models of the experimental
dent that the subjects’ models match the situation. Where do we look for a theory of
experimenter’s.81 peoples’ models of the world?
In many cases, it will be difficult to distin- This section suggests an evolutionary
guish whether unexpected behavior is root- approach to both questions. The idea is to
ed in subjects’ payoffs or their perception of view evolution as the biological process by
the game form. A tendency to analyze the which humans came to their modern form.82
ultimatum game with a model that implicitly This modern form includes a host of physical
builds in a future may give results that look characteristics—our size, our relative lack of
as if an agent has a preference for fairness or hair, our ability to walk upright— and behav-
an antipathy for asymmetric solutions. This ioral characteristics—our diet—that we
may be more than a coincidence. As argued readily attribute to the forces of evolution.
in Samuelson (2004), a likely response by We can also expect our preferences and our
Nature to limitations on our reasoning abili- decision-making to have been the products
ty, the same sort of limitations that prompt of evolution.83 The result is a “reverse engi-
our use of models, may be to compensate by neering” approach to studying decision mak-
building arguments into our preferences ing. Can we plausibly make a case that a
that we would not expect to find when given specification of preferences, or rules
agents are perfectly rational. Hence, the two for how situations are modeled and translat-
arguments are likely to be complementary ed into decisions, might have evolved as part
rather than contradictory. Then how do we of a solution to an evolutionary design prob-
choose between them, or what use is there lem? The more easily one finds such evolu-
in considering both? These questions again tionary foundations, the more seriously
suggest a quest for richer theoretical models. should we be inclined to take the model in
question.84
5.2 Evolutionary Foundations Evolutionary research abounds in mal-
One difficulty in modeling preferences is adaption stories.85 One first identifies a
that once we move beyond a narrow concep-
82
tion of self-interest, there appear to be few This distinguishes this exercise from the bulk of what
has come to be known as “evolutionary game theory” or
restrictions on the features we can attribute “evolutionary economics.” These latter bodies of (quite
to preferences, and hence the behavior we diverse) work share the guiding principle that instead of
can explain (cf. Andrew Postlewaite 1998). optimizing, people reach decisions and markets reach out-
comes through an adaptive process involving varying
degrees of learning, experimentation, and trial-and-error.
80
Camerer and Keith Weigelt (1988) conduct an early “Evolution” is a metaphor for this adaptive process.
83
experimental analysis of reputation models. Their results This view is familiar in evolutionary psychology, (e.g.,
exhibit many features of reputation equilibria, but their Leda Cosmides and John Tooby 1992), and has ample
subjects also appear to have a “homemade prior” about the precedent in economics (e.g., Arthur J. Robson 1992,
information structure that is at odds with a strict interpre- 1996, 1996, 2001a, 2001b).
84
tation of the experimental environment. It appears as if the Just as economists are adept at building models, evo-
subjects have provided a context for the experiment. lutionary psychologists have been criticized for seemingly
81
In the context of the hourglass-figure experiment dis- being able to rationalize any behavior with an evolutionary
cussed above, one can imagine an experimenter adding a model. Steven Jay Gould and Richard C. Lewontin (1979)
variety of additional controls to ensure that the subjects find these models sufficiently unpersuasive as to be
understand that pornographic females are not real, per- deemed “just-so stories.” If the evolutionary approach is to
haps stressing this in the instructions and quizzing the sub- be successful, it must do more than provide such stories.
85
jects on the difference. But this is news to no one, and is Terry Burnham and Jay Phelan (2000) contains a
unlikely to eliminate the effect. wealth of examples.
mr05_Article 2 3/28/05 3:25 PM Page 97

Samuelson: Economic Theory and Experimental Economics 97

behavior that is likely to have been an opti- bestow benefits and costs on others. Second,
mal response to the environment in which a propensity to imitate or conform to the
we evolved. One then notes that our mod- behavior of others plays an important role in
ern environment is quite different, causing the model, suggesting we look for a link
the behavior to now be quite surprising, if between such behavior and prosocial behav-
not counterproductive.86 In contrast, our ior. Third, evolution is viewed as facing
concern here is with behavior that evolu- information constraints, so that we must in
tion has designed as an optimal response to turn view preferences as tools for maximiz-
our environment, recognizing that this is an ing fitness while economizing on informa-
environment in which we must rely on tion. These features may be consistent with
preferences and on models in making our a variety of other models, and so they cannot
decisions. be the end of the quest, but they provide a
There is a growing body of work on how useful point of departure.88
our evolutionary background may have Work on how evolution has shaped the
shaped our preferences. Perhaps best devel- way people model their environment is in an
oped is the link between reproduction and even earlier stage. Three illustrations will be
risk taking, and hence the implications for useful.
attitudes toward risk (Arthur J. Robson First, the Wason selection test (1966) is
1992, 1996). Much of this material is nicely now a standard example in evolutionary
covered in Robson (2001a). psychology. Experimental subjects are sur-
More recently, experimental evidence has prisingly prone to errors in evaluating
mounted that people will incur costs not abstract conditional statements.89 But if
only to bestow benefits on others, but also to asked to evaluate conditional statements
penalize others, with the preference for posed in terms of monitoring compliance
reward or punishment hinging upon percep- with a standard of behavior, success is
tions of whether the recipient has acted much higher.90 The suggested interpreta-
appropriately or inimically. What might be tion is that our reasoning about conditional
the evolutionary origins of such “prosocial” statements evolved in a setting in which
behavior? Henrich (2004) offers an explana- monitoring behavior was particularly
tion based on cultural group selection.87 An important.91 This interpretation bolsters an
advantage of this model is that it provides argument of Fehr and Henrich (2003), that
the type of discipline required for further
investigation. For example, the model sug-
gests that we should expect multiple cultur- 88
In connection with the first feature, it is intriguing
al equilibria, and hence considerable that the study of bargaining behavior in fifteen small-scale
variation across cultures in the tendency to societies by Henrich, Boyd, Bowles, Camerer, Fehr,
Gintis, and McElreath (2001) finds significant behavioral
variation.
89
For example, if given a collection of cards with num-
86
For a simple example, it is likely that during most of bers on one side and letters on the other, along with the
our evolutionary history, food was both in chronically short hypothesis that any card with a 3 on one side has a B on the
supply and could be stored only in the form of body fat. As other, and then shown four cards bearing 3, 7, A and B,
a result, it appears likely that an evolutionarily successful only a minority correctly identify which cards must be
strategy was to eat as much as possible whenever possible. turned over to check the hypothesis.
90
It is then no surprise that members of modern, wealthy For example, told that a coke-drinker, beer-drinker,
societies find it difficult to avoid health-threatening 17-year-old and 25-year-old are seated at a table, virtually
overeating. every subject knows which ones to check for compliance
87
The model of cultural group selection avoids many of with a 21-year-old drinking law.
91
the difficulties that have made biologists skeptical of group See Cosmides and Tooby (1992). The ability to rec-
selection arguments. Elliott Sober and David Sloan Wilson ognize faces (Steven Pinker 1997) is similarly interpreted as
(1998) argue that group selection lies behind preferences an evolutionary response to the importance of monitoring
for altruism. others’ behavior.
mr05_Article 2 3/28/05 3:25 PM Page 98

98 Journal of Economic Literature, Vol. XLIII (March 2005)

laboratory behavior exhibiting reciprocity of experiments.94 These results initially


should not be interpreted as an evolution- seem to strike at the core of economic theo-
ary maladaption.92 At the same time, how- ry, calling into question the idea of stable
ever, it raises the possibility that subjects’ preferences. Notice, however, that the
perceived protocols may not correctly process by which preferences are discov-
capture incentives if their presentation is ered or constructed sounds much like the
sufficiently unfamiliar. process Luce and Raiffa (1957) describe as
Second, a variety of evidence suggests that the foundation for expected utility maxi-
people are not very good in dealing with mization (cf. section 4.2.2). Rather than sug-
probabilities. However, there is also evi- gesting that we abandon expected utility
dence that people fare much better when theory, the experimental results again
probabilities are presented in terms of fre- remind us that the theory may not be as
quencies.93 This may indicate that we spent straightforward as one would like. We can
much of our history with a frequentist view expect consistent behavior in settings
of the world. This is consistent with the pos- amenable to small-worlds modeling, but
sibility that people are approximately must expect anomalies to appear in other
expected utility maximizers, while perform- situations. In terms of experiments, the
ing quite poorly in laboratory experiments, if implication is again that behavior may be
the latter present probabilistic information quite sensitive to seemingly irrelevant
unfamiliarly. details of the experimental environment.
Third, Plott (1996) presents the “discov- What is the common theme of these three
ered preference hypothesis,” suggesting examples? Echoing the ideas that opened
that rather than coming to a decision prob- section 5.1, it is that evolution has equipped
lem with fixed and well defined prefer- us with a variety of models and rules and
ences, people respond by combining shortcuts for dealing with a dauntingly com-
contextual information and experience with plex world.95 The possibility the people do
an internal search process to discover their not perfectly model futureless protocols may
preferences. Similar ideas appear in Dan
Ariely, George Loewenstein, and Drazen
Prelec’s (2003) notion of “constructed” pref-
94
erences, which they illustrate with a number For example, they find that, if subjects are first asked
whether they would be willing to purchase a product at a
price equal to the last two digits of their social security
number, and are then asked their valuation of the product,
there is significant correlation between their social securi-
92
The maladaption account of such behavior would be ty numbers and reported valuations. The suggested inter-
that we evolved in an environment in which repeated or pretation is that the subjects subconsciously use the
kinship interactions, and hence the optimality of reciproc- numbers involved in the first purchase decision as clues to
ity, were sufficiently pervasive that there was no point in the appropriate valuation in the second. This reliance on
checking whether such behavior is warranted. However, it contextual information may work well in many applica-
is not clear that our evolutionary past would have tions, but leads to apparently absurd behavior in the
equipped us with a basic propensity to monitor for and experiment.
95
detect cheating in an environment in which it was unim- Evolutionary psychologists find evidence for con-
portant to distinguish situations meriting reciprocity from straints on evolution’s ability to simply enhance our rea-
those that do not. soning powers and dispense with these devices in the
93
See Cosmides and Tooby (1996), Gerd Gigerenzer relatively large amount of energy required to maintain the
(1991, 1996, 1998), and Tversky and Kahneman (1983). human brain (Katharine Milton 1988), the high risk of
For example, when told that 2 percent of the population maternal death in childbirth posed by infants’ large heads
has a disease and that a test produces no false negatives (W. Leutenegger 1982), and the similarly-caused lengthy
but 5 percent false positives, many subjects will struggle to period of human postnatal development (Paul H. Harvey,
ascertain the implications of a positive report. However, R. D. Martin, and T. H. Clutton-Brock 1986). Andy Clark
they fare better when told that out of every thousand peo- (1993) discusses the potential advantages of using contex-
ple, all twenty who have the disease turn up positive but so tual clues and specialized rules to conserve on generalized
do fifty others. reasoning resources.
mr05_Article 2 3/28/05 3:25 PM Page 99

Samuelson: Economic Theory and Experimental Economics 99

then arise not because evolution has erro- interest, while ecological rationality is con-
neously designed us for a different environ- cerned with “the possible intelligence
ment, but because evolution has effectively embodied in the rules, norms, and institu-
designed us for an environment in which tions of our cultural and biological her-
such shortcuts are valuable. itage....” (2003, p. 470). The argument here
The first two examples are concerned with ties these concepts together with the vision
techniques for processing information. The of an evolutionary process that struggles
third reflects a spillover into preferences, with the constraints imposed by scarce rea-
bringing us back to the fact that information soning resources. The quest to maximize
constraints also feature in models of prefer- fitness generates a motivation for behavior
ence evolution. Samuelson and Jeroen to reflect constructionist rationality, while
Swinkels (2001) consider the relationship the quest to relax constraints leads to the
between these, examining the implications ready incorporation of ecological forces.
for preferences of an evolutionary process What are the implications for combining
that must cope with scarce reasoning economic theory and experiments? A first
resources. The conclusion is that we can one is that we must be careful in assessing
expect a variety of seemingly nonstandard both experimental findings and economic
features to be built into our preferences in theory. For example, numerous experi-
response to imperfections and limitations in ments have found that subjects are willing
our information processing and reasoning. to pay less to receive an object than they are
What are the implications for economics? willing to accept to relinquish the object.
We can often expect people to act consis- Some have interpreted this as reflecting a
tently and rationally, given their prefer- common feature of preferences, being an
ences. However, the preferences involved in illustration of the more general principle
the resulting optimizing behavior may that people value losses more heavily than
involve all sorts of features that at first blush gains.96 However, as Plott and Kathryn
do not appear consistent with either the pur- Zeiler (2003) argue, there have also been
suit of individual self-interest or a narrow many cases in which such discrepancies do
concept of consistency. Finally, we can not appear, and one can identify experi-
expect context to be important. In this mental settings in which the effect reliably
sense, evolution and the axiomatic does or does not appear. This suggests that
approaches of Savage (1972) and Luce and we should stop short of proclaiming the
Raiffa (1957) are on the same page. Both endowment effect a universal feature of
suggest that we can expect consistent behav- preferences, and focuses attention on the
ior within sufficiently constrained contexts, internal validity of the experiments. What
though with preferences that may appear to links do we draw between differences in
go beyond a narrow conception of self-inter- experimental settings and the forces that
est, but that the context will be important shape valuations? At the same time, it indi-
and that seeming anomalies may readily cates that something is missing from our
arise across contexts. theoretical repertoire, which currently
Smith (2003) argues that we can usefully
view human behavior in terms of two types
96
of rationality, “constructivist” and “ecologi- For example, Jack L. Knetsch, Fang-Fang Tang, and
Thaler (2003), who also provide references to earlier work,
cal” rationality. Constructivist rationality comment that “The endowment effect and loss aversion
resembles the rational choice models of have been among the most robust findings of the psychol-
traditional economic theory, though again ogy of decision making. People commonly value losses
much more than commensurate gains . . . ” (2001, p. 257).
allowing the possibility that preferences Kahneman and Tversky (1979) stress that people view
might reflect more than a narrow self- gains and losses quite differently.
mr05_Article 2 3/28/05 3:25 PM Page 100

100 Journal of Economic Literature, Vol. XLIII (March 2005)

provides little insight into such forces. A point to such behavior.97 We must also allow
first step in addressing this issue would the possibility that making connections to
then be the construction of theoretical behavior is a goal that the theory will often
models, especially models shedding insight not yet be sufficiently advanced to address.
into how and when it might have been evo- But at some point some connections must be
lutionarily valuable to condition valuations made between theory and behavior if eco-
on ownership. nomic theory is not to fade into either phi-
losophy or mathematics, and work that
6. Conclusion aspires to make this connection should be
explicit about the implied behavior. In the
Economic theory and economic experi- course of doing so, it would be helpful to
ments can be combined to the benefit of have some idea not only of the expected
both. By itself, this is a fairly uninformative behavior itself, but also of how much noise
“more is better” conclusion. There must be we might expect to surround this behavior.
gains from considering experiments or theo- Perhaps more importantly, it would be
ry more carefully when doing theory or useful for theory to identify behavior for
experimentation. But what steps can we take which the theory cannot account, in the
to make it more likely that potential gains sense that the observations would force the
are realized? theorist to reconsider. This would ensure
The danger with the concerns raised in that the theory is not performing well by
this essay is that they might be used to apol- “theorizing to the test,” as in section 4.1.1.
ogize away any potential interaction The behavior relegated to this category
between theory and experiments. It is might further be grouped in two categories.
unlikely that we will usefully combine theo- One, recognizing that theories can be useful
ry and experiments if we too freely respond without applying universally, would identify
to contrasts between the two with such state- situations that are not a good match for the
ments as: “The results appear to be at odds theory and in which contrary behavior would
with the theory, but we have no obvious way not shake one’s confidence in the usefulness
to measure how far away they are, and by my of the theory. The second would consist of
preferred measure they are pretty close.” behavior that would force reconsideration of
“ … but I suspect the subjects really per- the theory. The strength of a theory will often
ceived a different experimental protocol be reflected in the content of this latter cate-
under which their behavior is consistent gory, and we might move toward an explicit
with the theory.” “ … but the theory is an examination of what makes a theory useful.
approximation that cannot be expected to Similarly, it would be helpful to have the
apply everywhere, and the discovery of this experimental design indicate which out-
exception tells us nothing about the theory comes would be regarded as a failure as well
in other applications.” How do we avoid
working at such cross-purposes?
97
A good beginning would be for exercises For example, the theory of utility maximization occu-
pies a prized place at the center of economics. However,
in economic theory to routinely identify the theory has very little predictive content. Given the
behavior that would be consistent with the freedom to define preferences, virtually any behavior can
theory, and especially behavior that would be reconciled with expected-utility maximization. Even
apparent violations of the axioms of revealed preference
distinguish the theory from contending can often be apologized away by noting that the data con-
explanations. Section 3.1 noted that predict- sist of choices made at different times and thus while the
ing behavior is not the only goal of econom- decision-maker is in different states, and hence possibly
described by different preferences. Predictions then
ic theory, and so we cannot expect all require some augmenting or extension of the revealed-
theoretical exercises to be in a position to preference axioms, as in Cox (1997).
mr05_Article 2 3/28/05 3:25 PM Page 101

Samuelson: Economic Theory and Experimental Economics 101

as which would be considered a success. sM. Using the definition of a maximum for
This question appears to be trivial in many the first inequality and (5) for the second,
cases, with success and failure riding on the we have
statistical significance of an estimated
(7) min max H ( ƒ ∗ , ∗ ) ≥
parameter. However, one of the advantages  ∗ ∈S M f ∗ ∈S M
of experimental work is the ability to control
the environment and design the tests. This ( )
min H f ∗ , ∗ ≥ 1 −  .
 ∗ ∈S M
allows us to direct attention away from issues
of statistical significance and toward issues of From Ky Fan’s (1953, Theorem 1) first minmax
economic importance. The strength of the theorem, we have:
experiment will often be reflected in the (8) mi n max H ( f ∗ , ∗ ) =
content of this “failure” category.  ∗ ∈ S M f ∗ ∈ S M

Finally, again returning to section 4.1.1, it


max mi n H ( f ∗ , ∗ ) .
is important that both theoretical models f ∗ ∈ S M  ∗ ∈ S M
and interpretations of experimental results
Using (8) to replace the first term in (7) and
be precise enough to apply beyond the
deleting the middle term then gives
experimental situation from which they
emerge. This allows links to be made that (9) max mi n H ( f , ∗ ) ≥ 1 −  ,
multiply the power of single studies. f ∗ ∈ S M  ∗ ∈ S M

and hence, there is an fˆ such that, for all


7. Appendix: Proof of Proposition 1 sM ∈ S M ,
Proof. Given the design xN, the experi- (
H fˆ , sM ≥ 1 −  )
ment induces a true probability distribution
π∗ ∈SM over elements sM of the set SM, which, from (6), is the desired result.
while the theory f induces a probability dis-
tribution f ∗ ∈SM over elements π of the REFERENCES
set SM. Given f ∗ and π∗, let Anderson, Simon P., Jacob K. Goeree, and Charles A.
Holt. 1998. “Rent Seeking with Bounded Rationality:
H ( f ∗ , ∗ ) = An Analysis of the All-Pay Auction.” Journal of
Political Economy, 106(4): 828–53.
Anderson, Simon P., Jacob K. Goeree, and Charles A.
∫ ∫ T ( sM , ) df ∗ ( ) d ∗ ( sM ) . Holt. 1998. “A Theoretical Analysis of Altruism and
SM  SM Decision Error in Public Goods Games.” Journal of
H(f ∗, π∗)is thus the probability the theory is Public Economics, 70(2): 297–323.
Anderson, Simon P., Jacob K. Goeree, and Charles A.
accepted. Let f π* be a measure over SM that Holt. 2001. “Minimum-Effort Coordination Games:
puts probability one on the true distribution Stochastic Potential and Logit Equilibrium.” Games
π∗. Then, from the assumption that T and Economic Behavior, 34(2): 177–99.
Andersson, Malte. 1982. “Female Choice Selects for
accepts the truth with probability 1, we Extreme Tail Length in a Widowbird.” Nature,
have, for any π∗, 299(5886): 818–20.
Andreoni, James, Paul M. Brown, and Lise Vesterlund.
(5) H ( f ∗ , ∗ ) ≥ 1 −  . 2002. “What Makes an Allocation Fair? Some
Experimental Evidence.” Games and Economic
Behavior, 40(1): 1–24.
Andreoni, James, Marco Castillo, and Ragan Petrie.
We need to show there exists an f̂ ∈ S
M
2003. “What Do Bargainers’ Preferences Look Like?
such that, for all s ∈ S
M M
Experiments with a Convex Ultimatum Game.”
American Economic Review, 93(3): 672–85.
(6) (
H fˆ , ≥ 1 −  ,
sM
) Andreoni, James and John Miller. 2002. “Giving
According to GARP: An Experimental Test of the
Consistency of Preferences for Altruism.”
where πsM is a probability distribution over Econometrica, 70(2): 737–53.
SM that puts unitary probability on outcome Andreoni, James and Larry Samuelson. 2003. “Building
mr05_Article 2 3/28/05 3:25 PM Page 102

102 Journal of Economic Literature, Vol. XLIII (March 2005)

Rational Cooperation,” SSRI Working Paper 2003-4. Preferences.” Organizational Behavior and Human
Ariely, Dan, George Loewenstein, and Drazen Prelec. Decision Processes, 63(2): 131–44.
2003. “‘Coherent Arbitrariness’: Stable Demand Bolton, Gary E. 1991. “A Comparative Model of
Curves without Stable Preferences.” The Quarterly Bargaining: Theory and Evidence.” American
Journal of Economics, 118(1): 73–105. Economic Review, 81(5): 1096–136.
Aumann, Robert J. 1985. “What Is Game Theory Bolton, Gary E. and Axel Ockenfels. 2000. “ERC: A
Trying to Accomplish?” in Frontiers of Economics. Theory of Equity, Reciprocity, and Competition.”
Kenneth J. Arrow and Seppo Honkapohja, eds. American Economic Review, 90(1): 166–93.
Oxford: Blackwell, 28–76. Bolton, Gary E. and Rami Zwick. 1995. “Anonymity
Balkenborg, Dieter. 1994. “An Experiment on Forward versus Punishment in Ultimatum Bargaining.”
versus Backward Induction,” SFB Discussion Paper Games and Economic Behavior, 10(1): 95–121.
B-268, University of Bonn. Brandts, Jordi and Charles A. Holt. 1992. “An
Banks, Jeffrey S., John O. Ledyard, and David P. Experimental Test of Equilibrium Dominance in
Porter. 1989. “Allocating Uncertain and Signaling Games.” American Economic Review,
Unresponsive Resources: An Experimental 82(5): 1350–65.
Approach.” Rand Journal of Economics, 20(1): 1–25. Brandts, Jordi and Charles A. Holt. 1993. “Adjustment
Bateson, Melissa and Alex Kacelnik. 1996. “Rate Patterns and Equilibrium Selection in Experimental
Currencies and the Foraging Starling: The Fallacy of Signaling Games.” International Journal of Game
the Averages Revisited.” Behavioral Ecology, 7(3): Theory, 22(3): 279–302.
341–52. Brandts, Jordi and Charles A. Holt. 1995. “Limitations
Battalio, Raymond, Larry Samuelson, and John Van of Dominance and Forward Induction: Experimental
Huyck. 2001. “Optimization Incentives and Evidence.” Economics Letters, 49(4): 391–95.
Coordination Failure in Laboratory Stag Hunt Brewer, Paul J., Maria Huang, Brad Nelson, and
Games.” Econometrica, 69(3): 749–64. Charles R. Plott. 2002. “On the Behavioral
Berg, Joyce E., John W. Dickhaut, and Thomas A. Rietz. Foundations of the Law of Supply and Demand:
2003. “Diminishing Preference Reversals by Inducing Human Convergence and Robot Randomness.”
Risk Preferences: Evidence for Noisy Maximization.” Experimental Economics, 5(3): 179–208.
Journal of Risk and Uncertainty, 27(2): 139–70. Brewer, Paul J. and Charles R. Plott. 1996. “A Binary
Bergstrom, Theodore C. 2003. “Vernon Smith’s Conflict Ascending Price (BICAP) Mechanism for
Insomnia and the Dawn of Economics As the Decentralized Allocation of the Right to Use
Experimental Science.” Scandinavian Journal of Railroad Tracks.” International Journal of Industrial
Economics, 105(2): 181–205. Organization, 14(6): 857–86.
Binmore, Ken. 1999. “Why Experiment in Brown, James N. and Robert W. Rosenthal. 1990.
Economics?” Economic Journal, 109(453): F16–24. “Testing the Minimax Hypothesis: A Re-examination
Binmore, Ken and Paul Klemperer. 2002. “The Biggest of O’Neill’s Game Experiment.” Econometrica,
Auction Ever: The Sale of the British 3G Telecom 58(5): 1065–81.
Licenses.” Economic Journal, 112(478): C74–96. Bulmer, Michael. 1997. Theoretical Evolutionary
Binmore, Ken, John McCarthy, Giovanni Ponti, Larry Ecology. Sunderland, MA: Sinauer Associates, Inc.
Samuelson, and Avner Shaked. 2002. “A Backward Burnham, Terry and Jay Phelan. 2000. Mean Genes:
Induction Experiment.” Journal of Economic From Sex to Money to Food: Taming Our Primal
Theory, 104(1): 48–88. Instincts. Cambridge: Perseus Publishing.
Binmore, Ken, Peter Morgan, Avner Shaked, and John Camerer, Colin. 1995. “Individual Decision Making,”
Sutton. 1991. “Do People Exploit Their Bargaining in Handbook of Experimental Economics. John H.
Power? An Experimental Study.” Games and Kagel and Alvin E. Roth, eds. Princeton: Princeton
Economic Behavior, 3(3): 295–322. University Press: 587–703.
Binmore, Ken and Larry Samuelson. 1999. Camerer, Colin. 2003. Behavioral Game Theory:
“Evolutionary Drift and Equilibrium Selection.” The Experiments in Strategic Interaction. Princeton:
Review of Economic Studies, 66(2): 363–93. Princeton University Press and Russell Sage
Binmore, Ken, Avner Shaked, and John Sutton. 1985. Foundation.
“Testing Noncooperative Bargaining Theory: A Camerer, Colin and Keith Weigelt. 1988.
Preliminary Study.” American Economic Review, “Experimental Tests of a Sequential Equilibrium
75(5): 1178–80. Reputation Model.” Econometrica, 56(1): 1–36.
Binmore, Ken, Avner Shaked, and John Sutton. 1989. Cameron, Lisa A. 1999. “Raising the Stakes in the
“An Outside Option Experiment.” The Quarterly Ultimatum Game: Experimental Evidence from
Journal of Economics, 104(4): 753–70. Indonesia.” Economic Inquiry, 37(1): 47–59.
Binmore, Ken, Joe Swierzbinski, and Chris Proulx. Cason, Timothy N. and Daniel Friedman. 1996. “Price
2001. “Does Minimax Work? An Experimental Formation in Double Auction Markets.” Journal of
Study.” Economic Journal, 111(473): 445–64. Economic Dynamics and Control, 20(8): 1307–37.
Binmore, Kenneth G., John Gale, and Larry Samuelson. Charness, Gary and Matthew Rabin. 2002.
1995. “Learning to Be Imperfect: The Ultimatum “Understanding Social Preferences with Simple
Game.” Games and Economic Behavior, 8(1): 56–90. Tests.” The Quarterly Journal of Economics, 117(3):
Blount, Sally. 1995. “When Social Outcomes Aren’t 817–69.
Fair: The Effect of Causal Attributions on Clark, Andy. 1993. Microcognition: Philosophy,
mr05_Article 2 3/28/05 3:25 PM Page 103

Samuelson: Economic Theory and Experimental Economics 103

Cognitive Science and Parallel Distributed Dawes, Robyn M. 1988. Rational Choice in an
Processing. Cambridge: MIT Press. Uncertain World. New York: Harcourt Brace
Coller, Maribeth, Glenn W. Harrison, and E. Elisabet Jovanovich.
Rutström. 2003. “Are Discount Rates Constant? Dekel, Eddie and Yossi Feinberg. 2004. “A True
Reconciling Theory and Observation.” Mimeo. Expert Knows Which Questions Should be Asked,”
University of South Carolina and University of Mimeo. Institute for Advanced Study, Princeton.
Central Florida. Dennett, Daniel C. 1995. Darwin’s Dangerous Idea.
Cooper, David J., Nick Feltovich, Alvin E. Roth, and New York: Simon and Schuster.
Rami Zwick. 2003. “Relative versus Absolute Speed Drago, Robert and John S. Heywood. 1989.
of Adjustment in Strategic Environments: “Tournaments, Piece Rates, and the Shape of the
Responder Behavior in Ultimatum Games.” Payoff Function.” Journal of Political Economy,
Experimental Economics, 6(2): 181–207. 97(4): 992–98.
Cooper, David J., Susan Garvin, and John H. Kagel. Dufwenberg, Martin and Georg Kirchsteiger. 2004. “A
1997. “Adaptive Learning vs. Equilibrium Theory of Sequential Reciprocity.” Games and
Refinements in an Entry Limit Pricing Game.” Economic Behavior, 47(2): 268–98.
Economic Journal, 107(442): 553–75. Dyer, Douglas and John H. Kagel. 1996. “Bidding in
Cooper, David J., Susan Garvin, and John H. Kagel. Common Value Auctions: How the Commercial
1997. “Signalling and Adaptive Learning in an Entry Construction Industry Corrects for the Winner’s
Limit Pricing Game.” Rand Journal of Economics, Curse.” Management Science, 42(10): 1463–75.
28(4): 662–83. El-Gamal, Mahmoud A. and David M. Grether. 1995.
Cooper, David J. and John H. Kagel. 2003. “Lessons “Are People Bayesian? Uncovering Behavioral
Learned: Generalizing Learning across Games.” Strategies.” Journal of the American Statistical
American Economic Review, 93(2): 202–07. Association, 90(432): 1137–45.
Cosmides, Leda and John Tooby. 1992. “The Erev, Ido, Alvin E. Roth, Robert L. Slonim, and Greg
Psychological Foundations of Culture,” in The Barron. 2002. “Combining a Theoretical Prediction
Adapted Mind: Evolutionary Psychology and the With Experimental Evidence.” Mimeo. Harvard
Generation of Culture. Jerome H. Barkow, Leda University.
Cosmides, and John Tooby, eds. Oxford: Oxford Fagin, Ronald, Joseph Y. Halpern, Yoram Moses, and
University Press, 19–136. Moshe Y. Vardi. 1995. Reasoning About Knowledge.
Cosmides, Leda and John Tooby. 1996. “Are Humans Cambridge: MIT Press.
Good Intuitive Statisticians After All? Rethinking Falk, Armin, Ernst Fehr, and Urs Fischbacher. 2003.
Some Conclusions from the Literature on Judgement “On the Nature of Fair Behavior.” Economic
Under Uncertainty.” Cognition, 58(1): 1–73. Inquiry, 41(1): 20–26.
Cosmides, Leda and John Tooby. 1996. “Cognitive Fan, Ky. 1953. “Minimax Theorems.” Proceedings of
Adaptations for Social Exchange,” in The Adapted the National Academy of Sciences of the United
Mind: Evolutionary Psychology and the Generation of States of America, 39: 42–47.
Culture. Jerome H. Barkow, Leda Cosmides, and Fehr, Ernst and Simon Gächter. 2000. “Cooperation
John Tooby, eds. Oxford: Oxford University Press: and Punishment in Public Goods Experiments.”
163–228. American Economic Review, 90(4): 980–94.
Cox, James C. 1997. “On Testing the Utility Fehr, Ernst and Simon Gächter. 2000. “Fairness and
Hypothesis.” Economic Journal, 107(443): 1054–78. Retaliation: The Economics of Reciprocity.” Journal
Cox, James C. 2004. “How to Identify Trust and of Economic Perspectives, 14(3): 159–81.
Reciprocity.” Games and Economic Behavior, 46(2): Fehr, Ernst and Simon Gächter. 2002. “Altruistic
260–81. Punishment in Humans.” Nature, 415(6868): 137–40.
Cox, James C. and Ronald L. Oaxaca. 1995. “Inducing Fehr, Ernst, Simon Gächter, and Georg Kirchsteiger.
Risk-Neutral Preferences: Further Analysis of the 1997. “Reciprocity as a Contract Enforcement
Data.” Journal of Risk and Uncertainty, 11(1): 65–79. Device: Experimental Evidence.” Econometrica,
Cox, James C. and Vjollca Sadiraj. 2002. “Risk Aversion 65(4): 833–60.
and Expected Utility Theory: Coherence for Small- Fehr, Ernst and Joseph Henrich. 2003. “Is Strong
and Large-Stakes Gambles.” Mimeo. University of Reciprocity a Maladaptation? On the Evolutionary
Arizona and University of Amsterdam. Foundations of Human Altruism,” in The Genetic
Crawford, Vincent P. 1997. “Theory and Experiment in and Cultural Evolution of Cooperation. Peter
the Analysis of Strategic Interaction,” in Advances in Hammerstein, ed. Cambridge: MIT Press: 55–82.
Economics and Econometrics: Theory and Applica- Fehr, Ernst and Klaus M. Schmidt. 1999. “A Theory of
tions, Volume 1. Davis M. Kreps and Kenneth F. Fairness, Competition, and Cooperation.” The
Wallis, eds. Cambridge: Cambridge University Press, Quarterly Journal of Economics, 114(3): 817–68.
206–42. Forsythe, Robert, Joel L. Horowitz, N. E. Savin, and
Dasgupta, Partha and Eric Maskin. 2003. “Uncertainty, Martin Sefton. 1994. “Fairness in Simple Bargaining
Waiting Costs, and Hyperbolic Discounting.” Experiments.” Games and Economic Behavior, 6(3):
Mimeo. Institute for Advanced Study, Princeton. 347–69.
Davis, Douglas D. and Charles A. Holt. 1993. Frederick, Shane, George Loewenstein, and Ted
Experimental Economics. Princeton: Princeton O’Donoghue. 2002. “Time Discounting and Time
University Press. Preference: A Critical Review.” Journal of Economic
mr05_Article 2 3/28/05 3:25 PM Page 104

104 Journal of Economic Literature, Vol. XLIII (March 2005)

Literature, 40(2): 351–401. “The Spandrels of San Marco and the Panglossian
Fudenberg, Drew and David K. Levine. 1997. Paradigm: A Critique of the Adaptionist
“Measuring Players’ Losses in Experimental Games.” Programme.” Proceedings of the Royal Society of
The Quarterly Journal of Economics, 112(2): 507–36. London, Series B, 205(1161): 581–98.
Fudenberg, Drew and David K. Levine. 1998. The Grafen, Alan. 1990a. “Biological Signals as Handicaps.”
Theory of Learning in Games. Cambridge: MIT Journal of Theoretical Biology, 144(4): 517–46.
Press. Grafen, Alan. 1990b. “Sexual Selection
Gächter, Simon and Armin Falk. 2002. “Reputation and Unhandicapped by the Fisher Process.” Journal of
Reciprocity: Consequences for the Labour Relation.” Theoretical Biology, 144(4): 473–516.
Scandinavian Journal of Economics, 104(1): 1–26. Güth, Werner, Rolf Schmittberger, and Bernd
Geanakoplos, John, David Pearce, and Ennio Schwarze. 1982. “An Experimental Analysis of
Stacchetti. 1989. “Psychological Games and Ultimatum Bargaining.” Journal of Economic
Sequential Rationality.” Games and Economic Behavior and Organization, 3(4): 367–88.
Behavior, 1(1): 60–79. Güth, Werner and Reinhard Tietz. 1990. “Ultimatum
Gigerenzer, Gerd. 1991. “How to Make Cognitive Bargaining Behavior: A Survey and Comparison of
Illusions Disappear: Beyond Heuristics and Biases,” Experimental Results.” Journal of Economic
in European Review of Social Psychology, Volume 2. Psychology, 11(3): 417–49.
Wolfgang Stroebe and Miles Hewstone, eds. New Haile, Philip, Ali Hortacsu, and Grigory Kosenok.
York: Wiley. 2004. “On the Empirical Content of Quantal
Gigerenzer, Gerd. 1996. “The Psychology of Good Response Equilibrium,” Mimeo. Yale University.
Judgment: Frequency Formats and Simple Halevy, Yoram. 2004. “Diminishing Impatience:
Algorithms.” Journal of Medical Decision Making, Disentangling Time Preference from Uncertain
16(3): 273–80. Lifetime,” Mimeo. University of British Columbia.
Gigerenzer, Gerd. 1998. “Ecological Intelligence: An Harless, David W. and Colin F. Camerer. 1994. “The
Adaptation for Frequencies,” in The Evolution of Predictive Utility of Generalized Expected Utility
Mind. D. Cummings and C. Allen, eds. Oxford: Theories.” Econometrica, 62(6): 1251–89.
Oxford University Press: 9–29. Harless, David W. and Colin F. Camerer. 1995. “An
Gneezy, Uri and Aldo Rustichini. 2000. “A Fine is a Error Rate Analysis of Experimental Data Testing
Price.” Journal of Legal Studies, 29(1): 1–18. Nash Refinements.” European Economic Review,
Gneezy, Uri and Aldo Rustichini. 2000. “Pay Enough or 39(3–4): 649–60.
Don’t Pay at All.” The Quarterly Journal of Harrison, Glenn W. 1989. “Theory and Misbehavior of
Economics, 115(3): 791–810. First-Price Auctions.” American Economic Review,
Gode, Dhananjay K. and Shyam Sunder. 1993. 79(4): 749–62.
“Allocative Efficiency of Markets with Zero- Harrison, Glenn W. 1990. “Risk Attitudes in First-Price
Intelligence Traders: Market as a Partial Substitute Auction Experiments: A Bayesian Analysis.” Review
for Individual Rationality.” Journal of Political of Economics and Statistics, 72(3): 541–46.
Economy, 101(1): 119–37. Harrison, Glenn W. 1992. “Theory and Misbehavior of
Gode, Dhananjay K. and Shyam Sunder. 1993. “Lower First-Price Auctions: Reply.” American Economic
Bounds for Efficiency of Surplus Extraction in Review, 82(5): 1426–43.
Double Auctions,” in The Double Auction Market: Harrison, Glenn W., Ronald M. Harstad, and E.
Institutions, Theories and Evidence, Proceedings Vol. Elisabet Rutström. 2002. “Experimental Methods
15. Daniel Friedman and John Rust, eds. Santa Fe: and Elicitation of Values,” University of South
Santa Fe Institute in the Science of Complexity. Carolina Working Paper B-95-11.
Gode, Dhananjay K. and Shyam Sunder. 1997. “What Harrison, Glenn W. and Jack Hirshleifer. 1989. “An
Makes Markets Allocationally Efficient?” The Experimental Evaluation of Weakest Link/Best Shot
Quarterly Journal of Economics, 112(2): 603–30. Models of Public Goods.” Journal of Political
Godfray, H. C. J. and R. A. Johnstone. 2000. “Begging Economy, 97(1): 201–25.
and Bleating: The Evolution of Parent-Offspring Harrison, Glenn W. and John A. List. 2004. “Field
Signalling.” Philosophical Transactions of the Royal Experiments,” Mimeo. University of Central Florida
Society of London B, 355(1403): 1581–91. and University of Maryland.
Goeree, Jacob K. and Charles A. Holt. 2001. “Ten Harrison, Glenn W. and Kevin A. McCabe. 1992.
Little Treasures of Game Theory and Ten Intuitive “Testing Non-Cooperative Bargaining Theory in
Contradictions.” American Economic Review, 91(5): Experiments,” in Research in Experimental
1402–22. Economics, Volume 5. R. Mark Isaac, ed. London:
Goeree, Jacob K., Charles A. Holt, and Thomas R. JAI Press, 137–69.
Palfrey. 2002. “Quantal Response Equilibrium and Harrison, Glenn W. and Kevin A. McCabe. 1996.
Overbidding in Private-Value Auctions.” Journal of “Expectations and Fairness in a Simple Bargaining
Economic Theory, 104(1): 247–72. Experiment.” International Journal of Game Theory,
Goeree, Jacob K., Charles A. Holt, and Thomas R. 25(3): 303–27.
Palfrey. 2004. “Regular Quantal Response Harvey, Paul H., R. D. Martin, and T. H. Clutton-
Equilibrium,” Mimeo. California Institute of Brock. 1986. “Life Histories in Comparative
Technology and University of Virginia. Perspective,” in Primate Societies. Barbara B. Smuts,
Gould, Steven Jay and Richard C. Lewontin. 1979. Dorothy L. Cheeney, Robert M. Seyfarth, Richard
mr05_Article 2 3/28/05 3:25 PM Page 105

Samuelson: Economic Theory and Experimental Economics 105

W. Wrangham, and Thomas T. Struhsaker, eds. Gail Cardew, eds. New York: Wiley, 51–70.
Chicago: University of Chicago Press, 181–96. Kacelnik, Alex and Fausto Brito e Abreu. 1998. “Risky
Henrich, Joseph. 2000. “Does Culture Matter in Choice and Weber’s Law.” Journal of Theoretical
Economic Behavior? Ultimatum Game Bargaining Biology, 194(2): 289–98.
among the Machiguenga of the Peruvian Amazon.” Kagel, John H. 1995. “Auctions: A Survey of
American Economic Review, 90(4): 973–79. Experimental Research,” in The Handbook of
Henrich, Joseph. 2004. “Cultural Group Selection, Experimental Economics. John H. Kagel and Alvin
Coevolutionary Processes and Large-Scale E. Roth, eds. Princeton: Princeton University Press,
Cooperation.” Journal of Economic Behavior and 501–85.
Organization, 53(1): 3–35. Kagel, John H., Ronald M. Harstad, and Dan Levin.
Henrich, Joseph, Robert Boyd, Samuel Bowles, Colin 1987. “Information Impact and Allocation Rules in
Camerer, Ernst Fehr, Herbert Gintis, and Richard Auctions with Affiliated Private Values: A Laboratory
McElreath. 2001. “In Search of Homo Economicus: Study.” Econometrica, 55(6): 1275–304.
Behavioral Experiments in 15 Small-Scale Societies.” Kagel, John H. and Dan Levin. 2002. Common Value
American Economic Review, 91(2): 73–78. Auctions and the Winner’s Curse. Princeton and
Hoffman, Elizabeth, Kevin A. McCabe, and Vernon L. Oxford: Princeton University Press.
Smith. 1996. “On Expectations and the Monetary Kahneman, Daniel and Amos Tversky, eds. 2000.
Stakes in Ultimatum Games.” International Journal Choices, Values, and Frames. Cambridge: Cambridge
of Game Theory, 25(3): 289–301. University Press; New York: Russell Sage Foundation.
Hoffman, Elizabeth, Kevin McCabe, Keith Shachat, Kahneman, Daniel and Amos Tversky. 1979. “Prospect
and Vernon L. Smith. 1994. “Preferences, Property Theory: An Analysis of Decision under Risk.”
Rights, and Anonymity in Bargaining Games.” Econometrica, 47(2): 263–91.
Games and Economic Behavior, 7(3): 346–80. Knetsch, Jack L., Fang-Fang Tang, and Richard H.
Hoffman, Elizabeth, Kevin McCabe, and Vernon L. Thaler. 2001. “The Endowment Effect and Repeated
Smith. 1996. “Social Distance and Other-Regarding Market Trials: Is the Vickrey Auction Demand
Behavior in Dictator Games.” American Economic Revealing?” Experimental Economics, 4(3): 257–69.
Review, 86(3): 653–60. Kohlberg, Elon and Jean-Francois Mertens. 1986. “On
Hoglund, J., M. Eriksson, and Lindell. L. E. 1990. the Strategic Stability of Equilibria.” Econometrica,
“Females of the Lek-Breeding Great Snipe, 54(5): 1003–37.
Gallinago media, Prefer Males with White Tails.” Ledyard, John O. 1986. “The Scope of the Hypothesis
Animal Behavior, 40: 23–32. of Bayesian Equilibrium.” Journal of Economic
Holt, Charles A. and Susan K. Laury. 2002. “Risk Theory, 39(1): 59–82.
Aversion and Incentive Effects.” American Ledyard, John O., David Porter, and Randii Wessen.
Economic Review, 92(5): 1644–55. 2000. “A Market-Based Mechanism for Allocating
Hopkins, Ed. 2002. “Two Competing Models of How Space Shuttle Secondary Payload Priority.”
People Learn in Games.” Econometrica, 70(6): Experimental Economics, 2(3): 173–95.
2141–66. Leutenegger, W. 1982. “Encephalization and
Houser, Daniel, Michael Keane, and Kevin McCabe. Obstetrics in Primates with Particular Reference to
2004. “Behavior in a Dynamic Decision Problem: An Human Evolution,” in Primate Brain Evolution:
Analysis of Experimental Evidence Using a Bayesian Methods and Concepts. Este Armstrong and Dean
Type Classification Algorithm.” Econometrica, 72(3): Falk, eds. New York: Plenum: 85–95.
781–822. Levine, David K. 1998. “Modeling Altruism and
Houston, Alasdair I. and John M. McNamara. 1999. Spitefulness in Experiments.” Review of Economic
Models of Adaptive Behavior: An Approach Based on Dynamics, 1(3): 593–622.
State. Cambridge: Cambridge University Press. Lipman, Barton L. 1999. “Decision Theory without
Jehiel, Philippe. 2004. “Analogy-Based Expectation Logical Omniscience: Toward an Axiomatic
Equilibrium.” Journal of Economic Theory, In Press. Framework for Bounded Rationality.” The Review of
Johnstone, Rufus A. 1995. “Sexual Selection, Honest Economic Studies, 66(2): 339–61.
Advertisement and the Handicap Principle: Loewenstein, George and Drazen Prelec. 1992.
Reviewing The Evidence.” Biological Reviews, “Anomalies in Intertemporal Choice: Evidence and
70(1): 1–65. an Interpretation.” The Quarterly Journal of
Johnstone, Rufus A. 1998. “Game Theory and Economics, 107(2): 573–97.
Communication,” in Game Theory and Animal Low, Bobbi S., R. D. Alexander, and K. M. Noonan.
Behavior. Lee Alan Dugatkin and Hudson Kern 1987. “Human Hips, Breasts, and Buttocks: Is Fat
Reeve, eds. Oxford: Oxford University Press, Deceptive?” Ethology and Sociobiology, 8(4): 249–57.
94–117. Luce, Duncan and Howard Raiffa. 1957. Games and
Johnstone, Rufus A. and Alan Grafen. 1992. “Error- Decisions. New York: Wiley.
Prone Signalling.” Proceedings of the Royal Society Mazur, James E. 1984. “Tests of an Equivalence Rule
of London, Series B, 248: 229–33. for Fixed and Variable Reinforcer Delays.” Journal of
Kacelnik, Alex. 1997. “Normative and Descriptive Experimental Psychology: Animal Behavior
Models of Decision Making: Time Discounting and Processes, 10(4): 426–36.
Risk Sensitivity,” in Characterizing Human Mazur, James E. 1986. “Fixed and Variable Ratios and
Psychological Adaptations. Gregory R. Bock and Delays: Further Tests of an Equivalence Rule.”
mr05_Article 2 3/28/05 3:25 PM Page 106

106 Journal of Economic Literature, Vol. XLIII (March 2005)

Journal of Experimental Psychology: Animal Experimental Data from Sequential Games.” The
Behavior Processes, 12(2): 116–24. Quarterly Journal of Economics, 107(3): 865–88.
Mazur, James E. 1987. “An Adjusting Procedure for Rabin, Matthew. 1993. “Incorporating Fairness into
Studying Delayed Reinforcement,” in Quantitative Game Theory and Economics.” American Economic
Analyses of Behaviour: The Effect of Delay and of Review, 83(5): 1281–302.
Intervening Events on Reinforcement Value. Michael Rabin, Matthew. 2000. “Risk Aversion and Expected-
L. Commons, James E. Mazur, John A. Nevin, and Utility Theory: A Calibration Theorem.”
Howard Rachlin, eds. Hillsdale, NJ: Lawrence Econometrica, 68(5): 1281–92.
Erlbaum, 55–73. Rabin, Matthew and Richard H. Thaler. 2001.
McCabe, Kevin A., Stephen J. Rassenti, and Vernon L. “Anomalies: Risk Aversion.” Journal of Economic
Smith. 1998. “Reciprocity, Trust, and Payoff Privacy Perspectives, 15(1): 219–32.
in Extensive Form Bargaining.” Games and Rassenti, Stephen J., Vernon L. Smith, and Robert L.
Economic Behavior, 24(1–2): 10–24. Bulfin. 1982. “A Combinatorial Auction Mechanism
McKelvey, Richard D. and Thomas R. Palfrey. 1992. for Airport Time Slot Allocation.” Bell Journal of
“An Experimental Study of the Centipede Game.” Economics, 13(2): 402–17.
Econometrica, 60(4): 803–36. Ridley, Matt. 1993. The Red Queen: Sex and the
McKelvey, Richard D. and Thomas R. Palfrey. 1995. Evolution of Human Nature. New York: Penguin
“Quantal Response Equilibria for Normal Form Books.
Games.” Games and Economic Behavior, 10(1): 6–38. Robson, Arthur J. 1992. “Status, the Distribution of
McKelvey, Richard D. and Thomas R. Palfrey. 1998. Wealth, Private and Social Attitudes to Risk.”
“Quantal Response Equilibria for Extensive Form Econometrica, 60(4): 837–57.
Games.” Experimental Economics, 1(1): 9–41. Robson, Arthur J. 1996a. “A Biological Basis for
Milgrom, Paul. 2004. Putting Auction Theory to Work. Expected and Non-expected Utility.” Journal of
Cambridge: Cambridge University Press. Economic Theory, 68(2): 397–424.
Milton, Katharine. 1988. “Foraging Behavior and the Robson, Arthur J. 1996b. “The Evolution of Attitudes
Evolution of Primate Cognition,” in Machiavellian to Risk: Lottery Tickets and Relative Wealth.” Games
Intelligence: Social Expertise and the Evolution of and Economic Behavior, 14(2): 190–207.
Intellect in Monkeys, Apes and Humans. Richard W. Robson, Arthur J. 2001. “The Biological Basis of
Byrne and Andrew Whiten, eds. Oxford: Clarendon Economic Behavior.” Journal of Economic
Press: 285–305. Literature, 39(1): 11–33.
Møller, Anders. 1988. “Female Choice Selects for Male Robson, Arthur J. 2001. “Why Would Nature Give
Sexual Tail Ornaments in the Monogamous Individuals Utility Functions?” Journal of Political
Swallows.” Nature, 332(6165): 640–42. Economy, 109(4): 900–14.
Nash, John F. 2002. “Noncooperative Games,” in The Ross, Sheldon. 1996. Stochastic Processes. New York:
Essential John Nash. Harold W. Kuhn and Sylvia Wiley.
Nasar, eds. Princeton: Princeton University Press: Roth, Alvin E. 1987. “Laboratory Experimentation in
85–98. Economics,” in Advances in Economic Theory: Fifth
Ochs, Jack and Alvin E. Roth. 1989. “An Experimental World Congress of the Econometric Society. Truman
Study of Sequential Bargaining.” American Bewley, ed. Cambridge: Cambridge University
Economic Review, 79(3): 355–84. Press, 269–99.
Pinker, Steven. 1997. How the Mind Works. New York: Roth, Alvin E. 1988. “Laboratory Experimentation in
W. W. Norton. Economics: A Methodological Overview.” Economic
Plott, Charles R. 1996. “Rational Individual Behavior Journal, 98(393): 974–1031.
in Markets and Social Choice Processes: The Roth, Alvin E. 1991. “Game Theory as a Part of
Discovered Preference Hypothesis,” in The Rational Empirical Economics.” Economic Journal, 101(404):
Foundations of Economic Behavior. Kenneth J. 107–14.
Arrow, Enrico Colombatto, Mark Perlman and Roth, Alvin E. 1993. “The Early History of
Christian Schmidt, eds. New York: St. Martin’s Press, Experimental Economics.” Journal of the History of
225–50. Economic Thought, 15(2): 184–209.
Plott, Charles R. and Vernon L. Smith. 1978. “An Roth, Alvin E. 1994. “Let’s Keep the Con out of
Experimental Examination of Two Exchange Experimental Econ.: A Methodological Note.”
Institutions.” The Review of Economic Studies, 45(1): Empirical Economics, 19(2): 279–89.
133–53. Roth, Alvin E. 1995. “Bargaining Experiments,” in
Plott, Charles R. and Kathryn Zeiler. 2003 “The Handbook of Experimental Economics. John H.
Willingness to Pay/Willingness to Accept Gap, the Kagel and Alvin E. Roth, eds. Princeton: Princeton
‘Endowment Effect,’ Subject Misconceptions and University Press, 253–348.
Experimental Procedures for Eliciting Valuations,” Roth, Alvin E. and Ido Erev. 1995. “Learning in
Mimeo. California Institute of Technology. Extensive-Form Games: Experimental Data and
Postlewaite, Andrew. 1998. “The Social Basis of Simple Dynamic Models in the Intermediate Term.”
Interdependent Preferences.” European Economic Games and Economic Behavior, 8(1): 164–212.
Review, 42(3–5): 779–800. Roth, Alvin E. and Michael W. K. Malouf. 1979. “Game-
Prasnikar, Vesna and Alvin E. Roth. 1992. Theoretic Models and the Role of Information in
“Considerations of Fairness and Strategy: Bargaining.” Psychological Review, 86(6): 574–94.
mr05_Article 2 3/28/05 3:25 PM Page 107

Samuelson: Economic Theory and Experimental Economics 107

Roth, Alvin E. and J. Keith Murnighan. 1982. “The Markets and the Walrasian Hypothesis.” Journal of
Role of Information in Bargaining: An Experimental Political Economy, 73(4): 387–93.
Study.” Econometrica, 50(5): 1123–42. Smith, Vernon L. 1976. “Bidding and Auctioning
Roth, Alvin E., Vesna Prasnikar, Masahiro Okuno- Institutions: Experimental Results,” in Bidding and
Fujiwara, and Shmuel Zamir. 1991. “Bargaining and Auctioning for Procurement and Allocation. Y.
Market Behavior in Jerusalem, Ljubljana, Amihud ed. New York: New York University Press,
Pittsburgh, and Tokyo: An Experimental Study.” 43–64.
American Economic Review, 81(5): 1068–95. Smith, Vernon L. 1982. “Microeconomic Systems as an
Roth, Alvin E. and Francoise Schoumaker. 1983. Experimental Science.” American Economic Review,
“Expectations and Reputations in Bargaining: An 72(5): 923–55.
Experimental Study.” American Economic Review, Smith, Vernon L. 1991. Papers in Experimental
73(3): 362–72. Economics. Cambridge: Cambridge University Press.
Rubinstein, Ariel. 1998. Modeling Bounded Smith, Vernon L. 2000. Bargaining and Market
Rationality. Cambridge: MIT Press. Behavior: Essays in Experimental Economics.
Rubinstein, Ariel. 2001. “Comments on the Risk and Cambridge: Cambridge University Press.
Time Preferences in Economics,” Mimeo. Tel Aviv Smith, Vernon L. 2003. “Constructivist and Ecological
University. Rationality in Economics.” American Economic
Rubinstein, Ariel. 2003. “‘Economics and Psychology’? Review, 93(3): 465–508.
The Case of Hyperbolic Discounting.” International Sober, Elliott and David Sloan Wilson. 1998. Unto
Economic Review, 44(4): 1207–16. Others: The Evolution and Psychology of Unselfish
Samuelson, Larry. 2001. “Analogies, Adaptation, and Behavior. Cambridge: Harvard University Press.
Anomalies.” Journal of Economic Theory, 97(2): Sozou, Peter D. 1998. “On Hyperbolic Discounting and
320–66. Uncertain Hazard Rates.” Proceedings of the Royal
Samuelson, Larry. 2004. “Information-Based Relative Society of London, Series B, 265(1409): 2015–20.
Consumption Effects.” Econometrica, 72(1): 93–118. Sunder, Shyam. 2004. “Markets as Artifacts: Aggregate
Samuelson, Larry and Jeroen Swinkels. 2001. Efficiency from Zero-Intelligence Traders,” in
“Information and the Evolution of the Utility Models of a Man: Essays in Honor of Herbert A.
Function,” SSRI Working Paper 2001-06, University Simon. Mie Augier and James G. March, eds.
of Wisconsin. Cambridge: MIT Press, 501–19.
Sandroni, Alvaro. 2003. “The Reproducible Properties Thaler, Richard H. 1988. “Anomalies: The Ultimatum
of Correct Forecasts.” International Journal of Game Game.” Journal of Economic Perspectives, 2(4):
Theory, 32(1): 151–59. 195–206.
Savage, Leonard J. 1972. The Foundations of Statistics. Thaler, Richard H. 1992. The Winner’s Curse.
New York: Dover Publications. Princeton: Princeton University Press.
Segal, Uzi and Joel Sobel. 2003. “Tit for Tat: Thaler, Richard H. 1994. Quasi-Rational Economics.
Foundations of Preferences for Reciprocity in New York: Russell Sage Foundation.
Strategic Settings,” Mimeo. Johns Hopkins Tversky, Amos and Daniel Kahneman. 1982.
University and University of California, San Diego. “Judgment under Uncertainity: Heuristics and
Selten, Reinhard. 1965. “Spieltheoretische Biases,” in Judgment under Uncertainity: Heuristics
Behandlung eines Oligopolmodells mit and Biases. Daniel Kahneman, Paul Slovic, and
Nachfrageträgheit.” Zeitschrift fur die Gesamte Amos Tversky, eds. Cambridge: Cambridge
Staatswissenschaft, 121: 301–24 and 667–89. University Press, 3–22.
Selten, Reinhard. 1975. “Reexamination of the Tversky, Amos and Daniel Kahneman. 1983.
Perfectness Concept for Equilibrium Points in “Extensional Versus Intuitive Reasoning: The
Extensive-Form Games.” International Journal of Conjunction Fallacy in Probability Judgment.”
Game Theory, 4(1–2): 25–55. Psychological Review, 90(4): 293–315.
Selten, Reinhard, Abdolkarim Sadrieh, and Klaus Walker, James M., Vernon L. Smith, and James C. Cox.
Abbink. 1999. “Money Does Not Induce Risk 1990. “Inducing Risk-Neutral Preferences: An
Neutral Behavior, but Binary Lotteries Do Even Examination in a Controlled Market Environment.”
Worse.” Theory and Decision, 46(3): 211–49. Journal of Risk and Uncertainty, 3(1): 5–24.
Slonim, Robert and Alvin E. Roth. 1998. “Learning in Wason, Peter. 1966. “Reasoning,” in New Horizons
High Stakes Ultimatum Games: An Experiment in in Psychology. B. M. Foss, ed. Harmondsworth:
the Slovak Republic.” Econometrica, 66(3): 569–96. Penguin Books.
Smith, Cedric A. B. 1961. “Consistency in Statistical Weibull, Jörgen W. 2004. “Testing Game Theory,” in
Inference and Decision.” Journal of The Royal Advances in Understanding Strategic Behaviour:
Statistical Society, Series B, 23(1): 1–37. Game Theory, Experiments and Bounded
Smith, Vernon L. 1962. “An Experimental Study of Rationality: Essays in Honour of Werner Güth.
Competitive Market Behavior.” Journal of Political Steffen Huck, ed. Basingstoke: Palgrave.
Economy, 70(2): 111–37. Winter, Eyal and Shmuel Zamir. 1997. “An Experiment
Smith, Vernon L. 1964. “Effect of Market Organization with Ultimatum Bargaining in a Changing
on Competitive Equilibrium.” Quarterly Journal of Environment,” Discussion Paper 159, The Hebrew
Economics, 78(2): 181–201. University, Center for Rationality and Interactive
Smith, Vernon L. 1965. “Experimental Auction Decision Theory.

You might also like