Games Machine Play

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Games Machines Play

Wynn C. Stirling
Electrical and Computer Engineering Department
459CB Brigham Young University, Provo, UT, USA 84602

Competition, which is the instinct of selfishness, is another word for dissipation of


energy, while combination is the secret of efficient production.
— Edward Bellamy
Looking Backward (1888)

Abstract

Individual rationality, or doing what is best for oneself, is a standard model used to

explain and predict human behavior, and von Neumann-Morgenstern game theory is the

classical mathematical formalization of this theory in multiple-agent settings. Individual

rationality, however, is an inadequate model for the synthesis of artificial social systems

where cooperation is essential, since it does not permit the accommodation of group in-

terests other than as aggregations of individual interests. Satisficing game theory is based

upon a well-defined notion of being good enough, and does accommodate group as well

as individual interests through the use of conditional preference relationships, whereby a

decision maker is able to adjust its preferences as a function of the preferences, and not

just the options, of others. This theory is offered as an alternative paradigm to construct

artificial societies that are capable of complex behavior that goes beyond exclusive self

interest.

Keywords
game theory, decision theory, individual rationality, group rationality, satisficing, altruism

January 18, 2002 Page 1 Submitted to Minds and Machines


1 Introduction

In an environment of rapidly increasing computer power and greatly increased scientific knowl-

edge of human cognition, it is inevitable that serious consideration will be given to designing

artificial systems that function analogously to the way humans function. Many researchers

in this field concentrate on four major metaphors: (a) brain-like models (neural networks), (b)

natural language models (fuzzy logic), (c) biological evolutionary models (genetic algorithms),

and (d) cognition models (rule-based systems). The assumption is that, by designing according

to these metaphors, machines can be made at least to imitate, if not replicate, human behavior.

Such systems are often claimed to be intelligent.

The word “intelligent” has been appropriated by many groups, and may mean anything

from nonmetaphorical cognition (for example, strong AI) to advertising hype (for example,

intelligent lawn mowers). Some of the definitions in use are quite complex, some are circular,

and some are self-serving. But when all else fails, we may appeal to etymology—it owns

the deed to the word, everyone else can only claim squatters rights. Intelligent comes from the

Latin roots inter (between) + legĕre (to choose). Ultimately, it seems, there is only one essential

characteristic of intelligence in man or machine—an ability to choose between alternatives.

Classifying “intelligent” systems in terms of anthropomorphic metaphors deals primar-

ily with the way knowledge is represented, rather than with the way decisions are made.

Whether knowledge is represented by neural connection weights, fuzzy set-membership func-

tions, genes, production rules, or even differential equations, is a choice that must be made

according to the context of the problem and the preferences of the system designer. The way

knowledge is represented, however, does not dictate the rational basis underlying the way de-

cisions are made, and therefore has little to do with this important aspect of intelligence.

One question, when designing a decision-making machine or a society of machines, is the

issue of just where the actual choosing mechanism lies—with the designer, who must supply

the machine with all of rules it is to follow, or with the machine itself, in that it possesses

a degree of true autonomy. This essay does not address that question. Instead, it focuses

primarily on the issue of how decisions might be made, rather than who ultimately bears the

January 18, 2002 Page 2 Submitted to Minds and Machines


responsibility for making them.

Much current research is being devoted to the design and eventual implementation of

artificial social systems. The envisioned applications of this technology include automated

air-traffic control, automated highway control, automated shop floor management, computer

network control, and so forth. Although much effort has concentrated on such applications,

attempts to design artificial systems that are compatible with human behavior, and thus have

a chance of being accepted and trusted by people, have met with limited success. Perhaps the

most important (and most difficult) social attribute to imitate is that of coordinated behavior,

whereby a system of autonomous distributed machines coordinate their actions to accomplish

tasks that achieve the goals of both the society and its members.

This essay investigates rationality models that may be used by men or machines. If ef-

fective and trustworthy decision-making machines are to be constructed, they must function

according to an adequate model of human behavior, both in isolation and in social settings.

The decision-making mechanisms employed by such machines must be understandable to and

viewed as reasonable by the people who interface with such systems. In Section 2 we discuss

conventional rational choice theory as it is typically instantiated via game theory, and point out

its shortcomings. In Section 3 we turn our attention to the problem of synthesizing artificial

decision systems and motivate the need for a model of rationality that accounts for group as

well as individual interests. In Section 4 we introduce an alternative to conventional rational

choice, and then describe a new theory of games, which we call satisficing games, in Section

5. Section 6 describes a key feature of our approach, namely, the capability to condition on

preferences; this feature distinguishes our approach from approaches based upon conventional

notions of rationality. Section 7 then presents an example of a satisficing game, and Section 8

finishes with a discussion of our theory.

2 Individual Rationality

People make choices according to various criteria, ranging from capriciousness at the low end

of sophistication, then to random guesses, then to heuristics, and finally to rational choice as

January 18, 2002 Page 3 Submitted to Minds and Machines


perhaps the most sophisticated form of decision making. Rational choice requires the decision

maker to form a total ordering of his or her preferences, and then to select a most preferable

option. This procedure is termed the principle of individual rationality.

Individual rationality is the acknowledged standard for calculus/probability-based knowl-

edge representation and decision making, and it is also the de facto standard for the alternative

approaches based on anthropomorphic metaphors. When designing neural networks, algo-

rithms are designed to calculate the optimum weights, fuzzy sets are defuzzified to a crisp set

by choosing the element of the fuzzy set with the highest degree of set membership, genetic

algorithms are designed under the principle of survival of the fittest, and rule-based systems are

designed according to the principle that a decision maker will operate in its own best interest

according to what it knows.

When more than one decision maker is involved, each participant must take into consider-

ation the possible actions of the other participants when determining his or her best choice, but

the principle of individual rationality still applies—the decision maker’s best choice is merely

constrained by the possible choices of others. Game theory, as developed by (von Neumann

and Morgenstern, 1944), is a mathematical formalization of this activity. It takes into account

all possible actions and consequences for all participants, and determines what the best actions

are for all players. Game theory provides a normative rule that tells a rational person what he

or she should do in various circumstances. A solution to a game is called a strategy, which is

a set of rules the decision maker should follow in every possible circumstance, or state, of the

game. A strategy vector is a set of strategies, one for each participant.

Equilibrium strategy vectors are the most useful. A strategy vector is said to be a Nash

equilibrium if, should any one participant change his or her own strategy, the new strategy

would decrease the level of his or her satisfaction. If all players simultaneously feel that way,

then there is no incentive for any player to change unilaterally—hence the equilibrium. There

are two important issues with equilibrium solutions. First, there may be more than one equi-

librium state, and second, the solutions, though instructive, are not constructive, in that game

theory does not tell us how to obtain an equilibrium state—it merely identifies them. A signif-

January 18, 2002 Page 4 Submitted to Minds and Machines


icant part of classical game theory literature is devoted to refinements of the theory in attempts

to address these key issues.

Recently, much attention has turned to the investigation of games in repeated-play situa-

tions. In this context, it is possible for the participants to adapt their strategies over time in

an attempt to improve their long-run performance. These are called evolutionary games, since

they invoke metaphors such as natural selection in an attempt to converge to a Nash equilibrium

or to discover some new concept that provides some form of mutual or individual benefit. At

the end of the day, however, evolutionary games are based upon exactly the same fundamen-

tal premise as is conventional game theory; namely, the premise of individual rationality—the

decision maker does the best thing for itself, regardless of the consequences to others. One

of the main distinctions between conventional game theory and evolutionary game theory is

that in the latter, preference orderings emerge through experience, while in the former they are

assumed to be known a priori, perhaps provided by some Laplacian daemon who has access

to the hearts and minds of all participants.

The individual rationality premise is perhaps the dominant concept when constructing mod-

els of multiple-agent decision making. It appears to be one of the favorite models of economics,

political science, and psychology. One of its virtues is its simplicity. It is the Occam’s rasor

of interpersonal interaction, and relies only upon the minimal assumption that an individual

will put its own interests above everything and everyone else. It is understood, however, that

this model is an abstraction of reality and is not causal. That is, a participant may exhibit

behavior that is consistent with the individual rationality premise, but that does not mean that

the individual is actually computing numerical preference orderings according to some math-

ematical formula and consciously optimizing (in the single-agent case) or equilibrating (in the

multiple-agent case). The value of such models is that they provide insight into the workings

of a complex society, and can be used to explain past behavior or to predict future behavior.

In his celebrated book, The Evolution of Cooperation, (Axelrod, 1984) makes a com-

pelling argument that cooperation may emerge in repeated-play situations where, although

non-cooperation may be more satisfying in the short run, long run satisfaction is better served

January 18, 2002 Page 5 Submitted to Minds and Machines


by cooperating, even in situations that are conducive to exploitation. The main vehicle for

establishing this hypothesis is the famous Prisoner’s Dilemma game, which is taken by many

as a model to explain the actions of people in a wide variety of social situations, and provides

a normative basis for their behavior.

While this game may be a reasonable model in the context of human behavior, it is not

obvious that it should be applied to artificial systems that are to be designed from their inception

to cooperate. With the Prisoner’s Dilemma model, cooperation, if it occurs at all, emerges as

the participants gain experience with repeated play. Unfortunately, such cooperation is not

binding, and the participants are free to defect at any time. Thus, even if a machine were to

learn to cooperate, it would still be under no obligation to do so, and it could also learn in the

future that defection is more to its advantage, especially in an evolutionary environment where

the participants are susceptible to infection by agents with different value systems.

The Prisoner’s Dilemma game may be an appropriate model of behavior when the opportu-

nity for exploitation exists and cooperation, though possible, incurs great risk, while defection,

even though it offers diminished rewards, also protects the participant from catastrophe. Many

social situations, however, possess a strong cooperative flavor with very little incentive for ex-

ploitation. One prototypical game that captures this behavorial environment is the Battle of

the Sexes game.1 This is a game involving a man and a woman who plan to meet in town for

a social function. She (S) prefers to go to the ballet (B), while he (H) prefers the dog races

(D). Each also prefers to be with the other, however, wherever it may be. This is a coordina-

tion game in which the players act simultaneously, and is a prototype for many coordination

scenarios, such as might occur with multiple robots trying to achieve some task, or with allo-

cating resources on an automated shop floor. The classical way to formulate this game is via a

payoff matrix, as given below in ordinal form, with the payoff pairs which compose this matrix

representing the benefits to H and S, respectively.


S
D B
D (4,3) (2,2)
H
B (1,1) (3,4)
Key: 4 = best; 2 = next best; 2 = next worst; 1 = worst

January 18, 2002 Page 6 Submitted to Minds and Machines


Rather than competing, these players wish to cooperate, but they must make their decisions

without benefit of communication. Both players lose if they make different choices, but the

choices are not all of equal value to the players. This game has two Nash equilibria, (D, D)

and (B, B). Unfortunately, this observation is of little help in defining a strategy.

One of the perplexing aspects of this game is that it does not pay to be altruistic, since

both participants doing so guarantees the worst outcomes for both. Nor does it pay to be

selfish—that guarantees the next worst outcome for both. The best and next-best outcomes

obtain if one participant is selfish and the other altruistic. It seems, however, that an attitude

of moderation should be helpful, but there is no fool-proof way to express this attitude without

direct communication.

An approach that does not require communication is for each player to flip a coin and

choose according to the outcome of that randomizing experiment. If they were to repeat this

game many times, then, on average, each player would realize an outcome midway between

next best and next worst for each. But, for any given trial, they would be in one of the four

states with equal probability.

If the players could communicate, then a much better strategy would be to alternate be-

tween (D, D) and (B, B), thus ensuring an average level of satisfaction midway between best

and next best for each. If they possess the ability to learn from experience, they may also

converge to an alternation scheme under repeated play.

Regardless of the strategies that may be employed, this game, as it is configured by the

payoff matrix, illustrates the shortcomings of conventional utility theory for the characteriza-

tion of behavior when cooperation is essential. Each player’s level of satisfaction is determined

completely as a function if his or her own enjoyment. For example, the strategy vector (D, D)

is best for H, but it is because he gets his way on both counts: he goes to his favorite event

and he is with S. Her feelings, however, are not taken into consideration. According to the

setup, it would not matter to H if S were to detest dog races and were willing to put up with

that event at great sacrifice of her own enjoyment, just to be with H. Such selfish attitudes,

though not explicit, are at least implied by the structure of the payoff matrix, and are likely to

January 18, 2002 Page 7 Submitted to Minds and Machines


send any budding romance to the dogs. The problem is that the solution concept, based as it is

upon individual rationality, fosters competition, even though cooperation is desired.

As an illustration of a subtle type of competition that may emerge from this game, consider

the operation of a shop floor. Producer X1 can manufacture lamps or tables, and Producer X2

can manufacture lamp shades or table cloths, but each must choose which product to manufac-

ture without direct communication. Coordinated behavior would make both of their products

more marketable, as indicated below, where the net profit accruing to each producer as a func-

tion of their joint decisions are displayed. Clearly, this is an instantiation of the Battle of the

Sexes game.

X2
Shades Cloths
Lamps ($20, $5) ($10, $4)
X1
Tables ($8, $3) ($15, $10)

Using these numbers, X1 might reason that, since his profit for (Lamps, Shades) is twice

the profit to X2 for (Tables, Cloths) but the incremental change in profit is the same for both,

then his preference is stronger, and should prevail. On the other hand, however, X2 might

reason that, since it is worth twice as much to her if they produce (Tables, Cloths), rather
4
than (Lamps, Shades) and it is only 3 more valuable to X1 if they produce (Lamps, Shades)

rather than (Tables, Cloths), her preference is stronger, and should prevail. Of course, if side

payments are allowed, then this could help resolve the dilemma, but that would also create a

secondary game of how to arrive at equitable side payments—which may produce an infinite

regress of dilemmas.

Unfortunately, rational choice does not resolve such potential conflicts, although a com-

promise may emerge with repeated play. Fortunately, however, the payoff structure does not

tell the whole story. It does not capture the interrelationships that may exist between the two

producers. But interrelationships, such as deference to the other, are vital to human interaction,

and they should also be considerations for artificial social systems if they are to imitate human

behavior. These examples serve to motivate us to take a hard look at the basic premises of

decision making.

January 18, 2002 Page 8 Submitted to Minds and Machines


Taylor (Taylor, 1987) attempts to mitigate the effects of individual rationality with a formal

notion of altruism by transforming the game through the creation of a new game according to a

utility array whose entries account for the payoffs to others as well as to itself. He suggests that

the utility functions be expressed as a weighted average of the payoffs to itself and to others.

By adjusting the weights, the agent is able to take into consideration the needs of others. Taylor

even goes further, and suggests that each player’s utility might be a function of other players’

utilities as well as their payoffs, and shows how this potentially leads to a situation of infinite

regress, with the players trying to account for higher and higher orders of altruism.

Taylor’s form of altruism does not distinguish between the state of actually relinquishing

one’s own self-interest and the state of being willing to relinquish one’s own self interest under

the appropriate circumstances. The unconditional relinquishment of one’s own self interests

is a condition of categorical altruism, whereby a decision maker unconditionally modifies

its preferences to accommodate the preferences of others. A purely altruistic player would

completely replace its preferences with the preferences of others. A state of being willing to

modify one’s preferences to accommodate others if the need arises is a state of situational

altruism, whereby a decision maker is willing to accommodate the preferences of others in lieu

of its own preferences if doing so would actually benefit the other, but otherwise retains its own

egoistic preferences intact and avoids needless sacrifice.

Categorical altruism may be too much to expect from a reasonable decision maker, whereas

the same decision maker may be willing to permit a form of situational altruism. It is one thing

for an individual to modify its behavior if it is sure that doing so will benefit another individual,

but it is quite another thing for an individual to modify its behavior regardless of its effect on the

other. With the Battle of the Sexes, given that S has a very strong aversion to D (even though

she would be willing to put up with those extremely unpleasant surroundings simply to be

with him and thus receive her second-best payoff), H might be willing to manifest situational

altruism by preferring B to D, but if she did not have a strong aversion to D he would stick to

his egoistic preference for D over B.

The appeal of optimization and equilibration is a strongly entrenched attitude that domi-

January 18, 2002 Page 9 Submitted to Minds and Machines


nates much of decision making theory. There is great comfort in following traditional paths,

especially when those paths are founded on such a rich and enduring tradition as rational choice

affords. The justification for the individual rationality hypothesis is generally attributed to

Bergson (Bergson, 1938) and Samuelson (Samuelson, 1948), who assert that individual inter-

ests are fundamental, and that social welfare is an aggregation of individual welfares. While

this perception may be appropriate for environments of perfect competition, the individual ra-

tionality hypothesis loses much of its power in more general settings. As expressed by Arrow,

when the assumption of perfect competition fails, “the very concept of rationality becomes

threatened, because perceptions of others and, in particular, of their rationality become part

of one’s own rationality (Arrow, 1986).” Arrow further asserts that the use of the individual

rationality hypothesis is “ritualistic, not essential.”

What is essential is that any useful model of society be ecologically balanced in that it is

able to accommodate the various relationships that exist between agents and their environment,

including other agents. Many thoughtful scholars argue that individual rationality is not an

adequate model for group behavior (for example, see (Bicchieri, 1993; Raiffa, 1968)), and

perhaps Luce and Raiffa summarized the situation most succinctly when they observed that

general game theory seems to be in part a sociological theory which does not

include any sociological assumptions . . . it may be too much to ask that any so-

ciology be derived from the single assumption of individual rationality (Luce and

Raiffa, 1957, p. 196).

Often, the most articulate advocates of a theory are also its most insightful critics. Regardless

of the merits of individual rationality as a model of human societies, society is not bound to

comply with it or any other model that purports to characterize it. All such models are simply

analysis tools, and their use is limited to that role. They cannot dictate behavior to the members

of the society.

January 18, 2002 Page 10 Submitted to Minds and Machines


3 Synthesis

In the absence of truly nonmetaphorical artificial intelligence, any artificial society that is not

completely whimsical, random, or anarchic must be designed according to an externally im-

posed sociological model that accounts for inter-agent relationships. In contrast to natural

societies, where models are used for analysis purposes, models in an artificial society are used

for synthesis. The analyst is free to concoct any story-line that is not contradicted by the ob-

servations, and then to abstract this story-line into a game that captures the essence of the

interaction without the encumbrance of irrelevant details. In the synthesis context, however,

the story-line is critical. The participants must actually live the story as they function in their

environment. Thus, not only will the sociological models explain and predict behavior, they

will actually dictate behavior.

Most artificial social systems designs that appear in the literature are based on exactly the

same premise that governs models that are used to analyze natural systems—individual ratio-

nality. But Arrow’s impossibility theorem demonstrates that group preferences cannot always

be made to be consistent with individual preferences if reasonable assumptions regarding the

structure of the decision problem are made (Arrow, 1951). Despite the inability to guarantee

consistency between individual and group preferences, there are a number of approaches to

group decision making, based on the premise of individual rationality, that are seriously con-

sidered. One approach is to view the group itself as a higher-level decision making entity, or

superplayer, who is able to fashion a group expected utility to be maximized. This “group-

Bayesian” approach, however, fails to distinguish between the notion of group choices and

group preferences. While it is one thing to say that a group decides upon a group alternative,

it is quite another thing to assume that the group itself prefers the alternative in any meaning-

ful sense. Shubik refers to this conflation of concepts as an “anthropomorphic trap (Shubik,

1982)” and strongly suggests that it be avoided.

Another classical approach to group decision making is to invoke the Pareto principle,

which is to choose a group decision such that, if any individual were to make a unilateral

change to increase its level of satisfaction, the satisfaction level for at least one other member of

January 18, 2002 Page 11 Submitted to Minds and Machines


the group would be decreased. The obvious problem with this approach is that all participants

must agree to abide by the Pareto principle. It seems that, for this principle to be viable,

some notion of group preference must enter through the back door, if only to agree to be loyal

Paretians, not to mention the problem of deciding which Pareto choice to select if there is

not a unique one. Nevertheless, Pareto optimality is viewed by many as a rational means of

group decision making. But there is still some disquiet, as Raiffa confesses: “Not too long ago

this principle [Pareto optimality] seemed to me unassailable, the only solid cornerstone in an

otherwise swampy area. I am not so sure now, and I find myself in that uncomfortable position

in which the more I think the more confused I become . . . somehow the group entity is more

than the totality of its members (Raiffa, 1968, p. 233, 237).”

Perhaps the source of the concerns raised by Shubik and Raiffa is a critical flaw in the

doctrine of individual rationality, which flaw becomes a serious issue when designing a system

whose charter is cooperation. The problem is that it is not possible to accommodate both group

and individual interests simultaneously and still strictly adhere to the notion of rationality as

doing the best thing possible.

Much research has been devoted to ways, within the context of game theory, to overcome

the limitations of strict individual rationality (but without abandoning it) when dealing with

group-wide decision problems. (Schelling, 1960) advocates what he calls a “reorientation”

of game theory, wherewith he attempts to counterbalance the notion of conflict with a game-

theoretic notion of coordination. The antipodal extreme to a game of pure conflict is a game

of pure coordination, where all players have coincident interests and desire to take actions that

are simultaneously of self and mutual benefit. In contrast to pure conflict games were one

player wins only if the others lose, with a pure coordination game, either all players win or

all lose. In general, games may involve both conflict and coordination; Schelling terms such

games mixed-motive games. Although Schelling’s attempt at reorientation is useful, it is not a

fully adequate way to mitigate the notion of conflict, since his reorientation does not alter the

fundamental structure of game theory as ultimately dependent on the principle of individual

rationality. Lewis attempts to patch up the problem by introducing the notion of coordination

January 18, 2002 Page 12 Submitted to Minds and Machines


equilibrium as a refinement of Nash equilibrium. Whereas a Nash equilibrium is a combination

in which no one would be better off had he alone acted otherwise, Lewis strengthens this

concept and defines a coordination equilibrium as a combination in which no one decision

maker would have been better off had any one decision maker alone acted otherwise, either

himself or someone else. (Lewis, 1969, p. 14). Coordination equilibria, however, are common

in situations of mixed opposition and coincidence of interest. In fact, even a zero-sum game of

pure conflict can have a coordination equilibrium.

There are many other attempts to overcome the problems that arise with the individual

rationality hypothesis. For example, (Shapley, 1953) suggests that players in an N -person

game should formulate a measure of the how much their joining a coalition contributes to its

value, and should use this metric to justify their decision to join or not to join. This formulation,

however, requires the acceptance of additional axioms involving the play of composite games

(Rapoport, 1970). Another way to extend the notion of optimality is to form coalitions on

the basis of no player having a justifiable objection against any other member of the coalition,

resulting in what is called the bargaining set.

Yet another approach to the problem is to modify the decision maker’s utility function so

it is a function of the group reward as well as its individual reward. In effect, the participant

is “brainwashed” into substituting group interests for its personal interests. Then, when acting

according to is supposed own self interest, it is actually accommodating the group (Wolpert

and Tumer, 2001). A somewhat similar, although less radical, approach is taken by (Glass

and Grosz, 2000) and (Cooper et al., 1996), who attempt to instill a social consciousness into

agents by rewarding them for good social behavior by adjusting their utility functions with

“brownie points” and “warm glow” utilities for doing the “right thing.” (Sen, 1996) introduces

a probabilistic reciprocity mechanism that elicits cooperation between self-interested agents.

Also, it is certainly possible to invoke various voting or auctioning protocols to address this

problem (for example, see (Sandholm, 1999)).

Granted, it is possible under the individual rationality regime for a decision maker to sup-

press its own egoistic preferences in deference to others, but doing so is only a device to trick

January 18, 2002 Page 13 Submitted to Minds and Machines


individual rationality into providing a response that can be interpreted as unselfish. Such an

artifice provides only an indirect way to simulate socially useful attributes of cooperation, un-

selfishness, and altruism with a model that is more naturally attuned to competition, exploita-

tion, and avarice. All of the approaches discussed above, however, are based, at the end of the

day, upon the premise of individual rationality, and serve as refinements or extensions of that

fundamental premise.2 Rapoport summarizes the situation succinctly:

If . . . we wish to construct a normative theory, i.e., be in a position of advising

“rational players” on what the outcomes of a game “ought” to be, we see that

we cannot do this without further assumptions about what explicitly we mean by

“rationality” (Rapoport, 1970, p. 136).

It seems that the vast majority of the decision-theoretic community is still committed to the

fundamental concept of individual rationality. Yet, researchers understand that achieving the

ideal is often impractical. Proponents of the so-called bounded rationality approach to deci-

sion making recognize that, in the real-world, strict optimization/equilibration often cannot be

achieved, due to either computational or informational constraints. (Sims, 1980) characterizes

this problem as a research “wilderness of disequilibrium economics” with no general theory

to guide the search for alternatives to equilibrium-based solutions. (Kreps, 1990) suggests

that bounded rationality should be defined as “intendedly [individually] rational, but limitedly

so,” and presents the case for short-term optimization and long-term adaptation through ret-

rospection of past results. (Sargent, 1993) offers that this might be done via the methods of

artificial intelligence such as neural computing, genetic algorithms, simulated anealing, and

pattern recognition, with the expectation that artificial boundedly rational agents will learn to

behave as if they were operating under conventional rational choice and at least achieve an “ap-

proximate” equilibrium. Machine learning researchers have had some success in uncovering

such processes (Bowling, 2000; Kalai and Lehrer, 1993; Fudenberg and Levine, 1993; Hu and

Wellman, 1998). Despite the successes of bounded rationality, however, it still holds optimiza-

tion as the ideal to be sought, and accepts a compromise only as an unavoidable consequence

of real-world constraints.

January 18, 2002 Page 14 Submitted to Minds and Machines


Although much research has been concentrated on ways to deal with the exigencies of

practical decision making, it is perhaps somewhat surprising that these exigencies have not also

prompted a re-examination of the most fundamental premise of all—individual rationality. At

present, there does not appear to be a body of theory that supports the systematic synthesis of

multiple-agent decision systems that does not rely upon the individual rationality premise, and

any relaxation of the demand for the best seems to place us even further into Sims’ wilderness.

But there is always the possibility that somewhere in the wilderness is a promised land.

4 Dichotomies

Let us begin our search at the headwaters of rationality. It is a platitude that decision makers

should make the best choices possible, but we cannot rationally choose an option, even if we

do not know of anything better, unless we know that it is good enough. Being good enough is

the fundamental desideratum of rational decision makers—being best is a bonus. Therefore,

let us replace the demand for the best, and only the best, with a desideratum of being good

enough. To be useful, we must, of course, precisely define what it means to be good enough.

Mathematically formalizing a concept of being good enough is not as straightforward as equi-

librating. Being best is an absolute concept—it does not come in degrees. Being good enough,

however, is not an absolute, and does come in degrees. Consequently, we must not demand or

expect a unique good-enough solution, but instead must be willing to accept multiple solutions

with varying degrees of adequacy.

Optimization is very sophisticated concept, and requires that the decision maker be able to

accomplish three tasks. The first is to define all of its relevant options, the second is to rank-

order the options according to its preferences for the consequences of implementing them, and

the third is to search through the set of choices to identify the one that is the most preferable.

The starting point for much of conventional decision theory is that the first two tasks are

assumed to be given, and most attention is concentrated on ways to conduct the search. Thus,

much of what is usually termed “decision theory” might more properly be termed “search

theory.” In point of fact, once a decision maker has defined a rank-ordering and has determined

January 18, 2002 Page 15 Submitted to Minds and Machines


to optimize, it has made its decision—it will implement the highest-ranking option, and all that

remains is to find it.

Rank ordering requires that all options be evaluated according to their expected utility. Al-

though conventional utility theory maps options into numerical degrees of satisfaction, utilities

must also account for any dissatisfaction that may accrue to the decision maker. For example,

owning a luxury vehicle may be very satisfying, hence have high utility, but the cost of doing

so may be very dissatisfying, hence would have high dis-utility. Conventional utility theory

aggregates these two polarizing aspects into a single function that characterizes the net utility

of the option.

As a practical matter, however, people often separate these two aspects. Attached to virtu-

ally every nontrivial option are attributes that are desirable and attributes that are not desirable.

To increase performance, one usually expects to pay more. To win a larger reward, one expects

to take a greater risk. People are naturally wont to evaluate the upside versus the downside, the

pros versus the cons, the pluses versus the minuses, the benefits versus the costs. Operating this

way is almost intuitive; it may be the nearest thing to instinct. Perhaps the most fundamental

human decision-making activity is embodied in the process of evaluating tradeoffs, option by

option—putting the gains and the losses on the balance to see which way it tips. The result of

evaluating dichotomies in this way is that the benefits must be at least as great as the costs. In

this sense, such evaluations provide an intuitive notion of being good enough.

Suppose two hamburgers are placed before you and you are to make a choice. One way to

frame the question before you is to compare the two hamburgers to each other and either select

the better product (presumably on the basis of appearance and cost). Another way to frame the

question is first to confine attention to, say, Hamburger A, and form the decision either to accept

it or to reject it, and then form a similar decision for Hamburger B. The difference is that the

first way to frame the question is extrinsic to Hamburger A, that is, it involves considerations

of issues in addition to that item (namely, its comparison with Hamburger B), while the second

framing is intrinsic, in that it involves consideration only of that particular hamburger. With

the extrinsic framing, one must combine appearance and cost into a single utility to be rank-

January 18, 2002 Page 16 Submitted to Minds and Machines


ordered, but under the intrinsic framing, one may keep appearance cost separate and form

the binary evaluation of appearance versus cost. If only one of the hamburgers passes such a

test, the problem is resolved. If you conclude that neither hamburger’s appearance is worthy

of the cost, you are justified in rejecting them both. If you think both are worthy, you may

choose them both (assuming that your budget and appetite are both adequate). Suppose that

Hamburger A costs more than Hamburger B, but is also much larger and has more trimmings.

If you view both as being worth the price, but you only want one, then whatever your final

choice, you can’t make a bad decision. You at least get a good hamburger—you get your

money’s worth.

Let us take the notion of getting at least what you pay for as an operational definition of

being good enough. This definition circumvents the logical problem that occurs with other

concepts of being good enough, such as meeting minimum requirements. Simon has appropri-

ated the term satisficing to mean “good enough” and advocates the construction of “aspiration

levels” of how good a solution might reasonably be achieved, and halting the search when these

expectations are met (Simon, 1955). Although aspiration levels at least superficially establish

minimum requirements, this approach relies primarily upon experience-derived expectations.

If the aspiration is too low, performance may needlessly be sacrificed, and if it is too high, there

may be no solution. It is difficult to establish a good and practically attainable aspiration level

without first exploring the limits of what is possible, that is, without first identifying optimal

solutions—the very activity that satisficing is intended to circumvent. For single-agent low-

dimensional problems, specifying the aspirations may be noncontroversial. But, with multiple-

agent systems, interdependence between decision makers can be complex and aspiration levels

can be conditional (what is satisfactory for me may depend upon what is satisfactory for you).

The current state of affairs regarding aspiration levels does not appear to address completely

the problem of specifying minimum requirements in multiple-agent contexts. Furthermore, this

latter definition of being good enough actually begs the question, because being good enough

requires us to define minimum requirements (aspirations), which immediately plunges us into

semantic nonsense, because the criterion for defining minimum requirements can only be that

January 18, 2002 Page 17 Submitted to Minds and Machines


they are good enough.

Let us retain the “satisficing” terminology3 because our usage is consistent with the issue

that motivated Simon’s original usage—to identify options that are good enough in the sense

of comparing attributes of the options to a standard. Our usage differs only in the standard

used for comparison. Whereas Simon’s approach is extrinsic and compares attributes to exter-

nally supplied aspiration levels, our usage is intrinsic and compares the positive and negative

attributes of each option.

Definition 1 A choice is satisficingly rational if the expected gains achieved by making it

equal or exceed the expected loss, provided the gains and losses can be expressed in commen-

surable units. 2

Getting at least what you pay for may be a natural concept for people but, surprisingly,

the notion has received little attention in the literature on the design of artificial systems.

The optimality paradigm is usually taken for granted as the standard against which all ap-

proaches should be measured, and there simply has not been a perceived need to explore other

paradigms. Nevertheless, if people are wont to make decisions by evaluating dichotomies, then

it is reasonable to investigate ways that machines may incorporate this same paradigm.

One of the benefits of replacing strict optimality as the ideal with the softer stance of

being good enough is to move closer to conforming to Arrow’s observation that other agents’

rationality is part of one’s own rationality. A willingness to be content with a choice that

is adequate, if not optimal, opens the way to expanding one’s sphere of interest to permit the

accommodation of others, thereby paving the way for the development of a notion of rationality

that is able to account for the interests of others as well as for one’s self interest. The key

enabling concept for this to happen is the notion of conditional preferences.

Our appreciation of conditional preferences may be strengthened by a brief summary of

conventional utility theory as it is employed in mathematical games. Utility theory was devel-

oped as a mathematical way to encode individual preference orderings. It is built on a set of

axioms that describe how a “rational man” would express his preference between two alter-

January 18, 2002 Page 18 Submitted to Minds and Machines


natives in a consistent way. An expected utility function is a mathematical expression that is

consistent with the preferences, conforms to the axioms, and orders the individuals preferences

for the various options. Since, in a game-theoretic context, an individual’s preferences are gen-

erally dependent upon the actions of others, an individual’s expected utility function must be

a function of not only the individual’s own options, but of the options of all other individuals.

The individual may then compute the expected utility of its possible options conditioned on the

actions of other players. These expected utilities may then be juxtaposed into a payoff array,

and an equilibrium strategy may be identified.

The important thing to note about this structure is that it is not until the expected utilities

are juxtaposed into an array that the actual “game” aspects of the situation emerge. It is the

juxtaposition that reveals possibilities for conflict and coordination. These possibilities are not

explicitly reflected in the individual expected utility functions. In other words, although the

individual’s expected utility is a function of other players’ strategies, it is not a function of

other players’ preferences. This structure is completely consistent with exclusive self-interest,

where all a player cares about is its personal benefit as a function of its own and other players’

strategies, without any regard for the benefit to the others. Under this paradigm, the only

way the preferences of others factor into an individual’s decision-making deliberations is to

constrain behavior to limit the amount of damage they can do. This situation obtains if the

game is a game of pure competition (such as a zero-sum game), a game of mixed motives, or

even a game of pure coordination.

In societies that value cooperation, it is unlikely that the preferences of a given individ-

ual will be formed independently of the preferences of others. Knowledge about one agent’s

preferences may alter another agent’s preferences. Such preferences are conditioned on the

preferences of others. In contrast to conditioning only on the actions of other participants,

conditioning on the preferences of others permits a decision maker to adjust its preferences to

accommodate the preferences of others. It can bestow either deference or disfavor to others

according to their preferences as they relate to its own preferences. With the Battle of the

Sexes game, for example, it would not be unreasonable, say, for H to take into consideration

January 18, 2002 Page 19 Submitted to Minds and Machines


S’s attitude about D. Since traditional utility theory is a function of participant options, rather

than participant preferences, it cannot be used to express such relationships.

5 Satisficing Games

While the statement “What is best for me and what is best for you is also jointly best for us to-

gether” may be nonsense, the statement “What is good enough for me and what is good enough

for you is also jointly good enough for us together” may be perfectly sensible, especially when

we do not have inflexible notions of what it means to be good enough, and if decision makers

are able to accommodate the interests of others as well as of themselves. To generate a useful

theory, however, we must be able to define, in precise mathematical terms, what it means to

be good enough, we must develop a theory of decision making that is compatible with this

notion, and we must establish that it permits the accommodation of both group and individual

interests.

An alternative to von Neumann-Morgenstern N -player game theory is a new approach to

multiple-agent decision making called satisficing games (Stirling and Goodrich, 1999b; Stir-

ling and Goodrich, 1999a; Goodrich et al., 2000). Let X1 , . . . , XN be a society of decision

makers, and let Ui be the set of options available to Xi , i = 1, . . . , N . The joint action set is

the product set U = U1 , × · · · × UN , and denote elements of this set as u = {u1 , . . . , uN },

where ui ∈ Ui . Rather than defining a game in terms of a payoff array of expected utilities,

as is done with conventional normal-form games, a satisficing game incorporates the same in-

formation that is used to define expected utilities to form an interdependence function, denoted

pS1 ···SN R1 ···RN : U × U → [0, 1], which encodes all of the positive and negative interrela-

tionships between the members of the society. The interdependence function is a multivariate

mass function, that is, it is non-negative and normalized to unity. In this respect it is similar

to a probability mass function, but does not characterize uncertainty or randomness. Rather,

it jointly characterizes the selectability, that is, the degree to which the options achieve the

goals of the decision maker and the rejectability, or the degree to which the options consume

resources (money, fuel, exposure to hazard, or other costs). From this function we may derive

January 18, 2002 Page 20 Submitted to Minds and Machines


two functions, called joint selectability and joint rejectability functions, denoted pS1 ···SN and

pR1 ···RN , respectively, according to the formulas


X
pS1 ···SN (u) = pS1 ···SN R1 ···RN (u, v) (1)
v∈U
X
pR1 ···RN (v) = pS1 ···SN R1 ···RN (u, v) (2)
u∈U

for all (u, v) ∈ U × U. These functions are also multivariate mass functions. The joint

selectability function characterizes the degree to which a joint option leads to success for the

group, in the sense of achieving the goal of the decision problem, and the joint rejectability

function characterizes the cost, or degree of resource consumption, that is associated with the

joint option. The two functions are compared for each possible joint outcome, and the set of

joint outcomes for which joint selectability is at least as great as joint rejectability form the

jointly satisficing set.

Definition 2 A satisficing game is a triple {U, pS1 ···SN , pR1 ···RN }. The joint solution to a

satisficing game is the set

Σq = {u ∈ U: pS1 ···SN (u) ≥ qpR1 ···RN (u)}, (3)

where q is the index of caution, and parameterizes the degree to which the decision maker is

willing to accommodate increased costs to achieve success. An equivalent way of viewing this

parameter is as an index of boldness, characterizing the degree to which the decision maker is

willing to risk rejecting successful options in the interest of conserving resources. Nominally,

q = 1, which attributes equal weight to success and resource conservation interests. Σq is

termed the joint satisficing set, and elements of Σq are jointly satisficing actions. 2

The jointly satisficing set provides a formal definition of what it means to be good enough

for the group; namely, a joint option is good enough it the joint selectability is greater than or

equal to the index of caution times the joint rejectability.

Definition 3 A decision-making group is jointly satisficingly rational if the members of the

group choose a vector of options for which joint selectability is greater than or equal to the

index of caution times joint rejectability. 2

January 18, 2002 Page 21 Submitted to Minds and Machines


The marginal selectability and marginal rejectability mass functions for each Xi may be

obtained by summing the joint selectability and joint rejectability over the options of all other

participants, yielding:
X
pSi (ui ) = pS1 ···SN (u1 , . . . , uN ) (4)
uj ∈Uj
j6=i
X
pRi (ui ) = pR1 ···RN (u1 , . . . , uN ). (5)
uj ∈Uj
j6=i

Definition 4 The individually satisficing solutions to the satisficing game {U, pS1 ···SN , pR1 ···RN }

are the sets

Σiq = {ui ∈ Ui : pSi (ui ) ≥ qpRi (ui )}. (6)

The product of the individually satisficing sets is the satisficing rectangle:

Rq = Σ1q × · · · × ΣNq = {(u1 , . . . , uN ): ui ∈ Σiq }. (7)

Definition 5 A decision maker is individually satisficingly rational if it chooses an option

for which the marginal selectability is greater than or equal to the index of caution times the

marginal rejectability. 2

The individually satisficing sets identify the options that are good enough for the individ-

uals; namely, the options such that the marginal selectability is greater than or equal to the

index of caution times marginal rejectability. It remains, however, to reconcile, if possible, the

individual choices with the group choices. To do this, we need to establish the relationship

between the jointly satisficing set and the satisficing rectangle.

Definition 6 A compromise at q is a jointly satisficing solution such that each element is

individually satisficing. We denote this set

Cq = Σq ∩ Rq . (8)

January 18, 2002 Page 22 Submitted to Minds and Machines


A compromise at q exists if Cq 6= ∅, otherwise an impasse at q occurs. 2

The following theorem expresses the relationship between the individual and jointly satis-

ficing sets.

Theorem 1 If ui is individually satisficing for Xi , that is, if ui ∈ Σiq , then it must be the ith

element of some jointly satisficing vector u ∈ Σq .

Proof This theorem is proven by establishing the contrapositive, namely, that if ui is not

the ith element of any u ∈ Σq , then ui 6∈ Σiq . Without loss of generality, let i = 1. By

hypothesis, pS1 ···SN (u1 , v) < qpR1 ···RN (u1 , v) for all v ∈ U2 × · · · × UN , so pS1 (u1 ) =
1
v pR1 ···RN (u1 , v) = qpR1 (u1 ), hence u1 6∈ Σq .
P P
v pS1 ···SN (u1 , v) < q 2

The content of this theorem is that no one is ever completely frozen out of a deal—every

decision maker has a seat at the negotiating table. This is perhaps the weakest condition under

which negotiations are possible. Perhaps the most simple way to negotiate is to lower one’s

standards in a controlled way.

Corollary 1 There exists an index of caution value q0 ∈ [0, 1] such that a compromise exists

at q0 .

The proof of this corollary is trivial and is omitted. If the players are each willing to lower

their standards sufficiently by decreasing the level of caution, q, they may eventually reach a

compromise solution that is both individually and jointly satisficingly rational. The parameter

q0 is a measure of how much they must be willing to compromise to avoid an impasse. Note

that willingness to lower one’s standards is not total capitulation, since the participants are able

to control the degree of compromise by setting a limit on how small of a value of q they can

tolerate. Thus, a controlled amount of altruism is possible with this formulation. But, if any

player’s limit is reached without a mutual agreement being obtained, the game has reached an

impasse.

January 18, 2002 Page 23 Submitted to Minds and Machines


6 Conditioning

The interdependence function provides a complete description of the interrelationships that

may exist between participants in a multiple-agent decision problem. This function, however,

can be rather complex but, fortunately, its structure as a mass function permits its decomposi-

tion into constituent parts according to the law of compound probability, or chain rule (Eisen,

1969).4 Applying the formalism (but not the usual probabilistic semantics) of the chain rule,

we may express the interdependence function as a product of conditional selectability and re-

jectability functions. To illustrate, consider a two-agent satisficing game involving decision

makers X1 and X2 , with option sets U1 and U2 , respectively. The interdependence function

may be factored in several ways, for example, we may write

pS1 S2 R1 R2 (x, y, z, w) = pS1 |S2 R1 R2 (x|y, z, w) · pS2 |R1 R2 (y|z, w) · pR1 |R2 (z|w) · pR2 (w).
(9)

We interpret this expression as follows. pS1 |S2 R1 R2 (x|y, z, w) is the conditional selectability

mass associated with X1 taking action x, given that X2 places all of its selectability mass on

y, X1 places all of its rejectability mass on z, and X2 places all of its rejectability mass on

w. Similarly, pS2 |R1 R2 (y|z, w) is the conditional selectability mass associated with X2 taking

action y, given that X1 and X2 place all of their rejectability masses on z and w, respectively.

Continuing, pR1 |R2 (z|w) is the conditional rejectability mass associated with X1 rejecting z,

given that X2 places all of its rejectability mass on w. Finally, pR2 (w) is the unconditional

rejectability mass associated with X2 rejecting option w.

Many such factorizations are possible, but the appropriate factorization must be determined

by the context of the problem. These conditional mass functions are mathematical instantia-

tions of production rules. For example, we may interpret pR1 |R2 (z|w) as the rule: If X2 rejects

w, then X1 feels pR1 |R2 (z|w) strong about rejecting z. In this sense, they express local behav-

ior, and such behavior is often much easier to express than global behavior. Furthermore, this

structure permits irrelevant interrelationships to be eliminated. Typically, there will be some

close relationships between some subgroups agents, while other subgroups agents will func-

tion essentially independently of each other. For example, suppose that X1 ’s selectability has

January 18, 2002 Page 24 Submitted to Minds and Machines


nothing to do with X2 ’s rejectability. Then we may simplify pS1 |S2 R1 R2 (x|y, z, w) to become

pS1 |S2 R1 (x|y, z).

Such conditioning permits the designer of the decision problem to construct the interdepen-

dence function as a natural consequence of the relevant interdependencies that exist between

the participants. In extreme cases where all of the participants are closely interrelated, the

construction of the interdependence function may be very complex. Many multiple-agent sce-

narios, however, are not closely interconnected. Hierarchical systems and other Markovian-like

societies for example, may be very economically characterized by such factorizations. At the

other extreme, anarchic systems (every man for himself) may be characterized by complete

inter-independence, yielding factorizations of the form

pS1 S2 R1 R2 (x, y, z, w) = pS1 R1 (x, z) · pS2 R2 (y, w). (10)

Conditioning is the key to the accommodation of the desires of others. For example, if X2

were very desirous of implementing y if X1 were not to implement x, X1 could accommodate

X2 ’s preference by setting pR1 |S2 (x|y) to a high value (close to unity). Then, x would be highly

rejectable to X1 if y were highly selectable to X2 . Note, however, that if X2 should turn out

not to highly prefer y and so sets pS2 (y) ≈ 0, then the joint selectability/rejectability of (x, y),

namely, pS2 R1 (y, x) = pR1 |S2 (x|y)pS2 (y) ≈ 0, so the joint event of X1 rejecting x and X2

selecting y has negligible interdependence mass. Thus, X1 is not penalized for being willing

to accommodate X2 when X2 does not need or expect that accommodation. By controlling the

conditioning weights, X1 is able to achieve a balance between its own egoistic interests and

the interests of others.

Preference conditioning cannot be accommodated by von Neumann-Morgenstern game

theory, since the utility functions required for that formalism are not conditioned on prefer-

ences. The only kind of conditioning possible with conventional utility theory is for X1 to

condition its preferences on the options that are available to X2 , but not on X2 ’s attitudes

about the options. For X1 to accommodate the preferences of X2 , it would be required to

change its entire utility structure; that is, it must be willing to throw the game, if need be, to

accommodate X2 , even if X2 does not require or expect such a sacrifice. This is the effect of

January 18, 2002 Page 25 Submitted to Minds and Machines


categorical altruism.

Structuring the interdependence function according to the axioms of probability theory

assures (i) consistency, in that none of the preference linkages are contradictory; (ii) com-

pleteness, in that all relevant linkages can be accommodated; (iii) non-redundancy, in that

duplication is avoided; and (iv) parsimony, in that the linkages are characterized by the fewest

number of parameters possible. Thus, although a system model may be complex, it will not be

more complex than it needs to be. As noted by Palmer, “Complexity is no argument against a

theoretical approach if the complexity arises not out of the theory itself but out of the material

which any theory ought to handle (Palmer, 1971, p. 176).”

7 Satisficing Battle of the Sexes

Let us now cast the Battle of the Sexes as a satisficing game. We must first establish each

player’s notions of success and resource consumption. In accordance with our previous dis-

cussion, success is related to the most important goal of the game, which is for the two players

to be with each other, regardless of where they go. Resource consumption, on the other hand,

deals with the costs of being at a particular function. Obviously, H would prefer D if he did

not take into consideration S’s preferences; similarly, S would prefer B. Thus, we may express

the myopic rejectabilities for H and S in terms of parameters h and s, respectively, as

pRH (D) = h
(11)
pRH (B) = 1 − h

and
pRS (D) = 1 − s
(12)
pRS (B) = s,

where h is H’s rejectability of D and s is S’s rejectability of B. The closer h is to zero, the

more H is adverse to B; with an analogous interpretation for s with respect to S attending D.


1
To be consistent with the stereotypical roles, we may assume that 0 ≤ h < 2 and 0 ≤ s < 21 .

As will be subsequently seen, only the ordinal relationship need be specified, that is, either

s < h or h < s.

January 18, 2002 Page 26 Submitted to Minds and Machines


Selectability is a measure of the success support associated with the options. Since being

together is a joint, rather than an individual objective, it is difficult to form unilateral assess-

ments of selectability, but it is possible to characterize individually the conditional selectability.

To do so requires the specification of the conditional mass functions pSH |RS and pSS |RH ; that

is, H’s selectability conditioned on S’s rejectability and S’s selectability conditioned on H’s

rejectability. If S were to place her entire unit mass of rejectability on D, H may account for

this, if cares at all about S’s feelings, by placing some portion of his conditional selectability

mass on B. S may construct her conditional selectability in a similar way, yielding

pSH |RS (D|D) = 1 − α

pSH |RS (B|D) = α


(13)
pSH |RS (D|B) = 1

pSH |RS (B|B) = 0


and
pSS |RH (D|D) = 0

pSS |RH (B|D) = 1


(14)
pSS |RH (D|B) = β

pSS |RH (B|B) = 1 − β.

We observe that the valuations pSH |RS (B|D) = α and pSS |RH (D|B) = β are conditions

of situational altruism. If S were to place all of her rejectability mass on D, then H may

defer to S’s strong dislike of D, by placing α of his selectability mass, as conditioned by her

preference, on B. Similarly, S shows a symmetric conditional preference for D if H were to

reject B strongly. The parameters α and β are H’s and S’s indices of altruism, respectively,

and serve as a way for each to control the amount of deference each is willing to grant to the

other. In the interest of simplicity, we shall assume that both players are maximally altruistic,

and set α = β = 1. In principle, however, they may be set independently to any value in [0, 1].

Notice that, even in this most altruistic case, these conditional preferences do not commit one

to unconditional abdication of his or her own unilateral preferences. He still myopically (that

is, without taking S into consideration) prefers D and she still myopically prefers B, and there

January 18, 2002 Page 27 Submitted to Minds and Machines


is no intimation the either participant must throw the game to accommodate the other.

With these conditional and marginal functions, we may factor the interdependence function

as follows:
pSH SS RH RS (x, y, z, w) = pSH |SS RH RS (x|y, z, w) · pSS |RH RS (y|z, w) · pRH RS (z, w)
(15)
= pSH |RS (x|w) · pSS |RH (y|z) · pRH (z)pRS (w),
where we have assumed that H’s selectability conditioned on S’s rejectability is dependent

only on S’s rejectability, that S’s selectability conditioned on H’s rejectability is dependent

only on H’s rejectability, and that the myopic rejectability values of H and S are independent.

Application of (1) and (2) results in joint selectability and rejectability values of
pSH SS (D, D) = (1 − h)s

pSH SS (D, B) = hs
(16)
pSH SS (B, D) = (1 − h)(1 − s)

pSH SS (B, B) = h(1 − s)


and
pRH RS (D, D) = h(1 − s)

pRH RS (D, B) = hs
(17)
pRH RS (B, D) = (1 − h)(1 − s)

pRH RS (B, B) = (1 − h)s.


The marginal selectability and rejectability values for H and S are

pSH (D) =s pRH (D) =h (18)

pSH (B) =1 − s pRH (B) =1 − h (19)

and

pSS (D) =1 − h pRS (D) =1 − s (20)

pSS (B) =h pRS (B) =s. (21)

Setting the index of caution, q, equal to unity, we obtain the jointly satisficing set as

 {(D, B), (B, D), (B, B)} for s < h
Σq = {(D, D), (D, B), (B, D)} for s > h , (22)
{(D, D), (D, B), (B, D), (B, B)} for s = h

January 18, 2002 Page 28 Submitted to Minds and Machines


the individually satisficing sets are

 {B} for s < h
ΣH
q = {D} for s > h (23)
{B, D} for s = h


 {B} for s < h
ΣSq = {D} for s > h , (24)
{B, D} for s = h

and the satisficing rectangle is



 {B, B} for s < h
Rq = ΣH
q × Σ S
q = {D, D} for s > h . (25)
{{B, B}, {D, D}} for s = h

Thus, if S’s aversion to D is less than H’s aversion to B, then both players will go to

H’s preference, namely, D, and conversely. This interpretation is an example of interpersonal

comparisons of utility. Such comparisons are frowned upon by conventional game theorists,

but are essential to social choice theory, so long as the utilities are expressed in the same

units and have the same zero-level. Since the utilities are mass functions, however, they each

apportion a unit of value (selectability or rejectability) among the possible options.

It is useful to discuss structural differences between the representation of this game as a

traditional game and its representation as a satisficing game. With the traditional game, all of

the information is encoded in the payoff matrix, with the values that are assigned representing

the importance, cost, or informational value to the players. The payoff matrix primarily cap-

tures information conditioned on options; that is, gains or losses to a player conditioned on the

choice possibilities of the other player.

With the satisficing game structure, all of the information is encoded into the interdepen-

dence function. This function may be factored into products of conditional interdependencies

that represent the joint and individual goals and preferences of the players. The joint selectabil-

ity function, (16) and the joint rejectability function, (17), characterize the state of the problem

as represented by the conditional goals and individual preferences of the the players. Specify-

ing numerical values for the preference parameters, h and s (also for α and β), is as natural, it

may be argued, as it would be to specify numerical values for the payoff matrix.

The essential difference between the von Neumann-Morgenstern and the satisficing rep-

resentations of this game is that the von Neumann-Morgenstern utilities do not permit the

January 18, 2002 Page 29 Submitted to Minds and Machines


preferences of one player to influence the preferences of the other player. But in the context of

this game, it is reasonable to assume that such preferences exist. If they do, then there should

be a mechanism to account for them.

Another important way to compare these two game formulations is in terms of the solution

concept. The classical solution to the traditional problem is to solve for mixed strategy Nash

equilibria corresponding to a numerically definite payoff matrix. The main problem with using

a mixed strategy, as far as this discussion is concerned, is that, for it to be successful, both

players must use exactly the same payoff matrix, and they must use the randomized decision

probabilities that are calculated to maximize their payoff’s. Even small deviations in these

quantities destroys the equilibrium completely and the solution has no claim even on rationality,

let alone optimality. The reason for this sensitivity is that the players are attempting to optimize,

and to do so, they must exploit the model structure and parameter values to the maximum extent

possible.

The satisficing solution concept, on the other hand, adopts a completely different approach

to decision making. There is no explicit attempt to optimize. Rather than ranking all of the

possible options according to their expected utility, the attributes (selectability and rejectabil-

ity) of each option are compared, and a binary decision is made with respect to each option,

and it is either rejected or it is not. For this problem, the comparison made in terms of the

parameters s and h, and all that is important is their ordinal relationship; numerically precise

values would not even be exploited were they available.

This game illustrates the common-sense attribute of the satisficing solution without the

need for randomizing decisions or for S to guess what H is guessing S is guessing, and so

forth, ad infinitum. The players go to the function that is considered by both to be the least

rejectable. This decision is easily justified by common sense reasoning. Furthermore, the sat-

isficing solution is more robust than than the conventional solution. There is no need to define

numerically precise payoff’s, and there is no need to calculate precise probability distributions

according to which random decisions will be chosen. Finally, there is no need to specify more

than an ordinal relationship regarding player preferences to arrive at a decision.

January 18, 2002 Page 30 Submitted to Minds and Machines


To apply this result to the distributed manufacturing game, we interpret h as X1 ’s rejectabil-

ity of producing lamps and s as X2 ’s rejectability of producing table cloths. One reasonable

way to compute these rejectabilities is to argue that the rejectability ratio for each player for

its two options ought to be the reciprocal of the individual profit ratios for the two cooperative
h 15 s 5 3
solutions. This approach yields the ratios 1−h = 20 and 1−s = 10 , or h = 7 and s = 31 . Since

s < h in this case, the only jointly and individually satisficing solution is (Tables, Cloths).

8 Discussion

Exclusive self-interest fosters competition and exploitation, and engenders attitudes of distrust

and cynicism. An exclusively self-interested decision maker would likely assume that the other

decision makers also will act in selfish ways. Such a decision maker might therefore impute

damaging behavior to others, and might respond defensively. While this may be appropriate in

the presence of serious conflict, many decision scenarios involve situations where coodinative

activity, even if it leads to increased vulnerability, may greatly enhance performance. Espe-

cially when designing artificial decision-making communities, individual rationality may not

be an adequate principle with which to characterize desirable group behavior.

Satisficing game theory provides a new approach to decision making in multiple distributed

decision-making settings that permits decision makers to take into account the preferences of

others, as well as their own egoistic preferences, when defining its preferences in social en-

vironments. This formulation permits the accommodation of group interests as well as of

individual interests via the mechanism of conditional preferences. The price for this accom-

modation, however, is that the individual rationality concept of demanding the best outcome

for oneself (in terms of equilibration) must be replaced by being content with a solution that is

merely good enough, in the sense of the benefits of implementing it outweighs the costs.

The ability to condition on the preferences of others, not just on their actions, is a key

enabling feature of satisficing game theory. Because of this capability, satisficing game theory

is based upon a more general notion of rationality than individual self interest, and this feature

distinguishes the satisficing game theoretic approach from all approaches that are based on

January 18, 2002 Page 31 Submitted to Minds and Machines


individual rationality, including game theory and, to best of the author’s knowledge, virtually

all instantiations of decision theory that appear in the literature. Satisficing game theory permits

decision makers to achieve a balance between their own egoistic interests and the interests

of others, and does so without requiring an unconditional sacrifice of its egoistic interests.

The decision maker is able to control the degree to which it would be willing to sacrifice to

accommodate others.

When designing machines to perform tasks that humans might otherwise perform, it is im-

perative that the machines function in ways that are compatible with the ways humans function,

otherwise, humans will not be willing to rely upon the machines. The individual rationality

paradigm of demanding the best and only the best for oneself is a very rigid model of behav-

ior with which humans are not easily able to comply, and machine designs based upon this

paradigm are not likely to be compatible with human behavior. Satisficing, as defined in this

essay as getting what one pays for, on the other hand, is compatible with a great deal of human

behavior, and artificial decision-making systems designed according to that paradigm may be

more compatible with human behavior.

Individual rationality has dominated the decision-theoretic scene for a long time. Over a

century ago, the American pragmatist Charles Sanders Peirce observed the following:

Take, for example, the doctrine that man only acts selfishly—that is, from the

consideration that acting in one way will afford him more pleasure than acting in

another. This rests on no fact in the world, but it has had a wide acceptance as

being the only reasonable theory. (Peirce, 1877).

If exclusive self-interest is not an adequate model for human interaction, why should we sup-

pose it should be an adequate model for artificial decision-making societies that we may wish

to create to interface with humans? Perhaps one of the great appeals of the individual rational-

ity hypothesis is that it leads to simplified analysis of natural systems and systematic synthesis

of artificial systems. Search procedures based upon the calculus of maximization are of such

enormous value that one may be tempted to adopt the individual rationality hypothesis largely

out of convenience. Perhaps what is needed is an alternative systematic synthesis procedure

January 18, 2002 Page 32 Submitted to Minds and Machines


that does not rely upon the individual rationality hypothesis, yet admits a calculus that, while

perhaps more complicated, nevertheless provides a systematic synthesis tool. The calculus of

satisficing, as developed in this essay, points to such a tool.

January 18, 2002 Page 33 Submitted to Minds and Machines


Notes
1 Apologies to those who might be offended by gender stereotypes, but this game was widely

discussed long before sexist issues became sensitive and cell-phones became available.

2 Extra-game-theoretic considerations, such as friendship, habits, fairness, etc., may also

be applied to modify the behavior of decision makers, but these approaches typically lead to ad

hoc rules of behavior that may not be compatible with any well-defined notion of rationality.

3 Other researchers have appropriated this term to describe various notions of constrained

optimization. In this essay, however, usage is restricted to be consistent with Simon’s original

concept.

4 In the probability context, let X and Y be random variables and let x and y possible values

for X and Y , respectively. By the law of compound probability, pXY (x, y) = pX|Y (x|y)pY (y)

expresses the joint probability of the occurrence of the event (X = x, Y = y) as the conditional

probability of the event X = x occurring conditioned on the event Y = y occurring, times the

probability of Y = y occurring. This relationship may be extended to the general multivariate

case by repeated applications, resulting in what is called the chain rule.

References

Arrow, K. J. (1951). Social Choice and Individual Values. John Wiley, New York. 2nd ed.

1963.

Arrow, K. J. (1986). Rationality of self and others. In Hogarth, R. M. and Reder, M. W.,

editors, Rational Choice. Univ. of Chicago Press, Chicago.

Axelrod, R. (1984). The Evolution of Cooperation. Basic Books, New York.

Bergson, A. (1938). A reformulation of certain aspects of welfare economics. Quarterly

Journal of Economics, 52:310–334.

Bicchieri, C. (1993). Rationality and Coordination. Cambridge Univ. Press, Cambridge.

January 18, 2002 Page 34 Submitted to Minds and Machines


Bowling, M. (2000). Convergence problems of general-sum multiagent reinforcement learning.

In Proceedings of the Seventeenth International Conference on Machine Learning.

Cooper, R., DeJong, D. V., Forsythe, R., and Ross, T. W. (1996). Cooperation without rep-

utation: experimental evidence from prisoner’s dilemma games. Games and Economic

Behavior, 12(1):187–218.

Eisen, M. (1969). Introduction to Mathematical Probability Theory. Prentice-Hall, Englewood

Cliffs, NJ.

Fudenberg, D. and Levine, D. K. (1993). Steady state learning and Nash equilibrium. Econo-

metrica, 61(3):547–573.

Glass, A. and Grosz, B. (2000). Socially conscious decision-making. In Proceedings of

Agents2000 Conference, pages 217–224, Barcelona, Spain.

Goodrich, M. A., Stirling, W. C., and Boer, E. R. (2000). Satisficing revisited. Minds and

Machines, 10:79–109.

Hu, J. and Wellman, M. P. (1998). Multiagent reinforcement learning: Theoretical frame-

work and an algorithm. In Shavlik, J., editor, Proceedings of the Fifteenth International

Conference on Machine Learning, pages 242–250.

Kalai, E. and Lehrer, E. (1993). Rational learning leads to Nash equilibrium. Econometrica,

61(5):1019–1045.

Kreps, D. M. (1990). Game Theory and Economic Modelling. Clarendon Press, Oxford.

Lewis, D. K. (1969). Convention. Harvard Univ. Press, Cambridge, MA.

Luce, R. D. and Raiffa, H. (1957). Games and Decisions. John Wiley, New York.

Palmer, F. R. (1971). Grammar. Harmondsworth, Penguin, Harmondsworth, Middlesex.

Peirce, C. S. (1877). The fixation of belief. Popular Science Monthly, 12.

January 18, 2002 Page 35 Submitted to Minds and Machines


Raiffa, H. (1968). Decision Analysis. Addison-Wesley, Reading, MA.

Rapoport, A. (1970). N-Person Game Theory. The Univ. of Michigan Press, Ann Arbor, MI.

Samuelson, P. A. (1948). Foundations of Economic Analysis. Harvard University Press, Cam-

bridge, MA.

Sandholm, T. W. (1999). Distributed rational decision making. In Weiss, G., editor, Multiagent

Systems, chapter 5, pages 201–258. MIT Press, Cambridge, MA.

Sargent, T. J. (1993). Bounded Rationality in Macroeconomics. Oxford Univ. Press, Oxford.

Schelling, T. C. (1960). The Strategy of Conflict. Oxford Univ. Press, Oxford.

Sen, S. (1996). Reciprocity: a foundational principle for promoting cooperative behavior

among self-interested agents. In Proceedings of The Second International Conference

on Multi-Agent Systems, pages 322–329, Kyoto, Japan.

Shapley, L. S. (1953). A value for n-person games. In Kuhn, H. W. and Tucker, A. W., editors,

Contributions to the Theory of Games. Princeton Univ. Press, Princeton, NJ.

Shubik, M. (1982). Game Theory in the Social Sciences. MIT Press, Cambridge, MA.

Simon, H. A. (1955). A behavioral model of rational choice. Quart. J. Econ., 59:99–118.

Sims, C. (1980). Macroeconomics and reality. Econometrica, 69:1–48.

Stirling, W. C. and Goodrich, M. A. (1999a). Satisficing equilibria: A non-

classical approach to games and decisions. In Parsons, S. and Wooldridge,

M. J., editors, Workshop on Decision Theoretic and Game Theoretic

Agents, pages 56–70, University College, London, United Kingdom, 5 July.

http://www.ee.byu.edu/faculty/wynns/publications.html).

Stirling, W. C. and Goodrich, M. A. (1999b). Satisficing games. Information Sciences,

114:255–280.

January 18, 2002 Page 36 Submitted to Minds and Machines


Taylor, M. (1987). The Possibility of Cooperation. Cambridge Univ. Press, Cambridge.

von Neumann, J. and Morgenstern, O. (1944). The Theory of Games and Economic Behavior.

Princeton Univ. Press, Princeton, NJ. (2nd ed., 1947).

Wolpert, D. H. and Tumer, K. (2001). Reinforcement learning in distributed domains: An

inverse game theoretic approach. In Proceedings of the 2001 AAAI Symposium: Game

Theoretic and Decision Theoretic Agents, pages 126–133. March 26–28, Stanford Cali-

fornia. Technical Report SS-01-03.

January 18, 2002 Page 37 Submitted to Minds and Machines

You might also like