Professional Documents
Culture Documents
Eco610-Game Theory
Eco610-Game Theory
Eco610-Game Theory
FACULTY OF COMMERCE
5.0 INTRODUCTION
The study of such strategic decisions is the subject matter of game theory. Game theory is
concerned with the choice of the best or optimal strategy in conflict situations. For
example, game theory can help a firm determine the conditions under which lowering its
price would not trigger a ruinous price war, and whether the firm should build excess
capacity to discourage entry into the industry even though this lowers the firm’s short-run
profits.
All game theoretic models are defined by a common set of five parameters. They include:
I. The Players: are the decision-makers whose behavior we are trying to explain and
predict. The decisions of all players determine the outcome. Game theory models
describe both the identity of the players and their number, as changes in either can
alter play.
II. The Feasible Strategy Set: A player’s strategy is a complete contingent plan
describing the actions he or she will take in each conceivable evolution of the game.
They are all the possible actions of the players that can be given a nonzero
probability of occurring. For example, the choices to change price, develop new
1
products, undertake a new advertising campaign, building new capacity, and all
other such actions that affect the sales and profitability of the firm and its rivals.
For each possible strategy that America could adopt, there were a number of
strategies (reactions) available to Iraq. Note that because there are two players,
and each has two strategic options, there are four possible outcomes (AS, WMD;
AS, CW; GA, WMD; and GA, CW).
IV. The Payoffs: These are the players’ preferences (i.e., utility functions) over the
possible outcomes. Each possible combination of strategies (outcome) by the
players has a corresponding payoff. For example, the payoff is usually expressed in
terms of the profits or losses of a firm as a result of the firm’s strategies and the
rival’s responses. The table giving the payoffs from all strategies open to a player
and the rival’s responses is called the payoff matrix.
V. The Order of Play: The model specifies the order in which players reveal their
chosen strategy. Models are simultaneous if all players reveal their strategy without
knowing the strategy of others. In other words, if all players commit to a strategy
before learning the strategies of others, then the game is simultaneous.
Sequential games are non-simultaneous games which specify the order of play.
2
Table 5.1 Payoff Matrix for an Advertising Game
Firm B
Advertise Don’t Advertise
Advertise 4, 3 5,1
Firm A
Don’t Advertise 2,5 3,2
The first number in each of the four cells refers to the payoff (profit) for firm A,
while the second is the payoff (profit) for firm B.
From the table, if both firms advertise, firm A will earn a profit of 4, and firm B
will earn a profit of 3 (the top left cell of the payoff matrix).
The bottom left cell of the payoff matrix shows that if firm A doesn’t advertise
and firm B does, Firm A will have a profit of 2, and firm B will have a profit of 5.
The other payoffs in the second column of the table can be similarly interpreted.
Figure 5.1 shows the game tree of the war between America and Iraq. It is assumed that
the two players move sequentially, rather than simultaneously. In particular, America
attacks Iraq (Air Strike, AS, or Ground Attack, GA) first. Then, after seeing America’s
choice, Iraq chooses her strategy (use Weapon of Mass Destruction, WMD, or
Conventional Weapon, CW).
America
AS GA
Iraq Iraq
WMD CW WMD CW
America’s payoff 0 -3 2
3
1
Iraq’s payoff -2 -1 -3
The game starts at an initial decision node represented by an open square, where
America (player 1) makes her move, deciding whether to use air strike or ground
attack.
Each of the two possible choices for America is represented by a branch from this
decision node.
3
At the end of each branch is another decision node represented by a solid square, at
which Iraq (player 2) can choose between two actions, use weapon of mass
destruction or conventional weapon, after seeing America’s choice.
The initial decision node is referred to as player 1’s decision node; the latter two as
player 2’s decision nodes.
After Iraq’s (player 2’s) move, we reach the end of the game, represented by terminal
nodes. At each terminal node, we list the players’ payoffs arising from the sequence
of moves leading to that terminal node.
However, all simultaneous games are played with imperfect information. That is, when it
is a player’s turn to make a move, he or she does not know at which decision node the
other player is. The reason for this ignorance is that the player has no opportunity to
observe something about what has transpired in the game.
If the advertising game earlier presented (see table 5.1) is such that firm A and firm B
simultaneously make decision as to whether to advertise or not to advertise, the game tree
would be as presented in figure 4.2 below.
Firm A
Firm B
1 5 2
Firm B’s payoff 3
The figure is identical to the one of sequential games except that a circle is drawn
around firm B’s two decision nodes to indicate that these two nodes are in a single
information set. The meaning of this information set is that when it is firm B’s
turn to make a choice of its action, it cannot tell which of these two nodes it is at
because firm A has not revealed its strategy.
4
Example Two: Matching Pennies
There are two players, denoted 1 and 2. Each player simultaneously puts a penny down,
either heads up or tails up. If the two pennies match (either both heads up or both tails
up), player 1 pays 1 dollar to player 2; otherwise, player 2 pays 1 dollar to player 1.
The game tree for this simultaneous game is presented in figure 5.3 below
Player 1
Head Tail
Player 2
-1 -1 1
Player 2’s payoff 1
A circle has been drawn around player 2’s two decision nodes to indicate that
these two nodes are a single information set.
Similarly note that because the moves are made simultaneously, player 1 is
equally unable to observe player 2’s choice. This implies that as long as each
player cannot observe each other’s choices, the timing of moves is irrelevant. By
this logic we can also describe this game with a game tree that reverses the
decision nodes of players 1 and 2 in figure 5.3.
Definition: A game is one of perfect information if each information set contains a single
decision node (e.g., figure 4.1). Otherwise, it is a game of imperfect information (e.g.,
figure 5.2 and 5.3).
5
4.4 STRATEGIES
Definition: A strategy is a complete contingent plan, or decision rule, that specifies how
the player will act in every possible distinguishable circumstance in which he or she
might be called upon to move.
It is worthwhile to consider what the players’ possible strategies are for games of perfect
information, and those of imperfect information.
Case 1: Defining all Possible Strategies in the War between America and Iraq
America’s Strategy
A strategy for America (player 1) in the sequential game specifies her move at the game’s
initial node. America has two possible strategies: use air strike (AS) or ground attack
(GA).
Iraq’s Strategy
Specifies how she will fight at each of her two information sets, that is, how she will
attack if America uses AS and how she will attack if America picks GA.
Firm A has two possible strategies in the simultaneous advertising game: advertise or not
to advertise.
But firm B now only has two possible strategies, “advertise” and “don’t advertise,”
because the firm now has only one information set. Firm B can no longer condition its
action on firm A’s previous action.
6
Dominant Strategies
If a player has a dominant strategy, we should expect him or her to play it. Referring back
to the war game between America and Iraq:
Table 5.3 Payoff Matrix for War Game between America and Iraq
Iraq
WMD CW
AS 0, -2 3, -1
America
GA -3, -3 2, 1
Equilibrium of the war game is {America uses Air Strike, Iraq uses Conventional
Weapon}.
7
Firm A’s Equilibrium Strategy
If firm B chooses to advertise, firm A would maximize its payoff by also advertising
(payoff 4 > payoff 2).
If firm B chooses not to advertise, firm A would maximize its payoff by advertising
Dominant strategy for firm A is to Advertise irrespective of firm B’s strategies.
Dominated Strategy
A player’s strategy is said to be dominated if there exist some other alternative strategy
that yield him greater payoff regardless of what other players do.
One approach to finding solution to a game theory problem having no clear dominant
strategy is to identify the dominated strategies and eliminating them. Note that no player
would ever play a dominated strategy, because he or she will always be worse off playing
the strategy. The iterative process of eliminating the dominated strategies may proceed
until each player is left with only one playable strategy (i.e., dominance solvable).
Example 1:
Table 5.4 shows a payoff matrix of a game involving Firm A and Firm B that must decide
on a pricing policy for a new product. Each firm knows that the rival firm will introduce a
similar competing product. Because Firm A’s product is expected to enter the market
slightly sooner than Firm B’s, Firm A announces its prices first. Firm A can choose one
of the following three prices: $1.00, $1.35, or $1.65. Firm B will reveal its price later.
Because it is second to market, its possible price points are two: $0.95 and $1.30. The
payoffs represent profits.
8
Steps towards identifying firm A’s and B’s optimal pricing strategies:
Note that to be able to reach an equilibrium we must assume that the players are rational,
and they would never play a dominated strategy. That is, a player must know not only
that his rival is rational but also that the rival know that he is.
Check whether there exist any dominated strategies for each firm. To identify
dominated strategy for Firm A, consider each strategy of Firm B and determine which
amongst the three pricing strategies of Firm A yields the lowest payoff. For example,
$1.00 pricing policy is most preferred for Firm A if Firm B selects $0.95 pricing
policy. The $1.35 pricing policy is Firm A’s preferred policy given Firm B’s $1.30
pricing policies. Note that the $1.65 pricing policy for firm A is dominated by both
pricing policies ($1.00 and $1.35).
Eliminate Firm A’s dominated strategy (eliminate pricing policy of $1.65).
The same procedure could continue to be used to eliminate the dominated strategy
until it is possible to clearly identify each firm’s optimal strategy.
In this example we notice that there is no longer a dominated strategy in the reduced
matrix. Also note that no dominant strategy exists.
The reduced form matrix gives us a situation of pure conflict: what one player wins,
the other loses. Such a game is called a zero-sum game. There would be no unique
prediction for this game. Firm A’s price choice of $1.00 will be met by a counter
offer of $1.30 by Firm B. Alternatively, an offer of $1.35 by Firm A is met with a
counter offer of $0.95 by Firm B.
Example 2:
Consider the following payoff matrix of player 1and player 2
Player 2
Product A Product B Product C
Player 1 Product D 3, 6 7, 1 10, 4
Product E 5, 1 8, 2 14, 7
Product F 6, 0 6, 2 8, 5
Using iterative elimination of dominated strategies approach, we can find the equilibrium
of the game.
Starting from the view point of player 1, if the player is rational he will see that the
strategy of producing product D is dominated, so he will never play it. If player 2
knows that player 1 is rational and will never produce product D, then its product C
strategy is dominant (this is observable after eliminating player 1’s product D).
9
Player 1’s product D strategy is eliminated.
Player 2
Product A Product B Product C
Player 1
Product E 5, 1 8, 2 14, 7
Product F 6, 0 6, 2 8, 5
Having identified player 2’s dominant strategy we can eliminate strategy product A
and product B.
Finally, given player 2’s dominant strategy (product C), player 1 would only have to
make a choice between product E and F. He will choose product E because it gives
him the highest payoff. The equilibrium of the game is: {player 1 chooses product 1
and player 2, product C}.
Too few games have dominant strategy equilibrium. The prediction in games without
dominance can be achieved by finding a Nash Equilibrium (named after John Nash,
the Princeton University mathematician who first formalized the notion in 1951).
10
In a Nash equilibrium, each player’s strategy choice is a best response to the strategies
actually played by his rival.
Example 1:
Consider the example presented above.
Player 2
Product A Product B Product C
Player 1 Product D 3, 6 7, 1 10, 4
Product E 5, 1 8, 2 14, 7
Product F 6, 0 6, 2 8, 5
Player 1 Player 2
Product D 3, 6 7, 1 10, 4
Product E 5, 1 8, 2 14 , 7
Product F 6 0 6, 2 8, 5
11
Any cell with both a circle and a square is a Nash equilibrium. In this game, the Nash
equilibrium is for player 1 to produce E (and receive 14) and player 2 to produce C
(and receive 7). Thus, we can say that a Nash equilibrium is a meeting of best
responses. Once the Nash equilibrium is reached, no player is willing to unilaterally
change behavior.
Example 2:
Payoff Matrix for an Advertising Game
Firm B
Advertise Don’t Advertise
Advertise 4, 3 5,1
Firm A
Don’t Advertise 2,5 3,2
If firm A chooses to advertise, the best response for firm B is to advertise (put a
square around 3 in row 1).
If firm A does not advertise, the best response for firm B is to advertise (put a square
around 5 in row 2).
Now follow the same procedure for player A. If firm B chooses not to advertise, the
best response for player A is to advertise (put a circle around 5 in column 2).
If firm B chooses to advertise, the best response for player A is to advertise (put a
circle around 4 in column 1).
Both a circle and a square appear in Advertise, Advertise. Therefore, the Nash
equilibrium is firm A and firm B should advertise.
Note that similar results were arrived at when we used the approach of identifying
dominant strategies of each player in section 4.4 above.
Assignment:
Q1. Consider a two-player simultaneous-move game shown in the payoff matrix table
below.
Player 2
L M R
Player 1 Product U 5, 3 0, 4 3, 5
Product N 4, 0 5, 5 4, 0
Product D 0, 4 0, 4 5, 3
Up to this point, we have assumed that players know all relevant information about each
other, including the payoffs that each receives from the various outcomes of the game.
Such games are known as games of complete information. This assumption is, however,
12
very strong. In many circumstances, players have what is known as incomplete
information about each other. For instance, when a bank is deciding whether or not to
give a loan to a borrower, it might not know with certainty whether it is dealing with an
honest borrower or a defaulter. The payoffs of the bank will, therefore, depend on the
type of borrower it is dealing with.
Another good example is when a new entrant firm is making decision as to whether or
not to venture into a market. It may not be sure what type of incumbent firm (hostile or
accommodating) it is dealing with. Whether the new entrant firm will survive or not
depends on the type of incumbent firm existing in the market. That is, the payoff of the
entrant would vary depending on the type of incumbent firm.
In the two examples above, players possess asymmetric information, which incomplete
information game models summarize in the form of player “types.” A type consists of
player characteristics that are unknown to others (e.g., honest or defaulter; hostile or
accommodating; strong or weak). Specific types are represented by different payoff
functions. So, a hostile type has a different payoff function relative to an accommodating
type.
Example 1:
Firm E needs to decide whether it should enter a product market where Firm D is an
incumbent. Firm E is uncertain of the reaction of Firm D if Firm E decides to enter the
market. If Firm D is “hostile,” then Firm E expects it to lower price and defend its
market. If Firm D is “accommodating,” then Firm E expects it to keep its prices high, and
allow Firm E to enter the market. Owing to the uncertainty associated with Firm D’s type
(hostile or accommodating), assume that Firm D is a hostile type with probability μ,
while the firm is an accommodating type with probability (1-μ).
The extensive form (game tree) of this Bayesian game is represented in figure 5.4.
13
Figure 5.4: Game tree of Bayesian game between Firm D and Firm E
NATURE
Firm E
Enter Don’t enter Don’t enter
Enter
Firm D Firm D
Firm E
Payoff
4 8 6 6 4 8 6 6
12 10 6 4 4 16 6 8
Firm D
Payoff
In figure 5.4, nature moves first and determines the game types (in this case whether
Firm D is the hostile type or the accommodating type). The probability that Firm D
will be the hostile type is μ, while the probability that Firm D will be the
accommodating type is (1-μ).
Firm E decision nodes are circled because it cannot observe what type Firm D is
(hostile or accommodating). However, Firm D knows whether it is the hostile type or
the accommodating type.
Although Firm D knows its type (hostile or accommodating), it does not observe the
action choice (enter or don’t enter) of Firm E. That is why at each type Firm D’s
decision nodes are circled (the final sets of decision nodes).
14
Note that the payoffs for Firm E are identical across parts A and B. The incomplete
information is about Firm D, not Firm E. Only the payoffs of Firm D change.
Observe that if Firm E knew the true type of Firm D (hostile or accommodating), its
decision would be easy. When Firm D is the hostile type (part A), the Nash equilibrium
would be for Firm E not to enter the market and for Firm D to price low (to fight to
maintain market dominance). When Firm D is the accommodating type (part B), the Nash
equilibrium would be for Firm E to enter the market and for Firm D to price high
(allowing entry).
However, Firm E does not know the true type of Firm D. For this reason we find a
Bayesian Nash equilibrium for the incomplete information game.
We want to define an equilibrium concept for Bayesian games. To do so, we must first
define the players’ pure strategy spaces in such a game. Recall that a player’s strategy is a
complete contingent plan.
A pure strategy for player i in a Bayesian game is a function si(θi) that gives the player’s
strategy choice for each realization of his type θi.
In the Bayesian game presented above, a pure strategy (a complete contingent plan) for
Firm D can be viewed as a function that for each possible realization of Firm D’s
preference type indicates what action it will take. Hence, Firm D has four possible pure
strategies:
(i) Price low if it is the hostile type, price low if it is the accommodating type
(ii) Price low if it is the hostile type, price high if it is the accommodating type
(iii) Price high if it is the hostile type, price low if it is the accommodating type
(iv) Price high if it is the hostile type, price high if it is the accommodating type
Notice, however, that Firm E does not observe Firm D’s type, and so a pure strategy for
Firm E in this game is simply a choice of either “enter the market” or “don’t enter the
market.”
In a Bayesian game, each player i has a payoff function ui ( si , si , i ) , where θi is called
player i’s type, which is chosen by nature, and is observed only by player i. It is assumed
that the probability distribution of the θi’s is common knowledge among all players.
We find the Bayesian Nash Equilibrium by comparing the expected payoff from a given
pure strategy with that of alternative pure strategies. Using the example presented above:
15
If Firm D is the hostile type, it must choose “price low” with probability 1 because this is
that type’s dominant strategy.
If Firm E enters, it earns a payoff of 4
If Firm E does not enter it earns a payoff of 6
Likewise, the accommodating type of Firm D also has a dominant strategy: “price high.”
the accommodating type, its dominant strategy would be price high.
If Firm E enters, it earns a payoff of 8
If Firm E does not enter it earns a payoff of 6
Given that Firm E has only two pure strategies (enter or not enter the market), Firm E
must compare the expected payoff from entering the market with the expected payoff
from not entering the market.
.4 (1 )8 .6 (1 )6
.4 .8 8 .6 6 6
.4 2
2 4
2 1
or
4 2
1
(ii) To choose not to enter the market if
2
1
(iii) To be indifferent as to whether or not to enter the market if
2
16
5.7 STRATEGIC FORESIGHT: THE USE OF BACKWARD INDUCTION
Good managers use strategic foresight in making decisions. We define strategic foresight
as the ability to make decisions today that are rational given what is anticipated will
happen in the future. For example, a manager builds extra capacity today because he or
she believes (correctly) that the demand will increase in the near future. Using strategic
foresight also helps managers understand that decisions have both short- and long-term
consequences.
Game theory formally models strategic foresight through what is called backward
induction. We use backward induction to solve games by looking to the future,
determining what strategy players will choose (anticipation) then choosing an action that
is rational, based on these beliefs. Backward induction is most easily seen in extensive-
form games, because of their ability to map out the choices of players.
America
AS GA
Iraq Iraq
WMD CW WMD CW
America’s payoff 0 -3 2
3
1
Iraq’s payoff -2 -1 -3
Since America attacks first, it is assumed that it will reach its decision first. After
seeing the decision of America, Iraq then decides whether to use conventional
weapons (CW) or weapon of mass destruction (WMD).
America wants to make a decision today that maximizes its payoff, given its vision of
the future. This it can achieve by using backward induction. That is, America must
anticipate the future actions of others, and then choose actions that are rational.
17
o Iraq would prefer CW if America was to use GA (notice payoff (1) for CW >
payoff (-3) for WMD).
o Iraq would prefer CW if America was to use AS since payoff for CW (-1) >
payoff for WMD (-2).
(ii) Given that these will be the actions taken at the final decision nodes, proceed
upwards to determine optimal actions to be taken by the first player (in this case
America). This second step is accomplished by replacing Iraq’s decision nodes by
payoffs that will result from Iraq’s optimal behavior.
Figure 5.5b: Reduced Game after Solving for Iraq’s Optimal Behavior
America
Air Ground
Attack Attack
2
America’s payoff 3
1
-1
Iraq’s payoff
Notice that this reduced game is a very simple single-player decision problem in
which America’s optimal decision is identified by now comparing its payoffs (3 and
2).America’s optimal decision is to use Air Attack since it is guaranteed higher payoff
of 3 compared with the payoff of 2 for ground attack.
Also note that by use of backward induction we have been able to identify the
sequentially rational Nash equilibrium strategy profile (America use Air Strike, Iraq
uses conventional weapon if America uses air strike).
Example 2
Consider the following predation game. Firm E (for entrant) is considering entering a
market that currently has a single incumbent (firm I). If it does so (playing “in”), the
incumbent can respond in one of two ways: It can either accommodate the entrant, giving
up some of its sales but causing no change in the market price, or it can fight the entrant,
engaging in a costly war of predation that dramatically lowers the market price.
The extensive and payoff matrix representations of this game are depicted in figure 5.6.
18
Firm E
In
Out
0 Fight Accommodate
2
Although the first strategy profile (out, fight if firm E plays “in”) is a Nash equilibrium,
yet, it is not a sensible prediction for this game. If firm E was to use strategic foresight, it
can foresee that if it does enter, the incumbent will, in fact, find it optimal to
accommodate (by doing so, firm I earns 1 rather than -1).
Hence, the incumbent’s strategy “fight” if firm E plays “in” is not credible.
This example illustrates that in dynamic games the Nash equilibrium concept may not
give sensible prediction, and so the need for use of backward induction.
19
Figure 5.6b
Firm E
Out In
2
Firm E’s payoff 0 1
2
Firm I’s payoff
Firm E’s optimal decision is to play “in” since the payoff of (2) is greater than 0
(payoff of staying out).
Therefore, the sequential rational Nash equilibrium strategy profile is (Firm E plays
in, Firm I accommodates if firm E plays “in”).
Example 3
Consider the three-player finite game of perfect information depicted in figure 5.7.
Figure 5.7a
Player 1
L R
Player 3 Player 2
v r a b
Player 3 Player 3
2 -1
0 5
1 6 v r v r
Player1’s 3 5 0 -2
4 -1 2
Player 2’s
payoff 1
2 4 7 0
Player 3’s
payoff
payoff
20
Use backward induction to identify sequentially rational Nash equilibrium strategy
profile for player 1, 2 and 3.
Solution
The arrows in figure 5.7a indicate the optimal play (of player 3) at final decision
nodes of the game. Comparison is done of players 3’s payoffs (row 3 of payoffs) and
the strategy with highest payoff is selected at each of the player’s decision node.
Figure 5.7b
Player 1
L R
Player 2
-1
5
6 a b
In figure 5.7b, the first round reduced game in a backward induction procedure is
formed by replacing the final decision nodes by the payoffs that result from optimal
play (by player 3) once these nodes have been reached. The arrows in figure 4.7b
indicate the optimal play (of player 2) at final decision nodes of the reduced game
(compare the second row of payoffs and select player 2’s strategy with highest payoff
at each of the player’s decision node).
Figure 5.7c
Player 1
L R
21
Figure 5.7c represents the reduced game derived in the next stage of the backward
induction, when the final decision nodes of the reduced game in figure 5.7b are
replaced by the payoffs arising from optimal play at these nodes (again indicated by
arrows). Finally compare the payoffs of player 1 (-1 and 5 in row 1 of payoffs) and
select the players strategy with highest payoff (in this case R).
Note that this strategy profile is a Nash equilibrium of this three-player game.
Our earlier analysis has only concentrated on a one-shot game. That is, a game is played
between players only once without any chance of repeating the game. It is worthwhile to
note that the prospects of a game being repeated again and again can change player
actions.
We illustrate the strategic effect of repeated play using a stylized example from a class of
games known as prisoner’s dilemmas. The story behind this game is as follows: Two
individuals are arrested for allegedly engaging in a serious crime and are held in separate
cells. The interrogators try to extract a confession from each prisoner. Each is positively
told that if he is the only one to confess, then he will be rewarded a light sentence of 1
year while the uncooperative prisoner will go to jail for 10 years. However, if he is the
only one not to confess, then it is he who will serve the 10-year sentence. If both confess,
they will both be shown some mercy: they will each get 5 years. Finally, if neither
confesses, it will still be possible to convict both of a lesser crime that carries a sentence
of 2 years. Each player wishes to minimize the time he spends in jail (or maximize the
negative of this, the payoffs that are depicted in table 5.6).
22
The Nash equilibrium of this game is for both prisoners to confess. To see why, note that
playing “confess” is each player’s best strategy regardless of what the other player does.
In other words “confess” is a dominant strategy for each player in a game that is only
played once.
Observe that both prisoners can be better off if each does not confess (each would serve a
jail term of 2 years), but they are afraid that the other prisoner will then confess and only
serve a jail term of 1 year, while the one that chooses not to confess serves a 10-year jail
term. Thus, if both prisoners are going to engage in criminal activities for a single
instance, then we expect the rational choice for both prisoners to be “confess.”
However, if the same scenario is repeated every time they are caught after committing
crimes together, they will realize that by failing to trust each other is expensive. Every
time they are caught and confess they would serve 5 years in jail instead of 2 years when
they co-operate and do not confess.
Strategically, the key difference between one-shot games and those that are repeated is
the presence of a future. A future introduces behavior not possible in a one-shot game.
Reputation, trust, promises, threats, and reciprocity need a future to exist. Therefore, a
betrayal of trust may lead to gains in the present, but these may be outweighed by future
losses.
In infinitely repeated games, we can have equilibrium where both prisoners reach
mutually beneficial outcomes by not confessing. The risk of one prisoner cheating and
confessing, leading to the other serving a 10-year jail term is mitigated through the threat
of future punishment. The threats to punish must, however, be credible. Cooperative
behavior between the prisoners is easier to maintain in an infinite horizon game because
the future always looms. But, in finite horizon games the future grows smaller as we
approach the last period.
In a finite horizon prisoner’s dilemma game that is repeated for only a finite number of
times T, the unique Nash equilibrium involves T repetitions of the one-shot prisoner’s
dilemma equilibrium in which “confess” is the dominant strategy. This is a simple
consequence of backward induction: In the last period, T, we must be at the Nash
equilibrium (confess, confess) regardless of what has happened earlier. But, then, in
period T-1 we are, strategically speaking, at the last period, and the equilibrium (confess,
confess) solution must arise again. And so on, until we get to the first period. In
summary, backward induction rules out the possibility of more cooperative behavior in
the finitely repeated prisoner’s dilemma game.
However, in an infinitely repeated game, things can dramatically change. The prisoners
can follow the following strategies:
Prisoner i’s strategy calls for him not to confess in period t = 1. Then, in each period t
> 1, prisoner i chooses not to confess if in every previous period both prisoners have
not confessed.
23
Otherwise, prisoner i choose to confess if his partner defected in the previous period
and confessed.
This type of strategy is called a Nash reversion strategy: Players cooperate until someone
deviates, and any deviation triggers a permanent retaliation in which the one-shot Nash
strategy (confess, confess) is the equilibrium. Therefore, if the prisoners use the Nash
reversion strategy defined above, then both will end up choosing “Don’t confess” in
every period, so that they now serve a jail term of 2 years each instead of 5 years without
cooperation.
The Nash equilibrium of this game is for both firms to sell their products at competitive
low price (Pc) and earn $5 million. Both firms can be better off if each maintains a high
monopolistic price (Pm), but a problem arises from the fear that the other firm would
drop its price and steal the market. In the event that this happens, the firm that charges the
high price Pm earns $1 million, while the low pricing firm earns $23 million.
Therefore, in a one-shot pricing competition we expect both firms to keep price low. In
an infinitely repeated pricing competition, however, the firms would have an incentive to
collude so that they could both charge a monopoly price (Pm), and earn $10 million each.
Each firm adopts a Nash reversion strategy: firm i chooses monopoly price Pm in period t
= 1. Then, in each period t > 1, the firm chooses Pm if in every previous period both
firms chose Pm. Otherwise, firm i will choose the competitive price (Pc) if the other firm
defected in the previous period and chose Pc. The equilibrium, therefore would be “Pm,
Pm”.
24