Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

MEEG 4104 – Decision-Making in Complex

Systems Design

Lecture 15 – Repeated Games

Instructor: Zhenghui Sha (zsha@uark.edu)

Lecture Objectives:
• Finitely Repeated Games
• Infinitely Repeated Games

Department of Mechanical Engineering | University of Arkansas | Fayetteville, AR 72701


Repeated games vs. non-repeated games
Fundamental Difference
• Players interact not just once but many times.
• The prospect of “reciprocity” either by way of rewards or
punishments, separates a repeated game from one-time interaction.

If players believe that future behavior will be affected by the nature of


current interaction, then they may behave in ways that they would not
otherwise.

In every repeated game, there is a component game – called stage


game – that is played many times.

The total payoff is the sum of payoffs in each stage.


2
Dominant Strategy – A Motivating Example

Example: Prisoner’s Dilemma

Calvin/Klein Confess Not Confess


Confess 0, 0 7, -2
Not Confess -2, 7 5, 5

3
Example 1: Once-Repeated Prisoner’s
Dilemma
There are five subgames

Regardless of what
the players do in
round one, their
round-two payoff is
going to be the
payoff to (c, c),

Figure : 14.1 on Page 210 (Dutta) 4


Repeated Game: Definition

A subgame is a part of the extensive form; it is a collection of nodes and


branches that satisfies three properties: (1) It starts at a single decision
node. (2) It contains every successor to this node. (A successor to node
𝑥 is all of those nodes that can be reached by following some sequence
of branches originating from 𝑥.) (3) If it contains any part of an
information set, then it contains all the nodes in that information set.

Subgame perfect equilibrium strategies must specify a pair of


actions that form a Nash equilibrium within this subgame.
5
Repeated Game: Definition

A repeated game is defined by a stage game 𝐺 and the


number of its repetitions, say 𝑇. The stage game 𝐺 is a game
in strategic form: 𝐺 = 𝑆𝑖 , 𝜋𝑖 ; 𝑖 = 1, … , 𝑁 where 𝑆𝑖 is player 𝑖’s
set of strategies and 𝜋𝑖 is his payoff function [and it depends
on 𝑠1 , 𝑠2 , … , 𝑠𝑁 ].

• Finitely repeated game: 𝑇 is finite


• Infinitely repeated game: No fixed end

6
Example 2: Finitely Repeated Modified
Prisoner’s Dilemma

Stage Game – Modified Prisoner's Dilemma

Player 1 /
c N p
Player 2
c 0, 0 7, -2 3, -1
n -2, 7 5, 5 0, 6
p -1, 3 6, 0 3, 3

c: confess
n: not confess
p: partly confess
7
Example 2: Finitely Repeated Modified
Prisoner’s Dilemma
Figure : 14.2 on Page 211 (Dutta)

The total payoffs in the repeated game are simply the


sum of payoffs to the T stage games.
8
Backward Induction in Finitely Repeated
Games

The analysis of subgame perfect equilibria in finitely repeated games


proceeds by way of backward induction.

Folding the repeated game, one subgame at a time, from the end.

9
Backward Induction in Finitely Repeated
Games
• The players play each of the
two stages as if they were
playing each stage by itself.
And they play the stage game
Nash equilibrium regardless.

• This observation can be


confirmed more generally.
Suppose the Prisoners'
Dilemma is played T + 1 times,
where T is any positive integer.

Figure : 14.4 on Page 215 (Dutta) 10


Proposition

Consider a finitely repeated game 𝐺, 𝑇 with 𝐺 = ሼ𝑆𝑖 , 𝜋𝑖 ; 𝑖 =

11
Infinitely Repeated Game
When a game has no identifiable end that is, when T is infinite, we cannot
simply add up payoffs because we run into problems if we do.

• The payoff numbers may add to infinity.

• The numbers may not add up the same way (for every T).
o Consider the following play: a repeated cycle comprising (n, c) five
times followed by (n, n) twice. For player 1, the total payoff over any
one cycle is 5 × −2 + 2 × 5 = 0; hence, every seven stages the
total payoff comes back to 0. However, the total payoff after 8, or 15,
or 22 . . . stages is always -2.

12
Discounted Payoffs
Discount factor: Amount by which future payments are discounted to get
their present-day equivalent.

For example, if $1 a month from now is equivalent to $0.99 today, then the
discount factor, 𝛿 is 0.99.
• Amount by which a payoff two stages from today is discounted = 𝛿 2
• Amount by which a payoff three stages from today is discounted = 𝛿 3
• …

The total discounted payoff for player 𝑖 is: 𝑖 𝑡ℎ-stage payoff

𝜋𝑖0 + 𝜋𝑖1 𝛿 + 𝜋𝑖2 𝛿 2 + ⋯ + 𝜋𝑖𝑡 𝛿 𝑡 + ⋯

13
Example 3: Infinitely Repeated Prisoner’s
Dilemma
Figure : 14.4 on Page 212 (Dutta)

Suppose that at the


𝑡 𝑡ℎ stage, player 𝑖
gets a payoff of 𝜋𝑖𝑡 .
The likelihood that
the 𝑡 𝑡ℎ stage will
even get played is 𝛿 𝑡.
Hence the expected
payoff to the 𝑡 𝑡ℎ
stage is 𝛿 𝑡 𝜋𝑖𝑡. The
total expected payoff
is the sum of these
stage-game expected
payoffs; that is, it
equals
14
The Behavior Cycle

A behavior cycle is a repeated cycle of actions; play 𝑛, 𝑛 for 𝑇1


stages, then 𝑐, 𝑐 for 𝑇2 stages, followed by (𝑛, 𝑐) for 𝑇3 stages,
and then 𝑐, 𝑛 for 𝑇4 stages. At the end of these 𝑇1 + 𝑇2 + 𝑇3 +
𝑇4 stages, start the cycle again, then yet again, and so on.

The behavior cycle is called individually rational if each


player gets strictly positive payoff within a cycle. for each
player the sum of stage payoffs over the 𝑇1 + 𝑇2 + 𝑇3 + 𝑇4
stages is positive.

15
The Folk Theorem
Theorem

• Equilibrium Behavior: Consider any individually rational behavior


cycle. Then this cycle is achievable as the play of a subgame
perfect equilibrium whenever the discount factor 𝛿 is close to 1.

• Equilibrium Strategy: One strategy that constitutes an


equilibrium is the grim trigger; start with the desired behavior cycle
and continue with it if the two players do nothing else. If either
player deviates to do something else, then play 𝑐, 𝑐 forever after.

16
The Folk Theorem
Some general remarks:
• All potential behaviors are equilibrium behaviors
• Not only are positive payoffs necessary for equilibrium but
they are sufficient as well; every behavior cycle with positive
payoffs is an equilibrium for high values of 𝛿.
• All payoffs are accounted for
• We are not excluding any possible payoffs. As we look at
different behavior cycles we get different payoffs per stage.
• Future needs to matter
• The result only works for high values of 𝛿. A high d means
that future payoffs matter. In turn, that fact means future
promises - or threats - can affect current behavior.
Infinitely many equilibria
17
The Folk Theorem
• Infinitely many equilibria
• An implication of the result is that there are an infinite
number of subgame perfect equilibria in the infinitely
repeated Prisoners' Dilemma.
• A more general conclusion
• By repeating any stage game, we can get all individually
rational behavior cycles as part of a subgame perfect
equilibrium
• Observable Actions
• One shortcoming of the analysis so far is that it requires
deviations to be perfectly observable, and hence immediately
punishable. In many contexts this assumption is unrealistic
because other players may not have precise information on
what a rival has done in the past. 18
Thank You!

19

You might also like