Topic 3 With Handwritten Notes

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Topic 3: Dynamic Games of Complete Information

We now turn from one-shot or static games to dynamic or multi-stage games which
are played over time. In this chapter, we assume that information is complete and
perfect. We will introduce another equilibrium concept called the subgame-perfect
Nash equilibrium.

3.1 A General Two-Stage Game with Observed Actions


Apayotfs
of each playeris knoon
We examine a 2-stage game with complete and perfect information. We introduce

game is as follows: Le
two key concepts: backward induction and subgame-perfect Nash equilibrium. The
址 of the
history gane
巧 tenown
Stage 1. Player 1 chooses 0
𝒂𝟏 .

hishory
Stage 2. Player 2 observes 𝒂𝟏 and chooses 𝒂𝟐 .
E
The game then ends and payoffs are: 𝒖𝟏 (𝒂𝟏 , 𝒂𝟐 ) and 𝒖𝟐 (𝒂𝟏 , 𝒂𝟐 ) for players 1 and
2 respectively.

This is a very general class of game and two specific examples follow. Key features are
(i) sequential moves, (ii) all previous moves are observed, and (iii) payoff functions
and the rationality assumption are common knowledge. (i) and (ii) imply perfect
information; (iii) implies complete information.

3.1.1 Solution by Backward Induction

To solve this game, we start at the end, i.e., we proceed by backward induction.

 Player 2 chooses 𝒂𝟐 to maximize his payoff 𝑢2 (𝑎1 , 𝑎2 ) given 𝒂𝟏 . We should


obtain the best response function 𝐵𝑅2 (𝑎1 ) for Player 2.
 Player 1 chooses 𝒂𝟏 to maximize his payoff 𝑢1 (𝑎1 , 𝐵𝑅2 (𝑎1 )). This means we
substitute the best response function 𝐵𝑅2 (𝑎1 ) for Player 2 into Player 1’s payoff
function and then maximize Player 1’s payoff by choosing 𝑎1 .

1
一一
3.1.1.1 A Specific Example

To understand the backward induction, let’s consider the following example. Notice
that for static games, we have the normal-form representation, correspondingly for
dynamic games, we have the extensive-form representation. The extensive-form
representation includes the players, the strategy sets, payoffs, and the stages of the
game. The extensive form of the present example can be represented by the following
game tree.


*
1

L R
@
2 2

L R L R

(3,1) (1,2) (2,1) (0,0)

In this games, Player 1 moves first by choosing L or R. After observing Player 1’s choice,
Player 2 chooses L or R. The nodes with the label 1 and 2 in the diagram are called
decision nodes. A decision node is a point where a player has to make a choice.

Backward Induction:
Imagine you are Player 2:
 If player 1 chooses L, what should you choose? Why?
 If player 1 chooses R, what should you choose? Why?

Imagine you are Player 1:


 Given Player 2’s best response, what should you choose? Why?

The backward induction outcome of the game is then: stage 1, Player 1 plays R. At
stage 2 Player 2 plays L. The payoff is (2, 1).

2
3.1.2 Subgames

A subgame is a subset of the original game that satisfies the property that it must start
from a decision node and must contain all its successor nodes.

To find the subgames, let’s start from the end of the game tree. We should start from
a decision node.
1. The middle-left decision node initiates a subgame that encompasses all of its
successor nodes. It is indicated by the red circle. In this subgame, Player 2 chooses
L or R given Player 1 chooses L.
2. The middle-right decision node initiates a subgame that encompasses all of its
successor nodes. It is indicated by the green circle. In this subgame, Player 2
chooses L or R given Player 1 chooses R.
3. The top decision node initiates a subgame that encompasses all of its successor
nodes. It is indicated by the purple circle. By definition, the original game is always
a subgame of itself.
4. Subgames that start from nodes other than the original game’s initial node are
called proper subgames.

1

Subgames
L R
2 orOo 2
Poy
2 2
subgames
L R L R

(3,1) (1,2) (2,1) (0,0)

3

3.1.3 Credibility

The backward induction outcome ensures credibility, i.e., no empty threat. Why?
Back to our specific example, is it possible for Player 2 to achieve a higher return by
introducing a threat that he must choose R no matter what?

L R

2 2

L R L R

(3,1) (1,2) (2,1) (0,0)

By backward induction, we know that choosing R is not the best response in the
subgame on the right-hand side (the green circle). Therefore Player 2’s strategy to play
R even if 1 plays R is an empty threat and will not be believed by Player 1.

3.2 Equilibrium Concept 3: Subgame-perfect Nash

Equilibrium

A subgame-perfect Nash equilibrium (SPNE) is an equilibrium concept in game theory


where players play their best responses at every subgame of the overall game. In
other words, it ensures that each player's strategy is optimal not just for the entire
game, but also for every smaller game that can be formed within the larger game.

4
3.2.1 Backward Induction and Subgame-perfect Nash Equilibrium

Strategies and outcomes isolated by the backward induction method are subgame-
perfect Nash equilibria. In other words, we can use the backward induction method
to find SPNE. To understand this better, let’s consider the following game.

SPNE

⑥ 0

There are 3 proper subgames indicated by the three circles in different colors.

In the first step, start from the end of the game tree. Consider the proper subgame in
the red circle.

Player 1 will choose E to maximize his payoff. Player 1 plays his best responses at this
subgame.

Then, we move to the upper stage which has two proper subgames (green and purple).
Let’s focus on the green one first.

5
Since F is not the best response in the previous subgame, if Player 2 chooses C, he will
get 3, and if he chooses D, he will get 2. Player 2 chooses C to maximize his payoff.
Player 2 plays his best responses at this subgame.

Now, let’s turn to the purple one.

Player 2 chooses A to maximize his payoff. Player 2 plays his best responses at this
subgame.

Finally, Player 1’s best response is to choose D. The SPE is: Player 1 chooses D at stage
1, Player 2 chooses C at stage 2, and Player 1 chooses E at stage 3. The payoff is (3, 3).

6
Clearly, in the backward induction method, players play their best responses at every
subgame. Therefore, strategies and outcomes isolated by the backward induction
method are subgame-perfect Nash equilibria.

3.2.2 Nash Equilibrium and Subgame-perfect Nash Equilibrium



The set of SPNE is a subset of the set of Nash equilibria of the normal-form game. To

find the Nash equilibria, let’s transform the extensive-form into a normal-form
representation. plager ! :

6

0 { U B}
x { E F所
SPNE ,


_
,

0 0 了
=
{ UF ,
UF , DE ,
DFS

了 PSNE
player 2 :
0⑥
0
6 ABSSX
者 0 { ciDS
0
0 与 AC .
AD B C , , ,
BD}

EDE BCS
USE ACS 0 0 0
,

时 O 0 SDF ADS
AD [ ,

SUE ,
时 ⑥ 0 ⑤ 0
时 00 0

Player 1 chooses {𝑼, 𝑫} and {𝑬, 𝑭} at stage 1 and 3 respectively. The strategy set
of Player 1 in a normal form is:
{𝑈, 𝐷} × {𝐸, 𝐹} = {𝑈𝐸, 𝐸𝐹, 𝐷𝐸, 𝐷𝐹}
Player 2 chooses {𝑨, 𝑩} and {𝑪, 𝑫} at stage 2. The strategy set of Player 2 in a
normal form is:
{𝐴, 𝐵} × {𝐶, 𝐷} = {𝐴𝐶, 𝐴𝐷, 𝐵𝐶, 𝐵𝐷}
Now, let’s input the payoffs. Consider the following examples.
 (𝑈𝐸, 𝐴𝐶): When Player 1 chooses U, he can’t choose E. So, we ignore E. Given
Player 1 chooses U, Player 2 can’t choose C, So, we ignore C. (𝑈𝐸, 𝐴𝐶) means
Player 1 chooses U and Player 2 chooses A and the payoff is (1,4).
 (𝐷𝐸, 𝐴𝐶): Player 1 can choose both D and E, i.e., choose D at the stage 1 and E at
stage 3. Given Player 1 chooses D, Player 2 can’t choose A. So, we ignore A.
(𝐷𝐸, 𝐴𝐶) means Player 1 choose D at stage 1, Player 2 chooses C at stage 2, and
Player 1 chooses E at stage 3. The payoff is (3,3).

From the normal form, we find that there are three pure Nash equilibrium outcomes:

7

。。。小手

(3,3), (6,2), and (6,2). However, only (3,3) is a subgame perfect outcome.

3.2.3 ICE: SPE in Business Context

Consider the following firm entry game. Firm 2 is currently a monopoly in the market.
Firm 1 is deciding whether to enter the market or not. If Firm 1 enters the market, Firm
2 whether to accept or declare a price war.

0
(a) How many subgames are there? fwo
(b) How many proper subgames are there? one
(c) Find the subgame-perfect Nash equilibrium.

InSout 了
0
Firm 1 Out
(2,2)
In { Acept ,
Ww {

Firm 2

Accept
SPNE =
War

Initaepe
(3,1) (0,0)

(a)
There two subgames.

(b)
There is one proper subgame.

(c)
By backward induction, (In, Accept) is the subgame-perfect Nash equilibrium.

8

2 ployers c hoose
3.3 The Stackelberg Duopoly Model
re quantty
This is a simple example of the general two-stage dynamic game of complete and
perfect information set out in the previous section. It is a dynamic version of the
Cournot Game. You might want to contrast this model to Cournot Game which is the
simultaneous move (static) game of quantity competition.

The story goes like this.


Stage 1: A leader Firm 1 chooses output0𝒒𝟏 .
Stage 2: A follower Firm 2 chooses output 𝒒𝟐 knowing the history of the game, i.e.,

Ofbesemspne
the choice of output 𝑞1 by Firm 1.
offiml
To solve this game, we proceed as before by backwards induction. We start with stage
2 and work out Firm 2’s optimal strategy given its observation of 𝒒𝟏∗ . Then, given
0
Firm 2’s best response 𝐵𝑅2 (𝑞1∗ ) we proceed to stage 1 and calculate Firm 1’s optimal

move.
Suppose the market demand curve is given by:

𝑝 = 130 − 𝑞1 − 𝑞2
No Fieed lost
Assume each firm has the same cost functions which are given by:
1 cost
A mar qfby 1
.

𝐶𝑖 = 10𝑞𝑖 , 𝑖 = 1,2
C4 by 1
0f
Their profit functions are given by: 120
- 1

𝑞2A
𝜋1 = 𝑝𝑞1 − 10𝑞1 = (130 − 𝑞1 − 𝑞2 )𝑞1 − 10𝑞1 = 120𝑞1 − 𝑞12 − 𝑞1⑥ ≥


𝜋2 = 𝑝𝑞2 − 10𝑞2 = (130 − 𝑞1 − 𝑞2 )𝑞2 − 10𝑞2 = 120𝑞2 − 𝑞22 − 𝑞1 𝑞2

At stage 2, Firm 2 observes Firm 1’s decision 𝑞1∗ . Given 𝒒∗𝟏 , Firm 2 maximizes its profit.
In other words, Firm 2 treats 𝒒𝟏∗ as a constant. = *
2= 120 -
2 2 qq
q
π

q
-

. 2

The FOC of Firm 2 is:


tue
0

𝜕𝜋2
= 120 − 𝑞1∗ − 2𝑞2 = 0

G
𝜕𝑞2

Solve for 𝑞2 we obtain the best response function of Firm 2:


120 − 𝑞1∗

𝑞2 =
2 θ
= 𝐵𝑅2 (𝑞1∗ )

9

又手








𝟏𝟐𝟎−𝒒∗𝟏
Now, go to stage 1. Firm 1 knows that Firm 2 will set 𝒒𝟐 = (complete
𝟐

information: the profit functions are a common knowledge. Firm 1 can also calculate
𝟏𝟐𝟎−𝒒𝟏
the best response function of Firm 2). Firm 1 will substitute 𝒒𝟐 = into its
𝟐

profit function and maximize its profit.

𝜋1 = 120𝑞1∗ − 𝑞1∗ 2 −
O120 − 𝑞1∗
𝑞1∗ (
2

1
) = 60𝑞1∗ − 𝑞1∗ 2
2

The FOC of Firm 1 is:


θ
𝜕𝜋1
= 60 − 𝑞1∗0

sP
=0
𝜕𝑞1
𝑞1∗ = 60
120 − 60
30 }
( q2 Ʃ
𝑞2 = = 30 0
2 ,

Therefore, the SPE is: Firm 1 produces 60 at stage 1 and Firm 2 produces 30 at stage 2.
The corresponding profits are:

0
𝜋1 = 120(60) − 602 − 60(30) = 1800
𝜋2 = 120(30) − 302 − 60(30) = 900 O
Note that there is a first mover advantage in this game. Even though both firms are
otherwise identical, except for the move. The firm which moves first obtains a higher
payoff than the firm moving second (contrast with 2.4.1.2 ICE).
上 same
profit
move at the same

fizne

10


3.3.1 ICE
Consider the following two-stage dynamic game of complete and perfect information.

Stage 1: A leader Firm 1 chooses output 𝑞1


Stage 2: A follower Firm 2 chooses output 𝑞2 knowing the history of the game, i.e.,
the choice of output 𝑞1 by Firm 1.

Suppose the market demand curve is given by:


𝑝 = 120 − 2𝑞1 − 2𝑞2
Assume each firm has the same marginal cost 2, their cost functions are given by:
𝐶𝑖 = 2𝑞𝑖 , 𝑖 = 1,2
(a) Write down the profit function for each firm.

(b) Solve for the subgame-perfect Nash equilibrium.


(c) Compute the equilibrium profit for each firm.

(a)
𝜋1 = 118𝑞1 − 2𝑞12 − 2𝑞1 𝑞2
𝜋2 = 118𝑞2 − 2𝑞22 − 2𝑞1 𝑞2

(b)
π2 = 118 qz - 2
*
q=^ -2 q
Start from Firm 2. Given 𝑞1∗ , the FOC of Firm 2 is:
𝜕𝜋2
= 118 − 2𝑞1∗ − 4𝑞2 = 0

6
𝜕𝑞2 .

0
Solve for 𝑞2 we obtain the best response function of Firm 2:
118 − 2𝑞1∗
𝑞2 = = 𝐵𝑅2 (𝑞1∗ )
4
118−2𝑞1∗
Then, given 𝑞2 = , Firm 1’s profit function is:
4

0
∗2
118 − 2𝑞1∗
时 1
𝜋 = 118𝑞1

− 2𝑞1 − 2𝑞1

( ) = 59𝑞1∗ − 𝑞1∗ 2
4 五

The FOC of Firm 1 is:


𝜕𝜋1
𝜕𝑞1
= 59 − 2𝑞1∗ = 0 0
𝑞1∗ = 29.5
118 − 2(29.5)
𝑞2 = = 14.75
4

11
。一


(d)
𝜋1 = 118(29.5) − 2(29.5)2 − 2(29.5)(14.75) = 870.25
𝜋2 = 118(14.75) − 2(14.75)2 − 2(29.5)(14.75) = 435.125

12
3.4 Repeated Games

In many business strategic situations, players interact repeatedly over time.


Repetition of the same game0 might foster cooperation. Since the game is played at
the current as well as future periods, we should discount future periods’ payoff. Let
𝜹 ∈O(𝟎, 𝟏) be the discount factor. For example, if 𝛿 = 0.9, a $1 payoff in the next
·

千 包
period worths $1 × 𝛿 = $0.9 in the current period.
( ) 0= δa1 0
[ ]
.

3.4.1 Finitely Repeated Prisoners’ Dilemma
-

← dicomt

Let’s start with a finitely repeated prisoners’ dilemma. The game is played at each date
or period for some duration of T periods, i.e., the game is played at period 1, 2, 3, …,
T. Company A (row player) and Company B (column player) make pricing decisions for
their products simultaneously in each period and all prior moves are observable.
Below is the normal-form representation of the game.
19000


this fom Company B
High (H) 0
Low (L)
Company A 0
High (H) (5, 5)

(1,0
8)
÷

Low (L) (8, 1)


.(3, 3)

Suppose the above game is repeated for 2 periods. We can use backward induction
to find the SPE. —
No matter we discount the payoff or not, (L, L) is the best response in
period 2. Similarly, (L, L) is also the best response in period 1. Similar result holds for
any finite T. Hence, no cooperation is possible (in equilibrium) if the game is repeated
finitely because backward induction leads to play of (L, L) each period.

One may think the firms can make a commitment to foster cooperation: As long as
(H, H) was played in the previous period, both firms play H in the current period,
otherwise, play L in the rest of the periods. This is called a “grim trigger strategy.”

However, a grim trigger strategy cannot foster cooperation in this finitely repeated
prisoners’ dilemma. It is because there is a profitable unilateral deviation opportunity
in the last period of the finitely repeated game.

In the last period, given that, say, Company A play H, Company B will play L to get a
higher payoff (8 > 5) because Company A cannot revenge (the game is over!). Then,
束 时

13 4 (L ,
l)






Company A will play L instead in the second last period. Each firm has an incentive to
deviate earlier. By backward induction, (H, H) is not a SPE.

世限
3.4.2 Infinitely Repeated Prisoners’ Dilemma


/
However, if we can repeat the prisoners’ dilemma “infinitely often”, then we can hope
to sustain cooperation for some values of 𝜹. Again, we consider the game same as
before.

Company B
High (H) Low (L)
Company A High (H) (5, 5) (1, 8)
Low (L) (8, 1) (3, 3)

Since the game is played infinitely, there is no “endgame effect” as described above.
A grim trigger strategy is applied to the game: As long as (H, H) was played in the
previous period, both firms play H in the current period, otherwise, play L forever.

l→
The infinitely repeated prisoners’ dilemma will proceed as follows. dis coumetfacfor
 Each firm starts the game by playing H in the first period.

在 —

Play H in every period unless someone played L in the previous period.



 If someone played L in the previous period, play L forever.

Consider Company A:
δ=

distomt
Afollows If both companies follow the trigger strategy and play (H, H) forever, Company A gets:
rute
0

Q 5
5
the ← 兴
0
5 + 5𝛿 + 5𝛿 2 + ⋯ =
1−𝛿 1 5
-

GT |
Here, we apply the formula for the sum of an infinite geometric sequence: 𝑎 + 𝑎𝑟 +

0 a 5 =
𝑎 =
𝑎𝑟 2 + ⋯ = 1−𝑟.
r

If Company A deviates today and plays L instead, then it gets 8 in the current period
Adeviates but 3 forever. So, Company A’s payoff becomes:
8 + 3𝛿 + 3𝛿 2 + ⋯ = 8 +
3
1−𝛿
=嗎11 − 8𝛿
1−𝛿
5 + 号
5
Play of H can be supported for Company A if: = +3


5 11 − 8𝛿 ε

1−𝛿 1−𝛿

14

5 ≥ 8 55
-
6

82 号
一工

一一丶一一



一。一
一丶
一土
一一


固 𝛿≥
3
4
Similarly, for Company B we can obtain the same condition. δ= 1
Interpretation: 𝜹 can be interpreted as the level of patience. For example, when
𝜹 = 𝟎. 𝟗𝟗, the player perceives today as roughly equivalent to the future. ($1 today
= $0.99 in the future). Therefore, if players are patient enough then infinite repetition
of the game can lead to时 mutual cooperation being supported as an equilibrium, which
was not possible with one-shot or finite repetition of the game. (Note: the terms
“patient enough” does not mean a “high” value of 𝛿. It means 𝛿 should be higher
than a certain value.)

15
3.4.2.1 ICE

Company A (row player) and Company B (column player), making pricing decisions for
their products. Below is the normal-form representation of the game.

Company B
High (H) Low (L)
Company A High (H) (10, 10) (2, 16)
Low (L) (16, 2) (6, 6)

(a) If the game is played only once, what is the pure Nash equilibrium?
(b) If the game is repeated for exactly 200 rounds, can cooperation (H, H) be
,
maintained? Why?
(c) If the game is repeated infinitely, write down a trigger strategy where the outcome
of the game is (H, H).
(d) Let 𝛿 be the discount factor. If both players follow the trigger strategy, Find the
condition on 𝛿 such at (H, H) can be supported as a SPNE of the infinitely game.

(a)
(L, L) is the pure Nash equilibrium.

(b)
0
No. In the last round, for example, Company A play H, Company B will play L to get a
higher payoff (16 > 10) because Company A cannot revenge (the game is over!). Then,
Company A will play L instead in the second last period. Each firm has an incentive to
deviate earlier. By backward induction, (H, H) is not a SPE.

(c)
The trigger strategy will proceed as follows.
 Each firm starts the game by playing H in the first period.
 Play H in every period unless someone played L in the previous period.
 If someone played L in the previous period, play L forever.

(d)
If both companies follow the trigger strategy and play (H, H) forever, Company A gets:

10 + 10𝛿 + 10𝛿 2 + ⋯ =
atG attar 2
.
t …
C 10
1−𝛿
a
=

1 -
r

16

一一
。。。
If Company A deviates today and plays L instead, then it gets 8 in the current period
but 3 forever. So, Company A’s payoff becomes:
o 上

10 +
6 22 − 16𝛿 1 δ


-

16 + 6𝛿 + 6𝛿 2 + ⋯ = 16 + =

1−𝛿 1−𝛿

10 10f +6
-

The play of H can be supported for Company A if: =

\\\
8


10 22 − 16𝛿 1 -


1−𝛿 1−𝛿
≥ 16 -
10f
3

𝛿≥
4
1-

Similarly, for Company B we can obtain the same condition.


10 ≥ 16 108
-

l0 + 10 f ≥ 16

108 ≥ 6
δ 最 or 系

17


一一
一门


You might also like