Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Repeated games with incomplete information

Characterization and existence of Nash equilibrium payo¤s in two-person


undiscounted repeated games with a single informed player.

Re-interpretation of the results in (short-term) principal-agent problems


and in one-shot games with long costless preplay communication (“cheap
talk”).

Extension of the results in two-person undiscounted and discounted games


in which both players are privately informed but know their own payo¤;
relationship with reputation models.
Two-person one-shot Bayesian game with a single informed player B (p)

K : …nite set of types of player 1 (the informed player).

p2 (K ): probability distribution over K .

Player 1’s type k is chosen in K according to p at a virtual initial stage of the


game; only player 1 is informed of k.

Ai: …nite set of actions of player i, i = 1; 2; jAij 2.

Player 1 and player 2 simultaneously choose an action in A1 and A2 respectively.

Their respective payo¤s are U k (a) and V k (a) when player 1’s type is k and
a 2 A = A1 A2 is chosen.
In…nitely repeated game B1(p)

Stage 0: player 1’s type k is chosen in K according to p.

Stage t = 1; 2; :::: player 1 and player 2 simultaneously choose an action in


A1 and A2 respectively; the actions are publicly revealed after every stage.

A strategy for player 1 in B1(q ) is a sequence of mappings ( t)t 1

t :K At 1 ! (A1).

A strategy for player 2 in B1(q ) is a sequence of mappings ( t)t 1

t : At 1 ! (A2).
Payo¤s in the undiscounted game B1(p)

Stage payo¤s are evaluated by U k (:) and V k (:) and only known to player 1.

Given (at)t 1 2 AN, let us de…ne, for player 1,

PT
U T (k; (at)t 1) = T1 t=1 U k (a t ) for every k and T = 1; 2; :::

Player 1’s interim expected payo¤ associated with a pair of strategies ( ; ) is


h i
k ( ; )=L E e ) j k ) , where L is a Banach limit.
U1 p; ; (U T (k; a

Player 2’s expected payo¤ associated with a pair of strategies ( ; ) is


h i P h i
e k
e )) = k p L Ep; ; (V T (k; a
V1( ; ) = L Ep; ; (V T (k; a e ) j k) .
Feasible payo¤s in B1(p)
n o
F = co (U k (a)) k2K ; (V k (a)) k2K :a2A RK RK

(x; y ) 2 F

P P
,9 2 (A) : xk = a (a)U (a), y = a (a)V k (a), k 2 K .
k k

is a correlated strategy, corresponding to appropriate frequencies of actions.

(u; ) = (uk )k2K ; 2 RK R is an “enhanced” feasible payo¤ in B1(p)

, 9 (x; y ) 2 F s.t. u x, p u = p x and =p y

(i.e.: uk xk 8 k; uk = xk 8 k: pk > 0)
Useful mappings: values of nonrevealing game

Let for q 2 (K )
0 1
X
f N R (q ) = min max @ q k U k ( 1 ; 1 )A
12 (A2 ) 1 2 (A1 )
0k2K 1
X
= max min @ q k U k ( 1 ; 1 )A
1 2 (A ) 1 2 (A )
1 2
0k2K 1
X
gN R (q ) = min max @ q k V k ( 1 ; 1 )A
1 2 (A ) 1 2 (A )
1 2
0k2K 1
X
= max min @ q k V k ( 1 ; 1 )A
1 2 (A ) 1 2 (A )
2 1 k2K
Individual rationality (IR) in B1(p)

(Blackwell 1956, Aumann and Maschler 1966)

u = (uk )k2K 2 RK is IR for the (informed) player 1

, player 2 has a strategy k ( ; )


in B1(p) such that 8 , 8 k 2 K : U1 uk

,8q2 (K ) q u f N R (q ) , 8 q 2 (K ) q u cavfN R (q ),

2 R is IR for the (uninformed) player 2

, player 1 has a strategy in B1(p) such that 8 : V1( ; ) v

, vexgN R (p),

where cavfN R is the smallest concave function above fN R on (K ) and


vexgN R is the largest convex function below gN R on (K ).
Example 1: zero-sum game

K = f1; 2g, p = Pr(k = 1) 2 [0; 1], A1 = fT; Bg, A2 = fL; Rg,


! !
1 ; 1 0; 0 0; 0 0; 0
(U 1 ; V 1 ) = (U 2 ; V 2 ) =
0; 0 0; 0 0 ; 0 1; 1

!
p; p 0; 0
Nonrevealing game:
0; 0 (1 p); (1 p)

fN R (p) = cavfN R = gN R (p) = vexgN R = p(1 p)

(1; 0), (0; 1), ( 41 ; 14 ), ... are IR for player 1 in B1(p).

is IR for player 2 in B1(p) , p(1 p).


In the long-run, player 1 cannot bene…t from revealing information to player 2.

IR in the in…nitely repeated game B1(p) 6= IR in the one-shot game B (p)!

(u1; u2) is IR for player 1 in B (p)

, u1 + u2 1

, 8 q 2 [0; 1] qu1 + (1 q )u2 f1(q ),

where f1(q ) = min fq; 1 qg.


Example 2: zero-sum game

K = f1; 2g, p = Pr(k = 1) 2 [0; 1], A1 = fT; Bg, A2 = fL; Rg,


! !
1 ; 1 2; 2 0; 0 0; 0
(U 1 ; V 1 ) = (U 2 ; V 2 ) =
0; 0 0; 0 2 ; 2 1; 1

fN R (p) = gN R (p) = 1 p if p 1
3
3p(1 p) if 13 p 2
3
p if p 2
3

cavfN R (p) = 1 vexgN R (p) = 1 for every p.

u = (u1; u2) is IR for player 1 in B1(p) , u1 1, u2 1

is IR for player 2 in B1(p) , 1


As in Example 1, IR in B1(p) 6= IR in B (p)!

(u1; u2) is IR for player 1 in B (p)

, u1 + u2 3

, 8 q 2 [0; 1] qu1 + (1 q )u2 f 1 (q )

where f1(q ) = min fq + 2(1 q ); 2q + (1 q )g.


ffi

..--.]------t_.+-_A+

-]-t1-itt-t] l-tI_tI-tT

-+-+-+--t--Ll--t-L
rllr

\7i

+---+---+---t-+-
Uniform punishment strategy (UPS)

For every k 2 K , let uk be the individually rational level of player 1 of type k,


when k is complete information:
j
uk = fN R ( k ) where k 2 (K ), kk = 1, k = 0, j 6= k.
12 (A2) is a uniform punishment strategy of player 2 in B (p)

,8k2K 8 12 (A1) : U k ( 1; 1) uk .

UPS means that player 2 can punish player 1 as if he knew player 1’s type.
P k k
Under UPS, cavfN R (q ) = k2K q u 8q2 (K )

Hence u = (uk )k2K is IR for player 1 , uk uk 8 k 2 K .

UPS is not satis…ed in Examples 1 and 2.


Nonrevealing equilibrium (NRE) payo¤s in B1(p)

Equilibrium ( ; ) = ( t)t 1; ( t)t 1 , with t : K At 1 ! (A1),

in which t(k; ) = t(k0; ) 8 k; k0 2 K , on equilibrium path.

(u; ) = (uk )k2K ; 2 RK R is a NRE payo¤ in B1(p) ,

(u; ) is “enhanced” feasible, u is IR for player 1 and is IR for player 2.

NRE correspondence:
n o
E0 = (p; u; ) 2 (K ) RK R : (u; ) is a NRE payo¤ in B1(p)
n o
E0(p) = (u; ) 2 RK R : (u; ) is a NRE payo¤ in B1(p)

E0(p) is convex 8 p but there may exist p such that E0(p) = ;.


In Example 1:
! !
1 ; 1 0; 0 0; 0 0; 0
(U 1 ; V 1 ) = (U 2 ; V 2 ) =
0; 0 0; 0 0 ; 0 1; 1

there is a NRE at every p, which is achieved by playing optimally (i.e., (1 p; p))


in the nonrevealing game at every stage.

In Example 2:
! !
1 ; 1 2; 2 0; 0 0; 0
(U 1 ; V 1 ) = (U 2 ; V 2 ) =
0; 0 0; 0 2 ; 2 1; 1

there is an obvious CRE (completely revealing equilibrium)


! but there is also a
0 12
NRE at every p; the latter is achieved with = 1 .
2 0
Example 3: let us take other payo¤s for player 1:
! !
1 1 0 0
U1 = U2 = and any payo¤s for player 2
0 0 1 1

fN R (p) = max fp; 1 pg; cavfN R (p) = 1.

E0(p) = ; 8 p 2 (0; 1): there is no 2 (A) such that u1 1 and u2 1.


Signalling in B1(p)

For t = 1; 2; :::, (A1)t S is a set of signals for the informed player 1.

:K! (S ) : signalling strategy for player 1:

(s j k) := probability of sending s given type k.

Assume that 8 s 2 S , 9 k : (s j k) > 0;

p 2 (K ) and induce a posterior probability distribution ps = (pks )k2K ,


for every signal s 2 S .

Joint plan equilibrium (Aumann, Maschler and Stearns 1968)

After that signal s has been sent, the players play a NRE of B1(ps).
Incentive compatible (IC) joint plan

Player 1 sends his signal by himself ; if he randomizes over signals s; s0 2 S , he


basically has to be indi¤erent between sending s and sending s0.

Once player 1 has sent s, payo¤s are determined by a NRE of B1(ps), i.e.,
there exists (xs; ys) 2 F s.t. uks xks 8 k and uks = xks 8 k: pks > 0.

is incentive compatible ,

8 k 2 K; 8 s; s0 2 S s.t. (s j k) > 0, (s0 j k) > 0 ) xks = xks0 , uks = uks0

8 k 2 K; 8 s; s0 2 S s.t. (s j k) > 0, (s0 j k) = 0 ) xks xks0 , uks = uks0


Joint plan equilibrium payo¤

(u; ) = (uk )k2K ; 2 RK R is a JPE payo¤ in B1(p) ,

there exists (ps; us; s) 2 E0, s 2 S , s.t. p 2 conv fps, s 2 Sg and us = u,


i.e., uks = uk for every k.

P P
If p = s s ps, then = s s s.

The interim expected payo¤ uk of player 1 of type k coincides with his posterior
payo¤ uks , given s. The vector payo¤ us = u must be IR:

8q2 (K ) q u fN R (q ).

If UPS holds, IR means that uk uk 8 k 2 K .


Every nonrevealing equilibrium can obviously be viewed as a particular joint
plan equilibrium.

The problem of existence of a Nash equilibrium in B1(p) remained open for a


long time.

Theorem 1 Sorin (1983), jKj = 2, Simon, Spiez· and Toruńczyk (1995)


For every p 2 (K ), B1(p) has a JPE (joint plan equilibrium).

See also Renault (2000) and Simon (2002).


Extension: abstract Principal-Agent problem D fK; p; U; f; F g

Simon, Spiez· and Toruńczyk (2008)

K : set of agent’s types, p 2 (K ).

X : set of joint decisions.

U : X ! RK , U (x) = (U k (x))k2K : agent’s utility function.

f : (K ) ! R: mapping to describe the agent’s IR payo¤s.

F : (K ) ! X : correspondence
F (p): set of joint decisions that are “acceptable” for the principal at p.
A solution for D fK; p; U; f; F g consists of a set of signals S , a signalling
strategy : K ! (S ) for the agent and a decision function : S ! X s.t.

- is IC given () U k ( (s)) uk )

-8q2 (K ) : q u f (q ) (e.g., uk uk 8 k 2 K )

- 8 s 2 S : xs 2 F (ps)

Assumption A 8 q; r 2 (K ) 9 x 2 F (r) such that q U (x) f (q ).

Assumption A’8 k 2 K; 8 r 2 (K ) 9 x 2 F (r) such that U k (x) uk .

Theorem 2 If X is compact, convex, f is lower semi continuous, F is non-


empty, convex-valued, upper hemi continuous and Assumption A holds, then
for every prior p 2 (K ), D fK; p; U; f; F g has a solution.
Simple application of theorem 2: cooperation in the one-shot game B (p)

Forges, Horst and Salomon (2014)

If the uninformed player has a uniform punishment strategy (UPS), then, for
every p 2 (K ), B (p) has a feasible, posterior individually rational solution.

IR for player 1: uk uk 8 k 2 K .

Let g1(p) be the value of player 2’s one-shot game;


n P k k o
IR for player 2: 8 s 2 S : s 2 F (ps ) = 2 (A) : k ps V ( ) g1(ps) .

The result does not necessarily hold without UPS (see Example 2).
Back to B1(p): from theorem 2 to theorem 1

X= (A)

f = fN R
n P k k o
F (p) = 2 (A) : k p V ( ) vexgN R (p)

Assumption A: 8 q; r 2 (K ) (q; r ) = 1(q ) 1 (r )


1 (q ) optimal for player 1 in NR game q U
1 (r ) optimal for player 2 in NR game r V
Example 4: UPS holds, no NRE, no CRE at p = 12

K = f1; 2g. Player 1’s payo¤s are:

j1 j2 j0 j3 j4 j1 j2 j0 j3 j4
U 1( ) = T 4 3 4 3 4 U 2( ) = T 8 3 0 3 0
B 0 3 0 3 8 B 4 3 4 3 4

Player 2’s payo¤s do not depend on player 1’s action:

j1 j2 j0 j3 j4 j1 j2 j0 j3 j4
V 1( ) = T 10 9 7 4 0 V 2( ) = T 0 4 7 9 10
B 10 9 7 4 0 B 0 4 7 9 10

u1 = u2 = 3, j2 and j3 are uniform punishment strategies of player 2.


L o-uJ { r n uÀ âr <r.^--/-o t
ttlrrttttttttttrl

4ffi
rT-ft-l

ffi
rï-.T.

tttttl

ffi .T-l---1

ï-rï-T-T
The mapping gN R = g1 is convex.
NRE at p = 12

2 (A) such that U 1( ) 3, U 2( ) 3 and 21 V 1( ) + 12 V 2( )


gN R ( 12 ) = 7. The latter condition implies that player 2 must take the action
j0, namely that

!
0 0 0 0
= for some 0 1
0 0 1 0 0

U 1 ( x) 3) 3 and U 2(x) 3) 1: impossible.


4 4

There are NRE at p = 14 and p = 34 .


CRE at p = 12 : let k 2 (A) be the frequency of moves when player 1
reports type k, k = 1; 2.

By IR for player 2, he chooses j1 (resp., j4) when he gets signal 1 (resp., 2):
! !
0 0 0 0 0 0 0 0 1
x1 = x2 =
1 0 0 0 0 0 0 0 0

for some ; 2 [0; 1].

IR for player 1: U k ( k ) 3, k = 1; 2, i.e. U 1( 1) = 4 with 3 and


4
U 2( 2) = 4 with 3.
4

IC for player 1: U 1(x1) U 1(x2) and U 2(x2) U 2 ( x1 ) , 1+ and


1 + , impossible.
There exists a partially revealing solution at p = 12

At the …rst stage t = 1, player 1 plays T or B with type-dependent probability,


so as to reach the posteriors pT = 41 and pB = 34 . From then on, a NRE is
played.

In order to satisfy player 2’s IR condition, T must give probability 1 to j3 and


B must give probability 1 to j2 .

IC is trivially satis…ed: player 1 gets 3 no matter what.

From p = 21 , the NRE at p = 14 and p = 43 are reached thanks to signalling.

The expected payo¤s are (3; 3) for player 1 and 31


4 for player 2.
Gradual construction of equilibrium payo¤s

Start with E0 = f(p; u; ) : (u; ) is a NRE payo¤ in B1(p)g;

E0(p) is convex, i.e., E0 is convex in (u; ) for every p.

E1 = f(p; u; ) : (u; ) is achieved with 1 signalling phase in B1(p)g


= f(p; u; ) : (u; ) is a convex combination of JPE payo¤s in B1(p)g

E0 E1.

E1 is constructed by …rst convexifying E0 in (p; ) for every u (signalling )

and then convexifying the new set in (u; ) for every p (compromising ).
Compromising in B1(p)

Let (u; ) and (u0; 0) be equilibrium (e.g., NRE or JPE) payo¤s in B1(p). By
performing a jointly controlled lottery (Aumann, Maschler and Stearns 1968),
the players can achieve any convex combination (u; ) + (1 )(u0; 0) as
an equilibrium payo¤ in B1(p).

Let fT; Bg A1, fL; Rg A2; take = 21 .

Player 1 plays 12 T; 21 B ; player 2 plays 12 L; 12 R:

L R
They go on as a function of the pair of moves: T (u; ) (u0; 0)
B (u0; 0) (u; )

Unilateral deviations have no e¤ect.


rrltlll

ffi
rltlttttttt
More equilibrium payo¤s Hart (1985)

A process (et)t 0 = (pt; ut; t)t 0 2 (K ) RK R is a di-martingale if


it is a martingale (i.e., E (et+1 j Ht) = et) such that:

for every t, either pt+1 = pt or ut+1 = ut.

De…ne, from E0 = f(p; u; ) : (u; ) is a NRE payo¤ in B1(p)g,

E0 = fe = (p; u; ) : 9 a di-martingale (et) such that e0 = e and e1 2 E0g

namely, (et)t 0 = (pt; ut; t)t 0 starts at (p0; u0; 0) = (p; u; ) and its a.s.
limit (p1; u1; 1) corresponds to a NRE.

Theorem 3 (u; ) is a Nash equilibrium payo¤ in B1(p) , (p; u; ) 2 E0 .


Theorem 3 is not arti…cial:

Let E2(p) = f(u; ) : (u; ) is achieved with 2 signalling phases in B1(p)g

Aumann, Maschler and Stearns 1968 show that there are games in which, at
some p’s, E1(p) E2(p).

It may even happen that Nash equilibrium payo¤s of B1(p) (thus in E0 (p))
cannot be achieved by a bounded di-martingale, namely require an unbounded
number of signalling phases (Forges 1984).
Long cheap talk Forges 1990, Aumann and Hart 2003

Stage 0: player 1’s type k is chosen in K according to p.

Stage t = 1; 2; :::: player 1 and player 2 exchange messages chosen in sets M1


and M2 respectively.

After the communication phase, player 1 and player 2 simultaneously choose


an action in A1 and A2 respectively.

Their respective payo¤s are U k (a) and V k (a) when player 1’s type is k and
a 2 A = A1 A2 is chosen. In particular, payo¤s do not depend on messages.

C1(p): game with (possibly in…nite) cheap talk.

Di¤erence with B1(p): 8 p, C1(p) has a NRE (“babbling equilibrium”).


n o
Nsil (p) = (u; ) 2 RK R : (u; ) is a NRE payo¤ in C1(p)
n o
= (u; ) 2 RK R : (u; ) is a NE payo¤ of the “silent game”

+
“Enhanced” correspondence Nsil (p):

+ b ) 2 Nsil (p) s.t. uk b k 8 k , uk = u


b k if pk > 0.
(u; ) 2 Nsil (p) , 9 (u; u
n o
+
Graph of the correspondence: (p; u; ) : (u; ) 2 Nsil (p) =def N0.

Characterization of cheap talk equilibrium payo¤s:

Theorem 4 (u; ) is a Nash equilibrium payo¤ in C1(p) , (p; u; ) 2 N0 .


Lack of information on both sides and known own payo¤s Koren (1992)

K , L: …nite sets, p 2 (K ), q 2 (L).

Undiscounted in…nitely repeated game B1(p; q ):

Stage 0: player 1’s type k is chosen in K according to p and, independently,


player 2’s type ` is chosen in L according to q . Only player 1 (resp., 2) is
informed of k (resp., `).

Stage t = 1; 2; :::: Player 1 and player 2 choose simultaneously an action


in A1 (jA1j jKj) and A2 (jA2j jLj) respectively. If the pair of actions
a 2 A = A1 A2 is chosen, they get the respective payo¤s U k (a) and V `(a).

Each player thus knows his own stage payo¤.


Tractable characterization:

Theorem 5 All Nash equilibrium payo¤s of B1(p; q ) are payo¤-equivalent to


CRE (completely revealing equilibrium) payo¤s.

BUT it may happen that B1(p; q ) has no Nash equilibrium payo¤ at some
(p; q )’s.

For every discount factor 2 (0; 1) and every sequence of actions (at)t 1,
P t 1 U k (at), k 2 K (resp.,
player 1’s discounted payo¤ is (1 ) 1
P1t=1 t 1 ` t
player 2’s discounted payo¤ is (1 ) t=1 V (a ), ` 2 L).

Let B (p; q ) be the discounted in…nitely repeated game and let N [B (p; q )]
be the set of Nash equilibrium payo¤s of B (p; q ). 8 ; p; q : N [B (p; q )] 6= ;.

Relevant question: lim !1 N [B (p; q )] 6= ; ???


In the zero-sum case (with unknown own payo¤s!), Mertens and Zamir (1971)
show that lim !1 v always exists ( = v1 when v1 exists).

Hörner and Lovo (2009) characterize the subset BF [B ( )] N [B (p; q )]


of belief-free equilibrium payo¤s in B ( ).
BF [B ( )] \p;q N [B1(p; q ))]. Hence, BF [B ( )] may be empty.

Peski (2014) introduces the set FR [B (p; q )] of …nitely revealing equilibrium


payo¤s in B (p; q ) and proves the following

Theorem 6 If there exists an open set of belief-free equilibrium payo¤s, then


for every p,q , lim !1 N [B (p; q )] = lim !1 F R [B (p; q )]. The latter set
can be constructed gradually, but the construction is much more intricate than
in Hart (1985).
Salomon and Forges (2015): class of public good games that are related to
two-sided reputation models.

UPS is satis…ed.

N [B (p; q )] = ; for a signi…cant set of (p; q )’s ) BF [B ( )] = ;.

lim inf !1 N [B (p; q )] 6= ;.


Payo¤s in lim inf !1 N [B (p; q )] are explicitly constructed as limits of
equilibrium payo¤s of B (p; q ), in which players behave as in a war of
attrition.
Ai = fc : “contribute”, d : “do not contribute”g, i = 1; 2. The public good
is produced as soon as one of the players contributes.

Each player can be “normal” (n) or “greedy” (g ).

The value of the public good is normalized to 1 for both players. The value of
the initial endowment for the normal type n is ! , 0 < ! < 1, while it is z > 2
for the greedy type g .

`=n `=g
c d c d
k=n c 1; 1 1; 1 + ! 1; 1 1; 1 + z
d 1 + !; 1 !; ! 1 + !; 1 !; z
k=g c 1; 1 1; 1 + ! 1; 1 1; 1 + z
d 1 + z; 1 z; ! 1 + z; 1 z; z
Main references can be found in:

Forges, F (1992), “Repeated games of incomplete information : non-zero-sum”, in : R.


Aumann and S. Hart, Handbook of Game Theory with Economic Applications, Elsevier
Science Publishers, North-Holland, chapter 6, 155-177.

Forges, F (1994), “Non-zero-sum repeated games and information transmission”, in : N.


Meggido, Essays in Game Theory in Honor of Michael Maschler, Springer-Verlag, chapter 6,
65-95.

More recent references:

Simon R., S. Spiez and H. Torunczyk (1995), “The existence of equilibria in certain games,
separation for families of convex functions and a theorem of Borsuk-Ulam type”, Israel
Journal of Mathematics 92, 1-21.

Aumann, R. and S. Hart (2003), “Long cheap talk”, Econometrica 71, 1619–1660.

Peski, M. (2008), “Repeated games with incomplete information on one side”, Theoretical
Economics 3, 29-84.

Simon, R., S. Spiez and H. Torunczyk (2008), “Equilibria in a class of games and topological
results implying their existence”, RACSAM 102, 161-179.

Peski, M. (2014), “Repeated games with incomplete information and discounting”,


Theoretical Economics 9, 651-694.

Salomon, A. and F. Forges (2015), “Bayesian repeated games and reputation”, mimeo,
Université Paris-Dauphine.

You might also like