9.behavioral Analysis in The Agent-Based

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012
Behavioral Analysis in the Agent-Based

Simulation of Centipede Games
Tomohiro Hayashida
Ichiro Nishizaki
Yuya Sugeo
Faculty of Engineering,
Hiroshima University,
Hiroshima, JAPAN
Email: hayashida@hiroshima-u.ac.jp
Hiroshima, JAPAN
Email: nisizaki@hiroshima-u.ac.jp
Hiroshima, JAPAN
Email: m103015@hiroshima-u.ac.jp
AbstractThe aim of this paper is to explain behavior of the

human subjects in several laboratory experiments of centipede
game which are different from the equilibrium solution. A
centipede game is a dynamic game which multiple players make
decision in prearranged order. Subgame perfect equilibrium
(SPE) is a solution concept which is based on the concept of Nash
equilibrium. SPE is well known solution as which predicts how
the players play in most of dynamic games with the exception of
few kinds of games. Some experimental results of centipede games
which are different from the equilibrium are reported. Contrary
to the theoretical assumption such that all players are rational
and they can discriminate between slight difference of payoffs, it
is thought of as human makind decision through trial-and-error
process. This paper conducts simulation of the centipede game
by using artificial adaptive agents, and shows behavioral features
of the human subjects.
I. I NTRODUCTION
A sequential game such that all players make decision in
prearranged order is called a dynamic game or a extensive
form game. Subgame perfect equilibrium is well known solution concept which predicts a strategy set of dynamic games.
However, some experimental results of dynamic games, centipede games [14], [11], ultimatum bargaining games [3]
and so forth, which the human subjects deviate from the
equilibrium are reported. McKelvey and Palfrey [9], Fey et al.
[4] and Nagel and Tang [12] conducted laboratory experiments
of two person centipede games, and reported that the subgame
perfect equilibrium are observed in some the experiments.
Rapoport et al. [14] and Murphy et al. [11] conducted three
person games. As the result, the subgame perfect equilibrium
were rarely observed in the experiments of three person games.
To explain such human behavior, McKelvey and Palfrey
[9] and Zauner [17] proposed static stochastic choice models
including error in decision making of human subjects based
on Harsanyi model [5]. These choice models successfully
explain average result of the laboratory experiments, however,
the models do not describe dynamic change. Rapoport et al.
[14] proposed dynamic stochastic choice model. However,
this model is not necessarily suitable to choice model of the
human subjects, because this model does not include amount
of payoffs which the human subjects obtain in a game.
This paper provides simulation analysis of centipede games.
We focus on the assumption of theoretical models corresponding to equilibrium such that a player is rational and decision
978-1-4673-2743-5/12/$31.00 2012 IEEE
mechanisms of all players are symmetric. We consider that

a human makes decision based on his experiences and learn
through trail and error process. In this paper, we conduct agentbased simulation system such that an agent makes decision
through neural networks (NN) and they evolves their NN by
genetic algorithms (GA) based on their experiences which is
proposed by Nishizaki [13].
Holland and Miller [7] showed that many systems could
be interpret as complex adaptive system. For engineering
modeling, they showed the efficiency of simulation system
using artificial adaptive agents. Here, we introduce several
studies which uses adaptive agents: Andreoni and Miller [1]
applied GA to construct decision-making model in auction.
They reported that adaptive artificial agents which have a
decision making mechanism same as human subjects can be
constructed based on GA. Axelrod [2] researched whether an
effective strategy on prisoners dilemma games is obtained
by using artificial agents based on GA. Leshno et al. [8] dealt
with market entry games by using artificial agents which make
decision by NN, and they reported that the average behavioral
features of artificial adaptive agents is similar to of the human
behavior in the laboratory experiments conducted by Sundali
et al. [16]. Hayashida et al. [6] constructed a simulation system
for social networks by using NN and GA based on a simulation
system proposed by Nishizaki [13]. Additionally, there are
a number of literature which show effectiveness of adaptive
models in game theoretical situation, decision problems and
so forth.
We briefly introduce the centipede games in section 2. We
construct a simulation model using artificial adaptive agents
in section 3, and we describe the simulation results in section
4. Finally, we conclude this paper in section 5.
II. C ENTIPEDE GAMES
Rosenthal [15] proposed centipede game which all players
can obtain larger amount of payoffs than in case where
subgame perfect equilibrium is achieved, i.e., the payoffs of
subgame perfect equilibrium is not Pareto optimum. Subgame
perfect equilibrium is a strong solution concepts for dynamic
games. In centipede games, the players make decision in
prearranged order. The strategy of each player in his turn is
take or pass. The game is represented by a tree as shown in
1345
Fig. 1, in the game tree, each turn of a player is represented

as a node. If a player chooses take, then the game finishes
and all players obtain payoffs corresponding to the finishing
node. If a player chooses pass, the next player in the decision
sequence makes decision. In Fig. 1, a game tree of a centipede
game with three players are represented. The circles represent
nodes, the numbers written in the circles are node numbers, a
set of three numbers written under each node shows payoffs,
they are payoff of players 13 from above, respectively. The
numbers written upper the nodes represent the player number
(e.g., p1 indicates player 1) who makes decision at the node.
p1
1
p2
2
p3
3
p1
4
p2
5
p3
6
p1
7
p2
8
p3
9
10
1
1
2
20
2
4
4
40
80
8
8
16
160
16
32
32
320
640 128 256

64 1280 256
64 128 2560
0
0
0
( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( )
Rapoport et al. [14] conducted laboratory experiments of

three person centipede game. 60 subjects were randomly
divided into four sessions by 15 subjects, therefore a session
consists of 5 groups for the games. The number of trials in
each group is 60, the groups are reorganized at beginning of
each trial. As the result of the experiments, fractions of choice
rate of take at node 1,2, or 3 are high. In 3 of 4 sessions,
the fraction of subgame perfect equilibrium in three sessions
was 42.2%. In this paper, we call this result result R-I-A, and
these sessions session R-I-A. The result of another session,
fraction of subgame perfect equilibrium is lower than result
R-I-A, 30.3%, and there exist several games which finish at
nodes larger than 3. In this paper, we call this result result
R-I-B, and this session session R-I-B. Figs 2 and 3 show the
average result of results R-I-A and R-I-B. The horizontal axes
indicate number of terms, and the vertical axes indicate 10
terms moving average of fractions of that games finish at nodes
13, respectively.
A part of a dynamic game forward from a node is called a

subgame of the game, a set of strategies is called subgame perfect equilibrium if all players choose best response strategies
in any subgame of the game. A subgame perfect equilibrium is
derived using backward induction. For example, the subgame
perfect equilibrium of the centipede game represented in Fig.
1 is derived as follows: At the node 9, the best response
of player 3 is choosing take because of 2560 > 0, then the
payoffs of players 13 are (256, 256, 2560). The best response
of player 2 at the node 8 is take because of 1280 > 256.
Similarly, the subgame perfect equilibrium of the game is that
player 1 chooses take at the node 1, then players 13 obtain
payoffs (10, 1, 1). The players can obtain the larger amount of
payoffs at the nodes with the larger number. We could argue
that coordination between all players succeed in the centipede
game if the number of finishing node is large, because the aim
of all players is to maximize their payoffs. In the contrary case,
in a three players centipede game, to finish the game at node 1,
2, or 3 means that there exists no player who makes decision
more than once, therefore the players fail to coordinate. In
this paper, we interpret that the players succeed to coordinate
if each player makes decision at least once and the finishing
node of a game is 4, 5, . . . , 9. In a number of experiments,
human subjects deviate from not only the subgame perfect
equilibrium, but also they succeed to coordination, i.e., each
subject makes decision at least once in a game.
McKelvey and Palfey [9] conducted laboratory experiments
with that the number of players in a group is 2. Rapoport et
al. [14] and Murphy et al. [11] conducted different kinds of
experiments with that the numbers of players in a group are 2
or 3. We focus on two kinds of laboratory experiments [11],
[14] which human subjects deviate from the equilibrium and
succeed to coordinate.
978-1-4673-2743-5/12/$31.00 2012 IEEE
fractions of that
game finishes
Game tree: a centipede game (three players)
node 1
0.8
0.6
0.4
node 2
0.2
node 3
0
0
10
20
30
40
50
term
Fig. 2.
Result R-I-A: experiment R-I by Rapoport et al. [14]
fractions of that
game finishes
Fig. 1.
A. Experiments by Rapoport et al. [14]
node 2
0.8
node 1
0.6
0.4
0.2
node 3
0
0
10
20
30
40
50
term
Fig. 3.
Result R-I-B: experiment R-I by Rapoport et al. [14]
10 games finish at node 9 in the session R-I-B, against this

result, only 02 games finish in the sessions R-I-A. In session
R-I-B, 5 particular subjects choose take at node 9. In case that
3 of these 5 subjects belong to same group, the fraction that
the games finish at node 9 is very high. From this result, there
exist several subjects having a predilection to coordinate and
they positively choose right in the games. From the effect of
the presence of such subjects, the fraction of that the games
finish at node 1,2, or 3 of result R-I-B is not higher than of
result R-I-A, even if the games are repeated.
Rapoport et al. conducted additional experiments with other
condition that the amount of monetary payment to the subjects
are 1/100 that of previous mentioned experiment. For simple
description, in this paper, we write the former mentioned
1346
fractions of that
game finishes
are included, and in each group, the subjects and the robots
play centipede games 90 times which the decision order is
decided at the beginning of each game. As the result, 10 terms
moving average of median of finishing nodes are shown in
Fig. 5. The horizontal axis indicates number of terms, and the
vertical axis indicates median.
medianof number of
finishing nodes
experiment as experiment R-I, and the experiment with 1/100

monetary payment as experiment R-II. Other experimental
conditions but the monetary payment of the experiment are
same. As the result, fraction of that the subgame perfect
equilibrium are observed is only 2.6%, and the fraction did
not increase even the games are played repeatedly. Fractions
of finishing the game at the nodes 1, 2, . . . , 9 are 2.6%, 3.4%,
9.8%, 13.3%, 22.6%, 22.8%, 16.5%, 5.7%, 3.0%, respectively.
From this result, it could be thought as that coordination
relationship are built between the subjects in the experiment
R-II. As the result of this experiments, fractions of that games
finishes at node 1, 2, and 3 are shown in Fig. 4. The horizontal
axis indicates number of terms, and the vertical axis indicates
10 terms moving average of fractions of that games finish at
nodes 13, respectively, as Figs. 2 and 3.
C6
7
6
NC6
NC3
Base
4
3
10
20
30
40
50
60
70
80
term
Fig. 5.
Result: experiment by Murphy et al. [11]
0.6
0.4
node 3
0.2
node 1
node 2
0
0
10
20
30
40
50
term
Fig. 4.
Result: experiment R-II by Rapoport et al. [14]
B. Experiments by Murphy et al. [11]

As mentioned in the previous subsection, the experimental
results by Rapoport et al. [14] indicate that the reason of
that the subjects deviate from subgame perfect equilibrium
can be thought as that there exist several players who positively choose a cooperative strategy pass. Murphy et al.
[11] conducted experiments using subjects of three person
centipede game which the amount of monetary payments
are same as experiment R-II [14]. They introduces computer
software (robots) which choose a strategy in accordance to
a particular protocol as follows: at nodes 18, a Robot C
chooses pass with probability 95%, a Robot NC chooses
take with probability 95%, and both of them choose another
strategy with probability 5%. At node 9, both of them certainly
choose take. Murphy et al. conducted 4 kinds of experiments
including no robot, non-cooperative robots Robots NC, or
cooperative robot Robot C as shown in Table I. In this paper,
we write 4 kinds of experiments [11] Experiment M-Base,
M-NC3, M-NC6, and M-C6, respectively.
TABLE I
E XPERIMENTAL DESIGN BY M URPHY ET AL .
Experiment
Experiment
Experiment
Experiment
M-Base
M-NC3
M-NC6
M-C6
Number of human and robots in each session

Human
Robot C
Robot NC
21
18
3
15
6
15
6
In the experiments, the subjects do not know that the robots
978-1-4673-2743-5/12/$31.00 2012 IEEE
From the results shown in Fig. 5, the subjects in the

experiment M-C6 most frequently succeed coordination in
the 4 kinds of experiments. However, the subjects in the
experiments M-NC3 and M-NC6 more frequently succeed
coordination than in M-Base, in spite of these 2 kinds of
experiments include Robot M-NC. The features can be seen in
experiment M-NC6, particularly, in 1 of 3 groups (group MNC6-1) of M-NC6, the subjects succeed coordination stronger
than other 2 groups (groups M-NC6-2 and M-NC6-3) of MNC6. In the group M-NC6-1, ratios of that the games finish at
nodes 13 which a player makes the first decision at a game is
10%, at nodes 46 which a player makes the second decision is
41%, and at nodes 79 which a player makes the third (final)
decision is 49%. On the other hand, in the groups M-NC62 and M-NC6-3, the ratios of finishing at the nodes for the
first decision are 63% and 54%, at the nodes for the second
decision are 29%, 38%, and at the nodes for the third are 8%,
8%, respectively. The number of the subjects who select pass
with probability 95% at the nodes for the first and the second
decision in the group NC6-1 is 8, and the number of such
subjects are in both groups M-NC6-2 and M-NC6-3 are 2.
Generally, it is thought of as that a player with experiences
such that he fails to coordinate due to that another player in
a game chooses take chooses take in the next game to obtain
higher payoff. However, in the group M-NC6-1, the particular
8 subjects positively choose pass even if a Robot NC chooses
take in the past games. Such coordinated activities lead to
construct coordinated relationship between the subjects in a
group.
III. T HE S IMULATION M ODEL
In general, it is thought of as that a human makes decision
through trial-and-error process. Therefore, the decision model
with assumption of payoff maximization of equilibrium concept of game theory, or the stochastic choice model with noise
or error in decision making do not have sufficient explanatory
power for behavior of the human subjects which described in
the previous section. In this paper, we construct simulation
model using artificial adaptive agents. Here, the agents make
1347
decision based on NN which evolve by using the procedure

of GA.
decision making of an agent, i.e., an agent chooses take with

probability pt calculated by following equation.
pt =
A. Outline of Simulation Model

We describe the outline of the simulation model as follows:
Step 1:Generate the initial population which consists of (3
R)N adaptive agents and Robots RN, and set t = 1.
Step 2:Divide all agents and robots into N groups such that
the total number of agents and robots in a group is
3 by randomly choosing from the population.
Step 3:Assign player numbers (13) to the agents and robots
in each group by random.
Step 4:Play a centipede game in each group.
Step 5:Evolve the adaptive agents by using GA, the fitness
value of each agent is evaluated based on the result
of the games.
Step 6:If t = T then a simulation run finishes, otherwise go
back to Step 2 with t := t + 1.
In this paper, the procedure of GA consists of roulette selection, onepoint crossover, mutation, and elitist conservation.
B. Decision Making and Learning of Agents
An agent makes decision by using NN. The following 6
kinds of information are given to NN: (i) The current number
of node in transformed forms as: (1, 0, 0) for nodes 13,
(0, 1, 0) for nodes 46, or (0, 0, 1) for nodes 79. (ii) The
player numbers in transformed forms as: (1, 0, 0) for player
1, (0, 1, 0) for player 2, or (0, 0, 1) for player 3. (iii) Utility
value when the agent chooses take. (iv) Expected utility value
when the agent chooses pass by using probabilities of finishing
rate defined based on the experience of the agent in the
past games. (v) Weighted average of 3 utility values of the
agents corresponding to player numbers at past s games. (vi)
Frequencies of choosing take at each class of nodes which
is classified by number of decision in a game, i.e., 3 classes
of nodes as nodes 13, 46, and 79 are defined. Here, it
is assumed that utility function of an agent is represented as
described later. At the tth game, information sequence of past
s games, x = (xt , xt1 , . . . , xts+1 ), are aggregated to f (x; s)
which is calculated as following equation, because of the
oblivion of the agents.
f (x; s) =
sk=1 xtk+1 k1
sk=1 k1
(1)
Where (0, 1] denotes oblivion rate, and an agent retains

the information of the past s games. xk indicates numerical
quantified information of kth game in the way described above.
Here, if an agent chooses take at rth decision of the agent in tth
game, then let xttake,r = 1, otherwise let xttake,r = 0. By equation
(1), the oblivion concerned frequencies of choosing take are
calculated. Let outt and out p denote 2 kinds of values of output
from NN, these values are interpreted as priority of strategies
take and pass, respectively. To include the error in decision
of an agent, we employ Boltzmann selection as a protocol of
978-1-4673-2743-5/12/$31.00 2012 IEEE
exp(outt / )
exp(outt / ) + exp(out p / )
(2)
From the result of Rapoport et al. [14] with different scales

of monetary payments, it is thought of as that human subjects
utility function of the monetary payment can be represented
nonlinear function. Since the amount of monetary payments is
proportionate to the amount of payoff, we assume that utility
function of payoff i of an agent i is represented as
ui (i ) = r1 + r2 exp(r3 i ).
(3)
As shown in Fig. 1, because the maximum value of payoff at a

game is 2560, and it is known that an individual preference for
money is risk averse, we assume that ui ( max ) = ui (3000) =
1, ui ( min ) = ui (0) = 0, ui (300) = 0.5, and r1 = 1.309, r2 =
1.309, r3 = 0.000481 are derived.
The agents evolve corresponding NNs based on GA which a
gene consists of weights and thresholds of NN of corresponding agent. The genes of the initial population are generated
by assigning random values in [1, 1] to each genetic locus.
Here, let a payoff sequence hi (t; s) = (i,ts+1 , . . . , i,t ) that
an agent i obtains during s terms, from t s + 1 to t. Let
utility value ui ( fi (hi ; s)) of the agent i be the strength of
corresponding gene, where the cumulative payoff fi (i ; s) is
calculated by equation (1). GA consists of roulette selection,
one point crossover, mutation, and elitist preservation rule.
C. Asymmetry Property of Agents
From the result of the laboratory experiments by Murphy et
al. [11] including cooperative or noncooperative robots, they
discovered that there exists asymmetry of decision making
such that some of subjects do not change the strategy and
choose the cooperative strategy even they play games several
times, adversely, some of subjects change the strategy immediately due to the behavior of other subjects. Therefore, it is
thought of as that the former mentioned subjects know that
some of opponents of the game would change their strategies
due to the cooperative behavior. In this paper, we consider 3
different kinds of agents. The first kind of agents with shortterm thinking changes their strategies due to other agents
behaviors. The second kind of agents with long-term thinking
does not change their strategies during a certain number of
games. The third kind of agents has intermediate feature
of the first and the second kinds of agents. To construct
such asymmetric agents, the value of the oblivion rate can
characterize the feature of each agent shown in equation (1).
The value of indicates the strength of influence of the results
of past games. An agent with small is susceptible to a result
of game and he might be myopic and might tend to change
his strategy. Contrary, an agent with large is insusceptible
to result of a game, and he might make decision with longterm view. In this paper, we construct 3 kinds of agents with
different values of oblivion rate = 0.3, 0.5, 0.8. They are
separated according to the value of , let GS be a set of agents
1348
IV. S IMULATION E XPERIMENTS
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.5
0.4
0.3
node 1
0.2
node 2
0.1
node 3
0
0
500
1000
1500
2000
term
(Group GS )
0.5
fractions of that
game finishes
fractions of that
game finishes
We execute 6 kinds of simulation experiments corresponding to the laboratory experiments [11], [14] with conditions
as: number of agents is 300, length of considering oblivion is
s = 10, crossover probability is pc = 0.7, mutation probability
is pm = 0.003, generation gap is G = 0.3, Boltzmann constant
is = 0.05, and length of a simulation run is T = 2000. In this
paper, we write 6 kinds of experiments as simulation R-I, R-II,
M-Base, M-NC3, M-NC6, M-C6. The simulation experiments
are executed with R = 0 for simulation R-I, R-II, and M-Base,
R = 0.2 for simulation M-NC3 with Robot NC, R = 0.4 for
simulation M-NC6 with Robot NC, and R = 0.4 for simulation
M-C6 with Robot C, relating to the experiments using subjects.
The results of groups GS and GL of the simulation R-I are
shown in Fig. 6. The horizontal axes indicate the terms of
experiments and the vertical axes indicate average values of
the fraction of choosing take at nodes 13 of 100 simulation
runs.
is thought as that the number of finishing nodes become lower

with the games are repeated. However, group GL includes
many agents who are hardly influenced of the result of one
game, it is thought as that the agents in group GL succeed
cooperation due to this reason. From above mentioned results
of the simulation R-I, the reason that Rapoport et al. [14]
obtained 2 different kinds of results in the experiment R-I
is thought as that there exist myopic subjects and subjects
with long-term view, and the composition ratios of subjects
specified by oblivion degree are different among sessions.
The results of simulation R-II are shown in Fig. 7, where
the axes are same as Fig. 6.
fractions of that
game finishes
with = 0.3, GM be a set of agents with = 0.5, and GL

be a set of agents with = 0.8. We conduct the simulation
experiments with the following 2 kinds of groups:
Group GS |GS | : |GM | : |GL | = 4 : 1 : 1.
Group GL |GS | : |GM | : |GL | = 1 : 1 : 4.
node 1
0.4
0.3
node 2
0.2
node 1
0.1
node 2
node 3
node 3
0
0
500
1000
(Group GL )
0
500
1000
1500
(Group GS )
2000
Fig. 7.
term
1500
2000
term
Result: simulation R-II
fractions of that
game finishes
0.5
node 1
0.4
0.3
node 2
0.2
node 3
0.1
0
0
500
1000
(Group GL )
Fig. 6.
1500
2000
term
Result: simulation R-I
As the result of laboratory experiments R-I [14], the fraction

of that games finish at node 1 is high in group GS , contrary,
the fractions of that games finish at nodes 13 are low in
group GL . In a simulation run, it often happens that agents
succeed cooperation and a game finishes at the nodes with
high numbers, though some of the agents sometime choose
take at nodes with small numbers due to decision with trialand-error process of agents. Group GS including many agents
who are easily influenced of the result of one game, therefore it
978-1-4673-2743-5/12/$31.00 2012 IEEE
As the result of laboratory experiments R-II [14], Fig. 7

that the fractions of that the games finish at nodes with small
numbers are small, and the agent succeed coordination both
in the groups GS and GL . Due to the risk-averse preference
of agents represented by equation (3), in simulation R-II, the
ratio of the value of utility of payoff of that an agent obtains
in a game which finishes at a node with small number to the
value of utility in a game which finishes at a node with large
number is larger than in simulation R-I. Though the fraction
of that a game finishes at node 1 increases to 0.3 from later
stage of simulation runs in the group GS , it is thought as that
impact of the result of one game in simulation R-II is smaller
than in simulation R-I even for the agents in AS . For the same
reason, it can be thought that the subjects succeed coordination
in experiment R-II which the monetary payments are smaller
than in experiment R-I.
Results of simulation M-Base, M-NC3, M-NC6, and M-C6
are shown in Fig. 8, where both vertical and horizontal axes
are same as the axes shown in Figs. 6 and 7. Note that the
results shown in Fig. 8 are the average numbers of the finishing
nodes after removal of the games which include Robots NC
1349
or C, i.e., the results of the games which are played by three

adaptive agents which make decision using NN.
8
average number of
finishing nodes
M-C6(GS) and M-C6(GL)
6
M-Base(GL)
5
M-Base(GS )
4
3
R EFERENCES
2
1
0
0
500
1000
1500
(a) simlation M-Base and M-C6
2000
term
M-NC3(GL )
average number of
finishing nodes
utility function and oblivion to the model, and we execute

behavioral analysis of subjects in the laboratory experiments
by Rapoport et al. [14] and Murphy et al. [11]. The results
of the simulation indicate that the subjects are risk-averse for
monetary payment, there exist myopic subjects, and there exist
multiple types of subjects who make myopic decision and
decision based on long-term view.
M-NC6(GL)
6
5
M-NC3(GS)
4
3
M-NC6(GS)
2
1
0
0
500
1000
1500
(b) simulation M-NC3 and M-NC6
Fig. 8.
2000
term
result: simulation M-Base, M-NC3, M-NC6, and M-C6
From the result of simulation M-Base shown in Fig. 8 (a),

as the results of simulation R-I and R-II, the agents in group
GS often finish the games at the nodes with smaller numbers
by comparing the result of in group GL . From the result of
simulation M-C6 shown in Fig. 8 (a), the ratio of the games
which finish at the nodes with small numbers is small in games
if a Robot C is included. And the number of finishing nodes
are larger than that of simulation M-Base, though the results of
in group GS and GL are most of the same. From the results of
simulation M-NC3 and M-NC6 shown in Fig. 8 (b), the ratio
of games which finish at the nodes with small numbers is large
if a Robot NC is included. Though inclusion of Robot NC to
the group affects the result of group GS , but it does not in
group GL . From above observation, in a group GL including
many agents with large , the ratio of agents with small
choosing take at the nodes with small numbers also small.
Therefore, the agents succeed cooperation in group GL .
The results shown in Fig. 8 are same as result of the
laboratory experiments by Murphy et al. [11] such that the
numbers of finishing node tend to be large if a session includes
many cooperative subjects.
From the above observation, cooperative behavior of the
subjects can be explained by their long-term view, contrary the
uncooperative behavior can be explained by myopic decision
making.
[1] J. Andreoni and J.H. Miller: Auctions with artificial adaptive agents,
Games and Economic Behavior 10, pp. 3964, 1995.
[2] R. Axelrod: The complexity of cooperation Princeton University Press,
pp. 1429, 1997.
[3] E. Fehr and K. M. Palfrey: A theory of fairness, competition, and
cooperation, The Quarterly Journal of Economics, 114, pp. 817868,
1999.
[4] M. Fey, R.D. McKelvey and T.R. Palfrey: An experimental study of
constant-sum centipede games, International Journal of Game Theory,
25, pp. 269287, 1996.
[5] J. Harsanyi: Games with incomplete information played by Bayesian
players; Part I, II, III, Management Science, 14, pp. 468502, 1967
1968.
[6] T. Hayashida, I. Nishizaki and H. Katagiri: Network formation and social
reputation: a theoretical model and simulation analysis, International
Journal of Knowledge Engineering and Soft Data Paradigms, 2, pp.
349377, 2010.
[7] J. H. Holland and J. H. Miller: Adaptive intelligent agents in economic
theory American Economic Review, 81, pp. 365370, 1991.
[8] M. Leshno, D. Moller and P. Ein-Dor: Neural nets in a group decision
process, International Journal of Game Theory 31, pp.447467, 2002.
[9] R.D. McKelvey and T.H. Palfrey: An experimental study of the centipede
game, Econometrica, 60, pp. 803836, 1992.
[10] R.D. McKelvey and T.H. Palfrey: Quantal response equilibria for extensive form games, Experimental Economics, 1, pp. 941, 1992.
[11] R.O. Murphy, A. Rapoport and J.E. Parco: Population learning of cooperative behavior in a three-person centipede, Rationality and Society,
16, pp. 91120, 2004.
[12] R. Nagel and F.F. Tang: Experimental results on the centipede game
in normal form: an investigation of learning, Journal Mathematical
Psychology, 42, pp. 356384, 1998.
[13] I. Nishizaki: A general framework of agent-based simulation for analyzing behavior of players in games, Journal of Telecommunications and
Information Technology, 2007/4, pp. 2835, 2007.
[14] A. Rapoport, W. E. Stein, J. E. Parco and T.E. Nicholas: Equilibrium
play and adaptive learning in a three-person centipede game, Games and
Economic Behavior, 43, pp. 239265, 2003.
[15] R. Rosenthal: Games of perfect information, predatory pricing, and the
chain-store paradox, Journal of Economic Theory, 25, pp. 92100, 1992.
[16] J.A. Sundali, A. Rapoport and D.A. Seale: Coordination in market entry
games with symmetric players, Organizational Behavior and Human
Decision Processes, 64, pp. 203218, 1995.
[17] K.G. Zauner: A payoff uncertainty explanation of results in experimental
centipede games, Games and Economic Behavior, 26, pp. 157185,
1999.
V. C ONCLUSION
In this paper, we conduct simulation analysis of the centipede games by using artificial adaptive agents. The agents
who make decision with trial-and-error process are designed
based on NN and GA. Additionally, we include risk-averse
978-1-4673-2743-5/12/$31.00 2012 IEEE
1350

9.behavioral Analysis in The Agent-Based

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

9.behavioral Analysis in The Agent-Based

Uploaded by

Copyright:

Available Formats

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012

Behavioral Analysis in the Agent-Based

AbstractThe aim of this paper is to explain behavior of the

978-1-4673-2743-5/12/$31.00 2012 IEEE

mechanisms of all players are symmetric. We consider that

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012

Fig. 1, in the game tree, each turn of a player is represented

640 128 256

Rapoport et al. [14] conducted laboratory experiments of

A part of a dynamic game forward from a node is called a

978-1-4673-2743-5/12/$31.00 2012 IEEE

Game tree: a centipede game (three players)

Result R-I-A: experiment R-I by Rapoport et al. [14]

A. Experiments by Rapoport et al. [14]

Result R-I-B: experiment R-I by Rapoport et al. [14]

10 games finish at node 9 in the session R-I-B, against this

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012

experiment as experiment R-I, and the experiment with 1/100

Result: experiment by Murphy et al. [11]

Result: experiment R-II by Rapoport et al. [14]

B. Experiments by Murphy et al. [11]

Number of human and robots in each session

In the experiments, the subjects do not know that the robots

978-1-4673-2743-5/12/$31.00 2012 IEEE

From the results shown in Fig. 5, the subjects in the

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012

decision based on NN which evolve by using the procedure

decision making of an agent, i.e., an agent chooses take with

A. Outline of Simulation Model

Where (0, 1] denotes oblivion rate, and an agent retains

978-1-4673-2743-5/12/$31.00 2012 IEEE

From the result of Rapoport et al. [14] with different scales

As shown in Fig. 1, because the maximum value of payoff at a

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012

IV. S IMULATION E XPERIMENTS

is thought as that the number of finishing nodes become lower

with = 0.3, GM be a set of agents with = 0.5, and GL

Result: simulation R-II

Result: simulation R-I

As the result of laboratory experiments R-I [14], the fraction

978-1-4673-2743-5/12/$31.00 2012 IEEE

As the result of laboratory experiments R-II [14], Fig. 7

SCIS-ISIS 2012, Kobe, Japan, November 20-24, 2012

or C, i.e., the results of the games which are played by three

M-C6(GS) and M-C6(GL)

(a) simlation M-Base and M-C6

utility function and oblivion to the model, and we execute

(b) simulation M-NC3 and M-NC6

result: simulation M-Base, M-NC3, M-NC6, and M-C6

From the result of simulation M-Base shown in Fig. 8 (a),

978-1-4673-2743-5/12/$31.00 2012 IEEE

You might also like