Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Comparison of Parallel Adversarial Search Algorithms

February 14, 2011

Introduction decide on how to make decisions given a tree. The


algorithm used in this paper is minimax search.

Minimax
Adversarial Search Minimax search is a recursive algorithm that
traverses the game tree downward,expanding each
Introduction node it visits, and then propagates the informa-
Adversarial search is a tool often used in arti- tion gathered back up the tree. Each level of the
ficial intelligence, more specifically, game theory tree (or move in the game) is either a max level
[1]. It is a method for allowing agents to decide or a min level. The max levels are levels in which
on a best course of action given a competitive envi- the agent moves, and min levels are levels in which
ronment, and can be used in topics ranging from the adversary. Obviously the goal is to reach a leaf
chess to investment planning. Let’s use tic-tac- node in which there is a positive utility, but the
toe as an example (See Figure 1). In adversarial adversary’s goal is to do the opposite. To take this
search a game tree is made with states as nodes. into account, minimax assumes the adversary will
The states in tic-tac-toe are board configurations, always make a move that is most optimal for it,
so the root node of the game search tree would be a and thus, least optimal for our agent.
blank tic-tac-toe board. The children of the root
are the board configurations representing the le- The minimax algorithm starts by traversing to
gal moves on the root node. The children of these the leaf nodes of the tree. In Figure 1, we decide
nodes are the legal moves on their own board con- that our agent will play using ”X” and will go first.
Next, the parents of the leaves gain the value of
figuration, and so on and so forth until there are no
more successors. In tic-tac-toe there are no more the minimum or the maximum (depending if the
successors when the board is full or when someone parents are on a min or max level) value of their
has won by having three of their characters in the children. Then, these values are propagated to
same row, column or diagonal. their parents and so on until it reaches the root
node. Minimax keeps track of which nodes the
Each node in the game tree is scored by an eval- values are passed from, so by the time the algo-
uation function that determines how optimal an rithm is done propagating utilities up the tree, it
intermediate solution is. The leaf nodes are given knows what move it should make in order to get
a utility value based on if the board configuration to a desirable utility.
of the leaf node is desirable or not. Tic-tac-toe
is what is called a zero-sum game, meaning that In theory traversing a tree to its leaves to find
either you win (a utility of +1), you lose (-1), or the utilities is fine. However, most game trees are
you play to a draw (0) [1]. Now that we have es- too large to traverse to the bottom in a tractable
tablished what a game tree looks like, we need to amount of time. Minimax has a time complexity of
O(bm ) and a space complexity of O(bm) where b is

1
Figure 1: Tic-tac-toe Game Tree(1)

the branching factor and m is the maximum depth assume that the adversary is optimal, then mini-
of the tree. Tic-tac-toe has a branching factor of 9 max is optimal, meaning that it will always find
- (lvl - 1), where lvl is the level of the tree, which the most desirable solution.
is fairly large for such a simple game. Games like
chess have much higher branching factors. Although in some cases simply setting a depth
is sufficient, in most cases the branching factor of
To work around the high space and time com- the tree is too high for the evaluation function to
plexities, minimax can use the evaluation function return useful information. One of the best ways to
values given to each board configuration to decide reduce time and space complexities is by eliminat-
on how desirable a solution is. Instead of propa- ing nodes that do not need to be expanded through
gating the utility values of the leaf nodes up the a process called pruning. By pruning some nodes,
tree, minimax can determine a maximum depth to you can eliminate paths to solutions that the agent
traverse, and pass the evaluation values of those would never reach. The algorithm in this paper
nodes at the given depth up the tree. In Figure discussed for pruning is alpha-beta pruning.
2, the tree’s depth is set at two. Minimax sees
that the values of 3, 2, and 2 would be chosen by
a optimal adversary. Of these, 3 is the highest, so Alpha-Beta Pruning
it decides to choose to move to the node with the
minimax value of 3. Alpha-beta pruning eliminates paths that are
not necessary to expand further, because a more
optimal solution has been found than what would
There are a couple of good features of minimax
be chosen by searching a subtree rooted at a sub-
search. If we assume our search space is not end-
optimal solution. On a min level if a node already
less, that is the maximum depth of the tree is not
explored has a more optimal solution than one
infinity, then minimax will always return a solu-
that has been partially expanded on a min level,
tion if there is one, i.e. it is complete. Also, if we

2
tion of alpha-beta pruning, see the first work cited
in the bibliography section of this paper. Mini-
max with alpha-beta pruning works well sequen-
tially, but they there are also techniques to use
their power on parallel machines.

Parallel Algorithms
Figure 2: Minimax Example(1)
Principal Variation Splitting
One of the first techniques used to parallelize
minimax search with alpha-beta pruning was Prin-
cipal Variation Splitting (PV Splitting). PV split-
ting classifies each node in the search tree as per
the taxonomy created by Knuth and Moore[6]:

• PV Nodes - The root node and every first


child of PV nodes
Figure 3: Alpha-Beta Pruning Example(1) • CUT Nodes - Every child of PV nodes ex-
cept for the first, and all the children of ALL
nodes
then you can effectively prune that entire branch • ALL Nodes - The first child of CUT nodes
of the tree without worry of a more optimal solu-
tion being there [1]. The same situation holds on • Undefined Nodes - Any node not defined above
a max level of nodes. If a less optimal solution is
found previously, then alpha-beta pruning knows This taxonomy is important because each node
that the min in the above level will never choose has a specific purpose. PV nodes should be ex-
a more optimal solution. plored first in the subtrees to establish bounds
for pruning. CUT nodes are good candidates for
Figure 3 is a good example of this. After ex- parallelization. Finally, ALL nodes are considered
ploring the first subtree, minimax knows that 3 tough to parallelize (See Figure 4).
will be chosen by min. The first leaf explored in
the second subtree gives an evaluation of 2. This PV splitting maps the search tree into an under-
means that no matter what else is expanded in this lying assignment of processors [5]. Each processor
subtree, the value passed up will be at the most is responsible for the evaluation of the node and
2. The max level above this will never choose this passing the value up to its parent. Some sort of
subtree, because the subtree with 3 is more opti- global entity is in charge of keeping record of this
mal. Therefore, the algorithm does not need to assignment. The PV nodes in the tree are recur-
explore this subtree further and can prune it. sively expanded first. When the search reaches the
terminal nodes or a given depth limit, the values
Alpha-beta pruning is extremely useful. It can of the PV nodes are propagated up the tree from
be applied to trees of any depth, and it often child to parent. Once a root of a subtree has been
prunes entire subtrees [1]. There is a small amount passed a value from its PV Branch (its first branch
of overhead, but it is clearly outweighed by the consisting of PV nodes), the CUT nodes, and thus
possible savings in space and time. Most impor- their subtrees, can be explored in parallel as per
tantly, it does not change the completeness or op- the taxonomy.
timality of minimax. For a more rigorous explana-

3
Figure 4: Node Taxonomy (Black is Type 1, Green is Type 2, Red is Type 3, Gray is undefined)

Since it is hardly the case that there are as many first branch of each node is the best branch at least
processors as there are nodes in the search tree, the 70 percent of the time and that the best move is
actual processors are used on the nodes that need in the first quarter of the branches being searched
to be evaluated first, and once a processor is done 90 percent of the time [5].
evaluating its node and propagating up its value,
it can be assigned to a node that needs to be eval- There are some serious flaws to the algorithm,
uated next. Once this value is propagated up to however. First, if the PV branch does not pro-
the parent, the parent compares this to its alpha duce good bounds, then there is little difference
and beta values, if the new value establishes better from simply parallelizing the tree naively. Also,
bounds, it propagates it up the tree, making bet- there is no real way for PV Splitting to use a large
ter bounds for all the nodes in which is applies. If number of processors depending on the branch-
the value passed to the parent represents a cut-off ing factor. For example, when PV splitting runs
condition, then it tells its children to stop search-on chess, which has an average branching factor
ing. If a processor finishes with a node and there of 32, no more than 31 processors can be used [3].
are no available unassigned branches (nodes), then Another problem is that synchronizing the proces-
it goes idle until more nodes are available [3]. sors . Some processors will inherently receive more
computationally expensive nodes, which means more
What makes PV Splitting a good parallel adver- idle time for processors that finish early. The next
sarial search algorithm is that it establishes mean- algorithm discussed will take the basic idea of PV
ingful bounds by exploring the PV branch first. If splitting and expand it to help remedy some of
one were to take the nave approach and simply these problems.
assign processors to the subtrees without estab-
lishing bounds first, there would be significantly
less pruning. PV splitting works especially well
for strongly ordered trees, that is, trees where the

4
Young Brothers Wait Concept YBWC can be further improved by a concept
of strength. Weak YBWC performs as described
The idea of having the PV Branch searched first
previously, however, Strong YBWC assigns a value
can be expanded to gain more pruning power [7].
to each split-point to indicate how ”promising” it
One of the algorithms that does this is Young
is. This value is application dependent [3]. More
Brothers Wait Concept (YBWC). In this algorithm
promising split-points are assigned processors first,
every subtree’s first branch is considered a PV
because they are seen as to have more potential to
Branch and must be searched first. After a sub-
make stricter bounds for pruning.
tree’s PV Branch is searched, then any other sib-
lings of the root of the subtree can be done in
parallel. The main advantage of YBWC over PV Split-
ting is that there is much less synchronization. All
nodes except the first child of a root can be done
Another difference between YBWC and PV Split-
in parallel. On trees with high branching factors,
ting is that YBWC does not have definite proces-
this means most of the entire tree is done in paral-
sor assignments based on the tree. YBWC actu-
lel. As a result, many more processors can be uti-
ally develops master/slave relationships between
lized, and load balancing is much better. The final
processors. The algorithm starts by assigning the
algorithm to be discussed in this paper improves
root node to a processor. This processor now
on this one by having a much stricter concept of
”owns” the root node. When a processor owns
strength of split nodes.
a node it is responsible for the node’s evaluation
and the propagation of the node’s value up the
tree. Dynamic Tree Splitting
Dynamic Tree Splitting (DTS) starts similarly
The other idle processors start request work from to YBWC in that it assigns a processor to the root,
a random processors, if a processor who owns a giving the processor ownership of the root. Here,
node and has the PV Branch explored gets this ownership simply means evaluating the node. Once
request, it can assign the requesting processor one the PV branch is explored, the processor creates a
of its other children that is not in the PV Branch. list of split points and adds them to a shared global
These are called split-points [3]. This is why it list of split points (SP List) that all the processors
is called ”Young Brothers Wait Concept”. The can reference. The idle processors consult the SP
younger brothers (siblings in the tree) have to wait List and choose a node to own from the list.
for the eldest brother to be evaluated in order for
them to be evaluated. These younger brothers are If an idle processor checks the SP List for work
now owned by processors and the algorithm re- and finds no split points, then it sends a broadcast
curses. to the other processors requesting work. A proces-
sor who receives the broadcast that has work but
When a processor is finished with its assigned has not established split points copies the state of
subtree, that is, it has evaluated its owned node the subtree in which it is working on into a shared
and passed up its value, it goes idle and it sends space in memory. The requesting processor then
another random request for work. If a processor analyzes the subtree for split-points. If the pro-
receives a request and does not have work, it for- cessor finds a suitable split-point, it becomes its
wards the request to another random processor. owner. If the processor cannot find suitable split
Usually there is a certain number or forwards that points from the state it analyzed, it goes idle and
can be done before the request is returned with no rebroadcasts.
work. In this case the requesting processor sends
another request to a different random processor. As with the other algorithms, when a processor
is done with a subtree, it goes idle. However, the
last processor working on a subtree is responsible

5
for passing the value of the subtree upward. Also, • The height of the node. Nodes that are higher
if a processor finishes its split-point, but the sub- up in the tree (closer to the root) represent
tree is no finished, it will continue to help finish more work.
evaluating the subtree until it is complete. Much
like the other two algorithms, values are passed • If it is a D-PV node, its first branch must
upward in the tree that establish better bounds have been searched.
and cutoff conditions. If a cutoff condition hap- • If it is a D-All node, the confidence factor
pens, all the processors helping on a subtree go should be relatively high [3].
idle except for one that returns the value up the
tree [3].
Comparison of Algorithms
What makes this algorithm different from the
rest, besides the peer-to-peer nature, is the com- General Comparison
plicated process it undertakes to find a split-point
The general idea for parallelizing alpha-beta prun-
after a broadcast. First, each node of the subtree
ing is to get as close to the amount of pruning the
that a processor analyzes is classified [3]:
sequential algorithm does (or even better) while
• D-PV Nodes - A node that has the same also trying to limit overhead. Often, these goals
alpha and beta values as the root. conflict with each other. If the algorithm finds
better split-nodes, then it takes some added over-
• D-CUT Nodes - A minimizing node with the head to do so. In Addition, there are three other
same beta as the root or a maximizing node sources of overhead besides split-node overhead to
with the same alpha as the root. consider: Communication overhead, synchroniza-
tion overhead, and search overhead [3].
• D-ALL - Any node that is not a D-CUT of
D-PV node [3].
Communication overhead occurs when a proces-
sor propagates a value up the tree, and its value
After the nodes are classified, there are two over- establishes a new alpha or beta value for the entire
ride phases [3]. All the D-CUT nodes are analyzed subtree. It is then the responsibility of the root of
and if there were more than three children of the the subtree to communicate to its descendents of
node that were evaluated and none of them created the new bounds. Considering that it is usually the
a cutoff condition, then it is changed to a D-ALL. case that multiple processors are working on the
The second phase checks if there are three D-ALL nodes lower in the tree, there needs to be commu-
nodes in consecutive levels of the tree. If there nication between the root processor and all of the
are, then the third D-ALL node is changed to a processors of its children.
D-CUT node and they alternate between D-CUT
and D-ALL from that point on.
Synchronization overhead occurs when proces-
sors are idle because they are waiting for some
After the two override phases all the D-Cut and result of a calculation from another processor. In
D-All nodes are given a level of confidence similar alpha-beta pruning this is usually the bottleneck.
to the promise value in YBWC. The more children In order to establish the original bounds the most
that have been expanded in a D-Cut node, the common solution is to have some nodes explored
less confidence is given to that node. The more first without much parallelism.
children that have been expanded in a D-All node,
the more confidence is given to that node. After
Finally, search overhead is the time is takes to
that there are four final aspects to consider for a
search nodes that could have been pruned by some
node to be a split point [3]:
more intelligent means. This is usually in direct
• The node must be of type D-PV or D-All. contrast with split-node overhead. If an algorithm

6
takes more time to choose better split nodes, then
there is often better pruning, and vice-versa.

Communication overhead relies heavily on two Figure 5: Speedups for PV Splitting [3]
things: The platform and the ability to set better
bounds. DTS was created for environments where
communication costs were cheap, while YBWC
was created for distributed environments [3]. PV-
Splitting was created with no particular environ-
ment in mind. Since DTS almost always finds bet-
ter bounds then the other two algorithms sooner, Figure 6: Speedups for YBWC [8]
there are more messages being passed between pro-
cessors, creating more communication overhead.
YBWC creates better bounds than simple PV Split- no speculative loss, while Strong YBWC and DTS
ting, so it generally has greater overhead. So, in both have a chance to lose some time due to spec-
the general case, PV-Splitting has the least amount ulative loss.
of communication overhead of the three, while DTS
has the most.
Each of these algorithms has strengths and weak-
nesses, but what aspects of parallel alpha-beta
As stated previously PV Splitting has a con- pruning are most important? All of these algo-
siderable amount of idle time for processors, due rithms have been implemented on the same prob-
to load imbalance. YBWC treats all nodes besides lems, but not all on the same machine. It seems
the first branch of subtrees as possible split-points, that this is because these algorithms have been
so there is much more opportunity for processors created in different decades and that YBWC is
to stay busy. DTS uses an approach similar to really meant for distributed system use. This cre-
YBWC except with the added overhead of ana- ates a problem for comparison. Given more time, I
lyzing split-nodes. None of these algorithms make would implement each algorithm on the same ma-
perfect use of their processors, but PV Splitting is chine. However, instead I will compare trends in
by far the worst as far as synchronization. the increase of processors.

DTS has a distinct advantage of much less search


Empirical Results
overhead. In the most complex games, the ability
to prune unnecessary nodes is vital for speed up. For comparison, all of the data collected for the
DTS is more selective in which nodes to explore algorithms is from the input problem chess. Chess
first, which creates more pruning. PV Splitting is one of the most common problem for these types
takes the first branch as bounds, but the rest of of algorithms to be run on. Figure 5 shows the
the nodes may not necessarily be able to estab- speedups for PV Splitting running on a Cray C90
lish good cut off conditions. YBWC is better than machine. Figure 6 shows the speedups of strong
PV Splitting here because it establishes bounds on YBWC running on a Parsytec GCel machine. Fig-
every subtree. However, Strong YBWC also has ure 7 shows the speedups of DTS running on a
some level of confidence in the nodes it chooses
as split-nodes, so it should be close to YBWC in
pruning power if done correctly.

Unfortunately there is a draw-back with search-


ing for a split-node. Speculative loss is overhead Figure 7: Speedups for DTS [3]
trying to find a better split-point, but does not
yield one [4]. PV Splitting and Weak YBWC have

7
without a more even playing field, this is little
more than conjecture. If there were more time
to work on this project, one future goal would be
to test the hypothesis of DTS being the best algo-
Figure 8: Speedups for PV Splitting
rithm by running the algorithms all on the same
problem, written in the same language, and espe-
cially, on the same platform

The most important conclusion, I believe, is that


having some heuristic to decide on split-points is
Figure 9: Speedups for YBWC the most important part of an adersarial search
algorithm. As stated previously, the more an al-
gorithm can prune the tree, the less work the al-
Cray C916/1024 machine. These speedups are gorithm has to do, and thus faster runtime. DTS
based on the sequential algorithms run on each and Strong YBWC outperform PV Splitting sim-
individual machine. Since all the machines have ply because they prune more nodes.
different specs, they cannot be compared directly.
However, trends among the increase of processors
versus the increase of speedup can be compared.
Conclusion and Related Work
Figures 8, 9, and 10 show the efficiencies of the Adversarial search is a relevant problem in many
speedups for each algorithm, respectively. disciplines. Simply doing a sequential search us-
ing minimax and alpha-beta pruning yields an in-
Some interesting conclusions can be drawn from tractable time complexity. Parallelizing the pro-
the data. First, the data would indicate that the cess allows for a considerable amount of speedup.
two algorithms that use some overhead to select Unfortunately, this is not a trivial process. There
which nodes are split-points are far superior to PV has been much advancement over the years from
Splitting. The speedup measured on PV Splitting Principal Variation Splitting to Dynamic Tree Split-
almost levels off from 8 to 16 processors, having ting to more recent algorithms.
only a slope of 0.0625. The efficiency nearly halves
in the same interval. The other two algorithms There are many approaches to improve upon
increase speedup fairly consistently. DTS has a these algorithms. There is some work being done
slope of 0.5625 from 8 to 16 and YBWC has a to use machine learning algorithms to improve how
slope of 0.1897 from even the much higher interval one evaluates a state of the tree [9]. There is also
of 512 processors to 1024 processors. work being done to use neural networks to enhance
the algorithms [3]. Researchers are trying to re-
If there were more comparable data between solve some of the problems with the current algo-
DTS and Strong YBWC, one could make stronger rithms, as well. The APHID system, for example,
comparisons. YBWC does seem to lose efficiency attempts to cut down some synchronization over-
quicker than DTS, which would make sense given head for YBWC [7]. The field of game theory has
the algorithms and the problems they solve, but come a long way in the last couple of decades.
There have even been chess programs that have
challenged even the greatest human chess players
in the world [2].

In my opinion, the future of this problem does


Figure 10: Speedups for DTS not lie in chess like so many of these algorithms
focus in on. It lies in problems that we now see as

8
Figure 11: Speedups for PV Splitting and DTS

Figure 12: Speedups for YBWC

Figure 13: Efficiency for PV Splitting and DTS

9
Figure 14: Efficiency for YBWC

hard or impossible. For instance, the best human (Revised version). University of Kentucky. pp. 6-
players of Go, a popular game that gained more 9.
popularity by appearing in the movie A Beautiful
Mind, are not even challenged by the best com- [6] D. V. Knuth and R. W. Moore, An analysis of
puter opponents. It is problems like these that alpha-beta prunning, Artificial Intelligence 6 pp.
will push the boundaries of adversarial search to 293-326 (1975).
its limits.
[7] Brockington, Mark G. and Schaeffer Johnathan.
APHID Game-Tree Search. University of Alberta.
Bibliography 1996. pp. 3

Cited Works [8] Feldmann, R. Game Tree Search on Massively


[1] Russell, Stuart J. and Norvig, Peter. Artificial Parallel Systems. Ph.D. Thesis. University of Pader-
Intelligence: A Modern Approach, Second Edition. born, Paderborn, Germany. 1993. pp. 12
Saddle River, NJ: Pearson Education Inc. 2003.
pp. 161 - 171. [9] Mandziuk1, Jacek and Osman, Daniel. Alpha-
Beta Search Enhancements with a Real-Value
[2] Hsu, Feng-hsiung. Behind Deep Blue: Building Game-State Evaluation Function. ICGA Journal.
the Computer that Defeated the World Chess March 2004
Champion Princeton University Press. 2002.

[3] Manohararajah, Valavan. Parallel Alpha-Beta Cited Pictures


Search on Shared Memory Multiprocessors. Uni-
versity of Toronto. 2001. pp. 21 29 (1) Russell, Stuart J. and Norvig, Peter. Artificial
Intellegence: A Modern Approach, Second Edition,
[4] Steinberg, Igor and Solomon, Marvin. Search- Saddle River, NJ: Pearson Education Inc. pp.
ing Game Trees in Parallel. University of Wiscon- 164.
sin. pp. 6-9

[5] Rezaie, Jaleh and Finkel, Raphael. A compar-


ison of some parallel game-tree search algorithms

10

You might also like