Adversarial Search

Adversarial
Search
Games vs. search problems
• "Unpredictable" opponent  specifying a move for every
possible opponent reply
• Time limits  unlikely to find goal, must approximate
Perfect Decisions in two person Games
• The general case of a game with two players, whom we

will call MAX and M1N,
• MAX moves first, and then they take turns moving until
the game is over.
• At the end of the game, points are awarded to the winning

player.
Formal definition of game as search
problem
• The initial state, which includes the board position and an
indication of whose move it is.
• A set of operators, which define the legal moves that a player can
make.
• A terminal test, which determines when the game is over. States
where the game has ended are called terminal states.
• A utility function (also called a payoff function), which gives a
numeric value for the outcome of a game. In chess, the outcome is a
win, loss, or draw, which we can represent by the values +1, —1, or 0
Perfect Decisions in two person
Games
• MAX would have to search for a sequence of moves that leads to a
terminal state that is a winner then go ahead and make the first
move in the sequence.
• Unfortunately, MIN has something to say about it.
• MAX therefore must find a strategy that will lead to a winning
terminal state regardless of what MIN does, where the strategy
includes the correct move for MAX for each possible move by MIN.
• We will begin by showing how to find the optimal (or rational)
strategy.
Tic-Tac-Toe Game tree (2-player,
deterministic, turns)
• From the initial state, has a choice of nine
possible moves.
• Play alternates between MAX placing x's and MIN
placing o's until we reach leaf nodes
corresponding to terminal states: states where
one player has three in a row or all the squares
are filled.
• The number on each leaf node indicates the utility
value of the terminal state from the point of view
of MAX; high values are assumed to be good for
MAX and bad for MIN.
Generate Game Tree
Searching with an opponent
Searching with an opponent
• Heuristic approximation: defining an evaluation function which indicates
how close a state is from a winning (or losing) move
• This function includes domain information.
• It does not represent a cost or a distance in steps.
• Conventionally:
• A winning move is represented by the value“+∞”.
• A losing move is represented by the value “-∞”.
• The algorithm searches with limited depth.
• Each new decision implies repeating part of the search.

Evaluation Function
• The game tree is fully developed to a given number of levels.
• An evaluation function is applied to each leaf node. These values
represent how good each node is for the max player.
• These values are then propagated upwards, indicating the next
best move for the max player
• The success of the minimax algorithm depends on how accurate
the evaluation function is.
• MAX plays with “X” and desires maximizing e.
• MIN plays with “0” and desires minimizing e.
Evaluation Function
• (3X + O) – (3Z + Y)
• X = Number of rows, columns or diagonals with two X’s and no
O’s
• O = Number of rows, columns or diagonals with one X and no O’s
• Z = Number of rows, columns or diagonals with two O’s and no
X’s
• Y = Number of rows, columns or diagonals with one O and no X’s
Evaluation Function
• (3X + O) – (3Z + Y)
X O X
X O O X X O
X O O X X O O
Another Evaluation Function
• e(n) =(number of rows, columns and diagonals available
to Max) - (number of rows, columns and diagonals
available to Min)
X O O X X
O
e(n) = 6 - 4 = 2 e(n) = 4 - 3 = 1
The minimax algorithm
• The minimax algorithm computes the minimax decision from
the current state.
• It uses a simple recursive computation of the minimax values of
each successor state:
• directly implementing the defining equations.
• The recursion proceeds all the way down to the leaves of the tree.
• Then the minimax values are backed up through the tree as the
recursion unwinds.
The minimax algorithm
• The algorithm first recurs down to the tree bottom-left
nodes
• and uses the Utility function on them to discover that their values
are 3, 12 and 8.
Tic Tac Toe Example
Tic Tac Toe Example
Tic Tac Toe Example
The minimax algorithm: problems
• For real games the time cost of minimax is totally impractical, but
this algorithm serves as the basis:
• for the mathematical analysis of games and
• for more practical algorithms
• Problem with minimax search:
• The number of game states it has to examine is exponential in the number of
moves.
• Unfortunately, the exponent can’t be eliminated, but it can be cut in
half.
Alpha-beta pruning
• It is possible to compute the correct minimax decision
without looking at every node in the game tree.
• Alpha-beta pruning allows to eliminate large parts of the
tree from consideration, without influencing the final
decision.
Alpha-beta pruning
• Alpha-beta pruning gets its name from two parameters.
• They describe bounds on the values that appear anywhere along
the path under consideration:
• α = the value of the best (i.e., highest value) choice found so far along
the path for MAX
• β = the value of the best (i.e., lowest value) choice found so far along
the path for MIN
Alpha-beta pruning
• Alpha-beta search updates the values of α and β as it goes
along.
• It prunes the remaining branches at a node (i.e.,
terminates the recursive call)
• as soon as the value of the current node is known to be worse
than the current α or β value for MAX or MIN, respectively.
Alpha-beta pruning
• The leaves below B have the values 3, 12 and 8.
• The value of B is exactly 3.
• It can be inferred that the value at the root is at least 3, because MAX has a
choice worth 3.
B
Alpha-beta pruning
• C, which is a MIN node, has a value of at most 2.
• But B is worth 3, so MAX would never choose C.
• Therefore, there is no point in looking at the other successors of C.
B C
Alpha-beta pruning
• D, which is a MIN node, is worth at most 14.
• This is still higher than MAX’s best alternative (i.e., 3), so D’s other
successors are explored.
B C D
Alpha-beta pruning
• The second successor of D is worth 5, so the exploration continues.
B C D
Alpha-beta pruning
• The third successor is worth 2, so now D is worth exactly 2.
• MAX’s decision at the root is to move to B, giving a value of 3
B C D
Effectiveness of Alpha-Beta
Search
• Worst-Case
• branches are ordered so that no pruning takes place. In this case
alpha-beta gives no improvement over exhaustive search
• Best-Case
• each player’s best move is the left-most child (i.e., evaluated first)
• in practice, performance is closer to best rather than worst-case
Alpha-Beta Pruning
• Pruning does not affect final results
• Entire subtrees can be pruned.
• Good move ordering improves effectiveness of pruning
Example
-which nodes can be pruned?
5 6
3 4 1 2 7 8
Answer to Example
Max
Answer: NONE! Because the most favorable
nodes for both are explored last (i.e., in the
diagram, are on the right-hand side).
Min
Max
5 6
3 4 1 2 7 8
Second Example
3 4
6 5 8 7 2 1
Second Example
Max -which nodes can be pruned?
Answer: LOTS! Because the most favorable
nodes for both are explored first (i.e., in the
diagram, are on the left-hand side).
Min
Max
3 4
6 5 8 7 2 1

Adversarial Search

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adversarial Search

Uploaded by

Copyright:

Available Formats

Adversarial

• The general case of a game with two players, whom we

• At the end of the game, points are awarded to the winning

• Each new decision implies repeating part of the search.

You might also like