Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 24

Games

Points
Games and strategies
A logic-based approach to games
AND-OR trees/graphs
Static evaluation functions
Tic-Tac-Toe
Minimax
Alpha-beta cutoff
Extensions

CSI 4106, Winter 2005 Games, page 1


Definitions
We consider non-random or semi-random
games
• with full information,
• zero-sum,
• two-person (dual),
• with rational players.
That's mainly board games such as chess,
chequers, go -- sometimes called strategic
games; some paper-and-pencil games are
of this kind too.

CSI 4106, Winter 2005 Games, page 2


Definitions (2)
A (pure) strategy:
a complete set of advance instructions that
specifies a definite choice for every
conceivable situation in which the player
may be required to act.
In a two-player game, a strategy allows the
player to have a response to every move of
the opponent.
Game-playing programs implement a
strategy as a software mechanism that
supplies the right move on request.

CSI 4106, Winter 2005 Games, page 3


A logic-based approach to games

Find a winning strategy by proving that the


game can be won -- use backward chaining.
A very simple game: nim.
 initially, there is one stack of chips;
 a move: select a stack and divide it in two
unequal non-empty stacks;
 a player who cannot move loses the game.
(The player who moves first can win.)

CSI 4106, Winter 2005 Games, page 4


... Games in logic (2)

CSI 4106, Winter 2005 Games, page 5


... Games in logic (3)
The players are A and B. A_wins( P, X ) means "player X moves
in position P and there is a winning continuation for A".
A position is represented as a list of sizes of stacks with 3 or more
chips (only those can be still divided).
A_wins([6], A)

A_wins([5], B) A_wins([4], B)

A_wins([4], A) A_wins([3], A)

A_wins([3], B) A_wins([], B)

A_wins([], A)
Player B loses.
This is an AND/OR tree (actually a
directed, acyclic AND/OR graph).

CSI 4106, Winter 2005 Games, page 6


... Games in logic (4)
A winning strategy would always lead to a win. Here,
such a strategy is described by a subgraph with one
OR edge selected from each OR node. All leaves in
the subgraph must represent wins for player A.
A_wins([6], B)

A_wins([5], A) A_wins([4], A)

A_wins([4], B) A_wins([3], B)

A_wins([3], A) A_wins([], A)

A_wins([], B)

Now we cannot find a winning strategy: why?

CSI 4106, Winter 2005 Games, page 7


... Games in logic (5)
This kind of analysis only works for very small game trees:

A_wins([8], A)

A_wins([7], B)

A_wins([6], A) A_wins([4, 3], A) A_wins([5], A)

A_wins([4], B)

CSI 4106, Winter 2005 Games, page 8


The basic loop
Regularities:
moves of player A sprout from OR nodes,
moves of player B sprout from AND nodes.
Seen from A's perspective, this means that A
chooses one of the moves (the best move, if
possible), and is ready to react to all of B's moves.
The basic loop in a game program:
build as much of the complete tree as seems
reasonable (for example, within a given time limit);
evaluate the incomplete tree;
prune unpromising or bad moves;
make a move;
get the opponent's move.

CSI 4106, Winter 2005 Games, page 9


Static evaluation
A static evaluation function returns the
value of a move without trying to play
(which would mean simulating the rest of
the game but not playing it).
Usually a static evaluation function returns positive
values for positions advantageous to A, negative
values for positions advantageous to B.
If player A is rational, he will choose the
maximal value of a leaf.
Player B will choose the minimal value.

CSI 4106, Winter 2005 Games, page 10


Static evaluation (2)
If we can have (guess or calculate) the value of
an internal node N, we can treat it as if it were a
leaf. This is the basis of the minimax procedure.
No tree would be necessary if we could evaluate
the initial position statically. Normally we need a
tree, and we need look-ahead into it. Further
positions can be evaluated more precisely,
because there is more information, and a more
focussed search.
Minimax works best for large trees, but it can be
useful even in mini-games such as tic-tac-toe.

CSI 4106, Winter 2005 Games, page 11


Tic-Tac-Toe
Let player A be x and let open(x), open(o) mean
the number of lines open to x and o. There are 8
lines. An evaluation function for position P:
f(P) = - if o wins
f(P) = + if x wins
f(P) = open(x) - open(o) otherwise
o
Example: x
open(x) - open(o) = 6 - 4

Assumptions:
only one of symmetrical positions is generated;
we build 2 levels of the game tree (one move --
one response) to have 2-ply lookahead.

CSI 4106, Winter 2005 Games, page 12


Tic-Tac-Toe (2)
x
x

xo x o x x x o o
o o x x
o x
6-5 5-5 6-5 5-5 4-5 5-4 6-4

o x x x x x
o o
o o
5-6 5-5 5-6 6-6 4-6
Player B chooses the minimal backed-up value among level 1 nodes.
Player A chooses the maximal value, and makes the move.
Player B, as a rational agent, selects the optimal response.

CSI 4106, Winter 2005 Games, page 13


o Tic-Tac-Toe (3)
x

o x o x o o
x x x x x
x

o x o x o oo o o o
x x o x x x x xo
o o x x x
3-3 3-2 4-3 4-2 3-2 3-2

B's first three moves are blocking moves. Other moves


lead to + for A: the only finite value is the minimum.
For A this is a three-way tie in the evaluation; the chance
to get more information is to consider more plies.

CSI 4106, Winter 2005 Games, page 14


o
Tic-Tac-Toe (4)
x

xo o o o
x x x x x
x x

x o o oo oo o o
x o x x x x xo x
o x x x xo
2-2 3-2 4-2 4-3 4-3 3-3

Now, what happens if B chooses a weaker move?


The procedure finds a winning continuation: the best
position ensures a win by forced moves.

CSI 4106, Winter 2005 Games, page 15


oo
x Tic-Tac-Toe (5)
x

oo x oo oo oo oo
x x x x x x x
x x x x x x x

oo o ooo ooo oo o
x x x x x x
2-1 3-1 x x x x x x
2-1 3-1 - - - -
Building complete plies is usually not necessary. If we evaluate
a position when it is generated, we may save a lot.
Assume that we are at a minimizing level. If the evaluation
function returns -, we do not need to consider other positions:
- will be the minimum.
The same applies to + at a maximizing level.

CSI 4106, Winter 2005 Games, page 16


Tic-Tac-Toe (6)
This is possible because of the special properties of
the infinite values, but we can achieve a similar effect
for finite values.

x x

xo x o x x x o x
o o
o
6-5 5-5 6-5 5-5 4-5 5-6

CSI 4106, Winter 2005 Games, page 17


Tic-Tac-Toe (7)
x x

xo x o x x x o x
o o
o
6-5 5-5 6-5 5-5 4-5 5-6

The backed-up value of the first node at level 1 is -1,


so the value of the (maximizing) root must be ≥ -1.
When we see 5 - 6 = -1, we know that the value of the
(minimizing) node • must be ≤ -1. The whole subtree
sprouting from • cannot contribute anything and should
not even be built.

CSI 4106, Winter 2005 Games, page 18


Minimax with cut-off
In general, we keep a provisional value in every
node. This value can only increase in an OR
(maximizing) node, and decrease in an AND
(minimizing) node.
If an AND node with the provisional value V has
a child C with a value less than V, we abandon
C.
If an OR node with the provisional value V has
a child C with a value greater that V, we
abandon C.
Provisional values are established as soon as
we "know something", and are propagated up
the tree, from the leaves to the root.

CSI 4106, Winter 2005 Games, page 19


- cut-off (2)

i. We stop searching in and below a minimizing


node N with a provisional value PVN that is less
than or equal to the provisional values of its
maximizing ancestors.
The final value for N is PVN.
ii. We stop searching in and below a maximizing
node N with a provisional value PVN that is
greater than or equal to the provisional values
of its minimizing ancestors.
The final value for N is PVN.

CSI 4106, Winter 2005 Games, page 20


- cut-off (3)
A provisional value of an OR node is called its
alpha-value.
A provisional value of an AND node is called its
beta-value.
During search, the alpha-value of a node is set
to the currently largest of the final values for
descendants;
the beta-value of a node is set to the currently
smallest of the final values for descendants.
(i) is a shallow alpha-cutoff,
(ii) is a shallow beta-cutoff.

CSI 4106, Winter 2005 Games, page 21


- cut-off (4)

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

CSI 4106, Winter 2005 Games, page 22


Extensions, modifications
"Waiting for quiescence" -- when we reach
the depth limit in the middle of a dynamic
exchange (large amplitude of values).
Secondary search ("feedover") -- "double
check" down a path that seems the best.
Book moves -- "canned" continuations
(openings, endgames), and forced moves.
Disadvantages of minimax:
relying on the optimality of the opponent's
play,
no spectacular sacrifices are possible Queen
lost
Pawn
lost
(winning back beyond the search limit),
the horizon effect. Search
limit
Queen
lost

CSI 4106, Winter 2005 Games, page 23


What next?

This is a technology for one kind of


games, and quite probably not the most
popular (-:). Adventure games require a
very different kind of Artificial
Intelligence. Visit
http://www.gameai.com/ai.html
for a nearly professional perspective.
And, in general, google it.

CSI 4106, Winter 2005 Games, page 24

You might also like