Professional Documents
Culture Documents
Game Theory
Game Theory
Game Theory
2
Number of players
most of the board games have two players.
ply one player’s turn (aka half-move with 2 players)
move One round of all the players’ turns (aka turn)
Goal
zero-sum game: your win is the opponent’s loss (1; −1)
trying to win ≡ trying to make your opponent loose. non-
zero-sum game: you could all win or all lose
focus on your own winning, rather than your opponent
losing
with more than two players and zero-sum games, best strategy may not be
making every opponent loose.
Information
perfect information fully observable environment
complete knowledge of every move your opponent could possibly make
imperfect information partially observable environment
eg, random element that makes unforeseeable which move you and the opponent 7
Types of Games
deterministic Chance
terminal positions: no possible move, represent end of the game. Score given to
players
branching factor: number of branches at each branching point in the tree tree
depth: finite or infinite
transposition same board position from different sequences of moves ~ cycles
Games-Typical case
2-person game
Players alternate moves
Zero-sum: one player’s loss is the other’s gain
Perfect information: both players have access to
complete information about the state of the game.
No information is hidden from either player.
No chance (e.g., using dice) involved
Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim,
Othello
Not: Bridge, Solitaire, Backgammon, ...
How to play a game
A way to play such a game is to:
Consider all the legal moves you can make
Compute the new position resulting from each move
Evaluate each resulting position and determine which is
best
Make that move
Wait for your opponent to move and repeat
Key problems are:
Representing the “board”
Generating all legal next boards
Evaluating a position
Evaluation function
Evaluation function or static evaluator is used to
evaluate the “goodness” of a game position.
Contrast with heuristic search where the evaluation
function was a non-negative estimate of the cost from the
start node to a goal and passing through the given node
The zero-sum assumption allows us to use a single
evaluation function to describe the goodness of a board
with respect to both players.
f(n) >> 0: position n good for me and bad for you
f(n) << 0: position n bad for me and good for you
f(n) near 0: position n is a neutral position
f(n) = +infinity: win for me
f(n) = -infinity: win for you
Evaluation function examples
Example of an evaluation function for Tic-Tac-Toe:
f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for
you]
where a 3-length is a complete row, column, or diagonal
Alan Turing’s function for chess
f(n) = w(n)/b(n) where w(n) = sum of the point value of
white’s pieces and b(n) = sum of black’s
Most evaluation functions are specified as a weighted
sum of position features:
f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)
Example features for chess are piece count, piece
placement, squares controlled, etc.
Game trees
Problem spaces for typical games are represented as trees
Root node represents the current board configuration; player
must decide the best single move to make next
Static evaluator function rates a board position. f(board) = real
number with f>0 “white” (me), f<0 for black (you)
Arcs represent the possible legal moves for a player
If it is my turn to move, then the root is labeled a "MAX"
node; otherwise it is labeled a "MIN" node, indicating my
opponent's turn.
Each level of the tree has nodes that are all MAX or all MIN;
nodes at level i are of the opposite kind from those at level i+1
Games vs Search Problems
"Unpredictable" opponent : specifying a move for every possible
opponent reply
Time limits : unlikely to find goal, must approximate
Game Playing Strategy
Maximize winning possibility assuming that opponent will try to
minimize (Minimax Algorithm)
Ignore the unwanted portion of the search tree (Alpha Beta Pruning)
Evaluation(Utility) Function
A measure of winning possibility of the player
Minimax procedure
Create start node as a MAX node with current board
configuration
Expand nodes down to some depth (a.k.a. ply) of
lookahead in the game
Apply the evaluation function at each of the leaf nodes
“Back up” values for each of the non-leaf nodes until a
value is computed for the root node
At MIN nodes, the backed-up value is the minimum of
the values associated with its children.
At MAX nodes, the backed up value is the maximum of
the values associated with its children.
Pick the operator associated with the child node whose
backed-up value determined the value at the root
Tic-Tac-Toe
X O
e(p) = 6 - 5 = 1
Initial State: Board position of 3x3 matrix with 0 and X.
Operators: Putting 0’s or X’s in vacant positions alternatively
Terminal test: Which determines game is over
Utility function:
e(p) = (No. of complete rows, columns or diagonals are
still open for player ) – (No. of complete rows, columns or
diagonals are still open for opponent )
Minimax Algorithm
Generate the game tree
Apply the utility function to each terminal state to get
its value
Use these values to determine the utility of the nodes
one level higher up in the search tree
From bottom to top
For a MAX level, select the maximum value of its
successors
For a MIN level, select the minimum value of its successors
If the top of this image represents the state of the game I see when it is
my turn, then I have some choices to make, there are three places I can
play, one of which clearly results in me wining and earning the 10
points.
If I don't make that move, O could very easily win. And I don't want O
to win, so my goal here, as the first player, should be to pick the
But What About O?
What do we know about O? Well we should assume that O is also playing
to win this game, but relative to us, the first player, O wants obviously
wants to chose the move that results in the worst score for us, it wants to
pick a move that would minimize our ultimate score. Let's look at things
from O's perspective, starting with the two other game states from above
in which we don't immediately win:
The choice is clear, O would pick any of the moves that result in a score of -10.
Describing Minimax
2 1 2 1
2 7 1 8 2 7 1 8 2 7 1 8