Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

Game Theory

Game Theory

• Game theory is the science of strategy. It attempts to determine


mathematically and logically the actions that “players” should take to
secure the best outcomes for themselves in a wide array of “games.”

• Classification of games according to:


• number of players
• kinds of goal
• information each player has
about the game.

2
Number of players
most of the board games have two players.
ply one player’s turn (aka half-move with 2 players)
move One round of all the players’ turns (aka turn)
Goal
zero-sum game: your win is the opponent’s loss (1; −1)
trying to win ≡ trying to make your opponent loose. non-
zero-sum game: you could all win or all lose
focus on your own winning, rather than your opponent
losing
with more than two players and zero-sum games, best strategy may not be
making every opponent loose.
Information
perfect information fully observable environment
complete knowledge of every move your opponent could possibly make
imperfect information partially observable environment
eg, random element that makes unforeseeable which move you and the opponent 7
Types of Games
deterministic Chance

perfect information chess, checkers, kalaha backgammon,


go, Othello Monopoly

imperfect information battleships, bridge, poker, scrabble


blind tictactoe
4
Game Tree
For turn-based games: each node in the tree represents a board position, and each
branch represents one possible move.

terminal positions: no possible move, represent end of the game. Score given to
players
branching factor: number of branches at each branching point in the tree tree
depth: finite or infinite
transposition same board position from different sequences of moves ~ cycles
Games-Typical case
2-person game
Players alternate moves
Zero-sum: one player’s loss is the other’s gain
Perfect information: both players have access to
complete information about the state of the game.
No information is hidden from either player.
No chance (e.g., using dice) involved
Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim,
Othello
Not: Bridge, Solitaire, Backgammon, ...
How to play a game
A way to play such a game is to:
Consider all the legal moves you can make
Compute the new position resulting from each move
Evaluate each resulting position and determine which is
best
Make that move
Wait for your opponent to move and repeat
Key problems are:
Representing the “board”
Generating all legal next boards
Evaluating a position
Evaluation function
Evaluation function or static evaluator is used to
evaluate the “goodness” of a game position.
Contrast with heuristic search where the evaluation
function was a non-negative estimate of the cost from the
start node to a goal and passing through the given node
The zero-sum assumption allows us to use a single
evaluation function to describe the goodness of a board
with respect to both players.
f(n) >> 0: position n good for me and bad for you
f(n) << 0: position n bad for me and good for you
f(n) near 0: position n is a neutral position
f(n) = +infinity: win for me
f(n) = -infinity: win for you
Evaluation function examples
Example of an evaluation function for Tic-Tac-Toe:
f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for
you]
where a 3-length is a complete row, column, or diagonal
Alan Turing’s function for chess
f(n) = w(n)/b(n) where w(n) = sum of the point value of
white’s pieces and b(n) = sum of black’s
Most evaluation functions are specified as a weighted
sum of position features:
f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)
Example features for chess are piece count, piece
placement, squares controlled, etc.
Game trees
 Problem spaces for typical games are represented as trees
 Root node represents the current board configuration; player
must decide the best single move to make next
 Static evaluator function rates a board position. f(board) = real
number with f>0 “white” (me), f<0 for black (you)
 Arcs represent the possible legal moves for a player
 If it is my turn to move, then the root is labeled a "MAX"
node; otherwise it is labeled a "MIN" node, indicating my
opponent's turn.
 Each level of the tree has nodes that are all MAX or all MIN;
nodes at level i are of the opposite kind from those at level i+1
Games vs Search Problems
"Unpredictable" opponent : specifying a move for every possible
opponent reply
Time limits : unlikely to find goal, must approximate
Game Playing Strategy
Maximize winning possibility assuming that opponent will try to
minimize (Minimax Algorithm)
Ignore the unwanted portion of the search tree (Alpha Beta Pruning)
Evaluation(Utility) Function
A measure of winning possibility of the player
Minimax procedure
Create start node as a MAX node with current board
configuration
Expand nodes down to some depth (a.k.a. ply) of
lookahead in the game
Apply the evaluation function at each of the leaf nodes
“Back up” values for each of the non-leaf nodes until a
value is computed for the root node
At MIN nodes, the backed-up value is the minimum of
the values associated with its children.
At MAX nodes, the backed up value is the maximum of
the values associated with its children.
Pick the operator associated with the child node whose
backed-up value determined the value at the root
Tic-Tac-Toe
X O

e(p) = 6 - 5 = 1
 Initial State: Board position of 3x3 matrix with 0 and X.
 Operators: Putting 0’s or X’s in vacant positions alternatively
 Terminal test: Which determines game is over
 Utility function:
e(p) = (No. of complete rows, columns or diagonals are
still open for player ) – (No. of complete rows, columns or
diagonals are still open for opponent )
Minimax Algorithm
Generate the game tree
Apply the utility function to each terminal state to get
its value
Use these values to determine the utility of the nodes
one level higher up in the search tree
From bottom to top
For a MAX level, select the maximum value of its
successors
For a MIN level, select the minimum value of its successors

From root node select the move which leads to


highest value
General Minimax algorithm
For each move by the computer:
1. Perform depth-first search to a terminal state
2. Evaluate each terminal state
3. Propagate upwards the minimax values
 if opponent's move, propagate up minimum
value of children
 if computer's move, propagate up maximum
value of children
4. choose move at root with the maximum of
minimax values of children
Game tree for Tic-Tac-Toe
Describing a Perfect Game of Tic Tac Toe

To begin, let's start by defining what it means to play


a perfect game of tic tac toe:
If I play perfectly, every time I play I will either win
the game, or I will draw the game. Furthermore if I
play against another perfect player, I will always draw
the game.
How might we describe these situations
quantitatively? Let's assign a score to the "end game
conditions:"
I win, hurray! I get 10 points!
I lose, --I lose 10 points (because the other player gets 10
points)
I draw, whatever. I get zero points, nobody gets any points.
Looking

at a Brief Example
To apply this, let's take an example from near the end of a game, where
it is my turn. I am X. My goal here, obviously, is to maximize my end
game score.

 If the top of this image represents the state of the game I see when it is
my turn, then I have some choices to make, there are three places I can
play, one of which clearly results in me wining and earning the 10
points.
 If I don't make that move, O could very easily win. And I don't want O
to win, so my goal here, as the first player, should be to pick the
But What About O?
 What do we know about O? Well we should assume that O is also playing
to win this game, but relative to us, the first player, O wants obviously
wants to chose the move that results in the worst score for us, it wants to
pick a move that would minimize our ultimate score. Let's look at things
from O's perspective, starting with the two other game states from above
in which we don't immediately win:

 The choice is clear, O would pick any of the moves that result in a score of -10.
Describing Minimax

The key to the Minimax algorithm is a back and forth


between the two players, where the player whose "turn it
is" desires to pick the move with the maximum score.
In turn, the scores for each of the available moves are
determined by the opposing player deciding which of its
available moves has the minimum score.
And the scores for the opposing players moves are again
determined by the turn-taking player trying to maximize
its score and so on all the way down the move tree to an
end state.
A description for the algorithm, assuming X is the "turn
taking player," would look something like:
If the game is over, return the score from X's perspective.
Otherwise get a list of new game states for every possible
move
Create a scores list
For each of these states add the minimax result of that state
to the scores list
If it's X's turn, return the maximum score from the scores
list
If it's O's turn, return the minimum score from the scores
list
You'll notice that this algorithm is recursive, it flips back
and forth between the players until a final score is found.
Let's walk through the algorithm's execution with the full
move tree, and show why, algorithmically, the instant
winning move will be picked:
 It's X's turn in state 1. X generates the states 2, 3, and 4 and calls minimax
on those states.
 State 2 pushes the score of +10 to state 1's score list, because the game is
in an end state.
 State 3 and 4 are not in end states, so 3 generates states 5 and 6 and calls
minimax on them, while state 4 generates states 7 and 8 and calls minimax
on them.
 State 5 pushes a score of -10 onto state 3's score list, while the same
happens for state 7 which pushes a score of -10 onto state 4's score list.
 State 6 and 8 generate the only available moves, which are end states, and
so both of them add the score of +10 to the move lists of states 3 and 4.
 Because it is O's turn in both state 3 and 4, O will seek to find the
minimum score, and given the choice between -10 and +10, both states 3
and 4 will yield -10.
 Finally the score list for states 2, 3, and 4 are populated with +10, -10 and -
10 respectively, and state 1 seeking to maximize the score will chose the
winning move with score +10, state 2.
 That is certainly a lot to take in. And that is why we have a computer
execute this algorithm.
Minimax Algorithm
2

2 1 2 1

2 7 1 8 2 7 1 8 2 7 1 8

This is the move 2


Static evaluator selected by minimax
value
2 1
MAX
MIN 2 7 1 8
Observation
Minimax algorithm, presented above, requires expanding
the entire state-space.
Severe limitation, especially for problems with a large
state-space.
Some nodes in the search can be proven to be irrelevant
to the outcome of the search

You might also like