Professional Documents
Culture Documents
Lecture 09 10+Game+Playing+++MinMax-AlphaBeta
Lecture 09 10+Game+Playing+++MinMax-AlphaBeta
Search
LECTURE # 09 & 10
8/7/23
SPRING 2017
FAST – NUCES, FAISALABAD
CAMPUS
Zain Iqbal
Zain.iqbal@nu.edu.pk
In which we examine the
problems that arise when we
try to plan ahead in a world
where other agents are
planning against us.
2
Game Playing
3
Why study games
• Clear criteria for success
• Offer an opportunity to study
problems involving {hostile,
adversarial, competing} agents.
• Historical reasons
• Fun
• Interesting, hard problems
4
Game-Playing Agent
sensors
?
environment
agent
Environment
actuators
5
What Kinds of Games?
Mainly games of strategy with the following
characteristics:
1. Sequence of moves to play
2. Rules that specify possible moves
3. Rules that specify a payment for each
move
4. Objective is to maximize your payment
6
Kinds of Games
• Deterministic
• Turn-taking
• 2-player
• Zero-sum
• Perfect information
7
Typical case
• 2-person game
• Players alternate moves
• Zero-sum: one player’s loss is the other’s gain
• Perfect information: both players have access to
complete information about the state of the game.
No information is hidden from either player.
• No chance (e.g., using dice) involved
• Examples: Tic-Tac-Toe, Checkers, Chess, Go,
Nim, Othello
• Not: Bridge, Solitaire, Backgammon, ...
8
Types of games
deterministic chance
perfect
information
imperfect
information
9
Types of games
deterministic chance
10
Game Playing
13
Differences from problem
solving
• Opponent makes own choices!
• Each choice that game playing agent makes
depends on what response opponent makes
• Playing quickly may be important – need a
good way of approximating solutions and
improving search
14
Game as Search Problem
• Initial State: board position and player to
move
• Players: Which player have the move in the
state
• Actions: returns a list of legal (move, state)
pairs
• Result: The result of a move
• Terminal Test: determines when the game
is over
• Utility function: Gives a numeric value for
the terminal state
15
Utility Function
17
Example: Tic Tac Toe
18
Game Trees
This is an example of a
partial game tree for the
game tic-tac-toe.
Even for this simple game,
the game tree is very large.
19
Minimax
• Minimax is a method used to evaluate game
trees.
• A static evaluator is applied to leaf nodes,
and values are passed back up the tree to
determine the best score the computer can
obtain against a rational opponent.
20
Two Agents
• MAX
• Wants to maximize the result of the utility
function
• Winning strategy if, on MIN's turn, a win is
obtainable for MAX for all moves that MIN can
make
• MIN
• Wants to minimize the result of the evaluation
function
• Winning strategy if, on MAX's turn, a win is
obtainable for MIN for all moves that MAX can
make
Minimax tree
Max
Min
Max
Min
Starting point:
Look at entire tree
23
Minimax Function
• MINIMAX-VALUE(n) =
• UTILITY(n)
if n is a terminal state
• maxs Successors(n) MINIMAX-VALUE(s) if
n is a MAX node
• mins Successors(n) MINIMAX-VALUE(s) if
n is a MIN node
24
Minimax algorithm
25
Minimax – Animated Example
Max 3 6 The computer can
obtain 6 by
choosing the right
Min 6 hand edge from the
5 3 first node.
Max 1 3 6 0 7
5
5 2 1 3 6 2 0 7
26
Minimax algorithm: Complexity
• We need to explore the complete game tree
before making the decision
27
Properties of minimax
• Complete?
• Yes (if tree is finite).
• Optimal?
• Yes
• Time complexity?
• O(bm)
• Space complexity?
• O(bm) (depth-first search, generate all actions at once)
• O(m) (backtracking search, generate actions one at a
time)
Multiplayer games
• Each node must hold a vector of values
• For example, for three players A, B, C (vA, vB, vC)
• The backed up vector at node n will always be the one that
maximizes the payoff of the player choosing at n
Searching Game Trees
• Exhaustively searching a game tree is not usually
a good idea.
• Even for a game as simple as tic-tac-toe there are
over 350,000 nodes in the complete game tree.
• An additional problem is that the computer only
gets to choose every other path through the tree –
the opponent chooses the others.
30
Pruning
• We can use a branch-and-bound technique
to reduce the number of states that must be
examined to determine the value of a tree.
32
Minimax with
Alpha-Beta Cutoffs
• Alpha is the lower bound on
maximizing nodes.
• Beta is the upper bound on
minimizing nodes.
• Both alpha and beta get passed down
the tree during the Minimax search.
33
Usage of Alpha & Beta
• At minimizing nodes, we stop evaluating children if we
get a child whose value is less than the current lower
bound (alpha).
• At maximizing nodes, we stop evaluating children as soon
as we get a child whose value is greater than the current
upper bound (beta).
• Some branches will never be played by rational players
since they include sub- optimal decisions (for either
player)
34
Alpha & Beta
• At the root of the search tree, alpha is set to
MAX and beta is set to MIN.
• Maximizing nodes update alpha from the
values of children.
• Minimizing nodes update beta from the
value of children.
• If alpha > beta, stop evaluating children.
35
Movement of Alpha and
Beta
• Each node passes the current value of alpha and beta to
each child node evaluated.
• Children nodes update their copy of alpha and beta, but
do not pass alpha or beta back up the tree.
• Minimizing nodes return beta as the value of the node.
• Maximizing nodes return alpha as the value of the
node.
36
Alpha-Beta Pruning:
Example
24
Alpha-Beta Pruning:
Example
25
Alpha-Beta Pruning:
Example
26
Alpha-Beta Pruning:
Example
27
Alpha-Beta Pruning:
Example
28
Alpha-Beta Pruning:
Example
29
Alpha-Beta Pruning:
Example
30
Alpha-Beta Pruning:
Example
31
Alpha-Beta Pruning:
Example
32
Alpha-Beta Pruning:
Example
33
Alpha-Beta Pruning:
Example
34
Alpha-Beta Pruning:
Example
35
Alpha-Beta Pruning:
Example
36
Alpha-Beta Pruning:
Example
37
Alpha-Beta Pruning:
Example
38
Another Example
52
The Effectiveness of Alpha-
Beta
• The effectiveness depends on the order in
which children are visited.
• In the best case, the effective branching
factor will be reduced from b to sqrt(b).
• In an average case (random values of
leaves) the branching factor is reduced to
b/logb.
53
Reading Material
• Russell & Norvig: Chapter # 5
• George.F.Lugar: Chapter # 4
54