Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Unit 3 –Adversarial Search and Games

 Game Theory
 Optimal Decisions in Games
-MiniMax algorithm
-Heuristic alpha - beta tree pruning
 Stochastic games
 Monte carlo tree search
 Limitations of game search algorithms
 Constraint Satisfaction problems (CSP)
- N Queen Problem
- Node Graph coloring problem
 Backtracking Search for CSP
 Constraint propagation
- Inference in csps.

What is Adversarial Search?

 Adversarial search is a game-playing technique where the agents are surrounded by a


competitive environment.
 These agents compete with one another and try to defeat one another in order to win the
game. Such conflicting goals give rise to the adversarial search.
 Some situations where more than one agent is searching for the solution in the same search
space.

 Example : - Game Playing - Chess, Tic-Tac-Toe

 The environment with more than one agent(player) is termed as multi-agent environment, in
which each agent is an opponent of other agent and playing against each other.
 Real-life Example: Playing a soccer match is a multi-agent environment.
 Searches in which two or more players with conflicting goals are trying to explore the same
search space for the solution, are called adversarial searches, often known as Games.

What is a game theory?


 Game theory is the process of modeling the strategic interaction between two or more
players in a situation containing set rules and outcomes.
 Game theory studies interactive decision-making, where the outcome for each participant or
"player" depends on the actions of all.
 Any interaction between two or more players in which each player’s payoff is affected by
their decisions & decisions made by others.

A game consist of :

 Players
 Actions
 Strategy
 Payoff ( Is the reward that all the players get when a certain outcome is achieved. It can
either be positive or negative).
Game Playing:

Why do AI researchers study game playing?

1. It’s a good reasoning problem, formal and nontrivial.


2. Direct comparison with humans and other computer programs is easy.

Game Playing is an important domain of artificial intelligence. Games don’t require much
knowledge; the only knowledge we need to provide is the rules, legal moves and the conditions of
winning or losing the game.
Both players try to win the game. So, both of them try to make the best move possible at each turn.
Searching techniques like BFS(Breadth First Search) are not accurate for this as the branching
factor is very high, so searching will take a lot of time. So, we need another search procedures that
improve –
 Generate procedure so that only good moves are generated.
 Test procedure so that the best move can be explored first.

Games as Adversarial Search

• States: – board configurations


• Initial state: – the board position and which player will move.
• Successor function: – returns list of (move, state) pairs, each indicating a legal move and the
resulting state.
• Terminal test: – determines when the game is over.
•Utility function: – gives a numeric value in terminal states.
A utility function gives the final numeric value for a game that ends in terminal states s for
player p. It is also called payoff function.
For Chess, the outcomes are a win, loss, or draw and its payoff values are +1, 0, ½.

Major components of Game playing programs:

 Mini-Max tree generation


 Heuristic Improvements
 Iterative deepening
 Alpha-Beta Pruning
 Program Optimization
Mainly games of strategy with the following characteristics:

1. Sequence of moves to play


2. Rules that specify possible moves
3. Rules that specify a payment for each move
4. Objective is to maximize your payment
Working of MiniMax Algorithm:
Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making
and game theory. It provides an optimal move for the player assuming that opponent is also playing
optimally.
 Mini-Max algorithm uses recursion to search through the game-tree.
 Min-Max algorithm is mostly used for game playing in AI. Such as Chess, Checkers, tic-tac-
toe, go, and various two-players game. This Algorithm computes the minimax decision for
the current state.
 In this algorithm two players play the game, one is called MAX and other is called MIN.
 Both the players fight it as the opponent player gets the minimum benefit while they
get the maximum benefit.
 Both Players of the game are opponent of each other, where MAX will select the maximized
value and MIN will select the minimized value.
 The minimax algorithm performs a depth-first search algorithm for the exploration of the
complete game tree.
 The minimax algorithm proceeds all the way down to the terminal node of the tree, then
backtrack the tree as the recursion.

Pseudo-code for MinMax Algorithm:

function minimax(node, depth, maximizingPlayer) is


if depth == 0 or node is a terminal node then
return static evaluation of node

if MaximizingPlayer then // for Maximizer Player


maxEva =- infinity
for each child of node do
eva= minimax(child, depth - 1, false)
maxEva= max(maxEva,eva) //gives Maximum of the values
return maxEva

else // for Minimizer player


minEva =+ infinity
for each child of node do
eva= minimax(child, depth - 1, true)
minEva = min(minEva, eva) //gives minimum of the values
return minEva

Initial call:
Minimax(node, 3, true)

https://www.youtube.com/watch?v=S7L4-KDTvEE Min max algorithm


Alpha-Beta Pruning
Key points about alpha-beta pruning:
 The Max player will only update the value of alpha.
 The Min player will only update the value of beta.
 Instead of alpha and beta values, node values will be sent to upper nodes while retracing the
tree.
 We will only pass the alpha and beta values to the child nodes.

Pseudo-code for Alpha-beta Pruning:

function minimax(node, depth, alpha, beta, maximizingPlayer) is


if depth == 0 or node is a terminal node then
return static evaluation of node

if MaximizingPlayer then // for Maximizer Player


maxEva =- infinity
for each child of node do
eva = minimax(child, depth-1, alpha, beta, False)
maxEva = max(maxEva, eva)
alpha = max(alpha, maxEva)
if beta <= alpha
break
return maxEva

else // for Minimizer player


minEva =+infinity
for each child of node do
eva= minimax(child, depth-1, alpha, beta, true)
minEva = min(minEva, eva)
beta = min(beta, eva)
if beta <= alpha
break

https://www.youtube.com/watch?v=dEs_kbvu_0s Alpha Beta pruning


How does AlphaGo work?

 AlphaGo is an artificial intelligence (AI) agent that is specialized to play Go, a Chinese
strategy board game, against human competitors.
 AlphaGo is a Google DeepMind project.
 The ability to create a learning algorithm that can beat a human player at strategic games is a
measure of AI development.

 AlphaGo and its successors use a Monte Carlo tree search algorithm to find its moves based
on knowledge previously acquired by machine learning, specifically by an artificial neural
network (a deep learning method) by extensive training, both from human and computer
play.
 Alpha Go Zero is made of a Convolutional Neural Networks and a Monte Carlo Tree.
It is trained in self-play with Reinforcement Learning algorithms.

AlphaGo involves both model-free methods (Convolutional Neural Network (CNN)), and also
model-based methods (Monte Carlo Tree Search (MCTS)). In fact, AlphaGo is pretty similar to how
we humans think: involving both fast intuition (i.e., cost function by CNN) and also careful and
slow thinking (i.e., MCTS).

Monte Carlo Tree Search:

 Monte Carlo Tree Search is a search technique in Artificial Intelligence.


 Monte Carlo Tree Search is a heuristic search algorithm for some kinds of decision
processes,most notably those employed in software that plays board games.
 This has recently been used by Artificial Intelligence Programs like AlphaGo, to play against
the world's top Go players.
 MCTS is about approximate inference (propagation or pruning: exact inference) MCTS is
related to Machine Learning.
 It is a probabilistic and heuristic driven search algorithm.
 Combines the classic tree search implementations alongside machine learning principles of
reinforcement learning.

 Monte Carlo methods have been used for decades to predict outcomes probabilistically.
 In MCTS, nodes are the building blocks of the search tree.
 These nodes are formed based on the outcome of a number of simulations.

The process of Monte Carlo Tree Search can be broken down into four distinct steps, viz.,
1. Selection:
Selecting good child nodes, starting from the root node R, that represent states leading to
better overall outcome (win).
2. Expansion:
If Lis a not a terminal node (i.e. it does not end the game), then create one or more child
nodes and select one (C).
3. Simulation:
Run a simulated playout from C until a result is achieved.
4. Backpropagation:
Update the current move sequence with the simulation result.
Stochastic Games(SG)
Chance
What are the drawbacks of MIN MAX algo How do you overcome using alpha-beta pruning?
Limitation of the minimax Algorithm:
1. The main drawback of the minimax algorithm is that it gets really slow for complex games
such as Chess, go, etc. This type of games has a huge branching factor, and the player has
lots of choices to decide.
2. The Mini-max algorithm performs a DFS algorithm for the exploration of the complete
game tree.
3. The mini -max algorithm proceeds all the way down to the terminal node of the tree,then
backtrack the tree as the recursion.
4. The disadvantage of the minimax algorithm is that each board state has to be visited
twice:one time to find its children & a second time to evaluate the heuristic value.

This limitation of the minimax algorithm can be improved from alpha-beta pruning.

Advantages:
1. Alpha-beta pruning plays a great role in reducing the number of nodes which are found out
by minimax algorithm.
2. When one chance or option is found at the minimum, it stops assessing a move.
3. This method also helps to improve the search procedure in an effective way.
Disadvantages:

It has been found that alpha - beta pruning method is not feasible in most of the cases to find out the
whole game tree.
Constraint Satisfaction Problem(CSP)
https://www.youtube.com/watch?v=vA1Bz8sII1c

Refer this link for CSP and node coloring example.


N -Queen Problem:
8 queen problem Solution:

https://www.youtube.com/watch?v=pnLFu0yJzN8

Note : Refer this link for n queen csp (4 queen & 8 queen)

You might also like