Professional Documents
Culture Documents
AI 3 Unit new savita
AI 3 Unit new savita
Game Theory
Optimal Decisions in Games
-MiniMax algorithm
-Heuristic alpha - beta tree pruning
Stochastic games
Monte carlo tree search
Limitations of game search algorithms
Constraint Satisfaction problems (CSP)
- N Queen Problem
- Node Graph coloring problem
Backtracking Search for CSP
Constraint propagation
- Inference in csps.
The environment with more than one agent(player) is termed as multi-agent environment, in
which each agent is an opponent of other agent and playing against each other.
Real-life Example: Playing a soccer match is a multi-agent environment.
Searches in which two or more players with conflicting goals are trying to explore the same
search space for the solution, are called adversarial searches, often known as Games.
A game consist of :
Players
Actions
Strategy
Payoff ( Is the reward that all the players get when a certain outcome is achieved. It can
either be positive or negative).
Game Playing:
Game Playing is an important domain of artificial intelligence. Games don’t require much
knowledge; the only knowledge we need to provide is the rules, legal moves and the conditions of
winning or losing the game.
Both players try to win the game. So, both of them try to make the best move possible at each turn.
Searching techniques like BFS(Breadth First Search) are not accurate for this as the branching
factor is very high, so searching will take a lot of time. So, we need another search procedures that
improve –
Generate procedure so that only good moves are generated.
Test procedure so that the best move can be explored first.
Initial call:
Minimax(node, 3, true)
AlphaGo is an artificial intelligence (AI) agent that is specialized to play Go, a Chinese
strategy board game, against human competitors.
AlphaGo is a Google DeepMind project.
The ability to create a learning algorithm that can beat a human player at strategic games is a
measure of AI development.
AlphaGo and its successors use a Monte Carlo tree search algorithm to find its moves based
on knowledge previously acquired by machine learning, specifically by an artificial neural
network (a deep learning method) by extensive training, both from human and computer
play.
Alpha Go Zero is made of a Convolutional Neural Networks and a Monte Carlo Tree.
It is trained in self-play with Reinforcement Learning algorithms.
AlphaGo involves both model-free methods (Convolutional Neural Network (CNN)), and also
model-based methods (Monte Carlo Tree Search (MCTS)). In fact, AlphaGo is pretty similar to how
we humans think: involving both fast intuition (i.e., cost function by CNN) and also careful and
slow thinking (i.e., MCTS).
Monte Carlo methods have been used for decades to predict outcomes probabilistically.
In MCTS, nodes are the building blocks of the search tree.
These nodes are formed based on the outcome of a number of simulations.
The process of Monte Carlo Tree Search can be broken down into four distinct steps, viz.,
1. Selection:
Selecting good child nodes, starting from the root node R, that represent states leading to
better overall outcome (win).
2. Expansion:
If Lis a not a terminal node (i.e. it does not end the game), then create one or more child
nodes and select one (C).
3. Simulation:
Run a simulated playout from C until a result is achieved.
4. Backpropagation:
Update the current move sequence with the simulation result.
Stochastic Games(SG)
Chance
What are the drawbacks of MIN MAX algo How do you overcome using alpha-beta pruning?
Limitation of the minimax Algorithm:
1. The main drawback of the minimax algorithm is that it gets really slow for complex games
such as Chess, go, etc. This type of games has a huge branching factor, and the player has
lots of choices to decide.
2. The Mini-max algorithm performs a DFS algorithm for the exploration of the complete
game tree.
3. The mini -max algorithm proceeds all the way down to the terminal node of the tree,then
backtrack the tree as the recursion.
4. The disadvantage of the minimax algorithm is that each board state has to be visited
twice:one time to find its children & a second time to evaluate the heuristic value.
This limitation of the minimax algorithm can be improved from alpha-beta pruning.
Advantages:
1. Alpha-beta pruning plays a great role in reducing the number of nodes which are found out
by minimax algorithm.
2. When one chance or option is found at the minimum, it stops assessing a move.
3. This method also helps to improve the search procedure in an effective way.
Disadvantages:
It has been found that alpha - beta pruning method is not feasible in most of the cases to find out the
whole game tree.
Constraint Satisfaction Problem(CSP)
https://www.youtube.com/watch?v=vA1Bz8sII1c
https://www.youtube.com/watch?v=pnLFu0yJzN8
Note : Refer this link for n queen csp (4 queen & 8 queen)