Professional Documents
Culture Documents
New CZ3005 Module 2 - Intelligent Agents and Search
New CZ3005 Module 2 - Intelligent Agents and Search
New CZ3005 Module 2 - Intelligent Agents and Search
Artificial Intelligence
https://personal.ntu.edu.sg/hanwangzhang/
Email: hanwangzhang@ntu.edu.sg
Office: N4-02c-87
Lesson Outline
Percepts
Environment ?
Agent
Actions
Effectors
Rational Agents
• A rational agent is one that does the right thing
• Rational action: action that maximises the expected value
of an objective performance measure given the percept
sequence to date
• Rationality depends on/subject to:
• performance measure
• everything that the agent has perceived so far
• built-in knowledge about the environment
• actions that can be performed
Example: Google X2: Driverless Taxi
• Percepts: video, speed, acceleration, engine status, GPS,
radar, …
• Actions: steer, accelerate, brake, horn, display, …
• Goals (Measure): safety, reach destination, maximise
profits, obey laws, passenger comfort,…
• Environment: Singapore urban
streets, highways, traffic,
pedestrians, weather, customers, …
Image source: https://en.wikipedia.org/wiki/Waymo#/media/File:Waymo_Chrysler_Pacifica_in_Los_Altos,_2017.jpg
More Examples
Agent Type Percepts Actions Goals Environment
Medical diagnosis Symptoms, findings, Questions, tests, Healthy patient, Patient, hospital
system patient’s answers treatments minimize costs
Satellite image Pixels of varying Print a Correct Images from orbiting
analysis system intensity, color categorization of categorization satellite
scene
Part-picking robot Pixels of varying Pick up parts and Place parts in Conveyor belt with
intensity sorts into bins correct bins parts
Refinery controller Temperature, Open, close valves; Maximize purity, Refinery
pressure readings adjust temperature yield, safety
Interactive English Typed words Print exercises, Maximize student’s Set of students
tutor suggestions, score on test
corrections
Types of Environment
Accessible (vs Agent's sensory apparatus gives it access to the
inaccessible) complete state of the environment
Deterministic (vs The next state of the environment is completely determined
nondeterministic) by the current state and the actions selected by the agent
Episodic (vs Each episode is not affected by the previous taken
Sequential) actions
Static (vs Environment does not change while an agent is
dynamic) deliberating
Discrete (vs A limited number of distinct percepts and actions
continuous)
Example: Driverless Taxi
Zerin
d lasi
Ara
d Sibiu Fagaras
Vasliu
Urziceni
Mehadi Bucharest
a Eforie
Dobret
a Craiova Giurgiu
Example: Vacuum Cleaner Agent
3 4
5 6
7 8
3 4
5 6
7 8
– e.g.: start in #5
– Solution: right, suck
Multiple-State Problem
– Inaccessible world state (with limited sensory information):
agent only knows which sets of states it is in
– Known outcome of action (deterministic)
1 2
3 4
5 6
7 8
• Solution:
a path (connecting sets of states) that leads to a set of states all of which are
goal states
Example: Vacuum World (Multiple-state
Version)
States: subset of the eight states
Operators: left, right, suck
Goal test: all states in state set have no dirt
Path cost: 1 per operator
Example: Vacuum World (Multiple-state
Version)
Example: 8-puzzle
• States: integer locations of tiles Start state
• number of states = 9!
• Actions: move blank left, right, up, down
• Goal test: = goal state (given)
• Path cost: 1 per move
Goal state
Real-World Problems
Route finding problems:
• Routing in computer networks
• Robot navigation
• Automated travel advisory
• Airline travel planning
Touring problems:
• Traveling Salesperson problem
• “Shortest tour": visit every city exactly once
Question to Think About
AlphaGo
https://forms.gle/dmFLBxgx7kBktFPdA
Search Algorithms
• Exploration of state space by
generating successors of Arad
already-explored states Timisoara
Zerind
• Frontier: candidate nodes Sibiu
for expansion
Arad Rimnicu
• Explored set Vilcea
Fagaras
Oradea
Sibiu Bucharest
Search Strategies
• A strategy is defined by picking the order of node expansion.
• Strategies are evaluated along the following dimensions:
1 A 2 A 3 A 4 A
B C B C B C
D E D E F G
A 1 2
S S
0
1 10
A B C
5 B 5 1 5 15
S G
4 S S
15 3
5
A B C A B C
C
15 5 15
G G G
11 10 11
Here we do not expand notes that have been expanded.
Uniform-Cost Search
Complete Yes
Time # of nodes with path cost g <= cost of optimal solution
(eqv. # of nodes pop out from the priority queue)
Space # of nodes with path cost g <= cost of optimal solution
Optimal Yes
Depth-First Search
Expand deepest unexpanded node which can be implemented by a Last-In-
First-Out (LIFO) stack, Backtrack only when no more expansion
1 A 2 A 3 A 4 A 5 A 6 A
B C B C B C B C B C
D E D E D E E
7 A 8 A H I I
9 A 10 A 11 A 12 A
B C B C
E C C C C
E
F G F G F G
J K K
L M M
Depth-First Search
Denote
• m: maximum depth of the state space
Time
Space
Optimal No
Iterative Deepening Search
Iteratively estimate the max depth / of DLS one-by-one
Limit = 0
Limit = 1
Iterative Deepening Search
Iteratively estimate the max depth / of DLS one-by-one
Limit = 2
Iterative Deepening Search
Iteratively estimate the max depth / of DLS one-by-one
Limit = 3
Iterative Deepening Search...
Function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution sequence
inputs: problem, a problem
for depth to ∞ do
if DEPTH-LIMITED-SEARCH(problem, depth) succeeds then return its result
end
return failure
Complete Yes
Time
Space
Optimal Yes
Summary (we make assumptions for optimality)
Breadth- Uniform- Depth-First Depth- Iterative Bidirectional
Criterion first Cost Limited Deepening (if applicable)
Time
Space
Optimal Yes Yes No No Yes Yes
Complete Yes Yes No Yes, if Yes Yes
General Search
Uninformed search strategies Informed search strategies
• Systematic generation of new • Use problem-specific knowledge
states (Goal Test) • To decide the order of node
expansion
• Inefficient (exponential space
• Best First Search: expand the
and time complexity)
most desirable unexpanded node
• Use an evaluation function to
estimate the “desirability" of
each node
Evaluation function
• Path-cost function
• Cost from initial state to current state (search-node)
• Information on the cost toward the goal is not required.
• “Heuristic” function
• Estimated cost of the cheapest path from to a goal state:
• Exact cost cannot be determined
• depends only on the state at that node (may not include history)
• is not larger than the real cost (admissible)
Greedy Search
Expands the node that appears to be closest to goal
• Evaluation function :estimate of cost from to
• Function Greedy-Search(problem) returns solution
• Return Best-First-Search(problem, ) //
state Arad
Arad
Bucharest
366
0
Craiova 160
366 Dobreta 242
Efoire 161
Fagaras 176
Giurgiu 77
Hirsova 151
Lasi 226
Lugoj 244
Mehadia 241
Neamt 234
Oradea 380
Pitesti 98
Rimnicu Vilcea 193
Sibiu 253
Timisoara 329
Urziceni 80
Vaslui 199
Zerind 374
Example
b) After Straight-line distance to Bucharest
expanding Arad
Arad
Bucharest
366
0
Arad Craiova
Dobreta
160
242
Efoire 161
Sibiu Timisoara Zerind Fagaras 176
Giurgiu 77
253 329 374 Hirsova 151
Lasi 226
Lugoj 244
Mehadia 241
Neamt 234
Oradea 380
Pitesti 98
Rimnicu Vilcea 193
Sibiu 253
Timisoara 329
Urziceni 80
Vaslui 199
Zerind 374
Example
c) After Straight-line distance to Bucharest
Dobreta Eforie
Craiova
Giurgiu
Answer: No
Greedy Search...
• m: maximum depth of the search space
Complete No
Time
Space
Optimal No
A * Search
• Uniform-cost search
• g(n): cost to reach n (Past Experience)
• optimal and complete, but can be very
inefficient
• Greedy search
• h(n): cost from n to goal (Future Prediction)
• neither optimal nor complete, but cuts search
space considerably
A * Search
Idea: Combine Greedy search with Uniform-Cost search
Evaluation function:
(a) The initial state (b) After expanding Arad (c) After expanding Sibiu
Arad Arad Arad
366 = 0 + 366
Sibiu Zerind Sibiu Zerind
Timisoara
393 = 140 + 253 449 = 75 + 374 449 = 75 + 374
447 = 118 + 329
Timisoara Arad
Oradea
447 = 118 + 329 646 = 280 + 366 Rimnicu
671 = 291+ 380
Vilcea
Fagaras
413 = 220 + 193
415 = 239 + 176
Example: Route-finding from Arad to Bucharest
Best-first-search with evaluation function
Zerind
Sibiu
Timisoara 449 = 75 + 374
Arad 447 = 118 + 329
646 = 280 + 366 Oradea Rimnicu
Fagaras 671 = 291+ 380 Vilcea
Sibiu
Sibiu Craiova Pitesti 553 = 300 + 253
Bucharest
591 = 338 + 253 526 = 360 + 166
415 = 317 + 98
450 = 450 + 0
Example: Route-finding from Arad to Bucharest
Best-first-search with evaluation function
Arad
Sibiu
Sibiu Craiova Pitesti 553 = 300 + 253
Bucharest
591 = 338 + 253 526 = 360 + 166
450 = 450 + 0
Rimnicu
Bucharest Craiova Vilcea
#turns
#Moves = 250150
#stone positions
Another Example: Monte Carlo Search Tree (MCST)
MCST Key Steps
#Wins/#Games
MCST for Go is very expensive & slow
Solution:
1. For selection/expansion, no longer stats
(xx/xx) but f = g+h
2. For simulation, no longer real self-play but f =
g+h
MCST: Update f = g() + h()
f f f f f f f f
f f f f f f f f
No real play,
just an estimation