CS 416 Artificial Intelligence: Agents

CS 416
Artificial Intelligence
Lecture
Lecture 22
Agents
Agents
Chess Article
Deep
Deep Blue
Blue (IBM)
(IBM)
•• 418
418 processors,
processors, 200
200 million
million positions
positions per
per second
second
Deep
Deep Junior
Junior (Israeli
(Israeli Co.)
Co.)
•• 88 processors,
processors, 33 million
million positions
positions per
per second
second
Kasparov
Kasparov
•• 100
100 billion
billion neurons
neurons in
in brain,
brain, 22 moves
moves per
per second
second
But
But there
there are
are 85
85 billion
billion ways
ways to
to play
play the
the first
first four
four moves
moves
Chess Article
1997
1997 -- Kasparov
Kasparov Lost
Lost to
to Deep
Deep Blue
Blue
2002
2002 -- Kramnik
Kramnik tied
tied Deep
Deep Junior
Junior (current
(current World
World Champion)
Champion)
2003
2003 -- Kasparov
Kasparov (current
(current number
number 1)
1) plays
plays Deep
Deep Junior
Junior
Jan
Jan 26
26 –– Feb
Feb 77
Chess Article
Cognitive
Cognitive psychologists
psychologists report
report chess
chess is
is aa game
game of
of
pattern
pattern matching
matching for
for humans
humans
•• But
But what
what patterns
patterns do
do we
we see?
see?
•• What
What rules
rules do
do we
we use
use to
to evaluate
evaluate perceived
perceived patterns?
patterns?
What is an agent?
Perception
Perception
•• Sensors
Sensors receive
receive input
input from
from environment
environment
–– Keyboard
Keyboard clicks
clicks
–– Camera
Camera data
data
–– Bump
Bump sensor
sensor
Action
Action
•• Actuators
Actuators impact
impact the
the environment
environment
–– Move
Move aa robotic
robotic arm
arm
–– Generate
Generate output
output for
for computer
computer display
display
Perception
Percept
Percept
•• Perceptual
Perceptual inputs
inputs at
at an
an instant
instant
•• May
May include
include perception
perception of
of internal
internal state
state
Percept
Percept Sequence
Sequence
•• Complete
Complete history
history of
of all
all prior
prior percepts
percepts
Do
Do you
you need
need aa percept
percept sequence
sequence to
to play
play Chess?
Chess?
An agent as a function
Agent
Agent maps
maps percept
percept sequence
sequence to
to action
action
•• Agent:
Agent: f ( ps )  a; ps  p *
–– Set
Set of
of all
all inputs
inputs known
known as
as state
state space
space
Agent
Agent Function
Function
•• IfIf inputs
inputs are
are finite,
finite, aa table
table can
can store
store mapping
mapping
•• Scalable?
Scalable?
•• Reverse
Reverse Engineering?
Engineering?
Evaluating agent programs
We
We agree
agree on
on what
what an
an agent
agent must
must do
do
Can
Can we
we evaluate
evaluate its
its quality?
quality?
Performance
Performance Metrics
Metrics
•• Very
Very Important
Important
•• Frequently
Frequently the
the hardest
hardest part
part of
of the
the research
research problem
problem
•• Design
Design these
these to
to suit
suit what
what you
you really
really want
want to
to happen
happen
Rational Agent
For
For each
each percept
percept sequence,
sequence, aa rational
rational agent
agent
should
should select
select an
an action
action that
that maximizes
maximizes its
its
performance
performance measure
measure
Example:
Example: autonomous
autonomous vacuum
vacuum cleaner
cleaner
•• What
What is
is the
the performance
performance measure?
measure?
•• Penalty
Penalty for
for eating
eating the
the cat?
cat? How
How much?
much?
•• Penalty
Penalty for
for missing
missing aa spot?
spot?
•• Reward
Reward for
for speed?
speed?
•• Reward
Reward for
for conserving
conserving power?
power?
Learning and Autonomy
Learning
Learning
•• To
To update
update the
the agent
agent function
function in
in light
light of
of observed
observed
performance
performance of
of percept-sequence
percept-sequence to to action
action pairs
pairs
–– Explore
Explore new
new parts
parts of
of state
state space
space
 Learn
Learn from
from trial
trial and
and error
error
–– Change
Change internal
internal variables
variables that
that influence
influence action
action selection
selection
Adding intelligence to agent
function
At
At design
design time
time
•• Some
Some agents
agents are
are designed
designed with
with clear
clear procedure
procedure toto improve
improve
performance
performance over
over time.
time. Really
Really the
the engineer’s
engineer’s intelligence.
intelligence.
–– Camera-based
Camera-based user
user identification
identification
At
At run-time
run-time
•• Agent
Agent executes
executes complicated
complicated equation
equation to
to map
map input
input to
to output
output
Between
Between trials
trials
•• With
With experience,
experience, agent
agent changes
changes its
its program
program (parameters)
(parameters)
How big is your percept?
Dung
Dung Beetle
Beetle
•• Largely
Largely feed
feed forward
forward
Sphex
Sphex Wasp
Wasp
•• Reacts
Reacts to
to environment
environment (feedback)
(feedback) but
but not
not learning
learning
A
A Dog
Dog
•• Reacts
Reacts to
to environment
environment and
and can
can significantly
significantly alter
alter behavior
behavior
Qualities of a task environment
Fully
Fully Observable
Observable
•• Agent
Agent need
need not
not store
store any
any aspects
aspects of
of state
state
–– The
The Brady
Brady Bunch
Bunch as
as intelligent
intelligent agents
agents
–– Volume
Volume of
of observables
observables may
may be
be overwhelming
overwhelming
Partially
Partially Observable
Observable
•• Some
Some data
data is
is unavailable
unavailable
–– Maze
Maze
–– Noisy
Noisy sensors
sensors
Deterministic
Deterministic
•• Always
Always the
the same
same outcome
outcome for
for state/action
state/action pair
pair
Stochastic
Stochastic
•• Not
Not always
always predictable
predictable –– random
random
Partially
Partially Observable
Observable vs.
vs. Stochastic
Stochastic
•• My
My cats
cats think
think the
the world
world is
is stochastic
stochastic
•• Physicists
Physicists think
think the
the world
world is
is deterministic
deterministic
Markovian
Markovian
•• Future
Future state
state only
only depends
depends on
on current
current state
state
Episodic
Episodic
•• Percept
Percept sequence
sequence can
can be
be segmented
segmented into
into independent
independent temporal
temporal
categories
categories
–– Behavior
Behavior at
at traffic
traffic light
light independent
independent of
of previous
previous traffic
traffic
Sequential
Sequential
•• Current
Current decision
decision could
could affect
affect all
all future
future decisions
decisions
Which
Which is
is easiest
easiest to
to program?
program?
Static
Static
•• Environment
Environment doesn’t
doesn’t change
change over
over time
time
–– Crossword
Crossword puzzle
puzzle
Dynamic
Dynamic
•• Environment
Environment changes
changes over
over time
time
–– Driving
Driving aa car
car
Semi-dynamic
Semi-dynamic
•• Environment
Environment is
is static,
static, but
but performance
performance metrics
metrics are
are dynamic
dynamic
–– Drag
Drag racing
racing
Discrete
Discrete
•• Values
Values ofof aa state
state space
space feature
feature (dimension)
(dimension) are
are constrained
constrained
to
to distinct
distinct values
values from
from aa finite
finite set
set
–– Blackjack:
Blackjack: f(your
f(your cards,
cards, exposed
exposed cards)
cards) == action
action
Continuous
Continuous
•• Variable
Variable has
has infinite
infinite variation
variation
–– Antilock
Antilock brakes:
brakes: ff (vehicle
(vehicle speed,
speed, wheel
wheel velocity)
velocity) == unlock
unlock
–– Are
Are computers
computers really
really continuous?
continuous?
Towards
Towards aa terse
terse description
description of
of problem
problem domains
domains
•• State
State space:
space: features,
features, dimensionality,
dimensionality, degrees
degrees of
of freedom
freedom
•• Observable?
Observable?
•• Predictable?
Predictable?
•• Dynamic?
Dynamic?
•• Continuous?
Continuous?
•• Performance
Performance metric
metric
Building Agent Programs
The
The table
table approach
approach
•• Build
Build aa table
table mapping
mapping states
states to
to actions
actions
–– Chess
Chess has
has 10 150 entries
10150 entries (10 80 atoms
(1080 atoms in
in the
the universe)
universe)
–– I’ve
I’ve said
said memory
memory is
is free,
free, but
but keep
keep itit within
within the
the confines
confines of
of
the
the boundable
boundable universe
universe
•• Still,
Still, tables
tables have
have their
their place
place
Discuss
Discuss four
four agent
agent program
program principles
principles
Simple Reflex Agents
•• Sense
Sense environment
environment
•• Match
Match sensations
sensations with
with rules
rules in
in database
database
•• Rule
Rule prescribes
prescribes an
an action
action
Reflexes
Reflexes can
can be
be bad
bad
•• Don’t
Don’t put
put your
your hands
hands down
down when
when falling
falling backwards!
backwards!
Inaccurate
Inaccurate information
information
•• Misperception
Misperception can
can trigger
trigger reflex
reflex when
when inappropriate
inappropriate
But
But rules
rules databases
databases can
can be
be made
made large
large and
and complex
complex
Simple Reflex Agents
Randomization
Randomization
•• The
The vacuum
vacuum cleaner
cleaner problem
problem
Dirty
Dirty
Left Right
Model-based Reflex Agents
So
So when
when you
you can’t
can’t see
see something,
something, you
you model
model it!
it!
•• Create
Create an
an internal
internal variable
variable to
to store
store your
your expectation
expectation of
of
variables
variables you
you can’t
can’t observe
observe
•• IfIf II throw
throw aa ball
ball to
to you
you and
and itit falls
falls short,
short, do
do II know
know why?
why?
–– Aerodynamics,
Aerodynamics, mass,
mass, my
my energy
energy levels…
levels…
–– II do
do have
have aa model
model
 Ball
Ball falls
falls short,
short, throw
throw harder
harder
Model-based Reflex Agents
Admit
Admit it,
it, you
you can’t
can’t see
see and
and understand
understand everything
everything
Models
Models are
are very
very important!
important!
•• We
We all
all use
use models
models to
to get
get through
through our
our lives
lives
–– Psychologists
Psychologists have
have many
many names
names for
for these
these context-
context-
sensitive
sensitive models
models
•• Agents
Agents need
need models
models too
too
Goal-based Agents
Lacking
Lacking moment-to-moment
moment-to-moment performance
performance measure
measure
Overall
Overall goal
goal is
is known
known
How
How to
to get
get from
from A
A to
to B?
B?
•• Current
Current actions
actions have
have future
future consequences
consequences
•• Search
Search and
and Planning
Planning are
are used
used to
to explore
explore paths
paths through
through state
state
space
space from
from AA to
to B
B
Utility-based Agents
Goal-directed
Goal-directed agents
agents that
that have
have aa utility
utility function
function
•• Function
Function that
that maps
maps internal
internal and
and external
external states
states into
into aa scalar
scalar
–– A
A scalar
scalar is
is aa number
number
Learning Agents
Learning
Learning Element
Element
•• Making
Making improvements
improvements
Performance
Performance Element
Element
•• Selecting
Selecting actions
actions
Critic
Critic
•• Provides
Provides learning
learning element
element with
with feedback
feedback about
about progress
progress
Problem
Problem Generator
Generator
•• Provides
Provides suggestions
suggestions for
for new
new tasks
tasks to
to explore
explore state
state space
space
A taxi driver
Performance
Performance Element
Element
•• Knowledge
Knowledge of
of how
how to
to drive
drive in
in traffic
traffic
Critic
Critic
•• Observes
Observes tips
tips from
from customers
customers and
and horn
horn honking
honking from
from other
other cars
cars
Learning
Learning Element
Element
•• Relates
Relates low
low tips
tips to
to actions
actions that
that may
may be
be the
the cause
cause
Problem
Problem Generator
Generator
•• Proposes
Proposes new
new routes
routes to
to try
try and
and improved
improved driving
driving skills
skills
Review
Outlined
Outlined families
families of
of AI
AI problems
problems and
and solutions
solutions
Next
Next class
class we
we study
study search
search problems
problems

CS 416 Artificial Intelligence: Agents

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS 416 Artificial Intelligence: Agents

Uploaded by

Copyright:

Available Formats

CS 416

You might also like