CS2351 - Ai 00 PDF

CS2351 Artificial Intelligence
A Course Material on
Artificial Intelligence
By
Mrs. J.Justina Princy Thilagavathy
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SASURIE COLLEGE OF ENGINEERING
VIJAYAMANGALAM – 638 056
SCE 1 Dept of CSE

QUALITY CERTIFICATE
This is to certify that the e-course material
Subject Code : CS22351
Subject : Artificial Intelligence
Class : III Year CSE
being prepared by me and it meets the knowledge requirement of the university curriculum.
Signature of the Author
Name: J. Justina Princy Thilagavathy
Designation: Assistant Professor
This is to certify that the course material being prepared by Mrs.J.Justina Princy Thilagavathy is of adequate
quality. She has referred more than five books among them minimum one is from abroad author.
Signature of HD
Name: Mrs. P. Murugapriya
Head & AP
SCE 2 Dept of CSE

TABLE OF CONTENTS
PAGE
S.No DATE TOPIC
No
UNIT I-PROBLEM SOLVING
1 Introduction 6
2 Agents 7
3 Problem formulation 21
4 uninformed search strategies 24
5 heuristics, informed search strategies 43
6 constraint satisfaction 46
UNIT II-LOGICAL AGENTS
7 Logical agents, Propotional logic 57
8 inferences 57
9 first-order logic, inferences in first order logic 58
10 forward chaining 64
11 backward chaining 66
12 unification, Resolution 76
UNIT III-PLANNING
13 Planning with state, Space search 77
14 partial Order Planning 78
15 planning graphs,Planning andacting with real world 84
UNIT IV-UNCERTAIN KNOWLEDGE AND REASONING
SCE 3 Dept of CSE

16 Uncertainty , Review of probability 86
17 probabilistic Reasoning 87
18 Bayesian networks 89
19 inferences in Bayesian networks, Temporal models 90
20 Hidden Markov models 93
UNIT V LEARNING
21 Learning from observation 95
22 Inductive learning 97
23 Decision trees 98
24 Explanation based learning 101
25 Statistical Learning methods 103
26 Reinforcement Learning 104
APPENDICES
A Glossary 107
B Question bank 113
C Previous year question papers 141
SCE 4 Dept of CSE

CS2351 ARTIFICIAL INTELLIGENCE LTPC

3003
Aim: To learn the basics of designing intelligent agents that can solve general purpose problems,
represent and process knowledge, plan and act, reason under uncertainty and can learn from
experiences
UNIT I PROBLEM SOLVING 9

Introduction – Agents – Problem formulation – uninformed search strategies – heuristics
– informed search strategies – constraint satisfaction
UNIT II LOGICAL REASONING 9

Logical agents – propositional logic – inferences – first-order logic – inferences in first- order logic
– forward chaining – backward chaining – unification – resolution
UNIT III PLANNING 9

Planning with state-space search – partial-order planning – planning graphs – planning and acting in
the real world
UNIT IV UNCERTAIN KNOWLEDGE AND REASONING 9

Uncertainty – review of probability - probabilistic Reasoning – Bayesian networks –
inferences in Bayesian networks – Temporal models – Hidden Markov models
UNIT V LEARNING 9
Learning from observation - Inductive learning – Decision trees – Explanation based learning –
Statistical Learning methods - Reinforcement Learning
TOTAL: 45PERIODS TEXT
BOOK:
1. S. Russel and P. Norvig, “Artificial Intelligence – A Modern Approach”, Second
Edition, Pearson Education, 2003.
REFERENCES:
1. David Poole, Alan Mackworth, Randy Goebel, ”Computational Intelligence : a logical

approach”, Oxford University Press, 2004.
2. G. Luger, “Artificial Intelligence: Structures and Strategies for complex problem solving”,
Fourth Edition, Pearson Education, 2002.
3. J. Nilsson, “Artificial Intelligence: A new Synthesis”, Elsevier Publishers, 1998.
SCE 5 Dept of CSE

UNIT-1
PROBLEM SOLVING
INTRODUCTION:
The objective of Artificial Intelligence is that how the system can perceive, understand, predict and
manipulate a world far larger and more complicated. The field of Artificial Intelligence is to build
intelligent entities.
DEFINITION:
Artificial Intelligence is the study of how to make computers do things at which, at the moment,
people are better.
SOME DEFINITIONS OF AI
 Building systems that think like humans
“The exciting new effort to make computers think … machines with minds, in the full and
literal sense” -- Haugeland, 1985
“The automation of activities that we associate with human thinking, … such as

decision-making, problem solving, learning, …” -- Bellman, 1978
 Building systems that act like humans
“The art of creating machines that perform functions that require intelligence when
performed by people” -- Kurzweil, 1990
“The study of how to make computers do things at which, at the moment, people
are better” -- Rich and Knight, 1991
 Building systems that think rationally
“The study of mental faculties through the use of computational models” -- Charniak
and McDermott, 1985
“The study of the computations that make it possible to perceive, reason, and act” --
Winston, 1992
 Building systems that act rationally
“A field of study that seeks to explain and emulate intelligent behavior in terms of
computational processes” -- Schalkoff, 1990
“The branch of computer science that is concerned with the automation of

intelligent behavior” -- Luger and Stubblefield, 1993
SCE 6 Dept of CSE

AGENTS
Agent = perceive + act
 Thinking
 Reasoning
 Planning
Agent: entity in a program or environment capable of generating action.

An agent uses perceptionof the environment to make decisions about actions to take. The
perception capability is usually called a sensor. The actions can depend on the most recent perception or
on the entire history (percept sequence).
Definition:
An agent is anything that can be viewed as perceiving its environment through sensors and acting
upon the environment through actuators.
Ex: Robotic agent
Human agent
INTELLIGENT AGENT:
Agent = perceive+act
 Thinking
 Reasonig
 Planning
SCE 7 Dept of CSE

Agent: entity in a program or environment capable of generating action.
An agent uses perception of the environment to make decisions about actions to take.
The perception capability is usually called a sensor.
The actions can depend on the most recent perception or on the entire history (percept
sequence).
An agent is anything that can be viewed as perceiving its environment through sensors and acting
upon the environment through actuators.
Ex: Robotic agent
Human agent
Agents interact with environment through sensors and actuators.
A B
SCE 8 Dept of CSE

Percept sequence action
[A, clean] right

[A, dirt] suck
[B, clean] left
[B, dirty] suck
[A, clean], [A, clean] right
[A, clean], [A, dirty] suck
Fig: practical tabulation of a simple agent function for the vacuum cleaner world
Agent Function
1.The agent function is a mathematical function that maps a sequence of perceptions into
action.
2. The function is implemented as the agent program.
3. The part of the agent taking an action is called an actuator.
4. Environment sensors agent function actuators environment
RATIONAL AGENT:
A rational agent is one that can take the right decision in every situation.
Performance measure: a set of criteria/test bed for the success of the agent's behavior.
The performance measures should be based on the desired effect of the agent on
the environment.
Rationality:
The agent's rational behavior depends on:
1.the performance measure that defines success
2. the agent's knowledge of the environment
3.the action that it is capable of performing
4 .The current sequence of perceptions.
Definition: for every possible percept sequence, the agent is expected to take an
action that will maximize its performance measure.
Agent Autonomy:
An agent is omniscient if it knows the actual outcome of its actions. Not possible in
SCE 9 Dept of CSE
practice. An environment can sometimes be completely known in advance.

Exploration: sometimes an agent must perform an action to gather information (to increase
perception).
SCE 10 Dept of CSE

Autonomy: the capacity to compensate for partial or incorrect prior knowledge (usually by
learning).
NATURE OF ENVIRONMENTS:
Task environment – the problem that the agent is a

solution to. Includes
Performance measure
Environment
Actuator
Sensors
Agent Type Performance Environment Actuators Sensors

Measures
Taxi Driver Safe, Fast, Roads, other Steering, Camera, sonar,

Legal, Comfort, traffic, accelerators, GPS,
Maximize Profits pedestrians, brake, Speedometer,
customers signal keyboard, etc
, horn
Medical Healthy patient, Patient, Screen display Keyboard (entry

diagnosi minimize costs, hospital, (questions, of
s system lawsuits staff tests, symptoms
diagnoses, , findings,
treatments, patient's
referrals) answers
)
Properties of Task Environment:
• Fully Observable (vs. Partly Observable)
– Agent sensors give complete state of the environment at each point in time
– Sensors detect all the aspect that is relevant to the choice of action.
– An environment might be partially observable because of noisy and

inaccurate sensors or apart of the state are simply missing from the sensor
data.
• Deterministic (vs. Stochastic)
– Next state of the environment is completely determined by the current state

SCE 11 Dept of CSE
and the action executed by the agent
SCE 12 Dept of CSE

– Strategic environment (if the environment is deterministic except for the actions
of other agent.)
• Episodic (vs. Sequential)
– Agent’s experience can be divided into episodes, each episode with what an agent
perceive and what is the action
• Next episode does not depend on the previous episode
– Current decision will affect all future sates in sequential environment
• Static (vs. Dynamic)
– Environment doesn’t change as the agent is deliberating
– Semi dynamic
• Discrete (vs. Continuous)
– Depends the way time is handled in describing state, percept, actions
• Chess game : discrete
• Taxi driving : continuous
• Single Agent (vs. Multi Agent)
– Competitive, cooperative multi-agent environments
– Communication is a key issue in multi agent environments.
Partially Observable:
Ex: Automated taxi cannot see what other devices are

thinking. Stochastic:
Ex: taxi driving is clearly stochastic in this sense, because one can never predict the
behaviorof the traffic exactly.
Semi dynamic:
If the environment does not change for some time, then it changes due to agent’s
performance is called semi dynamic environment.
Single Agent Vs multi agent:
An agent solving a cross word puzzle by itself is clearly in a single agent environment.
An agent playing chess is in a two agent environment.
SCE 13 Dept of CSE

Example of Task Environments and Their Classes
Four types of agents:
1. Simple reflex agent
2. Model based reflex agent
3. goal-based agent
4. utility-
base agent
Simple reflex agent
Definition:
SRA works only if the correct decision can be made on the basis of only the
current percept that is only if the environment is fully observable.
Characteristics
– no plan,
no goal
– do not know what they want to achieve
– do not know what they are doing
Condition-action rule
SCE 14 Dept of CSE

– If condition then action
Ex: medical diagnosis system.
SCE 15 Dept of CSE

Algorithm Explanation:
Interpret – Input:
Function generates an abstracted description of the current state from the percept.
RULE- MATCH:
Function returns the first rule in the set of rules that matches the given state
description.
RULE - ACTION:
The selected rule is executed as action of the given percept.
Model-Based Reflex Agents:
Definition:
An agent which combines the current percept with the old internal state to
generate updated description of the current state.
If the world is not fully observable, the agent must remember observations about the
parts of the environment it cannot currently observe.
This usually requires an internal representation of the world (or internal state).
Since this representation is a model of the world, we call this model-based agent.
Ex: Braking problem
characteristics
1.Reflex agent with internal state
2.Sensor does not provide the complete state of the

world.
3. must keep its internal state
Updating the internal world
requires two kinds of knowledge
1. How world evolves
2. How agent’s action affect the world
SCE 16 Dept of CSE

Algorithm Explanation:
UPDATE-INPUT: This is responsible for creating the new internal stated description.
Goal-based agents:
The agent has a purpose and the action to be taken depends on the current state
and on what it tries to accomplish (the goal).
In some cases the goal is easy to achieve. In others it involves planning, sifting through a
search space for possible solutions, developing a strategy.
Characteriscs
SCE 17 Dept of CSE

– Action depends on the goal. (consideration of future)

– e.g. path finding
SCE 18 Dept of CSE

– Fundamentally different from the condition-action rule.
– Search and Planning
– Solving “car-braking” problem?
– Yes, possible … but not likely natural.
• Appears less efficient.
Utility-based agents
If one state is preferred over the other, then it has higher utility for the agent
Utility-Function (state) = real number (degree of

happiness)
The agent is aware of a utility function that estimates how close the current state is to the
agent's goal.
• Characteristics
– to generate high-quality behavior
– Map the internal states to real

numbers. (e.g., game playing)
• Looking for higher utility value utility function
SCE 19 Dept of CSE

Learning Agents
Agents capable of acquiring new competence through observations and

actions. Learning agent has the following components
Learning element
Suggests modification to the existing rule to the critic
Performance element
Collection of knowledge and procedures for selecting the driving actions
Choice depends on Learning element
Critic
Observes the world and passes information to the learning element
Problem generator
Identifies certain areas of behavior needs improvement and

suggest experiments
SCE 20 Dept of CSE

Agent Example
A file manager agent.
Sensors: commands like ls, du, pwd.
Actuators: commands like tar, gzip, cd, rm, cp, etc.
Purpose: compress and archive files that have not been used in a while.
Environment: fully observable (but partially observed), deterministic (strategic),

episodic, dynamic, discrete.
Problem Formulation
• Problem formulation is the process of deciding what actions and states to consider,
given a goal
Formulate Goal, Formulate

problem
Search
Execute
SCE 21 Dept of CSE
PROBLEMS
Four components of problem definition
– Initial state – that the agent starts in
– Possible Actions
• Uses a Successor Function
– Returns <action, successor>

pair
• State Space – the state space forms a graph in which the nodes are
states and arcs between nodes are actions.
• Path
– Goal Test – which determine whether a given state is goal state
– Path cost – function that assigns a numeric cost to each path.
SOME REAL-WORLD PROBLEMS
• Route finding
• Touring (traveling salesman)
• Logistics
• VLSI layout
• Robot navigation
• Learning
SCE 22 Dept of CSE

TOY PROBLEM
Example-1 : Vacuum World
Problem Formulation
• States
– 2 x 22 = 8 states
– Formula n2n states
• Initial State
– Any one of 8 states
• Successor Function
– Legal states that result from three actions (Left, Right, Suck)
• Goal Test
– All squares are clean
• Path Cost
– Number of steps (each step costs a value of 1)
SCE 23 Dept of CSE

State Space for the Vacuum World.
Labels on Arcs denote L: Left, R: Right, S: Suck
UNINFORMED SEARCH STRATEGIES
• Uninformed strategies use only the information available in the problem definition
– Also known as blind searching
– Uninformed search methods:
• Breadth-first search
• Uniform-cost search
• Depth-first search
• Depth-limited search
• Iterative deepening search
SCE 24 Dept of CSE

BREADTH-
FIRSTSEARCH
Definition:
The root node is expanded first, and then all the nodes generated by the node are expanded.
• Expand the shallowest unexpanded node
• Place all new successors at the end of a FIFO queue
Implementation:
SCE 25 Dept of CSE

Properties of Breadth-First Search
• Complete
– Yes if b (max branching factor) is finite
• Time
– 1 + b + b2 + … + bd + b(bd-1) = O(bd+1)
– exponential in d
• Space
– O(bd+1)
– Keeps every node in memory
– This is the big problem; an agent that generates nodes at 10 MB/sec will
produce
860 MB in 24 hours
• Optimal
– Yes (if cost is 1 per step); not optimal in general
Lessons from Breadth First Search
• The memory requirements are a bigger problem for breadth-first search than is
execution time
SCE 26 Dept of CSE

• Exponential-complexity search problems cannot be solved by uniformed methods

for any but the smallest instances
Ex: Route finding problem
Given:
Task: Find the route from S to G using BFS.
Step1:
Step 2:
Step3:
SCE 27 Dept of CSE

Step4:
Answer : The path in the 2nd depth level that is SBG (or ) SCG.
Time complexity
DEPTH-FIRST SEARCH OR BACK TRACKING SEARCHING
Definition:
Expand one node to the depth of the tree. If dead end occurs, backtracking is done
to the next immediate previous node for the nodes to be expanded
• Expand the deepest unexpanded node
• Unexplored successors are placed on a stack until fully explored
• Enqueue nodes on nodes in LIFO (last-in, first-out) order. That is, nodes used as
a stack data structure to order nodes.
• It has modest memory requirement.
• It needs to store only a single path from the root to a leaf node, along with
remaining unexpanded sibling nodes for each node on a path
• Back track uses less memory.
Implementation:
SCE 28 Dept of CSE

SCE 29 Dept of CSE

SCE 30 Dept of CSE

Properties of Depth-First Search
• Complete
– No: fails in infinite-depth spaces, spaces with loops
• Modify to avoid repeated spaces along path
– Yes: in finite spaces
• Time
– O(bm)
– Not great if m is much larger than d
– But if the solutions are dense, this may be faster than breadth-first search
• Space
– O(bm)…linear space
SCE 31 Dept of CSE
• Optimal
– No
• When search hits a dead-end, can only back up one level at a time even if the
“problem” occurs because of a bad operator choice near the top of the tree.
Hence, only does “chronological backtracking”
Advantage:
• If more than one solution exists or no of levels is high then dfs is best because
exploration is done only a small portion of the white space.
Disadvantage:
• No guaranteed to find solution.
Example: Route finding problem
Given problem:
Task: Find a route between A to B
Step 1:
Step 2:
SCE 32 Dept of CSE

Step 3:
A B C
Step 4:
A B C
G
Answer: Path in 3rd level is SADG
DEPTH-LIMITED SEARCH
Definition:
SCE 33 Dept of CSE

A cut off (Maximum level of the depth) is introduced in this search technique to overcome
the
disadvantage of Depth First Search. The cut off value depends on the number of states.DLS
can be implemented as a simple modification to the general tree search algorithm or the
recursive DFS algorithm.DLS imposes a fixed depth limit on a dfs.
A variation of depth-first search that uses a depth limit
– Alleviates the problem of unbounded trees
– Search to a predetermined depth l (“ell”)
– Nodes at depth l have no successors
• Same as depth-first search if l = ∞
• Can terminate for failure and cutoff
• Two kinds of failure
Standard failure: indicates no solution
Cut off: indicates no solution within the depth limit
Properties of Depth-Limited Search
• Complete
– Yes if l < d
• Time
– N(IDS)=(d)b+(d-1)b²+……………………..+(1)
– O(bl)
SCE 34 Dept of CSE
• Space
– O(bl)
• Optimal
– No if l > d
Advantage:
• Cut off level is introduced in DFS Technique.
Disadvantage:
• No guarantee to find the optimal solution.
E.g.: Route finding problem
Given:
B C
D E
The number of states in the given map is five. So it is possible to get the goal state at the
maximum depth of four. Therefore the cut off value is four.
Task: find a path from A to E.
1. 2. 3. 4.
A A A A
B C B C B C
40
D D
SCE 35 Dept of CSE

Answer: Path = ABDE Depth=3
ITERATIVE DEEPENING SEARCH (OR) DEPTH-FIRST ITERATIVE DEEPENING

(DFID):
Definition:
• Iterative deepening depth-first search It is a strategy that steps the issue of choosing the
best path depth limit by trying all possible depth limit
Uses depth-first search
Finds the best depth limit
Gradually increases the depth limit; 0, 1, 2, … until a goal is found
Iterative Lengthening Search:
The idea is to use increasing path-cost limit instead of increasing depth limits. The
resulting algorithm called iterative lengthening search.
Implementation:
SCE Dept of CSE

36
Properties of Iterative Deepening

Search:
• Complete
– Yes
• Time : N(IDS)=(d)b+(d-1)b2+…………+(1)bd
– O(bd)
• Space
– O(bd)
• Optimal
– Yes if step cost = 1
– Can be modified to explore uniform cost tree
Advantages:
SCE Dept of CSE

37
• This method is preferred for large state space and when the depth of the search
is not known.
• Memory requirements are modest.
• Like BFS it is complete
SCE Dept of CSE

38
Disadvantages:
Many states are expanded multiple times.
Lessons from Iterative Deepening Search
• If branching factor is b and solution is at depth d, then nodes at depth d are

generated once, nodes at depth d-1 are generated twice, etc.
– Hence bd + 2b(d-1) + ... + db <= bd / (1 - 1/b)2 = O(bd).
– If b=4, then worst case is 1.78 * 4d, i.e., 78% more nodes searched
than exist at depth d (in the worst case).
• Faster than BFS even though IDS generates repeated states
– BFS generates nodes up to level d+1
– IDS only generates nodes up to level d
• In general, iterative deepening search is the preferred uninformed search

method when there is a large search space and the depth of the solution is
not known
Example: Route finding problem
Given:
A F
B C
D E G
Task: Find a path from A to G.
Limit=0
A
Limit=1
sce 39 Dept of CSE

B C F
Limit=2
1.
A
B C F
2.
Answer: Since it is a IDS tree the lowest depth limit (i.e.) A-F-G is selected as the solution path.
BI-DIRECTIONAL SEARCH
Definition:
sce 40 Dept of CSE

It is a strategy that simultaneously searches both the directions (i.e) forward from the
initial state and backward from the goal state and stops when the two searches meet
in the Middle.
• Alternate searching from the start state toward the goal and from the goal state
toward the start.
• Stop when the frontiers intersect.
• Works well only when there are unique start and goal states.
• Requires the ability to generate “predecessor” states.
• Can (sometimes) lead to finding a solution more quickly.
Properties of Bidirectional Search:
1. Time Complexity: O(b d/2)
2. Space Complexity: O(b d/2)
3. Complete: Yes
4. Optimal: Yes
sce 41 Dept of CSE

Advantages:
Reduce time complexity and space complexity
Disadvantages:
The space requirement is the most significant weakness of bi-directional search.If two
searches do not meet at all, complexity arises in the search technique. In backward search
calculating predecessor is difficult task. If more than one goal state exists then explicitly,
multiple state searches are required.
COMPARING UNINFORMED SEARCH STRATEGIES
• Completeness
– Will a solution always be found if one exists?
• Time
– How long does it take to find the solution?
– Often represented as the number of nodes searched
• Space
– How much memory is needed to perform the search?
– Often represented as the maximum number of nodes stored at once
• Optimal
– Will the optimal (least cost) solution be found?
• Time and space complexity are measured in
– b – maximum branching factor of the search tree
– m – maximum depth of the state space
– d – depth of the least cost solution
sce 42 Dept of CSE

1.5 HEURISTICS / INFORMED SEARCH STRATEGIES:
Heuristic / Informed
It uses additional information about nodes (heuristics) that have not yet been explored to
decide which nodes to examine next
 Use problem specific knowledge
 Can find solutions more efficiently than search strategies that do not use domain specific
knowledge.
 find solutions even when there is limited time available
General approach of informed search:
 Best-first search: node is selected for expansion based on an evaluation function f(n)
Idea: evaluation function measures distance to the goal.
* Choose node which appears best
• Best First Search algorithms differs in the evaluation function
– Evaluation function incorporate the problem specific knowledge in the form of

h(n)
– h(n) , heuristic function , a component of f(n), Estimated cost of cheapest path

to the
goal node
• h(n) = 0, if n is the goal node
Implementation:
 fringe is queue sorted in decreasing order of desirability.
 Special cases: greedy search, A* search
GREEDY BEST-FIRST SEARCH
• Expands the node that is closest to the goal
• Consider route finding problem in Romania
– Use of hSLD, Straight Line Distance Heuristic
– Evaluation function f(n) = h(n) (heuristic), estimate of cost from n to goal

SCE 43 Dept of CSE
Definition:
A best first search that uses to select next node to expand is called greedy search.
Ex:
Given,
Solution:
From the given graph and estimated cost the goal state is estimated as B
from A. Apply the evaluation function h(n) to find a path from A to B.
SCE 44 Dept of CSE

From F goal state B is reached. Therefore the path from A to B using greedy search is A-S-F-B
= 450(i.e.) (140+99+211).or the problem of finding route from Arad to Burcharest...
sce Dept of CSE

45
GREEDY SEARCH, EVALUATION:
 Completeness: NO (cfr. DF-search)
- Check on repeated states
- Minimizing h(n) can result in false starts, e.g. Iasi to Fagaras.
Properties of greedy best-first search:
• Complete? No – can get stuck in loops, e.g., Iasi Neamt Iasi Neamt
• Time? O(bm), but a good heuristic can give dramatic improvement
• Space? O(bm) -- keeps all nodes in memory
• Optimal? No
1.6 CONSTRAINT SATISFACTION PROBLEMS (CSPS)
• Standard search problem:
– state is a "black box“ – any data structure that supports successor function,
heuristic function, and goal test
• CSP:
– state is defined by variables Xi with values from domain Di

sce Dept of CSE
46
– goal test is a set of constraints specifying allowable combinations of values for

subsets of variables
– Simple example of a formal representation language
• Allows useful general-purpose algorithms with more power than standard search
algorithms
Arc consistency:
1. Arc refers to a directed arc in the constraint graph.

2. Arc consistency checking can be applied either as a preprocessing. step before
the process must be applied repeatedly until no more inconsistency remain.
Path consistency:
Path consistency means that any pair of adjacent variables can always be
extended to a third neighboring variable, this is also called path consistency
K-consistency:
Stronger forms of propagation can be defined using the notation called K-

consistency. A CSP is K-consistency if for any set of K-1 variables and for any consistent
assignment to those variables, a constant value can always be assigned to any variable.
Example: Map-Coloring
• Variables WA, NT, Q, NSW, V, SA, T
• Domains Di = {red,green,blue}
sce Dept of CSE
47
• Constraints: adjacent regions must have different colors
• e.g., WA ≠ NT, or (WA,NT) in ,(red,green),(red,blue),(green,red),

(green,blue),(blue,red),(blue,green)}
• Solutions are complete and consistent assignments, e.g., WA = red, NT = green,Q =

red,NSW= green,V = red,SA = blue,T = green
Constraint graph
• Binary CSP: each constraint relates two variables
• Constraint graph: nodes are variables, arcs are constraints
sce Dept of CSE

48
sce Dept of CSE

49
Varieties of CSPs
• Discrete variables
– finite domains:
• n variables, domain size d O(dn) complete assignments
• e.g., Boolean CSPs, incl.~Boolean satisfiability (NP-complete)
– infinite domains:
• integers, strings, etc.
• e.g., job scheduling, variables are start/end days for each job
• need a constraint language, e.g., StartJob1 + 5 ≤ StartJob3
• Continuous variables
– e.g., start/end times for Hubble Space Telescope observations
– linear constraints solvable in polynomial time by linear programming
Varieties of constraints:
• Unary constraints involve a single variable,
– e.g., SA ≠ green
• Binary constraints involve pairs of variables,
– e.g., SA ≠ WA
– Higher-order constraints involve 3 or more variables,
– e.g., cryptarithmetic column constraints
SCE 56 Dept of CSE

UNIT-II: LOGICAL AGENTS
Knowledge representation
A variety of ways of knowledge (facts) have been exploited in AI programs. Facts: truths
in some relevant world. These are things we want to represent.
Propositional logic
It is a way of representing knowledge.In logic and mathematics, a propositional calculus

or logic is a formal system in which formulae representing propositions can be formed by
combining atomic propositions
using logical connectives

Sentences considered in propositional logic are not arbitrary sentences but are the ones
that are either true or false, but not both. This kind of sentences are called propositions.
Example
Some facts in propositional logic:

It is raining. - RAINING
It is sunny - SUNNY
It is windy - WINDY
If it is raining ,then it is not sunny - RAINING -> SUNNY
Elements of propositional logic

Simple sentences which are true or false are basic propositions. Larger and more complex
sentences are constructed from basic propositions by combining them with connectives.
Thus propositions and connectives are the basic elements of propositional logic. Though
there are many connectives, we are going to use the following five basic connectives
here: NOT, AND, OR, IF_THEN (or IMPLY), IF_AND_ONLY_IF. They are also
denoted by the symbols: , , , , , respectively.
Inference
Inference is deriving new sentences from old.
SCE 57 Dept of CSE

Modus ponens
There are standard patterns of inference that can be applied to derive chains of
conclusions that lead to the desired goal. These patterns of inference are called inference
rules.
Entailment
Propositions tell about the notion of truth and it can be applied to logical reasoning. We can
have logical entailment between sentences. This is known as entailment where a sentence
follows logically from another sentence.In mathematical notation we write : knowledge
based agents or logical agents.The central component of a knowledge-based agent is its
knowledge base, or KB.
Informally,a knowledge base is a set of sentences. Each sentence is expressed in language
called a knowledge representation language and represents some assertion about the
world.
The syntax of propositional logic defines the allowable
sentences. The atomic sentences-
the indivisible syntactic elements-consist of a single proposition symbol. Each such
symbol tands for a proposition that can be true or false. We will use uppercase names for
symbols: P, Q, R, and so on.
Complex sentences are constructed from simpler sentences using logical

connectives. There are five connectives in common use:
First order Logic
Whereas propositional logic assumes the world contains facts, first-

order logic (like natural language) assumes the world contains
Objects: people, houses, numbers, colors, baseball games, wars, …

Relations: red, round, prime, brother of, bigger than, part of, comes between,
Functions: father of, best friend, one more than, plus,
The basic syntactic elements of -orderlogicare. the symbols that stand for objects,
relations, and functions. The symbols,come in three kinds:
a) constant symbols, which stand for objects;

b) predicate symbols, which stand for relations;
c) and function symbols, which stand for functions.
We adopt the convention that these symbols will begin with uppercase letters.
Example: Constant
symbols : Richard
SCE 58 Dept of CSE
and John; predicate

symbols :
Brother, OnHead, Person, King, and

Crown; function symbol :LeftLeg.
Quantifiers
There is need to express properties of entire collections of objects,instead of enumerating

the objects by name. Quantifiers let us do this.FOL contains two standard quantifiers
called
a) Universal ( ) and
b) Existential ( )
Universal quantification
( x) P(x) : means that P holds forall values of x in the domain associated with that
variable
E.g., ( x) dolphin(x) => mammal(x)
Existential quantification
( x)P(x) means that P holds for some value of x in the domain associated with that
variable
E.g., ( x) mammal(x) ^ lays-eggs(x)
Permits one to make a statement about some object without naming it
Explain Universal Quantifiers with an example.

Rules such as "All kings are persons,'' is written in first-order logic as
x King(x) => Person(x)
where is pronounced as “ For all ..”
Thus, the sentence says, "For all x, if x is a king, then z is a

person." The symbol x is called a variable(lower case letters)
The sentence x P,where P is a logical expression says that P is true for every object x.
SCE 59 Dept of CSE

Existential quantifiers with an example.
Universal quantification makes statements about every object. It is possible to make a

statement about some object in the universe without naming it,by using an existential
quantifier.
Example
“King John has a crown on his head”

x Crown(x) ^ OnHead(x,John)
x is pronounced There“ exists an x such that ..” or “ For some x ..”

connection between universal and existential quantifiers
“Everyone likes icecream “ is equivalent

“there is no one who does not like icecream”
This can be expressed as :
x Likes(x,IceCream) isquivalent
to Likes(x,IceCream)
STEPS ASSOCIATED WITH THE KNOWLEDGE ENGINEERING PROCESS
Knowledge Engineering
Discuss them by applying the steps to any real world
application of your choice. The general process of knowledge base construction a process
is called knowledge engineering. A knowledge engineer is someone who investigates a
particular domain, learns what concepts are important in that domain, and creates a
formal representation of the objects and relations in the domain. We will illustrate the
knowledge engineering process in an electronic circuit domain that should already be
fairly familiar,
The steps associated with the knowledge engineering process are :
1. Identfy the task.
. The task will determine what knowledge must be represented in order to connect
problem instances to answers. This step is analogous to the PEAS process for designing
agents.
2. Assemble the relevant knowledge. The knowledge engineer might already be an

expert in the domain, or might need to work with real experts to extract what they know-
a process called knowledge acquisition.
3. Decide on a vocabulary of predicates, functions, and constants. That is, translate

the important domain-level concepts into logic-level names.
SCE 60 Dept of CSE

Once the choices have been made. the result is a vocabulary that is known as the ontology of
the domain. The word ontology means a particular theory of the nature of being or existence.
4. Encode general /knowledge about the domain.
The knowledge engineer writes down the axioms for all the vocabulary terms. This pins down
(to the extent possible) the meaning of the terms, enabling the expert to check the content.
Often, this step reveals misconceptions or gaps in the vocabulary that must be fixed by returning
to step 3 and iterating through the process.
5. Encode a description of the specific problem instance.
For a logical agent, problem instances are supplied by the sensors, whereas a "disembodied"
knowledge base is supplied with additional sentences in the same way that traditional
programs are supplied with input data.
6. Pose queries to the inference procedure and get answers.
This is where the reward is: we can let the inference procedure operate on the axioms and
problem-specific facts to derive the facts we are interested in knowing.
7. Debug the knowledge base.
x NumOfLegs(x,4) => Mammal(x) Is

false for reptiles ,amphibians.
To understand this seven-step process better, we now apply it to an extended example-the

domain of electronic circuits.
The electronic circuits domain
We will develop an ontology and knowledge base that allow us to reason about digital Circuits
of the kind shown in Figure 8.4. We follow the seven-step process for knowledge engineering
There are many reasoning tasks associated with digital circuits. At the highest level, one
analyzes the circuit's functionality. For example, what are all the gates connected to the first
input terminal? Does the circuit contain feedback loops? These will be our tasks in this section.
SCE 61 Dept of CSE

Assemble the relevant knowledge
What do we know about digital circuits? For our purposes, they are composed of wires and
gates. Signals flow along wires to the input terminals of gates, and each gate produces a decide
on vocabulary.
We now know that we want to talk about circuits, terminals, signals, and gates. The next
step is to choose functions, predicates, and constants to represent them. We will start from
individual gates and move up to circuits. First, we need to be able to distinguish a gate from
other gates. This is handled by naming gates with constants: X I , X2, and so on
Encode general knowledge of the domain
One sign that we have a good ontology is that there are very few general rules which need
to be specified. A sign that we have a good vocabulary is that each rule can be stated clearly
and concisely. With our example, we need only seven simple rules to describe everything we
need to know about circuits:
1. If two terminals are connected, then they have the same signal:
2. The signal at every terminal is either 1 or 0 (but not both):
3. Connected is a commutative predicate:
SCE 62 Dept of CSE

4. An OR gate's output is 1 if and only if any of its inputs is 1:
5. An A.ND gate's output is 0 if and only if any of its inputs is 0:
6. An XOR gate's output is 1 if and only if its inputs are different:
7. A NOT gate's output is different from its input:
Encode the specific problem instance
The circuit shown in Figure 8.4 is encoded as circuit C1 with the following description.
First, we categorize the gates:
Type(X1)= XOR Type(X2)= XOR
Pose queries to the inference procedure
What combinations of inputs would cause the first output of Cl (the sum bit) to be 0 and
The second output of C1 (the carry bit) to be l?
Debug the knowledge base
We can perturb the knowledge base in various ways to see what kinds of erroneous
behaviors
emerge.
Usage of First Order Logic.
The best way to find usage of First order logic is through examples. The examples can be
taken from some simple domains. In knowledge representation, a domain is just some
part of
the world about which we wish to express some knowledge.
Assertions and queries in first-order logic
Sentences are added to a knowledge base using TELL, exactly as in propositional logic.
SCE 63 Dept of CSE
Such
sentences are called assertions.
For example, we can assert that John is a king and that kings are persons:
TELL(KB, King (John))
Where KB is knowledge base.
TELL(KB, x King(x) => Person(x)).
We can ask questions of the knowledge base using ASK. For

example, returns true.
Questions asked using ASK are called queries or goals
ASK(KB,Person(John))
Will return true.
(ASK KBto find whther Jon is a king) ASK(KB, x person(x))

.
The kinship domain
The first example we consider is the domain of family relationships, or

kinship. This domain includes facts such as
"Elizabeth is the mother of Charles" and
"Charles is the father of William7' and rules such as

"One's grandmother is the mother of one's parent."
Clearly, the objects in our domain are people.
We will have two unary predicates, Male and Female.
Kinship relations-parenthood, brotherhood, marriage, and so on-will be represented by

binary predicates: Parent, Sibling, Brother, Sister, Child, Daughter,Son, Spouse,
Husband, Grandparent, Grandchild, Cousin, Aunt, and Uncle.
We will use functions for Mother and Father.
Forward chaining with an example.
Using a deduction to reach a conclusion from a set of antecedents is called forward

chaining. In other words,the system starts from a set of facts,and a set of rules,and tries to
find the way of using these rules and facts to deduce a conclusion or come up with a
suitable couse of action. This is known as data driven reasoning.
SCE 64 Dept of CSE

The proof tree generated by forward chaining.

Example knowledge base
• The law says that it is a crime for an American to sell weapons to hostile nations. The
country Nono, an enemy of America, hassomemissiles, and all of its missiles were sold to
it by Colonel West, who is American.
• Prove that Col. West is a criminal
... it is a crime for an American to sell weapons to hostile nations: American(x)

Weapon(y)
Owns(Nono,x) ) … all of its missiles
were sold to it by Colonel West Missile(x)
Missiles are weapons: Missile(x)
"hostile“: Enemy(x,America) )
The country Nono, an enemy of America … Enemy(Nono,America)
Note:
(a) The initial facts appear in the bottom level
(b) Facts inferred on the first iteration is in the middle level
(c) The facts inferered on the 2nd iteration is at the top level
SCE 65 Dept of CSE

ALGORITHM
Backward chaining with an example.
Forward chaining applies a set of rules and facts to deduce whatever conclusions can be
derived. In backward chaining ,we start from a conclusion, which is the hypothesis we
wish to prove and we aim to show how that conclusion can be reached from the rules and
facts in the data base. The conclusion we are aiming to prove is called a goal, and the
reasoning in this way is known as goal-driven.
SCE 66 Dept of CSE

Backward chaining example
SCE 67 Dept of CSE

Note:
(a) To prove Criminal(West) ,we have to prove four conjuncts below it.
(b) Some of which are in knowledge base,and others require further backward
UNIFICATION:
UNIFY(P,R)=UNIFY(Q,R)=UNIFY(P,Q)
RESOLUTION:
NF
CNF
INF WITH REFUTATION
CNF WITH REFUTATION
SCE 76 Dept of CSE

UNIT III-PLANNING
3.1 PLANNING WITH STATE SPACE SEARCH
The agent first generates a goal to achieve and then constructs aplan to achieve it
from the Current state
PROBLEMSOLVING TO PLANNING
Representation Using Problem Solving Approach
 Forward search
 Backward search
 Heuristic search
Representation Using Planning Approach
 STRIPS-standard research institute problem solver.
 Representation for states and goals
 Representation for plans
 Situation space and plan space
 Solutions
Why Planning ?
Intelligent agents must operate in the world. They are not simply passive reasoners (Knowledge
Representation, reasoning under uncertainty) or problem solvers (Search), they must also acton
the world.
We want intelligent agents to act in “intelligent ways”. Taking purposeful actions, predicting the
expected effect of such actions, composing actions together to achieve complex goals.
E.g. if we have a robot we want robot to decide what to do; how to act to achieve our goals
SCE 77 Dept of CSE

Planning Problem
How to change the world to suit our needs

Critical issue: we need to reason about what the world will be like after doing a few
actions, not just what it is like now
GOAL: Craig has coffee
CURRENTLY: robot in mailroom, has no coffee, coffee not made, Craig in office etc.
TO DO: goto lounge, make coffee
3.2 PARTIAL ORDER PLANNING
Partial-Order Planning Algorithms

Partially Ordered Plan
c) Plan
d) Steps
e) Ordering constraints
f) Variable binding constraints
g) Causal links
h) POP Algorithm
i) Make initial plan
j) Loop until plan is a complete
– Select a subgoal
– Choose an operator
– Resolve threats
Choose Operator
k) Choose operator(c, Sneeds)
Choose a step S from the plan or a new step S by instantiating an operator that has c as an
effect
• If there’s no such step, Fail
SCE 78 Dept of CSE

• Add causal link S _c Sneeds
• Add ordering constraint S < Sneeds
• Add variable binding constraints if necessary
• Add S to steps if necessary Nondeterministic choice
• Choose – pick one of the options arbitrarily
• Fail – go back to most recent non-deterministic choice and try a different one that
has not been tried before Resolve Threats ∈
• A step S threatens a causal link Si c Sj iff ¬ c effects(S) and it’s possible that Si <
S < Sj
• For each threat Choose
–Promote S : S < Si < Sj
–Demote S : Si < Sj < S

If resulting plan is inconsistent, then Fail
Threats with Variables If c has variables in it, things are kind of tricky.
• S is a threat if there is any∈ instantiation of the variables that makes ¬ c effects(S)
•We could possibly resolve the threat by adding a negative variable binding constraint,
saying that two variables or a variable and a constant cannot be bound to one another
• Another strategy is to ignore such threats until the very end, hoping that the variables will
become bound and make things easier to deal with
Shopping Domain
4. Actions Have(Milk) Have(Banana)
•
5. Buy(x, store) • Start
At(Home) Sells(SM, Milk)
– Pre: At(store), Sells(store, x)
• ∧
– Eff: Have(x) Drill)
• Go(x, y)
– Pre: At(x)
– Eff: At(y), ¬At(x)
• Goal
∧
SCE 79 Dept of CSE
(Bananas) At(HDW) Sells

(HDW,D)
Buy (Milk)
At (SM) Sells(SM,M)
finish
Have(D) Have(M)
Have(B) At(SM)
Sells(SM,B) NB: Causal
links imply ordering
of steps
¬At(x2)
GO (SM)
Sells(SM, Banana) Sells(HW,
Shopping problem
sta
rt At
(Hom
e)
Buy (Drill) Buy
SCE 80 Dept of CSE

At (x2)
GO (HDW)
At(x1)
¬At(x1)
start At
(Home)
Buy (Drill) Buy (Bananas)
At(HDW) Sells (HDW,D)
Buy (Milk)
At (SM) Sells(SM,M)
finish
Have(D) Have(M) Have(B)
At(SM) Sells(SM,B)
NB: Causal links

http://csetube¬At(x1)
imply ordering
of steps
¬At(x2)
GO (SM)
At (x2)
GO (HDW)
At(x1)
start At
(Home)
Buy (Milk)
At (SM) Sells(SM,M)
finish
Have(D) Have(M)
Have(B) At(SM)
Sells(SM,B) x1=Home
x2=Home NB: Causal
links imply ordering
of steps
!
¬At(x2)
GO (SM)
At (x2) GO
(HDW)
At(x1)
SCE 81 Dept of CSE

¬At(x1)
start At
(Home)
Buy (Milk)
At (SM) Sells(SM,M)
finish
At(SM) Sells(SM,B)
x1=Home x2=Home
NB: Causal links
imply ordering
of steps
¬At(x2)
http://csetubeAt (Home)
GO (SM)
At (x2)
GO (HDW)
At(x1)
¬At(x1)
start

Buy (Milk)
At (SM) Sells(SM,M)
finish
At(SM) Sells(SM,B)
x1=Home x2=Home
NB: Causal links
imply ordering
of steps
start
At (Home)
¬At(x2)
GO (SM)
At (x2)
Buy (Milk)
SCE 82 Dept of CSE

At (SM) Sells(SM,M)
finish
At(SM) Sells(SM,B)
GO (HDW)
At(x1)
¬At(x1)
x1=Home x2=Home
3
start
At (Home)
¬At(x2)
GO (SM)
http://csetubeGO (HDW)
At (x2)
Buy (Milk)
At (SM) Sells(SM,M)
finish
At(SM) Sells(SM,B)
At(x1)
¬At(x1) x1=Home
x2=Home x2=HDW
start At
(Home)
¬At(x2)
GO (SM)
At (x2)
Buy (Milk)
At (SM) Sells(SM,M)
finish
At(SM) Sells(SM,B)
GO (HDW)
At(x1)
¬At(x1)
x1=Home x2=Home
x2=HDW
start At
(Home)
SCE 83 Dept of CSE


¬At(x2)
GO (SM)
At (x2)
Buy (Milk)
At (SM) Sells(SM,M)
finish Have(D)
Have(M) Have(B).
At(SM) Sells(SM,B)
At(x1)
¬At(x1) x1=Home x2=Home x2=HDW
3.3 PLANNING GRAPHS
 Levels 

 Mutex between actions 

 Mutex holds between luents 

 Graph plan algorithm 
3.4 PLANNING AND ACTING IN THE REAL WORLD
SCE 84 Dept of CSE

 Conditional planning Or Contingency

Planning
 Execution monitoring and replanning
 Continuous planning
 Multiagent planning
 Times, schedules, and resources
 Critical path method
 Hierarchical task network planning
SCE 85 Dept of CSE

UNIT-IV: UNCERTAINTY
4.1 UNCERTAINTY
To act rationally under uncertainty we must be able to evaluate how likely certain
things are. With FOL a fact F is only useful if it is known to be true or false. But we need
to be able to evaluate how likely it is that F is true. By weighing likelihoods of events
(probabilities) we can develop mechanisms for acting rationally under uncertainty.
Dental Diagnosis example.
In FOL we might formulate

P. symptom(P,toothache)→ disease(p,cavity) ∨disease(p,gumDisease) ∨
disease(p,foodStuck)
When do we stop?
Cannot list all possible causes.
We also want to rank the possibilities. We don’t want to start drilling for a cavity before
checking for more likely causes first.
Axioms Of Probability
Given a set U (universe), a probability function is a function defined over the

subsets of U that maps each subset to the real numbers and that satisfies the Axioms of
Probability
1.Pr(U) = 1
2.Pr(A) ∈[0,1]
3.Pr(A ∪B) = Pr(A) + Pr(B) –Pr(A ∩B)
Note if A ∩B = {} then Pr(A ∪B) = Pr(A) + Pr(B)
4.2 REVIEW OF PROBABILTY
 Natural way to represent uncertainty

 People have intuitive notions about probabilities
 Many of these are wrong or inconsistent
 Most people don’t get what probabilities mean
 Understanding Probabilities
 Initially, probabilities are “relative frequencies”
 This works well for dice and coin flips
 For more complicated events, this is problematic
 What is the probability that Obama will be reelected?
 This event only happens once
 We can’t count frequencies
 still seems like a meaningful question
 In general, all events are unique
SCE 86 Dept of CSE
Probabilities and Beliefs

 Suppose I have flipped a coin and hidden the outcome
 What is P(Heads)?
 Note that this is a statement about a belief, not a statement about the world
 The world is in exactly one state (at the macro level) and it is in that state
with probability 1.
 Assigning truth values to probability statements is very tricky business
 Must reference speakers state of knowledge
Frequentism and Subjectivism

 Frequentists hold that probabilities must come from relative frequencies
 This is a purist viewpoint
 This is corrupted by the fact that relative frequencies are often unobtainable
 Often requires complicated and convoluted
 assumptions to come up with probabilities
 Subjectivists: probabilities are degrees of belief
o Taints purity of probabilities
o Ofen more practical
Types are:
1 Unconditional or prior probabilities
2 Conditional or posterior probabilities
4.3 PROBABILISTIC REASONING

 Representing Knowledge in an Uncertain Domain
 Belief network used to encode the meaningful dependence between variables.
o Nodes represent random variables
o Arcs represent direct influence
o Nodes have conditional probability table that gives that var's probability
given the different states of its parents
o Is a Directed Acyclic Graph (or DAG)
The Semantics of Belief Networks

 To construct net, think of as representing the joint probability distribution.
 To infer from net, think of as representing conditional independence statements.
 Calculate a member of the joint probability by multiplying individual conditional
probabilities:
o P(X1=x1, . . . Xn=xn) =
o = P(X1=x1|parents(X1)) * . . . * P(Xn=xn|parents(Xn))
 Note: Only have to be given the immediate parents of Xi, not all other nodes:
o P(Xi|X(i-1),...X1) = P(Xi|parents(Xi))
 To incrementally construct a network:
1. Decide on the variables
2. Decide on an ordering of them
3. Do until no variables are left:
SCE 87 Dept of CSE

a. Pick a variable and make a node for it

b. Set its parents to the minimal set of pre-existing nodes
c. Define its conditional probability
 Often, the resulting conditional probability tables are much smaller than the
exponential size of the full joint
 If don't order nodes by "root causes" first, get larger conditional probability tables
 Different tables may encode the same probabilities.
 Some canonical distributions that appear in conditional probability tables:
o deterministic logical relationship (e.g. AND, OR)
o deterministic numeric relationship (e.g. MIN)
o parameteric relationship (e.g. weighted sum in neural net)
o noisy logical relationship (e.g. noisy-OR, noisy-MAX)
Direction-dependent separation or D-separation:

 If all undirected paths between 2 nodes are d-separated given evidence node(s) E,
then the 2 nodes are independent given E.
 Evidence node(s) E d-separate X and Y if for every path between them E contains a
node Z that:
o has an arrow in on the path leading from X and an arrow out on the path
leading to Y (or vice versa)
o has arrows out leading to both X and Y
o does NOT have arrows in from both X and Y (nor Z's children too)
Inference in Belief Networks
 Want to compute posterior probabilities of query variables given evidence

variables.
 Types of inference for belief networks:
o Diagnostic inference: symptoms to causes
o Causal inference: causes to symptoms
o Intercausal inference:
o Mixed inference: mixes those above
Inference in Multiply Connected Belief Networks
 Multiply connected graphs have 2 nodes connected by more than one path
 Techniques for handling:
o Clustering: Group some of the intermediate nodes into one meganode.
Pro: Perhaps best way to get exact evaluation.
Con: Conditional probability tables may exponentially increase in size.
o Cutset conditioning: Obtain simplier polytrees by instantiating variables as
constants.
Con: May obtain exponential number of simplier polytrees.
Pro: It may be safe to ignore trees with lo probability (bounded cutset
conditioning).
o Stochastic simulation: run thru the net with randomly choosen values for
each node (weighed by prior probabilities).
SCE 88 Dept of CSE
4.4 BAYESIAN NETWORK
Bayes’ nets:
A technique for describing complex joint distributions (models) using simple, local
distributions
(conditional probabilities)
More properly called graphical models
Local interactions chain together to give global indirect interactions
A Bayesian network is a graphical structure that allows us to represent and reason

about an uncertain domain. The nodes in a Bayesian network represent a set of random
variables,
X=X1;::Xi;:::Xn, from the domain. A set of directed arcs(or links) connects pairs of nodes,
Xi!Xj, representing the direct dependencies between variables.
Assuming discrete variables, the strength of the relationship between variables is

quantified by conditional probability distributions associated with each node. The only
constraint on the arcs allowed in a BN is that there must not be any directed cycles: you
cannot return to a node simply by following directed arcs.
Such networks are called directed acyclic graphs, or simply dags. There are a
number of steps that a knowledge engineer must undertake when building a Bayesian
SCE 89 Dept of CSE

network. At this stage we will present these steps as a sequence; however it is important to
note that in the real-world the process is not so simple.
Nodes and values

First, the knowledge engineer must identify the variables of interest. This involves
answering the question: what are the nodes to represent and what values can they take, or
what state can they be in? For now we will consider only nodes that take discrete values.
The values should be both mutually exclusive and exhaustive , which means that the
variable must take on exactly one of these values at a time. Common types of discrete
nodes include:
Boolean nodes, which represent propositions, taking the binary values true (T)
and false (F). In a medical diagnosis domain, the node Cancer would represent
the proposition that a patient has cancer.
Ordered values. For example, a node Pollution might represent a patient’s pol-
lution exposure and take the values low, medium, high
Integral values. For example, a node called Age might represent a patient’s age
and have possible values from 1 to 120.
Even at this early stage, modeling choices are being made. For example, an
alternative to representing a patient’s exact age might be to clump patients into different
age groups, such as baby, child, adolescent, young, middleaged, old. The trick is to choose
values that represent the domain efficiently.
1 Representation of joint probability distribution

SCE 90 Dept of CSE
2 Conditional independence relation in Bayesian network

.
INFERENCE IN BAYESIAN NETWORK

1 Tell
2 Ask
3 Kinds of inferences
4 Use of Bayesian network
 In general, the problem of Bayes Net inference is NP-hard (exponential in the size
of the graph).
 For singly-connected networks or polytrees in which there are no undirected loops,
there are linear time algorithms based on belief propagation.
 Each node sends local evidence messages to their children and parents.
 Each node updates belief in each of its possible values based on incoming messages
from it neighbors and propagates evidence on to its neighbors.
 There are approximations to inference for general networks based on loopy belief
propagation that iteratively refines probabilities that converge to accurate limit.
TEMPORAL MODELS
1 Monitoring or filtering
2 Prediction
Bayes' Theorem
Many of the methods used for dealing with uncertainty in expert systems are based
on Bayes' Theorem.
Notation:
P(A) Probability of event A
P(A B) Probability of events A and B occurring together
P(A | B) Conditional probability of event A
given that event B has occurred .nr/
If A and B are independent, then P(A | B) = P(A). .co
Expert systems usually deal with events that are not independent, e.g. a disease and
its symptoms are not independent.
Theorem
P (A B) = P(A | B)* P(B) = P(B | A) * P(A) therefore P(A | B) = P(B | A) * P(A) / P(B)
Uses of Bayes' Theorem

In doing an expert task, such as medical diagnosis, the goal is to determine
identifications (diseases) given observations (symptoms). Bayes' Theorem provides such a
relationship.
SCE 91 Dept of CSE
P(A | B) = P(B | A) * P(A) / P(B)
Suppose: A=Patient has measles, B =has a rash

Then:P(measles/rash)=P(rash/measles) * P(measles) / P(rash)
The desired diagnostic relationship on the left can be calculated based on the known
statistical quantities on the right.
Joint Probability Distribution

Given a set of random variables X1 ... Xn, an atomic event is an assignment of a
particular value to each Xi. The joint probability distribution is a table that assigns a
probability to each atomic event. Any question of conditional probability can be answered
from the joint.
Toothache ¬ Toothache
Cavity 0.04 0.06
¬ Cavity 0.01 0.89
Problems:
The size of the table is combinatoric: the product of the number of possibilities for
each random variable. The time to answer a question from the table will also be
combinatoric. Lack of evidence: we may not have statistics for some table entries, even
though those entries are not impossible.
Chain Rule
We can compute probabilities using a chain rule as follows:

P(A &and B &and C) = P(A | B &and C) * P(B | C) * P(C)
If some conditions C1 &and ... &and Cn are independent of other conditions U, we will
have:
P(A | C1 &and ... &and Cn &and U) = P(A | C1 &and ... &and Cn)
This allows a conditional probability to be computed more easily from smaller tables using
the chain rule.
Bayesian Networks
Bayesian networks, also called belief networks or Bayesian belief networks, express
relationships among variables by directed acyclic graphs with probability tables stored at
the nodes.[Example from Russell & Norvig.]
1 A burglary can set the alarm off
2 An earthquake can set the alarm off
3 The alarm can cause Mary to call
4 The alarm can cause John to call
Computing with Bayesian Networks
SCE 92 Dept of CSE

If a Bayesian network is well structured as a poly-tree (at most one path between
any two nodes), then probabilities can be computed relatively efficiently. One kind of
algorithm, due to Judea Pearl, uses a message-passing style in which nodes of the network
compute probabilities and send them to nodes they are connected to. Several software
packages exist for computing with belief networks.
A Hidden Markov Model (HMM) tagger chooses the tag for each word that maximizes:
[Jurafsky, op. cit.] P(word | tag) * P(tag | previous n tags)
For a bigram tagger, this is approximated as:

ti = argmaxj P( wi | tj ) P( tj | ti - 1 )
In practice, trigram taggers are most often used, and a search is made for the best
set of tags for the whole sentence; accuracy is about 96%.
HIDDEN MARKOV MODELS

A hidden Markov model (HMM) is an augmentation of the Markov chain to include
observations. Just like the state transition of the Markov chain, an HMM also includes
observations of the state. These observations can be partial in that different states can map
to the same observation and noisy in that the same state can stochastically map to different
observations at different times.
The assumptions behind an HMM are that the state at time t+1 only depends on the
state at time t, as in the Markov chain. The observation at time t only depends on the state
at time t. The observations are modeled using the variable for each time t whose domain is
the set of possible observations. The belief network representation of an HMM is depicted
in Figure. Although the belief network is shown for four stages, it can proceed indefinitely.
A stationary HMM includes the following probability distributions:
P(S0) specifies initial conditions.

P(St+1|St) specifies the dynamics.
P(Ot|St) specifies the sensor model.
There are a number of tasks that are common for HMMs.
The problem of filtering or belief-state monitoring is to determine the current state

based on the current and previous observations, namely to determine P(Si|O0,...,Oi).
SCE 93 Dept of CSE

Note that all state and observation variables after Si are irrelevant because they are
not observed and can be ignored when this conditional distribution is computed.
The problem of smoothing is to determine a state based on past and future

observations. Suppose an agent has observed up to time k and wants to determine the state
at time i for i<k; the smoothing problem is to determine
P(Si|O0,...,Ok).
All of the variables Si and Vi for i>k can be ignored.
SCE 94 Dept of CSE

UNIT-V
LEARNING
5.1 LEARNING FROM OBSERVATIONS:
Introduction:
What is learning?
Learning denotes changes in the system that are adaptive in the sense that they enable the
system to do the same task or tasks drawn from the same population more effectively the next
time (Simon, 1983).
Learning is making useful changes in our minds (Minsky, 1985).
Learning is constructing or modifying representations of what is being experienced
(Michalski, 1986).
A computer program learns if it improves its performance at some task through experience
(Mitchell, 1997).
So what is learning?
(1) acquire and organize knowledge (by building, modifying and organizing internal
representations of some external reality);
(2) discover new knowledge and theories (by creating hypotheses that explain some data or
phenomena);
(3) acquire skills (by gradually improving their motor or cognitive skills through repeated
practice,
sometimes involving little or no conscious thought).
(4) Learning results in changes in the agent (or mind) that improve its competence and/or
efficiency.
(5) Learning is essential for unknown environments, (1) i.e., when designer lacks omniscience
o Learning is useful as a system construction method,

o Expose the agent to reality rather than trying to write it down
o Learning modifies the agent's decision mechanisms to improve performance
5.1.1 FORMS OF LEARNING:
Learning agents:
• Four Components
1. Performance Element: collection of knowledge and procedures to decide on the next action.
SCE 95 Dept of CSE

E.g. walking, turning, drawing, etc.
2. Learning Element: takes in feedback from the critic and modifies the performance element
accordingly.
3. Critic: provides the learning element with information on how well the agent is doing based on a
fixed performance standard. E.g. the audience
4. Problem Generator: provides the performance element with suggestions on new actions to take.
Components of the Performance Element
• A direct mapping from conditions on the current state to actions
• Information about the way the world evolves
• Information about the results of possible actions the agent can take
• Utility information indicating the desirability of world states
Learning element
SCE 96 Dept of CSE

• Design of a learning element is affected by
– Which components of the performance element are to be learned
– What feedback is available to learn these components
– What representation is used for the components
Type of feedback:
– Supervised learning: correct answers for each example
– Unsupervised learning: correct answers not given
– Reinforcement learning: occasional rewards
5.2 INDUCTIVE LEARNING
Inductive Learning in supervised learning we have a set of {xi, f (xi)} for 1≤i≤n, and our
aim is to determine 'f' by some adaptive algorithm. It is a machine learning approach in which rules
are inferred from facts or data. In logic, reasoning from the specific to the general Conditional or
antecedent reasoning. Theoretical results in machine learning mainly deal with a type of inductive
learning called supervised learning. In supervised learning, an algorithm is given samples that are
labeled in some useful way. In case of inductive learning algorithms, like artificial neural networks,
the real robot may learn only from previously gathered data. Another option is to let the bot learn
everything around him by inducing facts from the environment. This is known as inductive
learning. Finally, you could get the bot to evolve, and optimise his performance over several
generations.
f(x) is the target function
An example is a pair [x, f(x)]
Learning task: find a hypothesis h such that h(x)

f(xi) ]}, i = 1,2,…,N Construct h so that it agrees with f.
The hypothesis h is consistent if it agrees with f on all observations.
Ockham’s razor: Select the simplest consistent hypothesis.
How achieve good generalization?
SCE 97 Dept of CSE

Simplest: Construct a decision tree with one leaf for every example = memory based learning.
Not very good generalization.
Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or
only no)
5.3 DECISION TREES
LEARNING DECISION TREES:
• Come up with a set of attributes to describe the object or situation.
• Collect a complete set of examples (training set) from which the decision tree can derive a
hypothesis to define (answer) the goal predicate.
Decision Tree Example:
Problem: decide whether to wait for a table at a restaurant, based on the following attributes:
1. Alternate: is there an alternative restaurant nearby?
2. Bar: is there a comfortable bar area to wait in?
3. Fri/Sat: is today Friday or Saturday?
4. Hungry: are we hungry?
5. Patrons: number of people in the restaurant (None, Some, Full)
6. Price: price range ($, $$, $$$)
7. Raining: is it raining outside?
8. Reservation: have we made a reservation?
9. Type: kind of restaurant (French, Italian, Thai, Burger)
10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)
Logical Representation of a Path
r [Patrons(r, full) -30)

SCE 98 Dept of CSE
Expressiveness of Decision Trees
• Any Boolean function can be written as a decision tree
• E.g., for Boolean functions, truth table row → path to leaf:
• Trivially, there is a consistent decision tree for any training set with one path to leaf for
each
example (unless f nondeterministic in x) but it probably won't generalize to new examples
• Prefer to find more compact decision trees
Limitations
– Can only describe one object at a time.
– Some functions require an exponentially large decision tree.
• E.g. Parity function, majority function
• Decision trees are good for some kinds of functions, and bad for others.
• There is no one efficient representation for all kinds of functions.
Principle Behind the Decision-Tree-Learning Algorithm
• Uses a general principle of inductive learning often called Ockham’s razor:
“The most likely hypothesis is the simplest one that is consistent with all
observations.”
• Decision trees can express any function of the input attributes.
Decision tree learning Algorithm:
• Aim: find a small tree consistent with the training examples
• Idea: (recursively) choose "most significant" attribute as root of (sub)tree
Choosing an attribute tests:
• Idea: a good attribute splits the examples into subsets that are (ideally) "all positive" or "all
negative"
• Patrons? is a better choice
Attribute-based representations
• Examples described by attribute values (Boolean, discrete, continuous)
• E.g., situations where I will/won't wait for a table:
SCE 99 Dept of CSE

• Classification of examples is positive (T) or negative (F)
Using information theory
• To implement Choose-Attribute in the DTL algorithm
• Information Content (Entropy):
I(P(v1), … , P(vn)) = Σi=1 -P(vi) log2 P(vi)
• For a training set containing p positive examples and n negative examples:
A chosen attribute A divides the training set E into subsets E1, … , Ev according to their
values for A, where A has v distinct values. Information Gain (IG) or reduction in entropy from the
attribute test: remainder ( A),
 Choose the attribute with the largest IG
• For the training set, p = n = 6, I(6/12, 6/12) = 1 bit
• Patrons has the highest IG of all attributes and so is chosen by the DTL algorithm as the
root
Assessing the performance of the learning algorithm:
• A learning algorithm is good if it produces hypotheses that do a good job of predicating the
classifications of unseen examples
• Test the algorithm’s prediction performance on a set of new examples, called a test set.
SCE 100 Dept of CSE

Patrons has the highest IG of all attributes and so is chosen by the DTL algorithm as the root
 Choose the attribute with the largest IG
• For the training set, p = n = 6, I(6/12, 6/12) = 1 bit
 Assessing the performance of the learning algorithm:
• A learning algorithm is good if it produces hypotheses that do a good job of predicating the
classifications of unseen examples
5.4 EXPLANATION BASED LEARNING
• Extract general rules from examples
• Basic idea
– Given an example, construct a proof for the goal predicate that applies using the background
SCE 101 Dept of CSE
knowledge.
– In parallel, construct a generalized proof with variabilized goal.
– Construct a new rule, LHS with the leaves of the proof tree and RHS with the variabilized
goal.
– Drop any conditions that are always true regardless of value of variables in the goal.
• Any partial subtree can be use for the extracted general rule, how to choose?
• Efficiency, Operationality, Generality
– Too many rules slows down reasoning
– Rules should provide speed increase by eliminating dead-ends and shortening the
proof
– As general as possible to cover the most cases
• Tradeoffs, how to maximize the efficiency of the knowledge base?
• Any partial subtree can be use for the extracted general rule, how to choose?
• Efficiency, Operationality, Generality
– Too many rules slows down reasoning
SCE 102 Dept of CSE

– Rules should provide speed increase by eliminating dead-ends and shortening the proof
– As general as possible to cover the most cases
5.5 STATISTICAL LEARNING METHODS
Learn probabilistic theories of the world from experience
♦ We focus on the learning of Bayesian networks
♦ More specifically, input data (or evidence), learn probabilistic theories
of the world (or hypotheses)
View learning as Bayesian updating of a probability distribution over the hypothesis space
H is the hypothesis variable, values h1, h2, . . ., prior P(H) jth observation dj gives the outcome of
random variable Dj training data d = d1, . . . , dN
Given the data so far, each hypothesis has a posterior probability:
P(hi|d) = αP(d|hi)P(hi)
where P(d|hi) is called the likelihood
Predictions use a likelihood-weighted average over all hypotheses:
P(X|d) = Σi P(X|d, hi)P(hi|d) = Σi P(X|hi)P(hi|d)
Example
Suppose there are five kinds of bags of candies:
10% are h1: 100% cherry candies
20% are h2: 75% cherry candies + 25% lime candies
10% are h5: 100% lime candies
Then we observe candies drawn from some bag:
What kind of bag is it? What flavour will the next candy be?
SCE 103 Dept of CSE

1. The true hypothesis eventually dominates the Bayesian prediction given
that the true hypothesis is in the prior
2. The Bayesian prediction is optimal, whether the data set be small or large[?]
On the other hand
1. The hypothesis space is usually very large or infinite summing over the hypothesis space is
often intractable.
2. Overfitting when the hypothesis space is too expressive such that some hypotheses fit the
date set well.
3. Use prior to penalize complexity.
5.6 REINFORCEMENT LEARNING
• Active Reinforcement learning
• Passive Reinforcement learning
Reinforcement learning
•Frequency of rewards:
–E.g., chess: reinforcement received at end of game
–E.g., table tennis: each point scored can be viewed as rewardco
. learning goals knowledge
 Environment
 Sensors
 ActuatorsCritic
 AgentLearning
 Performance Element
 Problem generator
 Performance standard
 changesfeedback
• reward part of the input percept•agent must be hardwired to recognize that as reward
and not as another sensory input•E.g., animal psychologists have studied reinforcement
SCE 104 Dept of CSE

on animals
Passive reinforcement learning
•Direct utility estimation
•Adaptive dynamic programming
•Temporal difference learning
– Active reinforcement learning •Exploration
•Learning an Action-Value Function
Active Reinforcement learning
The agent‘s policy is fixed
–in state s, it always executes the action π(s)
•Goal: how good is the policy?
•The passive learning agent has
–no knowledge about the transition model T(s,a,s‘)
–no knowledge about the reward function R(s)
•It executes sets of trialsin the environment using its policy π.
–it starts in state (1,1) and experiences a sequence of state transitions until it reaches one
of the terminal states (4,2) or (4,3).
•E.g., (1,1)-0.04 (1,2)-0.04 (1,3)-0.04 (2,3)-0.04 (3,3).0.04 (3,2)-0.04 (3,3)-0.04 (4,3)+1

•Use the information about rewards tolearntheexpected utility Uπ(s):
Utility is the expected sum of (discounted)rewards obtained if policy πis followed
Adaptive dynamic programming
•Idea: Learn how states are connected •Adaptive dynamic programming (ADP) agent
–learns the transition modelT(s, π(s), s’)of the environment
–solves the Markov decision process using a dynamic programming method
•Learning transition model is easy fully observable environment
–supervised learning taskwith input = state-action pair, output = resulting state –transition
model can be represented as table of probabilities
SCE 105 Dept of CSE

•how often do action items occur estimate transition probability T(s,a,s‘) from the frequency
with which s‘is reached when executing a in s.
•E.g., from state (1,3) Rightis executed three times. The resulting state is two times (2,3)
T((1,3) ,Right, (2,3)) is estimated to be 2/3.
SCE 106 Dept of CSE

GLOSSARY
1. Artificial Intelligence - The study of how to make computers do things at which, at

the moment, people are better.
2. Turing test - Defines the intelligent behavior as the ability to achieve human-level
performance in all cognitive tasks, sufficient to fool an interrogator.
3. Agent - Anything that can be viewed as perceiving its environment through sensors
and acting upon that environment through actuators.
4. Rational agent - Rational agent is one that does the right thing. A system is rational
if it does the “right thing”, given what it knows.
5. Omniscience agent - It is one which knows the actual outcome of its actions & can
act accordingly.
6. Agent program - Takes the current percept as input from the sensors and return to
the actuators.
7. Agent function - Abstract mathematical description. That maps any given percept
sequence to an action.
8. Problem solving agent - Decides what to do by finding sequences of actions that

lead to desirable states.
9. Backtracking search - A variant of depth-first search. Only one successor is

generated at a time rather than all successors. Each partially expanded node
remembers which successor to generate next.
10. Depth limited search - Supplying depth-first with a predetermined depth limit l.
That is, nodes at depth l are treated as if they have no successors. This approach is
called depth-limited search.
SCE 107 Dept of CSE

11. Uniformed search - Distinguish a goal state from a non-goal state. Also known as
blind search.
12. Informed search - It is one that uses problem-specific knowledge beyond the
definition of the problem itself and can find solutions more efficiently than an
uninformed strategy.
13. Iterative deepening search - It is an abstract mathematical description. That maps

any given percept sequence to an action.
14. Breadth first search - The root node is expanded first then all the nodes generated
by the root node are expanded next and their successors and so on.
15. Greedy best-first search - Expands the node that is closest to the goal, on the
grounds that this is likely to lead to a solution quickly. Thus, it evaluates nodes by
using the heuristic function f(n) = h(n).
16. A* search - evaluates nodes by combining g(n), the cost to reach the node, and
h(n), the cost to get from the node to the goal. f (n)=g(n)+h(n)
17. Recursive best-first search - A simple recursive algorithm that attempts to

minimize the operation of standard best-first search, but using only linear space.
18. Local maxima - Is a peak that is higher than each of its neighboring states, but
lower than the global maximum.
19. Ridges - Results in a sequence of local maxima that is very difficult for greedy
algorithms to navigate.
20. Plateaux - An area of the state space landscape where the evaluatin function is flat.
21. Hill Climbing Search - Is simply a loop that continually moves in the direction of
increasing value that is uphill. It terminates when it reaches a “peak” where no
neighbor has a higher value.
22. Genetic algorithm - A variant of stochastic beam in which successor states are
generated by combining two parent states, rather than by modifying a single state.
SCE 108 Dept of CSE

23. Online search problems -Solved only by an agent executing actions, rather than by
a purely computational process. Assume that the agent knows the following:
24. ACTIONS(s) - Returns a list of actions allowed in states.
25. Linear constraints - Constraints in which each variable appears only in linear
form.
26. Unary Constraints – Constraints that restrict the value of a single variable.
27. Binary Constraints - Binary constraints are one with only binary constraints. It can
be represented as a constraint graph.
28. Game - Defined by the initial state, the legal actions in each state, a terminal test
and a utility function that applies to terminal states.
29. Offline search - Compute a complete solution before setting in the real world and
then execute the solution without recourse to their percepts.
30. Commutative Problem - A problem is commutative if the order of application

of any given set of actions has no effect on the outcome.
31. Minimum remaining values - Choosing the variable with the fewest “legal”
values. Otherwise called as “most constraint variable” or “fail first”
32. Informed search strategy - Uses problem specific knowledge beyond the
definition of the problem itself.
33. Best First Search approach - An instance of the general TREE SEARCH
algorithm in which a node is selected for expansion based on an evaluation
function, f (n).
34. Nested Quantifier - Express the more complex sentences using multiple
quantifiers.
35. Equality symbol - Used to make the statements more effective that two terms refer
to the same object.
SCE 109 Dept of CSE
36. Higher Order Logic - allows quantifying over relations and functions as well as
over objects.
37. First Order Logic - Representation language that is far more powerful than
propositional logic.
38. Declarative approach - Representation language makes it easy to express the

knowledge in the form of sentences. This simplifies the construction problem
enormously.
39. Syntax - Describes the possible configuration that can constitute sentences.
40. Semantics - Determines the facts in the world to which the sentences refer.
41. Entailment - The generations of new sentences that are necessarily true given th
old sentences are true. This relation between sentences is called entailment.
42. Tuple - Collection of objects arranged in a fixed order and is written with angle
brackets surrounding the objects.
43. Symbols - The basic syntactic elements of first order logic are the symbols that
stand for objects, relations and functions. The symbols are in three kinds. Constant
symbols which stand for objects, Predicate symbols which stand for relations and
Function symbol which stand for functions.
44. Ground term - The term without variables.
45. Inference - The task of deriving the new sentence.
46. Datalog - Set of first order definite clauses with no function symbols.
47. Data complexity - Complexity of inference as a function of the number of ground

facts in the database.
48. Prolog programs - set of definite clauses written in a notation somewhat different
from standard first-order logic.
SCE 110 Dept of CSE

49. Skolemization - Process of removing existential quantifiers by elimination.
50. Situations - logical terms consisting of the initial situation and all situations that
are generated by applying an action to a situation.
51. Fluent - functions and predicates that vary from one situation to the next, such as
the location of the agent.
52. Learning - takes many forms, depending on the nature of the performance element,
the component to be improved, and the available feedback.
53. Inductive learning - Learn a function from examples of its inputs and outputs.
54. PAC-learning algorithm - Any learning algorithm that returns hypothesis that are
probably approximately correct.
55. Sample Complexity - The number of required examples, as a function of E..
56. Neuron - A cell in the brain whose principal function is the collection, processing
and dissemination of electrical signals.
57. Epoch - Each cycle through the examples is called an epoch.
58. Communication - intentional exchange of information brought about by the

production and perception of signs drawn from a shared system of conventional
signs. Most animals use signs to represent important messages.
59. Define language - enables us to communicate most of what we know about the
world.
60. Grammar -A finite set of rules that specifies a language. Formal languages always
have grammar. Natural languages have no grammar.
61. Metaphor - A figure of speech in which a phrase with one literal meaning is used
to suggest a different meaning by way of an analogy.
62. Discourse - any string of language usually that is more than one sentence long.
SCE 111 Dept of CSE

63. Reference resolution - Interpretation of a pronoun or a definite noun phrase that

refers to an object in the world.
64. Information retrieval - Task of finding documents that are relevant to a user’s
need for information. The best known example of information retrieval systems are
search engines on the World Wide Web.
SCE 112 Dept of CSE

QUESTION BANK
Unit I
Possible 2 marks:
1.What is AI? (anna univ 2005)
The study of how to make computers do things at which, at the moment,

people are better.
2.What are the categories of AI?
1. Systems that act like humans.

2. Systems that think like a humans.
3. Systems that think rationally.
4. Systems that act rationally.
3.What is meant by turing test?
It was designs to give a satisfactory operational definition of intelligence.

Turing defined the intelligent behavior as the ability to achieve human-level
performance in all cognitive tasks, sufficient to fool an interrogator.
4.What are the capabilities that a computer should process?
The capabilities are:
1. Natural language processing.

2. Knowledge representation.
3. Automated reasoning.
4. Machine learning.
5. Define agent with example.
An agent is anything that can be viewed as perceiving its environment

through sensors and acting upon that environment through actuators.
Ex: Human Agents, Robotics agents & Software agents.

SCE 113 Dept of CSE
6. Define rational agent.
A rational agent is one that does the right thing. A system is rational if it
does the “right thing”, given what it knows.
7. State the needs of a computer to pass the turing test. (anna univ 2005)
i) Computer Vision: To perceive Objects.
ii) Robotics: To manipulate objects and move about.
8. What is called as an omniscience agent?
It is one which knows the actual outcome of its actions & can act
accordingly.
9. Define agent program. (anna univ 2004)
The agent is a concrete implementation, running on the agent architecture.

They take the current percept as input from the sensors and return to the actuators.
10. Define agent function. (anna univ 2005)
It is an abstract mathematical description. That maps any given percept

11. State the properties of task environment.
1. Fully observable Vs Partially observable.
2. Deterministic Vs Stochastic.
3. Episodic Vs Sequential
4. Static Vs Dynamic
5. Discrete Vs Continuous
6. Single agent Vs Multi agent
SCE 114 Dept of CSE

12. What are the basic kinds of agent program?
i) Simple reflex agents.
ii) Mode-based reflex agents.
iii) Goal based agents and
iv) Utility-based agents.
13. Differentiate episodic vs sequential. (anna univ 2005)
In an episodic task environment, the agents experience is divided into

atomic episodes. Each episode consists of the agent perceiving and then performing a
single action.
The current decision does not affect whether the next part is defective.
In sequential environments, the current decision could affect all future

decisions.
Chess and taxi driving are sequential.
14. Define problem solving agent.
Problem solving agents decide what to do by finding sequences of actions

that lead to desirable states.
15. What is backtracking search? (anna univ 2005)
A variant of depth-first search called backtracking search uses still less

memory. Only one successor is generated at a time rather than all successors. Each partially
expanded node remembers which successor to generate next.
16. What do you mean by depth limited search?

SCE 115 Dept of CSE
The problem of unbounded trees can be alleviated by supplying depth-first

with a predetermined depth limit l. That is, nodes at depth l are treated as if they have no
successors. This approach is called depth-limited search.
17. What are the problems arises when knowledge of the states or actions is
incomplete?
1. Sensor less problems
2. Contingency problems
3. Exploration problems
18. What are the steps to evaluate an algorithm’s performance?
1. Completeness
2. Optimality
3. Time Complexity
4. Space Complexity
19. Give examples for real world problems.
i) The route finding
ii) Touring
iii) Traveling sales person
iv) Robot navigation
20. What are the four components in problem?
i) Initial state
ii) Actions
iii) Goal state
iv) Path cost

SCE 116 Dept of CSE
21. What is called as a uniformed search?
This term has no information about the number of steps or path cost current
to goal state. They can distinguish a goal state from a non-goal state. Also known as blind
search.
22. What is called informed search?
It is one that uses problem-specific knowledge beyond the definition of the

problem itself and can find solutions more efficiently than an uninformed strategy.
23. Give the complexity of a breath-first search. (anna univ 2004)
The time complexity is O(b d), where, d is the depth and b is number at each
level.
24. What is iterative deepening search?
It is an abstract mathematical description. That maps any given percept

25. What is breadth first search?
The root node is expanded first then all the nodes generated by the root node
are expanded next and their successors and so on.
Possible 16 mark questions:
1. Explain in detail the history of Artificial Intelligence. (anna univ 2004)
2. What is meant by PEAS? List out few agents types and describes their PEAS? (anna
univ 2004)
SCE 117 Dept of CSE

3. Explain in detail about properties of task environment. Give their characteristics.
4. Explain in detail about the four kinds of agent program.
5. Explain in detail the advantage and disadvantage of depth-first search.
6. Explain in detail iterative deepening depth-first search. Write an algorithm for it.
7. Describe in brief the depth-first search and breadth-first search algorithms and also
mention their advantages. (anna univ 2005)
UNIT II
POSSIBLE 2 MARKS:
1. Define greedy best-first search. (anna univ 2004)

Greedy best-first search expands the node that is closest to the goal, on the
grounds that this is likely to lead to a solution quickly. Thus, it evaluates nodes by
using the heuristic function f(n) = h(n).
2. Define A* search.
A* search evaluates nodes by combining g(n), the cost to reach the node,
and h(n), the cost to get from the node to the goal.
f (n)=g(n)+h(n)
3. Define Consistency.
A heuristic h(n) is consistent if, for every node n and every successor n’ of
n generated by any action a, the estimated cost of reaching the goal n is no greater than
the step cost of getting to n’ plus the estimated cost of reaching the goal from n’:
h(n) <= c(n, a, n’) | h(n’)
4. What do you mean by Recursive best-first search?

Recursive best-first search is a simple recursive algorithm that attempts to
minimize the operation of standard best-first search, but using only linear space.
SCE 118 Dept of CSE

5. What are the reasons that hill climbing often gets stuck? (anna univ 2004)
Local maxima:
A local maximum is a peak that is higher than each of its

neighboring states, but lower than the global maximum.
Ridges:
Ridges results in a sequence of local maxima that is very difficult for

greedy algorithms to navigate.
Plateaux:
A Plateaux is an area of the state space landscape where the

evaluatin function is flat.
6. Define Hill Climbing Search.

The hill climbing search algorithm is simply a loop that continually moves
in the direction of increasing value that is uphill. It terminates when it reaches a “peak”
where no neighbor has a higher value.
7. Mention the types of hill-climbing search.

 Stochastic hill climbing
 First-choice hill climbing
 Random-restart hill climbing
8. Why a hill climbing search is called a greedy local search? (anna univ 2004)
Hill climbing is sometimes called greedy local search because it grabs a
good neighbor state without thinking ahead about where to go next.
9. Define genetic algorithm.

A genetic algorithm is a variant of stochastic beam in which successor states
are generated by combining two parent states, rather than by modifying a single state.
10. Define Linear Programming problem.

SCE 119 Dept of CSE
Linear programming problem is in which the constraints must be linear

inequalities forming a convex region and the objective function is also linear. This
problem can be solved in time polynomial in the number of variables.
11. Define online search problems.

An online search problem can be solved only by an agent executing actions,
rather than by a purely computational process. Assume that the agent knows the
following:
 ACTIONS(s), which returns a list of actions allowed in states.

 The steps-cost function c(s, a, s’) note that this cannot be used until
the agent knows that s’ is the outcome.
 GOAL-TEST(s)
15. Define constraint satisfaction problem.
Constraint Satisfaction problem is defined by a set of variables, X1,

X2,…..Xn and a set of constraints, c1,c2,…..,cm. Each variable xi has a nonempty domain
Di of possible values. Each constraints Ci involves some subset of the variables and
specifies the allowable combinations of values for that subset.
16. Define linear constraints.
Linear constraints are the constraints in which each variable appears only in
linear form.
17. What are the types of Constraints?
Unary Constraints:
Unary constraints are one which restricts the value of a single

variable.
Binary Constraints:
Binary constraints are one with only binary constraints. It can

be represented as a constraint graph.
SCE 120 Dept of CSE
18. Define Triangle Inequality.
A heuristic h(n) is consistent if, for every node n and every successor n’ pf n
generated by any action a, the estimated cost of reaching the goal form n is no greater than
the step cost of getting to n’ plus the estimated cost of reaching the goal form n’:
H(n) <= c(n, a, n’) + h(n’)
This is a form of the general triangle equality.
19. Define game.
A game can be defined by the initial state, the legal actions in each state, a
terminal test and a utility function that applies to terminal states.
20. What is alpha-beta pruning? (anna univ 2004)
The problem with minimax search is that the number of games states it has
to examine is exponential in the number of moves. We can’t eliminate the exponent, but we
can effectively cut it in half. The trick is that is possible to compute the correct minimax
decision without looking at every node in the game tree. This technique is called alpha-beta
pruning.
21. Define Offline search.
Offline search algorithms compute a complete solution before setting in the

real world and then execute the solution without recourse to their percepts.
22. Define the term backtracking search.
Backtracking search is used for a depth first search that chooses values for
one variable at a time and backtracks when a variable has no legal values left to assign.
23. When a problem is called commutative?
SCE 121 Dept of CSE

A problem is commutative if the order of application of any given set of

actions has no effect on the outcome.
24. What do you mean by minimum remaining values?
Choosing the variable with the fewest “legal” values is called the minimum
remaining values heuristic. Otherwise called as “most constraint variable” or “fail first”
25.Define informed search strategy.
Informed search strategy is one that uses problem specific

knowledge beyond the definition of the problem itself that can find solutions
more efficiently that an uniformed strategy.
26.What do you mean by Best First Search approach?
Best first search is an instance of the general TREE SEARCH algorithm in

which a node is selected for expansion based on an evaluation function, f (n).
27. Define heuristic function. (anna univ 2005)
Best first search typically use a heuristic function h (n) that estimates the
cost of the solution from n.
h(n) = estimated cost of the cheapest path from node n to a goal node.
1. Explain the following with an example.
a) Greedy best first search.
b) Recursive best first search.
SCE 122 Dept of CSE

2. Trace the operation of A* search applied to the problem of getting to Bucharest from
Lugoj using the straight-line distance heuristic. (anna univ 2004)
3. Invent a heuristic function for the 8-puzzle that sometimes overestimates, and show
how it can lead to a suboptimal solution on a particular problem.
4. Relate the time complexity of LRTA* to its space complexity.
5. Describe a hill climbing approach to solve TSPs. (anna univ 2004)
6. Describe a genetic algorithm approach to the traveling sales person problem.
7. Explain backtracking search for CSPs with an example.
8. Explain Minimax algorithm with an example. (anna univ 2004)
9. Explain alpha-beta pruning in detail.
UNIT III
POSSIBLE 2 MARKS:
1. What are the standard quantifiers of First Order Logic? (Apr/May 2008)
The First Order Logic contains two standard quantifiers.
They are:
i) Universal Quantifiers
ii) Existential Quantifiers
2. Define Universal Quantifier with an example.
To represent “All elephants are mammal” “Raj is an elephant” is represented by

Elephant (Raj) and “Raj is a mammal”. The first order logic is given by
SCE 123 Dept of CSE

X Elephant (x) => Mammal (x)
Refers to”For all”. P is any logical expression, which is equivalent to the

conjunction (i.e. the ^) of all sentences obtained by substituting the name of an object for
the variable x where if appears in p. The above sentence is equivalent to
Elephant (Raj) => Mammal (Raj)
Elephant (John) => Mammal (John)
Thus it is true if and only if, all the above sentences are true that is if p is true
for all objects x in the universe. Hence, is called universal quantifier.
3. Define Existential Quantifier with an example. (Apr/May 2008)
Universal quantification makes statements about every object. Similarly, We can

make statement about some object in the universe without naming it, by using an existential
quantifier.
To say, for example, that king john has a crown on his head, we write
x Crown(x) ^ OnHead (x, John)
x is pronounced “There exists an x such that…..” or “For some x…”
The sentence says that P is true for at least one object x. Hence, is called
existential quantifier.
4. Define Nested Quantifier with an example.
The Nested Quantifier is to express the more complex sentences using multiple
quantifiers. For example, “Brothers are siblings” can be written as
SCE 124 Dept of CSE

x y Brother (x, y) => Sibling (x, y)
Consecutive quantifiers of the same type can be written as one quantifier with
several variables. For example, to say that siblinghood is a symmetric relationship, we can
write
x, y Sibling (x, y)  Sibling (y, x)
5. Explain the connections between and . (Apr/May 2008)
The two quantifiers can be connected with each other through negation. It can be
explained through negation. It can be explained with the following example.
Eg: x Likes(x, IceCream) is equivalent to x Likes (x, IceCream)
This means “Everyone likes ice cream” is equivalent to “there is no one who does
not like ice cream”.
6. What is the use of equality symbol?
The equality symbol is used to make the statements more effective that two terms
refer to the same object.
Eg: Father (John) = Henry
7. Define Higher Order Logic. (Apr/May 2008)
The Higher Order Logic allows quantifying over relations and functions as well as
over objects.
SCE 125 Dept of CSE

Eg: The two objects are equal if and only if, all the properties to them are
equivalent.
x, y (x=y)  ( p p(x)  p(y))
8. Define First Order Logic.
First Order Logic, a representation language that is far more powerful than
propositional logic. First Order Logic commits to the existence of objects and relations.
Eg: One plus two equals three
Objects - one, two & three
Relations - equals
Functions - plus
9. What is called declarative approach?
The representation language makes it easy to express the knowledge in the form of
sentences. This simplifies the construction problem enormously. This is called as
declarative approach.
10. State the aspects of a knowledge representation language.
A knowledge representation language is defined in two aspects:
i) Syntax: The syntax of a language describes the possible configuration

that can constitute sentences.
ii) Semantics: It determines the facts in the world to which the sentences
refer.
SCE 126 Dept of CSE

11. What is called entailment?
The generations of new sentences that are necessarily true given the old sentences
are true. This relation between sentences is called entailment.
12. What is meant by tuple? (May / June 2008)
A tuple is a collection of objects arranged in a fixed order and is written with angle
brackets surrounding the objects.
{< Richard the Lionheart, King John>, <King John, Richard the
Lion heart>}
13. What is Propositional Logic?
Propositional Logic is a declarative language because its semantics is based on a

truth relation between sentences and possible worlds. It also has sufficient expressive
power to deal with partial information, using disjunction and negation.
14. What is compositionality in propositional logic?
Propositional Logic has a third property that is desirable in representation

languages, namely compositionality. In a compositionality language, the meaning of
sentences is a function of the meaning of its parts. For example, “S1 ^ S2” is related to
the meanings of “S1 and S2”.
15. Define Symbols. (May / June 2008)
The basic syntactic elements of first order logic are the symbols that stand
for objects, relations and functions. The symbols are in three kinds. Constant symbols
which stand for objects, Predicate symbols which stand for relations and Function symbol
which stand for functions.
SCE 127 Dept of CSE

16. Define ground term, Inference.
The term without variables is called ground term.
The task of deriving the new sentence from the old is called Inference.
17. Define Datalog.
The set of first order definite clauses with no function symbols is called
datalog.
Eg: “The country Nono, an enemy of America”
Enemy(Nono, America)
The absence of function symbols makes inference much easier.
18. What is Pattern Matching? (May / June 2008)
The “inner loop” of the algorithm involves finding all possible unifiers such that the
premise of a rule unifies with a suitable set of facts in the knowledge base. This is called
Pattern Matching.
19. What is Data complexity?
The complexity of inference as a function of the number of ground facts in the

database is called data complexity.
20. Define Prolog. (May / June 2008)
Prolog programs are sets of definite clauses written in a notation somewhat

different from standard first-order logic.
21. What are the principal sources of Parallelism?
The first called OR-Parallelism comes from the possibility of a goal unifying with
many different clauses in the knowledge base. Each gives rise to an independent branch in
the search space that can lead to a potential solution and branches can be solved in parallel.
SCE 128 Dept of CSE

The second called AND-Parallelism comes from the possibility of solving each
conjunct in the body of an implication in parallel.
22. Define conjunctive normal form. (May / June 2008)
First Order resolution requires that sentences be in conjunctive normal form that is,
a conjunction of clauses, where each clause is a disjunction of literals. Literals can contain
variables, which are assumed to universally quantified.
For ex, the sentence
X American(x) ^ Weapon(y) Sells(x,y,z) ^ Hostile(z) => Criminal(x)
Becomes, in CNF,
American(x) Weapon(y) Sells(x,y,z)
Hostile(z) Criminal(x)
23. Define Skolemization. (Apr/May 2005)
Skolemization is the process of removing existential quantifiers by elimination.
24. What is the other way to deal with equality?
Another way to deal with an additional inference rule is
 Demodulation
 Para modulation
25. Define the ontology of situation calculus. (Apr/May 2005)
Situations, which denote the states resulting from executing actions. This approach
is called Situation Calculus.
SCE 129 Dept of CSE

 Situations are logical terms consisting of the initial situation and all
situations that are generated by applying an action to a situation.
 Fluent are functions and predicates that vary from one situation to the next,
such as the location of the agent.
 Atemporal or eternal predicates and functions are also allowed.
1. Explain the various steps associated with the knowledge engineering process?
Discuss them by applying the steps to any real world application of your choice.
(May/June 2007)
2. What are the various ontologies involved in situation calculus?
(May/June 2007)
3. How did you solve the following problems in Situation Calculus?

a) Representation frame problems
b) Inferential frame problems
(May/June 2007)
4. Illustrate the use of first-order-logic to represent the knowledge.
(Nov/Dec 2007)
5. Explain the forward chaining and backward chaining algorithm with an

example. (Nov/Dec 2007)
6. Explain the steps involved in representing knowledge using first order logic.
(Nov/Dec 2007)
7. What do you understand by symbols, interpretations and qualifiers?
(Nov/Dec 2007)
8. How are the facts represented using prepositional logic? Give an example.
(Nov/Dec 2005) 9.
Describe Non-Monotonic logic with an example.
(Nov/Dec 2005)
SCE 130 Dept of CSE

UNIT IV
TWO MARKS:
1. What is learning?
Learning takes many forms, depending on the nature of the performance
element, the component to be improved, and the available feedback.
2. What are the types of Machine learning?

a) Supervised
b) Unsupervised
c) Reinforcement
3. Define the following:

a) Classification:
Learning a discrete-valued function is called classification.
b) Regression:
Learning a continuous function is called regression.
4. What is inductive learning?

The task is to learn a function from examples of its inputs and outputs is
called inductive learning.
5. When will a learning problem is said to be realizable or unrealizable?

A hypothesis space consisting of polynomials of finite degree represent
sinusoidal functions accurately, so a leaner using that hypothesis space will not be able
to learn from sinusoidal data.
A Learning process is realizable if the hypothesis space contains the true

function, otherwise it is unrealizable.
6. What is decision tree?
SCE 131 Dept of CSE

A decision tree takes as input an object or situation described by a set of

attributes and returns a “decision”, the predicted output value for the input. The input
can be discrete or continuous.
7. Define goal predicate.

To define the goal predicate should have the following list of attributes.
a) Alternate
b) Bar
c) Fri/Sat
d) Hungry
e) Patrons
f) Price
g) Raining
h) Reservation
i) Type
j) Wait Estimate
8. Define the kinds of functions. (Apr/May 2005)

The kinds of functions are a real problem.
Parity function:
Returns 1 if and only if an even number of inputs are 1, then

an exponentially large decision tree will be needed.
Majority function:
Returns 1 if more than half of its inputs are 1.
9. Define training set.

The positive examples are the ones in which the goal WillWait is true (X1,
X3,…….); the negative examples are the ones in which it is false (X2, X5,…..). The
complete set of examples is called the training set.
10. How do you assess the performance of the learning algorithm?

A learning algorithm is good if it produces hypotheses that do a good job of
predication the classifications of unseen examples. We do this on a set of examples
known as the test set. It is more convenient to adopt the following methodology:
a) Collect a large set of examples.

SCE 132 Dept of CSE
b) Divide it into two disjoint sets: the training set and the test set.
c) Apply the learning algorithm to the training set, generating a hypothesis h.
d) Measures the percentage of examples in the test set that are correctly
classified by h.
e) Repeat steps 1 to 4 for different sizes of training sets and different randomly
selected training sets of each size.
11. Define Overfitting.

Whenever there is a large set of possible hypotheses, one has to be careful
not to use the resulting freedom to find meaningless “regularly” in the data. This
problem is called overfitting.
12. What is ensemble learning?

The idea of ensemble learning methods is to select a whole collection, or
ensemble, of hypotheses from the hypothesis space and combine their predictions.
13. Define Weak learning algorithm.

If the input learning algorithm L is a week learning algorithm which means
that L always returns a hypothesis with weighted error on the training set that is slightly
better than random guessing.
14. Define computational learning theory. (Apr/May 2008)

The approach taken in this section is based on computational learning
theory, a field at the intersection of AI, statistics, and theoretical computer science.
15. What do you mean by PAC-learning algorithm? (Apr/May 2008)

Any learning algorithm that returns hypothesis that are probably
approximately correct is called a PAC-learning algorithm.
16. What is an error?

The error of a hypothesis h with respect to the true function f given a
distribution D over the examples as the probability that h is different from f on an
example.
Error (h) = p (h(x) = f(x)|x drawn from D)
17. Define Sample Complexity.

SCE 133 Dept of CSE
The number of required examples, as a function of E and, is called the

sample complexity of the hypothesis space.
18. Define neural networks.

A neuron is a cell in the brain whose principal function is the collection,
processing and dissemination of electrical signals.
19. Define units in neural networks.

Neural networks are composed of nodes or units connected by directed
links. A link from unit j to unit i serve to propagate the activation aj from j to i.
20. Mention the types of neural structures.

a. Feed-forward networks
b. Cyclic or recurrent network
21. Define epoch.

Each cycle through the examples is called an epoch. Epochs are repeated
until some stopping criterion is reached typically that the weight changes have become
very small.
22. What do you mean by Bayesian learning? (Nov/Dec 2005)

Bayesian learning methods formulate learning as a form of probabilistic
inference, using the observation to update a prior distribution over hypotheses. This
approach provides a good way to implement Ockham’s razor, but quickly becomes
intractable for complex hypothesis spaces.
23. What is reinforcement?

The problem is this without some feedback about what is good and what is
bad, the agent will have no grounds for deciding which move to make. The agent needs
to when it loses. This kind of feedback is called a reward, or reinforcement.
24. Define passive learning.
SCE 134 Dept of CSE

The agent’s policy is fixed and the task is to learn the utilities of states, this
could also involve learning a model of the environment.
25. Define the following:

a. Utility-based agent: Learns a utility function on states and uses it to select
actions that maximize the expected outcome utility.
b. Q-learning agent: Learns an action-value function or Q-function, giving the
expected utility of taking a given action in a given state.
c. Reflex agent: Learns a policy that maps directly from states to actions.
Possible 16 mark question:
1. Explain with proper example how EM algorithm can be used for learning with
hidden variables. (Nov/Dec 2007)
2. Describe how decision trees could be used for inductive learning. Explain its
effectiveness with a suitable example. (Nov/Dec 2007)
3. Explain the explanation-based learning. (Nov/Dec 2007)
4. Discuss on learning with hidden variables. (Nov/Dec 2007)
5. i) What do you understand by soft computing?
6. ii)Differentiate conventional and formal learning techniques / Theory and learning
via forms of reward and punishment. (Nov/Dec 2005)
7. Discuss partial order planning with unbound variables. (Nov/Dec 2005)
8. With reference to planning discuss progression and regression. (Nov/Dec 2005)
9. What are the languages suited for planning? (Nov/Dec 2005)
UNIT V
Possible two marks:
1. What is communication?
Communication is the intentional exchange of information brought about by
the production and perception of signs drawn from a shared system of conventional
signs. Most animals use signs to represent important messages.
2. Define language.
Language enables us to communicate most of what we know about the
world.
SCE 135 Dept of CSE

3. Why would an agent bother to perform a speech act when it could be doing a
“regular” action?
A group of agents exploring together gains an advantage by being able to do
the following.
 Query
 Inform
 Request
 Acknowledge
 Promise
4. Differentiate formal language Vs natural language.
Formal language:
A formal language is defined as a set of strings. Each string is a

concatenation of terminal symbols called words.
For example, a language in the first order logic, the terminal symbols include ^ and
P, and a typical string is “P ^ Q”. The String is not a member of the language.
Formal languages always have grammar.
Natural language:
Formal language is in contrast to natural Languages, such as Chinese,

English, that have no strict definition but are used by a community of speakers.
Natural languages have no grammar.
5. Define Grammar.
A grammar is a finite set of rules that specifies a language. Formal
languages always have grammar. Natural languages have no grammar.
6. What are the component steps of communication?

 Intention
 Generation
 Synthesis
 Perception
 Analysis
 Disambiguation
 Incorporation
7. Define Lexicon.
SCE 136 Dept of CSE

The list of allowable words called lexicon. The words are grouped into the
categories or parts of speech familiar to dictionary users. Nouns, pronouns and names
to denote things, verbs to denote events, adjective to modify nouns and adverbs to
modify verbs.
8. What are called open classes and closed classes?

Nouns, Verbs, Adjectives and Adverbs are called open classes.
Pronoun, Article, Preposition and Conjunction are called closed classes.
9. Define grammar overgenerates, undergenerates.

The grammar overgenerates is that generates sentences that are not
grammatical.
Ex: I smell pit fold wumpus nothing east.
The grammar undergenerates is that generates sentence with grammar.
Ex: “I think the wumpus is smelly”
10. Define parsing (or) Syntactic parsing.

Parsing is the process of finding a parse tree for a given input string.
That is, a call to the parsing function PARSE, such as
PARSE(“the wumpus is dead”, ε0, S)
Should return a parse tree with root S whose leaves are the “the wumpus is dead”
and whose internal nodes are nonterminal symbols from the grammar ε0.
11. Define Semantic Interpretation.

The extraction of the meaning of utterance is called Semantics. Semantic
interpretation is the process of associating a First Order Logic expression with a phrase.
12. What are the properties of Intermediate form?

The Intermediate form is to mediate between syntax and semantics. It has
two key properties.
SCE 137 Dept of CSE

 First, it is structurally similar to the syntax of the sentence and thus

can that it can be easily constructed through compositional means.
 Second, it contains enough information that it can be translated into
a regular first order logical sentence.
13. Define metaphor.

A Metaphor is a figure of speech in which a phrase with one literal meaning
is used to suggest a different meaning by way of an analogy.
14. What are the models of knowledge?

 World model
 Mental model
 Language model
 Acoustic model
15. Define discourse.

A discourse is any string of language usually that is more than one sentence
long.
16. Define Reference resolution.

Reference resolution is the interpretation of a pronoun or a definite noun
phrase that refers to an object in the world.
17. Mention the list of coherence relations.

 Enable or cause
 Explanation
 Ground-figure
 Evaluation
 Exemplification
 Generalization
 Violated Expectation
16. What is grammar induction?

Grammar induction is the task of learning a grammar from data.
SCE 138 Dept of CSE

17. What is information retrieval?

Information retrieval is the task of finding documents that are relevant to a
user’s need for information. The best known example of information retrieval systems are
search engines on the World Wide Web.
An information retrieval can be characterized by:
1. A document collection
2. A query posed in a query language
3. A result set
4. A representation of the result set.
18. What is information extraction?

Information extraction is the process of creating database entries by
skimming a text and looking for occurrences of a particular class of object or event and
for relationships among those objects and events.
19. What is context-sensitive grammar?

Context-sensitive grammars are restricted only in that the right-hand side
must contain at least as many symbols as the left-hand side. The name “context sensitive”
comes from the fact that a rule such as A S B  A * b says that an S can be rewritten as
an X in the context of a preceding A and following.
20. Define Language Modeling.

Language modeling approach is one which estimates a language model for
each document and then, for each query, computes the probability of the query, given the
document’s language model.
21. What is a regular expression?

A regular expression defines a regular grammar in a single text string. These
are used in UNIX commands such as grep, in programming languages such as Perl, and
in word processors such Microsoft word.
22. What is cascaded finite-state transducer?
SCE 139 Dept of CSE

Cascaded finite-state transducer consist of a series of finite-state automata,

where automation receives text as input, transducers the text into a different format, and
passes it along to the next automation.
1. Explain the Machine Translation System with a neat sketch. Analyze its learning
probablities. (May/June 2007)
2. Perform Bottom Up and Top Down Parsing for the input “the wumpus is dead”.
(May/June 2007)
3. i) Describe the process involved in communication using the example sentence “the
wumpus is dead”
ii) Write short notes on semantic representation. (May/June 2007)
4. Explain briefly about the following: (May/June 2007)

i) Information retrieval
ii) Information extraction.
5. Construct semantic net representation for the following: (Apr/May 2008)
i) Pomepeian (Marcus), Blacksmith (Marcus)
ii) Mary gave the green flowered vase to her favorite cousin.
7. Construct partitioned semantic net representations for the following:
i) Every batter hit a ball
ii) All the batters like the pitcher. (Apr/May 2008)
8. Illustrate the learning from examples by induction with suitable examples.

(May/June 2007)
SCE 140 Dept of CSE

BE/B TECH DEGREE EXAMINATION,APRIL/MAY 2008

Sixth Semester
(Regulation 2004) Computer
Science Engineering
CS 1351 -ARTIFICIAL INTELLIGENCE (Common to B E
(Part Time )Fifth semester regulation 2005) Time:3 Hours
Maximum :100 Marks
Answer All Questions
PART A-(10*2=20 marks)
1. Define artificial intelligence

2. What is the use of heuristic functions?
3. How to improve the effectiveness of a search based problem solving
technique?
4. what is a constraint satisfaction problem?
5. what is unification algorithm?
6. how can u represent the resolution of predicate logic?
7. list out the advantages of non monotonic reasoning?
8. Differentiate between JTMS and LTMS.
9. what are frameset and instancxes?
10. list out important concepts of script?
PART B-(*16=80 marks)

11. (a) (i) give an example of problem for which breadth first search would work
better than depth first search. (ANS: Page number-73)
(ii) Explain algorithm for steepest hill climbing. (ANS: Page number-
111)
Or
(b) Explain the following search strategies(ANS: Page number-74)
(i) best first search
(ii) A* search
12. (a) Explain Min Max procedure (ANS: Page number-165)
Or
(b) Describe alpha beta pruning and give the other modifications to the min
max procedure to improve its performance, (ANS: Page number-167)
13. (a) Illustrate the use of predicate logic to represent the knowl;edge with suitable
example. (ANS: Page number-240)
Or
(b) consider the following sentences
john likes all kinds of food
apples are food
SCE 141 Dept of CSE

chicken is food
anything anyone isn’t killed by is
food. Bill eats peanuts and is still
alive
Sue eats everything bill eats
(i) translate these sentences into formulas in predicate
logic (ii) prove that johnlikes peanuts using backward
chaining (iii) convert the formulas of a part into clause
form
(iv) prove that john likes peanuts using resolution (ANS: Page number-253)
14 With an example explain the logics of nonmonotonic reasoning(ANS:

Page number-358)
O
r
(b) explain how bayesian statistics provides reasoning under various kinds
of uncertainty(ANS: Page number-492)
15 (a) (i) construct semantic net representation of the

following: Pomepein (marcus),blacksmith(marcus
Mary gave green flowered vaste to her favourite cousin
(ii) construct partitioned semantic net representations for the
following every batter hit a ball
all the batters like the pitcher(ANS: Page number-810)
o
r
(b) illustrates the learning from examples by induction with suitable
examples. (ANS: Page number-651)
B.E/B.TECH DEGREE EXAMINATION, MAY/JUNE

2009
Sixth semester (Regulation
2004)
CS 1351 – ARTIFICIAL INTELLIGENCE (Common to
B.E (part –time) fifth semester regulation 2005)
Time: three hours maximum: 100
marks
Answer ALL questions PART A- (10 x 2 = 20
marks)
1. Define ideal rational agent
2. Define a data type to represent problems and nodes.
3. How does one characterize the quality of heuristic?
SCE 142 Dept of CSE

4. Formally define game as a kind of search problems.

5. Joe, tom and Sam are brothers-represent using first order logic symbols.
6. List the canonical forms of resolution.
7. What is Q-learning?
8. List the issues that affect the design of a learning element.
9. Give the semantic representation of “john loves Mary”.
10. Define DCG.
PART B – (5 x 16 = 80
marks)
11.(a) explain uninformed search strategies.(16) (ANS: Page number-73)

12. (a) describe alpha-beta pruning and its effectiveness.(16) (ANS: Page number-
167)
(Or)
(b) Write in detail about any two informed search strategies. (16) (ANS: Page
number-94)
13. (a) elaborate forward and backward chaining.(16) (ANS: Page number-217)
(Or)
(b) Discuss the general purpose ontology with the following elements(ANS: Page
-328): (i) Categories (4)
(ii) Measures (4)
(iii) Composite objects (4)
(iv) Mental events and mental objects.(4)
14. (a) explain with an example learning in decision trees.(16) (ANS: Page number-
653)
(Or)
(b) Describe multilayer feed-forward networks. (16) (ANS: Page number-739)
15.(a) (i) list the component steps of communication.(8)
(ii) Write short notes about ambiguity and disambiguation.(8) (ANS: Page -
818)
(Or)
(b) Discuss in detail the syntactic analysis (PARSING). (16) (ANS: Page number-
798)
SCE 143 Dept of CSE

B.E/B.TECH DEGREE EXAMINATION,APRIL/MAY

2010
2004)
CS 1351 – ARTIFICIAL
INTELLIGENCE
marks
Answer ALL questions PART A- (10 x 2
= 20 marks)
1. Define al rational agent
2. How will you measure the problem solving performance?
3. State the reasons when the hill climbing often gets stuck
4. What is the constraint satisfaction problem?
5. Differentiate between prepositional versus first order logic
6. Define ontological engineering
7. What is explanation- based learning?
8. State the advantages of inductive logic programming.
9. Give the component steps of communication.
10. What are machine translation systems?
PART B – (5 x 16 = 80
marks)
11.(a) Discuss in detail the structure of agents with suitable diagrams?
(pg - 44) (or)
(b) Explain the following uninformed search strategies
(pg -73) (i) Iterative deepening depth-first search
(ii) Bidirectional
search
12. (a) Explain the A* search and give the proof of optimality of
A* (pg -96) (Or)
(b) Describe the Min-Max Algorithm and Alpha –beta Pruning (pg -165 ,167)
13. (a) (i)Describe the general process of knowledge engineering

(pg -260) (ii)Discuss the syntax and semantics of first order
logic (pg -245)
(
O
r)
(b) Describe the forward chaining and backward chaining algorithm with suitable
Example ( pg -280)
SCE 144 Dept of CSE
14. (a)(i) Describe the decision tree learning algorithm

(pg -653) (ii) Explain the relevance-based learning
(
O
r)
(b) Discuss active and passive reinforcement learning with suitable example
15.(a) (i)Describe the semantic interpretation

(ii)Illustrate the grammar induction with suitable example
(
O
r)
(b) Discuss on information retrieval systems and information extraction systems
B.E/B.TECH DEGREE EXAMINATION,APRIL/MAY

2011
2008)
CS 2351 – ARTIFICIAL
INTELLIGENCE
marks
Answer ALL questions PART A- (10 x 2
= 20 marks)
1. List down the characteristics of intelligent agent?
2. What do you mean by local maxima with respect to search techniques?
3. What factor determines the selection of forward or backward reasoning approach
for an AI problem?
4. What are the limitations in using propositional logic to represent the knowledge
base?
5 Define partial order planner?
6. What are the differences and similarities between problem solving and planning?
7. Define Dempster- Shafer theory?
8. list down two applications of temporal probabilistic models?
9. Explain the concept of learning from example?
10. How statistical learning differ from reinforcement learning?
PART B – (5 x 16 = 80
marks)
11.(a) Explain in detail on the characteristics and applications of
SCE 145 Dept of CSE

learning agents? (or)

(b) Explain AO* algorithm with example.
12. (a) Explain the unification algorithm used for reasoning under predicate
logic with an example.
(Or)
(b) Describe in detail the steps involved in the knowledge engineering process.
13. (a) Explain the concept of planning with state space search using example
(Or)
(b) Explain the use of planning graph in providing better heuristic estimate
with suitable example.
14. (a)Expalin the method of hidden markov models in speech

recognition. (Or)
(b) Explain the method of handling approximate inference in Bayesian networks.
15.(a) Explain the concept of learning using decision trees and neural network
approach
(
Or) (b) Write short
notes on:
(1) Statistical learning
(2) Explanation based learning.
SCE 146 Dept of CSE

SCE 147 Dept of CSE

CS2351 - Ai 00 PDF

Uploaded by

Copyright:

Available Formats

You might also like

CS2351 - Ai 00 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS2351 - Ai 00 PDF

Uploaded by

Copyright:

Available Formats

CS2351 Artificial Intelligence

Mrs. J.Justina Princy Thilagavathy

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SASURIE COLLEGE OF ENGINEERING

VIJAYAMANGALAM – 638 056

SCE 1 Dept of CSE

This is to certify that the e-course material

Subject Code : CS22351

Subject : Artificial Intelligence

Class : III Year CSE

Signature of the Author

Name: J. Justina Princy Thilagavathy

Designation: Assistant Professor

Name: Mrs. P. Murugapriya

SCE 2 Dept of CSE

UNIT I-PROBLEM SOLVING

4 uninformed search strategies 24

5 heuristics, informed search strategies 43

UNIT II-LOGICAL AGENTS

7 Logical agents, Propotional logic 57

9 first-order logic, inferences in first order logic 58

13 Planning with state, Space search 77

14 partial Order Planning 78

15 planning graphs,Planning andacting with real world 84

UNIT IV-UNCERTAIN KNOWLEDGE AND REASONING

SCE 3 Dept of CSE

16 Uncertainty , Review of probability 86

19 inferences in Bayesian networks, Temporal models 90

20 Hidden Markov models 93

21 Learning from observation 95

24 Explanation based learning 101

25 Statistical Learning methods 103

26 Reinforcement Learning 104

B Question bank 113

C Previous year question papers 141

SCE 4 Dept of CSE

CS2351 ARTIFICIAL INTELLIGENCE LTPC

UNIT I PROBLEM SOLVING 9

UNIT II LOGICAL REASONING 9

UNIT III PLANNING 9

UNIT IV UNCERTAIN KNOWLEDGE AND REASONING 9

TOTAL: 45PERIODS TEXT

1. David Poole, Alan Mackworth, Randy Goebel, ”Computational Intelligence : a logical

SCE 5 Dept of CSE

 Building systems that think like humans

“The automation of activities that we associate with human thinking, … such as

 Building systems that act like humans

 Building systems that think rationally

 Building systems that act rationally

“The branch of computer science that is concerned with the automation of

SCE 6 Dept of CSE

Agent: entity in a program or environment capable of generating action.

SCE 7 Dept of CSE

Agent: entity in a program or environment capable of generating action.

Ex: Robotic agent

Agents interact with environment through sensors and actuators.

SCE 8 Dept of CSE

Percept sequence action