Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

2009.

b) (i)

A
1
4

E
4

G
F

(ii) At each step of the graph/tree search algorithms, the collection of nodes have been generated by not yet
expanded is known as fringe. During breadth first search of the above search space the contents of the fringe at
each step are presented in below in the following manner Ni {N1, .. Nk} where Ni is the currently expanded node
and the FIFO queue within brace is the current fringe where N1 is the front node and Nk is the rear one.
{A}, A {B, C}, B {C, D, E}, C {D, E, F}, D {E, F}, E {F, G}, F {G}, G and the corresponding cost is:
4+1+3+8+6+4+2 = 30 (without considering the branching cost).
[DO the other two search in the same manner. However in case of Uniform Cost the nodes will be considered
according the increasing path cost and if there is 0 path cost like here Cost(C, C) = 0, the uniform cost path will
be trapped.]
10. (a) A heuristic h(n) is consistent if, for every node n and every successor n of n generated by any action a,
the estimated cost of reaching the goal from n is no greater than the step cost of getting to n plus the estimated
cost of reaching the goal from n:
i.e. h(n) c(n, a, n) + h(n)
If h(n) is consistent, then the values of f(n) along any path are nondecreasing. The proof follows directly from
the definition of consistency. Suppose n is a successor of n; then
g(n) = g(n) + c(n, a, n) for some action a, and we have
f(n) = g(n) + h(n) = g(n) + c(n, a, n) + h(n) g(n) + h(n) = f(n) .
i.e. f(n) f(n)

(b)

Above is a section of a game tree for tic tac toe. Each node represents a board position, and the children of each
node are the legal moves from that position. To score each position, we will give each position which is
favorable for player 1 a positive number (the more positive, the more favorable). Similarly, we will give each
position which is favorable for player 2 a negative number (the more negative, the more favorable). In our tic
tac toe example, player 1 is 'X', player 2 is 'O', and the only three scores we will have are +1 for a win by 'X', -1
for a win by 'O', and 0 for a draw. Note here that the blue scores are the only ones that can be computed by
looking at the current position.
(c) Proof of Admissibility of A*: An admissible heuristic is one that never overestimates the cost to reach
the goal. Because g(n) is the actual cost to reach n along the current path, and f(n) = g(n) + h(n), we have as an
immediate consequence that f(n) never overestimates the true cost of a solution along the current path through
n.
We will show that A* is admissible if it uses a monotone heuristic.
A monotone heuristic is such that along any path the f-cost never decreases.
But if this property does not hold for a given heuristic function, we can make the f value monotone by making
use of the following trick (m is a child of n)
f(m) = max (f(n), g(m) + h(m))
Let G be an optimal goal state
C* is the optimal path cost.
G2 is a suboptimal goal state: g(G2) > C*
Suppose A* has selected G2 from OPEN for expansion. Consider a node n on OPEN on an optimal path to G.
Thus C* f(n) Since n is not chosen for expansion over G2, f(n) f(G2)
G2 is a goal state. f(G2) = g(G2)
Hence C* g(G2).
This is a contradiction. Thus A* could not have selected G2 for expansion before reaching the goal by an
optimal path.

Proof of Completeness of A*:


Let G be an optimal goal state. A* cannot reach a goal state only if there
are infinitely many nodes where f(n) C*. This can only happen if either happens:
o There is a node with infinite branching factor. The first condition takes care of this.
o There is a path with finite cost but infinitely many nodes. But we assumed that Every arc in the graph has
a cost greater than some > 0. Thus if there are infinitely many nodes on a path g(n) > f*, the cost of that
path will be infinite.
Lemma: A* expands nodes in increasing order of their f values.
A* is thus complete and optimal, assuming an admissible and consistent heuristic function (or using the
pathmax equation to simulate consistency). A* is also optimally efficient, meaning that it expands only the
minimal number of nodes needed to ensure optimality and completeness.
2010 8). b) (ii)
Prove that if a heuristic is consistent it must be admissible but the converse is not True.
Ans.
Part 1 - Let k(n) be the cost of the cheapest path from n to the goal node. We will
proove by induction on the number of steps to the goal that h(n) k(n).
Base case: If there are 0 steps to the goal from node n, then n is a goal and therefore
h(n) = 0 k(n).
Induction step: If n is i steps away from the goal, there must exist some successor n
of n generated by some action a s.t. n is on the optimal path from n to the goal (via
action a) and n is i 1 steps away from the goal. Therefore,
h(n) c(n, a, n) + h(n)
But by the induction hypothesis, h(n) k(n). Therefore, h(n) c(n, a, n) + k(n) = k(n)
since n is on the optimal path from n to the goal via action a.
Part-2
Consider a search problem where the states are nodes along a path
P =n0, n1, . . . , nm where n0 is the start state, nm is the goal state and there is one action from each state ni which
gives ni+1 as a successor with cost 1. The cheapest cost to the goal from a state i is then k(ni) = m i. Dene a
heuristic function as follows:

h(ni) = m 2 i/2
For all states ni, h(ni) k(i), and so h is admissible. However, if i is odd, then
h(ni) = h(ni+1) > 1 + h(ni+1). Thus h is not consistent

9. a) (i) Draw the complete search tree for NIM.


The search tree is shown below. The student might draw a search tree which duplicates some of the nodes. This
is acceptable.
Note : At this stage, only the search tree is required. The Min, Max and the Utility Functions are not
required.
1
7

MIN

0
MIN

5-2

6-1

MAX

4-3

1
5-1-1

4-2-1

4-1-1-1

2-2-2-1

3-2-1-1

1
2-2-1-1-1

3-1-1-1-1

0
MAX

3-3-1

3-2-2

0
MIN

0
MAX

2-1-1-1-1-1

10. a) The solution is hinted for the following problem.


Consider the function f(n) = -(number of tiles out of place) . Now our aim is to maximize f(n) (Maximum
being 0). In this context the 8-Puzzle problem can be logically identical to Hill Climbing where f(n) which gives
the height at nth step needs to be maximized. Consider the following diagram for clarification.

Try to solve the problem in the same manner mentioned above.

A. Strategies for space search- Data driven and Goal driven

Goal driven search is suggested if:


1. A goal is given in the problem statement or can easily be formulated. In a mathematics theorem prover,
for example, the goal is the theorem to be proved. Many diagnostics system consider potential diagnoses
in a systematic fashion.

2. There are large no. of rules that match the facts of the problem and thus produce an increase in no of
conclusions or goals. In a mathematics theorem prover, for example, the total no of rules that may be
applied to the entire set of axioms.
3. Problem data are not given and must be acquired by the problem solver. In a medical diagnosis program,
for example, a wide range of diagnostic tests can be applied. Doctors order only those that are necessary
to confirm or deny a particular hypothesis.
Data driven search
1. All or most of the data are given in the initial problem statement. Interpretation problems often fit this
mold by presenting a collection of data and asking the system for interpretation.
2. There are large no. of potential goals, but there are only a few ways to use the facts and given
information of a particular problem instance.
3. It is difficult to form a goal or hypothesis.
Data driven search uses the knowledge and constraints found in the given data to guide search along lines
known to be true.
B. Deterministic vs. stochastic. If the next state of the environment is completely determined by the current
state and the action executed by the agent, then we say the environment is deterministic; otherwise, it is
stochastic. In principle, an agent need not worry about uncertainty in a fully observable, deterministic
environment. (In our definition, we ignore uncertainty that arises purely from the actions of other agents in a
multiagent environment; thus, a game can be deterministic even though each agent may be unable to predict
the actions of the others.) If the environment is partially observable, however, then it could appear to be
stochastic. Most real situations are so complex that it is impossible to keep track of all the unobserved
aspects; for practical purposes, they must be treated as stochastic. Taxi driving is clearly stochastic in this
sense, because one can never predict the behavior of traffic exactly; moreover, ones tires blow out and
ones engine seizes up without warning. The vacuum world as we described it is deterministic, but
variations can include stochastic elements such as randomly appearing dirt and an unreliable suction
mechanism. We say an environment is uncertain if it is not fully observable or not deterministic. One final
note: our use of the word stochastic generally implies that uncertainty about outcomes is quantified in
terms of probabilities; a nondeterministic environment is one in which actions are characterized by their
possible outcomes, but no probabilities are attached to them. Nondeterministic environment descriptions are
usually associated with performance measures that require the agent to succeed for all possible outcomes of
its actions.

C. Single-layer feed-forward neural networks (perceptrons)


[Read from Russel-Norvig 3rd edition section 18.7.2 of page 729]
D. Optimality of BFS:
For unit-step cost, breadth-first search is optimal. In general breadth-first search is not optimal since it
always returns the result with the fewest edges between the start node and the goal node. If the graph is a
weighted graph, and therefore has costs associated with each step, a goal next to the start does not have to be
the cheapest goal available. This problem is solved by improving breadth-first search to uniform-cost
search which considers the path costs. Nevertheless, if the graph is not weighted, and therefore all step costs
are equal, breadth-first search will find the nearest and the best solution.

E. Optimality of A*:
The main idea is that when A* finds a path, it has a found a path that has an estimate lower then the estimate
of any other possible paths. Since the estimates are optimistic, the other paths can be safely ignored. Also,
A* is only optimal if two conditions are met: 1. The heuristic is admissible, as in, it will never over-estimate
the cost. 2. The heuristic is monotonic, meaning, if H(1) < H(2) then RealCost(1) < RealCost(2). One can
prove the optimality to be correct by assuming the opposite, and expanding the implications. Assume that
the path give by A* is not optimal with an admissible and monotonic heuristic, and think about what that
means in terms of implications, and thus, our original assumption does not remain valid. From that one can
conclude that the original assumption was false, that is, A* is optimal with the above conditions.

2010
8. (b) (i) The heuristic function sum of manhattan distances for 8 puzzle problem is consistent
Ans: Suppose n and n are two intermediate positions of a single tile of the 8-Puzzle problem in its transit to reach the goal
position. Cost of each horizontal and vertical step is 1 and that of each diagonal step is 2. If we only allow horizontal and
vertical movement then h(n) = |x g - xn| + |yg - yn| where (xn,yn) and (xg,yg) are the current and final position of the current
block. Now from n to n the possible change is only a single horizontal or vertical movement with cost 1 i.e. c(n, n) = 1.
So either |xn - xn| = 1 or |yn - yn| = 1 the other distance being 0.
So, h(n) - h(n) = |xn - xn| + |yn - yn| c(n,n)
i.e. h(n) c(n, a, n) + h(n)
This means that this heuristic is consistent.
[Here we have considered a single tile but with all the 8 tiles and the sum of distances this argument can be suitably
extended].
(ii)

Prove that if a heuristic is consistent it must be admissible but the converse is not True.

Ans: Part 1
Let k(n) be the cost of the cheapest path from n to the goal node. We will prove by induction on the number of steps to the
goal that h(n) k(n).
Base case: If there are 0 steps to the goal from node n, then n is a goal and therefore
h(n) = 0 k(n).
Induction step: If n is i steps away from the goal, there must exist some successor n
of n generated by some action a s.t. n is on the optimal path from n to the goal (via
action a) and n is i 1 steps away from the goal. Therefore,
h(n) c(n, a, n) + h(n)
But by the induction hypothesis, h(n) k(n). Therefore, h(n) c(n, a, n) + k(n) = k(n)
since n is on the optimal path from n to the goal via action a.
Part-2
Consider a search problem where the states are nodes along a path P =n0, n1, . . . , nm where n0 is the start state, nm is the
goal state and there is one action from each state ni which gives ni+1 as a successor with cost 1. The cheapest cost to the
goal from a state i is then k(ni) = m i. Dene a heuristic function as follows:
h(ni) = m 2i/2
For all states ni, h(ni) k(i), and so h is admissible. However, if i is odd, then
h(ni) = h(ni+1) > 1 + h(ni+1). Thus h is not consistent.
2011
2. (2nd Part)
Common decision tree Pruning Algorithm is described here.
1. Consider all internal nodes in the tree.

2. For each node check if removing it (along with the subtree below it) and assigning the most common class to it
does not harm accuracy on the validation set.
3. Pick the node n* that yields the best performance and prune its subtree.
4. Go back to (2) until no more improvements are possible.
[The principle for decision tree construction may be stated as follows: Order the splits (attribute and value of the
attribute) in decreasing order of information gain.
Practical issues while building a decision tree can be enumerated as follows:
1) How deep should the tree be?
2) How do we handle continuous attributes?
3) What is a good splitting function?
4) What happens when attribute values are missing?
5) How do we improve the computational efficiency?
The depth of the tree is related to the generalization capability of the tree. If not carefully chosen it may lead to
overfitting. A tree overfits the data if we let it grow deep enough so that it begins to capture aberrations in the data that
harm the predictive power on unseen examples. There are two main solutions to overfitting in a decision tree:
(1) Stop the tree early before it begins to overfit the data.
(2) Grow the tree until the algorithm stops even if the overfitting problem shows up. Then prune the tree as a
post processing step].
** Four Approaches of defining AI systems:
Systems that think like
humans

Systems that think


rationally

Systems that act like


humans.

Systems
rationally.

that

act

Russell and Norvig define AI as the study and construction of rational (intelligent) agents. This is a broader definition of
intelligence than human intelligence (we should not expect computer intelligence to be the same as human intelligence
anymore than we expect airplanes to fly the same as birds). Intelligent behavior is also a broader concept than intelligent
thought (intelligent behavior may depend on reflex, not thinking).
Things AI systems still cannot do properly:

Understand natural language robustly (e.g., read and understand articles in a newspaper)
Surf the web
Interpret an arbitrary visual scene
Learn a natural language
Play Go well
Construct plans in dynamic real-time domains
Refocus attention in complex environments
Perform life-long learning

*** Basic structure of a rule-based expert system


Knowledge Base

Database

Rule: IF-THEN

Fact

Inference Engine

Explanation Facilities

User Interface

User

The knowledge base contains the domain knowledge useful for problem solving. In a rule-based expert
system, the knowledge is represented as a set of rules. Each rule specifies a relation, recommendation,
directive, strategy or heuristic and has the IF (condition) THEN (action) structure. When the condition part of
a rule is satisfied, the rule is said to fire and the action part is executed.
The database includes a set of facts used to match against the IF (condition) parts of rules stored in the
knowledge base.
Inference engine
The inference engine carries out the reasoning whereby the expert system reaches a solution. It links the rules
given in the knowledge base with the facts provided in the database.
The inference engine is a generic control mechanism for navigating through and manipulating knowledge and
deduce results in an organized manner.
Inference engine the other key component of all expert systems.
Just a knowledge base alone is not of much use if there are no facilities for navigating through and manipulating the
knowledge to deduce something from knowledge base.
A knowledge base is usually very large, it is necessary to have inferencing mechanisms that search through the database
and deduce results in an organized manner.
The Forward chaining, Backward chaining and Tree searches are some of the techniques used for drawing inferences from
the knowledge base.
Forward Chaining Algorithm
Forward chaining is a techniques for drawing inferences from Rule base. Forward-chaining inference is often called data
driven.
The algorithm proceeds from a given situation to a desired goal, adding new assertions (facts) found.
A forward-chaining, system compares data in the working memory against the conditions in the IF parts of the rules and
determines which rule to fire.

Data Driven

Example : Forward Channing


Given : A Rule base contains following Rule set
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
Problem : Prove
If A and B true Then D is true
Solution :
(i) Start with input given A, B is true and then
start at Rule 1 and go forward/down till a rule
fires'' is found.
First iteration :
(ii) Rule 3 fires : conclusion E is true
new knowledge found
(iii) No other rule fires;
end of first iteration.
(iv) Goal not found;
new knowledge found at (ii);
go for second iteration
Second iteration :
(v) Rule 2 fires : conclusion G is true
new knowledge found

(vi) Rule 4 fires : conclusion D is true


Goal found;
Proved
Backward Chaining Algorithm
Backward chaining is a techniques for drawing inferences from Rule base. Backward-chaining inference is often called
goal driven.
The algorithm proceeds from desired goal, adding new assertions found.
A backward-chaining, system looks for the action in the THEN clause of the rules that matches the specified goal.
Goal Driven

Example : Backward Channing


Given : Rule base contains following Rule set
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
Problem : Prove
If A and B true Then D is true
Solution :
(i) Start with goal i.e. D is true
go backward/up till a rule "fires'' is found.
First iteration :
(ii) Rule 4 fires :
new sub goal to prove G is true
go backward
(iii) Rule 2 "fires''; conclusion: A is true

new sub goal to prove E is true


go backward;
(iv) no other rule fires; end of first iteration.
new sub goal found at (iii);
go for second iteration
Second iteration :
(v) Rule 3 fires :
conclusion B is true (2nd input found)
both inputs A and B ascertained . (Proved)
**Branch and Bound Algoritm:
General purpose method to solve discrete optimization problems
Takes exponential time in the worst-case
Useful to solve small instances of hard problems
Can be applied to all scheduling problems
Need to specify:
1. A lower bounding scheme
2. Branching rule
3. Search rule (subproblem selection rule)

1.
2.
3.
4.
5.
6.

Overview: For a minimization problem, at a given branch and bound tree (b&b) node i:
Obtain a lower bound, lbi
(Optional) Obtain a candidate solution and an upper bound ubi
If (ubi < GLOBAL UB) then set ubi as the GLOBAL UB
If (lbi >= GLOBAL UB) then prune the node and go to 6.
Branch into subproblems (create child nodes)
Pick the next active subproblem.
If none exist, stop. Else, go to 1.

Example 1
Problem: Single machine, n jobs arriving at different times, preemptions not allowed, minimize maximum lateness (L max)

A b&b node corresponds to a partial schedule


At level k of the b&b tree, the first k jobs in the schedule have been fixed
Branching
Create a child node of a node at level k-1 by fixing each remaining job at kth position EXCEPT:

If a job c satisfies

rc min (max( t , rl ) pl ) , do not create a child for that job


lJ

Lower bounding: Schedule remaining jobs according to the preemptive EDD rule:
Every time the machine is freed or a new job is released,

pick the uncompleted job with minimum due date

Numeric Example:

Jobs

pj

dj

12

11

10

rj

Level 0

lb= 5

2,

1,

3,

lb= 7

1,2,

prune

1,3,

4,

Level 1

prune

Level 2
lb= 5

lb= 6

Level 3
1,3,4,2

ub= 5
candidate

Lower bound at node (1,-) is obtained by applying preemptive EDD to jobs 2, 3 and 4 with starting time 4
At t=4, available jobs: 2, 3,

pick 3

At t=5, available jobs: 2, 3,4

pick 4

At t=10,

available jobs: 2, 3,

pick 3

At t=15,

available jobs: 2,

pick 2

3
4

4
5

L1 = -4, L2 = 5, L3 = 4, L4 = 0,

3
10
0

2
15

17

Lmax = 5

Exercise: Calculate the lower bound at node (2,-)


Preemptive EDD at node (1,3,-) gives a non-preemptive schedule 1, 3, 4, 2 with Lmax = 5 and it is an optimal solution to
the problem

You might also like