Algorithms of Artificial Intelligence

Lecture 2: Knowledge
E. Tyugu Spring 2003
Prolog is a logic-based programming language, i.e. a language for logic programming. Its statements are Horn clauses. Examples: A program: ancestor(X,Z):-parent(X,Z). ancestor(X,Z):-parent(Y,Z),ancestor(X,Y). A program: state(1,0). state(S,T):-state(S1,pred(T)), nextstate(S,S1,pred(T)). nextstate(X+1,X,T). A goal: ?- state(X,3).

Prolog interpreter
prog - program to be executed; goals - list of goals, which initially contains the goal given by a user;

unifier(x,y) - produces the most general unifier of x and y, or nil ;

apply(x,L) - applies a unifier x to each element of a list L producing a new list.

Prolog interpreter
A.1.3: exec(prog,goals,success)= if empty(goals) then success( ) else goal:= head(goals); goals:= tail(goals); L: {rest:= prog; while not empty(rest) do U:=unifier(goal,head(head(rest)); if U nil then goals1:= (apply(U,tail(head(rest))); exec(prog,goals1,success); if success then exit L fi fi; rest:=tail(rest); od; failure( ) }; exec(prog,goals, succes) fi Enn Tyugu

Semantic networks
Linguists noticed long ago that the structure of a sentence can be represented as a network. Words of the sentence are nodes, and they are bound by arcs expressing relations between the words. The network as a whole represents in this way a meaning of the sentence in terms of meanings of words and relations between the words. This meaning is an approximation of the meaning people can assign to the sentence, analogous in a way to other approximate representations of the meaning, for instance, how floating point numbers represent the approximate meaning of real numbers.

John must pick up his report in the morning and have a meeting after lunch. After the meeting he will give the report to me.
before morning at the time lunch after

have a me eting pick up what? re port whos? his

after who? who? Joh n to whom? me



Example continued
Inferences can be made, depending on the properties of the relations of a semantic network. Let us consider only time relations of the network in our example, and encode the time relations by atomic formulas as follows:
before(lunch,morning) after(morning,lunch) after(lunch,have a meeting) after(have a meeting,give) at-the-time(morning,pick up)
= general knowledge

= specific knowledge

Example continued
Inference rules: before(x,y) before(y,z) before(x,z) after(x,y) before(y,x) at-the-time(x,z) before(y,z) before(y,x) Applying these rules, we can infere after(lunch,have a meeting) before(have a meeting,lunch) at-the-time(pick up,morning) before(lunch,pick up)

before(lunch,morning) etc.
1. The essence of the frame is that it is a module of knowledge

about something which we can call a concept. This can be a situation, an object, a phenomenon, a relation.
2. Frames contain smaller pieces of knowledge: components, attributes, actions which can be (or must be) taken when conditions for taking an action occur. 3. Frames contain slots which are places to put pieces of knowledge in. These pieces may be just concrete values of attributes, more complicated objects, or even other frames. A slot is being filled in when a frame is applied to represent a particular situation, object or phenomenon.

An essential idea developed in connection with frames was inheritance. Inheritance is a convenie way of reusing existing knowledge in describing new frames. Knowing a frame f, one can describe a new frame as a kind of f, meaning that the new frame inherits the properties of f, i.e. it will have these properties in addition to newly described properties described. Inheritance relation expresse very precisely the relation between super- and subconcepts.
ideas even ts actions state s thing s abstra ct th ings polygons triangles quadrangles parallelograms re ctan gles rh omb uses

Default theories
A default has the following form

A:B1, ... , Bk -------------C

where the formula A is a premise, the formula C is a conclusion and the formulas B1, ..., Bk are justifications. Conclusion of the default can be derived from its premise, if there is no negation of any justification derived.

1. bird(x): flies(x)

2. Closed world assumption (CWA): :not F -----not F

Derivation step with a default.

A - premise of a default C - conclusion of a default J - justifications of default
A1.4 Default(A,C,J): for B J do if derrivable( B) then failure fi od; success( )

Rules are a well-known form of knowledge which is easy to use. A rule is a pair
(condition, action) which has the meaning: "If the condition is satisfied, then the action can be taken." Also other modalities for performing the action are possible - "must be taken", for instance.

Using rules
Let us have a set of rules called rules and functions cond(p) and act(p) which select the condition part and action part of a given rule p and present them in the executable form. The following is a simple algorithm for problem solving with rules:
while not good do found := false; for p rules do if cond(p) then act(p); found:=true fi od; if not found then failure fi od
Decision trees
A simple way to represent rules is decision tree: a tree with nodes for attributes and arcs for attribute values. Example:
legs two four no bird no man

yes furry yes monkey

furry no

yes animal

Rete algorithm
Rete algorithm uses a data structure that enables fast search of applicable rules. We shall consider it in two parts: knowledge representation, knowledge management (i.e. introduction of changes into the knowledge base). Any rule that is reachable in the Rete graph (see below) via nonempty relation nodes can be fired.

Rete algorithm is used in JESS (Java Expert System Shell) and its predecessor CLIPS (both developed in NASA.)

Rete algorithm continued

Knowledge includes: 1. facts, e.g.
(goal e1 simplify), (goal e2 simplify), (goal e3 simplify), (expr e1 0 + 3), (expr e2 0 + 5), (expr e3 0 * 2),...

2. patterns, e.g.
(goal ?x simplify) (expr ?y 0 ?op ?a2) (parent ?x ?y) ...

3. and rules, e.g.

(R1 (goal ?x simplify) (expr ?x 0 + ?y) => (expr ?x ?y)) (R2 (goal ?x simplify) (expr ?x 0 * ?y) => (expr ?x 0)) ...

Rete algorithm continued

Knowledge is represented in the form of an acyclic graph. It is for the presented example as follows:
root goal
x goal e1 e2 e3 x y *** e3 2


expr * y

expr + y
3 5


x y e1 3 e2 5

*** R1


Rete algorithm continued

The overall structure of the Rete graph is the following:

root predicate names layer patterns layer - alpha nodes (with one input) beta-nodes (with two inputs) rules layer - one node for every rule
Adding facts to Rete graph

When a fact arrives then
1. Select the predicate 2. Select the pattern 3. For every relation depending on the selected pattern update the relation (add a new line to the relation).

Rete algorithm continued

The Rete graph is built, updated and used as follows: 1. One level down from the root are placed all predicate names. 2. The next level down contains alpha-nodes for all patterns of all rules as successors of their predicate names. 3. Beta-nodes of the following levels down (with two inputs each) include relations that unify with the patterns along the path from the root to the node. 4. The paths lead finally to nodes representing rules. 5. When a new knowledge item arrives, it is placed into the correct places. Finding the places is simple and straightforward, because it is guided by a relation in every node. 6. When a goal is given, the search is simple and straightforward, because it is guided by a relation in every node.

Rules with plausibilities

Rules can be extended by adding plausibility values to them. Let us associate with each rule p a plausibility value c(p) of application of the rule. These values can be in the range from 0 to 1. We shall consider as satisfactory only the results of application of a sequence of rules p, ..., q for which the plausibilities c(p) ,..., c(q) satisfy the condition c(p) * ... * c(q) > cm, where cm is the minimal satisfactory plausibility of the result. When selecting a new applicable rule, it is reasonable now to select a rule with the highest value of plausibility

A.1.6 c:=1; while not good do x:=cm; for p rules do if cond(p) and c(p) > x then a:=act(p); x:=c(p) fi od; c:=c*x; if c > cm then a else failure fi od; success
Using a plausibility function

A.1.7 c:=1; while not good do x:=cm; for p rules do if cond(p) and plausibility(c(p),c) > x then a:=act(p); x:=plausibility(c(p),c) fi od; c:=x; if c > cm then a else failure( ) fi od; success( )
Classification of knowledge systems


Symbolic (derivability, soundness, completeness)

Rules (effcient computabiliy)

Semantic networks (eloquence, simplicity)

Frames (modularity, inheritance)

Facts: parent(pam,bob).
parent(tom,bob). parent(tom,liz). parent(bob,ann). parent(bob,pat). parent(pat,jim).
Questions and answers: ?- parent(bob,pat). ?- parent(liz,pat). ?- parent(X,liz). ?- parent(bob,X). ; ; yes no X = tom X = ann X = pat no
Bratko, I. (2001) Prolog Programming for Artificial Intelligence. Addison Wesley. (Jess ja rete algoritm) Genesereth, M., Nilsson, N. (1986) Logical Foundations of Artificial Intelligence. Morgan Kauffmann.

