Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Glue-Nail: A Deductive Database System

Geo rey Phipps Marcia A. Derr


Kenneth A. Ross
Department of Computer Science, Stanford University, Stanford CA 94305 y
fphipps, mad, karg@cs.stanford.edu
13th March, 1991

Abstract recursion. One currently unsolved problem with declar-


ative languages is how to integrate them into a full sys-
Glue is a procedural language for deductive databases. tem without becoming tangled in the impedance mis-
It is designed to complement the purely declarative match problem (described below). This is a problem
NAIL! language, rstly by performing system functions which has plagued traditional database systems. Glue-
impossible to write in NAIL!, and secondly by allowing Nail is designed to solve it.
the procedural speci cation of algorithms for critical There is no precise de nition of \declarative", al-
code sections. The two languages together are sucient though most people would agree that in a declarative
to write a complete application. Glue was designed to language, the programmer states what is desired, not
be as close to NAIL! as possible, hence minimizing the how to do it. Hence declarative languages signi cantly
impedance mismatch problem. In this paper we con- reduce the amount of code that a programmer must
centrate on Glue. Pseudo-higher order programming is write for a given application. In addition, the code that
used in both languages, in the style of HiLog [1]. In par- the programmer must write is more strongly focused on
ticular Glue-Nail can handle set valued attributes (non- the actual application, rather than on the technical de-
1NF schemas) in a clean and ecient manner. NAIL! tails of a particular solution algorithm. This focusing
code is compiled into Glue code, simplifying the sys- e ect should reduce the number of mismatches between
tem design. An experimental implementation has been the application speci cation and the programmer's im-
written, a more ecient version is under design. plementation.
Unfortunately there are certain operations which
implicitly have a notion of state, for example In-
1 Introduction put/Output (I/O) and EDB1 updates. These opera-
tions have side e ects which cannot be easily captured
The Glue language grew out of our experiences of de- in logic.2 Any real application language must be able to
signing and implementing the rst NAIL! system [3], perform I/O and EDB update operations; they cannot
and of using commercial database systems. be ignored. Computations involving side e ects cannot
From a software engineering point of view, declara- be written declaratively because the programmer must
tive logic based languages o er many advantages over be able to specify the order in which the side e ects
traditional relational databases, primarily due to their occur. In other words the programmer must give the
simplicity and high-level approach (for example, see intermediate steps in the computation. Procedural lan-
the introductions of [4] and [10]). Relational database guages are well suited to describing such computations,
systems free the programmer from worrying about the declarative languages are not.
physical data representation and access methods. De- We feel that the increase in programming eciency
ductive database systems do the same for views and provided by a declarative language is very important,
 Also AT&T Bell Laboratories, Murray Hill, NJ 07974 and so we have preserved the declarative nature of
y This work supported by AFOSR-88-0266, NSF-87-12791, and NAIL!. Hence we need another (procedural) language
a gift of Mitsubishi Electric.
1 The Extensional Data Base (EDB) stores tuples; in a rela-
tional system it would just be called the \database".
2 One \solution" is to carry state variables around in the log-
ical rules, but the programmer must ensure that the variables
are strung together in the correct order, which is equivalent to
specifying the order of computation.
to complement NAIL!. This language is Glue. 2 Predicates
Embedding a query language in a procedural lan-
guage is common in databases, for example, embedding There are four kinds of predicates in Glue-Nail:
SQL in C. Unfortunately we then usually run into the EDB Relations: Ground tuples (facts) are stored in
impedance mismatch problem (for an example descrip- relations in the Extensional Data Base (EDB). The
tion, see [4]). This is the name for the collection of EDB is equivalent to the \database" of a tradi-
problems that arise when we interface two dissimilar tional relational system.
languages. It has no formal academic de nition, but it
is a serious problem in real programming systems. The Local Relations: Glue procedures can have local re-
problems include di ering type systems, set oriented lations; in a sense these are temporary EDB rela-
versus tuple oriented computation, di ering data life- tions with restricted lexical scope.
time, wildly di ering syntax, and an inability to carry
optimizations across the interface. For example, SQL NAIL! Rules: These de ne Intensional Database
uses the relation as the basic data type. C uses sin- (IDB) tuples, the appropriate parts of which are
gle valued variables, so C is a tuple oriented language. computed on demand using the current value of
When SQL is merged with C the concept of a cursor has the EDB.
to be introduced so that C can iterate (or recurse) over
all the elements in the SQL relation. SQL has ecient Glue Procedures: Glue procedures also belong to
set oriented algorithms for dealing with relations, but the IDB, in that they de ne tuples which are only
C does not. An even more serious eciency problem is computed on demand. Unlike NAIL! predicates,
that a two language system has two separate optimizers. Glue procedures can also use EDB updates and
Each individual SQL query is optimized independently, input (I/O) in their computations.
without any reference to other nearby SQL queries. Predicates do not have duplicates.3
To avoid these problems, Glue was designed to re- In Prolog a subgoal P in rule R can unify with either
semble NAIL! as closely as possible. Both languages a fact or a predicate; there is no syntactic or behavioral
have tuples and relations as their basic data objects, di erence discernible within rule R. Either P is true im-
\all-solution" computation, similar syntaxes, and iden- mediately or it can be derived, it makes no di erence4
tical type systems. For example, in Glue a subgoal can to rule R. This usage equivalence of EDB and IDB
be a NAIL! predicate, or an EDB relation or a Glue is also true of languages like NAIL!; it is one of their
procedure. The syntax and behavior is the same in all great advantages over traditional embedded relational
three cases. In each case the subgoal returns a set of systems. Glue-Nail also has this advantage; a subgoal in
tuples. Glue or NAIL! can reference an EDB relation, a NAIL!
We expect that an application programmer will write predicate, or a Glue procedure, and the syntax and se-
the declarative, query oriented sections of the applica- mantics are identical in all three cases. The meaning
tion in NAIL!, using Glue mostly for the interface and is always: use the current value. For an EDB rela-
for EDB modifying code. Sometimes it might be useful tion this value comes directly from the database. For
to use Glue for a particularly speed-critical query, for a NAIL! predicate it is derived from the current state
which an especially ecient special purpose algorithm of the EDB. For a Glue procedure it is computed from
is known. Such a practice is analogous to writing speed the current EDB, and perhaps from input.
critical sections of a C program in assembler; there is An attribute of a tuple can be either an atom (a
an increase in speed, but at the expense of clarity. number or a string), or a compound term. In Glue there
The remainder of the paper is organized as follows. is no di erence between atoms and strings. In Prolog-
Section 2 discusses the basic data and predicate types. style languages, the two data types are distinct, and
Section 3 is a tutorial introduction to the basic Glue a programmer is forever converting atoms into strings
assignment statement. Section 4 describes Glue proce- and vice versa. Strings are rst class data types in Glue,
dures. In Section 5 we describe Glue-Nail's set system. and the language has built-in operators (concatenation,
We also discuss higher-order programming in general. length, and substring) to aid in their use. Strings are
Section 6 brie y describes Glue's module system. Sec- central to databases, so they must be well supported.
tion 8 brie y describes three other deductive database Relations may contain only completely ground tu-
systems (LDL, CORAL, and Aditi), and compares their ples. Hence Glue only needs to use matching when
approaches with Glue-Nail. In Section 9 we describe the comparing subgoals against a relation, rather than use
current experimental implementation. Section 10 dis- 3 Or rather, the system must remove them if the programmer's
cusses some known problems with Glue-Nail, and what code would behave di erently in the presence of duplicates.
we are planning to do with Glue-Nail. 4 Ignoring nonterminating derivations.
full uni cation. This restriction is also very important := Clearing assignment. The head relation is over-
for the code optimizer, because it allows the system to written by the result of the body.
know at compile time when a variable in an assignment
statement becomes bound. Such knowledge is useful in += Insertion assignment. The tuples from the body
many optimizations. If binding time analysis could not are added to the head relation.
be performed at compile time, then it would have to be -= Deletion assignment. The tuples from the body are
done at run time. removed from the head relation.
~ ] Modify assignment, meant to be used as \up-
3 Assignment Statements +=[Z
date by key." Analogous to UPDATE in SQL. The
key is the variables in the vector Z~ .
3.1 Basic Elements For example, suppose the unary relation row con-
The basic element of Glue is the assignment statement. tains the integers 1 to N, and that the ternary relation
Here is an example: matrix contains (row, column, value) triples. Then the
code:
r(X,Y)+= s(X,W) & t(f(W,X),Y).
matrix(X,X, 1.0):= row(X).
The e ect of executing this statement is that the tuple matrix(X,Y, 0.0)+= row(X) & row(Y) & X != Y.
(X,Y) is to added to relation r if there is a tuple (X,W)
in relation s, and a tuple (f(W,X),Y) in relation t. All would create an identity matrix of size N in relation
such (X,Y) tuples are to added to relation r. matrix.
Glue assignment statements are not logical rules,
they are operational directives. They do not de ne tu-
ples, they command their creation (or destruction or
3.2 The Supplementary Relation Model
modi cation). To explain the semantics of assignment statements it
In their basic form Glue assignment statements have is useful to employ the supplementary relation model.
a single head term, and a conjunction5 of subgoals in The supplementary relations of an assignment state-
the body. The body is executed and produces a relation ment hold the bindings for the variables. If there are
of tuples over the variables in the body. This relation n subgoals in an assignment statement, then there are
is then used to modify the head relation. The subgoals n + 1 supplementary relations, named sup0 to supn .
and the head term may have compound terms as their The ith supplementary relation supi has as its at-
arguments. Although Glue assignment statements may tributes all the variables occurring in the rst i sub-
look a lot like Prolog rules, the control ow is com- goals. Note that the zeroth supplementary relation,
pletely di erent. The Prolog control strategy is \tuple sup0 , is a relation of arity zero. It contains a single
at a time" with backtracking. Glue's strategy is \all so- tuple, the null tuple . The body of an assignment
lutions" with no backtracking. In Prolog, the binding statement of the form:
for a variable is a single term. In Glue, the binding is b1 (B
~1 ) & : : : : : : & bn (B ~n ):

a set of terms. can be rewritten using supplementary relations as:


For the purposes of side e ects and aggregations, the sup0 () := true:
order of evaluation of Glue subgoals is xed and is from sup1 (S~1 ) := b1 (B ~1 ):
left to right. Each subgoal is completely solved be- sup2 (S~2 ) := sup1 (S ~1 )& b2 (B ~2 ):
fore the next subgoal is processed. We will refer to :::
side e ecting or aggregating subgoals as xed subgoals. ( ) := supn?1(S~n?1 )& bn (B~n ):
~n
supn S
A xed subgoal is either an EDB updating subgoal, a The attributes of supi are the union of the vari-
group by, an aggregator (see Section 3.3), or a call to ables in S~iS?1 with the variables in B~i , i.e. S~i =
a Glue procedure which is known to be xed. A Glue vars(S~i?1 ) vars(B ~i ). Note that no variables are ever
procedure is xed if it contains a xed subgoal. The dropped from one supplementary relation to the next.
prede ned I/O procedures are all xed. A Glue system However, variables that are not used further on in the
is free to reorder the non- xed subgoals, although pro- assignment statement can be projected out, unless there
cedures must still have their input arguments bound, are aggregators later in the assignment statement. For
and subgoals cannot be moved past an aggregator. example, the supplementary relations of the code:
There are four assignment operators in Glue:
h(X,W):= a(X,A,B) & b(A,C) & c(B,C,W).
5 The body may contain control operators other than conjunc-
tion, but we will not discuss them in this paper. are:
sup_1(X,A,B):= a(X,A,B). If two temperature readings were identical, then that
sup_2(X,A,B,C):= sup_1(X,A,B) & b(A,C). temperature reading would only appear only once if we
sup_3(X,A,B,C,W):= sup_2(X,A,B,C) & c(B,C,W). projected the supplementary relation onto the temper-
ature column. The temperature reading appears twice
These supplementary relations need not actually ex- (as it should) if we look at each tuple in the supplemen-
ist in the implementation, but they are very useful when tary relation.
thinking about the meaning of an assignment state- Note that the variable resulting from the aggregation
ment. They emphasize the fact that the set (relation) is in the j th supplementary relation. Here we are free to
of bound variables has tuples of bindings as its value. equate it to other bound variables (perform a \join"), so
Execution of an assignment statement is from left to as to select particularly interesting tuples. For example,
right, each supplementary relation being (conceptually) suppose we want to nd the coldest city.6 We want the
completed before the next is begun. Each subgoal is names of the city, not the actual minimum temperature.
completely solved before executing the next subgoal. The following code would provide the desired answer.
Execution of an assignment statement stops whenever
a supplementary relation is empty. coldest_city( Name ):=
daily_temp( Name, T ) &
3.3 Aggregation MinT = min(T) & T = MinT.

It often happens that we want to nd the \aggregate" The third subgoal joins the T and MinT columns, hence
value of a set of tuples, for example the minimum value the only tuples left in the supplementary relation after
of a particular attribute. In the version of Glue de- this subgoal are those with minimal temperatures. For
scribed so far, the value of a tuple in a supplementary example:
relation is independent of all the other tuples. This is
not true for statements containing aggregate operators. sup1
Here the values of tuples typically do depend on each Name T
other. San Francisco 12
The aggregate operators (aggregators) available in Madang 36
Glue are: min, max, mean, sum, product, arbitrary, Copenhagen -2
std dev (standard deviation), and count. These op- sup2
erators take a single bound term as an argument, and Name T MinT
return a single value. The operator arbitrary returns San Francisco 12 -2
a single arbitrary value from the binding set of the argu- Madang 36 -2
ment term, the other operators have their usual mean- Copenhagen -2 -2
ings. A simple example: sup3

max_temp( MaxT ):=


Name T MinT
temperature( T ) & MaxT = max(T).
Copenhagen -2 -2
The max operator computes the maximum T over all the In actual fact the third subgoal is not really neces-
bindings it has for T at that point in the statement. For sary. We could perform the restriction immediately by
example, if the value of temperature were f (10), (35) combining the second and third subgoals, as in:
g, then max would operate over sup1 = f (10), (35) g, coldest_cities( Name ):=
MaxT would be bound to 35, and sup2 (T ; M axT ) would
be f (10,35), (35,35) g. daily_temp( Name, T ) & T = min(T).
To explain the semantics of the aggregate operators it
is easiest to refer to the supplementary relation model. 3.3.1 Group by
If the j th subgoal is an aggregate operator, then it op- By default, aggregation operators use the entire sup-
erates over the tuples in the (j ?1)th supplementary plementary set in their computations. There are often
relation. If the argument term is T , the aggregator occasions when we want to partition the supplemen-
looks at the T value for each tuple in the supplemen- tary relation's tuples into a number of groups, and cal-
tary relation, rather than at each tuple in the relation culate aggregates over each group. For example, the
formed by projecting the supplementary relation onto supplementary relation might contain course-student-
the variables of T (i.e. vars(T ) supj?1 ). Choosing the grade triples, and we might want to calculate the aver-
second method would delete meaningful duplicates. For age grade in each course. In Glue we write this as:
example, suppose we were computing the average tem-
perature of a set of readings taken at various locations. 6 or cities, in the case of a tie.
course_average( C, Average ):= unchanged statement was executed.7 The predicate
course_student_grade(C,S,G) & unchanged(P ) is always false the rst time it is ex-
group_by(C) & Average = mean(G). ecuted.
All procedures have two special relations, in and
The e ect of the second subgoal group by(C) is to par- return. The relation in holds the input tuples to the
tition the supplementary set into groups, all the tuples procedure. The relation in has an arity equal to the
in a group having the same C value. The groups are bound arity of the procedure, i.e. the arity to the left of
maximal, in that no two groups can have the same C the colon in the procedure de nition (in this case it has
value. All subsequent aggregate operators then operate an arity of one). The relation return is used to hold the
over each of these groups independently. output tuples for the procedure. Assigning to this re-
Group by statements cascade; that is, if a group by lation also has the e ect of exiting the procedure. The
subgoal has split the supplementary relations into n return relation has the same arity as the procedure.
di erent groups, then the next group by subgoal will An assignment statement that assigns to the return
operate on each of these n groups separately, perhaps relation has an implicit in subgoal as its rst subgoal.
splitting them into smaller groups. The arguments of the in subgoal are the same as the
arguments to the left of the colon in the return head,
4 Glue Procedures for example:
return(X:Y):= in(X) & connected(X,Y).
We will explain the structure of Glue procedures using
the following example. The implicit in relation has a natural meaning, it re-
stricts the return relation to be only those tuples which
procedure tc_e (X:Y) extend the input relation.
rels connected(X,Y); When a Glue procedure is used as a subgoal it is
called once on all of the bindings for its input argu-
connected(X,Y):= in(X) & e(X,Y) ments, rather than being called many times, once for
repeat each binding for its input arguments.
connected(X,Y)+= connected(X,Z) & e(Z,Y).
until unchanged( connected(_,_) );
return(X:Y):= connected(X,Y). 5 Sets and Meta-programming
end

The name of this procedure is tc e. The procedure's


5.1 Sets
arity is (1:1), meaning that it produces binary tuples, Set valued attributes are useful; they give a language
given one bound input argument. Whenever tc e is more practical expressiveness, and can lead to more
used as a subgoal, the rst argument must be bound. space ecient relations. Accordingly we wanted Glue
Informally, this procedure calculates the nodes Y reach- and NAIL! to have sets. In both LDL and CORAL
able from X via edge relation e. More correctly, given sets and relations are di erent things, whereas in log-
a set of unary tuples (sole attribute X), the procedure ical terms they are both just sets of tuples. An LDL
tc e extends these tuples to be a set of binary tuples or CORAL rule with a set-generating operator needs
(X,Y) such that Y is reachable from X via edge relation to be read di erently from a standard LDL or CORAL
e. All Glue procedures declare a subset of their formal rule. Rules often produce a set of sets when what one
arguments to be bound when the procedure is called. really wants is the union of the sets. These sets of sets
This binding restriction is the only restriction on the then have to be explicitly attened. The only type of
use of a Glue procedure as a subgoal, otherwise they set equality available is set uni cation, which can be
are identical in their use to NAIL! predicates or EDB expensive.
relations. Glue-Nail borrows the second order syntax scheme of
The procedure has one local relation, connected, of HiLog [1]. In this scheme a set valued attribute contains
arity two. Procedures may be called recursively. Each the name of a predicate (i.e. the name of a set), rather
invocation of a procedure has its own copies of its lo- than the value (members) of a set. Sets are therefore
cal relations. Declarations of local relations \hide" the just normal predicates. In addition, compound terms
declarations of other predicates with which they unify. can have arbitrary terms as their functors, rather than
There is a repeat-until loop, the termination condi- being limited to atoms as in normal logic based lan-
tion being unchanged(connected( , )). The built- guages. Hence subgoals may have variables for their
in predicate unchanged(P ) is true if predicate P 7 The semantics of unchanged are under review, and may be
has not changed since the last time that particular changed slightly.
predicate names. In particular we can store the name students(cs99)( wilson ).
of a predicate in a tuple, then extract it using a variable students(cs99)( green ).
and use that variable as a subgoal name. For example:
A typical use of the class info predicate might be:
dept_employees( toy, E_set ) &
class_info(C,I,R,T,S) & T(TA) & S(Student)
E_set( Emp_name ) & ...

The second attribute of dept employees relation is a There is no automatic need for set uni cation in
set valued attribute, it holds the name of the predicate Glue-Nail; if two set valued attributes contain the same
which holds the employees in the toy department. predicate name, then the two sets are identical. Hence
Although the syntax is second order, the semantics much of the time a simple string-string matching suf-
is rst order. Predicate variables can only range over ces to determine equality. Of course, there will be
predicate names, not over all predicate extensions (val- times when the programmer needs to test whether two
ues). This distinction is important, because the set di erently named sets have the same members. Here is
of predicate names is always nite, whereas the set a small procedure which compares two sets S and T.
of possible predicates is in nite. The scoping rules of proc set_eq( S, T: )
Glue's modules and procedures give the compiler a list rels different(S,T);
of the predicates which a subgoal variable could possi- different(S,T):= in(S,T) & S(X) & !T(X).
bly match, so much of the predicate selection analysis different(S,T)+= in(S,T) & T(X) & !S(X).
can be done at compile time. return(S,T:):= !different(S,T).
Here is an example set de nition in Glue: end

class_info( ID, Ins, Room,


tas(ID), students(ID)):= 5.2 Meta-programming
class_instructor( ID, Ins ) & The HiLog system also allows the writing of parameter-
class_room( ID, Room ). ized predicates; for example the following NAIL! code
tas(ID)(Grad_student):= de nes the transitive closure of an arbitrary edge rela-
class_subject( ID, Subject) & tion E:
failed_exam( Grad_student, Subject ).
students(ID)(S):= tc(E,X,X).
attends( S, ID ). tc(E,X,Z):- tc(E,X,Y) & E(Y,Z).

The predicate class info contains information about Our example in Section 4 could have taken e as a for-
a class: its identifying code, instructor, set of TA's mal argument, thus allowing us to write one universal
(Teaching Assistants), and set of students. The pred- transitive closure predicate.
icate tas(ID) de nes the TA's for a course, notably
those graduate students who failed the graduate qual-
ifying exam in the course's subject area. Observe 6 Modules
that the name of this predicate is a compound term. Both logic programming and deductive database lan-
The predicate students(ID) contains the names of guages have had problems \programming in the large,"
the students who are taking the course ID. The predi- partly due to their lack of large scale code organization
cates class instructor, class room, class subject, structures. Hence, in common with several other lan-
failed exam, and attends are de ned elsewhere. Here
guages, Glue-Nail has a module system. Modules are
is an example EDB: purely a compile time concept, they do not have any
class_instructor( cs99, smith ). run time semantics. Besides o ering the usual advan-
class_room( cs99, mjh460a ). tages of separate compilation, modularity etc; modules
class_subject( cs99, databases ). give the Glue compiler valuable information concern-
failed_exam( jones, databases ). ing which predicates are visible at any point in a pro-
attends( wilson, cs99 ). gram. This information can be used to perform much
attends( green, cs99 ). of the predicate dereferencing at compile time, work
which would otherwise have to be done at run time.
It implies the following IDB tuples: This is especially true for subgoals which use predicate
variables.
class_info( cs99, smith, mjh460a,
Modules have:
tas(cs99), students(cs99)).
tas(cs99)( jones ).  a name,
 a list of imported EDB predicates,
 a list of imported IDB predicates,
 a list of exported IDB predicates, and
 IDB predicate code, both for Glue procedures and
NAIL! rules.
Notice that a module can contain both Glue proce- module example;
dures and NAIL rules, thus allowing the programmer export select(:Key);
to group predicates by function, rather than by type. from windows import event( :Type, Data );
from graphics import

7 A Larger Example highlight( Key: ), dehighlight( Key: );


edb element(Key, Origin, P1, P2, DS ),
Space precludes us from including a large example, but tolerance(T);
Figure 1 gives some interface code lifted from a micro-
CAD system, other examples may be found in [5]. We proc select( :Key )
show a complete module, although this code was origi- rels
nally (and more sensibly) part of a larger module. possible(Key, D), try(Key), confirmed(Key);
The procedure select allows the user to use the possible( Key, D ):=
mouse to select a graphical element. The procedure rst event( mouse, p(X,Y) ) &
nds all elements within some tolerance of the user's graphic_search( p(X,Y), Key, D ).
mouse point. It then presents the elements to the user repeat
one at a time, in increasing order of increasing distance try(Key):=
from the mouse point. The procedure returns the key possible( Key, D ) &
of the selected element, if any. D= min(D) &
Tha NAIL! rule graphic search is user to nd the It= arbitrary(Key) &
elements within the given tolerance of the mouse point. --possible( It, D).
confirmed(K):=
try(K) &
8 Comparison to Some Other highlight(K) &

Systems write( 'This one?' ) &


event( keyboard, KeyBuffer ) &
dehighlight( K ) &
There are several more systems than we mention here, KeyBuffer = 'y'.
space prevents us from mentioning them all. until {confirmed(K) | empty(possible(K)) };
It could be said that Aditi has started with the re- return(Et,Ed:Key):= confirmed( Key ).
lational engine (the back end), CORAL with the query end
language (the front end), with Glue holding the middle
ground. NAIL! has already covered the front end. graphic_search( p(X,Y), Key, Dist ):-
element( Key, _, p(Xmin, Ymin), _,_ ) &
8.1 LDL tolerance( T ) &
(X-Xmin)*(X-Xmin) + (Y-Ymin)*(Y-Ymin) < T.
LDL ([2], [4]) does not have a separate procedural lan- end
guage, it can itself perform I/O and EDB updates. As
in Glue, update and I/O subgoals are xed in a rule
and cannot be moved. Rules containing updates are Figure 1: Cad Example
not allowed to fail. There is a forever meta-predicate.
This predicate iteratively executes some rule body if
that rule body is forever true (i.e. will never fail). The
forever predicate is speci cally designed to be used in
rules with updates, so that the update will never fail.
Sets in LDL use extensional semantics, so that a set-
valued attribute has the elements of a set as its value.
Aggregation operators only operate over sets. The set
grouping operator can only be understood if the usual seen whether the extra power provided by magic tem-
tuple-based reading of a rule is abandoned. By \tuple- plates justi es the increased cost of a database lookup.
based," we mean that a tuple is true for a rule irre- CORAL has the same set and aggregation scheme as
spective of all the other tuples which may or may not LDL.
be true for that rule. With the set-grouping operator, Like LDL, CORAL uses strati ed negation.
one must implicitly gather up all the solutions of a rule, CORAL has evaluable I/O predicates. They are
then form them into sets. given a logical semantics, but the semantics relies on
Sets (and therefore aggregations) must be strati ed. state variables to ensure that the predicates are exe-
LDL does not have any meta-programming features. cuted in the correct sequence. These extra variables
LDL has a module system. LDL modules have e ects carry no useful information, they merely exist to force
at run time as well as at compile time. a certain procedural reading.
LDL uses strati ed negation, although it could prob-
ably be extended to modular strati cation. 8.3 Aditi
LDL is compiled into C, and it has a foreign language
interface to C. The LDL implementation has progressed The Aditi project has concentrated on building an in-
much further than the implementations of either Glue- dustrial strength back end for a deductive database lan-
Nail or CORAL. guage. Aditi-Prolog is the query language; it is pure
Our experience of writing LDL programs is that the Prolog with extensions for type and mode declarations,
procedural parts of the program (updates, and sets to quanti cation and aggregate operations. At present
a lesser extent) tend to dominate the programmer's Aditi-Prolog can be used interactively, although there
thinking, hence negating the theoretically declarative are plans to embed Aditi-Prolog queries in Nu-Prolog
nature of the language. or C.
Glue-Nail and Aditi have so far concentrated on dif-
ferent issues in deductive databases, so comparisons
8.2 CORAL cannot yet be made.
The query language of CORAL [8] is very similar to
that of LDL, however it uses the same two language
approach as Glue-Nail. CORAL has chosen to use the 9 Current Implementation
existing object-oriented language C++ as the procedu- An experimental implementation has been written.
ral language. The idea here is that the exible type Only the parser is written in C; the real meat of the
system of C++ will allow the easy creation of relation compiler is written in Prolog. The compiler produces
and tuple types in C++, reducing the impedance mis- Prolog code for a small virtual machine, also written in
match problem to a tolerable level. Although we have Prolog. The virtual machine uses the Prolog database
conducted no formal experiments, we suspect that C++ to store all relations. The parser is 3600 lines of C, the
will present more of an impedance problem than Glue. rest of the compiler is 4500 lines of Sicstus Prolog. The
C++ is built on C, and so has inherited the strong system compiles about two statements per Mips-second
philosophies of C; philosophies which are radically dif- in compiled Sicstus Prolog on an IBM PC/RT.
ferent to those present in a logic-based query language. The compiler is not a naive implementation; the aim
Using two completely separate languages also makes op- has been to do as much as possible at compile time. For
timization very dicult. Glue-Nail aims to avoid these example, using the scope rules, in Glue it is possible at
problems. compile time to determine which predicate classes (i.e.
CORAL has a powerful and complicated module sys- EDB, IDB, Glue procedure, or reference8) a statically
tem. CORAL modules have run time semantics, Glue unbound name, such as X, could refer to at run time.
modules are purely compile time. Module import lists A naive system would wait until X becomes bound at
can be bound at run time, allowing a form of meta run time, and then check it against the four possible
programming. In Glue-Nail we use the higher-order cases. The current compiler will have already elimi-
system of HiLog, there is no separate system for meta- nated those choices which were seen to be impossible
programming. at compile time. Procedure calls are expensive, so it is
CORAL allows variables in the EDB, partly to al- very important to identify at compile time those sub-
low the use of the Magic Templates query compilation goals which cannot possibly be procedure calls.
algorithm [7]. Hence a database lookup in CORAL re- We have used a pipelined (nested join) execution
quires uni cation, not just matching. Searching the strategy for the implementation, this being forced on us
database for a tuple match is the fundamental opera-
tion of any deductive database system. It remains to be 8 Not discussed in this paper.
by Prolog's tuple at a time strategy. The experimental It became clear when implementing the rst ver-
implementation has revealed a number of bottlenecks sion of NAIL! that it is a mistake to build a deduc-
in a pipeline design. The main problem found was that tive database system on top of an existing relational
certain language features force pipeline termination and database system. In a traditional relation database
the materialization of a supplementary relation. Break- there are few relations, they live for a long time, and
ing the pipeline and materializing the supplementary they usually have large numbers of tuples. These
relation incurs some computational overhead, reduces things are not true for deductive databases, where a
the join order exibility, may use extra space, and costs query or program execution might produce hundreds
an extra load and store for each tuple. We can elimi- of small, very short lived temporary relations. Such
nate duplicates at this point, which is also expensive, relations do not need the level of protection that a rela-
but so far has proven to be cost e ective. For various tional database provides, and in fact the system wastes
reasons (perhaps related to the particular application much of its time performing such tasks. All the usual
programs that we have run), the Glue assignment state- impedance mismatch problems occur, in particular the
ments that we have examined have produced a large front end and the back end cannot intergrate their opti-
number of duplicates, so removing duplicates early has mization strategies. Hence we need our own back end.
always been advantageous. However, in the worst case Work is in progress on designing an ecient rela-
pipeline breakage is a loss. Breaks are required when- tional back end for Glue. The kinds of applications we
ever a Glue procedure is called. We have to project envision for Glue are single-user on small-to-medium
the supplementary relation onto the input arguments, sized databases. Thus, the back end will ignore concur-
and call the Glue procedure once on all the input ar- rency issues and will manage relations in main mem-
guments. Breaks can also be required if we have an ory as much as possible, storing EDB relations on disk
update operation in the body,9 or an aggregator. between runs. The back end will be tailored to prop-
Two undergraduate students are writing medium erties of deductive databases programs. For example,
sized test applications in Glue. Their experiences have it will implement a \uniondi " operator ([9]) in order
helped in the development of the optimizer algorithms, to support compiled recursive NAIL! queries. Because
in identifying problematic areas of the language design, Glue programs create and update many relations at
and in debugging the compiler. More undergraduates run-time, queries involving those relations are dicult
will be writing senior projects in Glue-Nail. to optimize at compile-time. However, optimizing ev-
ery statement each time it is executed is would be too
expensive. Furthermore, for some queries, performing
10 Known Problems and Future optimization may be more expensive than executing the
Work query. Therefore, the back end will employ adaptive
optimization techniques that select appropriate stor-
Glue is intended to be a complete application language, age structures and access methods at run-time based
but in order to do so it probably needs a foreign lan- on changing properties of the database and patterns of
guage interface capability. Many applications use win- access. For example, an index could be created for a re-
dowing systems, typically with a C interface. It is not lation after the cumulative cost of selection by scanning
reasonable to ask the programmer to write an entire the relation reaches the cost of creating the index.
windowing system in Glue, so we must provide some We are currently building the NAIL! to Glue com-
way of interfacing to languages such as C. It would also piler. We may need to tune Glue so as to evaluate
be dicult to write a windowing scheme in Glue, be- NAIL! queries as eciently as possible.
cause Glue has such a simple type (and hence I/O)
system. A window system might require talking to a
device in terms of bitmaps or bytes, and Glue has no 11 Conclusions
easy way of doing this. Much work has been done on declarative query lan-
We have written some non-trivial programs in Glue, guages for deductive database systems | on the forms
but we plan to write several more so as to evaluate of such languages, and the algorithms to implement
the system. In the process of designing Glue we wrote them. Current systems are addressing the problem of
several small and one medium sized (400 lines) micro- turning them into full application languages. The Glue-
CAD program. It would be useful to take a subset of Nail system has taken the approach of providing two
an existing CAD program (or some other application), tightly knit languages, one declarative and one proce-
rewrite it in Glue-Nail and then compare the two im- dural. We feel that this approach is sound.
plementations. There are also many questions involved involved in
9 A language feature which is not discussed in this paper. the design of an ecient relational back end. Some
headway has been made in reducing the cost of higher- ings 2nd Int Workshop on Database Programming
order programming by compile time analysis, but much Languages, 1989.
more work remains to be done.
We claim that Glue has e ectively dealt with the [2] Danette Chimenti and Ruben Gamboa. The
impedance mismatch problem. Here follows a list of SALAD Cookbook: A User/Programmer's Guide.
the main problems and Glue-Nail's solutions to them. Technical Report ACT-ST-346-89, Microelectron-
ics and Computer Technology Corporation, 1989.
Separate optimization: NAIL! code is compiled into [3] Katherine Morris, Je rey Ullman, and Allen van
Glue procedures; the Glue optimizer runs over all
the code. Gelder. Design Overview of the NAIL! System. In
Proceedings 3rd Int Conference on Logic Program-
Tuple oriented versus Set oriented: Both NAIL! ming, pages 554{568, New York, 1986. Springer-
and Glue are relation oriented, so the interface Verlag.
does not require bu ering, back-tracking, or iter-
ating schemes. The programmer does not have to [4] Shamim Naqvi and Shalom Tsur. A Logical Lan-
constantly ip mental models. guage for Data and Knowledge Bases. Computer
Science Press, New York, 1989.
Types: Identical systems,10 both languages allow [5] Geo rey Phipps. Glue - A Deductive Database
function symbols and use HiLog terms.
Programming Language. In Jan Chomicki, editor,
Syntax: Similar but not identical. The two types of Proceedings of the NACLP'90 Workshop on De-
code can occur in the same module, allowing code ductive Databases. Kansas State University Tech-
to be grouped according to function, rather than nical Report TR-CS-90-14, 1990.
language. [6] Geo rey Phipps. The Glue Manual, Version 1.0.
Data Lifetimes: Explicit. Permanent data is stored Technical Report STAN-CS-91-1353, Department
in the EDB, Glue procedures and NAIL! rules both of Computer Science, Stanford University, 1990.
compute their values from the current state of the [7] Raghu Ramakrishnan. Magic Templates: A Spell-
EDB. binding Approach to Logic Programs. In Proceed-
A full description of Glue is available in [6]. ings Fifth International Conference on Logic Pro-
gramming, 1988.
12 Acknowledgements [8] Raghu Ramakrishnan, Per Bothner, Divesh Srivas-
tava, and S. Sudarshan. CORAL - A Database
The majority of the design of Glue is due to Geo rey Programming Language. In Jan Chomicki, editor,
Phipps. Ken Ross provided the basic design of the as- Proceedings of the NACLP'90 Workshop on De-
signment statement with respect to relations, and its ductive Databases. Kansas State University Tech-
coupling with the repeat loop. The implementation of nical Report TR-CS-90-14, 1990.
the parser and compiler was done by Geo rey Phipps. [9] Jayen Vaghani, Kotagiri
Marcia Derr is designing and implementing the rela- Ramamohanarao, David B. Kemp, Zoltan Somo-
tional back end. Compilation strategies for NAIL! are gyi, and Peter J. Stuckey. The Aditi Deductive
being investigated by Ashish Gupta, Geo rey Phipps Database System. In Jan Chomicki, editor, Pro-
and Ken Ross. David Chen and Kathleen Fisher have ceedings of the NACLP'90 Workshop on Deductive
written application programs in Glue. Besides the three Databases. Kansas State University Technical Re-
authors, the following people also contributed to discus- port TR-CS-90-14, 1990.
sions concerning Glue's design: Hakan Jakobsson, In-
derpal Mumick, Yehoshua Sagiv, and Je rey Ullman. [10] Carlo Zaniolo. Deductive Databases - Theory
Meets Practice. In Proceedings 2nd International
References Conference on Extending Database Technology,
1990.
[1] Weidong Chen, Michael Kifer, and David S. War-
ren. HiLog: A First-Order Semantics for Higher-
Order Logic Programming Constructs. In Proceed-
10 To be honest it could be said that neither language really has
a type system.

You might also like