Professional Documents
Culture Documents
The Intelligent Database Interface: Integrating AI and Database Systems
The Intelligent Database Interface: Integrating AI and Database Systems
The Intelligent Database Interface: Integrating AI and Database Systems
net/publication/221603722
CITATIONS READS
28 912
3 authors, including:
Tim Finin
University of Maryland, Baltimore County
566 PUBLICATIONS 28,885 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Tim Finin on 11 January 2014.
Extending the AI system not incorporate full DBMS technology. Rather, the
emphasis is on the AI system and the DBMS capabil-
Loose coupling
seconds
seconds
2
0.5
Collection
Translation
Without Empty Non-empty
Min Mean Max Caching Cache Cache
Figure 4: The aggregate processing time is broken down in Figure 5: Cache performance is measured for three cases:
terms of the three main stages of processing: translation, ex- (a) the without caching or base-line case where caching was
ecution, and collection. For each processing stage, the mini- disabled, (b) the empty cache case where caching was enabled
mum, mean, and maximum processing times are shown. but the cache was cleared before each repetition of the test set,
and (c) the non-empty cache case where the cache contained
Current Status the results for all queries in the test set.
Performance
The IDI, as described here, has been implemented in of the relative processing times for each stage could be
Common Lisp and tested as a stand-alone query pro- established.
cessor against two dierent databases running on RTI The dierences in translation time re
ect a depen-
INGRES and is also being used as a query server for dence on the number of relations in the IDIL query.
the Unisys spoken language project. The performance Similarly, the collection time is a function of the num-
results obtained thus far are, at best, preliminary since ber of tuples in the result relation. In both cases, the
the size of the test suite was comparatively small and processing times are signicantly less than the execu-
the IDI is just now being integrated with an AI system. tion time which is eected by the complexity of the
However, the results are encouraging and indicate the SQL query, the communication overhead, and the load
potential for ecient database access aorded by the on the remote DBMS host (since only elapsed time was
IDI. The following summarize some of the more inter- recorded).
esting of these performance results. Figure 5 indicates the eects of result caching on
One test set of IDIL queries used consisted of 48 performance. The results represent the mean process-
queries where there were 22 unique queries, i.e., each ing times (in seconds) for all queries. Three dierent
query was repeated at least once in the test set. The cases are represented: (a) the without caching or base-
queries ranged from simple (i.e., only project and select line case where caching was disabled, (b) the empty
operations were required) to complex (i.e., a four-way cache case where caching was enabled but the cache
join with two aggregation operations as well as projects was cleared before each repetition of the test set, and
and selects). The size of the result relations varied from (c) the non-empty cache case where the cache was con-
zero to 17 tuples. The statistics presented here are tained the results for all queries in the test set. The
based on the mean processing times for 20 repetitions dierence between the base-line and empty cache cases
of the test set of queries. is due to the number of repeated queries (i.e., 26 out
Figure 4 shows a breakdown of the aggregate process- of 48 were repeated). The fact that the base-line case
ing time in terms of the three main stages of processing: is more than twice the empty cache indicated that the
translation (i.e., the time to translate and IDIL query overhead required for result caching is not signicant.
into SQL), execution (i.e., the elapsed time between The non-empty cache case indicates the maximum po-
sending the SQL query to the DBMS and obtaining the tential benet of result caching, i.e., nearly two orders
rst tuple of the result relation), and collection (i.e., of magnitude improvement in performance. Clearly this
the time required to collect all the tuples in the result could only occur when the cache is \stacked" as in the
relation and convert them into internal form). For each test. However, it does help to establish an upper limit
processing stage, the minimum, mean, and maximum on the possible performance improvement aorded by
processing times are shown. The cache was disabled result caching. Obviously, as the number of repeated
for these measurements so that a more accurate picture queries increases so will the gain in performance.
Application of the IDI cache manager to allow it to make intelligent choices
about query generalizations, pre-fetching and replace-
Clearly, more detailed performance results need to be ment. We currently have an initial ATIS server running
obtained using more exhaustive test sets. It will be and will be collecting statistics on its transactions which
particularly important to integrate the IDI with an AI can then be used to dene a eective advice strategy.
system and measure its performance with a variety of
dierent applications. We are currently using the IDI Conclusion
to provide a database server for the Unisys spoken lan-
guage understanding system and are investigating the Although the implementation of the IDI is not com-
integration of the IDI with the Intelligent System Server plete, it does provide a solid foundation for easily cre-
and its Protem representation and reasoning engine, ating a sophisticated interface to existing DBMSs. The
The IDI and the ISS. Protem is a hybrid system key characteristics of IDI are eciency, simplicity of
containing both a frame-based representation system use, and a high degree of portability which make it an
and a logic-based reasoning component. The integra- ideal choice for supporting a variety of AI and related
tion of a frame-based representation system with a rela- applications which require access to remote DBMSs.
tional database management system is not straightfor- Among the various extensions to the IDI that have
ward. Our current approach labels some of the classes been planned for the future, most involve the Cache
in the frame system as \database classes". Any knowl- Manager. At present, the implementation of the CM
edge base activity which searches for the instances of has been focused on ecient result caching and most
this class will be handed a stream of \database in- other cache management functions have not been im-
stances" which will be the result of a query sent to the plemented. One of the rst steps will be to impose
database via the IDI. In order to avoid lling the knowl- a parameterized limit on the size of the cache and to
edge base memory with database information, these in- implement a cache replacement strategy. Other exten-
stances are not installed as persistent knowledge base sions to the CM include cache validation, and the abil-
objects but exist as \light weight objects" which are ity to perform DBMS-like operations on cache elements
garbage collected as soon as active processes stop ex- [O'Hare and Sheth, 1989].
amining them. They are also not \fully instantiated". If the IDI is extended so that it is capable of perform-
That is, the values for the frame's roles are not neces- ing DBMS-like operations on the contents of its cache
sarily installed. Instead, if an attempt is made to access then, given an IDIL query, it will have three general
their roles, additional database queries to retrieve the courses of action which it may take to produce the re-
information will generated automatically. Once again, sults: (a) the entire IDIL query can be translated into
this information is not added as permanent knowledge SQL and sent to the remote DBMS for execution; (b)
base data, but only last as long as the currently active the entire IDIL query can be executed locally by the
process is using it. IDI (including simple retrieval from the cache); and (c)
This approach has three advantages: it is relatively the IDIL query can be decomposed so that part of it
simple to implement, transparent to the user and is is executed on the remote DBMS and part of it is exe-
the key to isolating the data copy problem to cache cuted locally by the IDI. The decision of which action
validation as stated earlier. Once the relationship be- to take would depend on a number of factors includ-
tween a database class and its database tables is de- ing the current contents of the cache and the estimated
clared, the class and its instances can be treated as costs for each alternative.
any other knowledge base objects. However, without
the IDI cache implementation, it would be prohibitively References
slow. [Abarbanel and Williams, 1986] R. Abarbanel and M.
The IDI ATIS Server. The second AI system that Williams, \A Relational Representation for Knowledge
the IDI is being used to support is a spoken language in- Bases,", Technical Report, Intellicorp, Mountain View,
terface to an Air Travel Information System database. CA, April 1986.
In this project, spoken queries are processed by a speech [Bocca, 1986] J. Bocca, \EDUCE a Marriage of Conve-
recognition system and interpreted by the Unisys Pun- nience: Prolog and a Relational DBMS," Third Sympo-
dit natural language system [Hirschman, et. al., 1989]. sium on Logic Programming, Salt Lake City, Sept. 1986,
The resulting interpretation is translated into a IDIL pp. 36-45.
query which is then sent to the ATIS Server for evalua- [Brodie, 1988] M. Brodie, \Future Intelligent Informa-
tion. This server is a separate process running the IDI tion Systems: AI and Database Technologies Working
which, in turn, submits SQL queries to an INGRES Together" in Readings in Articial Intelligence and
database server. Databases, Morgan Kaufman, San Mateo, CA, 1988.
The \travel agent" domain is one in which there is a [Ceri, et. al., 1986] S. Ceri, G. Gottlob, and G. Wiederhold,
rich source of pragmatic information that can be used \Interfacing Relational Databases and Prolog Eciently,"
to infer the user's intentions underlying their queries. Proc. of the 1st Intl. Conf. on Expert Database Systems,
These intentions can be used to generate advice to the South Carolina, April 1986.
[Chakravarthy, et. al., 1982] U. Chakravarthy, J. Minker, Management System," in Proceedings of the 12th In-
and D. Tran, \Interfacing Predicate Logic Languages ternational Conference on Very Large Databases, Kyoto,
and Relational Databases," in Proceedings of the First Japan, 1986.
International Logic Programming Conference, pp. 91-98, [Li, 1984] D. Li, A Prolog Database System, Research Stud-
September 1982. ies Press, Letchworth, 1984.
[Chamberlin, et. al., 1976] D. Chamberlin, el al.. \SE- [Minker, 1978] J. Minker, \An Experimental Relational
QUEL2: A Unied Approach to Data Denition, Manip- Data Base System Based on Logic," in Logic and
ulation, and Control", IBM Journal of R&D, 20, 560-575, Databases, ed. J. Minker, Plenum Press, New York, 1978.
1976.
[Morris, 1988] K. Morris, J. Naughton, Y. Saraiya, J. Ull-
[Chang, 1978] C. Chang, \DEDUCE 2: Further investiga- man, and A. Van Gelder, \YAWN! (Yet Another Window
tions of deduction in relational databases," in Logic and on NAIL!)," IEEE Data Engineering, vol. 10, no. 4, De-
Databases, ed. H. Gallaire, pp. 201-236, New York, 1978. cember 1987, pp. 28-43.
[Chang and Walker, 1984] C. Chang and A. Walker, [Motro, 1986] Amihai Motro, \Query Generalization:
\PROSQL: A Prolog Programming Interface with A Method for interpreting Null Answers", in Ex-
SQL/DS," Proc. of the 1st Intl. Workshop on Expert pert Database Systems, ed. L. Kerschberg, Ben-
Database Systems, Kiawah Island, South Carolina, Octo- jamin/Cummings, Menlo Park CA, 1986.
ber 1984. [Naish and Thom, 1983] L. Naish and J. A. Thom, \The
[Chimenti, et. al., 1987] D. Chimenti, A. O'Hare, R. Krish- MU-Prolog Deductive Database," Technical Report 83-
namurthy, S. Naqvi, S. Tsur, C. West, and C. Zaniolo, 10, Department of Computer Science, University of Mel-
\An Overview of the LDL System," IEEE Data Engi- bourne, Australia, 1983.
neering, vol. 10, no. 4, December 1987, pp. 52{62. [O'Hare, 1987] A. O'Hare, Towards Declarative Control
[Dahl, et. al., 1990] D. Dahl, L. Norton, D. McKay, M. of Computational Deduction, University of Wisconsin{
Linebarger and L. Hirschman, \Management and Evalua- Madison PhD Thesis, June 1987.
tion of Interactive Dialogue in the Air Travel Information [O'Hare, 1989] A. O'Hare, \The Intelligent Database Inter-
System Domain", submitted to The DARPA Workshop face Language", Technical Report, Unisys Paoli Research
on Speech and Natural Language, June 24-27,1990, Hid- Center, June 1989.
den Valley, PA.
[Finin, et. al., 1989] Tim Finin, Rich Fritzson, Don Mckay, [O'Hare and Sheth, 1989] A. O'Hare and A. Sheth, \The
interpreted-compiled range of AI/DB systems", SIGMOD
Robin McEntire, and Tony O'Hare, \The Intelligent Sys- Record, 18(1), March 1989.
tem Server | Delivering AI to Complex Systems", Pro-
ceedings of the IEEE International Workshop on Tools [O'Hare and Travis, 1989] A O'Hare and L. Travis, \The
for Articial Intelligence { Architectures, languages and KMS Inference Engine: Rationale and Design Objec-
Algorithms, March 1990. tives", Technical Report TM-8484/003/00, Unisys - West
Coast Research Center, 1989.
[Fritzson and Finin, 1988] Rich Fritzson and Tim Finin,
\Protem { An Integrated Expert Systems Tool", Tech- [O'Hare and Sheth, 1989] Anthony B. O'Hare and Amit
nical Report LBS Technical Memo Number 84, Unisys Sheth. The architecture of BrAID: A system for e-
Paoli Research Center, May 1988. cient AI/DB Integration. Technical Report PRC-LBS-
8907, Unisys Paoli Research Center, June 1989.
[Hirschman, et. al., 1989] Lynette Hirschman, [Reiter, 1978] R. Reiter, \Deductive Question-Answering
Martha Palmer, John Dowding, Deborah Dahl, Marcia on Relational Data Bases," in Logic and Databases, ed.
Linebarger, Rebecca Passonneau, Francois-Michel Lang, J. Minker, Plenum Press, New York, 1978.
Catherine Ball, and Carl Weir. The pundit natural-
language processing system. In AI Systems in Govern- [Sheth, et. al., 1988] A. Sheth, D. van Buer, S. Russell, and
ment Conference. Computer Society of the IEEE, March S. Dao, \Cache Management System: Preliminary Design
1989. and Evaluation Criteria," Unisys Technical Report TM-
[Intellicorp, 1987] Intellicorp, \KEEConnection: A Bridge 8484/000/00, October 1988.
Between Databases and Knowledge Bases", An Intel- [Stonebreaker, et. al., 1987] M. Stonebraker, E. Hanson
licorp Technical Article, 1987. and S. Potamianos, \A Rule Manager for Relational
[Ioannidis, et. al., 1988] Y, Ioannidis, J. Chen, M. Fried- Database Systems," in The Postgres Papers, M. Stone-
man, and M. Tsangaris, \BERMUDA { An Architec- braker and L. Rowe (eds), Memo UCM/ERL M86/85,
tural Perspective on Interfacing Prolog to a Database Univ. of California, Berkeley, 1987.
Machine," Proceedings of the Second International Con- [Van Buer, et. al., 1985] D. Van Buer, D. McKay, D. Ko-
ference on Expert Database Systems, April 1988. gan, L. Hirschman, M. Heineman, and L. Travis, \The
[Jarke, et. al., 1984] M. Jarke, J. Cliord, and Y. Vassiliou, Flexible Deductive Engine: An Environment for Proto-
\An Optimizing Prolog Front-End to a Relational Query typing Knowledge Based Systems," Proceedings of the
System," Proceedings of the 1984 ACM-SIGMOD Con- Ninth International Joint Conference on Articial Intel-
ference on the Management of Data, Boston, MA, June ligence, Los Angeles CA, August 1985.
1984.
[Kellog, et. al., 1986] C. Kellogg, A. O'Hare, and L. Travis,
\Optimizing the Rule/Data Interface in a Knowledge