Professional Documents
Culture Documents
Cgworld - Architecture and Features: Lecture Notes in Computer Science July 2002
Cgworld - Architecture and Features: Lecture Notes in Computer Science July 2002
Cgworld - Architecture and Features: Lecture Notes in Computer Science July 2002
net/publication/221648867
CITATIONS READS
9 244
2 authors:
All content following this page was uploaded by Pavlin Dobrev on 23 April 2014.
The main motivation for creating CGWorld was the need for an application that
allowed Internet access to a knowledge base of Conceptual Graphs (CG). The goal
was to provide various facilities for remote browsing and editing of a KB that resides
on a central server.
Support of different representation formats for CGs was also a high priority. Similarly
to [9, 10, 11] we chose the graphical representation of conceptual graphs as the major
medium for browsing, editing and manipulation of the knowledge base since it is
easier to use by non CG-expert knowledge engineers and end users. The other
supported formats were CGIF [5], First Order Logic and a Prolog format [6,7,8].
CGWorld was first introduced at ICCS 2000 [3]. Future development was presented
at ICCS 2001 [1,2]. The main goals followed in the design and development of the
CGWorld workbench are:
(i) to allow for collaborative, distributed acquisition and editing of a CG
knowledge base;
(ii) to provide easy search and navigation in a large KB;
(iii) to maintain different representation languages, thus accommodating the
needs of different users of CGWorld and the different applications the
KB of CGs is used in;
(iv) to provide a graphical editor and viewer for CGs that is easy to use by
non-experts in CG theory
(v) to integrate and add Web access to previously developed CG
applications, written in different programming languages.
The initial version of CGWorld met many of the needs that motivated its creation. It
had excellent browsing, searching and editing features for a KB of CGs. However the
support of large data and distributed development was not fully satisfying because of
the architecture limitations. Subsequently the architecture of CGWorld was changed
according to the latest developments in the area of multi tier web applications. This
paper describes several enhancements to this architecture that increase the scalability,
reliability and usability of the application. It also reports on the addition of new
procedures for CG acquisition and the integration of a new representation format for
CGs. These features facilitate the development of a large KB by multiple parties using
different representation formats.
The need of an application like CGWorld arose in the context of projects that required
Natural Language Processing to be built on top of a Conceptual Graphs Knowledge
Base [1,2,3,4,6,7,8]. Initially we are concentrating on functionality that is required in
this area. We built different representation formats of Conceptual Graphs the most
used one being display form that is understandable by non-specialists and added the
support of CG operations to be used for inference.
The graphical editing facilities are implemented in the Editor and it is run over the
Internet and not downloaded locally as [10]. The other difference from [10] is that the
Knowledge Base is distributed over the Internet and not loaded from the local
computer. CGWorld has implementation of canonical formation rules as in [9]. An
added advantage of CGWorld is that its Editor is an applet and thus it provides higher
security and easier maintenance.
The application layer represents the end user logic. This layer uses the conceptual
layer to implement user-defined functionality (e. g. [2]).
3 Implementation View
In accordance with the new Java technologies the current release of the CGWorld
workbench uses an application server with support of the Java 2 Enterprise Edition
(J2EE). The set of HTML and Java Server Pages (JSP) and most of the Java Beans
components described in [3] were reused. Part of the application logic that was
previously developed as a set of JavaBeans is currently implemented as a set of
Session Enterprise JavaBeans. This facilitates the management of user sessions and
allows strict control of user rights. A set of Entity Enterprise JavaBeans represents
persistent objects that is used to store concept, relation, context, referent, arc and
information about the knowledge base. The object model is very similar to the UML
model defined in [3, fig 2, p. 247]. This allows the maintenance of large amounts of
data and the control of the data integrity is performed by the built-in mechanisms for
transaction maintenance.
The use of Enterprise Java Beans allows the manipulation of larger amounts of data
and increased numbers of concurrent users. This allows distributed acquisition and
editing of a CG knowledge base. Applications developed on top of J2EE can be
distributed on several computers because most application servers provide this
feature. The J2EE server that we used for development and test purposes was the
Orion Application Server (http://www.orionserver.com) licensed by Oracle and sold
under the name Oracle J2EE container. We are working on an implementation that
can be used with an Open Source J2EE server (e. g. http://www.jboss.org) and we
intend to provide this version (including source code) to the CG community at ICCS
2002.
The Data layer is defined as a set of container managed persistence Entity Enterprise
Java Beans according to the Enterprise Java Beans 1.1 specification. This means that
it uses the built-in mechanisms for persistence of the corresponding container.
Enterprise Java Bean contains the remote interface, the home interface, and the bean
implementation. The remote interface is the class that exposes the methods of the EJB
to the outside world. The home interface specifies how to create and find a bean that
implements the remote interface. The bean implementation provides an
implementation of the methods specified by the remote and home interfaces.
E
ArcBean
arcId : Integer
Hom e cgId : Integer R em ote
fromId : Integer
ArcHome Arc
toId : Integer
Fig 2. contains an UML model of the Arc Entity EJB. Arc is used to store persistently
information about arcs between concepts and relations in a given Conceptual Graph.
Arc is a Remote Interface of the Arc EJB. It contains methods for accessing fields that
are stored persistently in the database. ArcBean is Bean implementation. There is no
need to write any code for data persistency. EJB container automatically does this and
manages transactions and data integrity. ArcHome is the home interface of Arc EJB.
It defines methods for creating and finding an Arc by different parameters like cgId
(Id of the Conceptual Graph), fromId (Id of the Conceptual Object (Concept, Context
or Relation) of the beginning of the Arc) and toId (Id of the Conceptual Object
(Concept, Context or Relation) of the end of the Arc).
The mapping to the database table is defined in the XML deployment descriptor of
the CGWorld application. For the JBoss application server this is defined in files ejb-
jar.xml, jaws.xml and jboss.xml located in the META-INF subdirectory of the
application. The Arc bean is stored persistently in table ARC that has fields ARC_ID,
CG_ID, FROM_ID and TO_ID. As mentioned above there is a direct mapping
between EJB instances and rows in the table. For the Arc EJB this means that every
Arc instance in the container has a corresponding row in the table.
The container loads EJB instances into memory only when they are needed. This
allows large amounts of data to be handled using this model. Another advantage that
we gain from using EJB is that the implementation of the application is independent
of the choice of the particular database. The current version of CGWorld uses the
MySQL database, which is Open Source.
Fig 3 shows a UML Model of the Data Layer that is used to store conceptual
information. Here is a short description of the Entity Enterprise Java Beans given in
the model:
E E
E
RegistryBean
F sBe an E
ArcBean
cgId : Integer
arcId : Integer id : I nte ger nam e : String
cgI d : In teg er typeId : Integer value : String T ypeBean
fr om Id : Int eger
toId : Integer typeId : Integer
nam e : String
E E
E
ReferentBean
RootsBean
ref Id : Int ege r HierarchyBean
cgId : Integer id : Integer
from Id : Integer rootId : Integer id : Integer
toId : Integer typeId : Integer parentId : Integer
nam e : String hierarchyId : Integer hierarchyI d : Integer
4 Features
This section describes the procedures for extending the knowledge base with new
CGs and the additional representation format for CGs now supported by CGWorld.
The conceptual graphs formats currently supported by CGWorld are Display Form,
First Order Logic, CGPro format and the newly implemented XCG.
The basic way to add CGs is by manually creating and editing them with the graphical
CG editor. The latest version of CGWorld includes two additional methods for
creating CGs. These are automatic acquisition from natural language and derivation
from existing CGs through canonical formation rules.
The Conceptual Graphs Editor is a user-friendly graphical editor for CGs. It was
described in an earlier paper [3]. Until recently CGs were created only through this
editor.
CG operations can also be used for automatic generation of conceptual graphs from
other CGs in the knowledge base. The inference rules for conceptual graphs supported
by CGWorld are join, generalization, specialization, projection, type extraction and
type contraction. They were implemented for simple graphs, graphs with identity lines
and some special complex graphs. The user can request operations and specify their
arguments through a user-friendly interface. Detailed description and snapshots of the
web interface for the operation can be found in [1].
CGWorld now supports an additional representation format for CGs. The added CG
format is XCG. XCG is an XML linearization of a subset of the CG model. XML is
widely used as a platform-independent format for information exchange. Support of
this format is developed according to [12] and Peter Beker’s work in the CGXML
project (http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/tockit/cgxml).
5 Knowledge Base
CGWorld was used to develop a Knowledge Base from the financial domain [12].
This Knowledge Base is an excerpt from the KB of the LARFLAST (LeARning
Foreign LAnguage Scientific Terminology1) Project. Conceptual Graphs are used as a
knowledge representation core in the complex language-learning environment defined
in LARFLAST [4]. In [12] you can find the type hierarchy and Display, CGIF and
CGPro forms of the CGs in this Knowledge base.
As mentioned in [1] the main format used for processing of the Knowledge Base is
Java format. All other formats are translated to/from this format. For better
performance the CGIF and CGPro formats are stored in the database and access to
them is implemented through CgBean. This allows direct implementation of search
through the EJB find methods. The EJB container loads only EJBs that match a given
query. In the previsions versions of CGWorld the whole knowledge base was loaded
into memory. Using the components that provide remote interfaces by default allows
the handling of large numbers of user requests without writing additional code. Most
of the current implementations of EJB containers allow clustering of EJBs. Using this
Fig. 4. is an example that shows the conceptual graphs "A convertible bond is one
which is convertible into the company's common stock", "When a bond is converted
to common stock, the corporate debt is reduced" and "A bond is converted into
common stock" both in display and CGIF form.
The other representations that CGWorld supports are CGPro, FOL and XCG. The
graph “A bond is converted into common stock” (Fig 4.) in CGPro is:
cgc(55,simple,'bond',[fs(num,sing)],[]).
cgc(53,simple,'common_stock',[fs(num,sing)],[]).
cg(155,[cgr(convert_into, [55, 53], _)],
none,
fs(kind,'body_of_context'),
fs(comment,'A bond is converted into common
stock')]).
exists(A1,exists(A0,convert_into(A0,A1)
& bond(A0) & common_stock(A1)))
- <relation type="convert_into">
- <concept type="bond">
<number type="single" />
</concept>
- <concept type="common_stock">
<number type="single" />
</concept>
</relation>
The XGC, CGPro, CGIF and Java representation are equivalent. The Conceptual
Graphs can be converted from one representation to another. Currently FOL is
supported for a limited number of graphs and only as an output format. The modules
of CGWorld process Conceptual Graphs both in Java and Prolog representations. For
example the conversion, searching, browsing and editing of Conceptual Graphs are
implemented in Java. Conceptual Graph operations are implemented in Prolog.
6 Used Software
7 Conclusion
During the last three years CGWorld implemented different architectural concepts
and its development reflects the evolution of the authors’ understanding of enterprise
architectures. The general idea was to provide a set of components that can be used as
building blocks for CG applications and the authors continue to work in this direction.
8 Future Work
9 Acknowledgements
10 References