Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Appeared in: Proceedings of the 12th International Conference on Entity Relationship Approach, December 15-17, 1993, Arlington, Texas,

USA

Extending ER Model Clustering by Relationship Clustering


Peter Jaeschke, Andreas Oberweis, Wolffried Stucky
Institut für Angewandte Informatik und Formale Beschreibungsverfahren
Universität Karlsruhe(TH), D-76128 Karlsruhe, Germany
E-mail: {jaeschke|oberweis|stucky}@aifb.uni-karlsruhe.de
September 1993
Abstract
The entity relationship approach is a widely accepted method for conceptual database design. However,
some problems arise if ER modelling is applied to the design of really large databases concerning whole
enterprises. There is, e.g., no way to obtain a general view nor to perceive the global context of a detailed
enterprise scheme with hundreds of entity and relationship types.
Different approaches use the method of ER model clustering to overcome these problems. Whole sec-
tions of the detailed diagram are mapped into so-called entity-clusters, which are presented as (complex)
entity types in a higher level ER diagram. All approaches are based on an already existing detailed ER
diagram. Based on this, the abstraction layers are built bottom-up.
The approach proposed in this paper refines and extends the approaches mentioned above. It also takes
into account the Nested Entity Relationship Model in order to refine relationship types. It can be used in a
top-down database design process as well as in a bottom-up approach. Furthermore standard diagrams for
industry branches can be designed and individually refined for a specific enterprise.

1. Introduction
The entity relationship approach [Chen76] is a widely accepted method for conceptual database design.
Tools like IEF, IEW, ADW and ORACLE*CASE1 supporting the information engineering approach of
James Martin [Mart89, Mart90a/b] use the binary entity relationship approach. However, some problems
arise if ER modelling is applied to the design of really large databases concerning whole enterprises.
There is, e.g., no way to obtain a general view nor to perceive the global context of a detailed enterprise
scheme with hundreds of entity and relationship types.
[FeMi86, TWBK89, Mist91, RaSt92] propose the method of ER model clustering to gain a general view
and to recognise the global context of a detailed enterprise scheme. They collect whole sections of the de-
tailed diagram into so-called entity-clusters, which are then represented as (complex) entity types in a
higher level ER diagram. The higher level diagrams can be iteratively abstracted by this method. All pro-
posals are based on an already existing detailed ER diagram, the abstraction layers are built bottom-up.
The approach proposed in this paper refines and extends the approaches mentioned above. It also takes
into account the Nested Entity Relationship Approach of [CaJA89] in order to refine relationship types.
We distinguish between three kinds of clustering:
 Entity clustering which is introduced by [FeMi86, TWBK89, Mist91, RaSt92].
‚ Simple relationship clustering which is newly introduced here to refine relationship types by several
semantically similar relationship types. This also supports the modelling of additional integrity
constraints.
ƒ Complex relationship clustering which is proposed here to refine relationship types by whole ER
diagrams.
The extended approach presented in this paper has several advantages:
 It can be used especially in a top-down database design process in order to allow the stepwise analy-
sis and design of the universe of discourse.
‚ It can also be applied to database reengineering. Entity and relationship clusters can be built bottom-
up based on already existing database schemes. Then redesigning as well as extending the original
scheme can be done top-down.
ƒ Modifications in one cluster do not have any unpredictable side effects in adjacent clusters.
„ Standard diagrams for industry branches or special business functions can be designed and indi-
vidually refined for specific enterprises.
… Simple relationship clustering can be used to formulate additional integrity constraints on relation-
ship types. However, the presentation of these constraints can be faded out, if only the relationship
type itself without any additional constraints is relevant.
This paper is structured as follows: In section 2 the extended entity relationship model used in this paper
is described. Then the concept of ER clustering is presented and extended by simple relationship cluster-
ing as well as by complex relationship clustering. Rules concerning the different clustering techniques are
proposed. Because most of the CASE-tools supporting the Information Engineering approach use a bi-
nary ER (BER) model, in section 4 the idea of relationship clustering is transferred to the BER model and
compared to the algorithm of [RaSt92].

2. Extended Entity Relationship Model


(min1 , max1 ) (min2 , max 2 ) We assume that the reader is familiar with
E1 E2
R the basic notation of the ER model and its
a1 a2 a3 b1 b2 c1 c2 extensions in general. An EER model
with entity, relationship and aggregation
Fig. 2(1): Entity and relationship types
types and generalisation is used.
Entity types are described by their attributes, relationship types by participating entity types and relation-
ship attributes (cf. [ScSt83]). Fig. 2(1) shows the entity types E1 and E2 (rectangle) and relationship type
R (diamond) with single valued attributes (circle): E1:<a1, a2, a3>, E2:<c1, c2>, R:<E1, E2 / b1, b2>
Furthermore (min, max)-cardinalities are used. The minimum number of relationships of type R, in which
an entity of type Ei must participate, is given by mini. The maximum number of relationships of type R, in
which an entity of type Ei may participate, is given by maxi. A min-cardinality greater than 0 indicates an
existence dependency.
(min, max) (1, 1) Also the concept of weak entity types is supported.
E R E'
E' in Fig. 2(2) is a weak entity type, it is graphically
Fig. 2(2): Strong entity type E with weak entity type E' denoted as a double-lined rectangle. An entity of
type E' neither can exist without a counterpart of type E which is called strong entity type nor can be
identified without knowing the relationship to that corresponding counterpart. Therefore a weak entity
type is existence dependent and identifier dependent. The identifier dependency is represented as a
directed edge in an ER diagram. A weak entity type can be identifier dependent on more than one strong
entity type. The (min, max)-cardinalities of a weak entity type are always (1,1) with respect to the
corresponding strong entity types.
An entity identifier K of an entity type E:<A> is a subset of the set of attributes A which identifies each
entity of type E at every time. In addition to that K must be minimal, i.e. there is no proper subset K' of K
which also identifies the entities of type E at every time.
Let a weak entity type E':<A> be given which is identifier dependent with respect to the relationship
types R1, R2, ..., Rn; n ≥ 1. Then the weak entity identifier of E' is K = {R1, R2, ..., Rn} ∪ A'. A' de-
notes a subset of the set of attributes A, such that K identifies each entity of type E' at every time. In ad-
dition to that K must be minimal, i.e. there is no proper subset K' of K which also identifies the entities of
type E' at every time.
A relationship identifier K of a relationship type R is a subset of the participating entity types of which
the extension identifies each relationship of type R at every time. In addition to that K must be minimal,
i.e. there is no proper subset K' of K which identifies the relationships of type R at every time.
If several different identifiers for an entity type or relationship type exist, one of them is selected as pri-
mary identifier E# or R#.

A B Now various extensions to the original ER model are considered


(1,*) [ScSc79, ScSW79, SaNF79]. A relationship between entities can be
(0, *) R2
R1 (1, 25) regarded as a higher level entity called an aggregation. Aggregation
(1,*) (1, *) types can participate in other relationship types. In Fig. 2(3) the
(0, *)
C D E aggregation type is based on the relationship type R1. The
Fig. 2(3): Aggregation type R1 description of an aggregation type is equal to the description of the
underlying relationship type: R1:<A, B, C, D>, R2:<R1, E>.
Generalisation is an abstraction in which a set of similar entity types is
E (x,y) regarded as a generic entity type. All attributes of the supertype are de-
fined on all subtypes, too. Entities of any subtype may participate in any
E 1 ... E i ... E n
relationship type, which is connected to the supertype. Additional at-
Fig. 2(4): Generalisation tributes can be defined on each subtype. Also additional relationship types,
in which a subtype participates, can be introduced. The subtypes are
distinguished by a criterion (e.g. a value of an attribute or of a combination of attributes or any other
characteristic). The subtypes are either non-overlapping (y = N) or overlapping (y = O). If every entity of
type E also must be of at least one of the types E1, E2, ..., En then the generalisation is total (x = T). If
this is not required, then the generalisation is partial (x = P).

3. Entity and Relationship Clustering


3.1. Related Approaches
High-level diagram Entity clustering is first
A C
Subject proposed in [FeMi86]
area E as a bottom-up method.
An overview diagram
neglecting several
Subject area diagram
A E1 A C1 details is created from a
Information
area detailed ER diagram.
E2 C2 Whole sections of the
detailed diagram are
collected into so-called
Infor mation E 21 E 23 C21 C 23
ar ea di agr am entity-clusters, which
are represented as
E 22 E 24 A C22
(complex) entity types
a b
in a higher level ER
Fig. 3.1(1): Hierarchy of ER diagrams (Fig. 3.1(1a) is taken from [FeMi86])
diagram. The detailed
relationship types between entity types existing in one cluster are disappearing in the higher level ER dia-
gram. The others - so-called outside-relationship types - are transformed to relationship types between
the clusters containing the original detailed entity types. The higher level diagram can be abstracted itera-
tively by this method. An abstraction hierarchy can be built as shown in Fig. 3.1(1b). So called major
entity types are identified. Major entity types, e.g. A in Fig. 3.1(1b), are regarded as being very important
and reappear in different lower level diagrams. At the top level only the most important major entity types
and the relationship types between them are shown. In the approach of [FeMi86], the major entity types
are not refined, e.g. entity type A in Fig. 3.1(1b) is at all levels the same one.
In contrast to that, the approaches of [Mist91, RaSt92, TWBK89] use the major entity types as the
centres of the clusters. The entity types around a major entity type are clustered with it. The clusters are
usually named like the major entity types which they are based on.
All three approaches [FeMi86, Mist91, TWBK89] require that the entities collected together in one clus-
ter are all owned by exactly one functional area. A functional area is a part of the ER diagram describing
the objects which are used by a specific business function. Business functions can be refined or decom-
posed hierarchically. The hierarchy of ER diagrams is built bottom-up based on the hierarchy of business
functions. [FeMi86, Mist91, RaSt92] use a binary ER approach while [TWBK89] uses an EER approach.
The approach of [FeMi86] uses business units or business functions and a more intuitive concept to form
the clusters. The approach of [Mist91] as well as the algorithm of [RaSt92] is based upon a hierarchically
organised detailed binary ER scheme, always fading out parts of the hierarchy corresponding to the
hierarchy of entity clusters. While [Mist91] requires non-overlapping clusters [RaSt92] also allows
overlapping clusters. The algorithm of [RaSt92] is based only on the structure of the diagram, no func-
tional areas are considered. Because [Mist91, RaSt92] use the binary ER approach without any complex
relationship types, like (m:n)-relationship types, it can be argued that they reduce the approach of
[TWBK89] to the concept of iterative dominance grouping represented there.
Employee The example used by [RaSt92] is also used
(0,1) (0,*) in this paper. A stepwise top-down design
works at of an ER-diagram for an airline company,
(0,*) which has ground staff at different airports,
Airport is presented. First entity clustering as used
(0,*) (0,*) in this paper is explained, then simple rela-
from to tionship clustering and finally complex
(0,*)
relationship clustering is demonstrated.
Flight (0,*)
flies with Airplane Bold lines are used to draw entity and re-
lationship clusters. The top-level diagram is
Fig 3.1(2): Top-level diagram of the airline scheme shown in Fig. 3.1(2).

3.2. Entity Clustering


In this paper dominance grouping and abstraction grouping which are introduced
Airplane
in [TWBK89] are used for entity clustering.

Airplane Type
Two kinds of dominance grouping have to be distinguished.
(0,*)
• A strong entity type and its identifier dependent weak entity types form a cluster
of type
which is regarded as a higher level entity type. It must be emphasised that only
(1,1) weak entity types which are identifier dependent on exactly one strong entity
Airplane type are considered.
• Also normal entity types participating in (1:n)-relationship types can be clustered
Fig. 3.2(1): towards the dominating entity type if they are existence dependent (min-
Dominance cardinality > 0) on it. It is also required that this existence dependency is only
grouping due to one single entity type. In Fig. 3.2(1) the entity type 'Airplane Type' is the
dominating entity type whereas 'Airplane' is the dominated one.
The cluster can be named either like the dominating or like the dominated entity type. If the dominating
entities are used to classify the entities of the dominated type as in our example, the cluster is named like
the dominated type. If the dominated entity types describe details of the dominating entity type, the
cluster is named like the dominating one. The latter is mostly used if a strong entity type and its weak en-
tity types are regarded.

Employee Abstraction grouping [TWBK89] is used


• to cluster the participating entity types of an aggregation type with the aggre-
gation type.
Employee (T , N )
• to cluster the element types of a grouping type with the grouping type.
Crew Member
• to cluster subtypes ('Crew Member', 'Ground Staff') with their supertypes
Ground Staff ('Employee').
Fig 3.2(2): Abstraction In this paper only the last case is considered, which is mainly used in
grouping [TWBK89], too. The resulting clusters are named like the supertypes.
[TWBK89] distinguishes between dominance grouping, abstraction grouping, constraint grouping and
three different kinds of relationship grouping. Dominance grouping has the highest priority while rela-
tionship grouping has the lowest priority. However the notion of relationship clustering used in this paper
does not correspond to the notion of relationship grouping used there.
We use only the two concepts with the highest priority for entity clustering. Because of this the entity
clusters retain clear semantics very similar to the semantics of the major entity type. The other kinds of
grouping with less priority integrate aspects of relationship types into entity clusters. For this, we use
simple or complex relationship clustering to keep clear semantics. In order to avoid integrating relation-
ship aspects within entity clusters we give the following rules on entity clustering and therefore we use
neither constraint grouping nor relationship grouping in general.
Rules on Entity Clustering:
• Only abstraction grouping based on generalisation and dominance grouping are used as described
above.
• Within an entity cluster only relationship types, simple relationship clusters, entity types and entity clus-
ters are allowed.
• Within the entity cluster only the entity types and entity clusters can have outside relationships, i.e. a
relationship type or cluster within an entity cluster must not be interpreted as aggregation type with
respect to relationship types or clusters outside the entity cluster.

3.3. Simple Relationship Clustering


Simple relationship clus-
Employee Airport
works at tering is used to formu-
(0,*)
(0,1) late integrity constraints
Employee (N, T ) more precisely. In Fig.
Crew Member
(1,1) 3.3(1) the relationship
(0,*)
Ground Staff works at Airport type 'works at' is refined
by simple relationship
clustering in the context
Fig 3.3(1): Refinement of 'Employee' based on entity clustering; of the refinement of
refinement of 'works at' based on simple relationship clustering 'Employee'. Now it is
visible that only members of the ground staff work at airports and that every member of the ground staff
works at exactly one airport.
Simple relationship clustering can be also applied to represent additional integrity constraints in an entity
relationship diagram and to cluster semantically similar relationship types into one. To demonstrate this,
another example is used.
Customer (T , N) Goods (T, N) Sometimes there are two or more
Order Special Product (T , N) entity types with subtypes con-
Manufacturer (0,*) (0,*) nected by one relationship type,
Special Product
(0,*) for Manufacturers as shown in Fig. 3.3(2)). The ex-
Packaging (T, N) Special Product
ample is taken from an industrial
Wholesaler
for Wholesalers enterprise.
Standard Special
Packaging Packaging Often there is an interdependency
Product
Retailer
between the different subtypes
Trade Goods with respect to the relationship
type. For this example, they are
Fig. 3.3(2): Section of an ER diagram of an industrial enterprise: defined as follows:
Customers order goods in a certain packagings.
• Only retailers can order trade goods.
• Only manufacturers and wholesalers can order special products. Special products are either for manu-
facturers or for wholesalers.
• For any special product, at least one order must exist.
• Retailers can only order goods using standard packaging, while wholesalers and manufacturers can also
order goods using special packaging.
• Products can be ordered with special or with standard packaging.
• Special products can only be ordered with special packaging, trade goods only with standard one.
In the following, abstraction grouping is used to cluster the participating entity types with their subtypes.
Simple relationship clustering is used to represent the additional integrity constraints in an ER diagram
(Fig. 3.3(3)). In the upper part of Fig. 3.3(3), the overview diagram is shown, below the refined entity
clusters as well as the refined relationship cluster are shown.
Customer Order Goods
(0,*) (0,*) (0,*)
Packaging

Packaging (T, N)
S tandar d Special
Packaging Packaging

(0,*) (0,*) (0,*) (0,*) (0,*)


(0,*)
Customer (T , N )
(0,*) Goods (T, N)
Order 6 Special Pr oduct (T , N )
Manufacturer (0,*)
Order 5 (1,*) Speci al Pr oduct
(0,*) (0,*)
for Manufactur er s
Wholesaler Order 4
(0,*) (1,*) Speci al Pr oduct
for Whol es al er s
Order 3
(0,*) (0,*) Product
Order 2
Retailer (0,*) (0,*)
Trade Goods
Order 1
(0,*)

Fig. 3.3(3): Refinement of Fig. 3.3(2) modelling the additional integrity constraints.
Normally at least the relationship cluster will be faded out as shown in Fig. 3.3(2). There is no necessity
to show these details in order to understand the relevant aspects of the universe of discourse. Nowadays,
the additional constraints of this kind are typically implemented in and controlled by application modules
and not by the database itself. Therefore they are important when the specific module is implemented or
maintained. Also it is possible that the modelled situation is the regular one but that exceptions are
allowed in special cases.
A refinement is called contextsensitive if not only one single relationship or entity cluster is refined but
also all adjacent clusters are refined, too (Fig. 3.3(3)).
The following rules are given in order to restrict simple relationship clustering to cases in which (min,
max)-cardinalities or additional integrity constraints must be modelled more precisely.
Rules on Simple Relationship Clustering:
• A simple relationship cluster is refined by a set of relationship types or simple relationship clusters.
• In a simple relationship cluster only entity clusters (as in the examples above), entity types, simple rela-
tionship clusters (interpreted as aggregation types) or relationship types (interpreted as aggregation
types) can participate. Or the other way round, complex relationship clusters - as described in Section
3.4. - must not participate in a simple relationship cluster.
• In the contextsensitive refinement of a simple relationship cluster R for every relationship type or sim-
ple relationship cluster of the refinement the following requirements must be fulfilled:
• For every entity cluster E participating in R an entity type or cluster of E participates in every re-
finement of R.
• For every simple relationship cluster R' participating in R a relationship type or simple relationship
cluster of R' participates in every refinement of R.
• Every entity type E participating in R also participates in every refinement of R.
• Every relationship type R' participating in R also participates in every refinement of R.
The rules concerning relationship types or clusters must be applied in the same way to aggregation types
or clusters.

3.4. Complex Relationship Clustering


Now we introduce the concept of
Airport Airport complex relationship clustering as shown
(0,*) (0,*) from to in Fig. 3.4(1). The aggregation type
(0,*) (0,*)
from to 'Flight' is refined. Aggregation types are
Section refined like relationship types. In contrast
to simple relationship clustering not only
(1,*)
the relationship type is divided up into
is part of
Flight several similar relationship types, but also
(1,*)
Flight
additional new entity and relationship
types are introduced. A flight consists of
different flight sections in order to allow
Fig 3.4(1): Refinement of 'Flight' based on complex travels with intermediate landings.
relationship clustering
Now the relationship type 'flies with' is to
be refined. The contextsensitive refinement is shown in Fig. 3.4(2). Not only entity clusters but also rela-
tionship clusters can participate in another relationship cluster; e.g. 'Flight' participates in 'flies with' and is
refined based on relationship clustering.
We distinguish between contextsensitive and non-contextsensitive refinement. So a single element with its
neighbourhood can be refined as whole (contextsensitive refinement, cf. Fig. 3.4(2)) or a single element
(non-contextsensitive refinement, cf. Fig. 3.4(3)) can be refined alone. It is also possible that only
'Airplane' or only 'Flight' is refined.
(0,*)
Flight (0,*) (0,*)
Employee flies with Airplane

Empl oye (T , N ) (0,*) (0,*) Airplane Type


qualified for (0,*)
Crew Member
of type
Ground Staff (1,1)
(0,*) take part Airplane
Section
(0,*)
(0,*) (0,*)
Fl ight Secti on
is part of
Executi on used for
(0,*)
(0,*) (0,1)
(0,*) (1,1) (0,*)
Flight for F light Execution

Fig 3.4(2): Contextsensitive refinement of 'flies with' based on complex relationship clustering

(0,*)
Flight (0,*) (0,*)
Employee flies with Airplane

(0,*)
(0,*)
Employee qualified for Airplane

take part
(0,*)
(0,*)
(0,*)
(0,*) F light Section
Execution used for

Flight (0,1)
(0,*) (1,1) (0,*)
for Flight Execution

Fig 3.4(3): Refinement of 'flies with' based on complex relationship clustering without considering
the participating clusters

Rules on Complex Relationship Clustering:


• A complex relationship cluster is refined by relationship types as well as by simple and complex rela-
tionship clusters. Also new entity types and new entity clusters can be introduced.
• In the contextsensitive refinement of a complex relationship R cluster the following requirements must
be fulfilled:
• For every entity cluster E participating in R, an entity type or cluster of E participates in at least one
relationship type or cluster of the refinement of R.
• For every simple relationship cluster R' participating in R a relationship type or simple relationship
cluster of R' participates in at least one relationship type or cluster of the refinement of R.
• Every entity type E participating in R also participates in at least one relationship type or cluster of
the refinement of R.
• Every relationship type R' participating in R also participates in at least one relationship type or clus-
ter of the refinement of R.
Empl oyee (T , N ) (0,*) The rules concerning relationship
(1,1)
Crew Member qualified for types or clusters must be applied in the
works at Ground Staff (0,*) (0,*) same way to aggregation types or
(0,*)
clusters.
Airport Airplane Type
from to (0,*)
(0,*) (0,*) take part of type Finally Fig. 3.4(4) shows the complete
(1,1)
Section
refinement resp. the detail diagram of
Airplane
(0,*) our example.
(0,*) (0,*)

is part of Fli ght Section


(0,*) Execution used for
(0,1)
(0,*)
(0,*) (0,*)
for Fl ight Executi on
Flight
(1,1)
Fig 3.4(4): Detail diagram of the airline company scheme

3.5. Comparison to Related Approaches


Employee Employee (T , N ) (0,*) Now the bottom-up approach of
(1,1)
Crew Member qualified for [TWBK89] is applied to the ex-
works at (0,*) ample. Dominance grouping is
Ground Staff Airplane (0,*)
(0,*) used to create the clusters
T ype
Airport Airplane Type 'Airplane Type' and 'Flight' (Fig.
from to (0,*) 3.5(1)). In order to create the
(0,*) take
(0,*) of type cluster 'Employee' dominance
par t
(1,1) grouping, abstraction grouping
Section Airplane
(0,*) on generalisation and relation-
(0,*) (0,*)
ship grouping on unary relation-
Flight Section
ship types are used. Fig. 3.5(1)
is part of used for shows the creation of entity
(0,*) Execution (0,1)
clusters while Fig. 3.5(2) shows
(0,*)
(0,*) (0,*) the higher level diagram.
Flight for Flight Executi on Further clustering is difficult to
(1,1) enforce, because the remaining
Flight
objects can be interpreted as
Fig. 3.5(1): The creation of entity clusters entities as well as relationships.
Entity clustering in the sense of [FeMi86, TW-
qualified for
Employee BK89, Mist91, RaSt92] is rather a visual
take part instrument to present large ER diagrams than a
Airplane Type tool to support the design or redesign of large
is part of databases directly. All approaches are bottom-
up oriented. Therefore the changes always
have to be made at the detailed level and the
Flight Section take part clustering process must be repeated.
Flight Execution
The main idea of relationship clustering is to
Fig. 3.5(2): Higher level diagram determine the major entity types and the coarse
relationship types between them. Then these
relationship types are refined iteratively top down by complex and simple relationship clustering, also
involving entity clustering. After determining the major entity types, the detailed design of the different
relationship clusters can be realised simultaneously and independently by different project groups.
The relationship clusters are non-overlapping, the connections to other relationship clusters are only rep-
resented by the entity clusters or relationship clusters participating in other relationship clusters. So a re-
design of one cluster cannot have unpredictable side effects in other clusters if the changes are made in
context to the adjacent clusters.
The relationship clusters can be related to different business functions or functional areas. This is useful
because business functions rather create, change or delete relationships between the major entities instead
of changing the major entities.

4. The Concept of Relationship Clustering in a BER approach


Ground Staff Most of the CASE-tools supporting
Employee
(1,1) (0,1) the Information Engineering ap-
(1,1) (0,1)
proach use a binary ER (BER)
works at
(0,*)
(1,1) model. Hence, the idea of relation-
Crew Member
(0,*) (1,1) ship clustering is now transferred to
Airport Rating the BER model.
(0,*) (0,*) (0,*)
(1,1) In Fig. 4(1) the example of section 3
from to
(1,1)
(1,1)
(0,*) is transferred to a binary ER ap-
(1,1)
Member Action Airplane Type proach without generalisation. All
Section
(1,1)
(m:n)-relationship types are trans-
(0,*) (0,*)
formed to weak entity types with
of type
(1,1) (0,*)
two strong entity types. Therefore
(0,1) (0,*) (1,1) relationship clustering can be called
(0,*) (1,1) Fli ght Section used
Flight Section Executi on Airplane weak entity clustering and relation-
for
(1,1) (1,1) ship clusters are transformed to
(0,*)
weak entity clusters. Fig. 4(2) shows
(0,*) the overview diagram of the first de-
(0,*) (1,1) Fli ght
Flight Execution sign step.
Fig 4(1): The scheme of the airline company modelled in a At the higher level diagram the
binary ER model. edges being directed towards the
weak entity types cannot be
Employee
(0,*) interpreted as identifier
(0,1)
works at
dependencies, they are rather
(0,*) used to characterise the
Airport (1,1) relationship aspects of weak
(0,*) (0,*) entity types with more than one
from to strong entity type. The problem
(1,1) (1,1) of identifiers and attributes of
(1,1) (1,1) (0,*)
F light Secti on used clusters interpreted as complex
Flight Execution for Airplane objects is a topic of further re-
(0,*)
search based on the Nested En-
Fig. 4(2): ER scheme of the airline company in the first design step tity Relationship approach of
[CaJA89].
Employee The entity clusters are created as
(0,*)
(1,1) before. The weak entity clusters
(1,1) (0,*) are created analogously to com-
(1,1) Fl ight Section
Executi on
used
Airplane
plex relationship clustering. The
Flight (0,*) for
entity types of the cluster 'Flight
Section Execution' in Fig. 4(3) are
Employee
not dyed. The concept of simple
(0,1) (0,1)
weak entity clustering can be
(1,1) (1,1) transferred from simple relation-
(0,*) (1,1)
Ground Staff Crew Member ship clustering as well as the con-
Rating
(0,*) cept of complex weak entity clus-
(1,1) tering can be derived from com-
(1,1) plex relationship clustering.
(0,*)
Member Action Airplane Type
Section Now the algorithm of [RaSt92] is
(1,1) (0,*)
(0,*) applied to the example. Fig. 4(4)
of type shows the overlapping entity clus-
(1,1) (0,*)
(0,*) (1,1) Fli ght Secti on
(0,1)
used
(0,*) (1,1) ters, they are named like the bold
Flight Section Executi on for Airplane lined entity types. Most of the
(1,1) (1,1) entity types which are contained in
(0,*) more than one cluster have either
(0,*) (0,*) (1,1) Fl ight relationship characteristics or have
Flight Execution been derived from (m:n)-relation-
ship types. Fig. 4(5) shows the
Fig 4(3): Contextsensitive refinement of the weak entity cluster
overview diagram as created by
'Flight Section Execution'
the algorithm of [RaSt92].

Ground Staff Employee


(1,1) (0,1)
(1,1) (0,1)
works at
(1,1)
(0,*) 3
(0,*) (1,1)
Airport Crew Member Employee 4
Rating Airport
(0,*) (0,*) (0,*) 5
(1,1)
from to
(1,1) 1
(1,1) (1,1) (0,*) 6 7 8 9
Member Action Airplane Type
Section
(1,1) (0,*)
(0,*) Airplane Type 2 Flight
of type
(1,1) (0,*) 1 Rating
(0,1) (0,*) (1,1) 2 Flight Section Execution
(0,*) (1,1) Flight Section used
Flight Section Execution Airplane 3 Member Action
for
4 Member Action
(1,1) (1,1)
5 Ground Staff
(0,*) 6 Flight Section Execution
(0,*) (0,*) 7 Flight Section Execution
(1,1) Flight
Flight Execution 8 Flight Section
9 Flight Section
Fig. 4(4): Entity clusters after applying the algorithm of [RaSt92]
5. Conclusions
In this paper the approaches to entity model clustering have been extended to allow top-down design.
The main idea is to determine the major entity types and the coarse relationship types between them.
Then these relationship types are refined iteratively by complex and simple relationship clustering, also
involving entity clustering. After determining the major entity types, the detailed design of the different
relationship clusters can be realised simultaneously and independently by different project groups. This
approach also supports database reengineering. The clusters can be built bottom-up based on already
existing database schemes while the redesign is realised top down.
The relationship clusters are non-overlapping, the connections to other relationship clusters are only rep-
resented by the entity clusters or relationship clusters participating in other relationship clusters. So a
redesign of one cluster cannot have unpredictable side effects in other clusters if the changes are made in
context to the adjacent clusters.
The relationship clusters can be related to different business functions or functional areas. This is useful
because business functions rather create, change or delete relationships between the major entities than
changing the major entities.
Also standard diagrams for industry branches can be designed and individually refined to special enter-
prises. The 'Order' relationship type (Fig. 3.2(2)), e.g., can be regarded as standard diagram while the re-
finement (Fig. 3.2(3)) may be individually designed for a specific enterprise. In this example simple re-
lationship clustering enables the representation of additional integrity constraints without overloading the
global diagram.
To use the suggested method, a powerful, repository-based tool is required. First the processes of bot-
tom-up clustering and top-down refinement must be supported. But also the possibility to reorganise
clusters and to transfer objects between different clusters or abstraction levels must be given. Further-
more the handling of attributes in case of refining or coarsening clusters must be realised in the tool.
Some concepts of the nested entity relationship model [CaJA89] can be used. These problems will be the
topic of further research.
The work described here is part of the INCOME/STAR project. INCOME/STAR [OSS93] is an inte-
grated environment for the development of large, distributed information systems. It extends INCOME2
[INC92], an existing and commercially available tool for conceptual modelling and prototyping of
information systems. The underlying concepts has been developed at our institute between 1985 and 1991
[LNO*89]. INCOME/STAR supports the conceptual modelling of structural aspects by the entity
relationship model and of dynamic aspects by high level Petri Nets (predicate/transition nets). The con-
cepts described in this paper can be used to support an integrated top down design of Petri net and entity
relationship hierarchies.
Bibliography
[Bark90] Barker, R.: CASE*METHOD: Entity Relationship Modelling, Addison Wesley Publishing Company,
Wokingham, England, 1990
[CaJA89] Carlson, C. R.; Ji, W.; Arora, A. K.: The nested entity-relationship model - A pragmatic approach to
E-R comprehension and design layout, in Lochovsky, F.H. (Ed.): Proc. of the 8th Intern. Conf. on
Entity-Relationship Approach, Toronto, Canada 1989, North-Holland 1990, 43-58
[CaQu81] Caldiera, G.; Quitadamo, P.: Conceptual representation of data and logical IMS design, in Chen ,
P.P.(Ed.): Proc. of the 2nd Intern. Conf. on Entity-Relationship Approach, Washington, D.C., USA
1981, North-Holland 1983, 299-318
[Chen76] Chen, P. P.: The entity-relationship model: Toward a unified view of data, ACM TODS 1 (1976),
No.1, 166 - 192
[ChNC81] Chung, I.; Nakamura, F.; Chen, P. P.: A decomposition of relations using the entity-relationship
approach, in Chen, P.P. (Ed.): Proc. of the 2nd Intern. Conf. on Entity-Relationship Approach,
Washington, D.C., USA 1981, North-Holland 1983, 149-172
[ElWH85] Elmasri, R.; Weeldreyer, J.; Hevner, A.: The category concept: An extension to the entity-relationship
model, Data & Knowledge Engineering 1 (1985), 75-116
[FeMi86] Feldman, P.; Miller, D.: Entity model clustering: Structuring a data model by abstraction, The
Computer Journal 29(1986), No. 4, 348-360
[INC92] INCOME User Manuals: INCOME/Designer, INCOME/Dictionary, INCOME/Generator,
INCOME/Simalator, PROMATIS Informatik, Karlsbad/Germany, 1992
[LNO*89] Lausen, G.; Németh, T., Oberweis, A., Schönthaler, F., Stucky, W.: The INCOME approach for
conceptual modelling and rapid prototyping of information systems, in Proc. First Nordic Conference
on Advanced Systems Engineering CASE89, Stockholm, 1989.
[Mart89] Martin, J.: Information Engineering, Book I: Introduction, Prentice Hall, Englewood Cliffs, New
Jersey, 1989
[Mart90a] Martin, J.: Information Engineering, Book II: Planning and Analysis, Prentice Hall, Englewood Cliffs,
New Jersey, 1990
[Mart90b] Martin, J.: Information Engineering, Book III: Design and Construction, Prentice Hall, Englewood
Cliffs, New Jersey, 1990
[Mist91] Mistelbauer, H.: Datenmodellverdichtung: Vom Projektdatenmodell zur Unternehmensarchitektur,
Wirtschaftsinformatik 33(1991), No. 4, 289-299 (in German)
[OSS93] Oberweis, A.; Scherrer, G.; Stucky, W.: INCOME/STAR: Process model support for the
development of information systems, in Niedereichholz, J., Schuhmann, W. (Eds.): Wirtschafts-
informatik - Beiträge zur modernen Unternehmensführung, Campus-Verlag, Frankfurt, 1993, 145-165
[RaSt92] Rauh, O.; Stickel, E.: Entity tree clustering - A method for simplifying ER design, in Pernul, G.; Tjoa
A M. (Eds.): Proc. of the 11th Intern. Conf. on Entity-Relationship Approach, Karlsruhe, Germany
1992, Springer-Verlag 1992, 62-78
[SaNF79] dos Santos, C. S.; Neuhold, E. J.; Furtado, A. L.: A data type approach to the entity-relationship
model, in Chen, P.P. (Ed.): Proc. of the Intern. Conf. on Entity-Relationship Approach, Los Angeles,
California, USA 1979, North-Holland 1980, 103-120
[ScSc79] Scheuermann, P.; Schiffner, G.: Multiple views and abstractions with an extended entity-relationship
model,
Journal of Computer Languages 4 (1979), 139-154
[ScSt83] Schlageter, G.; Stucky, W.: Datenbanksysteme: Konzepte und Modelle, 2. Auflage, B. G. Teubner,
Stuttgart, 1983 (in German)
[ScSW79] Scheuermann, P.; Schiffner, G.; Weber, H.: Abstraction capabilities and invariant properties
modelling within the entity-relationship approach, in Chen, P.P. (Ed.): Proc. of the Intern. Conf. on
Entity-Relationship Approach, Los Angeles, California, USA 1979, North-Holland 1980, 121-140
[TeYF86] Teorey, T. J.; Yang, D..; Fry, P. J.: A logical design methodology for relational database using the
extended entity-relationship model, ACM Computing Surveys 18 (1986), No. 2, 197-222
[TWBK89] Teorey, T. J.; Guangping, W.; Bolton, D. L.; Koenig, J. A.: ER model clustering as an aid for user
communication and documentation in database design, Communications of the ACM 32(1989), No. 8,
975-987

1 IEF is a product of Texas Instruments;


IEW and ADW are products of Knowledge Ware;
ORACLE*CASE is a product of ORACLE Corporation.
2 INCOME is a Product of PROMATIS Informatik, Karlsbad, Germany

You might also like