Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Principles of Object Oriented Database Design

Klaus{Dieter Schewe, Bernhard Thalheim


Cottbus Technical University, Computer Science Institute, Karl-Marx-Str. 17, D-03055 Cottbus, FRG model-based methodology. It has been claimed that object orientation will have a signi cant impact on the development of such a methodology, especially as reusability and naturality of conceptual modelling are concerned. The methodology presented in this paper concentrates on four signi cant principles of object oriented database (OODB) design. The basic constituent is stepwise re nement , i.e. to begin the design process with a partial model that is completed and concretized furtheron depending on the growth of application knowledge. Class abstraction , i.e. to support libraries of incomplete parameterized designs that are instantiated and specialized later, is a natural consequence hereof. Declarativity is achieved by constraint centered design with (up to some degree) automatic transformation into consistent transactions. Variations enable the design of information systems with heavy reuse of existing design components. The methodology is based on a theoretically founded object oriented datamodel (OODM). Hence the support of inferences such as deciding the identi ability of objects, detecting the relation of an intended design to components in existing design libraries, and checking operations for reducedness as a prerequisite for the automatic transformation of constraints into consistent transactions.

Abstract. The design of complex information systems requires a transparent

The design of data and knowledge intensive information systems requires a transparent model-based methodology. Classically there exist seperate methods for the database and transaction design without a satisfactory integration 7, 9]. Therefore, it is a natural hope that the use of object oriented design methods will improve the situation. Object orientation involves the isolation of data in semi-independent modules in order to promote high software development productivity. This idea stems from programming languages and most methods proposed so far 3, 6, 11, 20] are intended to support object oriented program development. The main di erence in object oriented database (OODB) design is due to the notion of object that is now intended to serve as a basic unit of persistent data, a view that is influenced by semantic datamodels 9]. Since classes then serve not only as behaviour abstractions but also as (persistent) data collections, we have to cope with object identi cation, whereas in object oriented programming a simple identi cation mechanism via object names is su cient. This makes OODB design a signi cantly di erent task to object oriented program development, although some ideas of the approaches to the latter eld can be taken over.

1 Philosophy of OODB Design

Still most object oriented datamodels are very close to the language level 1, 10] no matter whether their development started from a semantic datamodel or an object oriented programming language. For object oriented database design, however, it is necessary to shift the approach to the conceptual level as also claimed in work of the IS-Core group 13, 21]. Therefore, the primary goal of our methodology is to provide a conceptual object oriented model with greater naturality in application modelling. At the same time we want to improve the design quality and to raise the rate of software reuse. The work presented in this paper is centered around the theoretically founded object oriented datamodel (OODM) introduced in 16] and partly based on the work in 2]. This model supports the uniform representation of designs at each level of concretion. In particular there is no need to use di erent models for the conceptual and logical design respectively. We regard requirements analysis and conceptual modelling as two activities running in parallel. We start with an initial design that is a one-to-one representation of rst knowledge about the intended application. The analysis task is to grasp and describe such knowledge with the formal representation tools. The following design process is monotonic, as the amount of application knowledge increases. Each knowledge increment then corresponds to some re nement , i.e. a change|not only extension|of the design. However, this does not prejudice a particular, e.g. \top-down" design procedure. In contrast, the OODM favours incomplete partial designs with the speci cation of details left for re nement. Keeping even such intermediate designs increases the spread of possible reuse. This is close to the Design-by-Units -strategy 23]. Classical design methods are centered around data, processes or constraints respectively. Within the uni ed model in our approach we may regard all these aspects at the same time and keep only track of the dependencies among them, since constraints depend on the data and processes on both other components. This implies the relative independence of re nement steps on data, processes or constraints as long as these dependencies are taken into consideration. Since processes in data and knowledge intensive application systems change much faster than constraints, it is desirable to minimize the process design task and to achieve a maximum of declarativity . As shown in 17, 18, 19] it is possible (up to some degree) to compute maximal specializations of speci ed processes in order to enforce consistency. The use of a uniform OODM during the whole design process enables to build design libraries. Due to the support of abstract partial designs the components of such libraries can be more generic than usually assumed, but it is a truism that reusability does not imply reuse. We have to support mechanisms to retrieve a maximum of existing reusable library components for a given partial design. This leads to the concept of variationbased reuse extending results on variant construction in semantic networks 14]. Such a methodology involves a high level of inferences. Some of these inferences are intrinsic to the used datamodel. Among them are the recognition of object identi ability, specialication and type correctness or the veri cation of re nement correctness. Others are extrinsic such as the proof of reducedness as a prerequisite for consistency enforcement or the ascertainment of the relationship to existing library components. In the remainder of this paper we shall rst describe the fundamental issues of the OODM in Section 2, then in Sections 3-6 we briefly concretize the basic principles of our design methodology. Section 7 presents a short outline of the required inferences and a discussion of open research problems.

In the object-oriented approach we distinguish between objects and values. Whereas values are encoded by themselves, objects have to be encoded by object identi ers. In our approach each object consists of a unique, immutable identi er , a set of values of possibly di erent types, references to other objects and methods associated with the object. Values can be grouped into types. In general, a type may be regarded as an immutable set of values of a uniform structure together with operations de ned on such values. Subtyping is used to relate values in di erent types. The class concept provides the grouping of objects having the same structure which uniformly combines aspects of object values and references. Objects can belong to di erent classes, which guarantees each object of our abstract object model to be captured by the collection of possible classes. As for values that are only de ned via types, objects can only be de ned via classes. Thus, a design consists of type and class de nitions. We follow the classical view of types in 4] using a type system that consists of some basic types , type constructors and a subtyping relation. Moreover, recursive types , i.e. types de ned by domain equations, and predicative types , i.e. types de ned by restrictions, can be de ned. De nition1. { The base types are BOOL, NAT , INT , FLOAT , STRING, ID or ?, where ID is an abstract identi er type without any non-trivial supertype and ? is the trivial type that is a supertype for every type. { The type constructors are e1 j j en (enumeration), (a1 : 1; : : :; an : n ) (record), f g ( nite set), ] (list), h i (bag) or (a : ) (b : ) (union). We may use base types and constructors to de ne new types by nesting. If there is no confusion, the eld selectors in record or union types may be omitted. The semantics of such types as sets of values is de ned as usual. Moreover, we assume the standard operators on base types and on records, sets, bags, : : : We omit the details here. A type t is called proper i the number of its parameters is 0. t is called a value type i there is no occurrence of ID in t. If t0 is a proper type occurring in a type t, then there exists a corresponding occurrence relation o : t t0 ! BOOL. A subtype function is a function t0 ! t from a subtype to its supertype (t0 t) de ned by the usual subtype relation 4]. Example1. Let us de ne a type VZ and a simple subtype V Z 0 hereof. Type VZ = ( begin : DATE , end : DATE ? , kind-of-insurance : \Main" j \Family" j \Interruption" ) End VZ Type V Z 0 = ( begin : DATE , end : DATE , kind-of-insurance : \Main" j \Family" j \Interruption" ) End V Z 0 t u

2 The Object Oriented Datamodel: Basic Features

2.1 Type De nitions

Predicative Types are used to restrict the set of values given by some type de nition to a subset. For this purpose a formula with exactly one free variable self is used. Clearly, the inclusion then gives a subtype function. In order to avoid inflationary use of quanti ers, other variables are also allowed to occur freely in such a formula. They are assumed to be universally quanti ed.

De nition2. A predicative type T consists of an underlying type T 0 and a formula P


with exactly one free variable self of type T 0.
Type

Example2. Let us de ne a predicative subtype of VZ ].

End

VZ-list = VZ ] Where ( self = concat(L1, V1,V2 j L2] ) ) V2 :: V Z 0 ^ V2.end V1.begin ) ^ ( self = concat(L1, V j L2] ) ) V .end 6= ? ) V .begin V .end ) VZ-list

t u

2.2 Class De nitions


Each object in a class consists of an identi er, a collection of values, references to other objects and methods. Let us postpone methods for a while. Identi ers can be represented using the unique identi er type ID. Values and references can be combined into a representation type, where each occurence of ID denotes references to some other classes. Therefore, we may de ne the structure of a class using parameterized types. Moreover, classes are arranged in IsA-hierarchies.

De nition3. { If t is a value type with parameters 1; : : : ;

n such that ID does not occur in t and if some of the parameters are replaced by pairs ri : Ci with a reference name ri and a class name Ci, the resulting expression is called a structure expression . Note that a structure expression may still contain parameters. { A class consists of a class name C , a structure expression S , a set of class names D1; : : :; Dm (called superclasses ) and a set of methods . We call ri the reference named ri from class C to class Ci. The type derived from S by replacing each reference ri : Ci by the type ID is called the representation type TC of the class C .

Example3. Let us consider a class Insurant for an insurance application.


Class Insurant = Structure ( contract-no

End Insurant

name : NAME , address : ADDRESS , sex : SEX , insurance-times : VZ-list , agency : AGENCY ) Method : : :

: NAT ,

t u

In this example there are no references, hence the structure expression is simply a type. We could have de ned this type, say INSURANT-DATA, separately from the class de nition as in Section 2.1. Then the structure would simply be Structure INSURANTDATA.

2.3 Method De nitions


Let us now turn to adding dynamics to the OODM. As required in the object oriented approach operations will be associated with classes. This gives us the notion of a method . We shall distinguish between visible and hidden methods to emphasize those methods that can be invoked by the user and others. However, all methods of a class including the hidden ones can be accessed by other methods. The justi cation for such a weak hiding concept is due to two reasons.

{ Visible methods serve as a means to specify (nested) transactions. In order to build

sequences of database instances we only regard these transactions assuming a linear invocation order on them. { Hidden methods can be used to handle identi ers. Since these identi ers do not have any meaning for the user, they must not occur within the input or output of a transaction.

Each method on a class C consists of a signature and a body . The signature consists of a method name and sets of parameter/type pairs for input and output. The body is de ned by the usual constructs of a procedural programming language.

De nition4. { A method signature consists of a method name M , a set of input-

parameter/type pairs i :: Ti and a set of output-parameter/type pairs oj :: Tj0. { A method on a class C consists of a method signature and a body that is recursively built from the following constructs:
assignment x := E , where x is either the class variable C of type fUC g or a local variable within S (including the output-parameters), and E is a expression of the same type as x, local variable declaration Let x :: T , skip and fail, sequencing S1 ; S2 and branching IF P THEN S1 ELSE S2 ENDIF, 0 method call C 0 :- M 0 (in : E1 ; : : : ; Ej0 ; out : x01; : : : ; x0i), where M 0 is a method on class C 0 with compatible signature and non-deterministic selection of values New:f (x), where f is a selector on the representation type of C .

If the class name is omitted in a method-call, then we refer to the class C itself or to the global method New Id to denote the selection of a new identi er. Clearly, we may regard this method as belonging to an abstract class Any that is a superclass of all classes with structure ?. A method M on a class C is called value-de ned i all types occurring in its signature are proper value types. As already mentioned we distinguish between methods visible to the user and hidden methods. We require each visible method to be value-de ned. Subclasses inherit the methods of their superclasses, but overriding is allowed as long as the new method is a specialization of all its corresponding methods in its superclasses.

Example4. Let us add the method add-insurant to the class Insurant of Example
3.

add-insurant ( in : request-data :: REQUEST-DATA , out : contract-no :: NAT ) = Insurant :- check-data ( in : request-data , out : acceptable :: BOOL ) IF acceptable THEN Let I :: ID , C :: NAT ; New.contract-no (C ) ; New Id ( out : I ) ; Insurant :- compute-insurant-data ( in : request-data, C , out : V ) ; Insurant := Insurant f ( I , V ) g ELSE fail ENDIF t u Let us briefly discuss what it means that a method N on a class D specializes the method M on a superclass C . First, we may assume|taking records|that there is exactly one input- and one output-type, say IN (resp. IM ) and ON (resp. OM ). The input-type is used for two purposes: object identi cation in D (resp. C ) and providing 0 0 necessary parameters, hence IN (resp. IM ) is a subtype of some ID IN (resp. IC IM ). In order to \inherit" the behaviour of M to N we must be able to transform N in such a way that it becomes applicable to the input of M . Hence we have to project the parameter parts, whereas identi cation may exploit object identi ers (see De nition 6). 0 0 Hence IM must be a subtype of IN . Note that this gives some kind of partial contravariance , whereas 11] requires covariance and 1] requires contravariance only. The di erences are due to the mismatches between program and database design as already mentioned in Section 1. For the output-types the situation is much simpler requiring ON to be a subtype of OM . We may then transform N in a canonical way to some N 0 with the same signature as M . Both may be regarded as methods on C . Then, if N 0 applied to some input-value yields some result, this should also result from applying M (but not vice versa). A more formal discussion on the theme occurs in 17].

Method

2.4 Schema De nitions

Now we are prepared for the de nition of a database schema that is simply given by a nite collection of type and class de nitions. Later we shall add constraint de nitions. Thus, taking together Examples 1-4, we get a schema with only one class Insurant and only one method add-insurant. However, some of the types in this schema such as NAME , ADDRESS , REQUEST ? DATA are unde ned. The same applies to the methods check-data and computeinsurant-data called by add-insurant. This style of allowing partiality in OODM schemata allows to capture also incomplete knowledge about an application area and will be essential for our methodology. In the next two chapters we shall explain in more detail this feature and show how to exploit it for a standard re nement process. First let us have a closer look at schemata that are \complete", i.e. correspond to a nal design of an application. This leads to the notion of closed schemata. De nition5. A schema S is a nite collection of type, class and constraint de nitions. It is closed i all types, classes and methods occurring within type de nitions, structure de nitions and methods are de ned in S .

Let us postpone constraints for a while. At each time, a class is given by a nite set of objects. More precisely, we need the notion of a database instance. De nition6. An instance D of a closed schema S assigns to each class C a value D(C ) of type f(ident : ID; value : TC )g such that the following conditions are satis ed: uniqueness of identi ers: For every class C we have 8i :: ID: 8v; w :: TC :(i; v) 2 D(C ) ^ (i; w) 2 D(C ) ) v = w : (1) inclusion integrity: For a subclass C of C 0 we have 8i :: ID: i 2 dom(D(C )) ) i 2 dom(D(C 0)) : (2) 0 0 Moreover, if TC is a subtype of TC with subtype function f : TC ! TC , then we have 8i :: ID: 8v :: TC : (i; v) 2 D(C ) ) (i; f (v)) 2 D(C 0) : (3) referential integrity: For each reference from C to C 0 with corresponding occurrence relation or we have 8i; j :: ID: 8v :: TC : (i; v) 2 D(C ) ^ or (v; j ) ) j 2 dom(D(C 0 )) : (4) Basic update methods, i.e. insertion, deletion and update of a single object into a class C , can not always be derived in the object-oriented case, because the abstract identi ers have to be hidden from the user. However, in 16] it has been shown that for value-representable classes these operations are uniquely determined by the schema and consistent with respect to the implicit referential and inclusion constraints. Value-representability of all classes in a closed schema is implied, if we can derive a (trivial) uniqueness constraint for each class. Such a constraint requires the values of type TC in the class extension C to be unique: (5) Finally, the semantics of a closed schema is given by database histories, where a database history on a schema S is a sequence D0; D1 ; : : : of instances such that D0 is the empty database and each transition from Di?1 to Di is due to some visible method on some class C 2 S .

8i; j :: ID: 8v :: TC : (i; v) 2 D(C ) ^ (j; v) 2 D(C ) ) i = j :

3 Class Abstraction
As we have seen in Section 2 the structure expression of a class in an OODM schema may contain parameters. These arise from parameterized types. Parameterized classes allow to abstract from concrete structures. Indeed, an instance of a parameterized class may not be regarded as a single set of pairs, but as a family hereof indexed by the possible instantiations. Let us now extend and concretize this view to arbitrary schemata. If we know that objects will have some attributes, but we still do not know the type of the corresponding values, we may leave the corresponding parameter uninstantiated. However, if we already know that we shall instantiate this parameter by some type, we may mark this parameter as a type parameter . If we know that there will be some reference ri : Ci, but Ci is unde ned, then we have a class parameter . For parameterized classes the possibilities to de ne methods and constraints are restricted. If is a type parameter and we do not know anything about the type,

there is no non-trivial way to express a term of that type, but terms are required in assignments as well as in constraints. However, we may have partial knowledge of that type, e.g. that it is a subtype of some other type, in which case we may use terms of that supertype. If C is a class parameter, then each call of a method m on C is indeed unde ned. Therefore, for the proof of properties of the calling method such as consistency we only have the possibility to assume an arbitrary input-output-relation for m unless we completely defer the proof.

De nition7. If S is a schema, T a type parameter, C a class parameter and M

an unde ned method. A parameter restriction is either T T 0 with some value type expression T 0, C isa C 0 with some class name C 0, C:structure S with some structure expression S or a restriction on the types of the signature of M . Here denotes the subtype relation and its canonical extension to structure expressions. Note that some parameter restrictions may be inferred from context in the schema S . If a parameter is unrestricted, we may add the implicit parameter restrictions T ?, C:structure ? and Ti ? for type parameters, class parameters and types in method signatures. However, if there is more than one restriction on a parameter, these may be inconsistent. In the case of a consistent set of parameter restrictions, the set of restrictions on one parameter may be uni ed to give only one restriction in the form of De nition 7. We then talk of the normalized set of parameter restrictions . In order to de ne the semantics of open (i.e. not closed) schemata, we need the notions of instantiations.

called minimal i we had taken the types and classes occurring in the normalized set of parameter restrictions.

De nition8. Let S be a schema with a consistent set of parameter restrictions. An instantiation I is given by a closed schema S 0 that results from S by replacing each type parameter T by a value type, each class parameter by a class and each unde ned method by \ Let : : : oi :: Oi : : : " such that all parameter restrictions are satis ed. S 0 is Example5. Let us look again at Examples 1-4. The minimal instantiation of the type
V Z (and V Z 0) gives

VZ = ( begin : ? , end : ? , kind-of-insurance : \Main" j \Family" j \Interruption" ) End VZ The minimal instantiation of the class Insurant leads to the structure expression Structure ( contract-no : NAT , name : ? , address : ? , sex : ? , insurance-times : VZ-list , agency : ? ) The method add-insurant involves the call of check-data on the same class, but this method is unde ned, hence could only be treated as the non-deterministic value selection \ Let accepted :: BOOL ". t u
Type

Finally, the full semantics of an open schema S is given by families of history sets indexed by the possible instantiations of S , whereas the minimal semantics is the semantics of the minimal instantiation. Note that each instantiation can be projected naturally to the minimal one. The principle of class abstraction is necessary for stepwise re nement as indicated in Section 3, since otherwise we were not able to support partial designs. On the other hand, it increases the band-width of possible concrete designs that occur as instantiations. Therefore, it is desirable to provide libraries of abstract (partial) designs to achieve a higher rate of reusability.

4 Stepwise Re nement
Once, an initial OODM schema is given, the following design process is based on stepwise re nement. Roughly speaking, re nement means the reorganization of classes and methods such that the semantics of the old schema is \preserved" within the new one. This is captured by the next de nition. Let S and T be closed schemata and suppose there are (partial) functions

{ finst that is total taking instances of T to instances of S , { fclass that is partial taking a class in T to a class in S and { fmeth that is total taking a method in T to a (possibly empty) set of methods in S .
such that for each method M associated with a class C in T each method M 0 2 fmeth (M ) is associated with fclass(C ). If S and T are arbitrary schemata, assume these functions to be de ned on the minimal instantiations.

De nition9. T is a re nement of S i for each pair (Di?1 ; Di) in a database history of T that corresponds to a method M and each M 0 2 fmeth (M ) that is de ned and terminating in finst (Di?1 ) the pair (finst (Di?1); finst (Di)) corresponds to M 0.
There exists a more elegant (but also strongly theoretical) characterization of re nement. We omit the details here. In 15] the following standard re nement steps in the OODM have been discussed on the basis of an application example.

4.1 Instantiation
In Section 3 we discussed the possibility of parameterized (open) schemata and de ned their semantics. Re nement by instantiation provides de nitions for such parameters, but may also introduce new parameters.

Example6. Let us instantiate the type parameters ADDRESS and AGENCY occurring in Example 3.
Type

ADDRESS = ( zip : NAT Where city : STRING , street : STRING ) End ADDRESS

self < 100; 000

Type

AGENCY = ( number : NAT Where self < 1; 000 , address : ADDRESS , phones : f TELECOM NO g , fax : TELECOM NO , cares for : f ( zip : NAT Where self < 100; 000 , city : STRING ) g ) End AGENCY

t u

Re nement by instantiation may also introduce bodies for methods that were unde ned so far.

4.2 Splitting
Re nement by splitting leads to new classes with structure expressions that correspond to parts of an existing structure expression which in turn are replaced by references. It is mainly used in the case of shared data.

Example7. The class Agency stems from splitting Insurant in Example 3 assuming the instantiation of Example 6 to be already done. The new reference is agency : Agency. Class Agency = Structure ( agency : AGENCY )
End Agency Class Insurant = Structure ( contract-no

End Insurant

: : :: : : , agency : Agency ) Methods : : :

: NAT ,

t u

Clearly, the existing methods on the splitted class have also to be changed.

4.3 Specialization
Re nement by specialization introduces subclasses and subtypes. Moreover, it may involve to replace a structure expression such that the new representation type will be a subtype of the old one and the new implicit constraints will imply the old ones.

Example8. Let us introduce a new class Main-Insurant as a subclass of Insurant. Objects in this subclass have an additional reference to Company that need not exist for all insurants. Class Main-Insurant = IsA Insurant Structure ( account-no : NAT , employed-by : Company ) Methods : : :
The new class Insurant results by specializing the old class with this name. We simply add a reference to the class Main-Insurant for the case of insurant of kind \Family". The corresponding subtype function is a simple projection.
End Main-Insurant

Class Insurant = Structure ( : : : ,

End Insurant

insurance-times : ( begin : DATE , end : DATE ? , ( kind : \Main" j \Interruption" ) ( kind : \Family", associated-with : Main-Insurant ))] Where : : : , agency : Agency ) Methods : : :

t u

4.4 Extension

Re nement by extension is very simple, since it means the de nition of new types, classes, constraints or methods that do not yet exist in the schema.

Example9. A new class New Insurant to capture persons that apply to become
an insurant is introduced as follows.
Class New Insurant Structure ( name :

Objects may at the same time belong to both class Insurant and New Insurant with di erent names, addresses and so on. Object identi ers are used to relate di erent aspects of the same object. t u

Methods : : : End New Insurant

= NAME , address : ADDRESS , sex : SEX , when to start : DATE , initial-agency : Agency , vocational-group : VOCATION-KEY , income : NAT Where self < 1; 000; 000 )

5 Declarativity by Constraint Centered Design


As announced in De nition 5 we now concretize constraints associated with a schema. Particular interest will be paid for such constraints that arise as generalizations of constraints known from the relational model, e.g. functional, inclusion and exclusion constraints 17, 18].

De nition10. { An integrity constraint on a schema S is a formula I over the underlying type system with free variables fr(I ) fC1; : : :; Cn g, where each class name Ci is used as a variable of type f(ident : ID; value : TC )g. { An instance D of a schema is said to be consistent i substituting D(C ) for each class variable C in each integrity constraint I evaluates to true, when interpreted
i

in the usual way.

Note that the conditions for an instance in De nition 6 correspond to model inherent integrity constraints. We refer to these constraints as implicit identi er , IsA and

referential constraints on the schema S . Other constraints that are already given implicitly by the structure of the schema arise from Where-clauses in predicative types. Indeed, we may replace such types by the underlying ground type |just omit the Whereclause| and add the clause as a constraint. From the designer's point of view this is not necessary, but it will be as soon as constraint maintenance comes into play (see below).

as a subclass of Insurant. We would like to express that each object currently in Insurant with kind = \Main" must also belong to Main-Insurant. This gives the formula 8i; v; b; ` (i; v) 2 Insurant ^ insurance-times(o) = (b; ?; \Main") j `] ) 9w (i; w) 2 Main-Insurant with free variables Insurant and Main-Insurant. t u In particular, we allow distinguished classes of constraints to be speci ed in OODM schemata. These comprise inclusion, exclusion, functional, uniqueness, object generating and path constraints and generalize relevant classes known in the relational eld 22]. De nition11. Let C; C1; C2 be classes in a schema S and let ci : TC ! Ti (i = 1; 2; 3) and ci : TC ! T (i = 1; 2) be subtype functions. { A functional constraint on C is a constraint of the form 8i; i0 :: ID: 8v; v0 :: TC : c1(v) = c1(v0) ^ (i; v) 2 xC ^ (i0; v0) 2 xC ) c2(v) = c2(v0) : (6) { An inclusion constraint on C1 and C2 is a constraint of the form 8t :: T: 9i1 :: ID; v1 :: TC1 : (i1; v1) 2 xC1 ^ c1(v1) = t ) (7) 9i2 :: ID; v2 :: TC2 : (i2; v2) 2 xC2 ^ c2(v2) = t : { An exclusion constraint on C1, C2 is a constraint of the form 8i1; i2 :: ID: 8v1 :: TC1 : 8v2 :: TC2 : (i1; v1) 2 xC1 ^ (i2; v2) 2 xC2 ) c1(v1) 6= c2(v2) : (8) Constraints increase the declarativity of designs. This is important, because in data and knowledge intensive application systems the data and constraints on them usually live longer than the operations, i.e. the methods. Then the problem is to guarantee the consistency of the methods with respect to the speci ed constraints. Sometimes this requires hard veri cation work, but for a wide spectrum of schemata automatic transformation of constraints into methods is provided. In 16] consistent generic update operations with respect to implicit constraints have been presented. In 18] this has been extended to the classes of constraints mentioned above. In 19] an algorithm for the transformation of constraints into transactions has been proven to be correct. This algorithm reduces the consistency enforcement task to basic updates. It can be shown that this operational approach to consistency enforcement is more powerfull than the rule triggering approach 5, 8]. However, the veri cation
i

Example10. Return to Example 8, where we introduced the class Main-Insurant

of a very technical condition, called I -reducedness is required, which limits the applicability of consistency enforcement in general. We omit the details of the algorithm here, since they are hidden to the designer. The only thing a designer has to know is that constraint speci cations will be made explicit in methods in a canonical way. If this leads to unexpected results, s/he may change the original design. It is an open research problem how to support the amelioration of a schema in case constraint enforcement leads to ine cient methods.

6 Variation Based Reuse: A Research Issue


The design process presented in Sections 3-5 implicitly assumes that we want to build a new application system from scratch. One promise of the object oriented approach, however, is an enormous increase in software reuse. This can be achieved if we keep the design components, i.e. type and class de nitions in libraries. The bene ts hereof are apparent especially if we regard the scale of reusability of parameterized class de nitions. Unfortunately reusability does not automatically imply reuse. Indeed, we have to provide mechanisms to relate the intended (new) designs with existing components in such a library. Existing type and class de nitions are not independent from one another. The idea is now to exploit the hierarchies in OODM schemata due to instantiation, specialization and re nement. This extends the work in 14], where the specialization taxonomy in a KL-ONE like knowledge representation system has been exploited for a similar task. An intended design is given just as before by a rst (partial) OODM schema. Then the following cases may occur.

{ A class/type of the intended design is an instantiation, specialization or re nement


of an existing design component. Then we may ask whether a rearrangement of requirements would enable the reuse of further instantiations, specializations or re nements that exist in the library. { A class/type of the intended design is a variant of an existing library component, i.e. the rst alternative is true for a reparameterization of this library component. Of course, this is always possible, since a pure parameter would satisfy this requirement. Hence we have to judge whether it is helpfull to take the reuse of the reparameterization into consideration. This approach is similar to the use of a similarity measure in case-based reasoning.

Once we have discovered a reusable variant in the library, we may keep track of the di erence to the intended design and propagate these changes along the existing hierarchies. Then we may ask whether the resulting components can be directly reused. This suggests a modi cation of the re nement-based design methodology. Before starting a re nement process existing domain-speci c libraries are examined and variants are built. Then the re nement process is based on selected variants. Moreover, variant construction is also required after standard re nement steps that introduce new types or classes, since for these there may also exist variants in some library.

Example11. The class Insurant in Example 3 corresponds to current legal requirements. Some years ago an initial schema for an insurance application would have looked slightly di erent, since only main insurants existed at that time. This could have been

modelled by some class Insurant old. Class Insurant old = Structure ( contract-no : NAT , name : NAME , address : ADDRESS , sex : SEX , insurance-times : (begin : DATE , end : DATE :::, account-no : NAT , employed-by : Company , family : f NAME g , agency : AGENCY ) Method : : :
End Insurant old

? )]

Where

Assume such an initial design and all re nements to be kept in some library. Omitting account-no, employed-by and agency in the structure expression above would give a common supertype of the representation types for Insurant old and Insurant in Example 3. Then build variants of all the existing re nements just omitting this information and check whether these are compatible with the new requirements. This avoids repeating re nement steps that occurred (in modi ed form) already in the past. Finally, specialize Insurant as indicated in Example 8 and build variants of the re ned classes Insurant and Main-Insurant with respect to the hierarchy developed so far. Again this should avoid repeating earlier re nement steps. t u The concretization and theoretical treatment of these ideas for the outlined methodology is a research issue under current investigation. The work reported in the preceding sections presents rst principles of object oriented database design. The main scenario is centered around stepwise re nement on the basis of an object oriented datamodel supporting class abstraction, generic update operations and declarative constraint speci cation. The datamodel as well as the design process involve a lot of supporting inferences. These fall into two classes. Let us rst describe those inferences that are intrinsic to the datamodel. { The datamodel supports type and class hierarchies. Since methods on subclasses may override inherited methods, we have to check that these are indeed specializations in order to shrink undesired arbitrariness. { The datamodel supports strongly typed methods, hence the problem to check type correctness. A more general problem is the veri cation of consistency for constraints that evade from enforcement. { The datamodel supports generic updates, but these only exist in the case of valuerepresentability. This leads to the problem whether a uniqueness constraint is implied. The second class of inferences is required by the design methodology and extrinsic to the datamodel.

7 Inferences in OODB Design

{ The main scenario is based on stepwise re nement. Hence the task to verify formal
re nement conditions. However, for the standard re nement steps in Section 3 this is redundant, since they have already been proven to be correct. { In order to enforce consistence the formal requirement on I -reducedness 19] has to be satis ed. Hence the task to check it. { Finally, we have to recognize the relation of an intended design to existing library components, i.e. whether it is an instantiation, specialization, re nement or variant. This may involve data restructuring as shown in 12]. Moreover, once a usefull variant has been detected, we may want to propagate the changes along the di erent hierarchies. This kind of variation-based reuse is still a research issue that we are working on.

However, there are still open research problems. So far, we do not know the exact boundary of the inferences. Another problem is the integration of user interfaces and graphical support in order to facilitate the control whether the design ts for the amount of knowledge resulting from the current stage of requirements analysis. Currently, there is a research project CODE (Computer-aided Object oriented Design Environment) that aims at solving these open problems. The main research topics of CODE will be the extension of the design method toward variation-based reuse and the support of the outlined methodology by a CASE tool.

References
1. M. Atkinson, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The objectoriented database system manifesto , Proc. 1st DOOD, Kyoto 1989 2. C. Beeri: A formal approach to object-oriented databases , Data and Knowledge Engineering, vol. 5 (4), 1990, pp. 353-382 3. G. Booch: Object-oriented design with applications , Benjamin Cummings, 1991 4. L. Cardelli, P. Wegner: On understanding types, data abstraction and polymorphism , ACM Computing Surveys, vol. 17(4), pp. 471-522 5. S. Ceri, J. Widom: Deriving production rules for constraint maintenance , Proc. 16th Conf. on VLDB, Brisbane (Australia), August 1990, pp. 566-577 6. P. Coad, E. Yourdan: Object-oriented analysis , Prentice Hall, 1991 7. C. Floyd: A comparative evaluation of system development methods, in T. W. Olle, H. G. Sol, A. A. Verrijn-Stuart (Eds.): Information Systems Design Methodologies { Improving the Practice, Elsevier 1986 8. P. Fraternali, S. Paraboschi, L. Tanca: Automatic rule generation for constraint enforcement in active databases , in U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations of Models and Languages for Data and Objects, Volkse (Germany), October 1992, Springer WICS 9. R. Hull, R. King: Semantic database modeling: survey, applications and research issues , ACM Computing Surveys, vol. 19(3), September 1987 10. W. Kim: Object-oriented databases: de nition and research directions , IEEE Trans. on Knowledge and Data Engineering, vol. 2 (3), 1990, pp. 327-341

11. B. Meyer: Object-oriented software construction , Prentice-Hall, 1988 12. B. Piza, K.-D. Schewe, J. W. Schmidt: Term subsumption with type constructors , in Y. Yesha (Ed.): Proc. 1st Int. Conf. on Information and Knowledge Management, Baltimore, November 1992 13. G. Saake, R. Jungclaus: Speci cation of database applications in the TROLL language, in D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Speci cation of Database Systems, Glasgow, July 1991, Springer WICS, pp. 228-245 14. K.-D. Schewe: Variant construction using constraint propagation techniques over semantic networks , in J. Retti, K. Leidlmaier (Eds.): Proc. of 5th Austrian AI Conference, Igls (Austria) 1989, Springer IFB 208, pp. 188-197 15. B. Schewe, K.-D. Schewe, B. Thalheim: Verfeinerungsschritte fur eine objektorientierte Entwurfsmethodik , in Proc. 23rd GI-Jahrestagung, Dresden (Germany), October 1993 16. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identi cation, genericity and consistency in object-oriented databases, in J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Berlin (Germany), October 1992, Springer LNCS 646, pp. 341-356 17. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity enforcement in objectoriented databases , in U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations of Models and Languages for Data and Objects, Volkse (Germany), October 1992, Springer WICS 18. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity preserving updates in object oriented databases , in M. Orlowska, M. Papazoglou (Eds.) : Proc. Australian Database Conference, Brisbane (Australia), February 1993, World Scienti c, pp. 171-185 19. K.-D. Schewe, B. Thalheim: Computing Consistent Transactions , University of Rostock, Preprint CS-08-92, December 1992, submitted for publication 20. S. Shlaer, S. J. Meller: An object-oriented approach to domain analysis , ACM Software Engineering Notes, vol. 14 (3), 1989 21. C. Sernadas, P. Gouveia, J. Gouveia, A. Sernadas, P. Resende: The rei cation dimension in object-oriented database design , in D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Speci cation of Database Systems, Glasgow, July 1991, Springer WICS, pp. 275-299 22. B. Thalheim: Dependencies in relational databases , Teubner, Leipzig 1991 23. B. Thalheim: Intelligent database design using an extended entity-relationship model, University of Rostock, Preprint CS-11-91, Dezember 1991

You might also like