Readings in Fundamentals of Object Oriented Databases

Readings in Fundamentals of 

Object Oriented Databases 

{ Selected Papers { 

Klaus-Dieter Schewe 1 , Bernhard Thalheim 2 

1 Technische Universitat Clausthal 

Institut fur Informatik, Erzstr. 1 

38678 Clausthal-Zellerfeld 

2 Technische Universitat Cottbus 

Institut fur Informatik, Karl-Marx-Str. 17 

03044 Cottbus

Table of Contents 

1 Fundamental Concepts of Object Oriented Databases 1 

2 Identication as a Primitive of Database Models 34 

3 Fundamentals of Object Oriented Database Modelling 51 

4 Higher-Level Genericity in Object Oriented Databases 68 

5 Towards a Theory of Consistency Enforcement 85 

6 Tailoring Consistent Specializations as a Natural Approach to Consistency 

Enforcement 118 

7 Limits of Rule Triggering Systems for Integrity Maintenance in the Context 

of Transaction Specications 134 

8 Consistency Enforcement inEntity-Relationship and Object-Oriented Models 

158 

9 Principles of Object Oriented Database Design 179 

10 View-Centered Conceptual Modelling 196 

i

Preface 

This report contains a collection of selected papers on \Fundamentals of Object Oriented 

Databases". This work started with a small working group meeting monthly at Hamburg 

University. Original participants were Ingrid Wetzel, Bernhard Thalheim and Klaus-Dieter 

Schewe. The primary intention was to bring together conceptual modelling and formal specication 

approaches to database design and to set up solid mathematical foundations. In 

these meetings we detected rather early that the major points to focus on were identication, 

genericity and consistency. 

After rst tentative papers addressing these problems { unpublished or documented in 

technical reports, e.g. [2, 4, 19] { the rst paper just containing the problem areas above in 

its title was presented at ICDT '92 in Berlin [3]. In parallel the GCS approach to consistency 

enforcement was set up [9]. In collaboration with David Stemple we even discovered linguistic 

reection as the fundamental implementation issue for these tasks [20]. Chapter 1 contains 

a reprint of a polished journal version of this work published in Acta Cybernetica[6], also 

summarized in [5]. Chapter 2 contains a deeper investigation of the identication problem by 

Catriel Beeri and Bernhard Thalheim [1]. Chapter 3 contains a follow-on paper [7], in which 

the original idea from formal specications to consider arbitrary underlying type systems was 

taken up and the connection to higher-order intuitionistic logic was established. Chapter 4 a 

reprint of the paper presented at the 1994 Indian conference on \Management of Data" [20]. 

In the sequel the GCS approach has been developped carefully, which after some preliminary 

work [11, 10] lead to the fundamental journal article [8] in Acta Informatica and a 

follow-on article in [14]. These are reprinted in Chapter 5 and Chapter 6. 

From the beginning there was a struggle with the \Active Database" community. Researchers 

believing in the unlimited power of the rule based approach were not very enthusiastic 

with our fundamental approach to consistency enforcement relying on specication 

language semantics. In particular, our emphasizing the need for a formal denition of the 

goal of consistency enforcement in databases that encompasses just termination, conuence 

and consistency, was (and still is) not generally accepted. Therefore, we started a side activity 

on limits of rule based systems [12, 13, 15, 16, 17, 18] in the context of transaction specications. 

Chapter 7 contains a reprint of the polished article [13] published in Acta Cybernetica, 

which contains the major theoretical issues. The Data & Knowledge Engineering article [18], 

reprinted in Chapter 8, contains an application of part of that theory to simple object oriented 

schemata. 

Finally, we worked on the design of object oriented databases. The tie-in with formal 

specications initiated a renement-based approach in [21], reprinted in Chapter 9. In the 

meantime Bettina Schewe brought up the idea of an integrated user interface design through 

the use of dialogue objects. This lead to the work in [22, 23] and additional articles, reports 

and books written in German. We reprint [23] in Chapter 10. 

ii

Acknowledgement 

We would like to thank our coauthors and all others who stimulated ideas presented in our 

articles. 

References 

1. C. Beeri, B. Thalheim. Identication as a Primitive of Database Models. In T .Polle, T. Ripke, 

K.-D. Schewe. Fundamentals of Information Systems. Kluwer 1998. 

2. K.-D. Schewe, B. Thalheim, I. Wetzel, J.W. Schmidt. Extensible, safe object oriented design of 

database applications. preprint CS-09-91. Rostock University. 1991. 

3. K.-D. Schewe, J.W. Schmidt, I. Wetzel. Identication, genericity and consistency in object oriented 

databases. in J. Biskup, R. Hull (Eds.). Proc. Int. Conf. on Database Theory { ICDT '92 . Springer 

LNCS 646. Berlin 1992. 

4. K.-D. Schewe, B. Thalheim, I. Wetzel. Foundations of object oriented database concepts. Technical 

Report FBI-HH-B-157/92. Hamburg University. 1992. 

5. K.-D. Schewe, B. Thalheim. Towards a formal foundations of object oriented databases. SIGMOD 

workshop Combining declarative and object oriented databases. Washington 1993. 

6. K.-D. Schewe, B. Thalheim. Fundamental concepts of object oriented databases. Acta Cybernetica, 

vol. 11 (4), 49-84. Szeged 1993. 

7. K.-D. Schewe. Fundamentals of object oriented database modelling. Intelligent Systems. Moskau 

1997. 

8. K.-D. Schewe, B. Thalheim. Towards a theory of consistency enforcement. Acta Informatica (to 

appear). 

9. K.-D. Schewe, B. Thalheim, J.W. Schmidt, I. Wetzel. Enforcing integrity in object oriented 

databases. in U. Lipeck, B. Thalheim (Eds.). Modelling Database Dynamics. Springer Workshops 

in Computer Science. London 1993. 

10. K.-D. Schewe, B. Thalheim, I. Wetzel. Integrity preseving updates in object oriented databases. in 

M. Orlowska, M. Papazoglou (Eds.). Advances in Databases { ADC '93 .World Scientic. Sigapore 

1993. 

11. K.-D. Schewe, B. Thalheim. Computing Consistent Transactions. Preprint CS-08-92. Rostock University 

1992. 

12. K.-D. Schewe, B. Thalheim. Achieving Consistency in Active Databases. in S. Chakravarty, 

J. Widom (Eds.). Research Issues in Data Engineering { Active Database Systems. Houston 1994. 

13. K.-D. Schewe, B. Thalheim. Limits of Rule Triggering Systems for Integrity Maintenance in the 

Context of Transaction Specications. Acta Cybernetica (to appear). 

14. K.-D. Schewe. Tailoring Consistent Specilizations as a Natural Approach to Consistency Enforcement. 

in S. Conrad, H.-J. Klein, K.-D. Schewe (Eds.). Integrity in Databases. Dagstuhl 1996. 

15. K.-D. Schewe, B. Thalheim. Active Consistency Enforcement for Repairable Database Transitions. 

in S. Conrad, H.-J. Klein, K.-D. Schewe (Eds.). Integrity in Databases. Dagstuhl 1996. 

16. K.-D. Schewe, B. Thalheim. On the strength of rule triggering systems for integrity maintenance. 

in C. McDonald (Ed.). Database Systems '98 . Australian Computer Science Communications, vol. 

20 (2), 77-88. Springer 1998. 

17. K.-D. Schewe. Well-behaving rule systems for Entity-Relationship and object oriented models. in 

D. Embley, R.Goldstein (Eds.). Conceptual Modeling { ER '97 . Springer LNCS 1331, 141-154. 

New York 1997. 

18. K.-D. Schewe. Consistency Enforcement inEntity-Relationship and Object Oriented Models. Data 

& Knowledge Engineering 1998 (to appear). 

19. K.-D. Schewe, J.W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel. A reective approach to method 

generation in object oriented databases. Rostocker Informatik Berichte, vol. 14. Rostock University. 

1992. 

iii

20. K.-D. Schewe, D. Stemple, B. Thalheim. Higher level genericity in object oriented databases. Proc. 

Int. Conf. Management of Data. Bangalore 1994. 

21. K.-D. Schewe, B. Thalheim. Principles of object oriented database design. in H. Jaakkola et al. 

(Eds.). Information Modelling and Knowledge Bases V , 227-242. IOS Press. Amsterdam 1994. 

22. B. Schewe, K.-D. Schewe. A user-centered method for the development of data-intensive dialogue 

systems. in E. Falkenberg, W. Hesse. (Eds.). Information System Concepts, 88-103. Chapman & 

Hall 1995. 

23. K.-D. Schewe, B. Schewe. View-centered conceptual modelling { an object oriented approach. in 

B. Thalheim. (Eds.). Conceptual Modeling {ER'96. Springer LNCS. Berlin 1996. 

iv

Chapter 1 

Fundamental Concepts of Object 

Oriented Databases 

Contents 

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 

1.2 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . 5 

1.3 A Core Object Oriented Datamodel . . . . . . . . . . . . . . . . . 9 

1.3.1 A Simple Type System . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

1.3.2 The Class Concept as a Structural Primitive . . . . . . . . . . . . . 11 

1.3.3 User Dened Integrity Constraints . . . . . . . . . . . . . . . . . . . 11 

1.3.4 Methods as a Basis for Behaviour Modelling . . . . . . . . . . . . . . 12 

1.3.5 Queries and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 

1.4 The Object Identication Problem . . . . . . . . . . . . . . . . . . 15 

1.4.1 The Notion of Value-Representability . . . . . . . . . . . . . . . . . 16 

1.4.2 Value-Representability in the Case of Acyclic Reference Graphs . . . 16 

1.4.3 Computation of Value Representation Types . . . . . . . . . . . . . 17 

1.4.4 The Finiteness Property . . . . . . . . . . . . . . . . . . . . . . . . . 18 

1.4.5 Weak Value-Representability . . . . . . . . . . . . . . . . . . . . . . 20 

1.5 The Genericity Problem . . . . . . . . . . . . . . . . . . . . . . . . 21 

1.5.1 Generic Update Methods . . . . . . . . . . . . . . . . . . . . . . . . 22 

1.5.2 Generic Updates in the Case of Value-Representability . . . . . . . . 23 

1.6 The Consistency Problem . . . . . . . . . . . . . . . . . . . . . . . 25 

1.6.1 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 25 

1.6.2 Enforcing Integrity in the OODM . . . . . . . . . . . . . . . . . . . 27 

1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 

This chapter contains a reprint of 

K.-D. Schewe, B. Thalheim. Fundamental Concepts of Object Oriented Databases. 

Acta Cybernetica, vol. 11, no. 1-2, 49 - 84. Szeged 1993. 

1

Abstract. It is claimed that object oriented databases (OODBs) overcome many of the 

limitations of the relational model. However, the formal foundation of OODB concepts is 

still an open problem. Even worse, for relational databases a commonly accepted datamodel 

existed very early on whereas for OODBs the unication of concepts is missing. The work 

reported in this paper contains the results of our rst investigations on a formally founded 

object oriented datamodel (OODM) and is intended to contribute to the development of a 

uniform mathematical theory of OODBs. 

A clear distinction between objects and values turns out to be essential in the OODM. 

Types and Classes are used to structure values and objects repectively. Then the problem 

of unique object identication occurs. We show that this problem can be be solved 

for classes with extents that are completely representable by values. Such classes are called 

value-representable. 

Another advantage of the relational approach istheexistence of structurally determined 

generic update operations. We show that this property can be carried over to object-oriented 

datamodels if classes are value-representable. Moreover, in this case database consistency 

with respect to implicitly specied referential and inclusion constraints will be automatically 

preserved. 

This result can be generalized with respect to distinguished classes of explicitly stated 

static constraints. Given some arbitrary method and some integrity constraint there exists a 

greatest consistent specialization (GCS) that behaves nice in that it is compatible with the 

conjunction of constraints. We present an algorithm for the GCS construction of user-dened 

methods and describe the GCSs of generic update operations that are required herein. 

1.1 Introduction 

The shortcomings of the relational database approach encouraged much research aimed at 

achieving more appropriate data models. It has been claimed that the object-oriented approach 

will be the key technology for future database systems and languages [8]. Several systems 

[4, 6, 7, 9, 15, 16, 17, 19, 26, 36, 37, 38] arose from these eorts. However, in contrast to 

research in the relational area there is no common formal agreement on what constitutes an 

object-oriented database [10, 11, 13]. 

The basic question \What is an object?" seems to be trivial, but already here the variety 

of answers is large. In object oriented programming the notion of an object was intended as 

a generalization of the abstract data type concept with the additional feature of inheritance. 

In this sense object orientation involves the isolation of data in semi-independent modules in 

order to promote high software development productivity. The development of object oriented 

databases regarded an object also as a basic unit of persistent data, a view that is heavily in- 

uenced by existing semantic datamodels (SDMs) [2, 29, 31, 39, 40, 60]. Thus, object oriented 

databases are composed of independent objects but must also provide for the maintenance of 

inter-object consistency, a demand that is to some degree in dissonance with the basic style 

of object orientation. 

A view that is common in OODB research is that objects are abstractions of real world 

entities and should have an identity [8]. This leads to a distinction between values and objects 

[10, 11]. A value is identied by itself whereas an object has an identity independent of its 

value. This object identity is usually encoded by object identiers [1, 3, 34]. Abstracting from 

the pure physical level the identier of an object can be regarded as being immutable during 

2

the object's lifetime. Identiers ease the sharing and update of data. However, such abstract 

identiers do not relieve us from the task to provide unique identication mechanisms for 

objects. In object oriented programming object names are sucient, but retrieving mass data 

by name is senseless. 

In most approaches to OODBs an object is coupled with a value of some xed structure. 

To our point of view this contradicts already the goal of objects being abstractions of reality. 

In real situations an object has several and also changing aspects that should be captured by 

the object model. Therefore, in our object model each object o consists of a unique identier 

id, a set of (type-, value-)pairs (T i v i ), a set of (reference-, object-)pairs (ref j o j ) and a set 

of methods meth k . 

Types are used to structure values. Classes serve as structuring primitive for objects 

having the same structure and behaviour. It is obvious that the multiple aspects view of an 

object allows them to be simultaneously members of more than one class and to change class 

memberships. This setting also makes every discussion on \object migration" unnessecary, as 

migration is only a specic form of value change. 

In our model a class structure uniformly combines aspects of object values and references. 

The extent of classes varies over time, whereas types are immutable. Relationships between 

classes are represented by references together with referential constraints on the object identiers 

involved. Moreover, each class is accompanied by a collection of methods. A schema is 

given by a collection of class denitions together with explicit integrity constraints. 

The Identication Problem. One important concept of object-oriented databases is object 

identity. Following [1, 12] the immutable identity of an object can be encoded by the concept 

of abstract object-identiers. The advantages of this approach are that sharing, mutability 

of values and cyclic structures can be represented easily [42]. On the other hand, object 

identiers do not have a meaning for the user and should therefore be hidden. 

We study whether equality ofidentiers can be derived from the equality ofvalues. In the 

literature the notion of \deep" equality has been introduced for objects with equal values and 

references to objects that are also \deeply" equal. This recursive denition becomes interesting 

in the case of cyclic references. 

Therefore, we introduce uniqueness constraints, which express equality on identiers as a 

consequence of the equality of some values or references. On this basis we can address the 

problem how tocharacterize those classes that are completely representable (and hence also 

identiable) by values. 

Generic Update Operations. The success of the relational data model is due certainly to the 

existence of simple query and update-languages. Preserving the advantages of the relational 

in OODBs is a serious goal. 

The generic querying of objects has been approached in [1, 12]. While querying is per se 

a set-oriented operation, i.e. it is not necessary to select just one single object, and hence 

does not raise any specic problems with object identiers, things change completely in case 

of updates. If an object with a given value is to be updated (or deleted), this is only dened 

unambigously, if there does not exist another object with the same value. If more than one 

object exists with the same value or more generally with the same value and the same references 

to other objects, then the user has to decide, whether an update- or delete-operation is 

applied to all these objects, to only one of these objects selected non-deterministically or to 

none of them, i.e. to reject the operation. However, it is not possible to specify a priori such 

3

an operation that works in the same way for all objects in all situations. The same applies 

to insert-operations. Hence the problem, in which cases operations for the insertion, deletion 

and update of objects can be dened generically. 

Some authors [43] have chosen the solution to abandon generic operations. Others [6, 7, 9] 

use identifying values to represent objectidentity, thus embody a strict concept of surrogate 

keys to avoid the problem. Our approach is dierent from both solutions in that we use the 

concept of hidden abstract identiers, but at the same time formally characterize those classes 

for which unique generic operations for the insertion, deletion and update of single objects 

can be derived automatically. It turns out that these are exactly the value-representable ones. 

The Consistency Problem. One of the primary benets that database systems oer is automatic 

enforcement of database integrity. One type of integrity is maintained through automatic 

concurrency control and recovery mechanisms another one is the automatic enforcement 

of user-specied integrity constraints. Most commercial database systems, especially 

relational database management systems enforce only a bare minimum of constraints, largely 

because of the performance overhead associated with updates. 

The maintenance problem is the problem how to ensure that the database satises its 

constraints after certain actions. There are at present two approaches to this maintenance 

problem. The rst one, more classical is the modication of methods in accordance to the specied 

integrity constaints. The second approach uses generation mechanisms for the specied 

events. Upon occurrence of certain database events like update operations the management 

component is activated for integrity maintenance. The rst research direction did not succeed 

because of some limitations within the approach. The second one is at present one of the most 

active database research areas. One of our objectives is to show that the rst approach can 

be extended to object-oriented databases using stronger mathematical fundamentals. 

Accuracy is an obviously important and desirable feature of any database. To this end, 

integrity constraints, conditions that data must satisfy before a database is updated, are 

commonly employed as a means of helping to maintain consistency. In relational databases 

the specication and enforcement of integrity constraints has a long tradition [61], whereas 

in OODBs the integrity problem has only recently drawn attention [48]. 

In object oriented databases, integrity maintenance can be based on two dierent approaches. 

The rst one uses blind update operations. In this case, any update is allowed and 

the system organizes the maintenance. The second approach is based on methods rewriting. 

This approach is more eective. Assuming a consistent database state the modied method 

can not lead to an inconsistent state. 

In relational databases distinguished classes of static integrity constraints have been discussed 

such asinclusion, exclusion, functional, key and multi-valued dependencies. All these 

constraints can be generalized to the object oriented case. Then the result on the existence 

of integrity preserving methods can be generalized to capture also these constraints. We shall 

also describe the resulting methods. 

The Organization of the Paper. We start with a motivating example in Section 1.2, then 

introduce in Section 1.3 a core OODM to formalize the concepts used intuitively in the 

example. In Section 1.4 the notions of (weak) value-representability areintroduced in order 

to handle the identication problem. The genericity problem will be approached in Section 

1.5. We show the relationship between value-representability and the unique existence of 

generic update operations. The consistency problem is dealt with in Section 1.6. We outline an 

4

operational approach based on the computation of greatest consistent specializations (GCSs). 

Since the used algorithm allows the problem to be reduced to basic update operations, we 

describe the GCSs hereof. We summarize our results and describe some open problems in 

Section 1.7. 

1.2 A Motivating Example 

In this section we start giving a completely informal introduction to the OODM on the basis 

of a simple university example. We rst introduce types and classes, then show an example 

of a database instance, i.e. the content of the database at a given timepoint. The representation 

of an instance requires object identiers. Then we extend the example by introducing 

user-dened constraints. We shall see that this enables alternative representations without 

using identiers, hence leads to the notion of value-representability. Finally, we indicate the 

denition of methods as a means to model database dynamics. For the sake of simplicity we 

only describe a generic update method that can be generated by the system. 

As already said in the introduction, we distinguish between values and objects with the 

main dierence dened by values identifying themselves whereas objects require an additional 

external identication mechanism. Types are used to structure values. Thus, let us rst give 

some examples of types. 

Example. Basically, every type can be built from a few predened basic types such as 

BOOL, NAT, STRING, etc. and also predened type constructors for records, nite sets, 

lists, unions, etc. 

The type denition for PERSONNAME uses both a set constructor fg and a (tagged) 

record constructor (): 

Type PERSONNAME 

= ( FirstName : STRING , 

SecondName : STRING , 

Titles : STRING ) 

End PERSONNAME 

The denition of a type PERSON uses the type PERSONNAME. 

Type PERSON 

= ( PersonIdentityNo : NAT , 

Name : PERSONNAME ) 

End PERSON 

The following denes STUDENT as a subtype of PERSON , i.e. we can naturally project 

each value of type STUDENT onto a value of type PERSON. 

Type STUDENT 


StudNo : NAT , 

Name : PERSONNAME ) 

5

End STUDENT 

Besides these denitions of types as sets of values we may also dene new type constructors 

as follows, where is a parameter for this new constructor: 

Type MPERSON () 


Spouse : ) 

End MPERSON 

ut 

Next we use these types to build the structural part of an OODM schema. We deneaschema 

as a collection of classes and a class as a variable collection of objects. 

Example. Each object in a class has a structure, which combines aspects of values associated 

with the object and references to other objects. This structure can be based on a type 

denition as above or involve itself a (nameless) type denition. Moreover, class denitions 

involve IsA relations in order to model objects in more than one class. We use to indicate 

concatenation for record types. 

Schema University 

Class PersonC 

Structure PERSON 

End PersonC 

Class MarriedPersonC 

IsA PersonC 

Structure ( PersonIdentityNo : NAT , 

Spouse : MarriedPersonC ) 

End MarriedPersonC 

Class StudentC 

IsA PersonC 

Structure STUDENT 

( Supervisor : ProfessorC , 

Major : DepartmentC , 

Minor : DepartmentC ) 

End StudentC 

Class ProfessorC 

IsA PersonC 

Structure ( PersonIdentityNo : NAT , 

Age : NAT , 

Salary : NAT , 

Faculty : DepartmentC ) 

End ProfessorC 

Class DepartmentC 

Structure ( DeptName : STRING ) 

End DepartmentC 

ut 

In principle, we are now able to describe the content of the database at a given timepoint. For 

6

such database instances we need a type ID of object identiers that is used for two purposes, 

rst as a unique and ecient internal identication mechanism for objects and second for 

modelling objects in dierent classes and references to other objects. In this case each class 

will be associated with a representation type that can be used directly for storing objects. 

Example. 

We useD as a name for the instance. 

D(PersonC) =f ( i 1 , ( 123 , ( \John" , \Denver" , f \Professor" , \Dr" g ))), 

( i 2 , ( 124 , ( \Mary" , \Stuart" , f \Dr" g ))), 

( i 3 , ( 456 , ( \John" , \Stuart" , fg))), 

( i 4 , ( 567 , ( \Laura" , \James" , fg))), 

( i 5 , ( 987 , ( \Dave" ,\Ford" , fg))) g 

D(MarriedPersonC)=f ( i 1 ,(123,i 2 )), 

( i 2 ,(124,i 1 )) g 

D(ProfessorC)=f ( i 1 , ( 123 , 48 , 8000 , i 6 )) 

D(StudentC)=f ( i 3 , ( 456 , 1023 , ( \John" , \Stuart" , fg),i 1 , i 6 , i 7 )), 

( i 4 , ( 567 , 2134 , ( \Laura" , \James" , fg),i 1 , i 6 , i 7 )) g 

D(DepartmentC)=f ( i 6 , ( \Computer Science" ) ) , 

( i 7 , ( \Philosophy" ) ) , 

( i 8 , ( \Music" ) ) g 

Note that the following three conditions are satised by the instance: 

{ The object identiers are unique within a class, 

{ the IsA relations in the schema give rise to set inclusion relationships for the underlying 

sets of identiers (inclusion integrity), and 

{ the identiers occurring within an object's value at a place corresponding to a reference, 

always occur as an object identier in the referenced class (referential integrity). 

We shall always refer to these conditions as model inherent constraints that must be satised 

by each instance. Other integrity constraints can be dened by the user and added to the 

schema in order to capture more application semantics as shown in the next example. 

Example. First let us express that there are no two persons with the same PersonIdentityNo, 

no two students with the same StudentNo and no two departments with the same name. In 

order to formulate this, use x P , x S and x D to refer to the content of the classes PersonC, 

StudentC and DepartmentC, and let c P : PERSON ! (PersonIdentityNo : NAT) 

and c S : STUDENT ID 3 ! (StudNo : NAT) be functions that arise from the natural 

projection to the components PersonIdentityNo and StudNo in PERSON and STUDENT 

respectively. This gives the following uniqueness constraints. 

8i j :: ID:8v w :: PERSON: (i v) 2 x P ^ (j w) 2 x P ^ c P (v) =c P (w) ) i = j : 

8i j :: ID:8v w :: STUDENT ID 3 : (i v) 2 x S ^ (j w) 2 x S ^ c S (v) =c S (w) ) i = j : 

8i j :: ID:8v w :: (DeptName : STRING): (i v) 2 x D ^ (j w) 2 x D ^ v = w ) i = j (1.1) : 

Let us further assume that the salary of a professor is determined by his/her age. For this 

purpose, let Age Salary : T Prof ! NAT be the natural projections to the Age- and 

ut 

g 

7

Salary-values respectively. Then we have the following functional constraint on the class 

ProfessorC: 

8i j :: ID:8v w :: T Prof : (i v) 2 x Prof ^ (j w) 2 x Prof ^ Age(v) =Age(w) ) 

Salary(v) = Salary(w) : (1.2) 

Next assume that we want to guarantee that the spouse of a person's spouse is the person 

itself, which gives (with the abbreviations understood) the formula 

8i j :: ID:8v w :: T MP : (i v) 2 x MP ^ (j w) 2 x MP ^ Spouse(v) =j ) Spouse(w) =i : 

(1.3) 

Note that all these constraints are also satised by the instance above. 

ut 

Now wehave added uniqueness constraints, the object identiers used in instances correspond 

one-to-one to values of some types associated with the classes. These are the so-called value 

identication types V C .Hencewe could remove identiers and represent the same information 

in a purely value-based fashion. In our example the value representation type for the class 

PersonC is simply PERSON, but for the class MarriedPersonC we need the recursive 

type 

V MP = PERSON ( Spouse : V MP ) 

with values that are rational trees [45, 47]. 

So far only structural aspects (types, classes, constraints) have been considered. Let us 

now add methods to classes in order to model the dynamics of the database. In the OODM 

methods will be modelled in a simple procedural style. 

Example. 

Let us describe an insert-method for the class PersonC. 

insert P ersonC (in: P :: PERSON,out:I::ID) = 

IF 9 O 2 PersonC .value(O) =P 

THEN I := ident(O) 

ELSE I := NewId 

PersonC := PersonC [f( I,P )g 

ENDIF 

For an insertion into the class MarriedPersonC we need a more complex input type V 

recursively dened as 

V = PERSON (V [ ID) 

For each P :: V let f(P ) :: PERSON be the projection onto PERSON corresponding to 

the subtype relation between V and PERSON.Thenwehave 

insert MarriedPersonC (in: P :: V , out: I :: ID) = 

I := insert PersonC (f(P )) 

IF 8 O 2 MarriedPersonC . ident(O) 6= I 

THEN P 0 := substitute(I,P ,Spouse(P)) 

IF P 0 :: ID 

THEN J := P 0 8

ELSE J := insert MarriedP ersonC (P 0 ) 

ENDIF 

MarriedPersonC := MarriedPersonC [f(I,f(P ) (J))g 

ENDIF 

We used the global method NewId to denote the selection of a new identier. The expression 

substitute(I,P ,T ) denotes the result of replacing the value I for P in the expression T . Later 

we shall use a more abstract syntax oriented toward guarded commands [20, 41, 46]. ut 

Later we shall see that methods as described in this example are canonical and can be automatically 

derived from the schema. Corresponding generic update methods look quite similar 

with the only dierence that there is no output. Such generic update methods only exist for 

value representable classes in which case,however, they enforce integrity with respect to the 

model inherent constraints. However, generic update methods need not be consistent with 

respect to the user-dened constraints. To achieve this, we have to apply the GCS algorithm 

to user-dened methods. 

In the following sections we formally dene the concepts above and proof the main results 

on value representation, generic updates and integrity enforcement. 

1.3 A Core Object Oriented Datamodel 

In this section we present a slightly modied version of the object oriented datamodel (OODM) 

of [45, 47, 49]. We observe that an object in the real world always has an identity. Therefore, 

abstract (i.e. system-provided) object identiers are introduced to capture identity. However, 

neither the real world object that was the basis of the abstraction nor the abstract identier 

can be used for the identication of an object. 

In contrast to existing object oriented datamodels [1, 3, 4, 6, 7, 8, 9, 16, 17, 26, 36, 37, 42, 

43, 54] an object is not coupled with a unique type. In contrast, we observe that real world 

objects can have dierent aspects that may change over time. Therefore, a primary decision 

was taken to let an object be associated with more than one type and to let these types even 

change during the object's lifetime. The same applies to references to other objects. 

In the following let N P , N T , N C , N R , N F , N M and V denote arbitrary pairwise disjoint, 

denumerable sets representing parameter-, type-, class-, reference-, function-, method- and 

variable-names respectively. 

1.3.1 A Simple Type System 

Relational approaches to data modelling are called value-oriented since in these models real 

world entities are completely represented by their values. In the object-oriented approach we 

distinguish between objects and values. Values can be gouped into types. In general, a type 

may be regarded as an immutable set of values of a uniform structure together with operations 

dened on such values. Subtyping is used to relate values in dierent types. 

In [12, 47, 49] algebraic type specications as in [21, 23] have been used to allow opentype 

systems. For the sake of simplicity we deviate here from this approach and follow the more 

classical view of [14, 15, 45] using a type system that consists of some basic types such as 

BOOL, NATURAL, INTEGER, STRING, etc., and type constructors for records, nite 

sets, bags, lists, etc. and a subtyping relation. Moreover, assume the existence of recursive 

9

types, i.e. types dened by (a system of) domain equations. In principle we could use one of 

the type systems dened in [4, 5, 14, 15, 19, 24, 38]. In addition we suppose the existence of 

an abstract identier type ID in T without any non-trivial supertype. Arbitrary types can 

then be dened by nesting. A type T without occurrence of ID will be called a value-type. 

We shall proceed giving a more formal denition of types. 

Denition 1.1. (i) A base type is either BOOL, NAT, INT, FLOAT, STRING, ID or 

?. 

(ii) Let a i 2 N F and i 2 N P (i = 1::: n). A type constructor is either (a 1 : 

1 ::: a n : n ) (record), fg (nite set), [] (list), hi (bag) or [ (union). 

(iii) A type t is either a base type, a type constructor, a generalized constructor that results 

from replacing some parameters in a type constructor by types or a recursive type dened 

by an equation t = f=tg:t 0 , where t 0 is a generalized constructor and one of its parameters 

is replaced by t 2 N T . 

In the latter two cases the remaining parameters of the type constructor together with 

the parameters of the replacing types yield the parameters 1 ::: n of t. 

(iv) Atype t is called proper i the number of its parameters is 0. t is called a value type i 

there is no occurrence of ID in t. 

(v) A type form consists of a type name t 2 N T and a type t 0 with possibly some of its 

parameters replaced by type names. 

(vi) A type specication T is a nite collection of type forms t 1 ::: t n such that the only type 

names occurring herein are the names of t 1 ::: t n . 

The semantics of such types as sets of values is dened as usual. Moreover, we assume the 

standard operators on base types and on records, sets, bags, ::: We omit the details here. 

If t 0 is a proper type occurring in a type t, then there exists a corresponding occurrence 

relation 

o : t t 0 ! BOOL : 

Finally, we introduce subtypes. For a more detailed introduction to types see either [14] or 

[49]. 

Denition 1.2. (i) A subtype relation on types is given by the following rules: 

(a) Every type t is its own subtype and a subtype of ?. 

(b) NAT INT FLOAT . 

(c) (::: a i;1 : i;1 a i : i a i+1 : i+1 :::) (::: a i;1 : 0 i;1 a i+1 : 0 i+1 :::) 

whenever j 0 j . 

(d) 

8 

< 

: 

fg fg 

[] [] 

hi hi 

9 

= 

 

(e) fg hi and [] hi. 

(f) [ . 

i . 

(ii) A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened 

by (a)-(f) above. 

10

1.3.2 The Class Concept as a Structural Primitive 

The class concept provides the grouping of objects having the same structure which uniformly 

combines aspects of object values and references. Moreover, generic operations on objects such 

as object creation, deletion and update of its values and references are associated with classes 

provided these operations can be dened unambigously. Objects can belong to dierent classes, 

which guarantees each object of our abstract object model to be captured by the collection 

of possible classes. As for values that are only dened via types, objects can only be dened 

via classes. 

Each object in a class consists of an identier, a collection of values and references to 

objects in other classes. Identiers can be represented using the unique identier type ID. 

Values and references can be combined into a representation type, where each occurence of 

ID denotes references to some other classes. Therefore, we may dene the structure of a class 

using parameterized types. 

Denition 1.3. (i) Let t be a value type with parameters 1 ::: n .For distinct reference 

names r 1 ::: r n 2 N R and class names C 1 ::: C n 2 N C the expression derived from t 

by replacing each i in t by r i : C i for i =1::: n is called a structure expression. 

(ii) A structural class consists of a class name C 2 N C , a structure expression S and a set of 

class names D 1 ::: D m 2 N C (in the following called the set of superclasses). We call r i 

the reference named r i from class C to class C i . The type derived from S by replacing 

each reference r i : C i bythetype ID is called the representation type T C of the class C, 

the type U C =(ident : ID value :: T C ) is called the class type of C. 

(iii) A (structural) schema S is a nite collection of structural classes C 1 ::: C n closed under 

references and superclasses. 

(iv) An instance D of a structural schema S assigns to each classC avalue D(C) oftype U C 

such that the following conditions are satised: 

uniqueness of identiers: For every class C we have 

8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (1.4) 

inclusion integrity: For a subclass C of C 0 wehave 

8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (1.5) 

Moreover, if T C isasubtype of TC 0 with subtype function f : T C ! TC 0 , then we have 

8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (1.6) 

referential integrity: For each reference from C to C 0 with corresponding occurrence 

relation o r wehave 

8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (1.7) 

1.3.3 User Dened Integrity Constraints 

Let us now extend the notion of schema by the introduction of explicit user-dened integrity 

constraints. First we dene the notion of constraint schema in general, then we restrict ourselves 

to distinguished classes of constraints that arise as generalizations of constraints known 

from the relational model, e.g. functional and key constraints, inclusion and exclusion constraints 

[48, 52]. 

11

Denition 1.4. 

Let S = fC 1 ::: C n g be a structural schema. 

(i) An integrity constraint on S is a formula I over the underlying type system with free 

variables fr(I) fx C 1 ::: x C n 

g, where each x Ci is a variable of type fU Ci g.We call x Ci 

the class variable of C i . 

(ii) A constrained schema consists of a structural schema S and a nite set of integrity 

constraints on S. 

(iii) An instance of a constrained schema is an instance of the underlying structural schema. 

An instance D is said to be consistent with respect to the integrity constraint I i 

substituting D(C) for each class variable x C in I evaluates to true, wheninterpreted in 

the usual way. 

Note that the conditions for an instance in Denition 4 correspond to model inherent integrity 

constraints. We refer to these constraints as implicit identier, IsA and referential constraints 

on the schema S. Let us now dene some distinguished classes of user-dened constraints. 

Denition 1.5. Let C C 1 C 2 be classes in a schema S and let c i : T C ! T i (i =1 2 3) and 

c i : T Ci ! T (i =1 2) be subtype functions. 

(i) A functional constraint on C is a constraint of the form 

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) : 

(1.8) 

(ii) A uniqueness constraint on C is a constraint of the form 

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) i = i 0 : 

(1.9) 

A uniqueness constraint onC is called trivial i T C = T 1 and c 1 = id hold. 

(iii) An inclusion constraint on C 1 and C 2 is a constraint of the form 

8t :: T:9i 1 :: IDv 1 :: T C 1 : (i 1v 1 ) 2 x C 1 ^ c 1(v 1 )=t ) 

9i 2 :: IDv 2 :: T C 2 : (i 2v 2 ) 2 x C 2 ^ c 2(v 2 )=t : (1.10) 

(iv) An exclusion constraint on C 1 , C 2 is a constraint of the form 

8i 1 i 2 :: ID:8v 1 :: T C 1 : 8v 2 :: T C 2 : (i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1 (v 1 ) 6= c 2 (v 2 ) : 

(1.11) 

1.3.4 Methods as a Basis for Behaviour Modelling 

So far, only static aspects have been considered. A structural schema is simply a collection of 

data structures called classes. Let us now turn to adding dynamics to this picture. As required 

in the object oriented approach operations will be associated with classes. This gives us the 

notion of a method. 

We shall distinguish between visible and hidden methods to emphasize those methods 

that can be invoked by the user and others. This is not intended to dene an interface of a 

class, since for the moment all methods of a class including the hidden ones can be accessed 

by other methods. The justication for such aweak hiding concept is due to two reasons. 

12

{ Visible methods serve as a means to specify (nested) transactions. In order to build 

sequences of database instances we only regard these transactions assuming a linear invocation 

order on them. 

{ Hidden methods can be used to handle identiers. Since these identiers do not have any 

meaning for the user, they must not occur within the input or output of a transaction. 

Denition 1.6. Let S be a structural schema. Let T 1 ::: T n T 0 1 ::: T0 m be types, M 2 N M 

and 1 ::: n o 1 ::: o m 2 V . 

(i) A method signature consists of a method name M, a set of input-parameter / input-type 

pairs i :: T i and a set of output-parameter / output-type pairs o j :: Tj 0 .We write 

o 1 :: T 0 1::: o m :: T 0 m M( 1 :: T 1 ::: n :: T n ) : 

(ii) Let C be some structural class in S. A method M on C consists of a method signature 

with name M and a body that is recursively built from the following constructs: 

(a) assignment x := E, where x is either the class variable x C or a local variable within 

S, andE is a term of the same type as x, 

(b) skip, fail, loop, 

(c) sequential composition S 1 S 2 , choice S 1 S 2 , projection x :: T j S, guard P ! S, 

restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable of type 

T ,and 

(d) instantiation x 0 1 ::: x0 i C0 : S 0 (E1 0 ::: E0 j ), where S0 is a method on class C 0 with 

input-parameters 0 1 ::: 0 j and output-parameters o0 1 ::: o0 i ,such that the variables 

o 0 f , x0 f have the same type and the term E0 g has the same type as the variable 0 g. 

(iii) A method M on a class C with signature o 1 :: T1 0::: o m :: Tm 0 M( 1 :: T 1 ::: n :: 

T n ) is called value-dened i all T i (i =1:::n) and Tj 0 (j =1::: m) are proper value 

types. 

As already mentioned the OODM distinguishes between transactions, i.e. methods visible to 

the user, and hidden methods. We require each transaction to be value-dened. 

Subclasses inherit the methods of their superclasses, but overriding is allowed as long 

as the new method is a specialization of all its corresponding methods in its superclasses. 

Overriding becomes mandatory in the case of multiple inheritance with name conicts. A 

method that overrides a hidden method on some superclass must also be hidden. 

Denition 1.7. Let S be a structural schema and C 2Sbe a structural class as in Denition 

1.3 with superclasses D 1 ::: D k .Amethod specication on C consists of two setsofmethods 

S = fM 1 ::: M n g (called transactions) and H = fM1 0::: M0 mg (called hidden methods) 

such that the following properties hold: 

(i) Each M i (i =1::: n)isvalue-dened. 

(ii) For each transaction M l on some superclass D l there exists some i 2f1::: ng such that 

M i specializes M l . 

(iii) For each hidden method M l on some superclass D l there exists some j 2f1::: mg such 

that M 0 j specializes M l . 

13

Let us briey discuss what specialization means for the input- and output-types. Sometimes 

it is required that the input-type for an overriding method should be a subtype of the original 

one (covariance rule), sometimes the opposite (contravariance rule) is required. The rst rule 

applies e.g. if we want tooverride an insert method. In this case the inherited method has no 

eect on the subclass, but simply calls the \old" method. The second rule applies if inputtypes 

required on the superclass can be omitted on the subclass. Both rules are captured 

by the formal notion of specialization. We omit the details [44]. Now we are prepared to 

generalize the denition of classes and schemata. 

Denition 1.8. (i) A class consists of a class name C 2 N C , a structure expression S, a set of 

class names D 1 ::: D m 2 N C (called the set of superclasses) and a method specication 

(S = fM 1 ::: M n g , H = fM 0 1 ::: M0 n 0 g)onC. 

(ii) A (behavioural) schema S is a nite collection of classes fC 1 ::: C n closed under references, 

superclasses and method call together with a collection of integrity constraints 

I 1 ::: I n on S. 

(iii) An instance D of a behavioural schema S is an instance of the underlying structural 

schema. A database history on S is a sequence D 0 D 1 ::: of instances such that each 

transition from D i;1 to D i is due to some transaction on some class C 2S. 

Note the relation between database histories used here and the work on the semantics of 

object bases in [22, 28]. 

1.3.5 Queries and Views 

Roughly speaking the querying of a database is an operation on the database without changing 

its state. The emphasis of a query is on the output. While such a general view of queries can be 

subsumed by transactions, hence by methods in the OODM, query languages are in particular 

intended to be declarative in order to support an ad-hoc querying of a database without the 

need to write new transactions [8]. 

Querying a relational database can be expressed by terms in relational algebra. This view 

can be easily generalized to the OODM using its type system. Therefore, terms over such 

types occur naturally. Moreover, type specications are based on other type specications via 

constructors, selectors and functions. Hence, T allows arbitrary terms involving more than one 

class variable x C to be built. Then a query turns out be be represented by termt over some 

type T such that the free variables of t are all class variables. This approach is in accordance 

with the algebraic approach in [12] and with so called universal traversal combinators [25]. 

In relational algebra a view may be regarded simply as a stored query (or derived relation). 

We shall try to generalize also this view to the OODM. 

However, things change dramatically, when object identiers come into play [13], since 

now we have to distinguish between queries that result in values and those that result in 

(collections of) objects. Therefore we distinguish in the OODM between value queries and 

general access expressions. 

A value query on a schema S can then be represented by a term t of some value type T 

with fr(t) fx C j C 2Sg. Ad-hoc querying of a database should then be restricted to value 

queries. This is no loss of generality, because for any type T in T involving identiers there 

exists a corresponding type T 0 allowing multiple occurrences. Take e.g. a class C. Ifwewant 

to get all the objects in that class no matter whether they have the same values or not, the 

14

corresponding term would be x C . This is not a value query, but if T C is a value type, we may 

take T 0 = hT C i and the natural projection given by the subtype functions 

f(ident : ID value : )g ! h(ident : IDvalue : )i ! hi : 

In the case of arbitrary access expressions another problem occurs [13]. So far, we can only 

build terms t that involve identiers already existing in the database. Thus, such queries 

are called object preserving. If we want the result of a query to represent \new" objects, i.e. 

if we want to have object generating queries, we have to apply a mechanism to create new 

object identiers. This can be achieved by object creating functions on the type ID with arity 

ID ::: ID ! ID [32, 35]. 

The idea that a view is a stored query then carries over easily. However, the structure of a 

view should be compatible with the structure of the schema, i.e. each view may be regarded 

as a derived class. Summarizing, we getthe following formal denition. 

Denition 1.9. Let S = fC 1 ::: C n g be some schema. 

(i) A value query on S is a term t over some proper value type T with fr(t) fx C 1 ::: x C n 

g. 

(ii) An access expression on S is a term t over some proper type T with fr(t) fx C 1 ::: x C n 

g. 

(iii) A view on S consists of a view name v 2 N C such that there is no class C 2S with this 

name, a structure expression S(v) containing references to classes in S or to views on S 

and a dening access expression t(v) of type fU v g, where T v is the representation type 

corresponding to S(v). 

(iv) A (complete) schema is a behavioural schema together with a nite set of views. An 

instance of a complete schema is an instance of the underlying structural schema such 

that for every view v replacing each class variable x C in the access expressions of v yields 

avalue of type fU v g satisfying the uniqueness property for identiers. 

1.4 The Object Identication Problem 

From an object oriented point ofview a database may be considered as a huge collection of 

objects of arbitrary complex structure. Hence the problem to uniquely identify and retrieve 

objects in such collections. 

Each object in a database is an abstraction of a real world object that has a unique identity. 

The representation of such objects in the OODM uses an abstract identier I of type ID to 

encode this identity. Suchanidentier may be considered as being immutable. However, from 

a systems oriented view permutations or collapses of identiers without changing anything 

else should not aect the behaviour of the database. 

For the user the abstract identier of an object has no meaning. Therefore, a dierent 

access to the identication problem is required. We show that the unique identication of 

an object in a class leads to the notion of (weak) value-identiability, where weak valuerepresentability 

can be used to capture also objects that do not exists for there own, but 

depend on other objects. This is related to weak entities in entity-relationship models [62]. 

The stronger notion of value-representability is required for the unique denition of generic 

update operations. 

15

1.4.1 The Notion of Value-Representability 

According to our denitions two objects in a class C are identical i they have the same 

identier. By the use of constraints, especially uniqueness constraints, we could restrict this 

notion of equality. 

Let us address the characterization of those classes, the objects in which are completely 

representable by values, i.e. we could drop the object identiers and replace references by values 

of the referred object. We shall see in Section 1.5 that in case of value-representable classes 

we are able to preserve an important advantage of relational databases, i.e. the existence of 

structurally determined update operations. 

Denition 1.10. Let C be a class in a schema S with representation type T C . 

(i) C is called value-identiable i there exists a proper value type I C such that for all 

instances D of S there is a function c : T C ! I C such thatthe uniqueness constraint on 

C dened by c holds for D. 

(ii) C is called value-representable i there exists a proper value type V C such that for all 

instances D of S there is a function c : T C ! V C such that for D 

(a) the uniqueness constraint onC dened by c holds and 

(b) for each uniqueness constraintonC dened by some function c 0 : T C ! VC 0 with proper 

value type VC 0 there exists a function c00 : V C ! VC 0 that is unique on c(codom(D(C))) 

with c 0 = c 00 c. 

It is easy to see that each value-representable class C is also value-identiable. Moreover, the 

value-representation type V C in Denition 1.10 is unique up to isomorphism. 

1.4.2 Value-Representability in the Case of Acyclic Reference Graphs 

Since value-representability is dened by the existence of a certain proper value type, it is hard 

to decide, whether an arbitrary class is value-representable or not. In case of simple classes 

the problem is easier, since we only have to deal with uniqueness and value constraints. In 

this case it is helpful to analyse the reference structure of the class. Hence the following 

graph-theoretic denitions. 

Denition 1.11. The reference graph of a class C in a schema S is the smallest labelled 

graph G rep =(VEl) satisfying: 

(i) There exists a vertex v C 2 V with l(v C ) = ft Cg, where t is the top-level type in the 

structure expression S of C. 

(ii) For each proper occurrence of a type t 6= ID in T C there exists a unique vertex v t 2 V 

with l(v t )=ftg. 

(iii) For each reference r i : C i in the structure expression S of C the reference graph G i ref is 

a subgraph of G ref . 

(iv) For each vertex v t or v C corresponding to t(x 1 ::: x n )inS there exist unique edges e (i) 

t 

from v t or v C respectively to v ti in case x i is the type t i or to v Ci in case x i is the reference 

r i : C i . In the rst case l(e (i) 

t )=fS i g, where S i is the corresponding selector name in the 

latter case the label is fS i r i g. 

16

Denition 1.12. (i) Let S = fC 1 ::: C n g be a schema. Let S 0 = fC1 0 ::: C0 ng be another 

schema such that for all i there exists a uniqueness constraint on C i dened by some 

c i : T Ci ! T C 0 

i 

. Then an identication graph G id of the class C i is obtained from the 

reference graph of Ci 0 bychanging each label C0 j to C j. 

(ii) The identication graph G id resulting from the use of trivial uniqueness constraints is 

called the standard identication graph. 

Clearly, there need not exist any identication graph nor does the existence of one identication 

graph imply the existence of the standard one. However, if the standard identication 

graph exist, then it is equal to the reference graph. 

Proposition 1.13. Let C be a class in a schema S with acyclic reference graph G ref such 

that there exist uniqueness constraints for C and each C i such that C i occurs as a label in 

G ref . Then C is value-representable. 

Proof. We use induction on the maximum length of a path in G ref . If there are no references 

in the structure expression S of C the type T C is a proper value type. Since there exists a 

uniqueness constraintonC, the identity function id on T C also denes a uniqueness constraint. 

Hence V C = T C satises the requirements of Denition 1.10. 

If there are references r i : C i in the structure expression S of C, then the induction 

hypothesis holds for each such C i , because G ref is acyclic. Let V C result from S by replacing 

each r i : C i by V Ci . Then V C satises the requirements of Denition 1.10. 

ut 

Corollary 1.14. Let C be a class in a schema S such that there exist an acyclic identication 

graph G id and uniqueness constraints for C and each C i occuring as a label in G id . Then C 

is value-identiable. 

1.4.3 Computation of Value Representation Types 

We want to address the more general case where cyclic references may occur in the schema 

S = fC 1 ::: C n g. In this case a simple induction argument as in the proof of Theorem 1.13 

is not applicable. So we take another approach. We dene algorithms to compute types V C 

and I C that turn out to be proper value types under certain conditions. In the next subsection 

we then show that these types are the value representation type and the value identication 

type required by Denition 1.10. 

Algorithm 1.15. Let F (C i )=T i provided there exists a uniqueness constraint onC i dened 

by c i : T Ci ! T i , otherwise let F (C i ) be undened. If ID occurs in some F (C i ) corresponding 

to r j : C j (j 6= i), we writeID j . 

Then iterate as long as possible using the following rules: 

(i) If F (C j )isaproper value type and ID j occurs in some F (C i )(j 6= i), then replace this 

corresponding ID j in F (C i )by F (C j ). 

(ii) If ID i occurs in some F (C i ), then let F (C i ) be recursively dened by F (C i )==S i , where 

S i is the result of replacing ID i in F (C i )by the type name F (C i ). 

This iteration terminates, since there exists only a nite collection of classes. If these rules are 

no longer applicable, replace each remaining occurrence of ID j in F (C i ) by the type name 

F (C j )provided F (C j ) is dened. 

ut 

17

Note that the the algorithm computes (mutually) recursive types. Now we give a sucient 

condition for the result of Algorithm 1.15 to be a proper value type. 

Lemma 1.16. Let C be a class in a schema S such that there exists a uniqueness constraint 

for all classes C i occurring as a label in some identication graph G id of C. Let I C be the 

type F (C) computed by Algorithm 1.15 with respect to the uniqueness constraints used in the 

denition of G id . Then I C is a proper value type. 

Proof. Suppose I C were not a proper value type. Then there exists at least one occurrence of 

ID in I C . This corresponds to a class C i without uniqueness constraint occurring as a label 

in G id , hence contradicts the assumption of the lemma. 

ut 

1.4.4 The Finiteness Property 

Let us now address the general case. The basic idea is that there is always only a nite number 

of objects in a database. Assuming the database being consistent with respect to inclusion 

and referential constraints yields that there can not exist innite cyclic references. This will 

be expressed by theniteness property. We show that this property allows the computation 

of value representation types. 

Denition 1.17. Let C be a class in a schema S and let g kl denote a path in G ref from v Ck 

to v Cl provided there is a reference r l : C l in the structure expression of C k . Then a cycle in 

G ref is a sequence g 01 g n;1n with C 0 = C n and C k 6= C l otherwise. 

Note that we use paths instead of edges, because the edges in G ref do not always correspond 

to references. According to our denition of a class there exists a referential constraint on 

C k , C l dened by o kl : T Ck ID ! BOOL corresponding to g kl . Therefore, to each cycle 

there exists a corresponding sequence of functions o 01 o n;1n . This can be used as follows 

to dene a function cyc : ID ID ! BOOL corresponding to a cycle in G ref . 

Denition 1.18. Let C be a class in a schema S and let g 01 g n;1n be a cycle in G ref . The 

corresponding cycle relation cyc : ID ID ! BOOL is dened by cyc(i j) =true i there 

exists a sequence i = i 0 i 1 ::: i n = j (n 6= 0) such that (i l v l ) 2 C l and o ll+1 (i l+1 v l )=true 

for all l =0::: n; 1. 

Given a cycle relation cyc, letcyc m the m-th power of cyc. 

Lemma 1.19. Let C be a class in a schema S. Then C satises the niteness property, 

i.e. for each instance D of S and for each cycle in G ref the corresponding cycle relation cyc 

satises 

8i 2 dom(C): 9n: 8j 2 dom(C): 9m

Lemma 1.20. Let D be an instance of schema S = fC 1 ::: C n g. Then D satises at 

each stage of Algorithm 1.15 uniqueness constraints for all i = 1::: n dened by some 

c 0 i : T C i 

! F (C i ). 

Proof. It is sucient toshow that whenever a rule is applied replacing F (C i )by F (C i ) 0 , then 

F (C i ) 0 also denes a uniqueness constraint onC i . 

Suppose that (i v) 2 C i holds in D. Since it is possible to apply a rule to F (C i ), there exists 

at least one value j :: ID occurring in c i (v). Replacing ID j in F (C i ) corresponds to replacing 

j by some value v j :: F (C j ). Because of the niteness property such a value must exist. 

Moreover, due to the uniqueness constraint dened by c j the function f : F (C i ) ! F (C i ) 0 

representing this replacement must be injective onc i (codo(D(C i ))). Hence, c 0 i = f c i denes 

a uniqueness constraint onC i . 

ut 

Now assume that we use only trivial uniqueness constraints in Algorithm 1.15. In order to 

distinguish this situation from the general case we write G(C i ) instead of F (C i ) to refer to 

this special case. 

Lemma 1.21. Let D be an instance of schema S = fC 1 ::: C n g. Then at each stage of 

Algorithm 1.15 (applied with arbitrary uniqueness constraints and in parallel with trivial 

ones) there exists for all i = 1::: n a function c i : G(C i ) ! F (C i ) that is unique on 

c i (codom(D(C i ))) with c 0 i =c i c i . 

Proof. As in the proof of Lemma 1.20 it is sucient to show that the required property 

is preserved by the application of a rule from any of the two versions of Algorithm 1.15. 

Therefore, let c i satisfy the required property and let g : G(C i ) ! G(C i ) 0 and f : F (C i ) ! 

F (C i ) 0 be functions corresponding to the application of a rule to G(C i ) and F (C i ) respectively. 

Such functions were constructed in the proofs of Lemma 1.20 and Lemma 1.20 respectively. 

Then f c i satises the required property with respect to the application of f. In the case 

of applying g we know that g is injective on c i (codom(D(C i ))). Let h : G(C i ) 0 ! G(C i ) be 

any continuation of g ;1 : g(c i (codom(D(C i )))) ! G(C i ). Then c i h satises the required 

property. 

ut 

Theorem 1.22. Let C be a class in a schema S such that there exists a uniqueness constraint 

for all classes C i occurring as a label in the reference graph G ref of C. Let V C be the type 

G(C) computed by Algorithm 1.15 with respect to trivial uniqueness constraints and let I C be 

the type F (C) computed by Algorithm 1.15 with respect to arbitrary uniqueness constraints. 

Then C is value-representable with value representation type V C and each such I C is a value 

identication type. 

Proof. V C is a proper value type by Lemma 1.16. From Lemma 1.20 it follows that if D is an 

instance of S, then there exists a function c : T C ! V C such that the uniqueness constraint 

dened by c holds for D. The same applies to I C . 

If VC 

0 is another proper value type and D satises a uniqueness constraint dened by 

c 0 : T C ! VC 0 , then V C 0 is some value-identication type I C.Henceby Lemma 1.21 there exists 

a function c 00 : V C ! VC 0 that is unique on c(codom(D(C))) with c0 = c 00 c. This proves the 

Theorem. 

ut 

Corollary 1.23. Let S be a schema such that all classes C in S are value-identiable. Then 

all classes C in S are also value-representable. 

ut 

19

1.4.5 Weak Value-Representability 

Let us now ask whether there exist also weaker identication mechanisms other than valuerepresentability. 

Inseveral papers, e.g. [42] a navigational approach on the basis of the reference 

structure has been favoured. This leads to dependent classessimilar to \weak entities" 

in the entity-relationship model [62]. We shall show that such an approach requires at least 

a value-identiable \entrance" of some path and the hard restriction on references to be 

representable by surjective functions. 

Denition 1.24. 

Let S be some schema. 

(i) If r is a reference from class C to D in S and o : T C ID ! BOOL is the function 

of Denition 4 expressing the corresponding referential constraint, then r satises the 

(SF)-condition i 

(a) o(v i) ^ o(v j) ) i = j and 

(b) j 2 dom(x D ) ) 9v :: T C :v 2 codom(x C ) ^ o(v j) 

hold for all i j :: IDv :: T C . 

(ii) An (SF)-chain from class D to C in S is a sequence of classes D = C 0 ::: C n = C such 

that for all i (i =1::: n) either C i is a subclass of C i;1 or there exists a reference r i 

from C i;1 to C i satisfying the (SF)-condition. 

(iii) A class C in S is called weakly value-identiable i there exists avalue-identiable class 

D and an (SF)-chain from D to C. 

The notation (SF)-condition has been chosen to emphasize that such a reference represents 

a surjective function. It is easy to see taking n =0that each value-identiable class is also 

weakly value-identiable. 

Lemma 1.25. If C is a weakly value-identiable class in a schema S, then there exists a 

proper value type I C such that for each instance D of S there exists a function c : ID ! I C 

such that c is injective on dom(D(C)). 

Call I C a weak value-identication type of the class C. 

Proof. Let D = C 0 ::: C n = C be an (SF)-chain from the value-identiable class D to C 

with corresponding references r i (i =1::: n). If r i satises the (SF)-condition, there exists 

a function c i : ID ! ID such that j 2 dom(D(C i )) ) (c i (j)v) 2 x Ci;1 for some v with 

o i (v j) (just take some inverse image of j under the surjective reference function). Since r i 

denes a function, c i is clearly injective. If C i is a subclass of C i;1 , then take c i = id. 

If c 0 : ID ! I D is the function dened by the uniqueness constraint onD and c 00 : ID ! 

ID is the concatenation c 1 ::: c n , then c = c 0 c 00 satises the required property. ut 

Denition 1.26. A class C in a schema S is called weakly value-representable i there exists 

apropervalue type V C such that for each instance D of S the following properties hold. 

(i) There is a function c : ID ! V C that is injective ondom(D(C)). 

(ii) For each proper value type VC 0 and each function c0 : ID ! VC 0 that is injective on 

dom(D(C)) there exists a function c 00 : V C ! VC 0 that is unique on c(dom(D(C))) with 

c 0 = c 00 c. 

20

We call V C the weak value-representation type of the class C. 

Note that the weak value-representation typeisuniqueprovided it exists. Again it is easy to 

see that value-representability implies weak value-representability. Moreover, due to Lemma 

1.25 each weakly value-representable class is also weakly value-identiable. We shall see that 

also the converse of this fact is true. 

We want to compute weak value representation types. This can be done using a slight 

modication of Algorithm 1.15 that completely ignores uniqueness constraints. We refer to 

this algorithm as the blind version of Algorithm 1.15 and to emphasize this, we write H(C i ) 

instead of F (C i ). Analogous to Lemmata 1.16 and 1.20 the following results holds. 

Lemma 1.27. Let C be aclass in a schema S and let I C bethetype H(C) computed by the 

blind version of Algorithm 1.15. Then I C is a proper value type. 

Lemma 1.28. Let D be an instance of the schema S = fC 1 ::: C n g. Let C, D be classes 

such that C is weakly value-identiable, D is value-identiable and there exists some (SF)- 

chain from D to C. Let c : ID ! I C be the function of Lemma 1.25 corresponding to this 

chain. Let c 0 : ID ! H(D) be a function corresponding to the uniqueness constraint on D 

and the instance D. Then at each stage of the blind version of Algorithm 1.15 there exists a 

function c : H(D) ! I C that is unique on c 0 (dom D (C)) with c =c c 0 . 

Based on these two lemmata we can now state the main result on weak value representability. 

Theorem 1.29. Let C be a weakly value-identiable class in a schema S andlet V C be the 

product of all types H(D), where D is the leading value-identiable class in some maximal 

(SF)-chain corresponding to C and H(D) is the result of the blind version of Algorithm 1.15. 

Then C is weakly value-representable with weak value-representation type V C . 

Proof. V C is a proper value type by Lemma 1.27. From Lemmata 1.20 and 1.25 it follows 

that there exists a function c 0 : ID ! V C that is injective ondom D (C). 

From Lemma 1.28 it follows that there exists a function c : V C ! I C that is unique on 

c 0 (dom(D(C))) with c =c c 0 . This proves the Theorem. 

ut 

1.5 The Genericity Problem 

The preservation of advantages of relational databases requires generic operations for querying 

and for the insertion, deletion and update of single objects. While querying [1, 12, 30, 55] is 

per se a set-oriented operation, i.e. it is not necessary to select just one single object, and 

hence does not raise any specic problems with object identiers, things change completely 

in case of updates. If an object with a given value is to be updated (or deleted), this is only 

dened unambigously, if there does not exist another object with the same value.Ifmorethan 

one object exists with the same value or more generally with the same value and the same 

references to other objects, then the user has to decide, whether an update- or delete-operation 

is applied to all these objects, to only one of these objects selected non-deterministically or to 

none of them, i.e. to reject the operation. However, it is not possible to specify a priori such 

an operation that works in the same way for all objects in all situations. The same applies 

to insert-operations. Hence the problem, in which cases operations for the insertion, deletion 

and update of objects can be dened generically. 

21

Some authors [43] have chosen the solution to abandon generic operations. Others [6, 7, 9] 

use identifying values to represent objectidentity, thus embody a strict concept of surrogate 

keys to avoid the problem. Our approach is dierent from both solutions in that we use the 

concept of hidden abstract identiers, but at the same time formally characterize those classes 

for which unique generic methods for the insertion, deletion and update of single objects exist. 

At the same time inclusion and referential integrity have to be enforced. We show that these 

classes are the value-representable ones. 

1.5.1 Generic Update Methods 

The requirement that object-identiers have to be hidden from the user imposes the restriction 

on canonical update operations to be value-dened in the sense that the identier of a new 

object hastobechosen by the system whereas all input- and output-data have to be values 

of proper value types. 

We now formally dene what we mean by generic update methods. For this purpose regard 

an instance D of a schema S as a set of objects. For each recursively dened type T let T 

denote by replacing each occurrence of a recursive type T 0 in T by UNION(T 0 ID). 

Denition 1.30. Let C be a class in a schema S. Generic update methods on C are insert C , 

delete C and update C satisfying the following properties: 

(i) Their input types are proper value types their output type is the trivial type ?. 

(ii) In the case of insert applied to an instance D there exists some o :: U C such that 

(a) the result is an instance D 0 with o 2D 0 and DD 0 hold and 

(b) if D is any instance with D D and o 2 D, then D 0 D. 

(iii) In the case of delete applied to an instance D there exists some o :: U C such that 

(a) the result is an instance D 0 with o 62 D 0 and D 0 Dhold and 

(b) if D is any instance with DDand o 62 D, then DD 0 . 

 

(iv) In the case of update applied to an instance D = D 1 [D 2 , where D 2 = fog if o 6= o 0 and 

D 2 = otherwise there exist o o 0 :: U C with o =(i v) ando 0 =(i v 0 )such that 

(a) the result is an instance D 0 

= D 1 [D2 0 with D 2 \D2 0 = , 

(b) o 2D, o 0 2D 0 , 

(c) if D is any instance with D 1 D and o 0 2 D, then D 0 D. 

Canonical update methods on C are insert 0 C , delete0 C and update0 C 

dened analogously with 

the only dierence of their output type being ID and their input-type being T for some 

value-type T . 

Note that this denition of genericity includes the consistency with respect to the implicit constraints 

on S. Weshowthatvalue-representability is necessary and sucient for the existence 

and uniqueness of such operations. 

Lemma 1.31. Let C be a class in a schema S such that there exist canonical update methods 

on C. Then also generic update methods exist on C. 

Proof. In the case of insert dene insert C (V :: V C ) == I insert 0 C 

(V ), i.e. call the 

corresponding canonical operation and ignore its output. The same argument applies to delete 

and update. 

ut 

22

Theorem 1.32. Let C be a class in a schema S such that there exist generic update methods 

on C. Then C is value-representable. Moreover, all super- and subclasses of C are also valuerepresentable. 

Proof. First consider the delete method with input type I C which isby denition a proper 

value type. We show that it is already a value identication type. 

If not, then for all instances D and all functions c : T C ! I C there exist i j :: ID and 

v w :: T C with 

i 6= j ^ (i v) 2D(C) ^ (j w) 2D(C) ^ c(v) =c(w) : (1.12) 

Now take o = (i v) and o 0 = (j w). Then there exist two distinct instances D 0 and D 00 

satisfying the conditions of Denition 1.30(iii) with respect to o and o 0 respectively, hence 

contradict the assumption of a unique generic delete-method on C. 

The same argument applies to the input-type V C . Moreover, since insertion requires all 

values of referenced object to be provided, we derive from Algorithm 1.15 and Theorem 1.22 

that V C is a value representation type. Therefore, C is value-representable. 

The value-representability on superclasses is implied, since insert (and update) on C 

involve the corresponding method on each superclass. The value-representability of subclasses 

follows from the propagation of update through them. We omit the technical details. ut 

1.5.2 Generic Updates in the Case of Value-Representability 

Our next goal is to reduce the existence problem of canonical update operations to schemata 

without IsA relations. 

Lemma 1.33. Let C, D be value-representable classes in a schema S such that C is a subclass 

of D with subtype function g : T C ! T D .Thenthere exists a function h : V C ! V D such that 

for each instance D of S with corresponding functions c : T C ! V C and d : T D ! V D we have 

h(c(v)) = d(g(v)) for all v 2 codom(D(C)). 

Proof. By Denition 1.10 c is injectiveoncodom(D(C)), hence any continuation h of dgc ;1 

satises the required property. 

It remains to show that h does not depend on D. Suppose D 1 , D 2 are two instances such 

that w = c 1 (v 1 )=c 2 (v 2 ) 2 V C , where c 1 d 1 h 1 correspond to D 1 and c 2 d 2 h 2 correspond to 

D 2 . Then there exists a permutation on ID such that v 2 = (v 1 ). We may extend to a 

permutation on any type. Since ID has no non-trivial supertype, g permutes with , hence 

g(v 2 )=(g(v 1 )). From Denition 1.10 it follows d 2 (g(v 2 )) = d 1 (g(v 1 )), i.e. h 2 (w) =h 1 (w). 

ut 

In the following let S 0 be a schema derived from a schema S by omitting all IsA relations. 

Lemma 1.34. Let C be a value-representable class in S such that all its superclasses and 

subclasses D 1 :::D n are also value-representable. Then canonical update operations exist on 

C in S i they exist on C and all D i in S 0 . 

Proof. By Theorem 1.22 the value-representation type V C is the result of Algorithm 1.15, 

hence V C does not depend on the inclusion constraints of S. Thenwehave 

I :: ID 

insert 0 C (V :: V C)== 

I insert 0 D1 (h 1(V )) ::: I insert 0 D n 

(h n (V )) I insert 0 C(V ) 

23

where h i : V C ! V Di is the function of Lemma 1.33 and insert 0 C 

denotes a canonical insert 

on C in S 0 . Hence in this case the result for the insert follows by structural induction on the 

IsA-hierarchy. 

If the subtype function g required in Lemma 1.33 does not exist for some superclass D 

then simply add V D to the input type. We omit the details for this case. 

The arguments for delete and update are analogous. The value-representability of subclasses 

is required for the update case. 

ut 

From now onwe use a global operation NewId that produces a fresh identier I :: ID.This 

can be represented as a method using projection. 

Lemma 1.35. Let C be a value-representable class in S 0 . Then there exist unique quasicanonical 

update operations on C. 

Proof. Let r i : C i (i =1:::n) denote the references in the structure expression of C. IfV be 

avalue of type V C , then there exist values V ij :: VCi 

(i =1:::nj =1:::k i ) occurring in V . 

Let V = fV ij =J ij j i =1:::nj =1:::k i g:V denote the value of type T C that results from 

replacing each V ij by some J ij :: ID. Moreover, for I :: ID let 

 

V (I) fV=Ig:Vij if V occurs in V 

ij 

= 

ij 

else 

V ij 

Then the canonical insert operation can be dened as follows: 

I :: ID insert 0 C (V :: VC ) == 

9 I 0 :: ID V 0 :: T C : (P air(I 0 V 0 ) 2 C ^ c(V 0 )=V ) ! I := I 0 

9V 0 :: T C :V = V 0 ! I NewId x C := x C [f(IV )g 

I NewId J 11 insert 0 (I) 

C1 (V 11 ) ::: J nk n 

insert 0 C n 

(V (I) 

nk n 

) 

x C := x C [f(IV )g 

It remains to show that this operation is indeed canonical. Apply the method to some instance 

D. If there already exists some o =(I 0 V 0 ) in C with c(V 0 ) = V , the result is D 0 = D and 

the requirements of Denition 1.30 are trivially satised. Otherwise let o = (I V ). If D 

is an instance with D D and o 2 D, we have J ij 2 dom(C i ) for all i = 1 :::n, j = 

1 :::k i , since D satises the referential constraints. Hence D contains the distinguished objects 

corresponding to the involved quasi-canonical operations insert 0 C i 

. By induction on the length 

of call-sequences D ij D for all i = 1 :::n, j = 1 :::k i , where D ij is the result of J ij 

insert 0 C i 

(V (I) 

ij ). Hence D0 = S ij 

D ij [fog D. The uniqueness follows from the uniqueness of 

V C . 

The denitions and proofs for delete and update are analogous. 

Theorem 1.36. Let C be a value-representable class in a schema S such that all its superand 

subclasses are also value-representable. Then there exist unique generic update operations 

on C. 

Proof. By Lemma 1.31 and Lemma 1.34 it is sucient to show the existence of canonical 

update operations on C and all its super- and subclasses in the schema S 0 . This follows from 

Lemma 1.35. 

ut 

In [50] it has been shown, how linguistic reection [56] can be exploited to generate the generic 

update operations for value-representable classes in an OODM schema. 

24 

ut

1.6 The Consistency Problem 

In general a database may be considered as a triplet (S O C), where S denes a structure, 

O denotes a collection of state changing operations and C is a set of constraints. Then the 

consistency problem is to guaranteethateach specied operation o 2Owill never violate any 

constraint I2C. Integrity enforcement aims at the derivation of a new set O 0 with j O 0 j=j O j 

of operations such that(S O 0 C) satises this property. 

Suppose we are given a database schema S and a static integrity constraint I on that 

schema. Regard I as a logical formula dened on S. Consistency requires that only those 

instances D of S are allowed that satisfy I. Call the set of such instances sat(S I). Each 

transaction is a database transformation. Such a database transformation T takes an arbitrary 

instance D and possibly some input values v 1 ::: v n and produces a new instance D 0 and 

possibly some output values v1 0 ::: v0 m . T is consistent with respect to I i for each D 2 

sat(S I) we also have D 0 2 sat(S I). 

Classically consistency is maintained at run-time by transaction monitors. Whenever an 

inconsistent instance is produced the transaction that caused the inconsistency will be rolled 

back. This \everything or nothing" approach has been critized, since it causes enormous runtime 

overhead for consistency checking and rollback. Moreover, it leaves the burden of writing 

consistent transactions to the user. In principle the rst problem vanishes, if verication 

techniques are used at design time [44, 57, 58], whereas the second one still remains. 

As an alternative alotofattention has been paid to integrity enforcement. In most cases 

the envisioned solution is an active database [18, 27, 59, 64, 65], where production rules are 

used to repair inconsistencies instead of rolling back. Although this is sometimes coupled 

with design time (or even run-time) analysis of the rules [18, 27, 33, 63], the approach isnot 

always successfull. Moreover, a satisfying theory for rule triggering systems with respect to the 

integrity enforcement problem is still missing. Therefore, we favour an operational approach 

[51, 48, 52, 53], which aims at replacing inconsistent database transactions by consistent 

specializations. 

1.6.1 Greatest Consistent Specializations 

In general non-deterministic partial state transitions S as used in our method language can 

be described by a subset of DD ? , where D denotes the set of possible states and D ? = 

D[f?g, where ? is a special symbol used to indicate non-termination. It can be shown 

[20, 41, 46, 44] that this is equivalent to dening two predicate transformers wp(S) andwlp(S) 

associated with S satisfying the pairing condition wp(S)(R) , wlp(S)(R) ^ wp(S)(true) and 

the universal conjunctivity of wlp(S),i.e. 

wlp(S)(8i 2 I:R i ) , 8i 2 I:wlp(S)(R i ) : 

The predicate transformers assign to some postcondition R the weakest (liberal) precondition 

of S to establish R. Clearly, pre- and postconditions are X-constraints. Informally these 

conditions can be characterized as follows: 

{ wlp(S)(R) characterizes those initial states such that all terminating executions of S will 

reach a nal state characterized by R provided S is dened in that initial state, and 

{ wp(S)(R) characterizes those initial states such that all executions of S terminate and 

will reach a nal state characterized by R provided S is dened. 

25

The use of these predicate transformers for the denition of language semantics is usually 

called \axiomatic semantics". Based on this consistency and specialization can be formally 

dened and used for the formal description of the consistency problem. For this purpose we 

dene \extended operations" and therefore need to know for each operation S the set of 

classes S 0 such that S does neither read nor change the class variables x C with C =2 S 0 . In 

this case we callS a S 0 -operation. We omit the formal denition [41, 51]. 

Denition 1.37. Let S be a schema, I a constraint and S, T methods dened on S 1 S 

and S 2 S respectively with S 1 S 2 . 

(i) S is consistent with respect to I i I ) wlp(S)(I) holds. 

(ii) T specializes S i wp(S)(true) ) wp(T )(true) and wlp(S)(R) ) wlp(T )(R) hold for 

all constraints R with free variables x C such that C 2S 1 (denoted T v S). 

Hence the following denition of a greatest consistent specialization: 

Denition 1.38. Let S be a schema, I a constraint and S a method dened on S 1 S. A 

method S I is a Greatest Consistent Specialization (GCS) of S with respect to I i 

(i) S I v S , 

(ii) S I is consistent with respect to I and 

(iii) for each method T satisfying properties (i) and (ii) (instead of S I )wehave T v S I . 

If only properties (i) and (ii) are satised, we simply talk of a consistent specialization. 

Let us rst state the main results from [48]. 

Theorem 1.39. Let S be a schema, I, J constraints and S a method dened on S 1 S. 

(i) There exists a greatest consistent specialization S I of S with respect to I. Moreover, S I 

is uniquely determined (up to semantic equivalence) by S and I. 

(ii) The GCSs (S I ) J and S (I^J) coincide on initial states satisfying I^J. 

The proof of these results heavily uses predicate transformers and is therefore omitted here. 

In [51] it has been shown that a GCS|that is in general non-deterministic|can be written 

as a nite choice of maximal quasi-deterministic specializations (MQCSs), where quasideterminism 

means determinism up to the selection of some values. In most cases this value 

selection can be shifted to the input, but the selection of object identiers should be left to 

the system. 

Next, we formally dene quasi-determinism and then present the main result from [51], 

an algorithm for the computation of MQCSs. 

Denition 1.40. A method S is called quasi-deterministic i there exist types T 1 ::: T n 

such thatS is semantically equivalent to 

where S 0 is a deterministic method. 

y 1 :: T 1 j :::y n :: T n j S 0 

26

Algorithm 1.41. 

In: An X-operation S and constraints I 1 ::: I n dened on extensions Y 1 ::: Y n of X. 

Let ` be the list of the constraints. As long as ` 6= nil proceed as follows: 

1. Set S 0 I = S. 

2. Choose and remove one constraint I i from `. 

3. Check whether S 0 I is I i-reduced. If not, stop with no result, otherwise continue. 

4. Make S 0 I -free by replacing each occurring S 1 S 2 by S 1 wlp(S 1 )(false) ! S 2 . 

5. Replace each basic assignment inSI 0 by some (subsumption-free) MQCS with respect to 

I i . 

6. Compute P (S I )as 

P ( S I ) fz 1 =x 1 ::: z n =x n g:wlp(fx 1 =z 1 ::: x n =z n g: S I )(:wlp(S)(z 1 6= x 1 _:::_z n 6= x n )) 

where the x i are the class variables occurring in I or in S and the z i are used as a disjoint 

copy of these. 

7. Set S = P (S I ) ! S 0 I . 

Set S 0 I = S. 

Out: An operation I!SI 0 , where S0 I is a (subsumption-free) MQCS of the original S with 

respect to the conjunction I of the constraints. 

ut 

An extension of the GCS algorithm to compute all (subsumption-free) MQCSs is easy. 

It has been shown in [51] that Algorithm 1.41 is correct. However, it depends on checking 

avery technical condition, I-reducedness. We omit this condition here. 

1.6.2 Enforcing Integrity in the OODM 

Since Algorithm 1.41 allows integrity enforcement to be reduced to the case of assignments, 

we may restrict ourselves to the case of a single explicit constraint in addition to the trivial 

uniqueness constraints that are required to assure value-representability and that are used to 

construct generic update operations. In the following we describe MQCSs with respect to the 

constraints introduced in Denition 1.5. 

Inclusion Constraints. Let I be an inclusion constraint onC 1 , C 2 dened via c i : T Ci ! T 

(i = 1 2). Then each insertion into C 1 requires an additional insertion into C 2 whereas a 

deletion on C 2 requires a deletion on C 1 . Update on one of the C i requires an additional 

update on the other class. 

Let us rst concentrate on the insert-operation on C 1 (for an insert on C 2 there is nothing 

to do). Insertion into C 1 requires an input-value of type V C 1 an additional insert on C 2 then 

requires an input-value of type V C 2 .However, these input-values are not independent, because 

the corresponding values of type T C 1 and T C2 must satisfy the general inclusion constraint. 

Therefore we rst show that the constraint can be \lifted" to a constraint on the valuerepresentation 

types. Note that this is similar to the handling of IsA-constraints in Lemma 

1.33. 

27

Lemma 1.42. Let C 1 , C 2 be classes, c i : T Ci ! T functions and let V Ci be the value-representation 

type ofC i (i =1 2). Then there exist functions f i : V Ci ! T such that for all database 

instances D 

f 1 (d D 1 (v 1 )) = f 2 (d D 2 (v 2 )) , c 1 (v 1 )=c 2 (v 2 ) (1.13) 

for all v i 2 codom(D(x Ci )) (i =1 2) holds. Here d D i : T Ci ! V Ci denotes the function used 

in the uniqueness constraint on C i with respect to D. 

Proof. Due to Denition 1.10 we may dene f i = c i (d D i );1 on c i (codom(D(x Ci ))) (i =1 2). 

Then wehave toshow that this denition is independent of the instance D. Suppose D 1 , D 2 

are two dierent instances. Then there exists a permutation on ID such that d D 2 

i 

= d D 1 

i 

, 

where is extended to T Ci . Then 

c i (d D 2 

i 

) ;1 = c i ;1 (d D 1 

i 

) ;1 = ;1 c i (d D 1) ;1 

i 

since c i permutes with ;1 . Then the stated equality follows. 

ut 

Now let V C 1C2 = V C1 V C2 and dene the new insert-operation on C 1 by (insert C 1 ) I ((v 1v 2 ):: 

V C 1C2 ) == f 1 (v 1 )=f 2 (v 2 ) ! insert C 1 (v 1) insert C 2 (v 2) (1.14) 

where the f i are the functions of Lemma 1.42. Note there there is no need to require C 1 6= C 2 . 

Delete- and update-operations can be dened analogously. 

Functional and Uniqueness Constraints. Now let I be a functional constraint on C 

dened via c 1 : T C ! T 1 and c 2 : T C ! T 2 . In this case nothing is required for the delete 

operation whereas for inserts (and updates) we have to add a postcondition. Moreover, let 

c D : T C ! V C denote the function associated with the value-representability of C and the 

database instance D and let all other notations be as before. Let us again concentrate on the 

insert-operation. Let insert 0 C 

denote the canonical insert on C. Then we dene 

(insert C ) I (V :: V C ) == 

I :: ID j I insert 0 C (V ) 

V 0 :: T C j (IV 0 ) 2 x C ! 

( 8J :: IDW :: T C : ((JW) 2 x C 

^ c 1 (W )=c 1 (V 0 ) ) c 2 (W )=c 2 (V 0 )) ! skip : (1.15) 

Note that in this case there is no change of input-type. For delete- and update-operations we 

have analogous denitions. 

A uniqueness constraint dened via c 1 : T C ! T 1 is equivalent toa functional constraint 

dened via c 1 and c 2 = id : T C ! T C plus the trivial uniqueness constraint. Since trivial 

uniqueness constraints are already enforced by the canonical update operations, there is no 

need to handle separately arbitrary uniqueness constraints. 

28

Exclusion Constraints. The handling of exclusion constraints is analogous to the handling 

of inclusion constraints. This means that an insert (update) on one class may cause a delete 

on the other, whereas delete-operations remain unchanged. 

We concentrate again on the insert-operation. Let I be an exclusion constraint onC 1 and 

C 2 dened via c i : T Ci ! T (i =1 2). Let f i : V Ci ! T denote the functions from Lemma 

1.42. Then we dene a new insert-operation on C 1 by 

(insert C 1 ) I (V :: V C1 )== 

insert C 1 (V ) 

S: ((I :: ID j V 0 :: T C 2 j (IV 0 ) 2 x C 2 

^c 2 (V 0 )=f 1 (V ) ! delete C 2 (V 0 ) S ) skip ) : (1.16) 

For delete- and update-operations an analogous result holds. 

Theorem 1.43. The methods S I in (1.14), (1.15) and (1.16) are MQCSs of generic insertmethods 

with respect to inclusion, functional and exclusion constraints respectively. 

The proof involves detailed use of predicate transformers and is therefore omitted here [48, 49]. 

Analogous results hold for delete and update. 

1.7 Conclusion 

In this paper we describe rst results concerning the formal foundations of object oriented 

database concepts. For this purpose weintroduced a formal object oriented datamodel (OODM) 

with the following characteristics. 

{ Objects are considered to be abstractions of real world entities, hence they have an immutable 

identity. This identity is encoded by abstract identiers that are assumed to form 

some type ID. This identier concept eases the modelling of shared data and cyclic references, 

however, it does not relieve us from the problem to provide unique identication 

mechanisms for objects in a database. 

{ In our approach there is not only one value of a given type that is associated with an 

object. In contrast we allow several values of possibly dierent types to belong to an 

object, and even this collection of types may change. 

{ Classes are used to structure objects. At each time a class corresponds to a collection of 

objects with values of the same type and references to objects in a xed set of classes. 

Inheritance is based on IsA relations that express an inclusion at each time of the sets of 

objects. Moreover, referential integrity is supported. 

{ We associate with each class a collection of methods. Methods are specied by guarded 

commands, hence the method language is computationally complete. In order to allow 

the handling of identiers that are always hidden from the user as well as user-accessible 

transactions a hiding operator on methods is introduced. Generic update operations, i.e. 

insert, delete and update on a class are assumed to be automatically derived whenever 

this is possible. 

{ We associate integrity constraints to schemata. Certain kinds of such constraints can be 

obtained by generalizing corresponding constraints in the relational model. We assume 

that methods are automatically changed in order to enforce integrity. 

29

On this basis of this formal OODM we study the problems of identication, genericity and 

integrity. Weshow that the unique identication of objects in a class requires the class to be 


An advantage of database systems is to provide generic update operations. We show that 

the unique existence of such generic methods requires also value-representability. However, in 

this case referential and inclusion integrity can be enforced automatically. This result can be 

generalized with respect to distinguished classes of user-dened integrity constraints. Given 

some arbitrary method S and some constraint I there exists a greatest consistent specialization 

(GCS) S I of S with respect to I. Such a GCS behaves nice in that it is compatible with the 

conjunction of constraints. For the GCS construction of a user-dened transaction we apply 

the GCS algorithm developpedin[48,51,52,53]. 

This work on mathematical foundations of OODB concepts is not yet completed. A lot of 

problems are still left open and are the matter of current investigations and future research. 

{ In our approach classes are sets. What are other bulk types? Does it make sense to abstract 

from classes in this way? 

{ The problem of updatable views is still open. 

{ Our approach to genericity only handles the worst case expressed by the value representation 

type. We assume that polymorphism will help to generalize our results to the general 

case. Moreover, we must integrate communication aspects at least with respect to the 

user. 

{ The usual axiomatic semantics for guarded commands abstracts from an execution model. 

All results are true for semantic equivalence classes. However, we also need optimization, 

especially with respect to the derived GCSs. 

{ We only presented a formal OODM without looking into methodological aspects such as 

the characterization of good designs. 

We express the hope that others will also contribute to solve open problems in OODB foundation 

or in the implementation of more sophisticated object oriented database languages on 

a sound mathematical basis. 

References for Chapter 1 

1. S. Abiteboul: Towards a deductive object-oriented database language, Data & Knowledge Engineering, 

vol. 5, 1990, pp. 263 { 287 

2. S. Abiteboul, R. Hull: IFO: A Formal Semantic Database Model, ACM ToDS, vol. 12 (4), December 

1987, pp. 525 { 565 

3. S. Abiteboul, P. Kanellakis: Object Identity as a Query Language Primitive, in Proc. SIGMOD, 

Portland Oregon, 1989, pp. 159 { 173 

4. A. Albano, G. Ghelli, R. Orsini: Types for Databases: The Galileo Experience, in Type Systems 

and Database Programming Languages, University of St. Andrews, Dept. of Mathematical and 

Computational Sciences, Research Report CS/90/3, 27 { 37 

5. A. Albano, A. Dearle, G. Ghelli, C. Marlin, R. Morrison, R. Orsini, D. Stemple: AFramework for 

Comparing Type Systems for Database Programming Languages, inType Systems and Database 

Programming Languages, University of St. Andrews, Dept. of Mathematical and Computational 

Sciences, Research Report CS/90/3, 1990 

6. A. Albano, G. Ghelli, R. Orsini: Objects and Classes for a Database Programming Language, FIDE 

technical report 91/16, 1991 

30

7. A. Albano, G. Ghelli, R. Orsini: ARelationship Mechanism for a Strongly Typed Object-Oriented 

Database Programming Language, in A. Sernadas (Ed.): Proc. VLDB 91, Barcelona 1991 

8. M. Atkinson, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The Object-Oriented 

Database System Manifesto, Proc. 1st DOOD, Kyoto 1989 

9. F. Bancilhon, G. Barbedette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeer, 

P. Richard, F. Velez: The Design and Implementation of O 2 , an Object-Oriented Database System, 

Proc. of the ooDBS II workshop, Bad Munster, FRG, September 1988 

10. C. Beeri: Formal Models for Object-Oriented Databases, Proc. 1st DOOD 1989, pp. 370 { 395 

11. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Engineering, vol. 

5 (4), 1990, pp. 353 { 382 

12. C. Beeri, Y. Kornatzky: Algebraic Optimization of Object-Oriented QueryLanguages, in S. Abiteboul, 

P. C. Kanellakis (Eds.): Proc. ICDT '90, Springer LNCS 470, pp. 72 { 88 

13. C. Beeri: New Data Models and Languages - the Challange in Proc. PODS '92 

14. L. Cardelli, P. Wegner: On Understanding Types, Data Abstraction and Polymorphism, ACM 

Computing Suerveys 17,4, pp 471 { 522 

15. L. Cardelli: Typeful Programming, Digital Systems Research Center Reports 45, DEC SRC Palo 

Alto, May 1989 

16. M. Carey, D. DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc. 

ACM SIGMOD 88 

17. M. Caruso, E. Sciore: The VISION Object-Oriented Database Management System, Proc.ofthe 

Workshop on Database Programming Languages, Rosco, France, September 1987 

18. S. Ceri, J. Widom: Deriving Production Rules for Constraint Maintenance, Proc. 16th Conf. on 

VLDB, Brisbane (Australia), August 1990, pp. 566 { 577 

19. A. Dearle, R. Connor, F. Brown, R. Morrison: Napier88 - ADatabase Programming Language?, 

in Type Systems and Database Programming Languages, University of St. Andrews, Dept. of 

Mathematical and Computational Sciences, Research Report CS/90/3, 10 { 26 

20. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Springer-Verlag, 1989 

21. H.-D. Ehrich, M. Gogolla, U. Lipeck: Algebraische Spezikation abstrakter Datentypen, Teubner- 

Verlag, 1989 

22. H.-D. Ehrich, A. Sernadas: Fundamental Object Concepts and Constructors, in G. Saake, A. Sernadas 

(Eds.): Information Systems { Correctness and Reusability, TU Braunschweig, Informatik 

Berichte 91-03, 1991 

23. H. Ehrig, B. Mahr: Fundamentals of Algebraic Specication, vol.1, Springer 1985 

24. L. Fegaras, T. Sheard, D. Stemple: The ADABTPL Type System, inType Systems and Database 

Programming Languages, University of St. Andrews, Dept. of Mathematical and Computational 

Sciences, Research Report CS/90/3, 45 { 56 

25. L. Fegaras, T. Sheard, D. Stemple: Uniform Traversal Combinators: Denition, Use and Properties, 

University of Massachusetts, 1992 

26. D. Fishman, D. Beech, H. Cate, E. Chow et al.: IRIS: An Object-Oriented Database Management 

System, ACM ToIS, vol. 5(1), January 1987 

27. P. Fraternali, S. Paraboschi, L. Tanca: Automatic Rule Generation for Constraint Enforcement 

in Active Databases, in U. Lipeck (Ed.): Proc. 4th Int. Workshop on Foundations of Models and 

Languages for Data and Objects \MODELLING DATABASE DYNAMICS", Volkse (Germany), 

October 19-22, 1992 

28. G. Gottlob, G. Kappel, M. Schre: Semantics of Object-Oriented Data Models { The Evolving 

Algebra Approach, in J. W. Schmidt, A. A. Stognij (Eds.): Proc. Next Generation Information 

Systems Technology, Springer LNCS, vol. 504, 1991 

29. M. Hammer, D. McLeod: Database Description with SDM: A Semantic Database Model, J.ACM, 

vol. 31 (3), 1984, pp. 351 { 386 

30. A. Heuer, P. Sander: Classifying Object-Oriented Results in a Class/Type Lattice, in B. Thalheim 

et al. (Ed.): Proceedings MFDBS 91, Springer LNCS 495, pp. 14 { 28 

31. R. Hull, R. King: Semantic Database Modeling: Survey, Applications and Research Issues, ACM 

Computing Surveys, vol. 19(3), September 1987 

31

32. R. Hull, M. Yoshikawa: ILOG: Declarative Creation and Manipulation of Object Identiers, in 

Proc. 16th VLDB, Brisbane (Australia), 1990, pp. 455 { 467 

33. A. P. Karadimce, S. D. Urban, Diagnosing Anomalous Rule Behaviour in Databases with Integrity 

Maintenance Production Rules, in Proc. 3rd Int. Workshop on Foundations of Models and Languages 

for Data and Objects, Aigen (Austria), September 1991, pp. 77 { 102 

34. S. Khoshaan, G. Copeland: Object Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon, 

1986 

35. M. Kifer, J. Wu: ALogic for Object-Oriented Logic Programming (Maier's O-Logic Revisited), in 

PODS'89, pp. 379 { 393 

36. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrating an Object-Oriented 

Programming System with a Database System, in Proc. OOPSLA 1988 

37. D. Maier, J. Stein, A. Ottis, A. Purdy: Development of an Object-Oriented DBMS, OOPSLA, 

September 1986 

38. F. Matthes, J. W. Schmidt: Bulk Types { Add-On or Built-In?, in Proc. DBPL III, Nafplion 1991 

39. J. Mylopoulos, P. A. Bernstein, H. K. T. Wong: A Language Facility for Designing Interactive 

Database-Intensive Applications, ACM ToDS, vol. 5 (2), April 1980, pp. 185 { 207 

40. J. Mylopoulos, A. Borgida, M. Jarke, M. Koubarakis: Telos: Representing Knowledge About Information 

Systems, ACM ToIS, vol. 8 (4), October 1990 pp. 325 { 362 

41. G. Nelson: A Generalization of Dijkstra's Calculus, ACM TOPLAS, vol. 11 (4), October 1989, pp. 

517 { 561 

42. A. Ohori: Representing Object Identity in a Pure Functional Language, Proc. ICDT 90, Springer 

LNCS, pp. 41 { 55 

43. G. Saake, R. Jungclaus: Specication of Database Applications in the TROLL Language, in Proc. 

Int. Workshop on the Specication of Database Systems, Glasgow, 1991 

44. K.-D. Schewe, I. Wetzel, J. W. Schmidt: Towards a Structured Specication Language for Database 

Applications, in D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication of Database 

Systems, Springer WICS, 1991, pp. 255 { 274 (an extended version appeared as FIDE technical 

report 1991/30, October 1991) 

45. K.-D. Schewe, B. Thalheim, I. Wetzel,J.W.Schmidt: Extensible Safe Object-Oriented Design of 

Database Applications, University of Rostock, Preprint CS-09-91, September 1991 

46. K.-D. Schewe: Spezikation datenintensiver Anwendungssysteme (in German), lecture manuscript, 

University of Hamburg, Winter 1991/92 

47. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency in Object- 

Oriented Databases, in J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Springer LNCS 646, pp. 341-356 

48. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity Enforcement in Object-Oriented 

Databases, in U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations of Models 

and Languages for Data and Objects \MODELLING DATABASE DYNAMICS", Volkse (Germany), 

October 19-22, 1992 

49. K.-D. Schewe, B. Thalheim, I. Wetzel: Foundations of Object Oriented Database Concepts, University 

ofHamburg, Report FBI-HH-B-157/92, October 1992 

50. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: AReective Approach to Method 

Generation in Object Oriented Databases, University of Rostock, Rostocker Informatik Berichte, 

no. 14, 1992 

51. K.-D. Schewe, B. Thalheim: Computing Consistent Transactions, University of Rostock, Preprint 

CS-08-92, December 1992 

52. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity Preserving Updates in Object Oriented Databases, 

in M. Orlowska, M. Papazoglou (Eds.) : Proc. 4th Australian Database Conference, Brisbane, 

February 1993, World Scientic, pp. 171-185 

53. K.-D. Schewe, B. Thalheim: Exceeding the Limits of Rule Triggering Systems to Achieve Consistent 

Transactions, submitted for publication 

54. M. H. Scholl, H.-J. Schek: ARelational Object Model, in Proc. ICDT 90, Springer LNCS, pp. 89 

{105 

32

55. G. M. Shaw, S. B. Zdonik: An Object-Oriented Query-Algebra, IEEE Data Engineering, vol. 12 

(3), 1989, pp. 29 { 36 

56. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programming to Database Languages, 

in Proc. HICSS '92 

57. D. Stemple, S. Mazumdar, T. Sheard: On the Modes and Meaning of Feedback to Transaction 

Designer, in Proc. SIGMOD 1987, pp. 375 { 386 

58. D. Stemple, T. Sheard: Automatic Verication of Database Transaction Safety, ACM ToDS vol. 

14 (3), September 1989 

59. M. Stonebraker, A. Juingran, J. Goh, S. Potaminos: On Rules, Procedures, Caching and Views in 

Database Systems, in Proc. SIDMOD 1990, pp. 281 { 290 

60. S. Y. W. Su: SAM : A Semantic Association Model for Corporate and Scientic-Statistical 

Databases, Inf. Sci., vol. 29, 1983, pp. 151 { 199 

61. B. Thalheim: Dependencies in Relational Databases, Teubner Leipzig, 1991 

62. B. Thalheim: The Higher-Order Entity-Relationship Model, inJ.W.Schmidt, A. A. Stognij (Eds.): 

Proc. Next Generation Information Systems Technology, Springer LNCS, vol. 504, 1991 

63. S. D. Urban, L. Delcambre: Constraint Analysis: a Design Process for Specifying Operations on 

Objects, IEEETrans. on Knowledge and Data Engineering, vol. 2 (4), December 1990 

64. J. Widom, S. J. Finkelstein: Set-oriented Production Rules in Relational Database Systems, in 

Proc. SIGMOD 1990, pp. 259 { 270 

65. Y. Zhou, M. Hsu: A Theory for Rule Triggering Systems, in Proc. EDBT '90, Springer LNCS 416, 

pp. 407 { 421 

33

Chapter 2 

Identication as a Primitive of 

Database Models 

Contents 

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 

2.2 The Identication Problem . . . . . . . . . . . . . . . . . . . . . . 36 

2.3 Identication Concepts in Databases . . . . . . . . . . . . . . . . . 41 

2.4 Comparison of Identication Concepts . . . . . . . . . . . . . . . . 44 

2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 


Catriel Beeri, Bernhard Thalheim. Identication as a Primitive of Database Models. 

In T .Polle, T. Ripke, K.-D. Schewe. Fundamentals of Information Systems. Kluwer 

1998. 

34

Abstract. Identication is one of the main primitives of database technology. Whereas identication 

of real world entities by humans is an extremely exible mechanism, identication in 

a database system is severely restricted, since the identication mechanism used in it depends 

on the data model and the type system on which it is based. To understand the modelling 

power of a data model, it is necessary to understand the identication mechanism it supports. 

Thus, this paper surveys and analyses the identication mechanism of database models. 


Databases are used to represent entities 1 of the real world. On a suciently high level of 

abstraction, every thing we deal with in our life, whether concrete or abstract, is an entity. 

However, to facilitate the construction of a world model, and certainly if one wants to use such 

a model as a basis of a database representation, it is useful to distinguish between entities, 

properties of entities, associations between entities, etc. In a computerized system, some entities 

are represented by atomic values (numbers, for example), whereas others are represented 

by structured, non-atomic values, such as tuples in the relational model, or by objects in 

object-oriented systems. Properties and associations are represented as part of the structures 

representing the entities, or as additional structures. For example, an employee tuple in a 

relational database contains the values for its properties of interest, and may also contain 

values that represent relationships, for example a department number. If the relationship between 

employees and projects is many tomany, then a separate relation may store the tuples 

describing it. In an object-oriented database, properties of an entity represented by an object 

are stored with it as associated values or objects. In either case, we may say that properties 

and associations are described by structures. 

Entities in the real world have the following properties: 

{ An entity is uniquely identied by its history, and by its properties and associations. 

{ Its set of properties and associations can be arbitrary. 

{ It has a life cycle | it is created, it exists, then it ceases to exist. 

{ An entity can exist independently of other entities. 

Note that this holds even for entities that on rst thoughtwemaybelieve not to satisfy some of 

the above. For example, nails in a box exist, and each is a unique physical entity. Furthermore, 

each is uniquely identied at every point oftimeby its location in the box. Time-independent 

identication, for example by minute dierences in lengths or weights may also exist. The 

last property may not hold for abstract entities, i.e., entities that are conceptual, rather than 

physical. 

The fact that the set of properties is arbitrary is important in real life. We recognize other 

people by many dierent properties. We may believe we know somebodyby his hair colour. 

Meeting him after twenty years, the colour is changed, or the hair is gone, yet we do know 

him. 

The exibility that exists in the real world cannot be directly supported in computerized 

representations. When we choose to represent a universe of discourse in a database, we restrict 

the properties and associations we care to represent to a nite, pre-specied set. Although 

this set is arbitrary, in the sense that we can choose it as we like, it is xed by the choice. 

1 We use èntity' here in the normal natural language sense. It should not be confused with (closely related 

but technical) use in the entity-relationship model. 

35

Furthermore, our choice regarding what to represent are guided by feasibility and cost. While 

we can,ifwe wish, record the location of each nailateachpoint of time, practically, however, 

we are ready to pay the price of such a system for locating cars, but not nails. A primary 

goal of the representations we choose is to allow us to uniquely identity entities, as this si 

the basis for proper use and manipulation. The restrictions on representations impose severe 

limits on how we can uniquely identify the (representations of the) entities in the database. 

This applies not only to the cases where we have given up the option of unique representation, 

such as the nail box, but also to many of the cases where our representation is `full'. 

While identication in relational databases has been solved by the key concept, the issue is 

still vague in OODB's. By way ofmotivation, one of the authors has performed an experiment 

on a commercial OODB. Three objects with the name 'John' and cyclic references 'friends' 

between them were created. The query `How many John's are in the database' was run several 

times. The results were 3, 6, 9 for the rst three runs, respectively, and increased similarly for 

subsequent runs. It seems very plausible that the failure has to do with unique identiability 

of the three objects. 

Overview on the paper 

In this paper we discuss the representation of entities in information systems, and the identi- 

cation mechanisms of dierent database models. We distinguish several notions of identication, 

and in particular between identication and separability, and we show that currently 

implemented mechanisms are limited. 

Section 2 discusses the identication problem in general and for object-oriented databases. 

Section 3 introduces dierent identication concepts. These concepts are compared in Section 

4. Section 5 demonstrates that there are further concepts which can be used for identication 

as well. 

2.2 The Identication Problem 

Identication is intimately related to equality. In the real world, to say that entities t 1 and 

t 2 are equal means that they are the same | they are identical. To uniquely identify an 

entity is to be able to separate it of from any entity that is not identical to it. As mentioned 

above, in the real world, we may identify entities by arbitrary combinations of properties 

and associations, which may change over time. The situation is further complicated by the 

fact that entities are often related to roles and abstractions, and it is not always clear in a 

statement to which of those one relates. 

Consider the following equalities: Clinton = Clinton, Cicero = Ford, and Clinton = The 

President of the USA. Clinton and Ford refer to physical entities that existed (each at a certain 

time). Thus, the rst equality is trivialy true, it is an identity, and the second is false. Is the 

third equality true or false? If we take it to mean that these two dierent names, denoting 

two conceptual personalities Clinton and The President of the USA, are an identical physical 

person, then it is true. If the intention is that the two conceptual entities are identical, it is 

false. And note that if we say it is true, then Ford = The President of the USA was also true 

at some time. 

In the real world, such dierences are resolved in a variety of ways | by context, by 

asking for clarication, by misunderstanding, and so on. In a database system using current 

36

technology the meaning has to be clear, or, at most, resolution should be obvious given schema 

information. Note that deducing that some objects in a database are identical can have nontrivial 

consequences. Consider the situation that in a database we have: 

fbooksg |||- The President, Clinton||{ fbusiness friendsg. 

By the equality Clinton = The President the object Clinton can inherit the properties of the 

President, e.g. the books, and the President inherits the business friends of Clinton. 

Logical Fundamentals of Identication 

Computerized systems are one instance of formal systems for world representation. It is of 

interest to consider how equality and identication were treated in other domains that deal 

with such systems. 

In philosophy and the study of logic various principles have been considered together with 

the equality concept. (for a theory of equality see [5], [13], [17], [8]). 

The indistinguishability principle [7] formulated by Leibniz [16] states that entities which 

cannot be distinguished by the unary formulas or predicates of the given language are equal, 

i.e. 

x=y i 8P ( P (x) $ P (y) ) . 

The characterization of entities is related to the abstraction principle in the sense it is used in 

database modelling, i.e. abstracting from most properties and concentrating on some properties. 

The presented indistinguishability depends on the chosen language and on the applicability 

in the case of partial predicates. Thus, denoting by P (x)! the applicability ofP to x, 

we have two versions of Leibniz principle: 

1. 8P ( (P (x)! ^ P (y)!) ;! (P (x) $ P (y)) ) 

2. 8P ( (P (x)! $ P (y)!) ^ ((P (x)! ^ P (y)!) ;! (P (x) $ P (y)) ) . 

The rst version restricts distinction to those predicates which are dened on x and y at 

the same time. If P is not dened on one of the entities and dened on the other then this 

dierence is not used for distinction. The second version permits this possibility. 

The principle is also related to the observation property. If for a given entity its identity 

can be observed on the basis of a calculus then this observability can be used for identication 

as well. Observation is closely related to and crucially depends on scope. A scope denes what 

is visible from a given viewpoint, and only what is visible can be used for identication. For 

example, a view over a database often presents less information than the complete database, 

and that may prevent entity representations to be distinguishable from each other in the view. 

(This is closely related to the view update problem.) 

Summarizing, (in)distinguishability depends on the languages which are used for representation 

of entities, and for querying them. These characteristics hold in information systems 

as well, as shown in the sequel. 

Values and Objects 

Values are the basic building blocks of data. Atomic values represent universally known abstractions. 

For example, numbers are atomic types. By ùniversally known abstraction' we 

mean that it has a standard meaning that is known to a large community further, in this 

community, there are accepted denotation(s) for it. Certainly, numbers satisfy these requirements. 

Values can also be combined in various ways to form structures, such as tuples, lists 

or sets. These are non-atomic or structured values. Most often values are partitioned into 

37

sets, called domains, such as the set of integers, the set of characters, and so on a value is an 

element ofadomainofvalues. Each value has a xed, user-visible denotation/representation, 

bound to an element of the domain, and these have the same form for all values in a domain. 

A system normally supports several domains of atomic values, and several kinds of nonatomic 

values that can be constructed from them, and depending on the intended semantics, 

a collection of functions, also called operations, on these domains. Values do not change the 

operations only map values to other values. They are not created, nor do they cease to exist. 

The other kind of data in computerized system are objects. They normally represent 

entities or abstractions that are not necessarily universally known, and whose existence is not 

pre-wired into the system. Their properties are inuenced by those of such entities and by 

the representation method, and are the following [6]: 

{ An object has an internal structure and has a state according to this internal structure. 

{ It has a life cycle : it is created, it can be modied and it is nally removed. 

{ Its identity cannot be changed during its lifetime. Identity isthatproperty ofanobject 

which distinguishes each object from all others[12]. 

{ An object can exist independently from other objects. 2 

The internal structure of an object serves as a representation of its properties, and possibly 

also of (some of) its associations. 3 At any point of time, this structure provides the values 

of the properties at that time. Just as is the case for real world entities, properties can be 

changed | this is modication. However, the identity of the object never changes, throughout 

its lifetime. A value can be seen as a special kind of object that has no properties except itself 

(for an atomic value), or its components (for a structured value). Hence, a value is never 

modied. Whereas numbers are immutable and exist forever, an employee object is created 

when an employee is hired, its properties are subject to change (e.g., a salary raise), and when 

the employee quits the company, its object is removed. 

While objects dier from values in several ways as described above, the dierence that is 

often assumed as the primary concept that distinguishes OODB's from previous models, is 

the existence of object identity. Simply stated, an object has an identity that is independent 

of its properties and associations, and is immutable. The object properties, or associations 

in which it participates can change, but the identity never changes throughout its lifetime. 

This idea is considered as a cornerstone for proper representation of real world entities. In 

particular, each object represents a unique entity, and only one object represents each entity. 

But how is this requirement accomplished in a database system? A common approach 

is to implement identication by means of object identiers (o-id's). O-id's are system supplied 

(and implementation-dependent) atomic items, used solely for the purpose of identifying 

objects in the system. 4 An identier is assigned to an object upon creation, and it 

never changes. The uniqueness and immutability of identiers guarantee that the system can 

uniquely identify each object throughout its lifetime, That means that they can be used for 

access structures, or to allow an object to be an attribute value of another (or of several other 

objects), by using the o-id as a surrogate. However, o-id's are considered internal, their values 

being meaningless to users, hence the only operation on them that is permitted to users is to 

2 In some models that incorporate a notion of composite object, the existence of an object may depend on 

that of others. 

3 In the currently accepted models, relationships as a separate concept are not supported. See [18] for an 

inuential paper that suggests an extension of object models with relationships. 

4 The o-id concept is similar to that of surrogate or tuple identier in relational databases. 

38

ask whether two o-id's (equivalently, two objects) are identical. This is commensurate with 

the OODB philosophy of encapsulation | the internal state of an object can be observed 

and manipulated exclusively through an its interface. Indeed, if o-id's were made available for 

users to view, they would simply be just another attribute value, like employee numbers. 

But now we observe that if the user cannot see the values of o-id's, they cannot serve 

him/her for identifying objects! That is, while o-id's may serve a useful role at the implementation 

level, they serve nosuch role at the conceptual level. Thus, as noted in a previous work 

[3], identiers are an implementational, not a conceptual, concept. We note that some systems 

actually use a physical address as the o-id, and this may change with physical reorganization. 

In such systems, the o-id certainly cannot serve for conceptual identication, although from 

the system's point of view, since such changes guarantee integrity of references, the o-id's can 

be considered immutable. 

The identication of objects at the conceptual and external levels must ultimately rely on 

values, just as in the value-based models. The dierence (if at all) is that an OODB has a 

rich structure, hence many more ways values can be associated with objects for identication. 

Further, the rich structure possibly allows the structure itself to serve as an identication or 

equality mechanism. For example, consider the two well-known notions of equality for objects, 

based on the values in this representation: In shallow equality two objects are (shallow) equal 

if they have the same structure, and the values in the structures are pairwise equal. Note 

that a component of a structure may be an object, and then equality asidentity isused. In 

deep equality two objects are equal if their structures match, and for each pair of matching 

components, either they are equal values, or deep equal objects. Note that in the real world, 

entities are identied by value properties (such as hair colour, timbre of voice, height), or by 

associations with other entities that have value-vased identication. Although identication 

in the real world is exible and potentially complex, it is eventually value-based. 

Among implemented or proposed OODB models we can distinguish three dierent kinds: 

Value-based databases: All objects are value-identiable, i.e. can be identied by values of 

their (public) attributes or by an unfolding, unnesting of the values. This means that a 

subset of the public attributes serves as a key for a class. 

Value-representable databases: All objects are reference-identiable, where reference-identi- 

ability can be recursively dened as follows: 

{ Each value-identiable object is also reference-identiable. 

{ If an object is identied by acombination of attribute values and by references to or 

from a set of objects such that each object in this set is reference-identiable, then 

the object is itself reference-identiable. 

Identier-based databases: There are objects which are not reference-identiable. 

Figure 2.1 depicts the relationships these classes. Note that we do not claim that no other 

methods for identication exist. Finding methods that are ecient yet expressive is a subject 

for research. 

The Identication Problem in Object-Oriented Databases 

In summary of the discussion above, the issue of identication of objects in OODB's is not 

solved by the use of o-id's and is far from being well understood. In particular, the last class, 

it is possible for a database to contain objects that cannot be distinguished from each other. 

39

[htbp] 

value-oriented database 

database 

value-representable database 

value-based database 

P PPPPP 

 

 

 

 

object-oriented database 

P PPPPP 

 

 

 

 

P PPPPP 

 

 

 

 

non-value-based database 

identier-based database 

Fig. 2.1. Classication of databases 

We now illustrate the problem. For simplicity, we use a simple graph model (similar models 

have been used in e.g., GOOD, [11]). Object graphs are dened on a set O [ V of nodes, 

where O is a set of (abstract) objects, and V is a set of (atomic) values, and a set L of edge 

labels. Labels can be 2, state or names (used as attribute names). Type constructor names 

and class names are assumed to be elements of V .Thus, V may contain values such astuple, 

set, emp-class. Now an object graph G is given by a nite set N of nodes and a nite set E of 

labeled edges, i.e. E N L N. The label 2 appears on an edge that connects an object to 

its class, or an element to a set that contains it. The label state connects an object to a tuple 

value, what represent its state. A name label connects a tuple to a component. Thus, this 

simple model can describe complex types constructed by tuple and set constructors, object 

classes, and object states (without encapsulation). 

Let us consider the graphs shown in gure 2.2. In (a), the objects o 1 o 2 o 3 cannot be 

[htbp] 

o 1 

AK 

s A 0 

 

s ? 

A 

A s 

1 A 

 

A 

s * HHY 0 

s 0H 

s HA 

o 2 

- o 3 

a 

 

 

 

 

s 

 

 

 

 

o 4 

s 

B 

BBBBBBN 

s 

s 

o 6 

(b) 

- 

o 5 

s 

B 

BBBBBBN 

s 

- 

b 

(a) 

Fig. 2.2. Identication in Object-Oriented Databases 

distinguished each from one another. They have the same outgoing and incoming edges, i.e. 

the graph is completely symmetric. However, if somehow o 1 could be distinguished from o 2 

then all three objects can be distinguished from each other. In (b), since there are value nodes 

and a 6= b, objects o 4 o 5 o 6 can be distinguished either by their outgoing or incoming edges. 

We can use identication trees for presenting the local structure of the graph around each 

object. The trees in gure 2.3 show the similarity of of the neighborhoods for the three objects. 

If (o i so j ) 2 E the edge (o j s!o i ) is used for inversing the order. 

The graph in gure 2.4 also has non-trivial symmetries, yet the objects cannot be uniquely 

identied. However, the objects are divided into two sets that can be distinguished from each 

other. Objects o 2 o 3 can be distinguished, that is separated from each other, i objects o 1 o 4 

40

[bhtp] 

o 1 

o 2 

 

H s 

 

s HHHj 

 

H s 

? 

s HHHj 

? 

o 2 

! o 3 

s 0 1 

o 3 

! o 1 

s 0 1 

o 3 

 

H s 

 

s HHHj 

? 

o 1 

! o 2 

s 0 1 

Fig. 2.3. Trees of of depth 1 for o 1 o 2 o 3 

[htbp] 

o 2 

s 2 

s 2 

- 

Xy 

6 X s 3 

s 1 o 1 s 1 

XXX o 4 

Xy 

XXXXX X ? s 1 s s 1 

s 3 

s 3 

2 

Xz XXXX 

o 3 9 

s 2 

s 3 

: 

Xy X XXXX 

XXXXX XzX 

Fig. 2.4. Objects which cannot be distinguished 

can be distinguished. 

The examples demonstrate that the fact that objects have o-id's cannot serve for identi- 

cation. They illustrate that objects may be identiable conditional to the identiability of 

others, and the close relationship between identiability and distinguishability. Several general 

approaches to identication and distinguishability are introduced and discussed next. 

2.3 Identication Concepts in Databases 

We have already mentioned the option of identifying objects by values of their attributes, or 

additionally by associations to other objects. Generally, objects can be identied by the their 

position in the graph as well. We now consider concrete formalizations of these ideas. 

The rst two ideas concern homomorphic mappings on graphs. Given two object graphs 

G = (NE) and G 0 = (N 0 E 0 ). A mapping h : N ;! N 0 preserves node labels if, for 

each u 2 N \ V , h(u) = u. It preserves adjacency if for all nodes u v in G, and for each 

label s, if there exists an edge (u s v) inG then there exists an edge (h(u)sh(v)) in G 0 . If, 

additionally, whenever (h(u)sh(v)) is in G 0 there is an edge (u s v) in G, then we say it 

strongly preserves adjacency. 

The mapping h is called g-homomorphism if it maps N onto N 0 and both preserves node 

labels and strongly preserves adjacency. It is an isomorphism if it is a bijective map. We 

denote a g-homomorphism by h : G ;! G 0 . 

The requirement that homomorphisms preserve node labels embodies our assumption that 

avalue is a uniquely identiable entity, that can be mentioned by users. Hence, a node labeled 

by avalue cannot be mapped to a node labeled by another. Recall that class names and type 

constructor names are also considered values, so they must be mapped to themselves. Thus, 

since a g-homomorphism strongly preserves adjacency objects in a class can only be mapped 

to objects of the same class, since they are related to the node representing their class by an 

edge. 

Identiability by homomorphisms: 

41

We say that two nodes o 1 o 2 of G are indistinguishable by a g-homomorphism if there 

exists a graph G 0 and a g-homomorphism h : G ;! G 0 such that h(o 1 )=h(o 2 ). An object 

o is H-uniquely identiable if there is no other object o 0 dierent from o such that o o 0 are 

indistinguishable by some g-homomorphism. The graph G is H-identiable if each of its objects 

is H-uniquely identiable. 

 

Identiability by automorphisms: 

Given the object graph G = (NE), a mapping h is called g-automorphism if it is a 

g-isomorphism from G to itself. Denote the automorphism group of G by ; (G). Two nodes 

u v from G are called A-equivalent (denoted by u = v) if there exists a g-automorphism h in 

; (G) withv = h(u). (It is easily seen that it is indeed an equivalence relation.) For each node 

u the set of of all A-equivalent nodes is called the orbit of u (denoted by Or(u)). A node u is 

called A-identiable if u = h(u) foreach g-automorphism h, that is if it is the only element in 

Or(u). The node is called A-unidentiable otherwise. The graph is A-identiable if its nodes 

are A-identiable. 

 

Identication by bisimulation: 

A related idea is obtain by generalizing from mappings to relations. Given two graphs as 

above, a bisimulation between them is a binary relation R that preserves labels and adjacency, 

i.e., if R(u v) and u 2 N \ V ,thenu = v, and if R(u v), then there exists an edge (u s u 0 ) 

i there exists an edge (v s v 0 ) and R(u 0 v 0 ). Bisimulations are closed under union, hence 

there always exists a maximal bisimulation between two graphs. Two nodes u u 0 in G are 

B-identiable is they are related by the maximal bisimulation between the graph and itself. 

 

The three previous denitions use the idea that values are the basis for identiability, 

but they rely on a global mechanism, namely the existence or unexistence of mappings with 

certain properties. The next two denitions introduce ideas that are essentially local. Let us 

dene the local neighborhood of a node u to consist of u, all the nodes v such that (u s v) 

or (v s u) are in the graph, and all the edges connecting them. That is, it is the subgraph 

of G induced by the nodes whose distance from u is at most one (where edge directions are 

ignored). (This is essentially a neighborhood of radius one, in the terminology of [9].) We 

denote the local neighborhood of u by ln(u). A g-isomorphism from ln(u) onto ln(v) is a 

regular g-isomorphism between these two graphs, that maps u to v. Thus, we assume that 

u is a distinguished node in ln(u). Given a set P of pairs of nodes in N, a g-isomorphism 

f from l(u) to l(v) is a P -mapping if for any node u 0 in ln(u), the pair (u 0 f(u 0 ) is not in 

P . The set P should be thought ofasasetof excluded pairs, that cannot be related by the 

mapping. A typical case is when P is the set of values | these are all dierent from each 

other. Another case is a set of objects in a view, where the information that they are pairwise 

dierent cannot be deduced from the data in the view, but can be given as a summary of 

information in the underlying database. 

Identiability by values: 

The idea here is that values are distinguishable from each other, and from objects. Also, 

nodes can be distinguished if they are connected to distinguishable nodes, or have a dierent 

pattern of connectivity. In the following algorithm, this idea is repeatedly applied until a 

xpoint isreached. For generality, the algorithm is given in terms of an arbitrary initial set 

42

IE of pairs of nodes that are assumed to be known to be unequal. 5 

1. Input G =(NE) 

2. Initialization 

NotId = IE 

3. Repeat until no further change 

NotId := NotId [ 

f (u v) (v u) j thereisno NotId-mapping of ln(u) onto ln(v)g 

4. Output : NotId 

For a graph G, let the canonical inequality setbe 

IE can (G) =f(u v)ju v are dierent nodes u v 2 N \ V _ (u 2 N n V ^ v 2 N \ V )g: 

Anodeu 2 N is called V-identiable if for each nodev 2 N the property(u v) 2 NotId holds, 

when the algorithm is started with IE can (G). Otherwise the node is call V-unidentiable. The 

graph G is V-identiable if each of its nodes is V-identiable. 

 

The close relationship between computation and logical inference suggests that the previous 

denition can be brought into a logical form. 

Identiability by (dis)equational logics: 

This logic is an analog of inequality systems used for ADT logics. Now we dene a Hilbert 

type deductive system for this logic. In addition to the predicate above, the system uses 

another binary predicate, that we denote 6=. Also, rather than writing (u v) 2 6=, we arite 

u 6= v. The set of axioms is assumed to be a given set IE of pairs on B =(O L) (whichmay 

be empty). The deductive system is denoted D IE 

V 

Axioms 

Rules 

u 6= v 

if (u v) 2 IE 

there is no 6= ;mapping of ln(u) onto ln(v) 

u 6= v 

Now we dene the derivation relationship ÌE on the basis of DB IE. 

The node n is E-identiable if for every other node n 0 2 N we can derive ÌEcan(G) n 6= n 0 . 

The graph G is E-identiable if each of its nodes is E-identiable. 

 

Identiability by queries 

We dene a simple query language on B = (O V L). The set of queries Q(B) is the 

smallest set generated by the following formation rules. 

(i) If M is a subset of V then M is a query. 

(ii) If q is a query, andJ is a subset of L then ! J (q), J (q) are queries. 

(iii) If q q 0 are queries, then q [ q 0 , q \ q 0 and q n q 0 are queries. 

The semantics of queries, i.e., their meaning on a graph G =(NE) is dened as follows. 

(i) M(G) =fu 2 N j u 2 Mg 

5 We discuss below a scenario where this is of interest. 

43

(ii) J (q)(G) =fu 2 N j (u l v) 2 El 2 Jv 2 q(G)g 

(iii) ! J (q)(G) =fu 2 N j (v l u) 2 El 2 Jv 2 q(G)g 

(iv) (q [ q 0 )(G) =q(G) [ q 0 (G) (q \ q 0 )(G) =q(G) \ q 0 (G) (q n q 0 )(G) =q(G) n q 0 (G) 

Using the rst type of query, we can select any subset of the value nodes of a graph. Using the 

next two kinds, we can express complex reachability patterns. Note that the language does 

not contain iteration idioms. However, since queries can essentially be composed, and we can 

write large queries, which compensate the lack of such idioms. 

Given a graph G =(NE) onB, letQ G be the subset of Q(B) that mentions only values 

and labels in G. Two nodes u v are Q-indistinguishable (denoted u = Q v) if for all q in Q G , 

u 2 q(G) i v 2 q(G). A node u in N is Q-identiable if there is a query q in Q such that 

q(G) =fug. G is Q-identiable if each nodeinN is Q-identiable. . 

2.4 Comparison of Identication Concepts 

We now proceed to compare the expressive power of the mechanisms introduced above. We 

start with V and E identiability. 

Proposition 2.1. For each graph G, for any set of pairs IE, and for any nodes u v in N, 

the algorithm terminates with (u v) in NotId if and only if ÌE u 6= v. 

Proof. Easy inductions on computations, deductions respectively. 

Corollary 2.2. Anode u is V-identiable if and only if is E-identiable. 

 

From now, we use computation to refer to either computation or derivation. We note that 

computations are non-deterministic in the sense that in each stepwe can consider an arbitrary 

pair. However, when we consider the set of pairs of nodes that are inferred to be unequal, we 

have: 

Lemma 2.3. Computations are conuent: All computations on a given graph from an initial 

set IE terminate with the same set of pairs of nodes for 6=. 

Proof. If at a given point in a computation, it is possible to derive that u 6= v, then the 

execution of other steps does not invalidate any of the prerequisites, since 6= canonly grow. 

Hence this fact remains derivable. 

Given a set of pairs, if wewanttointerpret it as a set of inequalities, then it is desirable that 

its complement has the properties of equality, in particular that is it an equivalence relation 

on the nodes of the graph. Let us call a set of pairs whose complement is an equivalence 

relation well-behaved. 

Proposition 2.4. Assume that the given initial set is well-behaved. Then the computation 

produces a well-behaved set NotId. 

Proof. By the lemma, the order of steps in a computation is irrelevant, so we consider steps 

done in a certain order, and organized into stages: In a given stage, we take the complement 

of the set NotId computed so far, take a connected component ofthatcomplement, and test 

each pair in this component for inclusion in NotId. We add all pairs that qualify at the end 

44

of the stage to the set of pairs, then proceed to the next stage. This computation is still 

non-deterministic in the choice of a component for a stage. 

The claim now proceeds by induction on stages. By assumption, the given set is wellbehaved, 

so after initialization, NotId is well-behaved. Assumed that it is well-behaved after 

k stages, we show the property holds after the k + 1 stage. Note that since it is well-behaved, 

the complement is an equivalence relation | a connected component of the complement is 

an equivalence class. Now, a pair u v in the class is noted for inclusion in NotId if and only 

if there is no NotId-isomorphism of ln(u) onto ln(v). Thus, two nodes u v will remain in 

the complement ofNotId i there is such an isomorphism between ln(u)ln(v). It is easy to 

see that since NotId is well-behaved, the set of pairs that are not included in it in this stage 

is a (disjoint) partition of the class. 

We now proceed to compare V- and A-identiability: 

Proposition 2.5. If nodes u v are V-distinguishable, then they are A-distinguishable. 

Proof. We claim that if u v are A-indistinguishable, that is, there is a g-automorphism h 

that maps u to v, then the pair (u v) will not be in NotId . The proof is by induction on 

the stages of a computation. Clearly, the pairs in IE can (G) are A-distinguishable, each is a 

separate class in the complement of NotId and is mapped by h to itself. So (u v) is not in 

IEcan(G), for any such pair. For the induction, we assume that for all u v, ifh(u) =v then 

u v are in the complement of NotId after k stages, and we prove this holds after the next 

stage. Indeed, let u v be a pair such that h(u) = v. Then the restriction of h to ln(u) is a 

NotId-isomorphism onto ln(v), so the pair (u v) is not put into NotId. 

The converse to the proposition does not hold. As an example, consider a database with n 

classes, C 1 ::: C n . Let class C i contain objects x i y i z i , and let each of these haveanl-labeled 

outgoing edge, so the graph contains the 3n edges (a i la i+1 ), for i = 1n and a = x y z. 

Further, assume the existence of the following three edges: (x n lx 1 ) (y n lz 1 ) (z n ly 1 ). 

Finally, assume the following 3n h-edges: (x i hy i ) (y i hz i ) (z i hx i ). Now, since class nodes 

are V-distinguishable, the algorithm for V-identication will partition the objects so that 

x i y i z i will be in an equivalence class in the complement ofNotId. Since the structure of each 

class is symmetric, no additional partition can occur. However, no non-trivial automorphism 

exist. Indeed, assume it exists, and call it h. Without loss of generality, assume h(x 1 )=y 1 . 

Then necessarily, from then edge structure, h(y 1 ) = z 1 h(z 1 ) = x 1 , and the same, namely 

h(x i ) = y i , ::: , must hold for the objects in the classes C 2 ::: C n . But now we reach a 

contradiction, since (y n lz 1 )isinE, but (h(y n )lh(z 1 )) = (z n lx 1 ) is not in E. 

We now consider properties of H. Every g-homomorphism on a graph G partitions the 

nodes into equivalence classes. We can compare g-homomorphisms by comparing the partitions 

they induce. We say h dominates h 0 if each equivalence class of h 0 is contained in an 

equivalence class of h. Clearly, dominance induces a partial order. A g-homomorphism of 

G that dominates all other g-homomorphisms of G is called maximum. Any two maximal 

mappings dominate each other, hence induce the same partitions. Thus, their images are 

g-isomorphic. 

Proposition 2.6. There exists a g-homomorphism which is maximum. 

45

Proof. Let HN be the transitive closure of the binary relationship ù and v are H-indistinguishable'. 

It is an equivalence relation on the nodes N of the graph G. We can create a graph ^G whose 

nodes are the elements of HN, and such that ([u]l[v]) is an edge if and only if there is an 

edge (u l v) in G, where u is any member of the equivalence class [u], and similarly for v. 

We claim that the mapping ^h that maps all elements of [u] inN to [u] is a g-homomorphism 

from G onto ^G. 

Assume that for some g-homomorphism h, h(u) =h(v). Further assume that (u l u 0 ) is 

an edge in G. Then (h(u)lh(u 0 )) = (h(v)lh(u 0 )) is an edge in the image under h. Since h 

strongly preserves adjacency, (v l u 0 )must be an edge in G. Thus, u and v are connected by 

l edges to precisely the same nodes. The claim obviously holds also for other edge labels, and 

also for back edges of the form (u 0 lu). In short, u v have the same connections. 

Now, if v and w are identied by some h 0 , then v and w have the same connections. It 

follows that u and w have the same connections. By induction, we nowhave thatifwehave a 

sequence u 1 ::: u n ,suchthateach pair u i u i+1 is identied by some h i , then all elements in 

the sequence have the same connections. In other words, all elements of an equivalence class 

in HN have the same connections. It follows that ^h is indeed a g-homomorphism. It is clear 

that it is a maximal g-homomorphism. 

Proposition 2.7. H-identiability is the same as B-identiability 

Proof. A bisimulation on a graph induces an equivalence relation on its nodes, and it is 

easy to see that there exists a g-homomorphism from the graph to its image modulo this 

equivalence relation. In particular, taking the maximal bisimulation, we have that if objects 

are H-identiable, then they are B-identiable. For the opposite direction, we observe that 

the maximal g-homomorphism induces a bisimulation on the graph. 

We now compare V and H. Let us call the classes in the partition of the nodes of G into 

equivalence classes by ^h-classes. As noted above, the notion of V-identiability also denes 

equivalence classes on G, each consisting of objects that are pairwise V-indistinguishable. 

Denote these as V-classes. 

Proposition 2.8. Let G be a graph. Then each ^h-class is contained in some V-class. 

Proof. We prove the claim by induction on the stages of the computation of the V-classes. 

Initially, each v 2 N \ V is an equivalence class by itself, and all other elements of N are in 

one class. Clearly, each v 2 N \ V is also a singleton ^h-class, so the claim holds. Assume now 

it holds after stage k. If the local neighborhoods of u v of the same ^h-class are compared, 

they will be found to be isomorphic. To see this, observe thatasweshowed above theyhave 

precisely the same connections. Thus, the mapping that takes u to v and leaves all other nodes 

in place is an isomorphism of ln(u) and ln(v). It follows that they will not be separated. 

Corollary 2.9. V-identiability implies H-identiability. 

The converse fails: In the example above, no twonodeshave precisely the same connections, 

hence ^h is the identity, so each node is H-identiable, but as we saw, nodes are not V- 

identiable. 

If we change the example by conneecting x n to x 1 , ::: , then there is a non-trivial 

g-automorphism, so nodes in the same class are not A-identiable, but they are still H- 

identiable. 

46

Proposition 2.10. A-identiability implies H-identiability. 

Proof. Assume nodes u v are H-undistinguishable. We claim that the map that interchanges 

u with v and is the identity elsewhere is a g-automorphism. 

We now consider Q. As mentioned above, when one starts from IE can (G), the complement 

of the 6= relation computed by the algorithm for V-identication is a partition of the nodes 

into equivalence classes, the V-classes. 

Proposition 2.11. Let G be agraph, and q be a query. Then the q(G) is a union of V-classes. 

Proof. The proof uses induction on the structure of queries. If M V , then M(G) =M \ N, 

which obviously is a union of V-classes, as each value is in a class by itself. Now, assume the 

claim is true for a query Q, let J L, and consider the query ! q . Assume u 2 ! q (G), and 

also assume that u 0 is in the same V-class as u. Then there is a 6=-isomorphism from ln(u) 

onto ln(u 0 ). In particular, if there is an l-edge from u to a V-class, there is an l-edge from u 0 

to the same class. It follows that u 0 is also in ! q (G). The case for q is similar. Since sets of 

V-classes are closed under boolean operations, the proof is complete. 

Corollary 2.12. Q-identiability implies V-identiability. 

The converse does not hold, since the V-algorithm can count, while queries do not count. 

For example, assume o 1 o 2 have l-edges to a node labeled 3, o 3 has a k-edge to o 1 , while o 4 

has k-edges to both o 1 and o 2 . The V-algorithm will separate o 3 from o 4 , since their local 

neighborhoods are not isomorphic. Queries cannot separate o 1 from o 2 , hence cannot also 

separate nodes related to them by the same kind of edges, like o 3 o 4 . 

Notice, for non-canonical sets of inequalities the generalized V-identiability and the equivalent 

E-identiability donot imply H-identiability orA-identiability. Thus, identiability 

based inequality sets can be a very powerful method. 

Integrity constraints can be used for identiability aswell, since they can impose distinguishability 

or indistinguishability of objects. As we have seen, sometimes two objects can be 

distinguished from each other if a pair of others can. Thus, distinguishability that is deduced 

from the integrity constraints in the system can propagate. 

Using a generalized relational representation, equality generating dependencies are constraints 

of the following form: 

8x((P R (x 1 ) ^ ::: ^ P R (x m ) ^ F (x m+1 ) 

! G(x) 

where F G are conjunctions of equalities of the form x ij = x i 0 j0, P is the predicate symbol 

associated with relation R, andx i x. Based on the transformation of the constraint to the 

equivalent formula 

8(x)(P R (x 1 ) ^ ::: ^ P R (x m ) ^:G(x m+1 ) 

!:F (x)) 

we can use the inequality set IE in order to extend the deductive system DV 

IE . Thus, we 

can express the identication properties on the basis of value-distinguishability or equational 

logic in the case of equality generating dependencies. Notice, that functional dependencies, 

key constraints, and generalized functional dependencies are special equality generating dependencies. 

47

Corollary 2.13. Identication extended byequality-generating dependencies can be expressed 

by V-identiability. 

An exclusion dependency is an expression of the form 

R[R:A 1 :::: R:A n ] k S[S:B 1 :::S:B n ]: 

The property specied by the exclusion dependency can be directly translated to inequalities 

among objects. 

A generalized inclusion dependency is an expression of the form 

R 1 [X 1 ] \ ::: \ R n [X n ] S 1 [Y 1 ] [ ::: [ S m [Y m ] 

for compatible sequences X i Y j . Similarily to equality-generating dependencies, generalized 

inclusion dependencies can be transformed to negated formulas. These formulas are the basis 

for the extension of the deductive system D IE 

V . 

Corollary 2.14. Identication extended by generalized inclusion dependencies and exclusion 

dependencies can be expressed by V-identiability. 

Disjunctive existence constraints X ) Y 1 Y 2 ::: Y n specify that if a tuple is completely dened 

on X then it is completely dened on Y i for some i. There is an axiomatization for disjunctive 

existence constraints. They can be represented by monotone Boolean functions. 

Since the existence has been treated explicitly in the denition of value-identiability we 

conclude directly: 

Corollary 2.15. Identication extended by existence dependencies can be expressed by V- 

identiability. 

Summarizing the comparisons above we obtain 

Corollary 2.16. V-identiability and E-identiability are equivalent for generalized inclusion, 

exclusion, existence and equality-generating dependencies. 


This paper has reconsidered notions of identity andidentiability in OODB'ss. We have proposed 

and justied the thesis that object identiers, as proposed and used in most OODB 

implementations are system-related but do not address problems of users. In particular, although 

the o-id mechanism guarantees that objects do have a unique identity, as required 

in the foundational postulates for OODB's, that by itself does not provide for identiability, 

namely the ability of a program or user to uniquely identify an object. For the latter, 

as far users of an OODB are concerned, a value-based mechanism must be used. We have 

shown a close relationships between identiability and separability of objects from each others. 

In order to better understand identiability, wehave studied various notions that can be 

used as a specication of this notion, and have classied their relative strengths. Our results 

and discussion complement and augment previous discussion of object identication and its 

complexity in the literature, e.g., [2, 10, 14, 15]. 

We have not considered the practical issues of identiability. Since the mechanism must 

be value-based, a notion of keys, as used in relational databases certainly suces. However, 

48

OODB's oer a much richer structure, and it seems reasonable to expect that more of this 

structure be used for identication. E.g., one can use not only the value of a key attribute, 

but also it membership in a given class, in particular distinguish between membership in 

subclasses of a given class. Initial work in this direction has been reported in [19, 20]. We see 

in this an important research direction. 

A related problem concerns the representation of real-world entities by database objects. 

A-posteriori it is possible for two or more objects to represent the same real world entity. 

Thus, in addition to the primary notion of object equality, where objects are equal if they are 

identical in the database, we have referential equality, where two database objects refer to the 

same real world entity. Although not discussed much in the literature, it is probably undesirable 

to allow dierent database objects to represent the same real world entity. Whether 

such phenomena can be avoided depends on the identication mechanisms supported by the 

system. A mechanism that is easy to use, simple to understand, and can be directly related 

to properties of real-world entities can help avoid problems of multiple representations. 

A nal point concerns views. One can say that the reason o-id's cannot help users to 

identify objects is the existence of an abstraction barrier between the system and its users. 

The system knows the values of o-id's and can use them freely for all its needs. However, since 

only an equality test is exported, these same o-id's are much less useful for the users. One 

can say that the OODB as seen by the users is a view of the internal OODB, as seen by the 

system. It follows that one should expect similar problems when dealing with views. Namely, 

it is possible that the identifcation mechanism in an OODB uniquely identies each object 

in the current state. Yet, since views present a restricted viewpoint, it is possible that this 

property does not hold for some views. This may be problematic for views that allow updates, 

possibly also for queries. Note that view denitions may form abstraction barriers in ways 

that are much more sophisticated that the simple interface between the implementation and 

conceptual levels of an OODB. The analysis of identiability issues may therefore be more 

dicult. The problem will become both more dicult and more important as distributed 

access to distinct OODB's through the Web becomes common. 


1. S. Abiteboul, P.C. Kanellakis, Object identity as a query language primitive. Proc. SIGMOD, 

1989, 159 - 173. 

2. S. Abiteboul, J. Van den Bussche, Deep equality revised. Proc. DOOD'95 (eds. T.W. Ling, A.O. 

Mendelzon, L. Vielle), LNCS 1013, 213 - 228. 

3. C. Beeri, A formal approach toobject-oriented databases. Data and Knowledge Engineering, 5, 

1990, 4, 353 - 382. 

4. C. Beeri, Some thoughts on the future evolution of object-oriented database concepts. Proc. BTW 

93 (ed. W. Stucky), Springer, 1993, 18 -32. 

5. K. Berka, L. Kreiser, Texts on logics. Akademie-Verlag, Berlin, 1973. 

6. J. Biskup and H.H. Bruggemann. An object-surrogate-value approach for database languages. 

Technical report 16-3-89, University Hildesheim, Dept. Computer Science. 

7. H.B. Curry, Foundations of mathematical logic. McGraw-Hill, New York, 1963. 

8. G. Frege, Funktion und Begri. Jena 1891. 

9. H. Gaifman, On local and non-local properties. Proc. of the Herbrand Symposium, Logic Colloq. 

'81, North-Holland, Amsterdam, 1982. 

10. M. Gogolla, A declarative query approach to object identication. Proc. OO-ER95 (ed.M.Papazoglou), 

LNCS 1021, 65 - 76. 

49

11. M. Gyssens, J. Paredaens, D. v. Gucht, A graph-oriented object database model. Proc. PODS, 

1990, 417-424. 

12. S.N. Khoshaan, G. Copeland, Object identity. Proc. OOPSLA-86, special Issue of SIGPLAN 

Notices (ed. N. Meyrowitz), 21 (12), Dec. 1986, 406 - 416. 

13. S.C. Kleene, Mathematical logic. John Wiley, New York, 1967. 

14. H.-J. Klein, J. Rasch. Value based identication and functional dependencies for object databases. 

Proc. 3rd Basque Int. Workshop on Information Technology, IEEE Comp. Sci. Press, 1997, 22-34. 

15. A. Kosky, Observational distinguishability. Proc. 5th DBPL, Electronic Report of Conferences in 

Computing, Springer, 1995. 

16. G.W. Leibniz, Fragmente zur Logik. Edited by Fr. Schmidt, Berlin, 1960. 

17. P.S. Poreckij, Theorie conjointe des egalites des non-egalites logiques. News of Physics Society of 

Kazan University, XVI, No. 1-2, 1908. 

18. J. Rumbaugh, Controlling propagation of operations using attributes on relations. Proc. OOP- 

SLA88, ACM Sigplan Notices (23,11), Nov. 1988, 285{296. 

19. K.-D. Schewe, J.W. Schmidt, and I. Wetzel, Identication, Genericity and Consistency in Object- 

Oriented Databases. In J. Biskup, R. Hull (eds.), Proc. 3rd International Conference on Database 

Theory, ICDT '92, Berlin (Germany), Lecture Notes in Computer Science 341{356, 1992, Springer. 

20. K.-D. Schewe, B. Thalheim, Fundamental Conceps of Object Oriented Concepts. Acta Cybernetica, 

11, No. 4, 1993, 49 { 81 

21. B. Thalheim, Reconsidering key and identication concepts in dierent database models. Technical 

Report CS-08-91, University of Rostock, 1991. 

22. J. Van den Bussche,J.Paredaens, The expressive power of complex values in object-based data 

models. Inf. Comput. 120, 220{236. 

23. J. Van den Bussche, D. van Gucht, M. Andries, M. Gyssens, On the completeness of object-creating 

database transformation languages. JACM 44:2, March 1997, 272{319 

50

Chapter 3 

Fundamentals of Object Oriented 

Database Modelling 

Contents 

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 

3.2 Type Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 

3.3 OODM Schemata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 

3.4 Value Representability . . . . . . . . . . . . . . . . . . . . . . . . . 58 

3.5 Logical Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 60 

3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 


Klaus{Dieter Schewe. Fundamentals of Object Oriented Database Modelling. Intelligent 

Systems. Moskau 1997. 

51

Abstract. Solid theoretical foundations of object oriented databases (OODBs) are still missing. 

The work reported in this paper contains results on a formally founded object oriented 

datamodel (OODM) and is intended to contribute to the development of a uniform mathematical 

theory of OODBs. 

A clear distinction between objects and values turns out to be essential in the OODM. 

Types and classes are used to structure values and objects repectively. This can be founded 

on top of any underlying type system. We outline dierent approaches to type systems and 

their semantics and claim that OODB theory on top of arbitrary type systems leads to type 

theory with topos-theoretically dened semantics. 

On this basis the known solutions to the problems of unique object identication and 

genericity can be generalized. It turns out that extents of classes must be completely representable 

by values. Such classes are called value-representable. As a consequence object 

identiers degenerate to a pure implementation concept. This stimulates considerations that 

do not depend on such identiers. 

In order to approach this problem object oriented schemata and instances are reorganized 

by means of general category-theoretical arguments to let them occur as theories in the higherorder 

intuitionistic logic associated with a topos dened by the type system. Moreover, in the 

case of value-representability itcan be seen that object identiers can be dispensed with at 

the logical level. This allows to approach queries algebraically as well as logically and sets up 

a starting point for deduction within OODBs. 


The shortcomings of the relational database approach encouraged much research aimed at 

achieving more appropriate data models. It has been claimed that the object-oriented approach 

will be the key technology for future database systems and languages [8]. Several systems 

[5, 6, 7, 9, 19, 20, 21, 22, 24, 27, 38, 40, 41, 70] arose from these eorts. However, in contrast 

to research in the relational area there is no common formal agreement on what constitutes 

an object-oriented database [11, 12, 14]. 

The basic question \What is an object?" seems to be trivial, but already here the variety 

of answers is large. In object oriented programming the notion of an object was intended as 

a generalization of the abstract data type concept with the additional feature of inheritance. 

In this sense object orientation involves the isolation of data in semi-independent modules in 

order to promote high software development productivity. The development of object oriented 

databases regarded an object also as a basic unit of persistent data, a view that is heavily in- 

uenced by existing semantic datamodels (SDMs) [2, 30, 31, 43, 44, 63]. Thus, object oriented 

databases are composed of independent objects but must also provide for the maintenance of 

inter-object consistency, a demand that is to some degree in dissonance with the basic style 

of object orientation. 

Theoretical investigations in the eld of OODBs are rare. The few existing results in OODB 

theory can be classied in three groups. The rst one [25, 65, 66, 67] studies expressiveness and 

complexity of query languages with object creation and duplicate elimination. This follows 

more or less the ideas of the IQL framework [3]. The second one [12, 14, 15, 16, 54, 55] asks 

for the fundamental features of object oriented datamodels and their semantical foundations. 

The third group [4, 37] continues the line of research in which databases occur as theories 

dened by logic programs. 

52

A view that is common in OODB research is that objects are abstractions of real world 

entities and should have an identity [8]. This leads to a distinction between values and objects 

[11, 12]. A value is identied by itself whereas an object has an identity independent of its 

value. This object identity is usually encoded by object identiers [1, 3, 36]. Abstracting from 

the pure physical level the identier of an object can be regarded as being immutable during 

the object's lifetime. Identiers ease the sharing and update of data. However, such abstract 

identiers do not relieve us from the task to provide unique identication mechanisms for 

objects. In object oriented programming object names are sucient, but retrieving mass data 

by name is senseless. 

In most approaches to OODBs an object is coupled with a value of some xed structure. 

To our point of view this contradicts already the goal of objects being abstractions of reality. 

In real situations an object has several and also changing aspects that should be captured by 

the object model. Therefore, in our object model each object o consists of a unique identier 

id, a set of (type-, value-)pairs (T i v i ), a set of (reference-, object-)pairs (ref j o j ) and a set 

of operations m k . 

Types are used to structure values. Then the rst problem concerns the semantics of the 

type system, i.e. the variety of types that can be dened and used in schema denitions. 

We consider three dierent approaches based on a simple type system with set semantics, the 

typed -calculus and a slightly extended version of Girard-Reynolds polymorphism [17, 42, 48]. 

For the third case it is well-known that there is no set-theoretic model. In this case, however, 

suitable models can be obtained in the eective topos [34, 32, 50] or even in Grothendieck 

topoi [47]. Moreover, we may always ask how good a model is with respect to computational 

aspects. Here again it may be argued that having an intuitionist's mind, i. e. taking a topostheoretic 

point of view, may helptohave eective computations [49]. 

Classes serve as structuring primitive for objects having the same structure and behaviour. 

It is obvious that the multiple aspects view of an object allows them to be simultaneously 

members of more than one class and to change class memberships. In the OODM a class 

structure uniformly combines aspects of object values and references. The extent of classes 

varies over time, whereas types are immutable. Relationships between classes are represented 

by references together with referential constraints on the object identiers involved. Moreover, 

each class is accompanied by acollection of operations. A schema is given by acollection of 

class denitions together with explicit integrity constraints. It will be shown that the semantics 

of OODM schemata can be dened in a uniform way independently from the underlying type 

system. 

Important OODB problems concern the unique identication of objects and the existence 

of generic update operations [55]. Following [1, 13] the immutable identity of an object can 

be encoded by the concept of abstract object-identiers. The advantages of this approach are 

that sharing, mutability ofvalues and cyclic structures can be represented easily [46]. On the 

other hand, object identiers do not have a meaning for the user and should therefore be 

hidden. The notion of value-representability is known to guarantee unique identication in 

the case of set semantics. This can be generalized to the general case. The same applies to 

the genericity problem. 

Then we show using now categorical terms how classes, schemata and instances can be captured 

categorically. Using the internal logic of a topos, we may dene schemata and instances 

by theories and even get rid of object identiers using the existence, identity and description 

predicates in intuitionistic logic instead. On this basis algebraic and logical queries can be 

dened. However, this last step depends on value-representability, a necessary property for 

53

genericity [55], whereas for the unique identication of objects weak value-identiability would 

be sucient [16, 55]. However, some slight extensions { which areomitted here { allow also 

to capture this case. 

Throughout the paper we assume some basic knowledge about category theory [10], elementary 

topos theory [35, 39] and their relation to higher-order intuitionistic logic [28, 39, 60]. 

3.2 Type Systems 

We start with a brief look at three dierent type systems and their semantics. The three 

approaches comprise a very simple type system with set semantics, typed -calculus with 

semantics in cartesian closed categories and a version of the polymorphic or second-order 

typed -calculus. 

Common to all these cases is the view that types are basically given by base types and 

constructors. The latter will occur as types with free (type) variables. A type without free 

variables will be called proper. Among the base types we assume an abstract identier type 

ID.Atype T without occurrence of ID will be called a value-type. 

A Simple Type System. In set-based modelling a type may be regarded as an immutable 

set of values of a uniform structure together. Subtyping is used to relate values in dierent 

types. We use a type system that consists of some base types such as BOOL, NAT, INT, 

STRING, etc., and type constructors for records, nite sets, lists, etc. Arbitrary types can 

then be dened by nesting. Moreover, we assume recursive types with a semantics dened 

by rational trees. We shall proceed giving a more formal denition of types. Thus the type 

system can be dened as 

t := b j x j (a 1 : t 1 :::a n : t n ) jftg j[t] j x:t : 

The semantics of such types as sets of values is dened as usual. Then the type system may be 

extended by a subtype relation t 0 t [42], which semantically gives rise to subtype functions 

t 0 ! t. We omit the details here. 

If t 0 is a proper type occurring in a type t, then there exists a corresponding occurrence 

relation 

o : t t 0 ! 

where is the truth object in Set, i.e. = BOOL. 

Typed -Calculus. In the typed -claculus the main emphasis is on function types, i.e. we 

can dene the type system by 

t := b j x j (a 1 : t 1 :::a n : t n ) j t 1 ! t 2 j : 

The semantics of the typed -calculus can described by cartesian closed categories. 

54

Polymorphism. As the third approach we choose some slightly enriched version of Girard- 

Reynolds polymorphism (GRP), i.e. types are given by the language 

t := b j x j t 1 ::: t n j t 1 ! t 2 j x:t 

where b denotes some collection of base types including for our purposes a type ID of object 

identiers, x represents some type variable, represents product types, ! represents function 

types and impredicative polymorphic abstraction with x running over all types [42, 48]. 

First recall the notion of a topos. Atopos E is a nitely-complete cartesian-closed category 

with a subobject classier, i. e. there is an object and a global element true :1l ! such 

that for each monomorphism f : A,! B there is a unique classifying morphism cl(f) :B ! 

such that f and triv dene the pullback ofcl(f) andtrue. Here1l denotes a terminal object. 

Then we may dene what we mean by a model of our type theory in a topos E. Amodel 

of GRP in a topos E consist of an essentially small internal category IE that is closed under 

nite products, exponents and Ob(IE)-indexed products together with an embedding into E 

which preserves these properties. The commonly known such model is given by the category 

Per of partial equivalence relations in the eective topos E . 

For an exhibition of various approaches to construct such modelswe refer to [34, 47]. 

3.3 OODM Schemata 

In this section we present a slightly modied version of the object oriented datamodel (OODM) 

of [52, 54, 58]. We observe that an object in the real world always has an identity. Therefore, 

abstract (i.e. system-provided) object identiers are introduced to capture identity. However, 

neither the real world object that was the basis of the abstraction nor the abstract identier 

can be used for the identication of an object. 

In contrast to existing object oriented datamodels [1, 3, 5, 6, 7, 8, 9, 20, 21, 27, 38, 40, 

46, 61] an object is not coupled with a unique type. In contrast, we observe that real world 

objects can have dierent aspects that may change over time. Therefore, a primary decision 

was taken to let an object be associated with more than one type and to let these types even 

change during the object's lifetime. The same applies to references to other objects. 

The Class Concept. The class concept provides the grouping of objects having the same 

structure which uniformly combines aspects of object values and references. Moreover, generic 

operations on objects such as object creation, deletion and update of its values and references 

are associated with classes provided these operations can be dened unambigously. Objects 

can belong to dierent classes, which guarantees each object of our abstract object model to 

be captured by the collection of possible classes. As for values that are only dened via types, 

objects can only be dened via classes. 

Each object in a class consists of an identier, a collection of values and references to 

objects in other classes. Identiers can be represented using the unique identier type ID. 

Values and references can be combined into a representation type, where each occurence of 

ID denotes references to some other classes. Therefore, we may dene the structure of a class 

using types with free variables. 

As to dynamics we distinguish between visible and hidden operations to emphasize those 

operations that can be invoked by the user and others. All operations on a class including the 

55

hidden ones can be accessed by other operations, but only hidden operations can be used to 

handle identiers. 

In the following N denotes some (large enough) collection of names. 

(i) Let t be a value type with free variables 1 ::: n .For pairwise distinct reference names 

r 1 ::: r n 2 N and class names C 1 ::: C n 2 N the expression derived from t by replacing 

each i in t by r i : C i for i =1::: n is called a structure expression. 

(ii) A class consists of a class name C 2 N, a structure expression S, a set of superclass names 

fD 1 ::: D m gN and a set fm 1 :::m k g of operations. We callr i a reference from class 

C to class C i . The type derived from S by replacing each reference r i : C i bythetype ID 

is called the representation type T C of the class C, the type U C =(ident : IDvalue :: T C ) 

is called the class type of C. 

(iii) An operation signature consists of a operation name M 2 N, a set of input-parameter 

/ input-type pairs i :: T i ( i 2 N) and a set of output-parameter / output-type pairs 

o j :: T 0 j (o j 2 N). We write 

o 1 :: T 0 1::: o m :: T 0 m M( 1 :: T 1 ::: n :: T n ) : 

(iv) A operation M on aclassC consists of a operation signature with name M and a body 

that is recursively built from the following constructs: 

(a) assignment x := E, where x is either the class variable x C or a local variable within 

S, andE is a term of the same type as x, 

(b) skip, fail, loop, 

(c) sequential composition S 1 S 2 , choice S 1 S 2 , projection x :: T j S, guard P ! S, 

restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable of type 

T ,and 

(d) instantiation x 0 1 ::: x0 i C0 : S 0 (E1 0 ::: E0 j ), where S0 is a operation on class C 0 with 

input-parameters 0 1 ::: 0 j and output-parameters o0 1 ::: o0 i ,such that the variables 

o 0 f , x0 f have the same type and the term E0 g has the same type as the variable 0 g. 

(v) An operation M on a class C with signature o 1 :: T1 0::: o m :: Tm 0 M( 1 :: T 1 ::: n :: 

T n ) is called value-dened i all T i (i =1:::n) and Tj 0 (j =1::: m) are proper value 

types. 

(vi) A schema S is a nite collection of classes C 1 ::: C n closed under references, superclasses 

and occurrences of class names in operations. 

Semantics. First assume that the underlying type systems has a set semantics. Then we can 

dene instances of OODM schemata. 

An instance D of a schema S assigns to each classC avalue D(C) oftype U C such that 

the following conditions are satised: 





Moreover, if T C is a subtype of T 0 C with subtype function f : T C ! T 0 C ,thenwehave 


56

eferential integrity: For each reference from C to C 0 with corresponding occurrence relation 

o r wehave 

8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : 

(3.20) 

On the basis of topos theory we can rephrase the denition of database instances. Instead of 

the set D(C) wehave to consider a subobject DC ,! | 

ID T C , i.e. a monomorphism in IE. If 

: ID T C ! ID is the canonical projection, then the uniqueness of identiers means that 

| is monic. If j C i is the image factorization of | with j C : im(DC) ! ID, then this 

must factor through j D if C is a subclass of D. Thirdly, letDC be the subobject of DC ID 

classied by 

DC ID |id 

,! ID T C ID id ;! T C ID or 

;! 

where o r corresponds to the reference from C to D. Letj r : im(DC) ,! ID result from the 

image factorization of { for { : DC ,! DC ID. Then j r must factor through j D . 

The semantics of operations can be dened via predicate transformers as shown in [26, 45] 

for the classical case and in [57] for the topos-based semantics. 

Example. Let us look at a simple university example based on the simple type system with 

set semantics. We rstintroduce types and classes, then show an example of an instance. 

Type PERSONNAME = ( FirstName : STRING , SecondName : STRING , Titles : f 

STRING g ) 

Type PERSON = (PersonIdentityNo : NAT Name : PERSONNAME ) 

Type MPERSON = ( PersonIdentityNo : NAT , Spouse : ) 

Then let the schema consist of the following classes: 

Class PersonC 


End PersonC 


IsA PersonC 

Structure ( PersonIdentityNo : NAT , Spouse : MarriedPersonC ) 

End MarriedPersonC 


IsA PersonC 

Structure ( StudentNumber : NAT , Supervisor : ProfessorC , 

Major : DepartmentC , Minor : DepartmentC ) 

End StudentC 


IsA PersonC 

Structure ( PersonIdentityNo : NAT , Age : NAT , 

Salary : NAT ,Faculty :DepartmentC ) 


57


Structure ( DeptName : STRING ) 


Next use D as a name for the instance. 

D(PersonC) =f ( i 1 , ( 123 , ( \John" , \Denver" , f \Professor" , \Dr" g ))), 

( i 2 , ( 124 , ( \Mary" , \Stuart" , f \Dr" g ))), 

( i 3 , ( 456 , ( \John" , \Stuart" , fg))), 

( i 4 , ( 567 , ( \Laura" , \James" , fg))), 

( i 5 , ( 987 ,(\Dave" ,\Ford" , fg))) g 

D(MarriedPersonC)=f ( i 1 , ( 123 , i 2 )), 

( i 2 , ( 124 , i 1 )) g 

D(ProfessorC)=f ( i 1 , ( 123 , 48 , 8000 , i 6 )) 

D(StudentC)=f ( i 3 , ( 456 , 1023 , ( \John" , \Stuart" , fg),i 1 , i 6 , i 7 )), 

( i 4 , ( 567 , 2134 , ( \Laura" , \James" , fg),i 1 , i 6 , i 7 )) g 

D(DepartmentC)=f ( i 6 , ( \Computer Science" ) ) , 

( i 7 , ( \Philosophy" ) ) , 

( i 8 ,(\Music"))g 

3.4 Value Representability 

From an object oriented point ofview a database may be considered as a huge collection of 

objects of arbitrary complex structure. Hence the problem to uniquely identify and retrieve 

objects in such collections. 

Each object in a database is an abstraction of a real world object that has a unique identity. 

The representation of such objects in the OODM uses an abstract identier I of type ID to 

encode this identity. Suchanidentier may be considered as being immutable. However, from 

a systems oriented view permutations or collapses of identiers without changing anything 

else should not aect the behaviour of the database. 

For the user the abstract identier of an object has no meaning. Therefore, a dierent 

access to the identication problem is required. We show that the unique identication of 

an object in a class leads to the notion of value-identiability. The stronger notion of valuerepresentability 

is required for the unique denition of generic update operations. The setbased 

case has been handled in [54, 55] 

(i) A class C is called value-identiable i there exists a proper value type I C , called valueidentication 

type such that for all instances D of S there is a morphism c : T C ! I C 

such that the composition 

DC ,! ID T C 

2 

! T C 

c 

! I C 

is monic. 

(ii) C is called value-representable i there exists a value-dentication type V C such that 

for all instances D of S there is a morphism c : T C ! V C such that for all valueidentication 

types I C and the image factorization T 

c V 

C ! DV C ,! V C there exists a 

morphism c 0 : DV C ! I C with c I = c 0 c V . 

58 

g

It is easy to see that each value-representable class C is also value-identiable. Moreover, the 

value-representation type V C in is unique up to isomorphism. 

We want to dene algorithms to compute types V C and I C that turn out to be proper 

value types under certain conditions. For this we extend subtyping to structure expressions 

in a natural way taking care of IsA-relations. Then each super structure expression S 0 and 

each instance dene a morphism IS 0 : DC ! DC 0 ,! ID T S 0 using the representation type 

T S 0 of S 0 . 

Algorithm. Let F (C i )=T i provided there exists a super structure expression on C i dened 

by c i : T Ci ! T i , otherwise let F (C i ) be undened. If ID occurs in some F (C i ) corresponding 

to r j : C j (j 6= i), we writeID j . 

Then iterate as long as possible using the following rules: 

(i) If F (C j )isaproper value type and ID j occurs in some F (C i )(j 6= i), then replace this 

corresponding ID j in F (C i )by F (C j ). 

(ii) If ID i occurs in some F (C i ), then let F (C i ) be recursively dened by F (C i )==S i , where 

S i is the result of replacing ID i in F (C i )by the type name F (C i ). 

The iteration terminates, since there exists only a nite collection of classes. If these rules are 

no longer applicable, replace each remaining occurrence of ID j in F (C i ) by the type name 

F (C j )provided F (C j ) is dened. 

ut 

Note that the the algorithm computes (mutually) recursive types. 

The reference graph of a class C in a schema S is the smallest labelled graph G rep = 

(VEl) satisfying: 

(i) There exists a vertex v C 2 V with l(v C ) = ft Cg, where t is the top-level type in the 

structure expression S of C. 

(ii) For each proper occurrence of a type t 6= ID in T C there exists a unique vertex v t 2 V 

with l(v t )=ftg. 

(iii) For each reference r i : C i in the structure expression S of C the reference graph G i ref is 

a subgraph of G ref . 

(iv) For each vertex v t or v C corresponding to t(x 1 ::: x n )inS there exist unique edges e (i) 

t 

from v t or v C respectively to v ti in case x i is the type t i or to v Ci in case x i is the reference 

r i : C i . In the rst case l(e (i) 

t )=fS i g, where S i is the corresponding selector name in the 

latter case the label is fS i r i g. 

Let S = fC 1 ::: C n g be a schema. Let S 0 = fC1 0 ::: C0 ng be another schema such that for 

all i there exists a super structure expression on C i dened by some c i : T Ci ! T C 0 

i 

. Then an 

identication graph G id of the class C i is obtained from the reference graph of Ci 0 bychanging 

each label Cj 0 to C j. 

With these notations it is easy to see that for a class C such that there exists a super 

structure expression for all classes C i occurring as a label in some identication graph G id of 

C and the type I C computed by the Algorithm with respect to the super structure expression 

used in the denition of G id , I C is a proper value type. 

Theorem. (i) Let C be a class in a schema S such that there exists a super structure expression 

for all classes C i occurringasalabel in the reference graph G ref of C. Let V C be the 

59

type G(C) computed by the Algorithm with respect to trivial super structure expressions 

and let I C be the type F (C) computed by the Algorithm with respect to arbitrary super 

structure expressions. Then C is value-representable with value representation type V C 

and each such I C is a value identication type. 

(ii) Let C be a class in a schema S such that there exist generic update methods on C. 

Then C is value-representable. Moreover, all super- and subclasses of C are also valuerepresentable. 

(iii) Let C be a value-representable class in a schema S such that all its super- and subclasses 

are also value-representable. Then there exist unique generic update operations on C. 

The proof mimiques the set-based arguments in [55]. 

3.5 Logical Reconstruction 

So far, we have seen the decisive roleoftype semantics for OODBs. Given a topos of types, 

we may describe instances of a schema on top of it. The only assumption is the existence of a 

type ID of object identiers. Moreover, it is known from [28, 35, 39] that topoi are inherently 

connected with higher-order intuitionistic logics. 

In principal, there are two (equivalent) ways to approach the logic of a topos. The rst 

one is given by the Mitchell-Benabou language anf Kripke-Joyal semantics [39], the second 

one based on Fourman-Scott languages [28] follows more the general line of logics dening its 

syntax and interpretation in an (arbitrary) topos. 

In our presentation we take the second approach, because it directly comes up with equality, 

existence and description [60]. Recall that a Fourman-Scott language L consists of 

{ two sets Sort and Const of sorts and constants, 

{ a power sort map [] : S n2IN Sortn ! Sort written (A 1 ::: A n ) 7! [A 1 ::: A n ], 

{ a family of countable sets fVar s g s2Sort indexed by the sorts and 

{ amap#:Const ! Sort assigning to each constant its sort. 

We also use Var= S s2Sort Var s to refer to the set of all variables. Then for a given variable 

x 2 Varwe write#x to refer to the sort of x. Moreover, we use f = [] as an abbreviation for 

the empty power sort which will be regarded as consisting of truth values. 

The terms T s (L) of sort s 2 Sort for a language L are constructed from L as the smallest 

set such that each variable x of sort s, each constant c with #c = s and Ix:' for each variable 

x with #x = s and each formula ' belong to T s (L). 

The formulae of L build the smallest set F(L) such that the following formulae are in 

F(L): 

{ E for each term 2T(L), 

{ for terms , of the same sort s, 

{ ( 1 ::: n ) for terms i 2T si (L) and 2T [s 1::: s n](L), 

{ ' ^ for formulae ' and , 

{ ' ) for formulae ' and and 

{ 8x:' for variables x 2 Var and formulae ' . 

60

We may then introduce the other junctors :, _, ,, the predicate = and the quantier 9 as 

abbreviations. 

The intension behind the description symbol I needs some explanation. Informally Ix:' 

means the unique x that satises '. However, such anx may not exist. 

The logic deals with this problem by introducing a formal existence predicate E, where 

E means that exists. This is formalized by distinguishing domains ~ A of possible elements 

and to let E pick out the subdomains of actual elements. Then bound variables will range 

only over actual elements. When interpreting the logic in a topos this construction is related 

to partial morphism classication. 

The introduction of an existence predicate also inuences the equality predicate = which 

is considered as a property of actual elements. In order to compare also possible elements 

the equivalence predicate is introduced. Non-existing elements are all considered to be 

equivalent. Since then equality can be dened in terms of the equivalence and the existence 

predicates, only is taken as a primitive in the logic. 

We have mentioned above that the sort f will be considered as truth values. Then the 

formula () with 2Tf(L) is a formula that asserts . 

We dispense with a description of the axioms and rules that dene the derivation operator 

` as well as with the interpretation of L in an arbitrary topos. We only mention that each 

theory T of L canonically denes a topos IE(T ), called the topos of denable types and 

denable total functions, and that each topos E can be written in this form. In particular, 

there is a canonical interpretation of L in IE(T ), which is sound and complete. 

In order to dene IE(T ) we introduce types and relations as terms of specic syntactic 

forms. Such types reect the many possible subdomains of domains associated with power 

sorts. 

A type A is a term of the form Iy :: [s]:8x :: s:(' , y(x)). A relation f from s to t is a 

term of the form Iz :: [s t]:8x :: s y :: t:(' , z(x y)). Atype A or a relation f is said to be 

denable i the dening formula is closed. 

A more convenient notationforatype A dened by the formula ' is A = fx :: s j 'g. For 

a term of sort s we then get the formula 2 A. Foravariable x with #x = s we may use 

the quantiers 8x 2 A and 9x 2 A. 

For a relation f we may use the notation f # () forIy :: t:f(y) for 2T s (L) even if do 

not know whether f is the graph of a function. Furthermore, we use functional abstraction 

writing x :: s: as an abbreviation for Iz :: [s t]:8x :: s y :: t:(y = , z(x y)). 

Finally, two relations f, g from type A to type B are equivalent with respect to T i 

T `8x 2 A:(f # (x) g # (x)) holds. 

Let T be a theory over L. ThetoposIE(T )ofdenable types and denable total functions 

has as objects the denable types of L and as morphisms from A to B equivalence classes of 

denable relations from A to B such that T `8x 2 A:f # (x) 2 B holds. For f 2 Hom(A B) 

and g 2 Hom(BC) the composition g f 2 Hom(A C) is dened by x 2 A:(g # (f # (x))). 

Schemata and Instances as Theories. Given a topos E, let us now try to shift the categorical 

characterization of instances into the associated logic. Recall that the sorts of this logic are 

the objects of E, the constants of sort A are the morphisms c :1l ! A, ~ where A : A ! A ~ is 

the partial morphism classier for A [35, 39, 53] and the power sort map takes A 1 ::: A n 

to A 1:::A n 

. 

Now consider the monomorphism | : DC ,! ID T C and the canonical projection 1 : 

IDT C ! ID.Asabove letj C : im(DC) ,! ID result from the image factorization of 1 |. 

61

Since we assume 1 | to be monic, the universal property of images gives rise to a unique 

monomorphism { : im(DC) ! DC. 

Since we assume value-representability, 2 | is also monic, hence 2 |{ gives a monomorphism 

from im(DC), a subobject of ID to T C . Then the universal property of the partial 

morphism classier TC gives rise to a unique monomorphism I(C) :ID ! T ~ C . 

Similarly, consider the morphism o r : T C ID ! corresponding to a reference r from 

class C to class D. Since TC I(D) : T C ID ! T ~ C T ~ D denes a monomorphism, we 

may again consider the partial morphism classier for , which is = id . This gives us 

a unique morphism ~o r : TC ~ T ~ D ! . Then let I(r) = ^~o r : TC ~ ! 

T ~ D 

be its exponential 

adjoint. 

Then the morphisms I(C) for all classes C 2Sand I(r) for all references in S (assuming 

for the moment unique reference names) are sucient to describe objects. In fact, we may 

think of these morphisms as semantically associated with an instance, whereas syntactically 

we may use the class names C and the reference names r instead. 

This gives rise to formulae of the form EIo: ' as \ground facts" given by some instance. 

Moreover, the following formulae dene the axioms of the schema S: 

8o: EC(o) ) ED(o) if C is a subclass of D (3.21) 

8o: EC(o) )8o 0 : (r(C(o))(D(o 0 )) ) ED(o 0 )) for a reference r from C to D (3.22) 

8o o 0 :C(o) =C(o 0 ) ) o = o 0 (3.23) 

If Ax(S) is the set of formulae (3.21), (3.22) and (3.23) dened for schema S, then this 

corresponds to the theory T 0 = f' j Ax(S) ` 'g. If in addition Ax(I) is a set of formulae 

given by some instance, maybe only \ground facts" as above, then the corresponding theory 

is T 0 = f' j Ax(S) [ Ax(I) ` 'g. 

Note that each model of such a theory T 0 in the underlying topos IE(T ) gives rise to a 

logical morphism IE(T 0 ) ! IE(T ) [28]. 

Let us nally remark that the construction of I(C) is also possible, if value-representability 

is not assumed, but in this case we shall not get a monomorphism. Then in general a fact 

such asIo: C(o) =t may not exist, i. e. EIo: C(o) =t may not factor through true. Then the 

only model would be the inconsistent topos. Nevertheless a smooth extension to the case of 

weak value-identiability is still possible. 

Getting Rid of Identiers. By the work [16] object identiers have been identied as a 

pure implementation concept. This leads to the requirement of weak value-identiability. In 

our construction above, where we assumed the stronger value-representability this is already 

reected by the fact that I(C) isamorphism into T ~ C , whereas in the rst categorical reformulation 

we had monomorphisms into ID T C . 

Nevertheless, the type T ~ C still involves identiers corresponding to references, but as shown 

in [53, 54, 55] the value types that can be used to identify objects can be eectively computed. 

Let us sketch a corresponding construction in E. 

Thus, consider the pullback ofI(r) : T ~ C ! 

T ~ D 

and id TD ~ . This denes an object T CrD 

and morphisms exp 0 (r) : T CrD ! 

T ~ D 

and exp(r) : T CrD ! T ~ C , the latter being monic. 

Since I(r) I(C) =id TD ~ (I(r) I(C)) the universal property of pullbacks denes a unique 

monomorphism I(C rD):ID ! T CrD with exp(r) I(C rD)=I(C). 

We may repeat this construction with respect to all morphisms corresponding to references 

including the exp 0 (r) constructed above. This denes a diagram D : ; ! E. Let O denote 

62

the limit of D. Then there is also a unique monomorphism I : ID ! O such that all the 

morphisms I(C) are given by I and D. 

Note that we may also assume all objects in D(; ) to be bounded, i. e. there exists a 

monomorphism into some xed object R. Then also O will turn out as a subobject of R. 

The construction of O glues together types and references, but still does not introduce 

object description without any identiers. For this let C : T C ! TC 0 result by elimination of 

all identiers, formally C occurs as the pushout of U ! 1l and U ! T C , where these two 

morphisms dene the pullback of the exponential adjoint ô r and the exponential adjoint of 

true triv ID . 

If fg dene the pushout of C and I(r), then we get also an object TCrD 0 by the pullback 

of f and g, hence also morphisms CrD : T CrD ! TCrD 0 .IfD0 is a diagram that extends D 

by these morphisms, then we obtain the required types without occurrences of ID that can 

be used to extend the logic. 

Queries. In the relational model there are two basic approaches to queries based on the 

relational algebra and the relational calculus. We are now able to introduce analogous constructions 

in the OODM. 

In the algebraic perspective we may use all operation supplied by the type system. Syntactically 

this means to consider all closed value terms as queries with a semantics dened 

by morphisms t :1l ! T ~ . In addition, each class C denes a query with semantics given by 

I(C) :ID ! T ~ C in an instance I. Combining these two basic queries using all operators of 

the type system gives a simple query language [55]. Note that in the relational subcase we 

obtain the operators of relational algebra without the join. 

Furthermore, we need polymorphic operators to combine queries. For queries dened by 

morphisms I 1 ! A and I 2 ! B and functions A ! C and B ! C we may consider the 

\inner" pullback A C B ! B, A C B ! B and in the same way the \outer" pullback 

I = I 1 C I 2 .Thenby universality we obtain a unique morphism I ! A C B, dening the 

semantics of pullback query. In the relational algebra the join corresponds to such pullbacks. 

For classes containing references we may also consider queries fr=Dg:C dened by the 

substitution of class D for reference r : D.Semantically we consider again a pullback T ~ C r 

T ~ D 

over id : 

T ~ D 

! 

T ~ D 

and I(r) : TC ~ ! 

T ~ D 

. Then the morphisms I(C) : ID ! T ~ C and 

I(r)I(C) :ID ! 

T ~ D 

give rise to a unique monomorphism ID ,! T ~ C r 

T ~ D 

, which denes 

the semantics of reference substitution queries. 

For the calculus things are much easier, since we may exploit the associated logic. Since 

classes and references have been incorporated into the logic, a qery is simply given by a term 

Ix:' with a dening formula '. This generalizes the relational approach. 


In this paper we indicated some fundamentals and logical semantics for object oriented 

databases. The starting point was the consideration of building blocks in OODB schemata, 

i.e. types and classes. First we observedadecisive importance of type semantics. Objects are 

considered to be abstractions of real world entities, hence they have an immutable identity. 

This identity is rst encoded by abstract identiers that are assumed to form some type ID. 

There is not only one value of a given type that is associated with an object. In contrast we 

allow several values of possibly dierent types to belong to an object, and even this collection 

63

of types may change. Classes are used to structure objects. At each time a class corresponds 

to a collection of objects with values of the same type and references to objects in a xed set 

of classes. 

In general, it is reasonable to assume a semantics based on topos theory. Then all these 

considerations can be generalized using notions from category theory. On this basis the problems 

of identication and genericity have been solved in general. The unique identication 

of objects and the existence of generic update operations in a class require the class to be 


Since topos theory is inherently connected with higher-order intuitionistic logic, we were 

able to rst rephrase the notions of object oriented databases in category theory, then to 

transform them into logic. This allows the denition of query algebra and calculus. Taking 

value-representability as a desirable property into account, we could even show how to get 

rid of object identiers that have already been detected as a pure implementation concept. 

The results achieved so far seem to oer a reasonable logical foundation for object oriented 

databases. They even allow to relate this eld to recent investigations in foundations of 

computer science with respect to type theory and eective computation. 

Nevertheless, it is just the beginning of a story concerning deductive capabilities in object 

oriented databases. To proceed, it will be interesting to investigate (higher-order) geometric 

theories [39, 68, 69]. Further research is planned in this direction. 

As to the dynamics of object oriented databases concerning the formalization of operation 

semantics we may wish to exploit e.g. axiomatic semantics in the sense of Dijkstra's predicate 

transformers [23, 26, 45]. The problem with this theory is that it depends on the use of 

a suitable logic that guarantees the existence of predicate transformers with the intended 

semantics. Whilst the classical theory uses an innitary rst-order logic L ! !1 the required 

generalization to topos logic has been shown in [53, 57]. 

Finally, types can be handled in a much more exible way, ifwe extend algebraic data type 

specications by higher-order functional and truth-value sorts and dene topoi as models of 

such constructor theories. This approach is described in [53, 56]. 

Then it is an open problem, how this kind of type theory relates to synthetic domain theory, 

which is roughly \domain theory within a topos" [29, 33, 51, 64]. The basic assumption of 

this theory is that \domains" are specic objects in a topos such that all morphisms between 

them are continuous and all constructions are solely based on categorical properties without 

recurring to order-theoretic properties. Again the eective topos turns out to be a reasonable 

source of examples of that kind of theory. 


1. S. Abiteboul: Towards a deductive object-oriented database language, Data & Knowledge Engineering, 

vol. 5, 1990, pp. 263 { 287 

2. S. Abiteboul, R. Hull: IFO: A Formal Semantic Database Model, ACM ToDS, vol. 12 (4), December 

1987, pp. 525 { 565 


Portland Oregon, 1989, pp. 159 { 173 

4. H. At-Kaci: An Overview of LIFE, inJ.W.Schmidt, A. A. Stognij (Eds.): Proc. Next Generation 

Information Systems Technology , Springer LNCS, vol. 504, 1991, pp. 42 { 58 

5. A. Albano, G. Ghelli, R. Orsini: Types for Databases: The Galileo Experience, in Type Systems 

and Database Programming Languages, University of St. Andrews, Dept. of Mathematical and 

Computational Sciences, Research Report CS/90/3, 27 { 37 

64

6. A. Albano, G. Ghelli, R. Orsini: Objects and Classes for a Database Programming Language, FIDE 

technical report 91/16, 1991 


Database Programming Language, in A. Sernadas (Ed.): Proc. VLDB 91, Barcelona 1991 



9. F. Bancilhon, G. Barbedette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeer, 

P. Richard, F. Velez: The Design and Implementation of O 2 , an Object-Oriented Database System, 

Proc. of the ooDBS II workshop, Bad Munster, FRG, September 1988 

10. M. Barr, C. Wells: Category Theory for Computing Science, Prentice-Hall 1990 

11. C. Beeri: Formal Models for Object-Oriented Databases, Proc. 1st DOOD 1989, pp. 370 { 395 


5 (4), 1990, pp. 353 { 382 

13. C. Beeri, Y. Kornatzky: Algebraic Optimization of Object-Oriented QueryLanguages, in S. Abiteboul, 

P. C. Kanellakis (Eds.): Proc. ICDT '90, Springer LNCS 470, pp. 72 { 88 

14. C. Beeri: New Data Models and Languages - the Challange in Proc. PODS '92 

15. C. Beeri, T. Milo: Subtyping in OODBs, in Proc. PODS'91 

16. C. Beeri, B. Thalheim: Can I see your Identication, please?, Proc. of the Workshop on Database 

Semantics, Rez, January 1995 (to appear) 

17. K. B. Bruce, A. R. Meyer: The Semantics of Second Order Polymorphic Lambda Calculus, in 

G. Kahn, D. B. MacQueen, G. Plotkin (Eds.): Semantics of Data Types, Springer LNCS 173, 

1984, 131-144 


Computing Suerveys 17,4, pp 471 { 522 

19. L. Cardelli: Typeful Programming, Digital Systems Research Center Reports 45, DEC SRC Palo 

Alto, May 1989 

20. M. Carey, D. DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc. 

ACM SIGMOD 88 

21. M. Caruso, E. Sciore: The VISION Object-Oriented Database Management System, Proc.ofthe 


22. R.G.G. Cattell: Object Data Management: Object Oriented and Extended Relational Database 

Systems, Addison-Wesley, 1991 

23. P. Cousot: Methods and Logics for Proving Programs, inJ.van Leeuwen (Ed.): The Handbook of 

Theoretical Computer Science, vol B: \Formal Models and Semantics", Elsevier, 1990, 841-993 

24. A. Dearle, R. Connor, F. Brown, R. Morrison: Napier88 - ADatabase Programming Language?, 

in Type Systems and Database Programming Languages, University of St. Andrews, Dept. of 

Mathematical and Computational Sciences, Research Report CS/90/3, 10 { 26 

25. K. Denningho, V. Vianu: Database Method Schemas and Object Creation, in Proc. PODS '93, 

265-275 

26. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Springer-Verlag, 1989 

27. D. Fishman, D. Beech, H. Cate, E. Chow et al.: IRIS: An Object-Oriented Database Management 

System, ACM ToIS, vol. 5(1), January 1987 

28. M. P. Fourman: The Logic of Topoi, in J. Barwise (Ed.): Handbook of Mathematical Logic, North- 

Holland Studies in Logic, vol. 90, 1977, 1053-1090 

29. P. Freyd: Recursive Types reduced to Inductive Types, in J. Mitchell (Ed.): 5th Symposium on 

Logic in Computer Science, Philadelphia, 1990 

30. M. Hammer, D. McLeod: Database Description with SDM: A Semantic Database Model, J.ACM, 

vol. 31 (3), 1984, pp. 351 { 386 

31. R. Hull, R. King: Semantic Database Modeling: Survey, Applications and Research Issues, ACM 


32. J. Hyland: The Eective Topos, inA.Troelstra, D. van Dalen (Eds.): The L.E.J. Brouwer Centenary 

Symposium, North Holland, 1982, 165-216 

65

33. J. Hyland: First Steps in Synthetic Domain Theory, in A. Carboni, M. Pedicchio, G. Rosolini 

(Eds.): Category Theory '90 , Springer LNM, vol. 1488, 1992 

34. J. Hyland, E. Robinson, G. Rosolini: The Discrete Objects in the Eective Topos, Proc. LMS 60 

(1990), 1-60 

35. P. Johnstone: Topos Theory, LMS Monographs vol. 10, Academic Press, 1977 


1986 

37. M. Kifer, G. Lausen. F-Logic: A Higher-order Language for Reasoning about Objects, Inheritance 

and Schema, in Proc. SIGMOD 1989, 134-146 



39. S. Mac Lane, I. Moerdijk: Sheaves in Geometry and Logic { A First Introduction to Topos Theory, 

Springer Universitext, 1992 



41. F. Matthes, J. W. Schmidt: Bulk Types { Add-On or Built-In?, in Proc. DBPL III, Nafplion 1991 

42. J. C. Mitchell: Type Systems for Programming Languages, inJ.van Leeuwen (Ed.): The Handbook 

of Theoretical Computer Science, vol B: \Formal Models and Semantics", Elsevier, 1990, 365-458 

43. J. Mylopoulos, P. A. Bernstein, H. K. T. Wong: A Language Facility for Designing Interactive 

Database-Intensive Applications, ACM ToDS, vol. 5 (2), April 1980, pp. 185 { 207 

44. J. Mylopoulos, A. Borgida, M. Jarke, M. Koubarakis: Telos: Representing Knowledge About Information 

Systems, ACM ToIS, vol. 8 (4), October 1990 pp. 325 { 362 

45. G. Nelson: A Generalization of Dijkstra's Calculus, ACM TOPLAS, vol. 11 (4), October 1989, pp. 

517 { 561 

46. A. Ohori: Representing Object Identity in a Pure Functional Language, Proc. ICDT 90, Springer 

LNCS, pp. 41 { 55 

47. A. M. Pitts: Polymorphism is Set Theoretic, Constructively, in D.H. Pitt, A. Poigne, D.E. Rydeheard 

(Eds.): Category Theory and Computer Science, Springer LNCS 283, 12-39 

48. J. C. Reynolds: Polymorphism is not Set-Theoretic, in G. Kahn, D. B. MacQueen, G. Plotkin 

(Eds.): Semantics of Data Types, Springer LNCS 173, 1984, 145-156 

49. G. Rosolini: Categories and Eective Computations, in D.H. Pitt, A. Poigne, D.E. Rydeheard 

(Eds.): Category Theory and Computer Science, Springer LNCS 283, 1-11 

50. G. Rosolini, E. Robinson: Colimit Completions and the Eective Topos, Journal of Symbolic Logic 

55 (1990), 678-699 

51. G. Rosolini: Notes on Synthetic Domain Theory, University of Genova, February 1995 

52. K.-D. Schewe, B. Thalheim, I. Wetzel,J.W.Schmidt: Extensible Safe Object-Oriented Design of 

Database Applications, University of Rostock, Preprint CS-09-91, September 1991 

53. K.-D. Schewe: Specication of Data-Intensive Application Systems, Habilitation Thesis, TU Cottbus, 

1994 


Oriented Databases, in J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Springer LNCS 646, 341-356 

55. K.-D. Schewe, B. Thalheim: Fundamental Concepts of Object Oriented Databases, Acta Cybernetica, 

vol. 11 (4), 1993, 49-84 

56. K.-D. Schewe: A Semantics for Type Specications Based onTopos Theory, TU Cottbus, Technical 

Report I-5 / 1994 

57. K.-D. Schewe: A Non-Classical Generalization of Dijkstra's Calculus { Axiomatic Semantics for 

Typed Program Specications, TU Cottbus, Technical Report I-6 / 1994 

58. K.-D. Schewe, B. Thalheim, I. Wetzel: Foundations of Object Oriented Database Concepts, University 

ofHamburg, Report FBI-HH-B-157/92, October 1992 

59. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: AReective Approach to Method 

Generation in Object Oriented Databases, University of Rostock, Rostocker Informatik Berichte, 

no. 14, 1992 

66

60. D. S. Scott: Identity and Existence in Intuitionistic Logic, in M. P. Fourman, C. J. Mulvey, 

D. S. Scott (Eds.): Applications of Sheaves, Springer LNM 753, 660-696 

61. M. H. Scholl, H.-J. Schek: ARelational Object Model, in Proc. ICDT 90, Springer LNCS, pp. 89 

{105 



63. S. Y. W. Su: SAM : A Semantic Association Model for Corporate and Scientic-Statistical 

Databases, Inf. Sci., vol. 29, 1983, pp. 151 { 199 

64. P. Taylor: The Fixed Point Property in Synthetic Domain Theory, inG.Kahn:6th Symposium on 

Logic in Computer Science, Amsterdam 1991, 152-160 

65. J. Van den Bussche, Dirk Van Gucht: A Hierarchy of Faithful Set Creation in Pure OODBs, in 

J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Springer LNCS 646, 326-340 

66. J. Van den Bussche, Dirk Van Gucht: Semi-determinism, in Proc. PODS '92, ACM Press, 191-201 

67. J. Van den Bussche: Formal Aspects of Object Identity in Database Manipulation, Ph.D. Thesis, 

University ofAntwerp, 1993 

68. S. Vickers: Geometric Theories and Databases, in M.P. Fourman, P.T. Johnstone, A.M. Pitts 

(Eds.): Applications of Category Theory in Computer Science, London Mathematical Society 

Lecture Notes Series, Cambridge University Press, 1992, 288-314 

69. S. Vickers: Geometric Logic in Computer Science, in G.L. Burn, S.J. Gray, M.D. Ryan (Eds.): 

Theory and Formal Methods 1993, Springer WiCS, 1993, 37-54 

70. S.B. Zdonik, D. Maier: Readings in Object Oriented Database Systems, Morgan Kaufmann Publishers, 

1990 

67

Chapter 4 

Higher-Level Genericity in Object 

Oriented Databases 

Contents 

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 

4.2 A Core Object Oriented Database Language . . . . . . . . . . . . 71 

4.2.1 A Simple Type System . . . . . . . . . . . . . . . . . . . . . . . . . . 71 

4.2.2 Specication of Structure . . . . . . . . . . . . . . . . . . . . . . . . 72 

4.2.3 Database Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 

4.2.4 Specication of Behaviour . . . . . . . . . . . . . . . . . . . . . . . . 73 

4.3 Genericity Beyond Polymorphism . . . . . . . . . . . . . . . . . . 74 

4.3.1 Implicit Schema Extensions . . . . . . . . . . . . . . . . . . . . . . . 74 

4.3.2 Linguistic Reection . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

4.3.3 Reection Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 

4.3.4 Generators for Generic Updates . . . . . . . . . . . . . . . . . . . . . 77 

4.4 Integrity Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . 80 

4.4.1 User-Dened Integrity Constraints . . . . . . . . . . . . . . . . . . . 80 


4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 


Klaus-Dieter Schewe, David Stemple, Bernhard Thalheim. Higher-Level Genericity in 

Object Oriented Databases. Proc. COMAD 1994. 

68

Abstract. Object oriented databases (OODBs) are composed of semi-independent objects 

but must also provide for the maintenance of inter-object consistency, especially with respect 

to constraints arising from class hierarchies and inter-object references. Hence the problem to 

provide consistent generic update methods. 

We address the problem how to derive such methods from the structure of an OODB 

schema by the specication of generator macros for them. These generators are based on a 

strict mathematical formalization of OODB concepts including the possibility to represent 

syntactic components of the language as values within the language itself, which isknown to 

form the basis of linguistic reection. 

Moreover, the approach can be extended to the enforcement of user-dened integrity constraints 

that give rise to context sensitive macros turning each user-dened method into 

branches of its greatest consistent specialization. 

Keywords: object oriented databases, genericity,integrity constraints, consistency, linguistic 

reection 


The relational datamodel (RDM) was the rst to support the complete abstraction from 

physical data organization. This was certainly one of its advantages in comparison to former 

hierarchical and network models and one of the reasons for its success. Another one is certainly 

due to the simplicity and elegance of query and update languages. In particular, each RDM 

schema is accompanied by operations to insert, delete or update a tuple. These operations are 

generic in the sense that they are applicable to each relation in the schema. To beeven more 

precise, the kind of genericity required here can be obtained by parametric polymorphism, 

since it is sucient to know the underlying RECORD-type of a relation. 

However, some shortcomings of the RDM encouraged much research aimed at achieving 

more exible and ecient datamodels. It has been claimed in [1] that object orientation 

provides the key technology for future database systems and languages. In order to provide 

object oriented databases (OODBs) with the same grade of maturity as existing relational 

systems it is a reasonable goal trying to preserve the advantages of the RDM herein. 

In object oriented programming the notion of an object was intended as a generalization of 

the abstract data type concept with the additional feature of inheritance. In this sense object 

orientation involves the isolation of data in semi-independent modules in order to promote 

high software development productivity. Object oriented databases must regard an object 

also as a basic unit of persistent data, and therefore are composed of independent objectsbut 

must also provide for the maintenance of inter-object consistency, a demand that is to some 

degree in dissonance with the basic style of object orientation. 

Therefore, it is not too surprising that many object oriented database systems do not 

provide generic update methods [12] or these fail to enforce model inherent inclusion and 

referential constraints. Another source of confusion is due to object identiers [4, 14], a concept 

used for encoding the identity of objects. Making such identiers visible to the user as done 

in programming languages does not make much sense in databases. However, regarding them 

as a pure implementation concept as in [6] raises the problem, whether generic updates do 

actually exist. 

In fact, generic updates in OODBs are much more complicated than in the RDM due 

to the fact that identiers may not occur within input- and output-values and that at least 

69

model inherent constraints have to be maintained, which requires context information in 

order to provide generic update methods. In has been shown in [21, 17] that parametric 

polymorphism [11] as used in most object oriented languages is insucient for the genericity 

problem. Nevertheless, we shall address in this paper the problem how to derive automatically 

and eciently generic update methods in OODBs. 

Our solution is based on the formally dened object oriented datamodel (OODM) introduced 

in [24] with a clear distinction between values and objects as required in [7, 8]. Types 

correspond to immutable sets of values, classes correspond to mutable collections of objects. 

In Section 4.2 we briey describe the basic features of this model. 

The main advantage of the used OODM is its theoretical basis [24]. Some competing 

models [13, 16] are closely oriented to a particilar object oriented programming language 

and ignore certain mismatches in coupling these with a database. Others [9, 10] are basically 

behaviourally extended semantic datamodels. [3, 5] share the OODM property of a datamodel 

orientation, but still ignore the problems of object identication, genericity and consistency 

with respect to model-inherent constraints. 

As shown in [22, 24] generic consistent update methods exist for value-representable classes 

and only for them. Hence, the construction of such methods depends on additional integrity 

constraints that are required for value-representability. Moreover, in order to capture cyclic 

references between objects, an extension to types is required that allows rational tree 1 structures 

to be dened by type equations. This corresponds to the -terms introduced in [2]. Such 

generic consistent update methods as well as nitely representable, but innite structures are 

missing in almost all competing OODB languages. 

The ecient construction of generic update methods is based on linguistic reection as 

described in [19, 20]. Type-safe linguistic reection came up with the development of the 

ADABTPL language which laid a primary interest on the develoment of correct database 

transactions [18]. It turned out that synthesizing common operations in the RDM such asor 

natural join would be helpfull, but these are not polymorphically expressible. 

The main idea is to provide macro facilities that allow to compute with syntactic representations 

of language constructs in a type-safe fashion within the database language itself. 

In Section 4.3 we describe this approach to generator macros for generic update methods. 

The approach includes the computation of the value-representation types for all classes 

in a given schema, hence genericity in this case exceeds the capability of simple parametric 

polymorphism. The implementation of the acyclic case without propagation is described in 

[27]. The implementation of the general case is currently investigated. 

The approach suggests an immediate extension to integrity enforcement with respect to 

explicit user-dened constraints, especially those that arise from generalizing corresponding 

constraints in the RDM such as functional, inclusion and exclusion constraints [26]. Enforcing 

constraints can be formalized by the computation of greatest consistent specializations (GCSs) 

of user-dened methods, an approach that occurs naturally in OODBs, since operational 

specialization is already present when overriding methods. 

In [23] an algorithm has been presented that allows GCS construction (under certain 

technical prerequisites) to be reduced to basic operations. Greatest consistent specializations 

of generic update methods for the OODM were presented in [25]. We briey describe GCS 

construction in Section 4.4. 

1 A rational tree is a nite or innite tree with only a nite number of dierent subtrees. 

70

4.2 A Core Object Oriented Database Language 

In the object-oriented approach we distinguish between objects and values. Values can be 

gouped into types that may be regarded as an immutable set of values of a uniform structure 

together with operations dened on them. Subtyping is used to relate values in dierent types. 

The class concept provides the grouping of objects having the same structure which uniformly 

combines aspects of object values, references and subreferences, but objects can belong 

to dierent classes. As for values that are only dened via types, objects can only be dened 

via classes. 

References and subreferences between classes give rise to implicit referential constraints. 

In addition, subreferences (part-of) dene local referential constraints, and subclasses (IsArelationships) 

require each database instance to satisfy inclusion constraints on object identiers. 

We shall later extend this picture allowing additional constraints to be dened by the 

user. 

As usual in object oriented approaches methods are used to model the database dynamics. 

In the OODM these are associated with classes. In addition we shall later add macros with 

the dierence that a macro produces new language expressions from language expressions. 

4.2.1 A Simple Type System 

Here we follow the classical view of types in [11] using a type system that consists of some 

basic types, type constructors and a subtyping relation. Moreover, assume the existence of 

recursive types, i.e. types dened by domain equations. 

The base types are BOOL, NAT, INT, FLOAT, STRING, ID or ?, where ID is an 

abstract identier type without any non-trivial supertype and ? is the trivial type that is a 

supertype for every type. The type constructors are (a 1 : 1 ::: a n : n ) (record), fg (nite 

set), [] (list), hi (bag) or (a 1 : 1 ) [ :::[ (a n : n ) (union). We may use base types and 

constructors to dene new types by nesting. 

It is easy to extend such atype system by adding e.g. a function type constructor ! . 

We abstained from this extension here because of the object identication and genericity 

problems. We shall discuss this problem at the end of the next subsection. 

Example 4.1. The type denition for PERSONNAME uses both the set constructor fg 

and the record constructor (): 

Type PERSONNAME = 

( FirstName : STRING , SecondName : STRING , Titles : f STRING g ) 

End PERSONNAME 

The denition of a type PERSON uses the type PERSONNAME. 

Type PERSON = 

( PersonIdentityNo : NAT , Name : PERSONNAME ) 

End PERSON 


standard operators on base types and on records, sets, bags, ::: We omit the details here. A 

type t is called proper i the number of its parameters is 0. t is called a value type i there 

is no occurrence of ID in t. If t 0 is a proper type occurring in a type t, then there exists a 

corresponding occurrence relation o : t t 0 ! BOOL. 

71 

ut

A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened by 

the usual subtype relation [11]. 

4.2.2 Specication of Structure 

Each object in a class consists of an identier, a collection of values and references / subreferences 

to other objects. Identiers can be represented using the unique identier type ID. 

Values and (sub)references can be combined in a representation type, where each occurrence 

of ID denotes references to some other classes. Therefore, we may dene the structure of a 

class using parameterized types. Moreover, classes are arranged in IsA-hierarchies. 

A structural class consists of an class name C, a set of class names D 1 ::: D m (in the 

following called superclasses) and a value type expression S with all parameters replaced 

either by areference ref r i : C i or by asubreference part r i : C i with (sub)reference names 

r i and class names C i . 

If r i occurs within ref r i : C i in the structure expression of a class C, we call r i the 

reference named r i from class C to class C i .Ifr i occurs within part r i : C i in the structure 

expression of a class C, wecallr i the subreference named r i from class C to class C i . 

The type derived from the structure expression S of a class by replacing each reference 

ref r i : C i and each subreference part r i : C i bythetype ID is called the representation type 

T C of the class C, thetype U C =(ident : IDvalue :: T C ) is called the class type of C. 

Example 4.2. 

Let us now describe some structural classes in a simple university example. 

Class PersonC 


End PersonC 


IsA PersonC 

Structure ( PersonIdentityNo : NAT , Age : NAT , 

Salary : NAT , ref Faculty :DepartmentC ) 



Structure ( DeptName : STRING , 

ref Head : ProfessorC , 

Phones : f NAT g ) 


ut 

4.2.3 Database Instances 

A (structural) schema S is a nite collection of structural classes C 1 ::: C n closed under 

references and superclasses. In order to dene the semantics of structural schemata, we need 

the notion of a database instance. 

An instance D of a structural schema S assigns to each class C avalue D(C) oftype fU C g 




72



Moreover, if T C is a subtype of TC 0 with subtype function f : T C ! TC 0 , then we have 


referential integrity: For each reference or subreference from C to C 0 with corresponding 

occurrence relation o r wehave 


local referential integrity: For each subreference r from C to a class C 0 with corresponding 

occurrence relation o r wehave 

8i 1 i 2 j :: ID:8v 1 v 2 :: T C : (i 1 v 1 ) 2D(C) ^ (i 2 v 2 ) 2D(C) ^ j 2 dom(D(C 0 )) ^ 

o r (v 1 j) ^ o r (v 2 j) ) i 1 = i 2 ^ v 1 = v 2 : (4.28) 

We know from [22] that schema-dened generic update operations only exist for valuerepresentable 

classes. In turn, value-representability is implied by imposing a trivial uniqueness 

constraint oneach class. Therefore, in order to guarantee the existence of generic update 

methods we also assume for each class C the following condition: 

value-identiability: 

8i j :: ID:8v :: T C : (i v) 2D(C) ^ (j v) 2D(C) ) i = j : (4.29) 

If we donothave function types, then for each database instance it is decidable, whether the 

value-identiability condition holds. If functions come into play, this is no longer true, since 

we then have tocheck the equality of functions. Introducing function types therefore requires 

a more sophisticated treatment of value-identiability in the sense that we have to require a 

decidable uniqueness constraint. For the reective generation of generic update operations, 

however, we need to know that they exist, not the reason why they exist. In order not to 

overload the presentation, we therefore decided to keep the type system as simple as possible. 

4.2.4 Specication of Behaviour 

So far, only static aspects have been considered. A structural schema is simply a collection of 

data structures called classes. Let us now turn to adding dynamics to this picture. As required 

in the object oriented approach operations will be associated with classes. This gives us the 

notion of a method. 

We shall distinguish between visible and hidden methods to emphasize those methods 

that can be invoked by the user and others. Each method on a structural class C consists of 

a signature and a body. The signature consists of a method name and sets of parameter/type 

pairs for input and output. The body is dened by the usual constructs of a procedural 

programming language. A method M on a class C is called value-dened i all types occurring 

in its signature are proper value types. 

73

Example 4.3. 

Let us describe an insert-method for the class PersonC. 

Method insert 0 PersonC 

( in : P :: PERSON, out :I::ID) = 

IF 9 O 2 PersonC .value(O) =P 


ELSE I := NewId 

PersonC := PersonC [f( I,P )g 

ENDIF 

We used the global method NewId to denote the selection of a new identier. Note that this 

method is not value-dened, but we could simply drop the output to receive avalue-dened 

method. 

ut 

As already mentioned we distinguish between methods visible to the user and hidden methods. 

We require each visible method to be value-dened. In particular, we use the value-dened 

generic update methods insert C , delete C and update C for each class C that exist, since 

we require value-representability [22]. Moreover, we use the quasi-generic update methods 

insert 0 C , delete0 C and update0 C 

for each class C that are used to dene generic updates. The 

only dierence is that the generic updates suppress the output of type ID. The method in 

Example 4.3 is quasi-generic. 

Subclasses inherit the methods of their superclasses, but overriding is allowed as long as 

the new method is a specialization of all its corresponding methods in its superclasses. 

A (behavioural) schema S is a nite collection of behavioural classes fC 1 ::: C n g closed 

under references, superclasses and method calls, where behavioural class just means a structural 

class together with methods on it. 

4.3 Genericity Beyond Polymorphism 

Our goal is to provide generic update methods insert C , delete C and update C for each class 

C of a database schema. These update methods are \generic" in the sense, that they are 

applicable to each class of a schema. These methods demand the identication of objects 

without accessing the object identier, since oids are an internal concept and do not have a 

meaning for the user of a database. Hence the need for value-representability. 

Besides this identication problem we alsohave to cope with the enforcement of implicit 

integrity constraints. In [22] it has been shown that value-representability is a necessary and 

sucient condition for the existence of consistent generic update methods. 

4.3.1 Implicit Schema Extensions 

Since we assume the existence of a trivial uniqueness constraint for each class, generic and 

quasi-generic update methods always exist. Let us rst illustrate this by an example. 

Example 4.4. Consider again the schema of Example 4.2. The value-representation types for 

the classes ProfessorC and DepartmentC are 

Type V Prof = 

(PersonIdentityNo : NAT , Age : NAT ,Salary:NAT , 

Faculty : ( DeptName : STRING , Head : V Prof , Phones : f NAT g )) 

74

End V Prof 

Type V Dept = 

( DeptName : STRING , Head : V Prof , Phones : f NAT g ) 

End V Dept 

Selecting the component Faculty(V ) for a value V :: V Prof gives the required value of type 

V Dept . However, we have to choose a new identier for a new object in ProfessorC and 

due to the cycle in this schema this identier also occurs in the value of some new object in 

DepartmentC, hence we need the more complex type 

Type VProf = 

(PersonIdentityNo : NAT , Age : NAT , Salary : NAT , 

Faculty : ( DeptName : STRING , 

Head:(value : VProf ) [ (ident :ID), Phones : f NAT g )) 

End VProf 

 

Note that this is a supertype of V Prof . Let the corresponding subtype function be f Prof . 

Neglecting for the moment IsA-relations the quasi-generic insert on the class ProfessorC 

is given by 

Method insert 0 Prof 

(in :V::VProf , out :I::ID) = 

IF 9O 2 ProfessorC:f Prof (value(O)) =V 


ELSE I:=NewId 

IF 9J :: ID: Head(Faculty(V)) =(ident:J) 

THEN ProfessorC = ProfessorC [f(I,V) g 

ELSE Let V 0 :: VDept . V0 := Faculty(V) 

Let K::ID . K:=insert 0 Dept (V0 ) 

Let V 00 :: VProf . V00 := ( PersonIdentityNo(V), Age(V), Salary(V), K ) 

ProfessorC = ProfessorC [f(I,V 00 ) g 

ENDIF 

ENDIF 

Our aim now is to generate (quasi-)generic update methods from the structural schema and 

to add them to the corresponding classes, i.e. to implicitly change the behavioural schema. 

A natural rst idea is to exploit polymorphism as in [11] for this task. However, generic 

consistent updates on a class C have to be value-dened, hence require an input-type V C 

without any occurrence of ID.Such an input-type has to be computed from the schema and 

hence the generation requires meta-information. It has been shown in [17] that the need for 

meta-information exceeds the capability of polymorphism. The alternative is to use linguistic 

reection as proposed in [20]. 

ut 

4.3.2 Linguistic Reection 

The basic idea of linguistic reection is to use reection types suchasSCHEMA rep , CLASS rep , 

TYPE rep , METHOD rep , COMMAND rep , etc. for the representation of abstract syntax expressions 

representing schemata, classes, types, methods, commands (method bodies), etc. 

respectively. For each of these, there exists a function raise associating with this syntactic 

expression a true schema, class, type, etc. respectively. 

75

The used types for the representation of language constructs such as types, classes, constraints, 

methods, commands and schemata form the basis of linguistic reection and will 

therefore be called reection types 2 . 

Moreover, we need a macro value-rep with signature 

SCHEMA rep CLASS rep ! TYPE rep 

 

where value-rep(S C) represents a value type needed for the unique identication (and representation) 

of some object, hence is needed for the generic insert-, delete- and update-methods 

on raise(C). 

Such macros provide a more general way to specify database behaviour. They can be 

understood as transformations of language expressions. The main dierence to macros in 

traditional programming languages, e.g. LISP, is that the expressions are abstract syntax 

that are represented in additional predened types. Hence macros are also strongly typed. 

Then the core of problem then is to dene three macros with signatures 

insert : S :: SCHEMA rep C:: CLASS rep ! METHOD rep 

delete : S :: SCHEMA rep C:: CLASS rep ! METHOD rep and 

update : S :: SCHEMA rep C:: CLASS rep ! METHOD rep : 

Clearly, there are also other macros used by these main macros, and there is also one macro 

generic with signature SCHEMA rep ! SCHEMA rep that transforms a whole user-dened 

schema into an internal schema with generic update methods added to all classes. 

4.3.3 Reection Types 

Let us now briey indicate some of the reection types that are needed to construct generic 

update methods. We follow the presentation in Section 4.2 starting with TYPE rep . In general, 

each type was given by atype-name and a dening type-expression, hence 

Type TYPE rep = 

( name : NAME rep , type-exp : TYPE EXP rep ) 

End TYPE rep 

Type expressions are given by the base types and type constructors dened in Section 4.2.1 

which leads to the following recursive denition 

Type TYPE EXP rep = 

( BoolT : ? ) [ :::[ ( SetT : ( element-type : TYPE FORM rep )) [ 

( RecordT : [ ( tag : NAME rep ,eld:TYPE FORM rep )]) 

End TYPE EXP rep 

with values of (reection) type TYPE FORM rep being either type expressions (of type 

TYPE EXP rep ) or simply type names. 

Type TYPE FORM rep = 

(type-name : NAME rep ) [ (type-exp : TYPE EXP rep ) 

End TYPE FORM rep 

Next, let us describe CLASS rep ,which can be built analogously. In particular there is a close 

2 In the original work on linguistic reection [20] the notion representation type was used instead of reection 

type. Here we changed this notation in order not to run into confusion with representation types of classes. 

76

essemblance between TYPE EXP rep and STRUCTURE rep . 

Type CLASS rep = 

( name : NAME rep , isa : f NAME rep g , 

structure : STRUCTURE rep , methods : f METHOD rep g ) 

End CLASS rep 

The dierence between the representation of type expressions and the one for structure expressions 

is that the latter may contain references and subreferences indicated by the use of 

TYPE REF FORM rep instead of TYPE FORM rep . 

Type STRUCTURE rep = 

( BoolT : ? ) [ :::[ ( SetS : ( element-type : TYPE REF FORM rep )) [ 

( RecordS : [ ( tag : NAME rep , eld : TYPE REF FORM rep )]) 

End STRUCTURE rep 

Then the extension of TYPE REF FORM rep with respect to TYPE FORM rep simply consists 

in adding reference expressions. 

Type TYPE REF FORM rep = 

(type-name : NAME rep ) [ (type-exp : TYPE EXP rep ) [ 

( ref-exp : ( ref-kind : ( REF : ? ) [ (PART :? ), 

reference : NAME rep , class : NAME rep )) 

End TYPE REF FORM rep 

Finally, let us indicate the denition of the reection type METHOD rep , but without going 

too much into details. We omit the denitions for COMMAND rep and EXPR rep . 

Type METHOD rep = 

( name : NAME rep , 

in-list : [ ( parameter : NAME rep ,type : TYPE FORM rep )], 

out-list : [ ( parameter : NAME rep ,type : TYPE FORM rep )], 

body : COMMAND rep ) 

End METHOD rep 

Further details on the reection types of the OODM will be omitted here. 

4.3.4 Generators for Generic Updates 

In order to build generator macros for generic update methods we follow the constructive 

proof of their existence in [22]. Then we have to cope with the following problems. 

(i) We have toprovide value types for the input. This will be achieved by the macro value-rep 

already mentioned above. 

(ii) Generic update methods are value-dened, but nevertheless have to cope with identiers. 

This seemingly contradiction can be resolved by constructing canonical update methods 

with ID as output-type. Clearly, these give hidden methods in the internal schema. The 

corresponding generic update method then just consists of a call to the canonical one and 

simply neglects its output. 

(iii) Inclusion integrity has to be enforced. Therefore, each insert propagates through superclasses, 

whereas deletions propagate through subclasses and updates do both. For the 

macros insert, delete and update this means that we have to build methods ignoring all 

IsA-relations and then arrange these in a sequence. 

77

(iv) The most dicult task is the enforcement of referential integrity, especially in the case 

of cycles as e.g. in Example 4.2. We have to propagate in both directions along these 

references, but for cycles we also have tochoose new identiers before starting this propagation. 

At rst glance it seems that our approach starts with the most complicated case, where all 

operations are propagated along references, whereas it is much simpler for an insert-operation 

to require referenced objects already to exist and to disallow delete-operations as long as there 

still exist referencing objects. There are two reasons not to follow this simpler approach: 

(i) Our approach does only take care about implicit structurally dened constraints, whereas 

the simpler scenario arises, when certain additional user-dened transition constraints are 

added. We briey discuss the general handling of any kind of user-dened constraints on 

a solid theoretical ground in Section 4.4 (see also [23, 25]). 

(ii) These additional constraints may discard the generic operations, inparticular in the cases 

of cycles in the schema. In theses cases it is desirable to let the database designer become 

aware of the consequences of adding certain constraints, whereas s/he has only the 

chance to completely change the schema (omitting all cycles) in the case, where additional 

constraints are tacitly assumed. 

An implementation of the simplied scenario on the basis of a strongly typed persistent 

programming language is reported in the PH. D. thesis [27]. 

Let us now partly indicate the solution to these problem by generator macros for the 

canonical update methods. 

The macro rep-type computes for a given class its representation type: 

Macro rep-type ( in :C::CLASS rep , out :TC::TYPE EXP rep ) = 

call rep-type-struct(in : structure(C) , out :TC) 

The called macro rep-type-struct involves several cases depending on the value E which contains 

the representation of a structure expression. This stems from a type constructor, hence 

leads to the case distinctions. 

Macro rep-type-struct ( in :E::STRUCTURE rep , 

out :E 0 :: TYPE EXP rep ) = 

CASE E ::: 

E = (RecordS : [ (tag : N , eld : S) j L]) ! 

Let S 0 :: TYPE REF FORM rep 

IF S=(ref-exp:R) 

THEN S 0 := ( type-name : ID ) 

ELSE S 0 := S 

ENDIF 

Let L 0 :: TYPE EXP rep . 

call rep-type-struct(in :L,out :L 0 ) 

E 0 := (RecordT : [ (tag : N , eld : S 0 ) j L 0 ]) 

ENDCASE 

The macro value-rep is a little bit more complicate, since it propagates through the whole 

schema. 

Macro value-rep ( in :C::CLASS rep ,S::SCHEMA rep , 

78

out :VC::TYPE EXP rep ) = 

call vrep-type-struct(in : structure(C),S,[name(C)] , out :VC) 

Macro vrep-type-struct 

( in :E::STRUCTURE rep ,S::SCHEMA rep , K :: [ NAME rep ], 

out :E 0 :: TYPE EXP rep ) = 

CASE E ::: 

E = (RecordS : [ (tag : N , eld : S) j L]) ! 

Let S 0 :: TYPE REF FORM rep 

IF S = ( ref-exp : ( ref-kind : x , reference : R , class : N 0 )) 

THEN 

IF N 0 =2 K 

THEN Let C 0 2 S . name(C 0 )=N 0 

Let VC 0 :: TYPE EXP rep . 

call vrep-type-struct(in : structure(C 0 ),S,[N 0 j K] , out :VC 0 ) 

S 0 := ( type-exp : VC 0 ) 

ELSE S 0 := ( type-name : ( vrep : N 0 )) 

ENDIF 

ELSE S 0 := S 

ENDIF 

Let L 0 :: TYPE EXP rep . 

call vrep-type-struct(in : L,S,K , out :L 0 ) 

E 0 := (RecordT : [ (tag : N , eld : S 0 ) j L 0 ]) 

ENDCASE 

This solves the rst problem above. The trivial solution to the second problem has already 

been given. Now concentrate on the enforcement of IsA-constraints. The following 

macro insert-seq computes the body of the required canonical insert. It uses the macro classdescription 

to get the denition of a class in a schema from the class name and the macro 

insert-ref to compute the core of the method body anf enforcing referential integrity. 

Macro insert-seq 

( in :C::CLASS rep ,S::SCHEMA rep , out :P::COMMAND rep ) = 

Let P1, P2 :: COMMAND rep . 

call command-list(in : isa(C),S , out : P1) 

call insert-ref(in : C,S , out :P2) 

P := sequence(P1,P2) 

Macro command-list 

( in :NL::f NAME rep g ,S::SCHEMA rep , 

out :P::COMMAND rep ) = 

CASE NL 

NL = ! P := `skip' 

NL = f N g[L ^ N =2 L ! 

Let D::CLASS rep . 

call class-description(in : N,S , out :D) 

Let P1, P2 :: COMMAND rep . 

call insert-ref(in :D,S,out : P1) 

call command-list(in : L,S , out : P2) 

P := sequence(P1,P2) 

79

ENDCASE 

To solve the third problem we have to construct also the input- and output-lists of the methods. 

This is straightforward. 

Finally, to solve the fourth problem let us briey consider how to enforce referential 

integrity within insertions. Again, we have to build an input-type that extends the valuerepresentation 

type by adding ID. The corresponding macros are completely analogous to 

value-rep and vrep-type-struct. Then the denition of insert-ref follows the spirit of Example 

4.4 and the macros shown above. We omit further details. 

4.4 Integrity Enforcement 

The reective approach in the preceding section allows to cope with the implicit integrity 

constraints, hence suggests an immediate generalization to arbitrary user-dened constraints. 

Then the problem is to guarantee the consistency of a specied method with respect to such 

constraints. 

4.4.1 User-Dened Integrity Constraints 

Let us rst extend the notion of schema S by the introduction of explicit user-dened integrity 

constraints which are formulae I over the underlying type system with free variables fr(I) 

fx C 1 ::: x C n 

g, where each x Ci is a variable of type fU Ci g.Wecallx Ci the class variable of 

C i . 

A constrained schema consists of a behavioural schema S and a nite set of integrity 

constraints on S.Aninstance D is said to be consistent with respect to the integrity constraint 

I i substituting D(C) for each class variable x C in I evaluates to true, wheninterpreted in 

the usual way. 

We use abbreviations for distinguished classes of constraints. For this let C C 1 C 2 be 

classes and let c i : T C ! T i (i =1 2 3) and c i : T Ci ! T (i =1 2) be subtype functions. 

A functional constraint on C is a constraint C:c 1 ! C:c 2 which abbreviates 

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) : 

(4.30) 

A uniqueness constraint on C is a constraint UNIQUE(c 1 )orC:c 1 ! C:ident which abbreviates 

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) (i 0 v 0 ) 2 x C ) i = i 0 : (4.31) 

An inclusion constraint on C 1 and C 2 is a constraint C 1 :c 1 C 2 :c 2 which abbreviates 

8t :: T:9(i 1 v 1 ) 2 x C 1 :c 1(v 1 )=t )9(i 2 v 2 ) 2 x C 2 :c 2(v 2 )=t: (4.32) 

An exclusion constraint on C 1 , C 2 is a constraint C 1 :c 1 kC 2 :c 2 which abbreviates 

80

8i 1 i 2 :: ID:8v 1 :: T C 1 :8v 2 :: T C 2 :(i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1(v 1 ) 6= c 2 (v 2 ) : 

(4.33) 

Assume c 1 c 2 c 3 denes a uniqueness constraint on C. Then an object generating 

constraint on C is a constraint C:c 1 C:c 2 which abbreviates 

8i 1 i 2 :: ID:8v 1 v 2 :: T C :(i 1 v 1 ) 2 x C ^ (i 2 v 2 ) 2 x C ^ c 1 (v 1 )=c 1 (v 2 ) ) 

9(i v) 2 x C ):c 1 (v) =c 1 (v 1 ) ^ c 2 (v) =c 2 (v 1 ) ^ c 3 (v) =c 3 (v 2 ) : (4.34) 

These constraint notations can be easily extended to path constraints using the usual dotnotation. 

In addition we may use the notation -!r i for a (sub)reference r i . The dierence to 

-:r i is that the latter refers to a value of type ID, whereas the former corresponds to the 

referenced object in class C i or equivalently to a value of type U Ci .Paths can be abbreviated 

if this does not lead to confusion, in particular the selector value is usually omitted. 

Example 4.5. Let us assume that the salary of a professor is determined by his/her age. 

For this purpose, let Age Salary : T Prof ! NAT be the natural projections to the Ageand 

Salary-values respectively. Thenwehave the following functional constraint on the class 

ProfessorC: 

Constraint ProfessorC.Age ! ProfessorC.Salary which abbreviates 

8i j :: ID:8v w :: T Prof : (i v) 2 x Prof ^ (j w) 2 x Prof ^ Age(v) = Age(w) 

) Salary(v) = Salary(w) : 

As a second example take DepartmentC!Head:F aculty = DepartmentC:ident , which 

states that the head of a department alsoworks in that department. 

ut 


The problem of integrity enforcement can be formalized by greatest consistent specializations 

(GCSs). Given a method M andanintegrity constraint I, the GCS M I satises 

{ M I is consistent with respect to I, 

{ M I specializes M and 

{ each consistent specialization of M also specializes M I . 

It has been shown in [23] that GCSs always exist. Moreover, if we consider more than one 

constraint, i.e. a conjunction of constraints, GCSs can be built successively and do not depend 

on the order of the constraints. It has also been shown how to compute GCS branches under 

some technical prerequisites. The restriction to GCS branches is due to practicality. We omit 

the algorithm, since it envolves calculations with predicate transformers that can hardly be 

explained in a few lines. Instead, let us look at a simple example. 

Example 4.6. Let us consider the insert-method on the class ProfessorC from Example 

4.2 and the functional constraint in Example 4.5. The method insert 0 Prof 

(see Example 4.4) 

has to be replaced by the following one [25]. 

81

Method insert 0 Prof 

(in :V::VProf , out :I::ID) = 

IF 9O 2 ProfessorC: f Prof (value(O)) = V 


ELSE 

IF 9O 2 ProfessorC: Age(value(O) = Age(V) ^ Salary(value(O) 6= Salary(V) 

THEN skip 

ELSE I:=NewId 

IF 9J :: ID: Head(Faculty(V)) = (ident :J) 

THEN ProfessorC = ProfessorC [f(I,V) g 

ELSE Let V 0 :: VDept . V0 := Faculty(V) 

Let K::ID . K:=insert 0 Dept (V0 ) 

Let V 00 :: VProf . 

V 00 := ( PersonIdentityNo(V), Age(V), Salary(V), K ) 

ProfessorC = ProfessorC [f(I,V 00 ) g 

ENDIF 

ENDIF 

ENDIF 

ut 

From this example it can be seen how toprovide the generators for GCS construction. Indeed, 

we must have for each constraint a generator for the GCSs of generic update methods, and in 

addition, for each constraint a generator for the precondition required in the GCS algorithm. 

We omit further details [23]. 


Object oriented databases dier from relational ones in that richer structures and implicit 

constraints, especially inclusion constraints (IsA) and referential constraints, are provided. 

This forbids a simple approach to genericity. Indeed, each generic method must enforce at 

least these implicit constraints. Consequently we must also be able to derive the necessary 

input types for these operations from a given schema. 

Here we have solved this genericity problem using a reective approach, i.e. that the generators 

themselves can be represented in an object oriented database language using strongly 

typed macros. This form of linguistic reection exceeds the limits of polymorphism. Reection 

is based on the possibility torepresent syntactic components of the language such astypes, 

classes, methods, etc. as values within the language itself and to compute new schemata from 

these representations. This gives a practical solution to the genericity problem, whilst its 

theoretical justication was proven in [22, 24]. A partial implementation of the approach has 

been described in [27]. 

We also sketched how to extend the approach to integrity enforcement. Based on the 

theoretical results in [23] each constraint gives rise to a macro that transforms a user-dened 

method into its greatest consistent specialization with respect to the given constraint. 

To summarize genericityandintegrity enforcement are not only theoretically well-founded, 

but can also be eciently built into object oriented database languages. This allows a tremendous 

increase in declarativity in object oriented databases. 

However, we have to ensure the value-representability of a schema, a demand that was 

granted for free in the RDM, and we have toprovide type-safe reective database languages. 

82

Of course, the second demand is only sucient and in general not necessary, since we 

could build in algorithms for method generation and GCS construction as long as we have 

access to the schema denitions. The main advantage of the reective approach is that the 

work of these algorithms is made explicit and type-safe in the schema by the use of the macro 

language. This allows e.g. schema changes or changes to integrity constraints to be easily 

maintained, since they aect only a few macros. 

Another advantage of the outlined object oriented approach is that it allows to cope with 

constraints that are either structurally determined or explicitly dened by the user. The 

traditional approach in the RDM usually buries such constraints in database programs. 




2. H. At-Kaci: An Overview of LIFE, inJ.W.Schmidt, A. A. Stognij (Eds.): Proc. Next Generation 

Information Systems Technology, Springer LNCS 504, 1991, 42-58 


Database Programming Language, in Proc. VLDB 1991 


Portland Oregon, 1989, 159-173 

5. F. Bancilhon, C. Delobel, P. Kanellakis: Building an Object-Oriented Database System: The Story 

of O2 , Morgan Kaufmann, 1992 

6. C. Beeri: Formal Models for Object-Oriented Databases, Proc. 1st DOOD 1989, 370-395 


5(4), 1990, 353-382 

8. C. Beeri: New Data Models and Languages { the Challenge, in Proc. PODS '92 

9. M. Carey, D.DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc. 

ACM SIGMOD 88 

10. M. Caruso, E. Sciore: The VISION Object-Oriented Database Management System, Proc. of the 



Computing Surveys, vol. 17(4), 471-522 

12. A. Heuer: Objektorientierte Datenbanken (in German), Addison Wesley, 1992 




1986 

15. B. Meyer: Object-Oriented Software Construction, Prentice-Hall, 1988 



17. D. Stemple, L. Fegaras, T. Sheard, A. Socorro: Exceeding the Limits of Polymorphism in Database 

Programming Languages, in Proc. EDBT90, Springer LNCS 416, 1990 

18. T. Sheard, D. Stemple: Automatic Verication of Database Transaction Safety, ACM ToDS vol. 

14 (3), September 1989 

19. D. Stemple, T. Sheard: ARecursive Base for Database Programming Primitives, in Proceedings 

of the First International East/West Database Workshop, Kiev, October 1990 



21. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: A Reective Approach to 

Method Generation in Object Oriented Databases, University of Rostock, Rostocker Informatik 

Berichte, no. 14, 1992 

83


Oriented Databases, in J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Springer LNCS 646, 341-356 


CS-08-92, December 1992 

24. K.-D. Schewe, B. Thalheim: Fundamental concepts of object oriented databases, Acta Cybernetica, 

vol. 11 (4), 1993, 49-85 

25. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity Preserving Updates in Object Oriented Databases, 

in M. Orlowska, M. Papazoglou (Eds.) : Proc. 4th Australian Database Conference, Brisbane, 

February 1993, World Scientic, 171-185 

26. B. Thalheim: Dependencies in Relational Databases, Teubner Leipzig, 1991 

27. I. Wetzel: Programmieren mit STYLE: Uber die systematische Entwicklung von Programmierumgebungen 

(in German), Ph.D. Thesis, Hamburg University, 1994 

84

Chapter 5 

Towards a Theory of Consistency 

Enforcement 

Contents 

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 

5.1.1 The Concistency Enforcement Problem . . . . . . . . . . . . . . . . 87 

5.1.2 The Problem of GCS Construction . . . . . . . . . . . . . . . . . . . 87 

5.1.3 The Practicality of GCS Construction . . . . . . . . . . . . . . . . . 88 

5.2 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . 88 

5.2.1 Constraints in the Relational Model . . . . . . . . . . . . . . . . . . 89 

5.2.2 Stepwise Consistency Enforcement . . . . . . . . . . . . . . . . . . . 90 

5.3 Fundamental Features of State-Based Specications . . . . . . . 92 

5.3.1 Formal Specications with Guarded Commands . . . . . . . . . . . . 92 

5.3.2 Axiomatic Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 

5.3.3 Consistency and Specialization . . . . . . . . . . . . . . . . . . . . . 95 


5.4 The Construction of GCSs . . . . . . . . . . . . . . . . . . . . . . . 99 

5.4.1 I-reduced Guarded Commands . . . . . . . . . . . . . . . . . . . . . 100 

5.4.2 An Upper Bound for GCSs . . . . . . . . . . . . . . . . . . . . . . . 103 

5.4.3 The General Form of GCSs . . . . . . . . . . . . . . . . . . . . . . . 106 

5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 

5.6 A Normal Form for the Specialization Proof Obligation . . . . . 109 

5.7 Proof of the Upper Bound Theorem for Sequences . . . . . . . . 110 

5.8 Proof of the Upper Bound Theorem in the Recursive Case . . . 114 


K.-D. Schewe, B. Thalheim. Towards a Theory of Consistency Enforcement. Acta 

Informatica 1998 (to appear). 

85

Abstract. Specications with invariants occur in almost all formal specication languages. 

Hence the problem is to prove the consistency of the specied operations with respect to 

the invariants. Whilst the problem seems to be easily solvable in predicative specications, it 

usually requires sophisticated verication eorts, when specications in the style of Dijkstra's 

guarded commands as e.g. in the specication language B are used. 

As an alternative a computational approach to consistency enforcement will be discussed 

in this paper. The basic idea is to replace inconsistent operations by new consistent ones 

preserving at the same time the intention of the old one. More precisely, this can be formalized 

by consistent spezializations, where specialization is a specic partial order on operations 

dened via predicate transformers. 

It can be shown that greatest consistent specializations (GCSs) always exist and are compatible 

with conjunctions of invariants. Then under certain prerequisites the general construction 

of such GCSs is possible. In general, GCS construction can be embedded in renement 

calculi and therefore strengthens the systematic development of correct programs. 


Invariants provide an excellent way toachieve declarativity in formal specications. Therefore, 

almost all commonly used specication languages such asVDM[4,14],Z[26,27]andB[1,2] 

as well as research prototypes [20] allow atleaststatic invariants to be dened, i.e. conditions 

that have to be satised by all states. Then consistency of an operation 1 S with respect to 

the specied (static) invariant means that S transforms consistent states only into consistent 

ones. More generally, transition invariants restrict the allowed pairs of initial and nal states 

for operations S, and dynamic constraints restrict the allowed sequences of states [16, 17]. 

Especially in the context of data-intensive application systems, where the nal implementation 

will make use of persistent data stored in databases, most of the application semantics 

is expressible by static and dynamic integrity constraints, which is just another word for 

invariant [16, 17, 20, 24, 25, 28]. 

Consistency proofs are therefore an inherent and important task within the development of 

correct programs. However, as pointed out in [3] there is a fundamental dierence in the way 

invariants are handled in VDM specications (and similarly Z specications) and specications 

in B. The predicative style in the former languages allows invariants to be considered as part 

of the specication, hence nding a correct program that satises the specication is left to 

renement. On the other hand, the axiomatic semantics associated with B operations in the 

style of Dijkstra [9, 12, 19] enables the denition of consistency proof obligations [1, 8, 20] in 

a suitable logic. At rst glance the VDM and Z approach seems to be advantagous, because 

it avoids signicant verication eorts. 

To the authors' point of view industrial applicability andacceptance of formal methods 

can only be expected if the whole renement process is taken into consideration. Starting from 

a high-level specication the application of provably correct renement steps should not stop 

before a formal specication is reached that is equivalent to an executable program. Then 

the automatic derivation of such a program should be possible as demonstrated in [13, 24]. 

As a consequence, proving consistency is an unavoidable problem, since at least once in the 

renement process we shall leave the ground of purely predicative specications [18, 24]. 

1 To be precise, we should write operation specication to emphasize the independence from the implementation, 

but throughout the paper we drop this distinction. 

86

The B approach allows static invariants to be specied and proof obligations to be derived. 

The consistency verication problem can be approached by using theorem provers or proof 

assistants, but the burden of writing consistent specications is left to the user. In the context 

of database transactions the same applies to the extended Boyer-Moore approach in [25]. 

Hence the problem is to assist the user in this task and to provide solid and theoretically 

founded techniques for consistency enforcement as an alternative toverication. This problem 

is investigated in this paper. 

Of course, it cannot be expected to obtain a panacea for the development of correct 

programs, since any approach to consistency enforcement must rely on certain assumptions. 

We must take the specied operations and invariants as xed, i.e., the specied operations 

and invariants reect exactly the intention of the user. Otherwise enforcement may produce 

an undesired new operation. In any case, just as for the results of verication, specications 

resulting from consistency enforcement may be used to give some feedback to the specifying 

user and may encourage changes to a specication. 

5.1.1 The Concistency Enforcement Problem 

Given an operation S and an invariant I the basic idea is to replace S by a new operation 

S I which is consistent with respect to I. Since this alone is not a sucient property because 

of its independence from S, we claim that S I should be as close to S as possible. The rst 

problem is to nd a suitable notion for \close". The intuition behind our work is that each 

operation has an \eect", i.e. performs certain state changes, and S I should \preserve the 

eect" of S. 

Whatever the denition of eect preservation will be, it should lead to a partial order 

v on operations. With respect to this partial order we have S I v S and S I should be the 

greatest (consistent) operation with this property. 

In Section 5.2 we rst look at a practical example in the relational model taken from [22] 

to motivate the construction of S I or more precisely of one of its deterministic branches. 

In Section 5.3 we recall fundamental features of state-based specications with emphasis on 

predicate transformer semantics. 

This is used to characterize consistency by a formula in innitary rst-order logic. In 

the same spirit we may formalize operational specialization which denes a partial order 

on operations. Then it is natural to take this order for a formal denition of consistency 

enforcement. This leads directly to the notion of a greatest consistent specialization (GCS) 

which rst appeared in [21] in an object oriented database context. The rst results show 

the existence of GCSs and their compatibility with conjunctions, which allows to enforce 

consistency step-by-step for any order of the invariants. 

5.1.2 The Problem of GCS Construction 

These results are not at all surprising, because we know that for each specication using 

guarded commands we always nd an equivalent predicative specication and vice versa. 

Hence conjunction would be sucient to nd at least one solution. In fact, we may translate 

a specication into a predicative form, join it with the invariant and then translate back to 

obtain the GCS. For the case of specications in the style of B this introduces unbounded 

choices and therefore destroys the \operational avour" [3] of specications with guarded 

87

commands. Therefore, we are looking for an approach to GCS construction which preserves 

this style. 

An insucient alternative would be to consider just the basic operations within the specication 

S, i.e. assignments or skip, and to replace them by their GCSs. In some cases this 

leads to over-specialization in other cases we do not even get a specialization at all. The main 

result of this paper shows that under some technical prerequisites it is nevertheless possible to 

concentrate on basic operations. Replacing them by their GCSs in a given complex operation 

denes a new operation S 0 I which is specialized by S I.We then get the GCS of the complex 

operation by adding a precondition. This fundamental result will be shown in Section 5.4, but 

parts of the rather lengthy proof of the \upper bound theorem" are shifted to the appendix. 

5.1.3 The Practicality of GCS Construction 

GCSs are in general non-deterministic. Their deterministic branches reect several alternative 

strategies for consistency enforcement. We may therefore ask, whether it is possible to 

construct directly such deterministic branches. In Section 5.4 we showthatifwe build S 0 I by 

replacing the basic operations in S by specializations of GCSs, we still achieve specializations 

of GCSs. This gives a second compatibility result which is of particular interest for practical 

applications. Especially in data-intensive applications, where we deal with sets, we maywant 

to choose deterministic GCS branches with minimized symmetric dierence for set values in 

the initial and nal state. More liberately this can be formalized by subsumption-free branches 

of the GCS. 

This paper is a continuation of the work in [21], where the notion of GCS was introduced 

to give a solid theoretical basis for consistency enforcement, but there was no idea how 

to construct them. Especially in the context of data-intensive applications there are many 

competing approaches based on active rule management [6, 10, 11, 15, 29, 30], but in none of 

these a complete denition of the problem has been given. 

The main new result of this paper concerns the construction of GCSs. It is shown how this 

can be reduced to basic operations, for which GCSs must still be detected case by case. On 

this basis, the practical paper [23] indicates an ecient implementation based on linguistic 

reection. In addition the technical prerequisite of I-reducedness, which only occurs as a 

means for the proof of the main result, shows the limits of consistency enforcement. The main 

technical diculty intheproofwas to abstain from looking at specializations of S, but rst 

to achieve a consistent generalization of S I , which is in general not a specialization of S. 

5.2 A Motivating Example 

Let us rst illustrate consistency enforcement in the relational model. For this we consider a 

small fragment of the example used in [10, 22]. 

Recall that a relation schema is simply a set of attributes. Moreover, with each attribute 

A in a relation schema R we associate a data type dom(A), but for our purposes here the data 

types are not important. A relational database schema S S is a nite set of relation schemata. 

A tuple t over a relation schema R is a map R ! 

A2R 

dom(A) witht(A) 2 dom(A). We 

usually denote tuples as ordinary tuples with components named by the attributes. Sometimes, 

we even omit the attributes assuming a xed order on them. Then a relation over R 

is a nite set of such tuples. An instance of S associates with each relation schema R 2S a 

relation r over R. 

88

5.2.1 Constraints in the Relational Model 

An integrity constraint over a database schema S is a formula 

I P 1 (x 1 ) ^ :::^ P n (x n ) ) Q 1 (y 1 ) _ :::_ Q m (y m ) 

where the predicate symbols P i , Q j either correspond to relation schemata R 2 S or are 

comparison predicates (= 6=

WIRE 

wire id connection wire type voltage power 

4711 HH-HB Koax 12 600 

4814 HH-H Tel 12 600 

TUBE 

tube id connection tube type 

8314 HH-H GX44 

8511 HH-HB GX44 

023 HB-H T33 

CONNECTION 

connection from to 

HH-H Hamburg Hannover 

HH-HB Hamburg Bremen 

HB-H Bremen Hannover 

It is easy to see that this instance satises the constraints above. 

ut 

5.2.2 Stepwise Consistency Enforcement 

With each relation schema R we also associate basic update operations insert R (t) and 

delete R (t). If a tuple t to be inserted already exists in the relation, the insert-operation 

does nothing. If a tuple t to be deleted does not exist, then a deletion also does nothing. 

Thus, these operations could also be written as assignments R := R [ftg and R := R ;ftg 

by slightly abusing the relation schemata as variables which take relations as values. 

Example 5.2. Consider the schema S from Example 5.1 and the operation insert WIRE (t). 

This may lead to a violation of constraint ID 1 ,inwhichcasewemust add a tuple to TUBE. 

Hence it can be replaced by 

WIRE := WIRE [ftg 

IF connection(t) =2 TUBE[connection] 

THEN TUBE := TUBE [f(?,connection(t),?) g 

ENDIF 

Here the question marks stand for arbitrarily chosen values of the corresponding data type. 

Similarly, wemay replace delete TUBE (t) by 

TUBE := TUBE ;ftg 

IF connection(t) 2 WIRE[connection] ; TUBE[connection] 

THEN WIRE := WIRE ;ft 0 j connection(t 0 ) = connection(t)g 

ENDIF 

In order to enforce FD 2 wemay then replace insert TUBE (t) by 

IF 8t 0 2 TUBE . tube id(t) 6= tube id(t 0 ) 

THEN TUBE := TUBE [ftg 

ENDIF 

Let us now add the exclusion constraint ED WIRE[wire id] k TUBE[tube id]. In order to 

enforce this constraint insertions into one of WIRE or TUBE should be followed by deletions 

in the other. The resulting operations are 

and 

WIRE := WIRE ;ftg 

TUBE := TUBE ;ft 0 j tube id(t 0 ) = wire id(t)g 

TUBE := TUBE ;ftg 

90

WIRE := WIRE ;ft 0 j wire id(t 0 )=tubeid(t)g . 

If we now take together FD 2 , ID 1 and ED we must be very carefull. E.g., if we execute 

insert WIRE (8511,HH-HB,Koax,12,600) on the instance above, we may rst delete the tuple 

(8511,HH-HB,GX44) in TUBE in order to enforce ED and then the two tuples (4711,HH- 

HB,Koax,12,600) and (8511,HH-HB,Koax,12,600) in WIRE in order to enforce ID 2 . The resulting 

instance would be (omitting CONNECTION): 

WIRE 


4814 HH-H Tel 12 600 

TUBE 


8314 HH-H GX44 

023 HB-H T33 

Thus, the \eect" of the original operation, i.e. insertion of a tuple into WIRE, is completely 

destroyed. The new eect is a deletion in WIRE and TUBE. 

The alternative works as follows: We start with the insert WIRE operation and replace 

it by the one above used to enforce ID 1 . The resulting operation involves an insertion into 

TUBE. Next \enforce" FD 2 by replacing insert TUBE . The resulting complex operation now 

involves insert WIRE and insert TUBE . These are both replaced in order to \enforce" ED. We 

obtain the following operation: 

WIRE := WIRE [ftg 

TUBE := TUBE ;ft 0 j tube id(t 0 ) = wire id(t)g 


THEN SELECT i =2 TUBE[tube id] [ WIRE[wire id] 

TUBE := TUBE [f(i,connection(t),?) g 

ENDIF 

However, this operation is not consistent with respect to ID 1 ,whichwe enforced before. We 

therefore add a precondition which holds exactly in those cases, where previous enforcement 

steps are preserved. This condition is wire id(t) =2 TUBE[tube id]. Then the nal operation 

will be 

IF wire id(t) =2 TUBE[tube id] 

THEN WIRE := WIRE [ftg 


THEN SELECT i =2 TUBE[tube id] [ WIRE[wire id] 

TUBE := TUBE [f(i,connection(t),?) g 

ENDIF 

ELSE fail 

ENDIF 

Here fail is used to express undenedness. If the condition in the IF-clause is not satised, 

the whole operation will be rejected. 

ut 

Example 5.2 reects exactly the construction of a GCS branch. The presentation so far is 

completely informal. In the following sections we shall justify this approach. We shall see that 

the chosen order of the constraints is not important. We shall also see that the precondition 

arises naturally from the specialization order. 

91

5.3 Fundamental Features of State-Based Specications 

In the following consider a specication to consist of a state space, invariants and operations. 

A state space is simply a collection of typed state variables, where the types are assumed to be 

sets. Operations on these sets are dened by functions. Invariants are dened by formulae in 

an underlying logic L. Finally, operations will be specied by generalized guarded commands. 

5.3.1 Formal Specications with Guarded Commands 

In the following assume a xed many-sorted, S innitary, rst-order logic L and a xed interpretation 

structure (D !), where D = 

T type T is a set (semantic domain) and ! assigns 

type-compatible functions !(f) :T 1 ::: T n ! T and !(p) :T 1 ::: T n !ftrue falseg 

to n-ary function symbols f and n-ary predicate symbols p respectively. Then ! can be extended 

in the usual way to the terms and formulae of L and we may assume the domain 

closure property, i.e. for each d 2 D there exists some closed term t in L with !(t) =d. 

Denition 5.1. (i) A state space X isanitesetofvariables of L such thatforeach x 2 X 

there is an associated type T x .We write x :: T x . 

(ii) A (static) invariant on a state space X (for short: X-invariant) isaformula I of L with 

free variables in X (fr(I) X). 

(iii) A transition invariant J on a state space X is a formula of L with fr(J ) X [ X 0 , 

where X 0 isadisjointcopyofX. 

(iv) Given a state space X, a state on X is a type-compatible variable assignment x 7! 

(x) 2 T x for each x 2 X. 

Let denote the set of all states. Clearly, states 2 are sucient tointerpret X-invariants, 

whereas state pairs () 2 suce to interpret transition invariants. We use the 

notations j= and j= () in these cases. The disjointcopy X 0 of the state space X for transition 

invariants is used to distinguish between the values in initial and nal states respectively. 

Example 5.3. Let Z denote the set of integers. Consider the state space X = fx 1 x 2 g where 

T x 1 = T x2 is the set of nite subsets of the cartesian product Z Z. In addition, for i =1 2 

we have projection functions i : Z Z ! Z. By abuse of notation we also use i to denote 

the elementwise shift to set arguments resulting in a nite set of integers. Moreover, consider 

the static invariants: 

I 1 1 (x 1 ) 1 (x 2 ) 

I 2 8x y :: Z Z: x2 x 2 ^ y 2 x 2 ^ 2 (x) = 2 (y) ) 1 (x) = 1 (y) 

I 3 2 (x 1 ) \ 2 (x 2 ) = 

ut 

Note that Example 5.3 captures the essentials of Examples 5.1 and 5.2. 

For operations we use guarded commands in the style of Dijkstra and Nelson [9, 12, 19, 24] 

including partiality and recursion. We dispense with more sophisticated constructs such as 

the dovetail-operator r [5], since fairness is beyond the scope of this piece of work. 

Denition 5.2. Let X be some state space. An operation S on X consists of a set of inputparameters 

f 1 ::: k g, a set of output-parameters fo 1 ::: o l g and a body. To each inputparameter 

i corresponds a type I i and to each output-parameter o j corresponds a type O j . 

The body of S is a guarded command, i.e. it is recursively built from the following constructs: 

92

(i) assignment x := E, where x is a state variable in X, an output parameter or a local 

variable within S and E is a term of the same type as x, 

(ii) skip, fail, loop , 

(iii) sequential composition S 1 S 2 ,choice S 1 S 2 ,unbounded choice @ x :: T S, guardP ! S 

and restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable of type 

T and 

(iv) the least xpoint operator S: f(S), where f(S) is an expression built as above using in 

addition the operation variable S. 

We usually write o 1 ::: o l S( 1 ::: k ). 

Let us rst explain the informal (and rather procedural) meaning of guarded commands. 

Each operation may be partial, i.e. it is undened on a subset of , and it is in general 

non-deterministic, i.e. starting in an initial state may result in more than one nal state , 

where may also be 1 to denote non-termination. 

Then the informal meaning of assignments, sequences and skip is the obvious one. Choices 

mean to arbitrarily select one of the operations, if it is dened. The intention behind the 

unbounded choice is the introduction of a new variable x not occurring in X of the given type 

T and to execute S on the extended state space X [fxg. A guard P ! S gives a precondition 

P for S. IfP is not satised, the whole operation is undened. Restricted choice S T means 

to execute S unless it is undened in which case T is taken. 

The basic operations fail and loop are only introduced for theoretical completeness: fail 

is always undened, and loop never terminates, but loop is the least element in the Nelson 

order , hence is fundamental for recursion, whereas fail will occur as the least element in 

the specialization order v. Recall that the Nelson order is dened by T S i 

wp(T )(R) ) wp(S)(R) and wlp(S)(R) ) wlp(T )(R) (5.35) 

hold for all X-invariants R [19]. The denition of the specialization order v will occur in 

Denition 5.6. 

Example 5.4. Let us extend Example 5.3 by some operations. Dene S(a b :: Z) by x 1 := 

x 1 [f(a b)g and S 0 (a b :: Z) by 

x 1 := x 1 [f(a b)g ((a 62 map( 1 )(x 2 ) ! @ c :: Z x 2 := x 2 [f(a c)g ) skip ) : 

S inserts a new pair (a b) into the set value of x 1 , and S 0 contains an additional insertion 

into x 2 .Thus, S 0 is a sequence S T ,whereT compensates a violation of the invariant I 1 . 

T itself has the form U skip. The reason for the skip is that U is a guard and thus 

is partial. Omitting the skip would lead to S 0 being undened in case of no violation to I 1 , 

whilst now S 0 coincides with S in that case. 

ut 

We allow types to be omitted, if they are known from the context or if they are not necessary 

for the understanding. Moreover, we allow cascaded unbounded choices @ x 1 @ ::: @ x n S 

to be abbreviated by @x 1 ::: x n S. 

93

5.3.2 Axiomatic Semantics 

In general, we may describe the semantics of an operation S simply by a set (S) 

( [f1g), where 1 is the special symbol used to indicate non-termination. Since nobody 

wants to dene the semantics of operations by explicit enumeration of all state pairs, we are 

looking for an equivalent logical characterization. 

Let R be an X-invariant and consider the set of states R = f 2 jj= Rg satisfying R. 

If we take R as a postcondition for an operation S, wewant to associate with it the weakest 

(liberal) precondition of S to establish R. Informally these conditions can be characterized as 

follows: 

{ wlp(S)(R) characterizes those initial states such that all terminating executions of S will 

reach a nal state characterized by R, i.e. j= wlp(S)(R) holds i for all 2 with 

() 2 (S) wehave j= R,and 

{ wp(S)(R) characterizes those initial states such that all executions of S terminate and 

will reach a nal state characterized by R, i.e. j= wp(S)(R) holds i for all () 2 (S) 

we have 6= 1 and j= R. 

Thus, wlp(S) and wp(S) are functions from X-invariants to X-invariants, which are usually 

called predicate transformers. It can be shown that these predicate transformers always exist 

and satisfy 

wp(S)(R) , wlp(S)(R) ^ wp(S)(true) and (5.36) 

wlp(S)(^ ^ 

R i ) , wlp(S)(R i ) : (5.37) 

i2I 

i2I 

Call (5.36) the pairing condition and (5.37) the universal conjunctivity property. 

The existence proof is based on the assumption that L is an innitary logic and the domain 

closure assumption. The latter allows for a given state to nd a characterizing predicate P , 

i.e. we have j= P i = . 

Furthermore, the former property then allows to write R , W 2 R 

P , which is used 

to prove that the pairing condition and the universal conjunctivity are sucient to dene 

operations, i.e., the denition of predicate transformers wlp(S) and wp(S) satisfying (5.36) 

and (5.37) is equivalent to the denition of (S) [9, 12, 19, 24]. 

Let us dene the semantics of operations axiomatically via predicate transformers. The 

proofs of universal conjunctivity and the pairing condition are straightforward [19]. 

Denition 5.3. Let S, S 1 , S 2 be guarded commands on some state space X, T some type, 

E(x) some term of type T x and let x 2 X and y 62 X. Then we have forany formula R of L: 

wlp(skip)(R) , wp(skip)(R) ,R (5.38) 

wlp(fail)(R) , wp(fail)(R) , true (5.39) 

wlp(loop)(R) , true 

wp(loop)(R) , false (5.40) 

wlp(x := E)(R) , wp(x := E)(R) ,fx=Eg:R (5.41) 

94

where fx=Eg:R denotes the substitution of the variable x in R by the expression E, 

wlp(S 1 S 2 )(R) , wlp(S 1 )(wlp(S 2 )(R)) 

wp(S 1 S 2 )(R) , wp(S 1 )(wp(S 2 )(R)) (5.42) 

wlp(P ! S)(R) ,P) wlp(S)(R) 

wp(P ! S)(R) ,P) wp(S)(R) (5.43) 

wlp(S 1 S 2 )(R) , wlp(S 1 )(R) ^ wlp(S 2 )(R) 

wp(S 1 S 2 )(R) , wp(S 1 )(R) ^ wp(S 2 )(R) (5.44) 

wlp(S 1 S 2 )(R) , wlp(S 1 )(R) ^ (wp(S 1 )(false) ) wlp(S 2 )(R)) 

wp(S 1 S 2 )(R) , wp(S 1 )(R) ^ (wp(S 1 )(false) ) wp(S 2 )(R)) (5.45) 

wlp(@ y :: T S)(R) ,8y :: T:wlp(S)(R) 

wp(@ y :: T S)(R) ,8y ^ :: T:wp(S)(R) (5.46) 

wlp(S:f(S))(R) , wlp(f (loop))(R) and 

 

wp(S:f(S))(R) , _ 

where ranges over the ordinal numbers. 

wp(f (loop))(R) (5.47) 

The recursive guarded command S:f(S) is the least xpoint of f with respect to the Nelson 

order dened in (5.35). Then we must know the monotonicity of the constructors in 

Denition 5.2 with respect to this order, which can be easily proven [19]. 

Note that operations may only eect parts of the state space. For consistency enforcement 

it will be necessary to \extend" such operations. Therefore, we need to know for each operation 

S the subspace Y X such thatS does not change the values in X ; Y . In this case we call 

S a Y -operation on X. A formal denition is the following. 

Denition 5.4. Let X be a state space and S an operation on X. S is a Y -operation for 

Y X i wlp(S)(R) ,Rand wp(S)(R) ,Rhold for each (X ; Y )-invariant R and Y is 

minimal with this property. 

Note that for each operation S on X there is always a Y X such that S is a Y -operation. 

Let us now giveacharacterization for deterministic operations. For this we need the notion 

of the dual or conjugate predicate transformers wlp(S) and wp(S) which are dened by 

wlp(S) (R) :wlp(S)(:R) and wp(S) (R) :wp(S)(:R) : 

(5.48) 

Denition 5.5. An operation S on the state space X is called deterministic i wlp(S) (R) ) 

wp(S)(R) holds for all X-invariants R. 

5.3.3 Consistency and Specialization 

General invariants and arbitrary operations on a state space X raise the problem, whether 

consistency as dened by the invariants is always satised by the operations. One approach 

to address this problem is to use general verication techniques, i.e. to derive (and prove) 

general proof obligations in the predicate transformer calculus. Let us rst express these 

proof obligations. 

95

Denition 5.6. Let X = fx 1 ::: x n g be a state space, Z Y X subspaces, I an X- 

invariant, J a transition invariant, S a Z-operation and T a Y -operation. Then 

(i) S is consistent with respect to I i I)wlp(S)(I) holds, 

(ii) T specializes 2 S i wp(S)(true) ) wp(T )(true) and wlp(S)(R) ) wlp(T )(R) hold for all 

Z-invariants R (denoted T v S), and 

(iii) S is consistent with respect to J i S v J holds, where J is dened as 

loop @ x 0 1::: x 0 n J ! x 1 := x 0 1 ::: x n := x 0 n : 

Recall the intention behind these denitions. An X-invariant partitions into two disjoint 

 

subsets. If we consider the invariant I we have = I [ :I . States not satisfying the 

invariant should never be reached, i.e. if S is started in a state satisfying I, it should only 

reach states also satisfying I, i.e., if we have 2 I , then for all 2 with () 2 (S) 

we should always have 2 I . Recall from the introduction to this section that the set of all 

initial states such that each terminating execution of S reaches I is wlp(S)(I) , i.e., 

f 2 j () 2 (S) implies j= for all 2 g = wlp(S)(I) : 

Hence we have the requirement I wlp(S)(I) which isequivalent to (i) [8]. 

The intuition behind the denition of specialization is that whenever an execution of the 

specialized operation T establishes some post-predicate R, then this execution should already 

be one of the general operation S. Clearly, v denes a partial order on operations. 

Each transition invariant may be regarded as a very general operation J that allows any 

state pair () satisfying J . Hence transition consistency for an operation S is equivalent to 

S v J . The loop-part in the denition of J gives wp( J )(true) , false,whichallows to 

consider only wlp( J )(R) ) wlp(S)(R) for all Z-invariants R. 

There exists an equivalent characterization of specialization and hence also of transition 

consistency that avoids the quantication over all X-invariants, but uses conjugate predicate 

transformers as dened in (5.48). The result in Proposition 5.7 is assumed to be commonly 

known. E.g., [7] mentions a similar result in the wp-calculus without proof. The proof is rather 

technical and can be done by simple direct calculations. Since we do not know ofany reference 

for such a proof, we have added it in Appendix 5.6. 

Proposition 5.7. Let S and T be operations on a state space X = fx 1 ::: x n g. Let Z = 

fz 1 ::: z n g be disjoint to X with T xi = T zi . Then wlp(S)(R) ) wlp(T )(R) holds for all 

X-invariants R i 

fz 1 =x 1 ::: z n =x n g:wlp(T 0 )(wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) (5.49) 

holds, where T 0 results from T by renaming all the variables x i by z i . 

ut 

2 Some other authors would prefer to call v renement. This is justied as long as renement does not comprise 

the extension of specications. This view of renement as a more general methodological means underlies 

our overall work on formal methods, whereas we prefer the notation specialization in this more restrictive 

context. From a technical point of view, we simply consider a partial order (with some nice properties) on 

operations. 

96

Note that the order of the substitution is irrelevant. Then T v S holds i we have (5.49) 

and wp(S)(true) ) wp(T )(true). This is a result of its own right, which enables mechanical 

or even automatic verication. In the context of consistency enforcement (5.49) denes an 

X-invariant, say P. If we restrict the operations S and T to R , then T would become a 

specialization of S. Ifwehave wp(S)(true) , wp(T )(true), this implies that P ! T with P 

dened by (5.49) is the greatest common specialization of S and T . This will later be used to 

nd the precondition in the GCS. 

Corollary 5.8. Let S and T be operations on a state space X = fx 1 ::: x n g with wp(S)(true) , 

wp(T )(true). Dene an X-invariant P ST by (5.49). Then P ST ! T is the greatest common 

specialization of S and T . 

ut 

In general the considered operations will be non-deterministic. Informally, non-determinism 

may be considered as glueing together innitely many deterministic operations by a choice 

operator. Sometimes, however, we are interested only in these deterministic branches. We give 

a formal denition for this. 

Denition 5.9. Let S and T be operations on X with T v S and wp(T ) (true) , wp(S) (true). 

If T is deterministic, it is called a deterministic branch of S. 

If we T and S are semantically equivalent to some @y 1 :: T 1 ::: y n :: T n T 0 and 

@y 1 :: T 1 ::: y n :: T n S 0 respectively such that T 0 is a deterministic branch ofS 0 ,thenwe 

call T a quasi-deterministic branch of S. 


Suppose now to be given an operation S and a static invariant I. Assume that S is an Y - 

operation, whereas I is dened on X with Y X. The idea is to construct a \new" operation 

S I that is consistent with respect to I and can be used to replace S. Roughly spoken this 

means that the eect of S I on the state variables in X should not dier from the eect of 

S. Formally this is expressed by consistent specialization. Since there will be more than one 

such specialization and we therefore choose the \greatest", i.e. all others should specialize it. 

Denition 5.10. Let Y X be state spaces, S a Y -operation and I an X-invariant. Then 

an operation S I on X is called Greatest Consistent Specialization (GCS) of S with respect to 

I i 

(i) S I v S holds, 

(ii) S I is consistent with respect to I and 

(iii) for each operation T on X satisfying properties (i) and (ii) (instead of S I )wehave T v S I . 

Example 5.5. Consider S = loop, which is already consistent with respect to any invariant 

I. HenceS I must also be loop. 

Similarly, S = fail is consistent with respect to any invariant I, whichgives S I = fail. 

ut 

Example 5.6. Let Z denote the set of integers. Take the state space X = fxg with x :: Z 

and suppose the X-constraint I x 0 and the X-operation S = x := x ; a for some 

constant a 0. Then we have 

S I = (x a _ x

(i) holds, since 

wlp(S)(R) ,fx=x ; ag:R 

) (x a _ x

Moreover, due to the construction of T and the denition of specialization, S 0 is the the 

least upper bound of T with respect to v, hence it must be also a specialization of S. On 

the other hand, from the consistency proof obligation and the construction of T we obtain 

immediately that S 0 is consistent with respect to I. HenceS 0 2T must hold, which proves 

that S 0 is a GCS S I of S with respect to I. This completes the existence proof. 

The uniqueness follows immediately, since each GCS S 0 of S with respect to I must be 

the least upper bound of T . 

ut 

We observe that the GCS with respect to the conjunction of invariants can be successively 

built. Similarly, we obtain a trivial compatibility result with respect to further specialization. 

Both results, given in the next proposition, have already been proven in [21]. 

Proposition 5.12. If I 1 and I 2 are static invariants on X, then for any operation S on 

Y X the GCSs (S I 1 ) I2 and S (I1Î2) coincide on initial states satisfying I 1 Î 2 , i.e., 

I 1 Î 2 ! (S I 1 ) I2 and I 1 Î 2 ! S (I 1Î2) are semantically equivalent. 

For an X-invariant I and a Z-operation T v S the GCS T I of T with respect to I is a 

specialization of S I . 

ut 

5.4 The Construction of GCSs 

The proof of the existence result in Proposition 5.11 is not constructive. Therefore, we have 

to nd a way to construct the GCS of an operation with respect to some given invariant. 

For the basic operations loop and fail we have already computed their GCSs in Example 

5.6. For skip,whichisa-operation, we notice that each operation T with wp(T )(true) , true 

is already a specialization. Hence we have to nd the greatest consistent operation with this 

property. Informally, when starting in a state 2 I the resulting state must also lie in I . 

When starting in a state 2 :I wemayreachany nal state 2 . For X = fx 1 ::: x n g 

this operation is given by 

(I !(@x 0 1::: x 0 n (fx 1 =x 0 1::: x n =x 0 ng:I ! x 1 := x 0 1 ::: x n := x 0 n))) 

(@x 0 1 ::: x0 n x 1 := x 0 1 ::: x n := x 0 n ) : 

The required properties are easily checked 3 . 

For assignments, we assume a case-by-case analysis for selected classes of invariants. In a 

data-intensive context such work has been done in [24, 21], but also the rule-based approach 

in [10] can be exploited for this task. 

In general, however, an operation is complex, built up from the basic operations and the 

constructors in Denition 5.2. It would be ne, if the GCS could be built just by replacing 

the involved basic operations by their GCSs, but in general this is wrong. 

Example 5.7. We have seen in Example 5.6 that GCSs may sometimes just arise from adding 

preconditions. Now let X and I be the same, but take 

S = S 1 S 2 

= x := x ; a x := x + a 

for some integer a 0. Clearly, S is semantically equivalent toskip. Aswehave seenabove, 

we obtainwp(S I )(true) , true. However, if we replace S 1 and S 2 by their GCSs, i.e. 

3 Formally, this also follows from Lemma 5.27 in Appendix 5.7. 

99

(S 1 ) I = (x

(ii) For all states with j= 

I we have, if 

P )fx 1 =x 0 1::: x l =x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g::I) 

is a -constraint for S + 1 ,thenitisalsoa-constraint forS+ 1 S 2. 

In both cases the evaluation order in the substitutions is not important. 

Example 5.8. Let us continue Example 5.7 with X = fx :: Zg, Ix 0, S 1 = x := x ; a, 

S 2 = x := x + a and S = S 1 S 2 for some integer a 0. In this example S 1 is deterministic 

and hence its only deterministic branch. 

(i) Take acharacterizing predicate P 

x = b for some b :: Z. Thenj= :I holds i 

b

Example 5.9. Now take X = fx yg with data types T x = T y being the set of nite subsets 

of some set T . Let I x y and S(a b :: T ) = S 1 S 2 with S 1 = y := y ;fag and 

S 2 = y := y [fbg. Then S 1 and S 2 are fyg-operations, and S 1 is deterministic. 

(i) Regard the -constraints of S 1 of the form P ) (8x 1 :: T x :x 1 y 0 ) with P y = y 0 

as required in Denition 5.14. We have 

fy 0 =yg:wlp(fy=y 0 g:S 1 )(P ) (8x 1 :: T x :x 1 y 0 )) , 

8x 1 :: T x :x 1 y 0 ;fag , 

false : 

Hence there is no such constraint and consequently the conjunction of these constraints 

is equivalent totrue, which is trivially a -constraint forS. 

(ii) Then regard the -constraints of S 1 of the form P ) (8x 1 :: T x :x 1 6 y 0 )withP y = 

y 0 as required in Denition 5.14. We have 

fy 0 =yg:wlp(fy=y 0 g:S 1 )(P ) (8x 1 :: T x :x 1 6 y 0 )) , 

:9x 1 :: T x :x 1 y 0 ;fag , 

false : 

Again the conjunction of these constraints is equivalent to true, which is trivially a - 

constraint forS. 

Hence, in this case S is -I-reduced. 

ut 

Note the fundamental dierence between these two examples. In both cases the free variables 

in the invariant I conprise all variables of X. In Example 5.8 the operation S 1 is an X- 

operation and S was not -I-reduced, whereas in Example 5.9 S 1 is a Y 1 -operation for a 

proper subset Y 1 of X and S is -I-reduced. The following lemma shows that this observation 

can be generalized. 

Lemma 5.15. Let the notations be as in Denition 5.14. Suppose that S = S 1 S 2 is not 

-I-reduced. Then we have Y 1 = X. 

Proof. Without loss of generality wemay assume that S 1 is deterministic. 

Let P ) fx 1 =x 0 1 ::: x l=x 0 l g:(8 i(i = 1:::m):fy 1 = 1 ::: y m = m g:K) be a -constraint 

for S 1 , where K is either I or :I. Furthermore, let P x 1 = d 1^:::^x n = d n with n = l+m 

and y 1 = x l+1 ::: y m = x n and assume j= :K. Thenweget 

fx 0 1 =x 1::: x 0 l =x lg:wlp(S 1 0 ) 

(P )fx 1 =x 0 1 ::: x l=x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g:K)) , 

8 i (i =1:::m):(fx 0 1=x 1 ::: x 0 l =x lg:wlp(S 1 0 ) 

(P )fx 1 =x 0 1::: x l =x 0 l g:fy 1= 1 ::: y m = m g:K)) : 

For Y 1 6= X we have m 6= 0and at least one i will be bound in this formula. Since j= :K 

holds, this formula can never be satised. Hence the conjunction of -constraints for S 1 of the 

given form is true, which is trivially a -constraint for S. Hence S is -I-reduced. ut 

102

Finally, we may extend Denition 5.14 to arbitrary operations requiring all occurring sequences 

to be -I-reduced. 

Denition 5.16. Let S be an X-operation and I some Y -invariant with X Y . S is called 

I-reduced i the following holds: 

(i) If S is one of fail, skip, loop or an assignment, then S is always I-reduced. 

(ii) If S = S 1 S 2 ,thenS is I-reduced i S 1 and S 2 are I-reduced and S is -I-reduced. 

(iii) If S is one of P ! T ,@y :: T y T , S 1 S 2 or S 1 S 2 , then S is I-reduced i S 1 and S 2 

or T respectively are I-reduced. 

(iv) If S = T:f(T ), then S is I-reduced i f (loop) isI-reduced for each ordinal number . 

5.4.2 An Upper Bound for GCSs 

Now we are prepared for our rst goal. We prove that the GCS S I of an I-reduced operation 

S specializes S I which is built by replacing each primitive operation in S by its GCS. As 

announced the proof will be done by structural induction on S using the constructors in 

Denition 5.2. 

Proposition 5.17. Let S 0 = P ! S be a Y -operation and I an X-invariant with Y X. If 

T v S 0 is consistent with respect to I, then we have T vP! S I . 

Proof. Since T v S 0 v S holds and T is consistent with respect to I, we conclude T v S I 

from Denition 5.10. In addition we have :P ) wp(S 0 )(false) ) wp(T )(false), hence 

T vP! S I follows. 

ut 

Proposition 5.18. Let S = S 1 S 2 beaY -operation and I an X-invariant with Y X. If 

T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . 

Proof. T is semantically equivalenttoT 0 Q!loop with wp(T 0 )(true) , true, wlp(T 0 )(R) , 

wlp(T )(R) for all R and Q,wp(T ) (false). Then Q!loop v S implies 

Q!loop = (Q 1 ! loop) (Q 2 ! loop) 

with Q i ! loop v S i for i =1 2. If we show T 0 v (S 1 ) I (S 2 ) I ,thenalso 

T v (S 1 ) I (Q 1 ! loop) 

| {z } 

(S1) 0 I 

(S 2 ) I (Q 2 ! loop) 

| {z } 

(S2) 0 I 

holds, but it is easy to see that (S i ) 0 I v (S i) I holds for i =1 2. Hence the result. 

From now onwemay therefore assume without loss of generality that wp(T )(true) , true 

holds. Then for any state dene T = T (P ! skip) with a characterizing predicate P 

of the state . Clearly, T is a deterministic specialization of T ,since 

 

wlp(T ) (P ) , wlp(T ) wlp(T ) 

(P ^P ) , 

(P ) for = 

) wlp(T ) (P 

false else 

) : 

Since wp(T )(P ) , wp(T )(true), we may also derive the stated determinism from this 

computation. Analogously we have 

103

wp(T ) (P ) , wp(T ) wp(T ) 

(P ^P , 

(P ) for = 

wp(T ) (false) else 

 

) wp(T ) (P ) 

since predicate transformers are monotonic. Consequently T is also consistent with respect 

to I. 

In addition the determinism implies that T is semantically eqivalent to T1 T2 with 

T 

i 

v S i for i =1 2. More precisely we have T 

i 

From Proposition 5.7 we derive 

= P 

i 

! T with 

P 

i fz=yg:wlp(fy=zg:T )(fy=zg:P ) wlp(S i ) (z = y)) : 

P 1 _P 2 ,fz=yg:wlp(fy=zg:T )(P ) wlp(S) (z = y)) (5:49) 

, true 

since T v S holds, hence T =(P1 _P 2 ) ! T = P1 ! T P2 ! T . Then it follows 

from Denition 5.10 that Ti v (S i ) I holds, hence also T v (S 1 ) I (S 2 ) I . 

Finally, the least upper bound of all T with respect to v must specialize (S 1 ) I (S 2 ) I , 

but this least upper bound is semantically equivalent toT ,which completes the proof. ut 

Unbounded choice can be handled analogous to choice. 

Proposition 5.19. Let S 0 =@y :: T y S be aY -operation and I an X-invariant with Y X. 

If T v S 0 is consistent with respect to I, thenwe have T v @y :: T y S I . 

ut 

Proposition 5.20. Let S = S 1 S 2 be a Y -operation and I an X-invariant with Y X. If 

T v S is consistent with respect to I, thenwe have 

T v (S 1 ) I wp(S 1 )(false) ! (S 2 ) I v (S 1 ) I (S 2 ) I : 

Moreover, we have T v (S 1 ) I wlp(S 1 )(false) ! (S 2 ) I . 

Proof. Dene T 1 = wp(S 1 ) (true) ! T and T 2 = wp(S 1 )(false) ! T . Since wp(S 1 ) (true) _ 

wp(S 1 )(false) , true, we certainly have T = T 1 T 2 . Moreover, T 1 v S 1 and T 2 v 

wp(S 1 )(false) ! S 2 obviously hold. 

Since T 1 and T 2 are consistent with respect to I, it follows T 1 v (S 1 ) I and T 2 v 

wp(S 1 )(false) ! (S 2 ) I v wp((S 1 ) I )(false) ! (S 2 ) I , hence also 

T 1 T 2 v (S 1 ) I wp(S 1 )(false) ! (S 2 ) I 

v (S 1 ) I wp((S 1 ) I )(false) ! (S 2 ) I = (S 1 ) I (S 2 ) I 

which proves the rst result. Since wp(S 1 )(false) ) wlp(S 1 )(false) holds, the second result 

is obvious. 

ut 

The most dicult part concerns sequences. In this case the proof is rather lengthy and requires 

several lemmata concerning a specic form of GCSs and a very technical result on -Ireducedness. 

Therefore the proof is shifted to Appendix 5.7. 

Proposition 5.21. Let S = S 1 S 2 be an I-reduced Y -operation and I an X-invariant with 

Y X. If T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . ut 

104

The remaining case is given by an I-reduced recursive operation S:f(S). For this we use 

induction on ordinal numbers. The main diculty is to bring together two dierent partial 

orders, the specialization order v of Denition 5.6 and the Nelson-order in (5.35). The specialization 

order is fundamental for GCSs, whereas the Nelson-order is required for recursion. 

For recursive guarded commands the monotonicity of all operation constructors with respect 

to the Nelson-order is fundamental [19]. Unfortunately, a similar result does not hold 

for the specialization order. More precisely, the result is false for the -constructor in its rst 

component. 

Lemma 5.22. Let f(S) be a guarded command expression built from the constructors in 

Denition 5.2 except restricted choice . Then f is monotonic with respect to the specialization 

order v. 

Proof. The proof is done by structural induction. For each constructor it is completely 

analogous to the corresponding proof for the Nelson-order in [19]. We omit the details. ut 

In Proposition 5.20 we have seen that S I may contain the choice-constructor instead of the 

restricted choice, provided we include some guard. Replacing within a recursive operation 

some S 1 S 2 by(S 1 ) I (S 2 ) I would destroy the required result. 

The next lemma follows from taking together Propositions 5.17{5.21. 

Lemma 5.23. Let T be aconsistent specialization of some I-reduced f(S 0 ) with respect to I, 

where f(S) is an expression built from the constructors in Denition 5.2. Construct f I (S) 

from f(S) as follows: 

(i) Each restricted choice S 1 S 2 occurring within f(S) will be replaced by 

S 1 wlp(S 1 )(false) ! S 2 : 

(ii) Then each basic operation, i.e. skip and assignments x := E(x) will be replaced by their 

GCSs with respect to I. 

Then we have T v f I (S 0 I ). 

ut 

Proposition 5.24. Let S 0 = S:f(S) be anI-reduced Y -operation and T v S 0 beaconsistent 

specialization with respect to some X-invariant I with Y X. Then we also have T v 

S:f I (S), where f I (S) is built as in Lemma 5.23. 

ut 

Again, the proof is rather lengthy and requires additional lemmata. Hence it is shifted to 

Appendix 5.8. We maynow summarize the result achieved so far, which gives the announced 

upper bound theorem. 

Theorem 5.25. Let S be some I-reduced Y -operation and I some X-invariant with Y X. 

Let SI result from S as follows: 

(i) Each restricted choice S 1 S 2 occurring within S will be replaced byS 1 wlp(S 1 )(false) ! 

S 2 . 

(ii) Then each basic operation, i.e. loop, fail, skip and assignments x := E(x) will be replaced 

by the GCSs with respect to I. 

Then T v S I holds for each consistent specialization T v S with respect to I. 

ut 

105

5.4.3 The General Form of GCSs 

Now we are prepared to state the main result on the general form of GCSs. Informally, the 

GCS of an I-reduced operation S results in two steps. First we have to remove all restricted 

choices and to replace basic update operations by their GCSs. Then we have seen that the 

GCS S I specializes the resulting SI 0 .Now add a precondition P (S0 I ) that \lters" only those 

computations of SI 0 that specialize the original S. This precondition corresponds to the normal 

form of the specialization proof obligation in (5.49). 

Theorem 5.26. Let I be anX-invariant and S some I-reduced Y -operation with Y X = 

fx 1 ::: x n g. Let S 0 I result from S by rst replacing each restricted choice S 1 S 2 by 

S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic assignment operation by its GCS with respect 

to I. For a disjoint copy fz 1 ::: z n g of X dene 

P (S 0 I ) fz 1=x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) 

where T results from S 0 I by renaming all x i to z i . Then the GCS of S with respect to I can 

be written in the form S I = P (S 0 I ) ! S0 I . 

Proof. 

Let R be an arbitrary X-invariant. Then we have 

wlp(S 0 I ) (R) , P (S 0 I ) ^ wlp(S0 I ) (R) (5.49) ) wlp(S) (R) 

and the wp-condition can be proven analogously, which implies that S I as given in the theorem 

is a specialization of S. Since SI 0 is consistent with respect to I, the same holds for S I. Hence 

the given operation S I is indeed a consistent specialization. 

It remains to show that it is already the GCS. Let T v S be some arbitrary consistent 

specialization and assume without loss of generality that wp(T )(true) , true holds. From 

Theorem 5.25 we know that T v SI 0 holds. Hence the result follows from wp(T ) (true) ) 

P (SI 0 ). 

If j= :P (SI 0 ) holds, we conclude from Proposition 5.7 that there exists some state 0 

with 

j= wlp(S) (P 0) ^ :wlp(S 0 I)(P 0) 

hence also j= wlp(S) (P 0) ^ :wlp(T )(P 0), since T v SI 0 holds. But since T v S is 

assumed, we must have j= :wp(T ) (true) follows, which completes the proof. 

ut 

Let us nally come back to our starting point and look at practical applications. In general, 

GCSs are non-deterministic, which reects various strategies for consistency enforcement. 

In most practical cases, however, we are only interested in one of these strategies, i.e., we 

usually select a deterministic or quasi-deterministic branch of the GCS. The selection of 

quasi-deterministic branches is related to an interactive support for the values to be selected. 

Consider the special case, where we deal with nite sets as in Examples 5.1 and 5.3. One 

strategy would be to change value as little as possible, i.e. the symmetric dierence between 

the old and the new values should be minimized. According to our general result on GCS 

construction a reasonable result can be achieved, if we already choose such quasi-deterministic 

branches for the GCS of the involved basic operations. In particular, we only have to take 

care of assignments. We demonstrate this approach by a nal example. 

106

Example 5.10. Let us continue Example 5.3, which comprises the essentials of the application 

example 5.1. The following calculations will justify the informal approach in Example 

5.2. Let the notations be as in Example 5.3. Then we consider the fx 1 g-operation 

S(a b :: Z) = x 1 := x 1 [f(a b)g. Proposition 5.12 allows to build the GCS successively. Let 

us take theinvariants in the order given in Example 5.3. 

Step 1. First consider the inclusion invariant I 1 . Since S is just an assignment, it is I 1 -reduced. 

We then replace S by a branch of its GCS with respect to I 1 and obtain SI 0 (a b :: INT) = 

x 1 := x 1 [f(a b)g ( a =2 map( 1 )(x 2 ) ! @ c :: INT x 2 := x 2 [f(a c)g skip ) 

(5.54) 

which isanX-operation with P (SI 0 ) , true. Thenwe redene S by (5.54). 

Step 2. Now consider the invariant I 2 . Since S is a sequence S 1 S 2 with a fx 1 g-operation 

S 1 ,theI 2 -reducedness follows from Lemma 5.15. 

We have to remove the restricted choice and then replace the basic assignment to x 2 by 

the following GCS branch with respect to I 2 

( a =2 map( 1 )(x 2 ) ! c =2 map( 2 )(x 2 ) ! x 2 := x 2 [f(a c)g )( a 2 map( 1 )(x 2 ) ! skip ) 

For the resulting operation SI 0 we compute P (S0 I ) , true. After some rearrangements we 

obtain the following GCS branch with respect to I 1 Î 2 : 

x 1 := x 1 [f(a b)g 

(( a =2 map( 1 )(x 2 ) ! @ c :: INT c =2 map( 2 )(x 2 ) ! x 2 := x 2 [f(a c)g ) 

a 2 map( 1 )(x 2 ) ! skip ) : (5.55) 

Then we redene the body of S by (5.55). 

Step 3. Now regard the exclusion invariant I 3 . Again I 3 -reducedness follows from Lemma 

5.15. Replace S 1 = x 1 := x 1 [f(a b)g in S by the GCS branch 

x 1 := x 1 [f(a b)g x 2 := x 2 ;fx 2 x 2 j 2 (x) =bg 

and analogously replace S 2 

= x 2 := x 2 [f(a c)g by the GCS branch 

x 2 := x 2 [f(a c)g x 1 := x 1 ;fx 2 x 1 j 2 (x) =cg : 

Then we compute 

P (SI) 0 , b =2 map( 2 )(x 2 ) ^ 

(a =2 map( 1 )(x 2 ) )8c :: INT: (c 62 map( 2 )(x 2 ) ) c =2 map( 2 )(x 1 ) [fbg) ) 

hence after some rearrangements the nal result is 

S I (a b :: INT) = b =2 map( 2 )(x 2 ) ! x 1 := x 1 [f(a b)g 

(( a 62 map( 1 )(x 2 ) ! @ c :: INT 

c =2 map( 2 )(x 2 ) ^ c =2 map( 2 )(x 1 ) ! x 2 := x 2 [f(a c)g ) 

a 2 map( 1 )(x 2 ) ! skip ) : 

Note that this result reects exactly the informal considerations in Example 5.2. 

ut 

107


The work reported in this paper deals with consistency enforcement in formal specications. 

This approach formalizes the problem by greatest consistent specializations (GCSs). Under 

the technical prerequisite of reducedness the computation of such a GCS can be be retraced to 

the denition of GCSs for basic update operations. It is possible to replace basic operations 

within a complex operation specication by their GCSs and to compute a specialization 

precondition. This result is a step towards a general and theoretically founded solution of 

consistency enforcement. However, a series of open problems is left for future research. 

{ The notion of a GCS can also be dened for transition constraints. Same as for static 

constraints existence, uniqueness and compatibility results are already known [21]. The 

problem is to extend also the results on GCS construction. 

{ The result on the construction of GCSs allows the problem of consistency enforcement to 

be reduced to basic operations, i.e. assignments, and simple constraints that are combined 

by conjunction. The remaining problem is to nd GCSs for basic operations, which has 

to be done case by case for selected classes of constraints (cf. [24, 21]). 

{ GCS construction depends on checking for I-reducedness. This is equivalent toshow that 

certain rst-order formulae derived from I are tautologies. Whilst this is undecidable 

in general, the problem is to characterize those invariants I for which I-reducedness is 

decidable. 

{ Even if we are able to decide I-reducedness, the problem is how to proceed in the case 

of non-reduced operations. It is not very satisfactory to break o with no result. The 

question is to nd a reduction algorithm or at least to nd conditions under which such 

an algorithm could exist. For inclusion, exclusion, functional and cardinality constraints 

in data-intensive applications such reductions have beenworked out in [24]. 

{ By using axiomatic semantics GCSs are only determined up to semantic equivalence. 

The construction of a GCS, however, will result in a concrete syntactic form using typed 

guarded commands. An operational interpretation of this form may involve backtracking 

[19] and may be totally inecient. Hence optimization may be required. 

All these problems are a bit technical in nature. The hardest problem, however, is concerned 

with the selection of the specialization order. As the use of the main result in practice demonstrates 

this order might still be too coarse for enforcement purposes. On the other hand, 

multi-valued dependencies, which concern only one set-valued state variable lead to preconditions, 

although we might expect additional changes instead [21]. 

One possible solution might be to choose an order based on -constraints, but then the 

problem leads back to GCSs because of the relation between transition consistency and specialization. 

Closing the remaining gaps in this piece of work is a matter of current research. 

108

Appendix 

5.6 A Normal Form for the Specialization Proof Obligation 

In this appendix we only give a proof of Proposition 5.7, which is rst repeated here. 

Proposition 5.7 Let S and T be operations on a state space X = fx 1 ::: x n g. Let Z = 

fz 1 ::: z n g be disjoint to X with T xi = T zi . Then wlp(S)(R) ) wlp(T )(R) holds for all 

X-invariants R i 

fz 1 =x 1 ::: z n =x n g:wlp(T 0 )(wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) (5.56) 

holds, where T 0 results from T by renaming all the variables x i by z i . 


In [19] it has been shown that we mayalways write wlp(T 0 )(R) in the form 

8z 0 1::: z 0 n: (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n) )fz 1 =z 0 1::: z n =z 0 ng:R) : 

In particular, (5.56) is equivalent to 

8z 0 1::: z 0 n: (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n) ) 

fz 1 =z 0 1::: z n =z 0 ng:wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) : 

Since S is an X-operation, we conclude 

fz 1 =z 0 1::: z n =z 0 ng:wlp(S) (x 1 = z 1 ^ :::^ x n = z n ) , wlp(S) (x 1 = z 0 1 ^ :::^ x n = z 0 n) : 

Now assume wlp(S)(R) ) wlp(T )(R) for all X-predicates R. Then also wlp(S 0 )(R) ) 

wlp(T 0 )(R) holds for all Z-predicates R, where S 0 results from S by renaming all x i to 

z i . 

In particular, we maytake R as z 1 = (d 1 )^:::^z n = (d n ) for arbitrary constants d i 2 D 

and a selector function , which assigns closed terms (d) tosemantic constants d 2 D such 

that ! T = id D holds. 

But then wlp(S 0 )(R) can be rewritten as 

Hence 

fx 1 =z 1 ::: x n =z n g:wlp(S) (x 1 = (d 1 ) ^ :::^ x n = (d n )) 

8z 0 1 ::: z0 n : (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n ) ) 

fx 1 =z 1 ::: x n =z n g:wlp(S) (x 1 = z1 0 ^ :::^ x n = zn 0 )) (5.57) 

holds, which implies (5.56). 

Conversely, (5.56) implies (5.57). For an arbitrary X-predicate R we may then write 

wlp(T ) (R) as 

9z 0 1 ::: z0 n : (fz 1=x 1 ::: z n =x n g:wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n ) ^fx 1=z 0 1 ::: x n=z 0 n g:R) 

and by (5.57)we conclude 

fz 1 =x 1 ::: z n =x n g:(9z 0 1 ::: z0 n : (wlp(S) (x 1 = z 0 1 ^:::^x n = z 0 n )^fx 1=z 0 1 ::: x n=z 0 n g:R)) 

which is wlp(S) (R) by the normal form representation for wlp(S). Hence wlp(S)(R) ) 

wlp(T )(R) as required. 

ut 

109

5.7 Proof of the Upper Bound Theorem for Sequences 

In this appendix we give the proof of Proposition 5.21. For this we need two lemmata. The 

rst of these shows a general syntactic form of GCSs based on unbounded choices. A similar 

result occurs if we exploit the equivalence to predicative specications. 

Lemma 5.27. Let S be a Y -operation, Y X and I an invariant on X. Then the greatest 

consistent specialization S I of S with respect to I is semantically equivalent to 

(I ! S @ z := I!skip) (:I ! S @ z := ) (5.58) 

where z has been used as an abbreviation of the collection of state variables in X ; Y and 

ranges over values of the corresponding types. Moreover, S I is uniquely determined (up to 

semantic equivalence) by S and I. 

Proof. We have to verify the conditions in Denition 5.10 for S I dened by (5.58). For an 

arbitrary Y -invariant R we have 

wlp(S I ) (R) , (I ^wlp(S) (9:fz=g:(I ^R))) _ (:I ^ wlp(S) (9:fz=g:R)) 

, (I ^wlp(S) ((9:fz=g:I) ^R)) _ (:I ^ wlp(S) (R)) 

) (I ^wlp(S) (R)) _ (:I ^ wlp(S) (R)) 

, wlp(S) (R) : 

Here we exploited the monotonicity of conjugate predicate transformers with respect to implication 

and the fact that the variables z do not occur in R. Then the same calculation can be 

used with wlp replaced everywhere by wp, which shows (i). For the proof of (ii) we compute 

wlp(S I )(I) , (I )wlp(S)(8:fz=g:(I )I))) ^ (:I ) wlp(S)(8:fz=g:I)) 

, (:I ) wlp(S)(8:fz=g:I) 

which implies I)wlp(S I )(I) as required. 

For the proof of (iii) let P be a characterizing predicate and let T v S be an arbitrary 

consistent specialization of S. We distinguish two cases. 

Case 1. Assume P ):I.Thenwe also have wlp(T ) (P ) ) wlp(T ) (:I) ):I, since T 

is consistent with respect to I and wlp(S) is monotonic. It follows 

wlp(T ) (P ) ) :I ^ wlp(S) (P ) ) :I ^ wlp(S) (9:fz=g:P ) ) wlp(S I ) (P ) : 

The last implication follows from the calculation of wlp(S I ) in the proof of (i). 

Case 2. Assume P ) I. Then we have wlp(T ) (P ) , wlp(T ) (I ^P ). Since T v S 

holds, the monotonicity ofwlp(S) implies 

wlp(S) (9:fz=g:(I ^P )) ^ wlp(S) (9:fz=g:P ) : (5.59) 

In any case (wlp(T ) (P ) Î) _ (wlp(T ) (P ) ):I) holds. Together with (5.59) we get 

wlp(T ) (P ) ) (I ^wlp(S) (9:fz=g:(I ^P ))) _ (:I ^ wlp(S) (9:fz=g:P )) 

, wlp(S I ) (P ) : 

110

The universal conjunctivity property then implies wlp(T ) (R) ) wlp(S I ) (R) for all R. 

In addition, it is easy to see that wp(T ) (false) ) wp(S I ) (false) holds. Hence T is a 

specialization of S I . 

ut 

The second technical lemma reformulates the properties of -I-reducedness for deterministic 

S 1 . 

Lemma 5.28. Let the notations be as in Denition 5.14. Assume that S is -I-reduced and 

S 1 is deterministic. Then following two conditions hold: 

(i) For all states and with j= :I , j= :I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ) we 

have, if 

P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) 

is a -constraint for S 1 ,thenP )fx=x 0 g:8 Y ;X 1:fy=g:I) is a -constraint for S. 

(ii) For all states and with j= I , j= I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ) we 

have, if 

P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) ):I) 

is a -constraint for S 1 ,thenP )fx=x 0 g:8 Y ;X 1:fy=g::I) is a -constraint for S. 

Proof. Let and be states with j= :I , j= :I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ). 

Assume that 

P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) (5.60) 

is a -constraint forS 1 . Then we have 

j= wlp(S 1 ) (wlp(S 2 ) (9 Y ;X 1;X2 :fy=g:P )) ) (since S 1 is deterministic) 

j= wlp(S 1 )(wlp(S 2 ) (9 Y ;X 1;X2 :fy=g:P )) ) 

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P ) wlp(fx=x 0 g:S 2 ) (9 Y ;X 1;X2 :fy=g:fx=x0 g:P )) :(5.61) 

From (5.60) we have 

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) : 

Together with (5.61) this implies 

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:8 Y ;X 1 :fy=g:I) : 

Hence P ) fx=x 0 g:8 Y ;X 1 :fy=g:I is a -constraint for S 1. Since S is assumed to be - 

I-reduced, it follows that P ) fx=x 0 g:8 Y ;X 1:fy=g:I is also a -constraint for S, hence 

condition (i). The proof of condition (ii) is completely analogous. 

ut 

With the help of these two technical lemmata we can now approach the proof of the upper 

bound theorem for sequences. We use Lemma 5.27 to compute S I and (S 1 ) I (S 2 ) I . This 

allows to compute their predicate transformers. Then we verify the required specialization, 

for which Lemma 5.28 must be exploited. 

111

Proposition 5.21 Let S = S 1 S 2 be an I-reduced Y -operation and I an X-invariant with 

Y X. If T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . ut 

Proof. We may assume without loss of generality that wp(T )(true) , true holds. Then it 

suces to show wlp(S I ) (P ) ) wlp((S 1 ) I (S 2 ) I ) (P ) for all characterizing predicates 

P . 

Moreover, since S 1 is the least upper bound of its deterministic branches with respect 

to v, we may assume without loss of generality thatS 1 is deterministic. Hence the stronger 

properties in Lemma 5.28 can be used. 

First we compute both sides of such an implication using (5.58). We have 

wlp(S I ) (P ) , (I^9:wlp(S) (fy=g:I ^fy=g:P )) _ 

(:I ^ 9:wlp(S) (fy=g:P )) 

(5.62) 

and 

wlp((S 1 ) I (S 2 ) I ) (P ) , 

(I ^wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp((S 2) I ) (P )))) _ 

(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:wlp((S 2) I ) (P ))) , 

(I ^wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:(I ^P ))))) _ 

(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:(I ^P ))))) _ 

(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:(:I ^ wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P )))) , 

9 Y ;X 1 :9 Y ;X2 0 :(wlp(S 1 ) (fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:(I ^P )))) _ 

9 Y ;X 1 :9 Y ;X2 0 ::I ^ (wlp(S 1 ) (:fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:P ))) : 

(5.63) 

Case 1. Assume P ):Iholds. Then we have wlp(S I ) (P ) ) wlp(S I ) (:I) ):I, since 

S I is consistent with respect to I. Hence, since we assume wlp(S I ) (P ), we have to consider 

only the second line of (5.62). We want to show that this implies the second line of (5.63), 

i.e. (due to consistency :I can be omitted) 

9 Y ;X 1 :9 Y ;X2 0 : (:I ^ (wlp(S 1 ) (:fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:P )))) : 

(5.64) 

Assume that (5.64) does not hold, i.e. there exists some state with 

j= wlp(S 1 )(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) : (5.65) 

We then calculate that (5.65) is equivalent to 

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(fx=x 0 g: 

8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) , 

| {z } 

R 

j= fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) )fx 0 =x 00 g:fx=x 0 g:R) , 

j= P )fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) )fx 0 =x 00 g:fx=x 0 g:R) , 

j= fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) ) (P )fx 0 =x 00 g:fx=x 0 g:R)) , 

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:R) : 

112

From this we conclude that 

P )fx=x 0 g:(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) (5.66) 

is a -constraint forS 1 . Due to Lemma 5.28(i), since j= :I and j= :I hold, this implies 

to be a -constraint forS, hence we get 

P )fx=x 0 g:(8 Y ;X 1:fy=g:I) (5.67) 

j= fx 0 =xg:wlp(fx=x 0 g:S)(P )fx=x 0 g:(8 Y ;X 1 :fy=g:I)) , 

(do the same calculation as above) 

j= fx 0 =xg:wlp(fx=x 0 g:S)(fx=x 0 g:8 Y ;X 1 :fy=g:I) , 

j= wlp(S)(8 Y ;X 1:fy=g:I) : (5.68) 

which leads to a contradiction, since P ):Iimplies 

wlp(S) (9 Y ;X 1 :fy=g::I) , :wlp(S)(8 Y ;X1 :fy=g:I) 

and on the other hand due to consistency we have 

wlp(S I ) (P ) ) wlp(S I ) (9 Y ;X 1;X2 :fy=g:P ) ) wlp(S) (9 Y ;X 1;X2 :fy=g:P ) : 

This proves the assertion in Case 1. 

Case 2. Now assume that P )Iand j= 

subcases. 

wlp(S I ) (P ) hold. From (5.62) we derive two 

Case 2.1 Assume j= :I ^ 9:wlp(S) (fy=g:P ). Then we conclude (always j= :::) 

9:wlp(S 1 ) (wlp(S 2 ) (fy=g:P )) , 

9:wlp(S 1 ) ((I _:I) ^ wlp(S 2 ) (fy=g:P )) , 

9:wlp(S 1 ) (I^wlp(S 2 ) (fy g:(P Î))) _9:wlp(S 1 ) (:I ^ wlp(S 2 ) (fy=g:P )) 

hence (5.63) follows. This proves Case 2.1. 

Case 2.2. Now assume j= I^9:wlp(S) (fy=g:(I ^P )). We show that this implies 

j= 9 Y ;X 1 :9 Y ;X2 0 :(wlp(S 1 ) (fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:(I ^P )))) 

(5.69) 

which implies the rst line of (5.63). 

As in Case 1 assume that (5.69) does not hold. We use analogous computations to derive 

that 

P )fx=x 0 g:(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) ):I)) 

is a -constraint forS 1 . According to Lemma 5.28, since j= I, j= I, this implies 

P )fx=x 0 g:(8 Y ;X 1 :fy=g::I) 

113

to be a -constraint forS. Thus, we get 

j= fx 0 =xg:wlp(fx=x 0 g:S)(P )fx=x 0 g:(8 Y ;X 1 :fy=g::I)) , 

j= fx 0 =xg:wlp(fx=x 0 g:S)(fx=x 0 g:8 Y ;X 1 :fy=g::I) , 

j= wlp(S)(8 Y ;X 1:fy=g::I) : (5.70) 

However, from our assumption and P )Iwe conclude 

wlp(S) (9 Y ;X 1 :fy=g:I) ,:wlp(S)(8 Y ;X1 :fy=g::I) 

contradicting (5.70). This proves the assertion in Case 2.2. 

ut 

5.8 Proof of the Upper Bound Theorem in the Recursive 

Case 

In this appendix we prove the upper bound theorem for recursive operations. For this we need 

an additional lemma that deals with the compatibility of GCSs with the Nelson-order. Let 

F 

denote the least upper bound with respect to the Nelson-order and F the least upper 

bound with respect to the specialization order. 

Lemma 5.29. Let T , S and S for each ordinal number be Y -operations such that S 0 S 

holds for 0 and let I be some X-invariant for Y X. Then we have: 

(i) If T S holds, then T I F S I follows. 

(ii) The least upper bound exists and we have 

< 

S I 

0 

@ G 

< 

 

S 

 

1 

A 

I 

v 

G 

 

S 

 

I : 

< 

Proof. (i) follows, because all constructors of Denition 5.2 are monotonic in the Nelson 

order , hence the rst result follows from (5.58). 

(ii) Since S 0 S holds for 0 , the family (S ) < forms an ascending chain. From 

[19] we know that in this case the least upper bound with respect to the Nelson order exists. 

It remains to show the required specialization assertion. 

Let T 1 and T 2 denote the left hand side F and the right hand side of this assertion respectively. 

Since for all < we have S S we conclude that S I T 1 and hence also 

< 

T 2 T 1 .Thisproves the wp(T 2 )(R) ) wp(T 1 )(R) for all X-constraints R. 

It remains to show wlp(T 2 )(R) ) wlp(T 1 )(R), which follows directly from Proposition 

5.7. ut 

Now we can give the main proof. 

Proposition 5.24 Let S 0 = S:f(S) be anI-reduced Y -operation and T v S 0 beaconsistent 

specialization with respect to some X-invariant I with Y X. Then we also have T v 

S:f I (S), where f I (S) is built as in Lemma 5.23. 

114


Recall from [19] that we have 

S 0 

= f (loop) = f 

0 

@ G 

< 

 

f (loop) 

1 

A 

for some ordinal number , hence from Lemmata 5.23 and 5.29 we derive 

0 

T v f I 

@ 

0 

@ G 

< 

 

f (loop) 

0 1 

1 1 

T1 

A A v fI G z }| { 

 

f 

B 

(loop) I 

I @ 

C 

< 

| {z } 

The last inequality follows from Lemma 5.29(ii) applied to the operand and from the monotonicity 

off I with respect to the specialization order as stated in Lemma 5.22. 

Now dene T2 = f I (loop) and apply transnite induction to show T 1 v T 2 for all . 

For = 0 the result is obvious, since loop I is semantically equivalent toloop. 

Now assume T1 0 

v T2 0 

holds for all 0 < . For 0 < we have f 

F 

0 

I (loop) f I (loop). 

Hence the least upper bound in the Nelson order 

f 0 

(loop) exists and we have 

0 

f I 

B 

1 

T2 

z }| 0 

{ 

 

f 0 

I (loop) 

C 

0 < 

| {z } 

T2 

G 

B 

@ 

0 < 

I 

T1 

A = f I (loop) : 

Then by applying the induction hypothesis (change in T 1 to 0 and to ) for an arbitrary 

X-constraint R we get 

wlp(T 2 )(R) , 

and 

wp(T 2 )(R) , 

^ 

0 < 

_ 

0 < 

induction hypothesis 

wlp(T2 0 

)(R) ) 

induction hypothesis 

wp(T2 0 

)(R) ) 

^ 

0 < 

_ 

0 < 

A : 

wlp(T 0 

1 )(R) , wlp(T 1)(R) 

wp(T 0 

1 )(R) , wp(T 1)(R) : 

Consequently T 1 v T 2 holds. Finally, from Lemma 5.22 we conclude T v f I (T 1 ) v f I (T 2 )= 

(loop) as required. 

ut 

f I 


1. J. R. Abrial: \A Formal Approach to Large Software Construction", in J. L. A. Van de Snepscheut 

(Ed.), Mathematics of Program Construction, Springer LNCS 375, 1989, 1-20 

2. J. R. Abrial: The B Method, Prentice Hall International (to appear) 

3. J. Bicarregui, B. Ritchie: \Invariants, Frames and Postconditions: a Comparison of the VDM and B 

Notations", in J.C.P. Woodcock, P.G. Larsen (Eds.): Formal Methods Europe (FME'93), Springer 

LNCS 670, 1993, 162-182 

115

4. D. Bjrner, C. B. Jones (1982): Formal Specication and Software Development, Prentice Hall 

5. M. Broy, G. Nelson: \Adding Fair Choice to Dijkstra's Calculus", ACM TOPLAS, vol. 16 (3), 

1994, 924-938 

6. S. Ceri, J. Widom: \Deriving Production Rules for Constraint Maintenance", in Proceedings 16th 

Conference on VLDB, 1990, 566-577 

7. W. Chen, J. T. Udding: \Towards a Calculus of Data Renement",inJ.L.AVan de Snepscheut 

(Ed.): Mathematics of Program Construction, Springer LNCS 375, 1989, 197-218 

8. P. Cousot: \Methods and Logics for Proving Programs", in J. van Leeuwen (Ed.): The Handbook 

of Theoretical Computer Science, vol. B, Elsevier, 1990, 841-993 

9. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Springer, Texts and 

Monographs in Computer Science, 1989 

10. P. Fraternali, S. Paraboschi, L. Tanca: \Automatic Rule Generation for Constraint Enforcement 

in Active Databases", in U. Lipeck, B. Thalheim (Eds.): Modelling Database Dynamics, Springer 

WICS, 1993, 153-173 

11. M. Gertz, U. W. Lipeck: \Deriving Integrity Maintaining Triggers from Transition Graphs", in 

Proceedings 9th ICDE, IEEE Computer Society Press, 1993, 22-29 

12. D. Gries: The Science of Programming, Springer Texts and Monographs in Computer Science, 

1981 

13. T. Gunther, K.-D. Schewe, I. Wetzel: \On the Derivation of Executable Database Programs 

from Formal Specications", in J.C.P. Woodcock, P.G. Larsen (Eds.): Formal Methods Europe 

(FME'93), Springer LNCS 670, 1993, 351-366 

14. C. B. Jones: Systematic Software Development using VDM , Prentice-Hall International, 1986 

15. A. P. Karadimce, S. D. Urban: \Diagnosing Anomalous Rule Behaviour in Databases with Integrity 

Maintenance Production Rules", in Proceedings 3rd Int. Workshop on Foundations of Models and 

Languages for Data and Objects, 1991, 77-102 

16. U. W. Lipeck: Dynamische Integritat von Datenbanken, Springer IFB 209, 1987 

17. J.-J. Meyer, H. Weigand, R. Wieringa: \A Specication Language for Static, Dynamic and Deontic 

Integrity Constraints", in J. Demetrovics, B. Thalheim (Eds.): MFDBS 89 , Springer LNCS 364, 

347-366 

18. C. Morgan: Programming from Specications, Prentice Hall, 1988 

19. G. Nelson: \A Generalization of Dijkstra's Calculus", ACM TOPLAS, vol. 11 (4), 1989, 517-561 

20. K.-D. Schewe, I. Wetzel, J. W. Schmidt: \Towards a Structured Specication Language for 

Database Applications", in D. Harper, M. Norrie: Specication of Database Systems, Springer 

Workshops in Computing Science, 1992, 255-274 

21. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: \Integrity Enforcement in Object-Oriented 

Databases", in U. W. Lipeck, B. Thalheim (Eds.): Modelling Database Dynamics, Springer WICS, 

1993, 174-195 

22. K.-D. Schewe, B. Thalheim: \Consistency Enforcement in Active Databases", in S. Chakravarty, 

J. Widom (Eds.): Research Issues in Data Engineering | Active Databases, 1994 

23. K.-D. Schewe, D. Stemple, B. Thalheim: \Higher Level Genericity in Object Oriented Databases", 

in S. Chakravarty (Ed.): Conference on the Management of Data, 1994 

24. K.-D. Schewe: Specication and Development of Correct Relational Database Programs, book 

manuscript 

25. T. Sheard, D. Stemple: \Automatic Verication of Database Transaction Safety", ACM ToDS, 

vol. 14 (3), 1989, 322-368 

26. J. M. Spivey: Understanding Z, A Specication Language and its Formal Semantics, Cambridge 

University Press, 1988 

27. J. M. Spivey: The Z Notation, A Reference Manual, Prentice Hall, 1989 

28. D. Stemple, S. Mazumdar, T. Sheard (1987): \On the Modes and Meaning of Feedback toTransaction 

Designer", in ProceedingsSIGMOD1987, 1987, 375-386 

29. S. D. Urban, L. Delcambre: \Constraint Analysis: A Design Process for Specifying Operations on 

Objects", IEEE Transactions on Knowledge and Data Engineering, vol. 2 (4), 1990 

116

30. J. Widom, S. J. Finkelstein: \Set-oriented Production Rules in Relational Database Systems", in 

Proceedings SIGMOD, 1990, 259-270 

117

Chapter 6 

Tailoring Consistent Specializations 

as a Natural Approach to 

Consistency Enforcement 

Contents 

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 

6.2 A Review on Transaction Transformation by Specialization . . . 121 

6.2.1 Greatest Consistent Specialization . . . . . . . . . . . . . . . . . . . 121 

6.2.2 The Construction of GCSs . . . . . . . . . . . . . . . . . . . . . . . . 122 

6.2.3 Two Major Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 

6.3 Weaker Notions of Eect Preservation . . . . . . . . . . . . . . . . 124 

6.3.1 Maximal Consistent Eect Preservers . . . . . . . . . . . . . . . . . 125 

6.3.2 Eective MCE Construction . . . . . . . . . . . . . . . . . . . . . . . 126 

6.4 Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . 127 

6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 

6.6 The Predicate Transformer Calculus . . . . . . . . . . . . . . . . . 130 

6.7 I-reducedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 


Klaus-Dieter Schewe. Tailoring Consistent Specializations as a Natural Approach to 

Consistency Enforcement. in S.Conrad, H.-J.Klein, K.-D. Schewe (Eds.). Integrity in 

Databases. available at 

http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceedings.html. 

118

Abstract. Consistency enforcement may be regarded as a process of transaction transformation, 

where the modied transaction will be consistent with respect to a given set of 

constraints. The computational approach by Schewe and Thalheim requires the modied 

transaction to be the greatest consistent one below the original transaction with respect to 

some order. The order should express the preservation of the \eect" of the original transaction. 

Thus, the major problem is to nd the right order. 

The rst choice, specialization, turns out to provide good computational properties, but 

on the one hand the order is too weak, because arbitrary changes to state variables not 

touched by the original transaction are allowed, and on the other hand it is too strong, as 

eect preservation by specialization means that further changes to the other state variables 

are forbidden. 

In this paper, modications of greatest consistent specializations are studied to avoid these 

problems. Weakening the notion of eect preservation leads to the denition of maximal 

consistent eect preservers (MCEs). This turns out to be a reasonable choice, since they 

preserve the computational strength achieved for consistent specializations. Moreover, for 

basic operations they are compatible with dierent consistency enforcement strategies chosen 

by users. 


Consistency enforcement is considered to be one of the major application elds of active 

database systems. It is expected that arbitrary sets of static integrity constraints allow repairing 

ECA-rules to be dened or even generated. The analysis of the resulting rule triggering 

system concentrates on the termination of the rule system, the independence of the nal 

database state from the chosen selection order of the rules and on consistency. The work in 

[1] can be taken as a representative of this approach. 

The mentioned requirements are not sucient for a reasonable rule behaviour, because 

they do not take care about the interaction of the rules. In general, given a complex database 

transition, rule systems may always invalidate the eect of that transition, e.g. an insertion 

may be turned into a deletion and vice versa. The work in [4] presents critical examples 

with respect to undesirable rule behaviour. In [2] the rule analysis is characterized as purely 

syntactical. 

The basic problem seems to be that an accepted theory of consistency enforcement is still 

missing. Since it is easy to dene an RTS that empties the database in case of any constraint 

violation, it is not sucient to ensure consistency of the result. Therefore, the notion of greatest 

consistent specialization (GCS) was introduced in [6] as a theoretical means for a denition 

of consistency enforcement. The basic considerations of this approach are quite simple: 

{ Instead of a constraint set we may study a single constraint { just take the conjunction. 

{ Since consistency is basically a property of transactions, we may consider an arbitrary 

complex database transition. 

{ Then there should be a partial order, called specialization, on transactions such that it 

expresses the preservation of the eects of the original transition. With respect to this 

order a solution of the consistency enforcement problem should be the GCS. 

Thus, the fundamental idea of the GCS approach is the transformation of arbitrary database 

transitions into GCSs which should then be handled as transactions. Both consistency and 

specialization can be dened in terms of the extended predicate transformer calculus [3]. 

119

The rst results on GCSs demonstrated their existence, uniqueness and a commutativity 

property with respect to several constraints [6]. Thus, the restriction to a single constraint 

is only necessary for denitional purposes, since the GCS with respect to a conjunction of 

constraints can be built successively.Furthermore, the order of the constraints is not important 

for such a construction. Such a property does not hold for any rule-based approach. The price 

for this exibility is the inherent non-deterministic nature of GCSs. 

The interesting problem how GCSs are to be constructed was investigated in detail in 

[5]. It could be shown that under mild technical restrictions the problem can be reduced to 

nding GCSs for basic operations. The GCS of a complex database transition results, if rstly 

involved basic operations are replaced by their GCSs and secondly a precondition is added. 

Hence, GCSs in general are partial, i.e. in certain cases there is no other choice than a rollback. 

This partiality cannot be achieved by rule systems, in particular, the computed precondition 

heavily depends on the original database transition, whilst the rule-based approach aims at 

a solution that is independent from user-dened database transitions. 

Nevertheless, two major drawbacks exist for the GCS approach. The rst one concerns the 

rigidity of the specialization order with respect to the part of the database aected by a transition. 

State changes related to the original transition can only be discarded, but not changed. 

E.g., with respect to a functional dependency in the RDM an insertion is only allowed, if 

it is consistent. The same holds for multi-valued dependencies. This is only one reasonable 

enforcement strategy, but from an intuitive point of view alternatives are imaginable. 

The second drawback concerns the arbitrariness with respect to the part of the database 

not aected by the original transition. Here any changes are allowed as long as consistency 

is achieved. In [5] this problem has been circumvented allowing branches of GCSs to be computed. 

This pragmatic approach leads to reasonable consistent specializations and restricts 

the non-determinism of GCSs. In Section 6.2 we present a formal review of the GCS approach. 

These two problems are taken up in this paper. In fact, the rst problem is investigated, 

but the nal solution comprises both problems at a time. The rst idea with respect to the 

rigidity problem is to weaken the order and to preserve not all the eects of the original 

transition. Indeed, this was also the case for specialization, since only eects on the part 

of the database aected by the more general transition were considered. In this paper, we 

consider eects { formalized by specic transition constraints { that are compatible with the 

given static constraint. These transition constraints can be ordered by implication and we 

may consider minimal constraints with this property. This leads to the notion of maximal 

consistent eect preservers (MCEs). 

Indeed, MCEs t well with our intuition. For this we shortly discuss dierent enforcement 

strategies with respect to basic operations and selected classes of constraints in the RDM and 

convince ourselves concerning the naturality of the MCE approach. 

After the formal denition of MCEs in Section 6.3 we analyze the computational properties 

of MCEs. We shall see that existence, uniqueness and commutativity hold as they did 

for GCSs. Then, again under mild technical restrictions, we show that the problem can be 

reduced to nding MCEs for basic operations. As for GCSs the MCE of a complex database 

transition results, if involved basic operations are replaced by some MCEs and a precondition 

is computed. Hence, MCEs preserve the computational properties of GCSs. Formal denitions 

and results for MCEs are presented in Section 6.3. Unfortunately, the proofs are more 

complicated than they were for GCSs. Therefore, proofs are omitted, but they follow the same 

approach as the proofs for GCSs in [5]. We conclude with a short summary. 

Throughout the paper we assume some familiarity with guarded command notations and 

120

their axiomatic semantics by predicate transformers 1 [3]. Furthermore, proofs are omitted, 

because they tend to become rather lengthy. 

6.2 A Review on Transaction Transformation by 

Specialization 

The starting point of the computational approach to consistency enforcement was the use 

of axiomatic semantics in the extended style of Dijkstra [3] (see Appendix 6.6. We consider 

a state space X as a set of (typed) variables. In the relational model [5] each state variable 

corresponds to a relation schema the possible relation dene the associated type. In object 

oriented models [6] state variables correspond to classes with the associated class type. 

A state over X is given by a map which associates with each state variable a value of 

its type. We use to denote the set of states over X. 

Then a (static) constraint I is dened as a formula (in a naturally dened many-sorted 

logic) with free variables in the state space X, i.e. fr(I) X. It is clear that states are 

sucient for the interpretation of constraints. 

A database transition can then be dened by a guarded command over the state space X 

capturing non-determinism, partiality and general recursion. Note that such aformalization 

is much more general than usual denitions of transactions, but the involved orthogonality of 

constructors such as sequence, guard, choice etc. is signicant for proofs to be kept simple. 

Furthermore, guarded commands are just one way to describe the syntax of transitions. 

A transition constraint on a state space X is a formula J with free variables in X [ X 0 

using a disjoint copy X 0 of X, i.e. fr(J ) X [X 0 .Ifx 0 2 X 0 corresponds to the state variable 

x 2 X, then values associated to x coresspond to before-states, whereas values associated to 

x 0 correspond to after-states. In particular, state pairs () 2 suce to interpret 

transition constraints. 

Each transition constraint J gives rise to a database transition S(J ) in a simple way. All 

state pairs satisfying J are used to dene S(J ). In addition, since we have to decide, how 

to handle termination, we choose also to take all pairs ( 1) into (S(J )). Then it can be 

shown that S(J ) can be written as 

S(J )=(@@x 0 1 :::x0 n J ! x 1 := x 0 1 ::: x n := x 0 n ) loop : 

The computational approach, however, abstracts from syntactic means. All denitions are 

given in terms in predicate transformers. This applies for the notions of operational specialization 

and consistency with respect to a static constraint. These are the necessary ingredients 

for the denition and investigation of greatest consistent specializations (GCSs). 

6.2.1 Greatest Consistent Specialization 

As already said the operational approach to consistency enforcement starts with a formal 

denition of the goal. The idea is quite simple. We choose an order on transitions which 

should model the preservation of \eects". This order is called specialization and denoted v. 

Then a consistent specialization is expresses both consistency and the preservation of eects. 

Finally we takethegreatest consistent specialization, if it exists. 

1 Wepresent a short summary of the used version of Dijkstra's calculus in Appendix 6.6. 

121

The intention behind specialization is quite easy. Ifwe are given an execution of a database 

transition T working on a larger state space X than the database transition S working on 

Y X, then we may restrict this computation to Y . Since states have been dened as 

mappings, this is just a restriction of mappings. Then specialization means that any execution 

of T (restricted to Y ) should also be an execution of S. It is straightforward to show that this 

is exactly captured by the predicate transformer formulation in Denition 6.1 (i). 

This also allows transition consistency of a database transition S with respect to a transition 

constraint J to be formalized by S specializing S(J ). 

As to static consistency with respect to some static constraint I we require that any 

terminating execution of a transition T starting in a state satisfying I should also reach a 

state satisfying I, which is formalized by Denition 6.1 (ii). 

Since specialization captures our intuitive notion of \preservation of eects", the denition 

of Greatest Consistent Specializations in Denition 6.1 (iii) is now obvious. 

Denition 6.1. Let S, T be database transitions on Y and X respectively with Y X. Let 

I denote a static constraint onX. 

(i) T specializes S (T v S) i for all static constraints R on Y the implications wlp(S)(R) ) 

wlp(T )(R) and wp(S)(R) ) wp(T )(R) hold. 

(ii) T is consistent with respect to I i I)wlp(T )(I) holds. 

(iii) S I is a greatest consistent specialization of S with respect to I i S I v S holds, S I is 

consistent with respect to I and S I is the greatest database transition with respect to v 

with these properties. 

ut 

The rst properties that were derived for GCSs concerned their existence, uniqueness and 

their relation to conjunctions (or equivalently sets) of constraints. Due to the very general 

approach to database transitions including non-determinism and partiality, their existence 

can be easily veried. Due to the abstract semantic nature of the denition the uniqueness 

(up to semantic equivalence) is obvious. Additionally, the rst steps towards GCS theory 

detected a commutativity property with respect to conjunctions, at least if we restrict initial 

states to those satifying the constraints [6]. We summarize: 

Proposition 6.2. Let S be a database transition on Y and let I and J be static constraints 

on X with Y X. The the GCS S I exists and is uniquely determined up to semantic equivalence 

byS and I. Furthermore I^J ! S I^J and I^J ! (S I ) J are semantically equivalent. 

ut 

This proposition is important for the practical computation of GCSs. If there exists an eective 

way to compute GCSs, then the proposition allows the computation to be restricted to simple 

constraints that cannot be decomposed as a conjunction. Then we can use the conjuncts of a 

more complex constraint in any order to build the GCS. Thus, the commutativity property 

allows consistency to be enforced stepwise taking any order of the constraints. 

6.2.2 The Construction of GCSs 

In order to become practically interesting GCSs must allow to be constructed. How toachieve 

construction means seemed to be a hopeless problem at the beginning, at least for complex 

database transitions. It is clear that a naive approach {replacing just basic operations such 

122

as insertions and deletions by their GCSs { leads to wrong results. More precisly, we obtain a 

consistent specialization, but not the greatest, or even worse, we obtain not a specialization 

at all. 

The major breakthrough was achieved by requiring database transitions to be I-reduced. 

This is only a technical condition (see Appendix 6.7) { in fact, only a condition on sequences { 

which informally states that there is no self-repairing. We omit the technical denition here, 

since it is only understandable in connection with the proof of the main results [5]. Then 

it was shown that I-reduced database transitions the GCS is itself a specialization of the 

database transition resulting from the replacement of basic operations by their GCSs (upper 

bound theorem) and it appears that it adding a certain precondition gives the complete GCS 

(main theorem). We summarize: 

Theorem 6.3. Let I be a static constraint on X and S some I-reduced database transition 

on Y with Y X = fx 1 ::: x n g. Let SI 0 result from S by rst replacing each restricted 

choice S 1 S 2 by S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic database transition by its 

GCS with respect to I. For a disjoint copy fz 1 ::: z n g of X dene 

P (S 0 I ) fz 1=x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) 

where T results from S 0 I by renaming all x i to z i . Then the GCS of S with respect to I can 

be written in the form S I = P (S 0 I ) ! S0 I . ut 

The theorem needs some explanation concerning both its terminology and its impact. By a 

disjoint copyofX = fx 1 :::x n g we mean a state space Z = fz 1 :::z n g such that the types 

of z i and x i coincide and X \ Z = hold. Then the notation fx=tg:R with a variable x, a 

term t of the same type as x and a formula R denotes the result of the substitution of each 

free occurrence of x in R by t. 

The formula P (SI 0 ) results from a rst order reformulation of the specialization condition 

in Denition 6.1 (i), which was basically second order. This is possible, since S works on X, T 

works on the disjoint copy Z, i.e. their parallel execution has the same eect as any sequence, 

and the formula x 1 = z 1 ^^x n = z n on X [ Z expresses a \glue" between X and Z. Ifthe 

given formula were always true, T would be a specialization of S. Taking it as a precondition 

restricts T to those executions that may occur in a specialization of S. Since the T chosen 

in the theorem is already a consistent generalization of S I , we really obtain the GCS. The 

lengthy proof is contained in [5]. 

Note that the theorem together with the commutativity result mentioned beforehand 

gives eective means for GCS construction and hence for consistency enforcement in the 

basic computational approach. The hard part of the proof is to show that S I v SI 0 holds 

(upper bound theorem) which requires lengthy structural induction [5]. 

6.2.3 Two Major Problems 

Let us look at consistency enforcement from a more practical point of view and ask whether 

GCSs really coincide with our intuition. In general, GCSs are non-deterministic, which reects 

various strategies for consistency enforcement. The approach in [5] selects a branch of the GCS 

which is related to an interactive support for the values to be selected. 

E.g., take an inclusion constraint x 2 p ) x 2 q and an insertion into p, then GCS 

branches oer the freedom to chose any newvalue for q provided it is a superset of p [fxg. 

123

Intuitively we prefer this value to be q [fxg, i.e. to keep change propagation as simple as 

possible. For GCS branches, however, there is no such \preference" or otherwise said: 

For a database transition on Y X the GCS approach is too liberal on X ; Y . 

On the other hand, multi-valued dependencies, which concern only one set-valued state variable 

lead to preconditions, although we might expect additional changes instead. Otherwise 

said: 

For a database transition on Y X the GCS approach is too restrictive onY . 

This demonstrates that the specialization order might still be too coarse for enforcement 

purposes. One possible solution originating from the work in [5] is to choose an order based 

on -constraints. We shall follow this idea in Section 6.3. 

6.3 Weaker Notions of Eect Preservation 

Intuitively there exist various enforcement strategies with respect to basic operations in order 

to enhance the rigidity of GCSs. E.g., consider an insertion of a new tuple into a relation: 

{ For a functional dependency we may enforce consistency either by adding a precondition 

(the choice made in the practical example in [5]) oder propagate the deletion of other 

tuples. 

{ For amultivalued dependency we either propagate further insertions or add a precondition. 

{ For an inclusion dependency we may propagate an insertion in the other relation or add 

a precondition. 

{ For an exclusion dependency we may propagate the deletion of tuples in the other relation 

or add a precondition. 

For the case of a deletion of a tuple the alternatives are similar. 

Let us now try to characterize the relationship between the original database transition 

and those resulting from such rewriting eorts in order to nd a weaker notion of eect 

preservation that allow to encompass the problems we have with GCSs. 

In general the eects of a database transition T may be expressed by transition constraints. 

Just take a the characterizing predicate of a subset of (T ). Then T is certainly consistent 

with a constraint choosen in that way. Therefore, we introduce the notion of a -constraint, 

i.e. a transition constraint that is satised by a database transition S [5]: 

Denition 6.4. Let S be a database transition on X = fx 1 ::: x n g. A -constraint for S 

is a transition constraint J on X such thatfx 0 1 =x 1::: x 0 n =x ng:wlp(S 0 )(J ) holds, where S 0 

results from S by renaming all x i to x 0 i . 

ut 

Example 6.1. Look at the the insertion S of a new tuple t into a relation r. Then the 

following formulae are -constraints for S: 

{ t 2 r 0 

{ 8u: u 2 r ) u 2 r 0 124

{ 8u: u 2 q , u 2 q 0 for all relation schemata q 6= r 

{ 8u: u 6= t ^ u 2 r 0 ) u 2 r ut 

If S is dened on Y X and we require all -constraints J of S with fr(J ) Y [ Y 0 to be 

also -constraints for T , then T will be a specialization of S. The converse also holds. Thus, 

we may replace the specialization condition by the preservation of certain -constraints. 

6.3.1 Maximal Consistent Eect Preservers 

We have already seen that we may always associate a database transition S(J ) with each 

transition constraint J .In order to preserve J we must require to specialize S(J ). The basic 

idea of the tailored operational approach is now to consider not all -constraints, but only 

some of them. Thus, we do no longer build the GCS of S with respect to I, but the GCS of 

some S(J ). 

If some -constraints of S are omitted in J ,thenS(J ) will allow executions that do not 

occur in any specialization of S. In this way, we can capture the reasonable changes that 

were listed at the beginning of this section. However, taking any such -constraint is much 

too weak. S(J ) should only add executions that are consistent with I. This justies to dene 

-constraints that are compatible with a given static constraint I on X in the sense that 

bulding the GCS S(J ) I does not increase partiality. 

Denition 6.5. A -constraint J for a database transition S is compatible with a static 

constraint I i wp(S(J ) I )(false) ) wp(S(J ))(false) holds. 

ut 

Example 6.2. It is easy to see that each of the -constraints in the previous example is 

compatible with I chosen to be a multivalued dependency. Furthermore, the conjunction 

of three of these constraints is also compatible with I, but the conjunction of all four - 

constraints is not. 

ut 

The last example suggests to consider the implication order on -constraints. We say that 

J 1 is stronger than J 2 i J 1 ) J 2 holds. Unfortunately there is no smallest -constraint 

compatible with I and we cannot consider the \strongest" I-compatible -constraint for S, 

but we may consider minimal elements in this order. 

Denition 6.6. A -constraint J for S is low with respect to I i it is I-compatible and 

there is no strictly stronger I-compatible -constraint. 

ut 

Nowwe are prepared to dene maximal consistent eect preservers for a database transition S. 

For these we choosealow -constraint J which formalizes an eect of S to be preserved. Then 

we take a consistent database transition S I that preserves this eect, but remains undened, 

whereever S is undened. Finally, we require S I to be a greatest database transition with 

these properties with respect to the specialization order. 

Denition 6.7. Let S be a database transition and I a static constraint onX. LetJ be a 

low -constraint of S with respect to I. A database transition S I on X is called a maximal 

consistent eect preserver (MCE) of S with respect to I i 

(i) J is a -constraint forS I , 

(ii) wp(S)(false) ) wp(S I )(false) holds, 

125

(iii) S I is consistent with respect to I and 

(iv) any other database transition T with these properties specializes S I . 

ut 

Note that in this denition the state space on which S is dened is no longer important. It 

\vanishes" inside the chosen J . Then it is easy to see that the informal enforcement strategies 

at the beginning of this section are captured by MCEs for basic database transitions. 

Furthermore, property (iv) employs the specialization order v again. This seems to be 

surprising for the rst moment, but it turns out to be a natural denition as shown in the 

following lemma which follows directly from the denitions. 

Lemma 6.8. Let S be adatabase transition and I a static constraint on X. Let J be a low 

-constraint of S with respect to I. Then wp(S) (true) ! S(J ) I is the MCE with respect to 

I. ut 

From the lemma we maydraw rst conclusions: 

{ For a chosen low -constraint with respect to I the MCE S I always exists and is uniquely 

determined (up to semantic equivalence) by S, I and J . 

{ MCEs are closely related to GCSs. Apart from the precondition wp(S) (true) the MCE 

is the GCS of a slightly extended database transition, i.e. possible changes have been 

incorporated into S(J ). 

The lemma suggests to apply the theory of GCS construction from Section 6.2 to the construction 

of MCEs. This idea, however, is misleading, since there is no eective way to construct 

S(J ). Instead, we shall investigate eective MCE construction below. On the other hand, we 

can show that commutativity also holds for MCEs. 

Proposition 6.9. For static constraints I 1 , I 2 each preconditioned MCE I 1 Î 2 ! S I 1Î2 

is semantically equivalent to I 1 Î 2 ! (S I1 ) I 2 and vice versa. 

ut 

6.3.2 Eective MCE Construction 

Let us now ask for the eective construction of MCEs for complex database transitions. Again 

a naive approach { replacing just basic operations such as insertions and deletions by some 

of their MCEs { leads to wrong results, but we observe that an MCE S I for a chosen low 

-constraint J is a specialization of S(J ) 0 I , a database transition that is basically built by 

replacing basic database transitions by their GCSs. Hence, it seems promising not to consider 

the replacement by GCSs, but by selected MCEs. 

As in the case of GCS construction we have to require I-reducedness, the purely technical 

condition which excludes self-repairing within sequences [5]. Then it can be shown that for 

I-reduced database transitions each MCE is itself a specialization of the database transition 

(S I ) 0 , a database transition that is basically built by replacing basic database transitions by 

MCEs (upper bound theorem). Then it appears that adding a precondition gives a MCE 

(main theorem). Thus, we obtain the following result: 

Theorem 6.10. Let I be a static constraint on X and S some I-reduced database transition. 

Assume X = fx 1 ::: x n g. Let (S I ) 0 result from S by rst replacing each restricted choice 

S 1 S 2 by S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic transition in S by one of its MCEs 

with respect to I. For a disjoint copy fz 1 ::: z n g of X dene 

126

P ((S I ) 0 ) fz 1 =x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) 

where T results from (S I ) 0 by renaming all x i to z i . Then 

S I = wp(S) (true) ! (P ((S I ) 0 ) ! (S I ) 0 ) 

is an MCE for S with respect to I. 

ut 

This theorem again requires some informal explanation. Its basic impact is the reduction 

of the MCE construction problem to basic operations. Practically this means to chose an 

enforcement strategy for basic operations by the means of a MCE. Then the theorem shows 

how to construct a corresponding MCE for any complex operation, i.e. for any \intended 

transaction". 

This also works if alternatives for MCEs of basic operations, i.e. insertions, deletions etc., 

are permitted. In this case each MCE combination can be used in the theorem to dene MCEs 

of complex transitions. 

If we take the theorem together with the commutativity result mentioned beforehand, 

this gives eective means for MCE construction for arbitrary sets of constraints and hence 

for consistency enforcement in the tailored computational approach. 

6.4 Application Example 

Let us now look at a simple application example for the (tailored) computational approach. 

We consider a simple MCE computation that is similar to the computation of a GCS branch 

in [5]. Consider the state space X = fx 1 x 2 :: FSET(INT INT)g where FSET() is the 

nite set type constructor. Moreover, consider the static constraints: 

I 1 map( 1 )(x 1 ) map( 1 )(x 2 ) 

I 2 8x y :: INT INT: x 2 x 2 ^ y 2 x 2 ^ 2 (x) = 2 (y) ) 1 (x) = 1 (y) 

I 3 2 (x 1 ) \ 2 (x 2 ) = 

These are examples of an inclusion dependency, a functional dependency and an exclusion 

dependency. 

Example 6.3. Let the state space and constraints be as above. Now consider the fx 1 g- 

operation S(a b :: INT) = x 1 := x 1 [f(a b)g. Let us take the constraints in the given 

order. 

Step 1. First consider the inclusion constraint I 1 .We dispense with the proof of I 1 -reducedness. 

S is a deterministic basic assignment that can be replaced by its MCE with respect to I 1 and 

the low -constraint 

J x 0 1 = x 1 [f(a b)g^9c: x 0 2 = x 2 [f(a c)g : 

Then we compute (S I ) 0 (a b :: INT) = 

x 1 := x 1 [f(a b)g ( a =2 1 (x 2 ) ! @@ c :: INT x 2 := x 2 [f(a c)g skip ) 

which isanX-operation with P ((S I ) 0 ) , true. Dene this as the new S. 

127

Step 2. Now consider the invariant I 2 . Again the reducedness proof is omitted. We have 

to remove the restricted choice and to replace the basic assignment tox 2 by the MCE with 

respect to I 2 and the low -constraint 

J x 0 1 = x 1 ^ (x 0 2 = x 2 [f(a c)g _x 0 2 = x 2) : 

( a =2 1 (x 2 ) ! c =2 2 (x 2 ) ! x 2 := x 2 [f(a c)g )( a 2 1 (x 2 ) ! skip ) 

Then we compute P (SI 0 ) $ true. Hence the new S is (after some rearrangements) 

S(a b :: INT) = x 1 := x 1 [f(a b)g 

(( a =2 1 (x 2 ) ! @@ c :: INT c =2 2 (x 2 ) ! x 2 := x 2 [f(a c)g ) 

a 2 1 (x 2 ) ! skip ) : 

Step 3. Now regard the exclusion invariant I 3 . Reducedness holds, but we omit the proof. 

Replace S 1 = x 1 := x 1 [f(a b)g in S by the MCE 

x 1 := x 1 [f(a b)g x 2 := x 2 ;fx 2 x 2 j 2 (x) =bg 

and analogously replace S 2 

= x 2 := x 2 [f(a c)g by theMCE 

x 2 := x 2 [f(a c)g x 1 := x 1 ;fx 2 x 1 j 2 (x) =cg : 


P (S 0 I ) , b =2 2(x 2 ) ^ 

( =2 1 (x 2 ) )8c :: INT: (c 62 2 (x 2 ) ) c =2 2 (x 1 ) [fbg) ) 

hence the nal result is (after some rearrangements) semantically equivalent to 

S I (a b :: INT) = b =2 2 (x 2 ) ! x 1 := x 1 [f(a b)g 

(( a 62 1 (x 2 ) ! @@ c :: INT 

c =2 2 (x 2 ) ^ c =2 2 (x 1 ) ! x 2 := x 2 [f(a c)g ) 

a 2 1 (x 2 ) ! skip ) : 

ut 


We investigated major problems of the computational approach to consistency enforcement. 

These problems are related to the specialization order chosen by this approach. We examined 

intuitive strategies for consistency enforcement with respect to basic operations and selected 

classes of constraints. In these cases the strict eect preservation property ofthe specialization 

approach can be restricted to so-called I-compatible -constraints. In fact, choosing such 

a constraint that is minimal with respect to the implication order gives rise to the denition 

of maximal consistent eect preservers (MCEs) as a natural approach to consistency 

enforcement. 

128

Fortunately, MCEs are closely related to greatest consistent specializations (GCSs) that 

were studied before. Each MCE is given by the GCS of a slightly extended transition and 

a precondition. This does not help directly in constructing MCEs, but it turns out that 

MCE construction can be done in the same way as GCS construction, i.e., the consistency 

enforcement problem can be reduced to nding MCEs for basic operations, which is only a 

problem of practical calculation. 

Thus, the tailored approach presented in this paper may be considered as a general solution 

to (static) consistency enforcement. Moreover, as indicated in [8] an ecient and exible 

implementation can be achieved by the use of linguistic reection. The only problem that 

might be critical concerns the technical prerequisite of I-reducedness which excludes badly 

written database transitions. As shown in [7] for selected classes of constraints it is even 

possible to rewrite transitions in such away thatI-reducedness always holds. 


1. S. Ceri, P.Fraternali, S. Paraboschi, L. Tanca: Automatic Generation of Production Rules for Integrity 

Maintenance. ACM TODS 19(3), 1994, 367-422. 

2. P.Fraternali, S. Paraboschi: Ordering and Selecting Production Rules for Constraint Maintenance: 

Complexity and Heuristic Solution. to appear in IEEE TKDE. 

3. G. Nelson: A Generalization of Dijkstra's Calculus. ACM TOPLAS 11 (4), 1989, 517-561. 

4. K.-D. Schewe, B. Thalheim: Consistency Enforcement in Active Databases. In S. Chakravarty, 

J. Widom (Eds.): Research Issues in Data Engineering |Active Databases. Workshop Proceedings. 

Houston, Februar 1994. 

5. K.-D. Schewe, B. Thalheim: A Computational Approach to Consistency Enforcement. submitted 

for publication. 

6. K.-D. Schewe, B. Thalheim, J. Schmidt, I. Wetzel: Integrity Enforcement in Object Oriented 

Databases. In U. W. Lipeck, B. Thalheim (Eds.): Modelling Database Dynamics. Springer Workshops 

in Computing. Volkse 1992, 174-195. 

7. K.-D. Schewe: Specication and Development of Correct Relational Database Programs. Technical 

Report. submitted for publication. 

8. K.-D. Schewe, D. Stemple, B. Thalheim: Higher Level Genericity in Object Oriented Databases. In: 

Proc. Conference on the Management of Data (COMAD '94). Bangalore (India), December 1994. 

129

Appendix 

6.6 The Predicate Transformer Calculus 

This section gives a brief review of Dijkstra's classical calculus [3]. Assume that S is a program 

specication and that X is the nite set of variables occurring in S. We usually call X a state 

space. If D is a set of values (or more generally a domain), then a state is simply a variable 

assignment : X ! D. Let be the set of all such states (more generally: a power domain). 

Then the overall meaning of S can be given by a subset (S) [ f1g, where 

() 2 (S) means that starting S in the initial state , may lead to the nal state and 1 

represents non-termination. This description does not depend on the style of the specication 

S. Of course, this trivial semantics description comprises non-determinism and partiality. 

Consider an innitary logic and assume that there is an equality predicate. Regard formulae 

R with free variables in X. These are called X-predicates. Let F(X) be the set of all 

X-predicates. Let St =(D !) be a xed structure for the interpretation of L with semantic 

domain D and assume that St satises the domain closure property, i.e. for each d 2 D there 

is some closed term t 2T(L) with!(t) =d. Obviously, a state is sucient tointerpret an 

X-predicate. Write j= R if interpreting R in state yields true. Now dene two mappings 

wlp(S) andwp(S) on equivalence classes of X-predicates. 

j= wlp(S)(R) i () 2 (S) ^ 6= 1)j= R and (6.71) 

j= wp(S)(R) i () 2 (S) ) 6= 1^ j= R : (6.72) 

we callw(l)p(S)(R) theweakest (liberal) precondition of S for the postcondition R. Note that 

this denition precisely formalizes the informal meaning of wlp(S) andwp(S). Moreover, the 

predicate transformers are uniquely determined by (S) uptoequivalence. 

Theorem 6.11. For a given program specication S the predicate transformers wlp(S) and 

wp(S) exist. Moreover, they satisfy 

wp(S)(R) , wlp(S)(R) ^ wp(S)(true) (pairing condition) and (6.73) 

wlp(S)(^ ^ 

R i ) , wlp(S)(R i ) (universal conjunctivity) : (6.74) 

i2I 

i2I 

The following inversion theorem shows that universal conjunctivity and the pairing condition 

already suce to nd a specication S with corresponding predicate transformers. For this 

recall that the dual f of a predicate transformer f is dened as f (R) =:f(:R). 

Theorem 6.12. Let flp and fp be predicate transformers satisfying (6.73) and (6.74) in 

place of wlp(S) and wp(S). Then for a program specication S with 

(S) =f() jj= flp (P )g[f( 1) jj= fp (false)g 

ut 

wlp(S)(R) , flp(R) and wp(S)(R) , fp(R). 

ut 

130

In [3] recursion has been investigated with respect to the order v dened by S v T i 

wlp(T )(R) ) wlp(S)(R) andwp(S)(R) ) wp(T )(R) hold for all X-predicates R. Therefore, 

for monotonic f with respect to v the program specication T = S:f(S) can be dened as 

a least xpoint and wlp(T ) (resp. wp(T )) is dened by conjunction (disjunction). 

Finally, regard the following language of guarded commands built recursively from the 

following constructs. 

(i) assignments x := E for a variable x and a term E, 

(ii) skip, fail, loop, 

(iii) sequential composition S 1 S 2 , choice S 1 S 2 , projection @@x S, guard P ! S and 

restricted choice S 1 S 2 ,whereP is a well-formed formula and x is a variable. 

The informal meaning of an assignment is the usual one. skip is an operation that \does 

nothing", loop is always dened, but never terminates, and fail is always undened. The 

latter two commands are only justied as least elements with respect to the Nelson order { 

used for recursion { and the specialization order. 

The intended meaning of a sequence is also the standard one: rst execute S 1 , then S 2 .A 

guard denes a precondition. If P is satised, S is executed, otherwise there is no execution. 

Choice means demonic choice, i.e., choose any of S 1 or S 2 as long as it is dened, even, if 

this leads to non-termination. Restricted choice on the other hand prefers the execution of 

S 1 unless it is undened, in which case S 2 is taken. Finally, theunbounded choice operator 

introduces a new variable x and executes S on the state space extended by x. 

For this language the axiomatic semantics can be dened by 

w(l)p(x := E)(R) ,fx=Eg:R 

w(l)p(skip)(R) ,R 

w(l)p(fail)(R) , true 

wlp(loop)(R) , true and wp(loop)(R) , false 

w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(w(l)p(S 2 )(R)) 

w(l)p(P ! S)(R) ,P) w(l)p(S)(R) 

w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(R) ^ w(l)p(S 2 )(R) 

w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(R) ^ (wp(S 1 )(false) ) w(l)p(S 2 )(R)) and 

w(l)p(@@x S)(R) ,8x:w(l)p(S)(R) : 

Then for all expression f(S) built from the constructors above f will be monotonic with 

respect to v and we get 

^ 

_ 

wlp(S:f(S))(R) , wlp(f (loop))(R) and wp(S:f(S))(R) , wp(f (loop))(R) 

 

 

where ranges in both cases over the ordinal numbers. 

For any guarded command S we may also consider the conjugate predicate transformers 

wp(S) and wlp(S) , which are dened by 

w(l)p(S) (R) = :w(l)p(S)(:R) : 

131

6.7 I-reducedness 

For all the constructors for a guarded command S except the sequence each computation 

of the resulting complex operation already occurs as a computation of one of the involved 

components. Therefore we may expect that GCS construction can be done componentwise. 

For sequences, however, this is not the case. 

Let us now dene the technical I-reducedness condition for sequences. Assume that the 

types of values are understood from the context. 

Denition 6.13. Let S = S 1 S 2 be an Y -operation such that S i is a Y i -operation for Y i Y 

(i = 1 2). Let I be some X-invariant with Y X. Let X ; Y 1 = fy 1 ::: y m g, Y 1 = 

fx 1 ::: x l g and assume that fx 0 1 ::: x0 l g is disjoint copy of Y 1 disjoint alsofrom X. Then 

S is called -I-reduced i the following two conditions hold: 

(i) For all states with j= 

:I we have, if 

P )fx 1 =x 0 1::: x l =x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g:I) 

is a -constraint for S 1 ,thenitisalsoa-constraint forS. 

(ii) For all states we have, if 

P )fx 1 =x 0 1 ::: x l=x 0 l g:8 i(i =1:::m):fy 1 = 1 ::: y m = m g::I) 

is a -constraint for S 1 ,thenitisalsoa-constraint forS. 

Example 6.4. Take X = fx 1 :: FSET(T )x 2 :: FSET(T )g, I x 1 x 2 and S(x y :: T )= 

S 1 S 2 with S 1 = x 2 := x 2 ;fxg and S 2 = x 1 := x 1 [fyg. 

(i) A -constraint for S 1 in the form P ) D 0 C 0 with P C = C 0 ^ D = D 0 is 

C = C 0 ^ D = D 0 ^ D 0 C 0 ^ x 62 D 0 ) D 0 C 0 : 

Since Denition 6.13 additionally requires j= :I, i.e. D 0 6 C 0 ,theconjunction of such 

constraints is true, which is also a -constraint ofS. 

(ii) Now takea-constraint of S 1 in the form P ) D 0 6 C 0 . Then we have 

fC 0 =C D 0 =Dg:wlp(fC=C 0 D=D 0 g:S 1 )(C = C 0 ^ D = D 0 ) D 0 6 C 0 , 

C = C 0 ^ D = D 0 ) D 6 C ;fxg , 

D 0 6 C 0 ;fxg 

x 2 D 0 _ D 0 6 C 0 : 

Denition 6.13 additionally requires j= I, i.e. D 0 C 0 , the conjunction J of such 

constraints is 

x 2 D ^ D C ) D 0 6 C 0 : 


fC 0 =C D 0 =Dg:wlp(fC=C 0 D=D 0 g:S)(J ) , 

x 2 D ^ D C ) D 6 (C ;fxg) [fyg , 

true for x 6= y 

false for x = y : 132

Hence S is -I-reduced only if x 6= y, but the operation 

S 0 (x y :: T ) = (x 6= y ! S 1 S 2 ) S 2 

is always -I-reduced and semantically equivalent toS. 

ut 

We may extend this denition to arbitrary operations requiring all occurring sequences to be 

-I-reduced. 

Denition 6.14. Let S be an X-operation and I some Y -invariant with X Y . S is called 

I-reduced i the following holds: 

(i) If S is one of fail, skip, loop or an assignment, then S is always I-reduced. 

(ii) If S = S 1 S 2 ,thenS is I-reduced i S 1 and S 2 are I-reduced and S is -I-reduced. 

(iii) If S is one of P ! T ,@@y :: T y T , S 1 S 2 or S 1 S 2 , then S is I-reduced i S 1 and S 2 

or T respectively are I-reduced. 

(iv) If S = T:f(T ), then S is I-reduced i f (loop) isI-reduced for each ordinal number . 

133

Chapter 7 

Limits of Rule Triggering Systems 

for Integrity Maintenance in the 

Context of Transaction 

Specications 

Contents 

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 

7.2 Unrepairable Transitions . . . . . . . . . . . . . . . . . . . . . . . . 136 

7.3 Critical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 

7.4 Stratied Constraint Sets . . . . . . . . . . . . . . . . . . . . . . . 140 

7.5 An Algorithm for Checking Stratication . . . . . . . . . . . . . . 142 

7.6 Locally Stratied Constraint Sets . . . . . . . . . . . . . . . . . . . 146 

7.7 Complexity of Local Stratication . . . . . . . . . . . . . . . . . . 150 

7.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 


K.-D. Schewe, B. Thalheim. Limits of Rule Triggering Systems for Integrity Maintenance 

in the Context of Transaction Specications. Acta Cybernetica 1998 (to appear). 

134

Abstract. Integrity Maintenance is considered one of the major application elds of rule 

triggering systems (RTSs). In the case of a given integrity constraint being violated by a 

database transition these systems trigger repairing actions. However, it will be shown that 

for any set of constraints there exist unrepairable transitions, which depend on the closure 

of the constraint set. This implies that integrity maintenance by RTSs is only possible, if the 

constraint implication problem is decidable. 

Even if unrepairable transitions are excluded, this does not prevent the RTS to produce 

undesired behaviour. Writing constraints as sets (conjunctions) of simple ones in implicative 

normal form, this behaves well if there is only one such constraint. In general, however, the 

rule triggering approach fails to solve the problem. 

Analyzing the behaviour of RTSs leads to the denition of critical paths in associated 

rule hypergraphs and the requirement ofsuch paths being absent. It will be shown that this 

requirement can be satised if the underlying set of constraints is stratied, but this notion 

turns out to be too strong to be also necessary. A sucient and necessary condition for the 

absence of critical paths is obtained, if sets of constraints are required to be locally stratied. 

Keywords: active databases, integrity maintenance 


Active databases (ADBs) aim at extending relational (or object oriented) DBMS by rule 

triggering systems (RTSs), i.e. by sets of rules which on a given event and in the case of a 

condition being satised trigger actions on the database (ECA-rules). Events can be external 

events, time conditions or internal events resulting from operations on the database. Conditions 

are usually given by boolean queries that have to be evaluated against the database. 

The action part consists of a sequence of basic operations to insert, delete or update tuples 

(or objects respectively) in the database. 

The current research on ADBs (see e.g. [3]) is dominated by implementational aspects, 

whilst foundations of RTSs are seldom approached. The work in [1, 2, 4, 9, 10] and partly 

in [3] considers the problem to enforce database integrity by the use of RTSs. The results 

concern the generation of repairing ECA-rules and partly the analysis of the resulting RTS. 

This analysis concentrates on the termination of the rule system, the independence of the nal 

database state from the chosen selection order of the rules (conuence) and on consistency. 

These requirements are not sucient for a reasonable rule behaviour, because it is easy 

to dene an RTS that empties the database in case of any constraint violation. Therefore, 

we claim an additional requirement, which informally means that the intended eect of a 

transition may not be turned into its opposite by the RTS. 

In this short paper we analyze the limits of the rule triggering approach. For a given set 

of constraints in implicational normal form we rst investigate the existence of unrepairable 

transitions. These are determined by the closure of the constraint set. It turns out that the 

decidability of the constraint implication problem is necessary for integrity maintenance by 

RTSs. 

Next we analyze, how to obtain RTSs that denitely repair constraint violations by a 

(repairable) transition without invalidating its intended eect. Given an RTS we rst associate 

with it a rule hypergraph which corresponds to the possible sequences of triggered rules. Next 

we dene critical trigger paths in these hypergraphs that correspond to the propagation of 

135

conditions. Indeed it can be shown that the existence of a single critical trigger path makes 

the RTS work incorrectly for at least one transition. 

Finally, we analyze constraint sets in order to detect, whether it is possible to dene an 

RTS of repairing actions such that the critical trigger paths in its associated hypergraph can 

only invalidate unrepairable transitions. For this we rst introduce stratied constraint sets 

that satisfy this condition. Since the converse is not true, we nally weaken the concept to 

locally stratied constraint sets which gives a necessary and sucient conditions for the RTS 

to work correctly. 

7.2 Unrepairable Transitions 

In the following we consider the relational datamodel with integrity constraints given by 

formulae in implicative normal form 

I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) (7.75) 

with predicate symbols p i , q j , which correspond either to a relation of the schema or are 

comparison predicates (= 6=

Let us rst demonstrate the insuciency of a naive RTS approach by a simple example. 

In \real" applications the situation of Example 7.1 will not occur in such an obvious way, 

but there are always implied and in general not detectable constraints leading to analogous 

problems as shown in [6]. 

Example 7.1. Take two unary relations p and q and the constraints I 1 p(x) ) q(x) and 

I 2 p(x) ^ q(x) ) false. This implies p to be always empty, hence insertions into p should 

be abolished. Then we obtain the following repairing rules: 

R 1 : ON insert p (x) IF:I 1 DO insert q (x) 

R 2 :ONdelete q (x) IF:I 1 DO delete p (x) 

R 3 : ON insert p (x) IF:I 2 DO delete q (x) 

R 4 : ON insert q (x) IF:I 2 DO delete p (x) 

If we try to execute a transition insert p (x) on a database state satisfying q(x), then we 

successively trigger the rules R 3 and R 2 with the eect of only deleting a in q. This contradicts 

the original intention of the transition. 

ut 

In order to analyze the unintended behaviour in Example 7.1 consider a set of constraints in 

implicational normal form. Let denote the (semantic) closure, i.e. = fI j j= Ig.Now 

let I2 be non-trivial, i.e. it does not hold in all database states. Write I in implicational 

normal form 

I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) 

and let p i 1 ::: p i k 

and q j 1 ::: p j` denote the relation symbols on the left and right hand 

sides of I respectively. Wemay dene a transition T by 

delete qj 

1 (y j1 ) ::: delete q j` (y j`) insert pi 

1 (x i1 ) ::: insert p ik 

(x ik ) : 

If we start T with values for the x i and y j such that the additional conditions on the left 

hand side of I are satised, whilst the additional conditions on the right hand side are not, T 

will always reach a database state satisfying :I. This eect of T is intentional and hence the 

only reasonable approach tointegrity maintenance in this case is to disallow such transitions. 

More formally, the eect of a transition T in a state is given by the strongest (with 

respect to )) formula E (T ) = such that j= wp(T )( ) holds. Here wp(T )( ) denotes 

the weakest precondition of under the transition , i.e. starting T in initial state will 

reach a nal state satisfying . 

Since we only consider sequences of insertions and deletions, E (T ) can always be written 

as a conjunction of literals, i.e. in negated implicational normal form, with the positive literals 

corresponding to insertions and the negative onestodeletions. In addition, we may consider 

the eect of a sequence T RT S, where T is a transition and RT S a system of rules. We say 

that RT S invalidates the eect of T i 6j= E (T ) ^ E (T RT S) holds for some state . 

Then it is justied to call a transition T repairable with respect to the constraint set 

i :E (T ) =2 holds for at least one state . Then a complete terminating system 

RT S of ECA-rules always invalidates the eect of a non-repairable transition T . Hence the 

problem is to detect (and exclude) non-repairable transitions. In order to decide whether a 

given transition T is repairable or not, we must be able to decide, whether :E (T ) is in the 

closure . Hence the implication problem for constraints must be decidable. 

137

Proposition 7.1. Let be asetofconstraints. The problem to decide, whether a transition 

T is repairable with respect to is equivalent to the constraint implication problem for , 

i.e. the problem to decide, whether a given constraint I is a member of or not. ut 

Proposition 7.1 denes the rst limit on integrity maintenance by rule triggering systems. In 

the following sections we shall concentrate on repairable transitions. 

Note that our treatment ignores the termination problem. Non-terminating transitions 

have to be excluded as well, but this problem is independent from the repairability problem, 

since non-termination of RTSs occurs as an orthogonal problem. 

7.3 Critical Paths 

Let us ask, whether we can always nd a complete set of repair rules for all repairable 

transitions. For this we introduce the notions of associated hypergraphs and critical trigger 

paths. 

Denition 7.2. Let S = fp 1 ::: p n g be a relational database schema and RT S = fR 1 ::: R m g 

a system of ECA-rules on S. Then the associated rule hypergraph (VE) is constructed as follows: 

{ V is the disjoint union of S and RT S. We then talk of S-vertices and RT S-vertices 

respectively. 

{ If R 2 RT S has event-part Ev on p 2 S and actions on p 1 ::: p k , then we have a 

hyperedge from p to fRg labelled by +or; depending on Ev being an insert or delete, 

and a hyperedge from fRg to fp 1 ::: p k g analogously labelled by k values + or ;. ut 

Figure 7.1 shows the associated rule hypergraph of Example 7.1 in which case we have a 

simple graph. 

 

q 

; 

@@I 

@ ; 

;; @ 

p 

 

R 2 

; + - R 4 

; 

@ ; 

+ @@ 

; ;; @R 

; 

R 1 

+ + - R 3 

Fig. 7.1. Associated Rule Hypergraph 

Denition 7.2 ignores the condition part of the rules. These come into play ifwe consider 

critical trigger paths in associated hypergraphs. These are dened in several steps starting 

from paths in the associated hypergraph which correspond to possible sequences of ECArules 

with respect only to their event- and action-parts. Secondly we attach formulae to the 

S-vertices in the path in such a way that pre- and postconditions of the involved rules are 

expressed. Then we talk of trigger paths. 

A maximal trigger path with contradicting initial and nal condition will then be called 

critical. Then imagine a transition with an eect implied by the initial formula, i.e. that 

138

there is an initial state such that running the transition in this state results in a state which 

satises the initial condition of the trigger path. If we execute this transition followed by the 

rule triggering system along the critical trigger path will then turn the eect of the transition 

into its opposite. This means that the RT S invalidates the eect of at least one transition. 

Denition 7.3. Let G = (VE) be the rule hypergraph associated with a system RT S of 

rules. A trigger path in G is a sequence v 0 e 1 v1 0 e0 1 ::: e0`v` of vertices and hyperedges with 

the following conditions: 

{ v i 2S holds for all i =0::: `, 

{ vi 0 2 RT S holds for all i =1::: `, 

{ e i is a hyperedge from v i;1 to vi 0 and 

{ e 0 i is a hyperedge from v0 i to V i with v i 2 V i and the same label as e i+1 . 

We call ` the length of the trigger path. 

In addition we associate with each vertex v i 2 S (i = 0::: `) a formula ' i in negated 

implication normal form such thatj= ' i ) cond(vi+1 0 ) holds for the condition part cond(v0 i+1 ) 

of rule vi+1 0 2 RT S and j= ' i ) wp(A i+1 )(' i+1 ) holds for the action-part A i+1 of rule vi+1 

0 

(i =0::: `; 1). Furthermore, there is no e`+1 2 E from v` to v0`+1 with the same label as 

e 0` such thatj= '` ) cond(v0`+1 ) holds. 

Then a trigger path is critical i j= :(' 0 ^ '`) holds. Such a critical trigger path is 

called admissible i there is a consistent state and a repairable transition T such that 

E (T ) , ' 0 holds. 

ut 

Critical trigger paths for the associated rule hypergraph in Figure 7.1 are sketched in Figure 

7.2. Note that in this case both critical trigger paths are not admissible. 

 

p 

 

 

p 

 

Fig. 7.2. Critical Trigger Paths 

 

q 

 

 

q 

 

+ + + ; 

R 1 R 4 

- - - - 

+ ; ; ; 

R 3 R 2 

- - - - 

 

p 

 

 

p 

 

p(x) ^:q(x) p(x) ^ q(x) :p(x) ^ q(x) 

v 0 e 1 v 0 1 e 0 1 

v 1 e 2 v 0 2 e 0 2 

v 2 

p(x) ^ q(x) p(x) ^:q(x) :p(x) ^:q(x) 

If a critical trigger path is not admissible, then only a non-repairable transition can be invalidated 

by running the rulesinthetrigger path. Since we exclude non-repairable transitions, 

we only have to consider admissible trigger paths. After these remarks we are able to prove 

our rst result. 

Proposition 7.4. Let RT S be acomplete set of rules associated with a set of constraints 

and let G =(VE) be the associated rule hypergraph. Then G contains an admissible critical 

trigger path i there exists a consistent database state and a repairable transition T such that 

executing T in and consecutively running RT S invalidates the eect of T without leaving 

the database unchanged. 

139

Proof. Let us rst assume that G contains an admissible critical trigger path. Let ' 0 ::: '` 

denote the formulae associated with the S-vertices in this trigger path. 

Case 1. Assume that e 1 is labelled by +.Then' 0 contains at least one positive literal p(x). 

Let be a consistent state and T a repairable transition such that E (T ) is given by ' 0 . 

We may assume that j= :p(x) holds and that the nal action in T in an insertion into p. If 

we start T in the initial state , then the resulting state satises ' 0 . 

T followed by the RT S may then result in a state satisfying '`. Hence the eect of 

T RT S in is given by '`. Since j= :(' 0 ^ '`) holds by the denition of critical trigger 

paths, this implies that RT S invalidates the eect of T . Furthermore, is consistent with 

respect to all constraints in , since RT S is complete and there is no hyperedge e`+1 from v` 

to some v 0`+1 2 RT S with the same label as e0` such thatj= '` ) cond(v0`+1 ) holds. 

It remains to show 6= . If this does not hold, we get j= '` and consequently there 

exists some such that '` ,:p(x) ^ and ' 0 , p(x) ^ hold. This implies `>1, because 

otherwise the rule v1 0 would have the form ON insert p(x) IF :I DO delete p (x) , which we 

excluded. 

If `>1 holds, there is at least one other literal q(y) (or :q(y)) in ' 0 such that delete q (y) 

(or insert q (y) respectively) occurs in the action-part of v1 0 .Thenwemay consider the admissible 

critical trigger path v 1 e 2 ::: v` of length ` ; 1 instead. Following the argumentation 

above, we maychoose and T in such away that j= :q(y) (orj= q(y) respectively) holds. 

This implies 6= as required. 

Case 2. If e 1 is labelled by ;, then ' 0 contains a literal :p(x). Thus, we have to consider 

a transition T containing delete p (x) as its nal action and a consistent state with j= p(x) 

and E (T ) , ' 0 . Then we may apply the same arguments as for case 1. 

Conversely, assume that there is no admissible critical trigger path. Let T be a repairable 

transition and a database state which is consistent with respect to . Now start T in 

and assume that the resulting state 0 is not consistent. Then consider a trigger path of nite 

length such that j= 0 ' 0 holds. The consecutive execution of the rules in this trigger path will 

result in a state satisfying '`. Thus, we have E (T ) , ' 0 and E (T RT S) , '`. 

According to our assumption, the used trigger path cannot be critical, i.e. '` ^ ' 0 is 

satisable. Hence RT S does not invalidate the eect of T . 

ut 

7.4 Stratied Constraint Sets 

According to the result in Proposition 7.4 we may ask for constraint sets that allow to dene 

complete RTSs which exclude admissible critical trigger paths in their associated hypergraphs. 

Let us start with a simple example. 

Example 7.2. Take again two unary relations p and q and the constraints I 1 p(x) ) q(x) 

and I 2 q(x) ) p(x) which implies p to be always equal to q. Thenwe obtain the following 

repairing rules: 



R 3 : ON insert q (x) IF:I 2 DO insert p (x) 

R 4 :ONdelete p (x) IF:I 2 DO delete q (x) 

140

In this case there are no admissible critical paths in the associated rule hypergraph. We omit 

further details. 

ut 

Let us now investigate the reason for the absence of admissible critical trigger paths in Example 

7.2. This leads us to the notion of a stratied set of constraints. 

The motivation behind this is as follows: In Example 7.2 insertions (deletions) on a relation 

p only trigger insertions (deletions) on q and vice versa. This should be sucient for not 

invalidating a once established eect. The corresponding constraints can therefore be grouped 

together. 

Denition 7.5. Let be a set of constraints in implicative normal form (7.75) on a schema 

S. The is called stratied i we have a partition = 1 [ :::[ n with pairwise disjoint 

constraint sets i called strata such that the following conditions are satised: 

(i) If L is a literal on the left hand side (right hand side) of some constraint I 2 i , then 

all constraints J 2 containing a literal L 0 on the right hand side (left hand side) such 

that L and L 0 are uniable also lie in stratum i . 

(ii) All constraints I, J containing uniable literals L and L 0 either on the left or the right 

hand side must lie in dierent strata i and j . 

ut 

Now we can prove in general that stratied constraint sets always give rise to RTSs without 

admissible critical trigger paths in the associated rule hypergraph. 

Proposition 7.6. Let be a stratied constraint set on a schema S. Then there exists a 

complete RTS such that for any repairable transition T on S the RTS does not invalidate the 

eect of T . 

Proof. Given a constraint I in implicative normal form (7.75), then each relation symbol p i 

on the left hand side gives rise to rules 

ON insert pi (x i )IF:I DO insert qj (y j ) , 

ON insert pi (x i )IF:I DO delete pj (y j ) 

with relation symbols q j occurring on the right hand side and p j (j 6= i) on the left hand side 

of I. Similarly, each predicate symbol q j on the right hand side gives rise to rules 

ON delete qj (y j )IF:I DO insert qi (y j ) (i 6= j) , 

ON delete qj (y j )IF:I DO delete pi (y j ) 

This denes a complete set RT S of rules. Now assume there exists a critical trigger path 

v 0 e 1 v1 0 e0 1 ::: e0`v` in the associated rule hypergraph. Each RT S-vertex vi 0 corresponds to 

a constraint I i 2 . Since e 0 i and e i+1 are equally labelled corresponding to the action- or 

event-part respectively, the construction of the rules above implies I i and I i+1 to lie in the 

same stratum (i =0::: `; 1). 

However, the condition j= :(' 0 ^ '`) implies that ' 0 contains a literal L, '` its negation, 

hence the construction of rules implies I 1 and I` to lie in dierent strata. Hence, there are 

only critical trigger paths of length ` =1. 

According to our construction of RT S this implies j= ' 0 ):Ito hold for some I 2 . 

Thus, :' 0 2 holds. Due to the denition of admissible critical trigger paths and the 

denition of repairable transitions, we conclude that the trigger paths of length ` = 1 cannot 

be admissible. Then the proposition follows from Proposition 7.4. 

ut 

141

Finally, we may ask for cases, where stratied constraint sets occur. Recall from [5] that a 

relational database schema S with constraint set is in Entity-Relationship normal form 

(ERNF) { and hence is equivalent toanER-schema{i 

{ all inclusion constraints in are key-based and non-redundant, 

{ there is no cycle of inclusion constraints in , 

{ each relation schema R 2 S is in BCNF with respect to the functional dependencies in 

and 

{ there are only inclusion and functional dependencies in . 

If a relational database schema S with constraint set is in ERNF, then it is easy to see 

that is stratied. 

Corollary 7.7. Let S be a database schema in ERNF with respect to the constraint set . 

Then is stratied. 

ut 

Hence, following the design approach of Mannila and Raiha in[5]{ifthisissucient for the 

application { leads to schemata without any problems concerning consistency enforcement by 

RTSs. 

Example 7.3. 

Let us look at the following constraints 

I 1 : p(x y) ) q(x z) and 

I 2 : q(x z) ^ q(y z) ) x = y : 

Then this set of constraints corresponds to the Entity-Relationship diagram [8] in Figure 7.3. 

Obviously, the constraint set is stratied. 

ut 

;; @ @@ ; C q 

(0 1) - D 

6 

(0 1) 

;; @ @@ ; ;; @ @@ ; - 

A p B 

Fig. 7.3. Entity-Relationship constraints 

7.5 An Algorithm for Checking Stratication 

Before we analyze the converse of Proposition 7.6 and present the weaker notion of locally 

stratied constraint sets, let us rst concentrate on an algorithm for checking stratication 

and its complexity. For this we consider the set 

142

BW = f> ?g [ (IN ;f0g) [ffj 1 ::: j n gjn 1j k 2 IN ;f0gg : 

In the algorithm we successively add labels from BW to constraints. A label i 2 IN for 

a constraint I is used to indicate that I must lie in the stratum i . A label fj 1 ::: j n g 

indicates that I must not lie in jk for k =1::: n. ? represents no information and > an 

inconsistent assignment of stratum numbers. 

For a more convenient terminology we call an element ofBW black, ifitisin(IN ;f0g) [ 

f>g, otherwise white. Furthermore, we use a commutative, associative binary operation on 

BW dened by 

x ? = x 

x > = > 

i if i = j 

i j = 

> otherwise 

fj 1 ::: j n gfk 1 ::: k m g = fj 1 ::: j n g[fk 1 ::: k m g and 

> if i = jk for some k 2f1::: ng 

i fj 1 ::: j n g = 

i otherwise 

 

: 

Algorithm 7.8 (Stratication Check). 

Input: aset = fI 1 ::: I n g of constraints 

in clausal form I i = L i1 _ :::_ L ini 

Output: a boolean value b 

Method: 

VAR gather : ARRAY 1 :::n OF BW , 

mb, mb 0 : BW 

BEGIN 

FOR i =1TO n DO 

gather(i) :=? 

ENDFOR 

b := true 

mb := 1 

WHILE 6= DO 

CHOOSE i 0 2f1::: ng WITH I i 0 2 AND gather(i 0) is maximal 

:= ;fI i 0 g 

IF gather(i 0 ) is white 

THEN gather(i 0 ):=mb 

mb := mb +1 

ENDIF 

mb 0 := gather(i 0 ) 

FOR j =1TO n i 0 DO 

FOR ALL I k 2 DO 

FOR ` =1TO n ik DO 

IF L i 0j and L k` are uniable AND gather(i 0 ) 6= > 

THEN gather(k) :=gather(k) fgather(i 0 )g 

ELSIF L i 0j and L k` are uniable 

143

THEN gather(k) :=gather(k) gather(i 0 ) 

ENDIF 

ENDFOR 

ENDFOR 

ENDFOR 

ENDDO 

FOR i =1TO n DO 

IF gather(i) => 

THEN b := false 

ENDIF 

ENDFOR 

RETURN(b) 

END 

ut 

We have tocheck that the algorithm is correct. Then we analyze its time complexity. Before 

we do this let us rst look at a simple example. 

Example 7.4. 

Consider the following constraints: 

I 1 = :p(x) _:q(x) _ r(x) _ s(x) 

I 2 = :q(x) _ r(x) _:t(x) 

I 3 = p(x) _:r(x) 

I 4 = :s(x) _ t(x) and 

I 5 = q(x) _:t(x) : 

Then consider Table 1. Each row corresponds to a constraint I i and lists the values added 

Table1. Stratication Check 

L 11 L 12 L 13 L 14 L 32 L 31 L 21 L 22 L 23 L 41 L 42 L 52 L 51 gather 

1 1 1 1 1 1 

3 1 1 1 1 1 

2 f1g f1g 1 > > > > 

4 1 > > > > 

5 1 > > > > > 

I 1 I 3 I 2 I 4 I 5 b = false 

to gather(i) during the excution of the algorithm. The chosen order of the constraints in the 

algorithm is I 1 , I 3 , I 2 , I 4 , I 5 . Then b will become false and hence is not stratiable. ut 

Let us now address the correctness of Algorithm 7.8. 

Proposition 7.9. Let be a set of constraints. Then is stratiable i Algorithm 7.8 

applied to the input computes the output b = true. 

144

Proof. Let us rst assume that is stratied. Let = 1 [ :::[ n be a decomposition 

into strata and assume that the i are taken minimal with the required properties. We use 

induction on n. 

For n = 1 there are no uniable literals L and L 0 in dierent constraints I J 2 . Hence 

gather(i) will become 1 for alle i and we obtain b = true. 

For n>1 we may assume without loss of generality that some constraint in 1 will be 

chosen rst. Then, due to our minimality assumption, we get gather(i) =1for all I i 2 1 , 

whereas gather(j) will be white for all I j =2 1 . Thus, all constraints in 1 will be chosen 

rst. 

Since gather(j) was white for I j =2 1 and gather(i) =1for I i 2 1 before chosing the 

rst constraint in 2 [:::[ n ,wemay apply the induction hypothesis to 2 [:::[ n , which 

gives gather(j) 6= > for all I j =2 1 . This implies b = true as claimed in the proposition. 

Conversely, assume that the algorithm produces the result b = true. Thenwe must have 

gather(i) 2 IN ;f0g. Dene k = fI i 2 j gather(i) = kg. Assume that the partition 

= 1 [ :::[ n does not satisfy the conditions for strata in Denition 7.5. Then there are 

two possible cases: 

(i) There are literals L and L 0 in constraints I i 2 k and I j 2 ` with k 6= ` such that L 

and L 0 are uniable. Suppose that I i is chosen rst by the algorithm. Then k will be 

added to gather(j), which gives gather(j) => contradicting our assumption. 

(ii) There are uniable literals L and L 0 in constraints I i I j 2 k .IfI i is chosen rst by the 

algorithm, fkg will be added to gather(j), which also gives gather(j) => contradicting 

our assumption. 

Thus 1 [ :::[ n is a partition into strata, which completes the proof. 

ut 

Proposition 7.10. Let be a set of constraints in clausal form, n =#, k the maximal 

arity of predicate symbols occurring in constraints I2 and let ` be the maximum number of 

literals in these constraints. Then the time complexity of Algorithm 7.8 for checking, whether 

is stratied is in O(k `2 n 2 ). 

Proof. The initialization and the nal computation of b can both be done in O(n) steps. 

In the inner FOR-loop the test for uniability can be done in O(k) steps, since there are no 

function symbols. All other operations have a complexity inO(1). Hence the inner FOR-loop 

has a total complexity inO(k). This loop is executed `0 `00 times, where `0 is the number of 

literals in the chosen constraint I i 0 and `00 is the total number of literals in the remaining 

constraints. If I i 0 is the i'th literal chosen by the algorithm, this can be estimated by `2(n;i). 

Since each I 2 will be chosen by the algorithm, the outer WHILE-loop will be executed 

n times. This gives the total complexity in 

O(n)+O(`2 

nX 

i=1 

(n ; i)) O(k)+O(n) = O(k `2 n 2 ) 

as claimed in the proposition. 

ut 

It is easy to see that n ` can be replaced by the total number u = P n 

i=1 n i of literals in 

with u < n `. Thus, the time complexity of the stratication checking algorithm 7.8 is in 

O(k u 2 ). 

145

From Proposition 7.6 we know that active mechanisms can be eectively applied, if the 

constraint set is stratied. In particular, this holds for schemata in ERNF [5], which are equivalent 

to Entity-Relationship schemata. From Proposition 7.10 we know that a stratication 

check can be done eciently. 

7.6 Locally Stratied Constraint Sets 

Unfortunately, the converse of Proposition 7.6 does not hold, as seen in the next example. 

The reason for this is that in the proof of Proposition 7.6 we considered all repairing rules for 

a given constraint, whereas the constraint set in Example 7.5 allows to select only a subset 

thus gaining the required result without loosing the completeness of the RTS. 

Example 7.5. Take three unary relations p and q and the constraints I 1 p(x) ^ r(x) ) 

q(x), I 2 q(x) ) p(x) andI 3 p(x) ) r(x). It is easy to see that this constraint set is not 

stratied. 

However, we may consider the following system of ECA-rules: 



R 3 : ON insert r (x) IF:I 1 DO insert q (x) 

R 4 :ONdelete q (x) IF:I 1 DO delete r (x) 

R 5 : ON insert q (x) IF:I 2 DO insert p (x) 

R 6 :ONdelete p (x) IF:I 2 DO delete q (x) 

R 7 : ON insert p (x) IF:I 3 DO insert r (x) 

R 8 :ONdelete r (x) IF:I 3 DO delete p (x) 

We dispense with showing that there are no admissible critical trigger paths in the associated 

rule hypergraph. 

Note that the construction in the proof of Proposition 7.6 would result in two more rules 

corresponding to insertions: 

R 9 : ON insert p (x) IF:I 1 DO delete r (x) 

R 10 : ON insert r (x) IF:I 1 DO delete p (x) 

These give rise to admissible critical trigger paths. The one shown in Figure 7.4 allows to 

invalidate the eect of the repairable transition insert p (x). 

ut 

 

p 

 

 

r 

 

+ ; ; ; 

R 9 R 8 

- - - - 

 

p 

 

p(x) ^:q(x) ^ r(x) p(x) ^:q(x) ^:r(x) :p(x) ^:q(x) ^:r(x) 

v 0 e 1 v1 0 e 0 v 

1 1 e 2 v2 0 e 0 v 

2 2 

Fig. 7.4. An Admissible Critical Trigger Path 

146

The constraint set in Example 7.5 is not stratied, but nevertheless the associated RTS does 

not invalidate the eect of repairable transitions. This shows that a constraint set need not 

be stratied to allow a reasonable rule behaviour. Indeed, replacing I 1 in the example by 

I1 0 p(x) ) q(x) gives an equivalent constraint set, which is stratied. However, equivalence 

of constraint sets is undecidable in general. Therefore, we introduce the weaker notion of 

being locally stratied. In this case we shall construct RTSs which only contain a subset of 

the set of rules constructed in the proof of Proposition 7.6. 

Denition 7.11. Let be a set of constraints in implicative normal form on a schema S. 

A labelled subsystem consists of a subset 0 = fI 2 j L (I) is dened g together with 

a set of clauses 00 = f L (I) jI2 0 g and a literal L (the label) such that each constraint 

I2 0 can be written as the disjunction L (I) _I 0 with j= I 0 ) L. 

Here L (I) is dened i the negation L does not occur in I (written as a clause). Then 

L (I) results from I by omission of the literal L if the result contains at least two literals. 

Otherwise L (I) is simply I. We call I 0 the label part and L (I) the label-free part of the 

constraint I. IfL is understood from the context, we drop the subscript and write instead 

of L . 

A labelled subsystem ( 0 00 L) is called stratied i the set 00 is stratied in the sense 

of Denition 7.5 or locally stratied as dened below. 

The constraint set is called locally stratied i = 1 0 [:::[0 n with stratied labelled 

subsystems (i 000 

i L i) (i =1::: n) such that for each constraint I 2 i 0 and each literal 

L occurring in its label part with respect to i there exists another j with I 2 j 0 and L 

occurring in its label-free part of I with respect to j . 

ut 

Example 7.6. For the constraint set in Example 7.5 we obtain the partition into 1 0 = 

fI 1 I 3 g and 2 0 = fI 1 I 2 g. 

For the rst of these we have the label L 1 :p(x) and the label-free parts dened by 

L 1 (I 1) q(x) _:r(x) and L 1 (I 3) I 3 . 

For 2 0 we get the label L 2 :r(x) and the label-free parts L 2 (I 1) :p(x) _ q(x) and 

L 2 (I 2) I 2 . 

This shows that the constraint set in Example 7.5 is indeed locally stratied. ut 

Note that each stratied constraint set is also locally stratied. In this case we dene 

depth() = 0. If is locally stratied by a partition = 1 0 [:::[0 n ,we dene depth() = 

max n i=1 depth(00 i ) + 1. We calldepth() the depth of the locally stratied constraint set . 

Finally, we can strengthen Proposition 7.6 now dealing with locally stratied constraint 

sets. This condition turns out to be sucient and also necessary for the absence of admissible 

critical trigger paths. 

Theorem 7.12. Let be aconstraint set on a schema S. Then is locally stratied i there 

exists a complete RTS such that for any repairable transition T the RTS does not invalidate 

the eect of T . 

Proof. First assume that is locally stratied. Let the labelled subsystems in the partition 

be (i 000 

i L i) for i =1::: n.We shall use induction on the depth of . For depth() =0 

we are done by Proposition 7.6. 

Let us now consider the case depth() = 1. As in the proof of Proposition 7.6 we construct 

an RTS for . Since each i 

00 is stratied in the sense of Denition 7.5, we rst construct a 

147

ule system RT Si 0 with respect to 00 i as in the proof of Proposition 7.6. The condition parts 

in these rules have theform: Li (I) forI2i 0. Then let RT S i result S from RT Si 0 bychanging 

n 

all condition parts replacing : Li (I) by :I. Finally, take RT S = i=1 RT S i. 

Due to the last property in the denition of locally stratied constraint sets in Denition 

7.11 we conclude that RT S is complete. 

Now consider a critical trigger path v 0 e 1 v1 0 e0 1 v 1::: e 0`v` in the rule hypergraph associated 

with RT S. Without loss of generality assume v1 0 2 RT S 1. According to Proposition 

7.4 we have to show that this trigger path is not admissible. 

We use induction on the length ` of this critical trigger path. For ` =1we may use the 

same argument asin the proof of Proposition 7.4. Therefore, assume `>1 and take a state 

with j= and a transition T with j= E (T ) , ' 0 .Thenwehave to show that T is not 

repairable. 

Assume that T is repairable. Then there exists a state with j= such that :E (T ) =2 

.We shall derive acontradiction from this. 

For this regard the critical trigger path v 1 e 2 v2 0 e0 2 v 2::: e 0`v` of length ` ; 1. By induction 

it is not admissible. If A 1 is the action in the rule v1 0 , we get j= E (T A 1 ) , ' 1 

and T A 1 cannot be repairable. In particular, this implies :E (T A 1 ) 2 . 

Since A 1 is a simple insertion or deletion, we getj= :E (T ) , '^L and j= :E (T A 1 ) 

, '^ L for some literal L and its negation L.From this we conclude ' 2 and L 2 . 

Then there must exist a resolution refutation for L from input . Any literal L 0 (except 

L) in this refutation must be selected at least once for building the resolvent. Therefore, due 

to our construction of L 1 (I) wemay cancel all clauses I2 containing the literal L 1 and 

simultaneously the literal L 1 in all clauses. Thus, there must also exist a resolution refutation 

for L from input 1 00. 

On the other hand, each clause in 1 00 contains at least two literals. Therefore, any resolvent 

will also contain at least two literals unless we have some I 1 2 1 00 with literals L 1 and L 2 

and another I 2 2 1 00 with literals L0 1 and L0 2 such that L 1, L 0 1 (and L 2, L 0 2 respectively) are 

uniable. 

This property, however, means that 1 00 is not stratied contradicting our assumptions. 

Hence T cannot be repairable and we are done. 

Next let depth() > 1. We proceed analogously. By induction, since i 

00 is (locally) stratied, 

there exists a rule system RT Si 0 for 00 i with the required property. The condition parts 

in these rules have theform: Li (I) forI2i 0. Then let RT SS i result from RT Si 0 bychanging 

n 

all condition parts from : Li (I) to:I. Finally, take RT S = i=1 RT S i. 

Again due to the last property in the denition of locally stratied constraint sets (cf. 

Denition 7.11) RT S must be complete. 

Now consider a critical trigger path v 0 e 1 v1 0 e0 1 v 1::: e 0`v` in the rule hypergraph associated 

with RT S. According to Proposition 7.4 we have to show that this trigger path is 

not admissible. Without loss of generality assume v1 0 2 RT S 1. Then take a maximal k such 

that v1 0 ::: v0 k 2 RT S 1 holds. Then for i =0::: k we may write ' i as a conjunction i ^J 

with j= i ): L 1 (I i) for some I i 2 1 0 .Hence,ifwe replace v0 i by the corresponding rule in 

RT S1 0 ,we obtain a critical trigger path for RT S0 1 . 

Now take a state with j= and a transition T with j= E (T ) , ' 0 . We have to 

show that T is not repairable. Assume the contrary. Then there exists a state with j= 

and :E =2 . 

Assume j= :L 1 . Since j= holds and each constraint I 2 1 0 can be written as a 

disjunction I 0 _ L 1 (I) with j= I0 ) L 1 ,we conclude j= 1 00. 

148

Since v 0 e 1 v1 0 e0 1 v 1::: e 0 k v k is a critical trigger path for RT S1 0 and j= E , ' 0 

holds, we may apply the induction hypothesis to 1 00 with depth(00 1 ) < depth(). Therefore, 

T cannot be repairable, i.e. for any state with j= 1 00 weget:E (T ) 2 (1 00) 

. 

In particular, take = . Then :E (T ) 2 (1 00) 

implies j= : L 1 (I) for some I 2 0 1 

and further 6j= contradicting our assumption on . Thus, we must have j= L 1 . 

Assume j= :L 1 . Then we must have j= 1 00 and consequently :E (T ) 2 (1 00) 

. As 

above this implies j= : L 1 (I) for some I20 1 and hence 6j= contradicting our assumption 

on . Hence,wemust have j= L 1 . 

Now let I 1 2 correspond to the rule v1 0 . Without loss of generality we may assume 

j= ' 0 ):L 1 . Otherwise, we must have j= :I1 0 and L1 (I 1)must not contain L 1 . This implies 

L 1 to occur in J , in which case we may change it to :L 1 without aecting the trigger path 

being critical. 

Since j= L 1 holds, T must involve an insertion (deletion) corresponding to a negative 

(positive) literal L 1 . Hence, j= E (T ) ,:L 1 ^: holds. Due to the independence of J 

from 1 00 wemaychoose in such away that 2 (1 00) 

holds. 

However, this implies j= :E (T ) , L 1 _ 2 contradicting the non-repairability of 

T with respect to RT S1 0 . This completes the suciency proof. 

Conversely, assume that we are given a complete RTS for which for any repairable 

transition T does not invalidate its eect. According to Proposition 7.4 this implies that all 

critical trigger paths in the associated rule hypergraph are not admissible. From this we have 

to construct a partition of into stratied labelled subsystems. 

First consider a single rule R corresponding to a constraint I 2 . In particular, I is 

the condition part of this rule. Since RT S is complete, the event part of R gives rise to a 

negative (positive) literal L ev in I for the case of an insertion (deletion). Similarly, an insertion 

(deletion) in the action part of R gives rise to a positive (negative) literal L a in I. 

Let (I) = L ev _ L a . If I contains n 1 more literals L 1 ::: L n , let i (I) = (I) _ 

L 1 _ :::_ L i _ :::_ L n . Then dene i 0(R) = fJ 2 j L i 

(J ) is dened g and i 00(R) 

= 

|{z} 

omit 

f Li (J ) jJ 2 i 0(R)g. (For I,(I) letL 1 = L ev and L 2 = L a and dene i 0 (R) and00 i (R) 

analogously.) 

Dene (R) =f(i 0(R)00 

i (R)L i) j i 00 (R) is locally stratied g, if this satises the last 

condition of Denition 7.11. Otherwise let (R) = . Then the elements of (R) dene 

stratied labelled subsystems of . 

In order to check the local stratication for i 00 (R) rst check, whether it is stratied. If 

not, dene for each literal L in i (I) thesetsiL 0 (R) =fJ 2 00 i (R) j L(J ) is dened g and 

iL 00 (R) =f L(J ) jJ 2 iL 0 (R)g. Consider f(0 iL (R)00 iL 

(R)L) j 00 iL 

(R) is locally stratied g 

and check the last condition S of Denition 7.11. 

Now take LSS = 

R2RT S 

(R). If (R) 6= holds for all R 2 RT S, this satises the last 

condition of Denition 7.11 and we obtain a partition of into stratied labelled subsystems. 

Then LSS is the required partition. 

It remains to show (R) 6= in the construction above. Assume (R) =. Then there 

exists a sequence L 1 L 2 ::: L k of literals in I and a sequence (1 0 00 1 L 1)::: (k 0 00 k L k) 

of non-stratied labelled subsystems such that i+1 0 = fJ 2 00 i j Li+1 (J ) is dened g and 

k 00 contains two clauses Ik 1 and Ik 2 with literals L1 , L 10 and L 2 , L 20 respectively such that L 1 , 

L 2 and L 10 , L 20 are uniable. 

I1 k and Ik 2 correspond to rules with respect to 00 k 

that dene an admissible trigger path 

in the associated rule hypergraph. Since for i =1 2 I1 k is : L k 

(I1 k;1 ), we may successively 

149

eplace these rules by rules corresponding to k;1 00 ::: 00 1 and simultaneously replace the 

formulae ' k i by ' k;1 

i 

= ' k i ^:L k::: ' 0 i = ' 1 i ^:L 1. The resulting trigger path is still 

critical and due to our construction it is also admissible with respect to contradicting our 

assumption. This completes the necessity proof. 

ut 

Example 7.7. 

Let us extend Example 7.3 and add a third constraint 

I 3 p(x z) ^ q(y z) ) false : 

In terms of the Entity-Relationship diagram in Figure 7.3 I 3 corresponds to an exclusion 

constraint BkD. It is easy to see that the new set fI 1 I 2 I 3 g of constraints is not stratied. 

In particular, any local stratication must contain a labelled subsystem with label :q(x z) 

with the reduced constraints I2 0 :q(y z) _ x = y and I0 3 I 3. However, :q(x z) cannot 

occur in the label-free part of some I2 0 , since this always denes the same labelled subsystem. 

Hence, the given constraint set is also not locally stratied. This shows that adding a single 

exclusion constraint toan Entity-relationship schema may already destroy a reasonable rule 

behaviour. 

ut 

7.7 Complexity of Local Stratication 

Let us now look at the check, whether a given set of constraints is locally stratied. In 

the second part of the proof of Theorem 7.12 we have seen that this check can be done by 

direct construction of the desired partition into maximal stratied labelled subsystems. The 

rst part of that proof then indicates how to construct the corresponding RTS. In [7] we gave 

an explicit algorithm which also produces for each constraint the set of \reduced" constraints 

used in the RTS construction. However, the time complexity of that algorithm was beyond 

any practicality, since we could proof the following result. 

Proposition 7.13. Let be a set of constraints in clausal form, n =#, ` the maximum 

number of literals in constraints I2 and k the maximal arity of predicate symbols occurring 

in these constraints. Then checking to be locally stratied can be done with a time complexity 

in O(k `2 n 2n2 `). 

ut 

We nowwant toshow that this complexity result is not accidentally. For this we rst show a 

technical lemma. 

Lemma 7.14. Let be a set of clauses containing only propositional atoms. Let L be a 

literal, such that L does not occur in any of the clauses in . Assume = 1 [ 2 such 

that L does not occur in any of the clauses in 1 , but in all clauses of 2 . Moreover, 2 

contains only clauses with exactly two literals. If 1 is locally stratied and 2 is stratied, 

then is locally stratied. 

Proof. First assume that 2 contains a single clause C = L _ L 0 .If 1 is not stratied, there 

is a partition 1 = 11 0 [[0 1n (n>2) with stratied labelled subsystems (0 1i 00 1i L i). 

Then at most one L k can be L 0 and we may dene 

( 

i 0 1i 0 if L i = L 0 

= 

1i 0 [fCg otherwise : 

150

By induction ( 0 i 00 i L i) is a stratied labelled subsystem. Thus, = 0 1 [[0 n denes 

the required partition. 

Now assume that 1 is stratied. Let 1 = 11 [[ 1n be a partition into pairwise 

disjoint strata. If 1 contains just one clause C 0 with L 0 and no clause with L 0 ,we are done, 

since C may be added to the stratum of C 0 . Analogously, C may dene its own stratum, if 

such aC 0 does not exist at all. Therefore, we are reduced to the following two cases: 

{ There is more than one clause in 1 containing L 0 (and hence none containing L 0 ) and 

these clauses belong to dierent strata. 

{ There are exactly two clauses C 1 and C 2 containing L 0 or L 0 respectively. Inparticular, 

C 1 and C 2 belong to the same stratum 1i . 

In both cases we choose the literals L 1 = L and L 2 = L 0 to dene labelled subsystems 

( 1 1 L 1 ) and (fCg[ 1 ;fC 00 j C 00 contains L 0 g 

| {z } 

2 

0 

00 

2 L 2 ) 

where 2 0 (and hence also 00 2 ) are stratied by the previous remarks. 

In the rst case choose C 0 containing L 0 and another literal L 00 to dene a labelled 

subsystem 

( 0 1 [fCg 

| {z } 

3 

0 

00 

3 L 3) 

with L 3 = L 00 ,where1 0 is a proper subset of 1 not containing C 0 . By induction 3 00 must 

be locally stratied. 

In the second case choose C 2 = L 0 _ C2 0 , a literal L00 in C2 0 and L 3 = L 00 ,which denes a 

labelled subsystem (3 0 00 3 L 3) as before with 3 0 = 0 1 [fCg with a proper subset 0 1 ( 1 

containing C 1 , but not C 2 .Thus, 3 0 and 00 3 are stratied. 

In both cases we have obtained a partition = 1 [ 2 0 [ 0 3 with stratied labelled 

subsystems ( 1 1 L 1 ), (2 0 00 2 L 2) and (3 0 00 3 L 3). Since the additional condition for 

local stratication is easily veried, we conclude that is locally stratied. 

For the general case we may assume that 0 = 1 [ ( 2 ;fCg) is locally stratied by 

successive application of the constructions in the rst part of this proof. Then we observe 

that in the case of non-stratied 0 we do not change labels, when we add C. However, it 

may happen that one of these labels now is L. This label results (as label L 1 ) from adding 

C 0 to some stratied constraint set.From the construction of this local stratication and the 

fact that 2 is stratied we conclude that the other labels L 2 and L 3 are dierent from L, 

which guarantees the local stratication condition to hold also in the general case. 

For the case of 0 being stratied the arguments are the same as before except for the case 

that 0 contains exactly one clause C 0 with L 0 and none with L 0 . Then the corresponding 

stratum may also contain clauses C i with literals L i and L i+1 (i = 1:::m), where L 1 

occurs in C 0 and L m+1 = L. 

In particular, we have C m 2 2 and adding C to this stratum is no longer possible. Since 

2 is stratied, we must have m > 0, but then the literals L 0 , L 1 and L dene a local 

stratication with associated constraint sets 0 ;fC 0 g[fCg, 0 ;fC m g[fCg and 0 

respectively. 

ut 

151

We shall use Lemma 7.14 in the proof of NP-hardness to shrink propositional constraint sets. 

Another way toreducethetechnical complexity of that proof is to drop the restriction on 

to contain only clauses with at least one negative literal. If is a set of propositional clauses 

containing neither the atom q nor its negation, we add :q to each clause to form the set ext 

of clauses. 

Lemma 7.15. Let be a set of propositional clauses each with at least two literals. Then 

is locally stratied i is satisable and locally stratied. 

ext 

Proof. First let be locally stratied and satisable. If is not stratied, we may choose 

the same labels to obtain a local stratication for ext . 

Thus, assume to be stratied. Then ( ext :q) is a stratied labelled subsystem. 

Since all clauses in all other labelled subsystems contain the literal :q, wehave to isolate these 

clauses. Therefore, take a model for whichisgiven by a set fL 1 :::L n g of literals occurring 

in which must be interpreted as true. Taking L i as a label and the corresponding labelled 

subsystem (i 000 

i L i), we obtain a proper subset i 0 ( ext .For #i 00 > 1wemay proceed 

with the other literals L j . The last step results in unary sets f:q _ L k g which are obviously 

stratied. 

Conversely, given a local stratication for ext we can remove :q to obtain a local stratication 

for . It remains to show that is satisable. If ext is stratied, this is obvious, 

because a literal L with L occurring in some clause in cannot occur in any clauseof. 

If ext is not stratied, there is at least one stratied labelled subsystem ( 0 00 L) such 

that :q occurs in all clauses in 00 , i.e. 00 = 0 ext and 0 is satisable. This still holds if we 

put back the literal L and extend our interprete L as false to satisfy clauses in ; 0 . ut 

Theorem 7.16. 

NP-hard. 

Let be a set of constraints. Then checking that is locally stratied is 

Proof. We show that the disjoint cover problem (DCP) { which isknown to be NP-complete 

{ can be reduced in polynomial time to the local stratication problem. For this, let (X S) 

be an instance of DCP, i.e. X is a nite set, say X = fx 1 :::x n g and S = fS 1 :::S m g is a 

subset of the power set P(X). The problem is to decide, whether a subset S 0 Sexists such 

that X is the disjoint union of the sets in S 0 .SuchaS 0 is called a S solution for (X S). 

Without loss of generality we may always assume that X = S i holds. Moreover, we 

may allow S to be a multiset. 

We now associate with (X S) a set of constraints . For this let p ij be a propositional 

atom for all x i 2 S j .For S i = fx j 1 :::x j i 

g2Swe dene clauses :p jk i _ p jì and :p jì _ p jk i 

for k ` 2 f1:::ig, k 6= `. We refer to these clauses as connection clauses with respect to 

S i . For x i 2 S j \ S k (j 6= k) we dene an exclusion clause :p ij _:p ik . Finally, for each x i 

we dene a cover clause p ij 1 __p ij m 

for the sets S j 1 :::S j m 

2Scontaining x i provided 

m 2. contains all these connection, exclusion and cover clauses. 

Then we have toshowthat(X S) has a solution i is locally stratied and satisable. 

For this we introduce a partial order on DCP-instances letting (X 1 S 1 ) < (X 2 S 2 )i 

X 

S2S1 

j S j < 

X S2S2 

or 

0 

@ X S2S1 

= X S2S2 

and 

S i 2S 

jS 1 j > jS 2 j 

1 

A 

152

holds. 

First let S 0 = fS i 1 :::S i k 

g be a solution for (X S). Then is obviously satisable. In 

order to use induction with respect to we consider the following two operations: 

{ Replace S j 2S 0 by S j ;fx`g and add S m+1 = fx`g for some x` 2 S j . 

{ Replace S j =2S 0 by S j ;fx`g for some x` 2 S j . 

In both cases we obtain a smaller DCP-instance which has a solution. By induction the 

corresponding constraint set1 0 is locally stratied. 

In the rst case we remove all clauses with literals p im+1 from 1 0 . The resulting subset 00 1 

is still locally stratied. Now build the labelled subsystem ( 0 00 L) with the label L = :p`j . 

The clauses in 0 (and hence in 00 ) do not contain p`j , i.e. we omit the cover clause with 

respect to x` and connection clauses containing p`j with respect to x` 2 S j . Clauses in 00 

containing :p`j arise from the restriction to keep at least two literals, hence must also lie in 

0 . Therefore, we obtain 00 = 1 [ 2 , where 2 is stratied and contains only clauses with 

two literals, one of them is :p`j , whereas clauses in 1 do not contain :p`j . 

Thus, the remaining connection clauses with respect to x` 2 S j and the exclusion clauses 

with respect to x` 2 S j occur in 2 . This implies 1 = 1 00 . From Lemma 7.14 we conclude 

that 00 is locally stratied. 

In the second case we build the labelled subsystem ( 0 00 L) with the label L = p`j . 

The clauses in 0 (and hence in 00 ) do not contain :p`j , i.e. we omit exclusion clauses 

and connection clauses containing :p`j with respect to x` 2 S j . Again, the clauses in 00 

containing p`j only arise from the restriction to keep at least two literals. Hence, these clauses 

dene a stratied subset 2 of 00 (and of 0 )containing only clauses with two literals. 

The remaining clauses form a subset 1 and clauses in 1 do not contain p`j , i.e. the 

remaining connection clauses with respect to x` 2 S j and the cover clause with respect to x` 

(if it contains just two literals) occur in 2 , which implies 1 = 1 0 . From Lemma 7.14 we 

conclude that 00 is locally stratied. 

Since in the rst case (x` 2 S j 2S 0 ) only the cover clause with respect to x` and connection 

clauses containing p`j and in the second case (x` 2 S j =2S 0 ) only exclusion clauses with respect 

to x` 2 S j and connection clauses containing :p`j are omitted in 0 , the additional condition 

for local stratication is easily veried, if all such choices are taken provided there are at least 

three such possibilities. The only critical case arises, if there are only three choices of the 

second kind, all with the same x`. In this case we must have another S j = fx`g 2S 0 and we 

simply add the labelled subsystem ( 0 00 :p`j ) to satisfy the additional local stratication 

condition. 

If there are at most two choices, then either 

{ S = S 0 and there is exactly one S j = fx k x`g or 

{ S 0 contains only unary sets and these are exactly S j = fx j g =2 S 0 and S k = fx k g =2 S 0 or 

{ S 0 contains only unary sets and there is exactly one S j = fx k x`g =2 S 0 . 

In the rst case contains only two connection clauses with respect to S j and hence is 

obviously stratied. In the second case contains only four clauses 

:p kk _:p kk 0 p kk _ p kk 0 :p jj _:p jj 0 and p jj _ p jj 0 

for S j 0 = fx j g2S 0 and S k 0 = fx k g2S 0 ,hence is stratied. 

153

In the third case we obtain six clauses 

:p kj _ p`j :p`j _ p kj :p kj _ p kk 0 :p`j _ p``0 p kj _ p kk 0 and p`j _ p``0 

for S k 0 = fx k g2S 0 and S`0 = fx`g 2S 0 . Using Lemma 7.14 it is easily veried that the labels 

p kj , p`j , :p kj and :p`j dene a partition into stratied labelled subsystems. 

For the converse let us rst assume that is stratied, i.e. there cannot exist three 

clauses with literals L, L and L respectively. In connection clauses we may have L = p`j 

(or L = :p`j ) and it follows that does not contain exclusion or cover clauses for x` 2 S j . 

This implies x` =2 S k for all k 6= j. Ifwehave an exclusion clause for x` 2 S j ,say :p`j _:p`k , 

then we also have a cover clause p`j _ p`k _ C 0 and vice versa, but there cannot be further 

exclusion clauses nor connection clauses for x` 2 S j , i.e. C 0 false and S j = fx`g. 

To summarize, if x` occurs in more than one S j ,then#S j = 1 and there are just two such 

sets. Therefore, for a solution S 0 we take all S j with #S j 2 and select a singleton set fx`g 

for the remaining elements. 

Next assume that is locally stratied, i.e. there is a local stratication with labels 

L 1 :::L n (n 3). Again, we proceed by induction on DCP-instances. 

For L 1 = :p`j and the stratied labelled subsystem (1 0 00 1 L 1) the cover clause for x` 

and connection clauses for x` 2 S j containing p`j have been removed from to give 1 0 , 

hence must occur in two other labelled subsystems such that for a label :p ki we must have 

k:` and for a label p ki wemust have i 6= j. 

Analogously, for L 1 = p`j exclusion and connection clauses for x` 2 S j , the latter ones 

containing :p`j have been removed omitted in 1 0 and must occur in two other labelled 

subsystems such that for another positive labelp ki wemust have k:` and for a negative label 

:p ki wemust have i 6= j. Hence, for the minimum number of three labels L 1 , L 2 and L 3 we 

obtain the following four cases: 

L 1 = :p`j L 2 = :p k 1i1 L 3 = :p k 2i2 with pairwise dierent ` k 1 k 2 

L 1 = :p`j L 2 = :p k 1i1 L 3 = p k 2i2 with ` 6= k 1 and j 6= i 2 6= i 1 

L 1 = :p`j L 2 = p k 1i1 L 3 = p k 2i2 with k 1 6= k 2 and i 1 6= j 6= i 2 or 

L 1 = p`j L 2 = p k 1i1 L 3 = p k 2i2 with pairwise dierent ` k 1 k 2 : 

For a negative literal L i = :p`j or a positive literal L i = p`j it follows from Lemma 7.14 that 

replacing S j by S j ;fx`g and fx`g denes a locally stratied constraint set. Therefore, by 

induction in all four cases (with the restrictions for indices) we obtain solutions for smaller 

DCP-instances with 

S 1 = fS 1 :::S j ;fx`g:::S m fx`gg 

S 2 = fS 1 :::S i 1 ;fx k1 g:::S m fx k 1gg and 

S 3 = fS 1 :::S i 2 ;fx k2 g:::S m fx k 2 gg 

respectively. Ifany of these solutions contains both (or none) of the splitted components, e.g. 

S j ;fx`g and fx`g, we also have a solution for the original problem. 

Therefore, assume that all solutions for (X S i ) must contain exactly one of the splitted 

components denoted as S 1 , S 2 and S 3 . Let S 0 i = fSi 1 :::Si n i 

S i g be a solution for (X S i ). 

For i 6= j we proceed in the following way: 

Start with T i = S 0 i ;S0 j , T j = S 0 j ;S0 i and T = fS jg and execute the following steps until 

there are no more changes: 

154

{ Remove all sets from T i intersecting some set in T and let these dene a new T . 

{ Remove all sets from T j intersecting some set in T and let these dene a new T . 

Finally, ifT i (and then also T j ) are non-empty, this means that we may replace T j S 0 j by 

T i or S 0 j ;T j by S 0 i ;T i. According to our assumption on solutions we always keep either S i 

or S j . Consequently, the procedure above denesachain 

S i ; S j i1 ; Si i1 ; Sj i2 ; Si i2 ;;Sj i k 

; S i i k 

; S j 

 

where neighbouring sets have a common element. This is still true, if we replace S i by the 

original S j .Taking together all three choices for (i j) we obtain an odd-length cycle 

S i 1 ; S i2 ; S i3 ;;S i m 

; S i 1 

with intersecting neighbouring sets S ij 2S.Let 0 be the set of constraints corresponding to 

fS i 1 :::S i m 

g. Then 0 diers from a subset 0 only by the fact that cover clauses may 

have been shortened. Since omitted (positive) literals in these cover clauses do not occur in 

any other clauses in 0 , this must be locally stratied i 0 is locally stratied. Therefore, 

the proof is completed, if we can show that cycles as above always dene constraint sets that 

are not locally stratied or not satisable. 

With each neighbouring pair (S ij S ij+1 )wemay associate a witness x 2 S i j 

\ S ij+1 . Then 

without loss of generality (just rename indices) we canalways assume a cycle 

S 1 

x1 

; S 2 

x2 

; S 3 ;;S m 

x m; 

Sm+1 = S 1 

and show that the following conditions can be achieved: 

{ m is odd, 

{ the x i are pairwise dierent, 

{ the S i are pairwise dierent and 

{ the cover clause in 0 for x` has the form p`` _ p``+1 _ C0`, where literals in C0` do not 

occur in any other clause in 0 . 

The last condition will allow us to assume without loss of generality thatcover clauses in 0 

only contain two literals. 

In order to achieve such a cycle rercall that our original cycle is composed of three subpaths 

(called anks) corresponding to a solution of a smaller DCP-instance and each pair of anks 

has a common set (called corner). If S i ( S j is such acorner, then the following cases may 

arise: 

{ The two nieghbours S i and S k coincide which allows to remove the corner S j and to 

identify S i with S k . 

{ If S i , S j and S k are pairwise dierent, we either obtain a simple cycle of length 3 or let 

the caycle unchanged. 

{ If one of the neighbours equals S j ,say S k = S j , then S k is not common in the solutions 

for the ank with S j and S k , i.e. there must be some S j 0 in the same solution as S i with 

S j \ S j 0 6= . In this case we may replace the even number of edges between S j and S j 0 

by a single edge. By the same argument theeven number of edges between the opposite 

edge S` (in the same ank) and some S`0 by a single edge. 

155

In all these cases the cycle length remains odd. 

If x i occurs twice, say between Si 1 and S i 2 and between Si 3 and S i 4 respectively, wemay 

assume paths from S i 1 to S i4 and from S i2 to S i3 of length n 1 and n 2 respectively. Then there 

are cycles with S i 2 , S i3 and S i1 , S i4 connected by x i respectively and one of the corresponding 

lengths n 1 +1 or n 2 +1 must be odd. The only critical cases occur for S i 2 = S i4 or S i1 = S i3 , 

but these correspond to corners that have already been removed. 

Finally, in order to achieve the condition on cover clauses consider S i \ S j 6= . 

{ If S i and S j belong to dierent anks, but to the same solution, then we have S i = S j 

and we may identify them and remove theeven number of edges between them. 

{ If S i and S j belong to dierent anks and dierent solutions, then for S i 6= S j we may 

replace the odd number of edges between them by a single new edge, whereas for S i = S j 

we may consider the odd number of edges between them as our new cycle. 

{ If S i and S j belong to the same ank, then the number of edges between them is even i 

S i = S j ,thus may beremoved or replaced by a single new edge. 

The conditions on our cycle now allows clauses to be arranged in such away thatwehave 

0 = f L 1 _ L 2 L 2 _ L 3 ::: L p;1 _ L p L p _ L 1 g 

for an even number p with L p=2+i = L i for i = 1:::p=2. Such a 0 , however, is not 

satisable. 

ut 


In this article we investigated the limits of rule triggering systems (RTSs) for maintaining 

database integrity. The rst result assures the existence of non-repairable transitions. In order 

to disallow such transitions the constraint implication problem must be decidable. 

Secondly, we analyzed critical trigger paths in rule hypergraphs associated with RTSs. We 

could show that the existence of critical trigger paths leads to RTSs which mayinvalidate the 

eect of some transitions, even if these are repairable. Such abehaviour can only be excluded 

for locally stratied constraint sets. In this case the needed RTS can be computed eectively, 

but checking local stratication is NP-hard. 

To summarize, both results limit the applicabilityofRTSs for integrity maintenance under 

the assumption that the intended eects of user-dened transitions should be preserved. 

Fortunately, there is a stronger condition on a constraint set to be stratied, which is only 

sucient for reasonable rule behaviour, but not necessary. Stratied constraint sets occur, if 

we have a relational database schema in Entity-Relationship normal form, which means that 

it is equivalent to an ER-schema without exclusion constraints. Checking stratication is not 

only eective, but also ecient. 

On the other hand, the RTS approach to integrity maintenance completely ignores userdened 

transitions. Thus, a second conclusion from our studies is that these should be taken 

into consideration. 



VLDB, Brisbane (Australia), August 1990, 566-577 

156

2. S. Ceri, P. Fraternali, S. Paraboschi, L. Tanca: Automatic Generation of Production Rules for 

Integrity Maintenance. ACM ToDS, vol. 19(3), 1994, 367-422. 

3. S. Chakravarty, J. Widom (Eds.): Research Issues in Data Engineering | Active Databases, Proc., 

Houston, Februar 1994 

4. M. Gertz, U. W. Lipeck: Deriving Integrity Maintaining Triggers from Transition Graphs, in Proc. 

9th ICDE, IEEE Computer Society Press, 1993, 22-29 

5. H. Mannila, K.-J. Raiha: The Design of Relational Databases, Addison-Wesley 1992 

6. K.-D. Schewe, B. Thalheim: Consistency Enforcement in Active Databases, in S. Chakravarty, J. 

Widom (Eds.): Research Issues in Data Engineering | Active Databases, Proc., Houston, Februar 

1994 

7. K.-D. Schewe, B. Thalheim: Active Consistency Enforcement for Repairable Database Transitions, 

in S.Conrad, H.-J. Klein, K.-D. Schewe (Eds.): Integrity in Databases, Proc. 6th Int. Workskop 

on Foundations of Models and Languages for Data and Objects, Schlo Dagstuhl, 1996, 87-102, 

available via http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceedings.html 

8. B. Thalheim: Foundations of entity-relationship modeling, Annals of Mathematics and Articial 

Intelligence, vol. 7, 1993, 197-256 


Objects, IEEETrans. on Knowledge and Data Engineering, vol. 2 (4), December 1990 


Proc. SIGMOD 1990, 259-270 

157

Chapter 8 

Consistency Enforcement in 

Entity-Relationship and 

Object-Oriented Models 

Contents 

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 

8.2 Rule Systems for Consistency Maintenance . . . . . . . . . . . . . 160 

8.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 

8.2.2 ECA-Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 

8.3 Problems with Rule-Based Integrity Enforcement . . . . . . . . . 164 

8.3.1 Non-Repairable Transactions . . . . . . . . . . . . . . . . . . . . . . 164 

8.3.2 Critical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 

8.3.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 

8.4 Well-behaving Rule Systems . . . . . . . . . . . . . . . . . . . . . . 169 

8.4.1 Stratied Rule Systems . . . . . . . . . . . . . . . . . . . . . . . . . 169 

8.4.2 Constraints Arising from Entity-Relationship Schemata . . . . . . . 170 

8.4.3 Constraints Arising from Simple Object-Oriented Schemata . . . . . 172 

8.5 Conict Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 

8.5.1 Problem of Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . 174 

8.5.2 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 

8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 


K.-D. Schewe. Consistency Enforcement inEntity-Relationship and Object-Oriented 

Models. Data & Knowledge Engineering. 1998 (to appear). 

158

Abstract. Integrity Maintenance is considered one of the major application elds of rule 

triggering systems (RTSs). In the case of a given integrity constraint being violated by a 

database transaction these systems trigger repairing actions. However, it has been shown 

that for any set of constraints there exist non-repairable transactions, which depend on the 

closure of the constraint set. Even if non-repairable transactions are excluded, this does not 

restrain the RTS from producing undesired behaviour. 

Analyzing the behaviour of RTSs leads to the denition of critical paths in associated rule 

hypergraphs and the requirementofsuch paths being absent. It is shown that this requirement 

can be satised if the underlying set of constraints is stratied and that this is always the 

case for the structural constraints in Entity-Relationship and simple object-oriented models. 

Moreover, in both cases there is no ambiguity for the selection of rules. 

Keywords. integrity constraints, consistency enforcement, active databases, Entity-Relationship, 

object-orientation, analysis of rule systems 


Active databases (ADBs) aim at extending relational (or object-oriented) DBMS by rule 

triggering systems (RTSs), i.e. by sets of rules which on a given event and in the case of a 

condition being satised trigger actions on the database (ECA-rules). Events can be external 

events, time conditions or internal events resulting from operations on the database. Conditions 

are usually given by boolean queries that have to be evaluated against the database. 

The action part consists of a sequence of basic operations to insert, delete or update tuples 

(or objects respectively) in the database. 

The work in [3, 4, 8, 16, 17] and partly in [5] considers the problem to enforce database 

integrity by the use of RTSs. The results concern the generation of repairing ECA-rules and 

partly the analysis of the resulting RTS. This analysis concentrates on the termination of the 

rule system, the independence of the nal database state from the chosen selection order of 

the rules (conuence) andonconsistency. 

These requirements are not sucient for a reasonable rule behaviour, because it is easy 

to dene an RTS that empties the database in case of any constraint violation. Therefore, 

we claim an additional requirement, which informally means that the intended eect of a 

transaction may not be turned into its opposite by the RTS. 

In this paper we investigate general problems with RTSs and show that these cannot occur 

in simple Entity-relationship- and object-oriented schemata. The rst problem concerns 

the existence of non-repairable transactions that are determined by the closure of the constraint 

set. The second problem arises from the analysis of how to obtain RTSs that denitely 

repair constraint violations by a (repairable) transaction without invalidating its intended 

eect. Given an RTS we associate with it a rule hypergraph which corresponds to the possible 

sequences of triggered rules. We dene critical trigger paths in these hypergraphs that correspond 

to the propagation of conditions. Then it can be shown that the existence of a single 

critical trigger path makes the RTS work incorrectly for at least one transaction. 

Next we analyze constraint sets in order to detect whether it is possible to dene an 

RTS of repairing actions such that the critical trigger paths in its associated hypergraph can 

only invalidate non-repairable transactions. For this we introduce stratied constraint sets 

that satisfy this condition. We apply our results to the case of specic Entity-Relationship 

159

and simple object-oriented models and demonstrate that structurally determined constraint 

sets in these cases are always stratied. Furthermore, it will be shown that in these cases 

ambiguities arising from dierent execution orders can also be detected. 

The work presented in this paper extends previous research in [12, 14] in that theoretical 

investigations about the strength and weaknesses of the rule triggering approach for integrity 

maintenance have been directly tied in with consistency in Entity-Relationship and simple 

object-oriented models. A preliminary version was presented at the 1997 conference on Conceptual 

Modelling (ER '97) [13]. 

8.2 Rule Systems for Consistency Maintenance 

Let us rst consider the relational data model with integrity constraints given by closed 

formulae I in implicative normal form 

8x 1 :::x k : 9y 1 :::y`: p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) : 

(8.76) 

The vectors x i consist only of universally quantied variables x j and the vectors y i consist of 

both universally quantied variables x j and existentially quantied variables y j . The predicate 

symbols p i , q j correspond either to a relation of the schema or are comparison predicates 

(= 6=

8.2.1 Motivation 

Let us rst illustrate consistency enforcement using a small fragment of the example used in 

[4, 10]. 

Example 8.1. Let us dene a schema with some simple functional and inclusion constraints. 

For simplicity we omit all types. The relation schemata are 

WIRE = f wire id, connection, wire type, voltage, power g , 

TUBE = f tube id, connection, tube type g and 

CONNECTION = f connection, from, to g 

These are used to express that there are tubes between two locations and wires in these tubes. 

In addition consider the following constraints: 

FD 1 WIRE : wire id ! connection, wire type, voltage, power 

FD 2 TUBE : tube id ! connection, tube type 

FD 3 CONNECTION : connection ! from, to 

ID 1 WIRE[connection] TUBE[connection] 

ID 2 TUBE[connection] CONNECTION[connection] 

The rst three functional dependencies express that the values of wire id, tube id and connection 

are unique in relations over WIRE, TUBE and CONNECTION respectively. The latter 

inclusion constraints express that there is no wire nor tube without a corresponding tuple in 

a relation over CONNECTION. 

Then the following relations dene an instance of the schema: 

WIRE 


4711 HH-HB Koax 12 600 

4814 HH-H Tel 12 600 

TUBE 


8314 HH-H GX44 

8511 HH-HB GX44 

023 HB-H T33 

CONNECTION 

connection from to 

HH-H Hamburg Hannover 

HH-HB Hamburg Bremen 

HB-H Bremen Hannover 

It is easy to see that this instance satises the constraints above. 

Now consider the operation insert WIRE (t). This may lead to a violation of constraint ID 1 , 

in which case we must add a tuple to TUBE. Hence it can be replaced by 

insert WIRE (t) 


THEN insert TUBE (? connection(t) ?) 

ENDIF 

Here the question marks stand for arbitrarily chosen values of the corresponding data type. 

Similarly, the operation delete TUBE (t) may also violate ID 1 . Therefore, we may replace 

delete TUBE (t) by 

delete TUBE (t) 

IF connection(t) 2 WIRE[connection] ; TUBE[connection] 

161

THEN FOR ALL t 0 WITH connection(t 0 ) = connection(t) DO 

delete WIRE (t 0 ) 

ENDFOR 

ENDIF 

In order to enforce FD 2 wemay then replace insert TUBE (t) by 

IF 8t 0 2 TUBE . tube id(t) 6= tube id(t 0 ) 

THEN insert WIRE (t) 

ENDIF 

Let us now add the exclusion constraint ED WIRE[wire id] k TUBE[tube id]. In order to 

enforce this constraint insertions into one of WIRE or TUBE should be followed by deletions 

in the other. The resulting transactions are 

and 

insert WIRE (t) 

FOR ALL t 0 2 TUBE WITH tube id(t 0 ) = wire id(t) DO 

delete TUBE (t 0 ) 

ENDFOR 

delete TUBE (t) 

FOR ALL t 0 2 WIRE WITH wire id(t 0 )=tubeid(t) DO 

delete WIRE (t 0 ) 

ENDFOR 

If we now take together FD 2 , ID 1 and ED we must be very careful. E.g., if we execute 

insert WIRE (8511,HH-HB,Koax,12,600) on the instance above, we may rst delete the tuple 

(8511,HH-HB,GX44) in TUBE in order to enforce ED and then the two tuples (4711,HH- 

HB,Koax,12,600) and (8511,HH-HB,Koax,12,600) in WIRE in order to enforce ID 2 . The resulting 

instance would be (omitting CONNECTION): 

WIRE 


4814 HH-H Tel 12 600 

TUBE 


8314 HH-H GX44 

023 HB-H T33 

Thus, the \eect" of the original operation, i.e. insertion of a tuple into WIRE, is completely 

destroyed. The new eect is a deletion in WIRE and TUBE. 

ut 

8.2.2 ECA-Rules 

Active databases approach integrity enforcement by using ECA-rules. The general form of 

these rules is 

ON heventi IF hconditioni DO hactioni : (8.80) 

heventi corresponds to an internal event, i.e. an insert- or delete-operation. hconditioni is a 

formula to be evaluated against the actual database state, e.g. it could be the negation :I 

of a constraint I in implicative normal form (8.76). hactioni is a sequence of basic insert- or 

delete-operations to be triggered, i.e. to be executed if the event occurred and the condition 

is satised. 

162

In the sequel the assumed execution model for ECA-rules relies on a deferred modus, i.e. 

the system RTS of rules is started after nishing a transaction. Furthermore, we do not assume 

any order of the rules. Instead of this the execution model relies on demonic non-determinism, 

i.e. if the events of several rules r 1 ::: r n occur and their conditions evaluate to true, anyof 

these r i may be executed unless it is undened. 

Example 8.2. Let us look again at the schema used in Example 8.1. For the sake of simplicity 

we only consider the constraints ID 1 and ED. Then the changed operations can be expressed 

by rules. First consider insert WIRE (t). The corresponding rule would be 

ON insert WIRE (w c t v p) IFc =2 TUBE[connection] DO insert TUBE (?c?) 

(8.81) 

with ? standing for any value to be selected. This form is not yet exactly the one in (8.80), 

but writing relations as predicates we obtain the following: 

ON insert WIRE IF 9w c t v p: 8x t 0 : WIRE(w c t v p) ^:TUBE(x c t 0 ) 

DO insert TUBE (?c?) : (8.82) 

Note that the condition part is exactly the negation of ID 1 . Analogously, the other changes 

to operations discussed in Example 8.1 give rise to the following rules: 

ON delete TUBE IF 9w c t v p: 8x t 0 : WIRE(w c t v p) ^:TUBE(x c t 0 ) 

DO delete WIRE (w c t v p) (8.83) 

ON insert WIRE IF 9w c t v p c 0 t 0 : WIRE(w c t v p) ^ TUBE(w c 0 t 0 ) 

DO delete TUBE (w c 0 t 0 ) (8.84) 

ON insert TUBE IF 9x c t v p c 0 t 0 : WIRE(x c 0 t 0 vp) ^ TUBE(x c t) 

DO delete WIRE (x c 0 t 0 vp) (8.85) 

In order to t with the intended behaviour described in Example 8.1 it may occur that the 

same rule has to be executed several times. This can be achieved, if the semantics of the 

IF-part is considered as a WHILE-condition. 

ut 

Given a single constraint I in implicative normal form (8.76) we already get minimum requirements 

for repairing rules. If a relation symbol p occurs on the left hand side (right hand 

side) of (8.76), then each insert- (delete-)operation on p may violate (8.76), hence give rise 

to event-parts. The corresponding condition-part is simply :I. However, for the action-part 

there are still several alternatives. 

We call a system of ECA-rules complete i for all these cases of events and conditions 

there exists at least one repairing rule, i.e. whenever the rule is selectable in some database 

state, the execution of the action part will establish I as a postcondition. However, we exclude 

those rules which simply invalidate the event. For transactions we simply consider sequences 

of insert- and delete-operations. 

Example 8.3. The four rules in the previous Example 8.2 form a complete system of ECArules, 

if we consider only the constraints ID 1 and ED from Example 8.1. However, if we also 

consider the other constraints in that example, we have to dene at least ve more rules to 

obtain a complete rule set, one rule for each of the three key constraints corresponding to the 

events insert WIRE , insert TUBE and insert CONNECTION , respectively, and two rules for the 

inclusion constraint ID 2 corresponding to insert TUBE and delete CONNECTION . 

ut 

163

8.3 Problems with Rule-Based Integrity Enforcement 

If we were given only a single constraint I, then any of the rule constructions discussed in 

the previous section would be sucient to enforce consistency. However, real systems { like 

the tiny one in Example 8.1 { contain many constraints and the interference of the rules may 

lead to problems. 

8.3.1 Non-Repairable Transactions 

Let us rst demonstrate the insuciency of a naive RTS approach using a second trivial 

example. In \real" applications as in the previous subsection the situation of Example 8.4 will 

not occur in such anobvious way, but there are always implied and in general not detectable 

constraints leading to analogous problems. 

Example 8.4. Take two unary relations p and q and the constraints I 1 p(x) ) q(x) and 

I 2 p(x) ^ q(x) ) false. This implies p to be always empty, hence insertions into p should 

be abolished. Then we obtain the following repairing rules: 

R 1 : ON insert p IF 9x: p(x) ^:q(x) DO insert q (x) 

R 2 : ON delete q IF 9x: p(x) ^:q(x) DO delete p (x) 

R 3 : ON insert p IF 9x: p(x) ^ q(x) DOdelete q (x) 

R 4 : ON insert q IF 9x: p(x) ^ q(x) DO delete p (x) 

Here again the condition part in R 1 and R 2 is simply :I 1 and the condition part in R 3 and 

R 4 is :I 2 . 

If we try to execute a transaction insert p (a) on a database state satisfying q(a), then we 

successively trigger the rules R 3 and R 2 with the eect of only deleting a in q. This contradicts 

the original intention of the transaction. 

ut 

In order to analyze the unintended behaviour in Example 8.4 consider a set of constraints in 

implicational normal form. Let denote the (semantic) closure, i.e. = fI j j= Ig.Now 

let I2 be non-trivial, i.e. it does not hold in all database states. Write I in implicational 

normal form 

I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) 

and let p i 1 ::: p i k 

and q j 1 ::: q j` denote the relation symbols on the left and right hand 

sides of I respectively. Wemay dene a transaction T by 

delete qj 

1 (y j1 ) ::: delete q j` (y j`) insert pi 

1 (x i1 ) ::: insert p ik 

(x ik ) : 

If we startT with values for the x i and y j such that the additional conditions on the left hand 

side of I are satised, whilst the additional conditions on the right hand side are not, T will 

always reach a database state satisfying :I. This eect of T is intentional and hence the only 

reasonable approach tointegrity maintenance in this case is to disallow such transactions. 

More formally, the eect of a transaction T in a state is given by the strongest (with 

respect to )) formula E (T ) = such that j= wp(T )( ) holds. Here wp(T )( ) denotes 

the weakest precondition of under the transaction , i.e. starting T in initial state will 

reach a nal state satisfying . 

164

Since we only consider sequences of insertions and deletions, E (T ) can always be written 

as a conjunction of literals, i.e. in negated implicational normal form, with the positive literals 

corresponding to insertions and the negative onestodeletions. In addition, we may consider 

the eect of a sequence T RT S, where T is a transaction and RT S a system of rules. We say 

that RT S invalidates the eect of T i 6j= E (T ) ^ E (T RT S) holds for some state . 

Then it is justied to call a transaction T repairable with respect to the constraint set 

i :E (T ) =2 holds for at least one state . Then a complete terminating system RT S 

of ECA-rules always invalidates the eect of a non-repairable transaction T . Hence the rst 

problem is to detect (and exclude) non-repairable transactions. In order to decide whether a 

given transaction T is repairable or not, we must be able to decide, whether :E (T ) is in 

the closure . Hence the implication problem for constraints must be decidable. 

Note that our treatment ignores the termination problem. Non-terminating transactions 

have to be excluded as well, but this problem is independent from the repairability problem, 

since non-termination of RTSs occurs as an orthogonal problem. 

8.3.2 Critical Paths 

Let us ask, whether we can always nd a complete set of repair rules for all repairable 

transactions. For this we introduce the notions of associated hypergraphs and critical trigger 

paths. 

Let S = fp 1 ::: p n g be a relational database schema and RT S = fR 1 ::: R m g a system 

of ECA-rules on S. Then the associated rule hypergraph (VE) is constructed as follows: 

{ V is the disjoint union of S and RT S. We then talk of S-vertices and RT S-vertices 

respectively. 

{ If R 2 RT S has event-part Ev on p 2 S and actions on p 1 ::: p k , then we have a 

hyperedge from p to fRg labelled by +or; depending on Ev being an insert or delete, 

and a hyperedge from fRg to fp 1 ::: p k g analogously labelled by k values + or ;. 

Example 8.5. Figure 8.1 shows the associated rule hypergraph of Example 8.4 in which case 

we have a simple graph. Note that whenever action-parts consist only of a single operation, 

the rule hypergraph degenerates to a graph. 

As a more practical example Figure 8.2 contains the associated hypergraph for Example 

8.1 with the rules discussed in Example 8.2. In particular, rules R 1 , R 2 and R 3 correspond to 

the functional dependencies FD 1 ,FD 2 and FD 3 ,rulesR 4 and R 5 to the inclusion dependency 

ID 1 ,rulesR 6 and R 7 to the inclusion dependency ID 2 and rules R 8 and R 9 to the exclusion 

dependency ED. Furthermore, we used the abbreviations W , T and C for WIRE, TUBE and 

CONNECTION, respectively. 

ut 

So far we ignore the condition part of the rules. These come into play if we consider 

critical trigger paths in associated hypergraphs. These are dened in several steps starting 

from paths in the associated hypergraph which correspond to possible sequences of ECArules 

with respect only to their event- and action-parts. Secondly we attach formulae to the 

S-vertices in the path in such a way that pre- and postconditions of the involved rules are 

expressed. Then we talk of trigger paths. 

A maximal trigger path with contradicting initial and nal condition will then be called 

critical. Then imagine a transaction with an eect implied by the initial formula, i.e. that 

there is an initial state such that running the transaction in this state results in a state which 

165

q 

; 

@@I 

@ ; ; 

;; @ 

p 

 

R 2 

; + - R 4 

; 

@ ; 

+ @@ 

; ;; @R 

R 1 

+ + - R 3 

Fig. 8.1. Associated Rule Hypergraph for RT S = fR 1 R 2 R 3 R 4 g in Example 8.4 

 

W 

 

*+ 

HY; 

H 

H 

R 8 

A 

AAAAA 

AU ; 

+ 

R 4 

 

 

R 6 

H HHj + 

T *+ H HHj + 

HY; 

 

6 AAK 

; H 

H ; 

6 6 

; A; 

R ; 5 

 

R ; 

7 

A 

 

A 

+ A + 

? A + ? 

R 1 R 9 R 2 R 3 

 

C 

 

+ 

? 

Fig. 8.2. Associated Rule Hypergraph for Example 8.1 

satises the initial condition of the trigger path. Executing this transaction followed by the 

rule triggering system along the critical trigger path will then turn the eect of the transaction 

into its opposite. This means that the RT S invalidates the eect of at least one transaction. 

Let G =(VE) be the rule hypergraph associated with a system RT S of rules. A trigger 

path in G is a sequence v 0 e 1 v 0 1 e0 1 ::: e0`v` of vertices and hyperedges with the following 

conditions: 

{ v i 2S holds for all i =0::: `, 

{ vi 0 2 RT S holds for all i =1::: `, 

{ e i is a hyperedge from v i;1 to vi 0 and 

{ e 0 i is a hyperedge from v0 i to V i with v i 2 V i and the same label as e i+1 . 

We call ` the length of the trigger path. 

In addition we associate with each vertex v i 2 S (i = 0::: `) a formula ' i in negated 

implication normal form such thatj= ' i ) cond(vi+1 0 ) holds for the condition part cond(v0 i+1 ) 

of rule vi+1 0 2 RT S and j= ' i ) wp(A i+1 )(' i+1 ) holds for the action-part A i+1 of rule vi+1 

0 

(i =0::: `; 1). Furthermore, there is no e`+1 2 E from v` to v0`+1 with the same label as 

e 0` such thatj= '` ) cond(v0`+1 ) holds. 

Then a trigger path is critical i j= :(' 0 ^ '`) holds. Such a critical trigger path is 

called non-admissible i there is a consistent state and a repairable transaction T such that 

E (T ) , ' 0 holds. 

166

Critical trigger paths for the associated rule hypergraph in Figure 8.1 are sketched in 

Figure 8.3. Note that in this case both critical trigger paths are not non-admissible. 

 

p 

 

 

p 

 

Fig. 8.3. Critical Trigger Paths 

 

q 

 

 

q 

 

+ + + ; 

R 1 R 4 

- - - - 

+ ; ; ; 

R 3 R 2 

- - - - 

 

p 

 

 

p 

 

p(x) ^:q(x) p(x) ^ q(x) :p(x) ^ q(x) 

v 0 e 1 v 0 1 e 0 1 

v 1 e 2 v 0 2 e 0 2 

v 2 

p(x) ^ q(x) p(x) ^:q(x) :p(x) ^:q(x) 

If a critical trigger path is not non-admissible, then only a non-repairable transaction 

can be invalidated by running the rules in the trigger path. Since we exclude non-repairable 

transactions, we only have to consider non-admissible trigger paths. After these remarks we 

are able to state our next result: 

If RT S is a complete set of rules associated with a set of constraints and let G =(VE) 

be the associated rule hypergraph, then G contains an non-admissible critical trigger path i 

there exists a consistent database state and a repairable transaction T such that executing 

T in and consecutively running RT S invalidates the eect of T without leaving the database 

unchanged. 

To sketch a proof, consider the sequence ' 0 :::'` of formulae associated with a critical 

trigger path. According to the label of e 1 being + or ; ' 0 either contains a literal p(x) or 

:p(x). Choose a consistent state with j= :p(x) orj= p(x), respectively, and a repairable 

transaction T with E (T ) , ' 0 . By the denition of critical trigger paths RT S invalidates 

the eect of T .Finally, use induction on the length ` to show that the state resulting from 

T followed by RT S is dierent from . 

Conversely, if there is no admissible critical trigger path, let T be a repairable transition 

and a database state which is consistent with respect to . Now start T in and assume 

that the resulting state 0 is not consistent. Then consider a trigger path of nite length such 

that j= 0 ' 0 holds. The consecutive execution of the rules in this trigger path will result in 

a state satisfying '`. Thus, we have E (T ) , ' 0 and E (T RT S) , '`. According to 

our assumption, the used trigger path cannot be critical. Hence RT S does not invalidate the 

eect of T . 

The full proof is contained in [12] and [14, p.82f.]. 

8.3.3 Extensions 

In our model the execution of a rule with condition-part :I does not completely repair 

violations to the constraint I, since there may be more than just one violating tuple. There 

are two possible solutions to this problem: 

{ The rst of these solutions considers a WHILE-semantics for the rules. In this case the 

second condition for critical trigger paths has to be replaced by j= ' i ) wp(A i+1 )(' i+1) 

167

R 8 

+ 

? 

W 

HY; 

 

*+ 

H 

H 

+ 

R 4 

A 

AAAAA 

AU ; 

 

T 

 

*+ 

HY; 

H H 

R 6 

H HHj + 

H HHj + 

; - 

; 

; - 

 

 

 

 

 

; 

+ 

R 5 

 

 

6 AAK 

6 6 

; A; 

; R ; 

7 

A 

 

A 

+ A + 

? A + ? 

R 1 R 9 R 2 R 3 

 

C 

 

+ 

? 

Fig. 8.4. Extended Rule Hypergraph for Example 8.1 

with A i+1 representing the iteration of the action-part as long as the condition is satised. 

Example 8.6 shows a critical trigger path using WHILE-semantics. 

{ The second solution extends the rule hypergraph, as if the action-part of each rule repeated 

the event. Of course, this is not necessary for rules that denitely repair all violations to 

I. Figure 8.4 extends the one in Figure 8.2 with respect to the rules R 5 , R 7 , R 8 and R 9 . 

In Example 8.6 we discuss critical trigger paths with respect to this extension. 

Example 8.6. The rst picture in Figure 8.5 shows a critical trigger path corresponding to 

the rule hypergraph in Figure 8.2 using WHILE-semantics. The used formulae are 

' 0 W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^ T (8511 HH-HB:::) 

' 1 W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^:T (8511 HH-HB:::) 

' 2 :W (8511 HH-HB:::) ^:W (4711 HH-HB:::) ^:T (8511 HH-HB:::) 

Using extensions to the hypergraph instead { as shown in Figure 8.4 { gives rise to the critical 

trigger path in the second picture in Figure 8.5 using 

' 0 2 :W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^:T (8511 HH-HB:::) 

and the same formulae ' 0 , ' 1 and ' 2 as above. 

ut 

Both extensions do not aect the result stated above. To sketch a proof, the second solution 

is the same as adding \dummy" actions, i.e. those repeating the event, to the action part. 

Therefore, it corresponds to a slightly changed rule system with the same behaviour. Then 

the iteration in the rst solution corresponds to rule iteration in the second solution. 

Analogously, if action-parts contain more than one operation, the critical trigger paths 

considered so far do not reect completely the sequences of rule executions. However, extending 

hyperedges from RT S-nodes to S-nodes according to previously triggered rules captures 

this situation. Just as before, this does not aect our main result on critical trigger paths. 

Since the practical rule systems we are interested in, only comprise simple action parts, we 

do not discuss further examples for this extension. 

168

W 

 

 

W 

 

 

 

T 

 

 

T 

' 0 ' 1 ' 2 

+ - ; 

R 8 

- ;- ; 

R 5 

- 

 

W 

 

 

T 

 

' 0 ' 1 ' 0 2 

+ ; ; ; 

R 8 R 5 R 5 

- - - - - - 

 

W 

 

' 2 

Fig. 8.5. Critical Trigger Paths for Example 8.1 

8.4 Well-behaving Rule Systems 

Let us now ask for constraint sets that allow us to dene complete RTSs which exclude nonadmissible 

critical trigger paths in their associated hypergraphs. Let us start with a simple 

example. 

Example 8.7. Take again two unary relations p and q and the constraints I 1 p(x) ) q(x) 

and I 2 q(x) ) p(x) which implies p to be always equal to q. Thenwe obtain the following 

repairing rules: 

R 1 : ON insert p IF 9x: p(x) ^:q(x) DO insert q (x) (8.86) 

R 2 : ON delete q IF 9x: p(x) ^:q(x) DO delete p (x) (8.87) 

R 3 : ON insert q IF 9x: :p(x) ^ q(x) DO insert p (x) (8.88) 

R 4 : ON delete p IF 9x: :p(x) ^ q(x) DO delete q (x) (8.89) 

First observe, that all edges in critical trigger paths are equally labelled with either + or ;. 

For the case of +andv 0 = p consider all constants a such that j= ' 0 ) p(a) ^:q(a) holds, 

but from the denition of eects such a pair can only result from a non-repairable transaction 

T or an inconsistent starting state . The same argument applies to the other cases. Hence 

there are no non-admissible critical paths in the associated rule hypergraph. 

ut 

8.4.1 Stratied Rule Systems 

Let us now investigate the reason for the absence of non-admissible critical trigger paths in 

Example 8.7. This leads us to the notion of a stratied set of constraints. 

The motivation behind this is as follows: In Example 8.7 insertions (deletions) on a relation 

p only trigger insertions (deletions) on q and vice versa. This should be sucient for not 

invalidating an eect once it has been established. The corresponding constraints can therefore 

be grouped together. 

Aset of constraints in implicative normal form (8.76) on a schema S is called stratied 

i we have a partition = 1 [:::[ n with pairwise disjoint constraint sets i called strata 


(i) If L 1 :::L k is a sequence of literals on the left hand side (right hand side) of I2 i and 

J 2 contains a sequence L 0 1:::L0` of literals on the right hand side (left hand side) 

such that fL 1 :::L k L 0 1 :::L0`g is uniable, then J must also lie in stratum i. 

169

(ii) If I 6= J contain sequences of literals L 1 :::L k and L 0 1:::L0` both on the left (right) 

hand side such that fL 1 :::L k L 0 1:::L0`g is uniable with most general unier and 

:I, :J contain uniable literals on the right (left) hand side, then I and J must lie in 

dierent strata i and j , unless one of the instances :I or :J is always satised. 

Example 8.8. Consider the constraints in Example 8.1 except the exclusion constraint ED. 

Then the rst condition above requires ID 1 and ID 2 to lie in the same stratum. The same 

applies to FD 3 and ID 2 (or FD 2 and ID 1 , respectively). 

We may also unify the left hand sides of FD 2 and ID 2 (or FD 1 and ID 1 , respectively), but 

then the resulting instance of the functional constraint degenerates to true. 

ut 

Our next result states that stratied constraint sets always give rise to RTSs without nonadmissible 

critical trigger paths in the associated rule hypergraph. 

If is a stratied constraint set on a schema S, then there exists a complete RTS such 

that for any repairable transaction T on S the RTS does not invalidate the eect of T . 

To sketch a proof consider I in implicative normal form (8.76). For each relation symbol 

p i on the left hand side dene rules 

ON insert pi IF :I DO insert qj (y j ) 

ON insert pi IF :I DO delete pj (y j ) 

and 

with relation symbols q j occurring on the right hand side and p j (j 6= i) on the left hand side 

of I. Similarly, each predicate symbol q j on the right hand side gives rise to rules 

ON delete qj IF :I DO insert qi (y i ) 

ON delete qj IF :I DO delete pi (y i ) : 

This denes a complete set RT S of rules. Due to this rule construction the constraints corresponding 

to the rules in a critical trigger path all belong to the same stratum. However, 

the condition j= :(' 0 ^ '`) implies that ' 0 contains a literal L, '` its negation, hence the 

construction of rules implies I 1 and I` to lie in dierent strata. This shows that there are no 

non-admissible critical trigger paths. The full proof can be found in [12, 14]. 

and 

8.4.2 Constraints Arising from Entity-Relationship Schemata 

Finally, we may ask for cases where stratied constraint sets occur. Recall from [9] that a 

relational database schema S with constraint set is in Entity-Relationship normal form 

(ERNF) { and hence is equivalent toanER-schema{i 

{ all inclusion constraints in are key-based and non-redundant, 

{ there is no cycle of inclusion constraints in , 

{ each relation schema R 2 S is in BCNF with respect to the functional dependencies in 

and 

{ there are only inclusion and functional dependencies in . 

If a relational database schema S with constraintset is in ERNF, then a slight generalization 

of the argument given in Example 8.8 shows that is stratied. Indeed, property (i) forces 

inclusion dependencies R[X 1 ] S[Y 1 ] and S[X 2 ] T [Y 2 ] to belong to the same stratum. By 

170

the same property the key constraints dened by theY i also belong to this stratum. Finally, 

property (ii) does not constrain pairs of constraints in . 

Furthermore, we only obtain an acyclic set of functional and inclusion constraints, for 

which the implication problem is decidable [2]. Hence we are able to detect also unrepairable 

transactions. Following the design approach of Mannila and Raiha in [9] leads to schemata 

without any problems concerning consistency enforcement byRTSs. 

;; @ @@ ; C q 

(0 1) - D 

6 

(0 1) 

;; @ @@ ; ;; @ @@ ; - 

A p B 

Fig. 8.6. Entity-Relationship constraints 

Example 8.9. Let us look at the higher order Entity-Relationship diagram [15] in Figure 

8.6, which leads to the constraints 

I 1 : p(x y) ) q(x z) and 

I 2 : q(x z) ^ q(y z) ) x = y : 

Stratication property (i) applied to q on the right hand side of I 1 and on the left hand 

side of I 2 forces I 1 , I 2 to lie in the same stratum. Property (ii) is not applicable. Hence the 

constraint set is stratied. However, if we add a third constraint 

I 3 p(x z) ^ q(y z) ) false 

which intermsoftheEntity-Relationship diagram in Figure 8.6 corresponds to an exclusion 

constraint BkD, the new set fI 1 I 2 I 3 g of constraints is no longer stratied. This is due to 

the fact that stratication property (i) forces I 1 and I 3 to lie in the same stratum, wheras 

now stratication property (ii) forces I 1 (or analogously I 2 ) to lie in a stratum dierent from 

the one of I 3 .Thus, there is no stratication satisfying both properties. 

ut 

connection { 

to { 

from { 

{tubeid 

CONNECTION ;; @ TUBE 

@@ ; (1 1) - 

{tubetype 

6 

;; @ @@ ; (1 1) - 

WIRE 

{ wire id 

{ wire type 

{voltage 

{power 

Fig. 8.7. Entity-Relationship constraints corresponding to Example 8.1 

171

Example 8.10. Let us take another look at Example 8.1. We have already seen in Example 

8.8 that the set of functional and inclusion constraints in this example is stratied. Again, 

adding the exclusion constraint ED destroys this property, since the stratication property 

(i) forces ED to belong to the same stratum as ID 1 , whereas property (ii) implies it lying in 

a dierent stratum. This is practically the same argument as in the previous Example 8.9. 

Again the schema corresponds to the Entity-Relationship diagram in Figure 8.7 with 

ED corresponding to the exclusion constraint W [wire id] T [tube id] and ID 1 to the path 

inclusion constraint W:C[connection] T:C[connection]. 

ut 

8.4.3 Constraints Arising from Simple Object-Oriented Schemata 

A similar situation arises for simple schemata in object-oriented data models. The OODM 

investigated in [11] distinguishes between objects and values. Types are used to describe 

immutable sets of values with (type-)operations predened on them. Type systems are prescriptions 

for the syntax and semantics of permitted type denitions. We mayalways consider 

type systems that consists of some base types, type constructors and a subtyping relation. 

E.g., base types could BOOL, NAT, INT, STRING, ID or OK,whereID is an abstract 

identier type without any non-trivial supertype and OK is the trivial type (which has exactly 

one value ok). Type constructors could be record types (a 1 : 1 ::: a n : n ) and nite set 

types fg. 

We may use base types and constructors to dene new types by nesting. In addition, we 

may build parameterized types letting type variables in constructors be uninstantiated. Then 

atype T is called proper i the number of its parameters is 0. T is called a value type i there 

is no occurrence of ID in T .IfT 0 is a proper type occurring in a type T , then there exists a 

corresponding occurrence relation o : T T 0 ! BOOL with o(v 1 v 2 )=true i v 2 occurs 

in v 1 at the position indicated by the position of T 0 in T . Each subtype relation T 1 T 2 as 

above denes a subtype function T 1 ! T 2 on the corresponding sets of values. 

The class concept provides the grouping of objects having the same structure and behaviour. 

Structurally this uniformly combines aspects of object values and references. Behaviourally, 

this abstracts from operations on single objects including their creation and 

deletion. 

Since identiers can be represented using ID,values and references can be combined into 

a representation type, where each occurrence of ID denotes references to some other class. 

Therefore, we may dene the structure of a class using parameterized types. 

If T is a value type with parameters 1 ::: n and if the parameters are replaced by 

pairs r i : C i with a reference name r i and a class name C i ,theresulting expression is called 

a structure expression. A class consists of a class name C, a structure expression S, a set of 

class names D 1 ::: D m (called superclasses) and a set of operations. Wecallr i the reference 

named r i from class C to class C i .Thetype derived from S by replacing each reference r i : C i 

by the type ID is called the representation type T C of the class C. 

A database schema S is given by a nite collection of type and class denitions such that 

all types, classes and operations occurring within type denitions, structure denitions and 

operations are dened in S. 

Then an instance D assigns to each classC avalue D(C) oftype f(ident : IDvalue : T C )g 


{ For each class C identiers must be unique. 

172

{ The set of identiers in a subclass C is a subset of the one in the superclass C 0 . Moreover, 

if T C T 0 C with subtype function f : T C ! T 0 C , then (i v) 2D(C) ) (i f(v)) 2D(C0 ) 

holds. 

{ For each reference r from C to D identiers j occurring in a value v of an object in C 

with respect to the occurrence relation o r , i.e.(i v) 2D(C) and o r (v j) hold, must occur 

in D(D). 

Let us consider only simple schemata as they occur in most practical object-oriented systems. 

In such aschema structure expressions always have the form (a 1 : T 1 :::a n : T n ), where T i 

is either a value type or a class name. In the latter case a i is a reference. In accordance to 

many practical systems we may then call a i an attribute. 

Example 8.11. Let us consider a simple university schema adapted from [11]: 

Class PersonC 

Structure (PersonIdentityNo : NAT , Address : STRING ) 


IsA PersonC 

Structure (Spouse:MarriedPersonC ) 


IsA PersonC 

Structure ( StudNo : NAT ,Name:STRING, Supervisor : ProfessorC, 

Major : DepartmentC, Minor : DepartmentC ) 


IsA PersonC 

Structure (Age:NAT , Salary : NAT ,Faculty :DepartmentC ) 


Structure ( DeptName : STRING, Head:ProfessorC ) 

This schema can be translated { using some self-explaning abbreviations for the attribute 

names { into a relational one with the following relation schemata: 

Person = (id pino address) 

MPerson = (id spouse) 

Student = (id sno name sup major minor) 

Prof = (id age salary faculty) 

Dept = (id dname head) 

In addition, we get the following functional and inclusion dependencies: 

Person :id ! pino address MPerson :id ! spouse 

Student :id ! sno name sup major minor 

Prof :id ! age salary faculty Dept :id ! dname head 

MPerson[id] Person[id] Student[id] Person[id] Prof[id] Person[id] 

MPerson[spouse] MPerson[id] Student[sup] Prof[id] 

Student[major] Dept[id] Student[minor] Dept[id] 

Prof[faculty] Dept[id] Dept[head] Prof[id] : 

173

Note that all these relations have an attribute id with the type ID as its domain. Furthermore, 

all inclusion constraints dened by theschema are key-based with just this attribute occurring 

on the right hand side. The inclusion constraints that stem from subclassing also have idon 

their left hand sides. In particular all inclusion constraints are unary. 

ut 

The observations made in Example 8.11 can be generalized. From the denition of conditions 

to be satised by instances, each inclusion constraint dened by a simple object-oriented 

schema is key-based with the identier attribute id occurring on its right hand side. Furthermore, 

id denes a key for each relation. However, due to the use of the set-type-constructor 

relations appear to be not in rst normal form. 

With these observations concerning the nature of constraint sets arises from transforming 

simple object-oriented schemata we may repeat our arguments used for Entity- 

Relationship constraints to see that is stratied. Indeed, property (i) forces inclusion dependencies 

R[a] S[id] and S[b] T [id] to belong to the same stratum. By the same property 

the key constraints dened by the attributes id on S or T also belong to this stratum. Finally, 

property (ii) does not constrain pairs of constraints in . 

Since all inclusion constraints in are unary, the implication problem is decidable [7]. 

Therefore, we are also able to detect non-repairable transactions. 

8.5 Conict Resolution 

Referential actions are special rules to cope with violations of a foreign key constraint R 1 [X] 

R 2 [Y ]. Note that all inclusion constraints in Entity-Relationship and object-oriented models 

considered so far have this form. As in SQL we only consider the case of delete- andupdateoperations 

on R 2 , i.e. we consider the deletion (or update) of a tuple t 2 2I(R 2 ). If this leads 

to constraint violation, there mustbeatleast one tuple t 1 2I(R 1 ) with t 1 [X] =t 2 [Y ]. The 

following actions have been suggested: 

cascade: Also delete t 1 (or update the values for the attributes in X such that the constraint 

violation dissappears). If there is more than one such tuple, the action is applied to all of 

them. 

set null: Set the values for the attributes in X to a null value. 

restrict: Reject the deletion or update on R 2 and roll back. 

In the rst two cases wehave a reaction by propagation, since referencing tuples also disappear 

from the instance. 

8.5.1 Problem of Ambiguity 

Assume that wehave associated a referential action with all constraints in I. Then the problem 

occurs that the nal result of an operation depends on the order of applying referential actions. 

A propagation path (for short: p-path) is a sequence R n [X n ]:::R 1 [X 1 ]such that there are 

constraints R i;1 [Yi;1 0 ] R i[Y i ] in I with X i Yi 0 Y i for i =2:::n, all these constraints 

are associated with a referential action of kind cascade or set null and R i;1 [X i;1 ] R i [X i ] 

is in I . 

A restriction path (for short: r-path) is a sequence R n [X n ]:::R 1 [X 1 ]such that there are 

constraints R i;1 [Y 0 

i;1 ] R i[Y i ]inI with X i Y 0 

i Y i for i =2:::n, where R 1 [Y 0 1 ] R 2[Y 2 ] 

174

is associated with a referential action of kind restrict and all other constraints are associated 

with a referential action of kind cascade or set null, andR i;1 [X i;1 ] R i [X i ]isinI . 

A p-path R n [X n ]:::R 1 [X 1 ] is called a phantom i there is an r-path Rm[X 0 m], 0 ::: , 

R1 0 [X0 1 ] with R0 m = R n , Xm 0 = X n and an inclusion constraint R 1 [X 1 ] R1 0 [X0 1 ]inDep . 

A schema S has a conict i there is a p-path R n [X n ]:::R 1 [X 1 ] corresponding to 

constraints R i;1 [Yi;1 0 ] R i[Y i ], a r-path Rm[X 0 m]:::R 0 1 0 [X0 1 ] corresponding to constraints 

Ri;1 0 [Z0 i;1 ] R0 i [Z i] with R n [X n ]=Rm[X 0 m]andR 0 1 [X 1 ]=R1 0 [X0 1 ], an instance I and tuples 

t n :::t 1 in I(R n ):::I(R 1 )witht i [Y i ]=t i;1 [Yi;1 0 ] and tuples t0 m :::t0 1 in I(R0 m ):::I(R0 1 ) 

with t 0 i [Z i]=t 0 i;1 [Z0 i;1 ]suchthatt n = t 0 m and t 1 = t 0 1 hold. A conict is called a phantom i 

the involved p-path is a phantom. 

The condition t 1 = t 0 1 could be omitted, since the existence of tuple sequences satisfying 

all other conditions implies the existence of tuples as claimed in the denition. 

registration { 

Car 

{ pino 

Person { name 

{ address 

{ licence no {date 

AK 

A 

H H 

H Driver H Patient 

HH 

 

H 

A 

H 

H 

H 

HH 

 

 

- 

Qk 

3 

Q 

Q 

H 

Q 

H 

H 

Accident H {costs 

HH 

 

 

Clinic 

{clname 

Fig. 8.8. Entity-Relationship schema leading to a conict 

Example 8.12. Consider the Entity-Relationship diagram in Figure 8.8. Transforming it to 

a relational schema gives rise to the relation schemata 

Person = (pino name address) 

Driver = (pino registration licence no) 

Patient = (pino cl name date) 

together with the inclusion dependencies 

and 

Driver[pino] Person[pino] 

Car = (registration) 

Clinic = (cl name) 

Accident = (pino registration cl name costs) 

Accident[pino cl name] Patient[pino cl name] 

Accident[pino registration] Driver[pino registration] 

Patient[pino] Person[pino] : 

Now suppose that the last of these dependencies has been equipped with the referential action 

restrict, whilst all others are equipped with cascade. ThenPerson[pino], Driver[pino], 

175

Accident[pino registration] is a p-path and Person[pino], Patient[pino], Accident[pino cl name] 

is an r-path. These two paths show that the schema has a conict. 

Now extend the schema as shown in Figure 8.9. We obtain the additional relation schema 

Bad Driver = (pino) with the inclusion dependency Bad Driver[pino] Person[pino]. Assume 

this dependency to be equipped with the referential action restrict. Then Person[pino], 

Bad Driver[pino] constitutes another r-path. 

If (for reasons beyond the small section of constraints considered) we derive the inclusion 

dependency Accident[pino] Bad Driver[pino] 2 Dep , then the above conict will be a 

phantom. 

ut 

{ pino 

H H (0,1) 

HBad Driver 

Person { name 

HH 

 

H 

 

- 

{ address 

registration { 

{ licence no {date 

Car 

H AK 

A 

H H 

Driver H Patient 

HH 

 

H 

A 

H 

H 

H 

HH 

 

 

- 

Qk 

3 

Q 

Q 

H 

Q H H 

Accident H {costs 

HH 

 

 

Fig. 8.9. Entity-Relationship schema leading to a phantom conict 

Clinic 

{clname 

If there is a conict, then a deletion or update for the tuple t n = t 0 m violates the constraints 

R n;1 [Yn;1 0 ] R n[Y n ] and Rm;1 0 [Z0 m;1 ] R0 m [Z m]. Executing the corresponding referential 

actions violates the \next" foreign key constraints along the p-path or r-path respectively. Dependending 

on the order of the referential actions the tuple t 1 = t 0 1 is either deleted according 

to the actions along the p-path and consequently no constraint violation for R1 0 [Z0 1 ] R0 2 [Z 2] 

may occur or it leads to a rollback according to the actions along the r-path. This is the core 

of the ambiguity problem. 

However, if it is a phantom conict, we also have a r-path Rk 00[X00 

k ]:::R00 1 [X00 1 ] with R00 k = 

R n with foreign key constraints Ri;1 00 [U i 0] R00 i [U i]andXk 00 = X n and an inclusion constraint 

R 1 [X 1 ] R1 00[X00 

1 ]. Hence there are also tuples t00 k :::t00 1 with t00 i [U i] = t i;1 [Ui 0 ]. Hence the 

tuple t 00 

1 enforces a rollback and there is no ambiguity. 

Thus, the ambiguity problem is to decide for a given schema S together with a set Dep = 

K [ I of minimal key and referential key constraints has a non-phantom conict or not. 

8.5.2 Decidability 

In order to show that ambiguity asdenedabove is decidable, we rst recall that implication 

for inclusion dependencies alone is decidable [2]. Thus, we can compute all p-paths and r- 

paths. Since a conict corresponds to a \diamond" with a p-path and a r-path, the existence 

of conicts is obviously decidable and we onlyhave to discard phantom p-paths. For this we 

have to decide, whether an arbitrary inclusion constraint (R 1 [X 1 ] R1 0 [X0 1 ] in the denition 

176

of phantom p-paths) is in Dep .Thus, the existence of non-phantom conicts is decidable i 

for any inclusion constraint it is decidable whether Dep j= holds. 

We have seen in the previous section that constraint sets dened by Entity-Relationship 

or simple object-oriented schemata only contain functional and inclusion dependencies. We 

know that for arbitrary sets of functional and inclusion constraints the implication problem 

Dep j= is undecidable [6], but for the Entity-Relationship case the inclusion dependencies 

are acyclic. Then it is well known [1] that the implication problem Dep j= is decidable. 

For the object-oriented case all inclusion dependencies are unary. For this case it is also 

well known [7] that the implication problem Dep j= is decidable. 

Therefore, for both cases of constraint sets considered in this paper, those resulting from 

Entity-Relationship schemata and those arising from simple object-oriented schemata, the 

ambiguity problem for referential actions is decidable. 


In this paper we investigated rule triggering systems (RTSs) for maintaining consistency 

arising from implicit constraints in Entity-Relationship and object-oriented models. Unfortunately, 

their always exist non-repairable transactions. In order to disallow such transactions 

the constraint implication problem must be decidable, which is the case for both models. In 

the rst case we are in the situation of acyclic inclusion constraints, whereas in the second 

case we only obtain unary inclusion constraints. 

Secondly, we analyzed critical trigger paths in rule hypergraphs associated with RTSs. We 

could show that the existence of critical trigger paths leads to RTSs which mayinvalidate the 

eect of some transactions, even if these are repairable. Such abehaviour can be excluded for 

stratied constraint sets, which holds for the constraint sets arising from Entity-Relationship 

and object-oriented models. 

Thirdly, we investigated the ambiguity problem for rules for the case that rollback is 

allowed in the action part. This again can be reduced to the decidability problem for constraint 

implication, hence holds for the chosen models. 

To summarize, the general applicability ofRTSs for integrity maintenance is limited, if we 

assume that the intended eects of user-dened transactions should be preserved. Fortunately, 

conicts do no occur or can be detected eciently if we only consider constraints arising from 

conceptual design with Entity-Relationship and certain object-oriented models. 


1. S. Abiteboul, R. Hull, V. Vianu. Foundations of databases. Addison-Wesley 1995. 

2. M. A. Casanova, R. Fagin, C.H.Papadimitriou. Inclusion dependencies and their interaction with 

functional dependencies. Journal of Computer and System Sciences 28 (1), 29-59, 1984. 


VLDB, Brisbane (Australia), August 1990, 566-577. 

4. S. Ceri, P. Fraternali, S. Paraboschi, L. Tanca: Automatic Generation of Production Rules for 

Integrity Maintenance. ACM ToDS, vol. 19(3), 1994, 367-422. 

5. S. Chakravarty, J. Widom (Eds.): Research Issues in Data Engineering | Active Databases, Proc., 

Houston, Februar 1994. 

6. A. K. Chandra, M. Y. Vardi. The implication problem for functional and inclusion dependencies is 

undecidable. SIAM Journal of Computing 14, 671-677, 1985. 

177

7. S. S. Cosmadakis, P. Kanellakis, M. Y. Vardi. Polynomial-time implication problems for unary inclusion 

dependencies. Journal of the ACM 37, 15-46, 1990. 

8. M. Gertz, U. W. Lipeck: Deriving Integrity Maintaining Triggers from transaction Graphs, in 

Proc. 9th ICDE, IEEE Computer Society Press, 1993, 22-29. 

9. H. Mannila, K.-J. Raiha: The Design of Relational Databases, Addison-Wesley 1992. 

10. K.-D. Schewe, B. Thalheim: Consistency Enforcement in Active Databases, in S. Chakravarty, J. 

Widom (Eds.): Research Issues in Data Engineering | Active Databases, Proc., Houston, Februar 

1994. 

11. K.-D. Schewe and B. Thalheim. Fundamental concepts of object oriented databases. Acta Cybernetica, 

vol. 11(1/2), Szeged 1993, 49 - 84. 

12. K.-D. Schewe, B. Thalheim: Active Consistency Enforcement for Repairable Database Transitions, 

in S.Conrad, H.-J. Klein, K.-D. Schewe (Eds.): Integrity in Databases, Proc. 6th Int. Workskop 

on Foundations of Models and Languages for Data and Objects, Schlo Dagstuhl, 1996, 87-102, 

available via http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceedings.html. 

13. K.-D. Schewe: Well-Behaving Rule Systems for Entity-Relationship and Object Oriented Models, 

in D. W. Embley, R. C. Goldstein (Eds.): Conceptual Modeling { ER '97, Springer LNCS 1331, 

1997, 141-154. 

14. K.-D. Schewe, B. Thalheim: On the Strength of Rule Triggering Systems for Integrity Maintenance, 

in C. McDonald (Ed.): Database Systems, Proc. 9th Australasian Database Conference, Perth 

1998, published as Australian Computer Science Communications, vol. 20 (2), 77-88. 

15. B. Thalheim: Foundations of entity-relationship modeling, Annals of Mathematics and Articial 

Intelligence, vol. 7, 1993, 197-256. 


Objects, IEEETrans. on Knowledge and Data Engineering, vol. 2 (4), December 1990. 


Proc. SIGMOD 1990, 259-270. 

178

Chapter 9 

Principles of Object Oriented 

Database Design 

Contents 

9.1 Philosophy ofOODB Design . . . . . . . . . . . . . . . . . . . . . . 180 

9.2 The Object Oriented Datamodel: Basic Features . . . . . . . . . 181 

9.2.1 Type Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 

9.2.2 Class Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 

9.2.3 Method Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 

9.2.4 Schema Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 

9.3 Class Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 

9.4 Stepwise Renement . . . . . . . . . . . . . . . . . . . . . . . . . . 188 

9.4.1 Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 

9.4.2 Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 

9.4.3 Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 

9.4.4 Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 

9.5 Declarativity by Constraint Centered Design . . . . . . . . . . . . 190 

9.6 Variation Based Reuse: A Research Issue . . . . . . . . . . . . . . 192 

9.7 Inferences in OODB Design . . . . . . . . . . . . . . . . . . . . . . 193 


Klaus-Dieter Schewe, Bernhard Thalheim. Principles of Object Oriented Database 

Design. in H. Jaakkola, H. Kangassalo, T. Kitahashi, A. Markus (Eds.). Information 

Modelling and Knowledge Bases V , 227 { 242. IOS Press, Amsterdam, 1994. 

179

Abstract. The design of complex information systems requires a transparent model-based 

methodology. It has been claimed that object orientation will have a signicant impact on 

the development ofsuch a methodology, especially as reusability and naturality of conceptual 

modelling are concerned. 

The methodology presented in this paper concentrates on four signicant principles of 

object oriented database (OODB) design. The basic constituent is stepwise renement, i.e. 

to begin the design process with a partial model that is completed and concretized furtheron 

depending on the growth of application knowledge. Class abstraction, i.e. to support libraries 

of incomplete parameterized designs that are instantiated and specialized later, is a natural 

consequence hereof. Declarativity is achieved by constraint centered design with (up to some 

degree) automatic transformation into consistent transactions. Variations enable the design 

of information systems with heavy reuse of existing design components. 

The methodology is based on a theoretically founded object oriented datamodel (OODM). 

Hence the support of inferences such as deciding the identiability ofobjects, detecting the 

relation of an intended design to components in existing design libraries, and checking operations 

for reducedness as a prerequisite for the automatic transformation of constraints into 

consistent transactions. 

9.1 Philosophy of OODB Design 

The design of data and knowledge intensive information systems requires a transparent modelbased 

methodology. Classically there exist seperate methods for the database and transaction 

design without a satisfactory integration [7, 9]. Therefore, it is a natural hope that the use of 

object oriented design methods will improve the situation. 

Object orientation involves the isolation of data in semi-independent modules in order 

to promote high software development productivity. This idea stems from programming languages 

and most methods proposed so far [3, 6, 11, 20] are intended to support object oriented 

program development. The main dierence in object oriented database (OODB) design is due 

to the notion of object that is now intended to serve as a basic unit of persistent data, a view 

that is influenced by semantic datamodels [9]. Since classes then serve not only as behaviour 

abstractions but also as (persistent) data collections, we have to cope with object identication, 

whereas in object oriented programming a simple identication mechanism via object 

names is sucient. This makes OODB design a signicantly dierent task to object oriented 

program development, although some ideas of the approaches to the latter eld can be taken 

over. 

Still most object oriented datamodels are very close to the language level [1, 10] no matter 

whether their development started from a semantic datamodel or an object oriented programming 

language. For object oriented database design, however, it is necessary to shift 

the approach to the conceptual level as also claimed in work of the IS-Core group [13, 21]. 

Therefore, the primary goal of our methodology is to provide a conceptual object oriented 

model with greater naturality in application modelling. At the same time we want to improve 

the design quality and to raise the rate of software reuse. 

The work presented in this paper is centered around the theoretically founded object 

oriented datamodel (OODM) introduced in [16] and partly based on the work in [2]. This 

model supports the uniform representation of designs at each level of concretion. In particular 

there is no need to use dierent models for the conceptual and logical design respectively. 

180

We regard requirements analysis and conceptual modelling as two activities running in 

parallel. We start with an initial design that is a one-to-one representation of rst knowledge 

about the intended application. The analysis task is to grasp and describe such knowledge 

with the formal representation tools. The following design process is monotonic, as the amount 

of application knowledge increases. Each knowledge increment then corresponds to some re- 

nement, i.e. a change|not only extension|of the design. However, this does not prejudice a 

particular, e.g. \top-down" design procedure. In contrast, the OODM favours incomplete partial 

designs with the specication of details left for renement. Keeping even suchintermediate 

designs increases the spread of possible reuse. This is close to the Design-by-Units-strategy 

[23]. 

Classical design methods are centered around data, processes or constraints respectively. 

Within the unied model in our approach we may regard all these aspects at the same 

time and keep only track of the dependencies among them, since constraints depend on the 

data and processes on both other components. This implies the relative independence of 

renement steps on data, processes or constraints as long as these dependencies are taken 

into consideration. 

Since processes in data and knowledge intensive application systems change much faster 

than constraints, it is desirable to minimize the process design task and to achieve a maximum 

of declarativity. As shown in [17, 18, 19] it is possible (up to some degree) to compute maximal 

specializations of specied processes in order to enforce consistency. 

The use of a uniform OODM during the whole design process enables to build design 

libraries. Due to the support of abstract partial designs the components of such libraries can 

be more generic than usually assumed, but it is a truism that reusability does not imply 

reuse. We have to support mechanisms to retrieve a maximum of existing reusable library 

components for a given partial design. This leads to the concept of variation-based reuse 

extending results on variant constructioninsemantic networks [14]. 

Such a methodology involves a high level of inferences. Some of these inferences are intrinsic 

to the used datamodel. Among them are the recognition of object identiability, specialication 

and type correctness or the verication of renement correctness. Others are extrinsic such as 

the proof of reducedness as a prerequisite for consistency enforcement or the ascertainment 

of the relationship to existing library components. 

In the remainder of this paper we shall rst describe the fundamental issues of the OODM 

in Section 9.2, then in Sections 9.3-9.6 we briefly concretize the basic principles of our design 

methodology. Section 9.7 presents a short outline of the required inferences and a discussion 

of open research problems. 

9.2 The Object Oriented Datamodel: Basic Features 

In the object-oriented approach we distinguish between objects and values. Whereas values 

are encoded by themselves, objects have tobeencodedby object identiers. In our approach 

each object consists of a unique, immutable identier, a set of values of possibly dierent 

types, references to other objects and methods associated with the object. 

Values can be grouped into types. In general, a type may be regarded as an immutable set 

of values of a uniform structure together with operations dened on such values. Subtyping 

is used to relate values in dierent types. The class concept provides the grouping of objects 

having the same structure which uniformly combines aspects of object values and references. 

181

Objects can belong to dierent classes, which guarantees each object of our abstract object 

model to be captured by the collection of possible classes. As for values that are only dened 

via types, objects can only be dened via classes. Thus, a design consists of type and class 

denitions. 

9.2.1 Type Denitions 

We follow the classical view of types in [4] using a type system that consists of some basic 

types, type constructors and a subtyping relation. Moreover, recursive types, i.e. types dened 

by domain equations, and predicative types, i.e. types dened by restrictions, can be dened. 

Denition 9.1. { The base types are BOOL, NAT, INT, FLOAT, STRING, ID or ?, 

where ID is an abstract identier type without any non-trivial supertype and ? is the 

trivial type that is a supertype for every type. 

{ The type constructors are e 1 j je n (enumeration), (a 1 : 1 ::: a n : n ) (record), fg 

(nite set), [] (list), hi (bag) or (a : ) [ (b : ) (union). 

We may use base types and constructors to dene new types by nesting. If there is no confusion, 

the eld selectors in record or union types may be omitted. 


standard operators on base types and on records, sets, bags, ::: We omit the details here. A 

type t is called proper i the number of its parameters is 0. t is called a value type i there 

is no occurrence of ID in t. If t 0 is a proper type occurring in a type t, then there exists a 

corresponding occurrence relation o : t t 0 ! BOOL. 

A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened by 

the usual subtype relation [4]. 

Example 9.1. 

Let us dene a type VZ and a simple subtype VZ 0 hereof. 

Type VZ = 

( begin : DATE , 

end : DATE [?, 

kind-of-insurance : \Main" j \Family" j \Interruption" ) 

End VZ 

Type VZ 0 = 

( begin : DATE , 

end : DATE , 


End VZ 0 

ut 

Predicative Types are used to restrict the set of values given by some type denition to a 

subset. For this purpose a formula with exactly one free variable self is used. Clearly, the 

inclusion then gives a subtype function. In order to avoid inflationary use of quantiers, other 

variables are also allowed to occur freely in such a formula. They are assumed to be universally 

quantied. 

Denition 9.2. A predicative type T consists of an underlying type T 0 and a formula P with 

exactly one free variable self of type T 0 . 

182

Example 9.2. Let us dene a predicative subtype of [ VZ ]. 

Type VZ-list = [ VZ ] Where 

( self = concat(L 1 ,[V 1 ,V 2 j L 2 ])) 

V 2 :: VZ 0 ^ V 2 .end V 1 .begin ) ^ 

( self = concat(L 1 ,[V j L 2 ])) 

V .end 6= ?)V .begin V .end ) 

End VZ-list 

ut 

9.2.2 Class Denitions 

Each object in a class consists of an identier, a collection of values, references to other objects 

and methods. Let us postpone methods for a while. Identiers can be represented using the 

unique identier type ID.Values and references can be combined into a representation type, 

where each occurence of ID denotes references to some other classes. Therefore, we may 

dene the structure of a class using parameterized types. Moreover, classes are arranged in 

IsA-hierarchies. 

Denition 9.3. { If t is a value type with parameters 1 ::: n such that ID does not 

occur in t and if some of the parameters are replaced by pairs r i : C i with a reference 

name r i and a class name C i , the resulting expression is called a structure expression. 

Note that a structure expression may still contain parameters. 

{ A class consists of a class name C, a structure expression S, a set of class names 

D 1 ::: D m (called superclasses) and a set of methods. Wecallr i the reference named r i 

from class C to class C i . The type derived from S by replacing each reference r i : C i by 

the type ID is called the representation type T C of the class C. 

Example 9.3. 

Let us consider a class Insurant for an insurance application. 

Class Insurant = 

Structure (contract-no : NAT , 

name : NAME , 

address : ADDRESS , 

sex : SEX , 

insurance-times : VZ-list , 

agency : AGENCY ) 

Method ::: 

End Insurant 

ut 

In this example there are no references, hence the structure expression is simply a type. We 

could have dened this type, say INSURANT-DATA, separately from the class denition as 

in Section 9.2.1. Then the structure would simply be Structure INSURANT-DATA. 

9.2.3 Method Denitions 

Let us now turn to adding dynamics to the OODM. As required in the object oriented 

approach operations will be associated with classes. This gives us the notion of a method. 

We shall distinguish between visible and hidden methods to emphasize those methods that 

can be invoked by the user and others. However, all methods of a class including the hidden 

ones can be accessed by other methods. The justication for such a weak hiding concept is 

due to two reasons. 

183

{ Visible methods serve as a means to specify (nested) transactions. In order to build 



{ Hidden methods can be used to handle identiers. Since these identiers do not have any 

meaning for the user, they must not occur within the input or output of a transaction. 

Each method on a class C consists of a signature and a body. The signature consists of a 

method name and sets of parameter/type pairs for input and output. The body is dened by 

the usual constructs of a procedural programming language. 

Denition 9.4. { A method signature consists of a method name M, a set of input-parameter/type 

pairs i :: T i and a set of output-parameter/type pairs o j :: Tj 0. 

{ A method onaclassC consists of a method signature and a body that is recursively built 

from the following constructs: 

assignment x := E, where x is either the class variable C of type fU C g or a local 

variable within S (including the output-parameters), and E is a expression of the 

same type as x, 

local variable declaration Let x :: T , 

skip and fail, 

sequencing S 1 S 2 and branching IF P THEN S 1 ELSE S 2 ENDIF, 

method call C 0 :- M 0 (in : E1 0 ::: E0 j out : x0 1 ::: x0 i ), where M 0 is a method on class 

C 0 with compatible signature and 

non-deterministic selection of values New:f(x), where f is a selector on the representation 

type of C. 

If the class name is omitted in a method-call, then we refer to the class C itself or to the 

global method New Id to denote the selection of a new identier. Clearly, wemay regard this 

method as belonging to an abstract class Any that is a superclass of all classes with structure 

?. 

A method M on a class C is called value-dened i all types occurring in its signature are 

proper value types. As already mentioned we distinguish between methods visible to the user 

and hidden methods. We require each visible method to be value-dened. Subclasses inherit 

the methods of their superclasses, but overriding is allowed as long as the new method is a 

specialization of all its corresponding methods in its superclasses. 

Example 9.4. Let us add the method add-insurant to the class Insurant of Example 9.3. 

184

Method 

add-insurant ( in : request-data :: REQUEST-DATA , 

out : contract-no :: NAT ) = 

Insurant :- check-data ( in : request-data , 

out : acceptable :: BOOL ) 

IF acceptable 

THEN Let I :: ID , C :: NAT 

New.contract-no (C) 

New Id ( out : I ) 

Insurant :- compute-insurant-data 

(in:request-data,C ,out:V) 

Insurant := Insurant [f(I,V)g 

ELSE fail 

ENDIF 

ut 

Let us briefly discuss what it means that a method N on a class D specializes the method M 

on a superclass C. First, we may assume|taking records|that there is exactly one inputand 

one output-type, say I N (resp. I M ) and O N (resp. O M ). The input-type is used for two 

purposes: object identication in D (resp. C) and providing necessary parameters, hence I N 

(resp. I M ) is a subtype of some I D I 0 N (resp. I C I 0 M ). 

In order to \inherit" the behaviour of M to N we must be able to transform N in such a 

way that it becomes applicable to the input of M. Hence we have to project the parameter 

parts, whereas identication may exploit object identiers (see Denition 9.6). Hence I 0 M must 

be a subtype of I 0 N . 

Note that this gives some kind of partial contravariance, whereas [11] requires covariance 

and [1] requires contravariance only. The dierences are due to the mismatches between 

program and database design as already mentioned in Section 9.1. 

For the output-types the situation is much simpler requiring O N to be a subtype of O M . 

We may then transform N in a canonical way to some N 0 with the same signature as M. 

Both may be regarded as methods on C. Then, if N 0 applied to some input-value yields some 

result, this should also result from applying M (but not vice versa). A more formal discussion 

on the theme occurs in [17]. 

9.2.4 Schema Denitions 

Now we are prepared for the denition of a database schema that is simply given by a nite 

collectionoftype and class denitions. Later we shall add constraint denitions. Thus, taking 

together Examples 9.1-9.4, we get a schema with only one class Insurant and only one 

method add-insurant. 

However, some of the types in this schema such asNAME, ADDRESS, REQUEST ; 

DATA are undened. The same applies to the methods check-data and compute-insurantdata 

called by add-insurant. This style of allowing partiality in OODM schemata allows to 

capture also incomplete knowledge about an application area and will be essential for our 

methodology. In the next two chapters we shall explain in more detail this feature and show 

how toexploit it for a standard renement process. 

First let us have a closer look at schemata that are \complete", i.e. correspond to anal 

design of an application. This leads to the notion of closed schemata. 

185

Denition 9.5. A schema S is a nite collection of type, class and constraint denitions. It is 

closed i all types, classes and methods occurring within type denitions, structure denitions 

and methods are dened in S. 

Let us postpone constraints for a while. At each time, a class is given by a nite set of objects. 

More precisely, we need the notion of a database instance. 

Denition 9.6. An instance D of a closed schema S assigns to each class C a value D(C) 

of type f(ident : IDvalue : T C )g such that the following conditions are satised: 





Moreover, if T C is a subtype of TC 0 with subtype function f : T C ! TC 0 , then we have 


referential integrity: 

relation o r wehave 

For each reference from C to C 0 with corresponding occurrence 


Basic update methods, i.e. insertion, deletion and update of a single object into a class C, 

can not always be derived in the object-oriented case, because the abstract identiers have 

to be hidden from the user. However, in [16] it has been shown that for value-representable 

classes these operations are uniquely determined by the schema and consistent with respect 

to the implicit referential and inclusion constraints. 

Value-representability of all classes in a closed schema is implied, if we can derive a (trivial) 

uniqueness constraint for each class. Such aconstraint requires the values of type T C in the 

class extension C to be unique: 

8i j :: ID:8v :: T C : (i v) 2D(C) ^ (j v) 2D(C) ) i = j : (9.94) 

Finally, the semantics of a closed schema is given by database histories, where a database 

history on a schema S is a sequence D 0 D 1 ::: of instances such that D 0 is the empty 

database and each transition from D i;1 to D i is due to some visible method on some class 

C 2S. 

9.3 Class Abstraction 

As we have seen in Section 9.2 the structure expression of a class in an OODM schema may 

contain parameters. These arise from parameterized types. Parameterized classes allow to 

abstract from concrete structures. Indeed, an instance of a parameterized class may not be 

186

egarded as a single set of pairs, but as a family hereof indexed by the possible instantiations. 

Let us now extend and concretize this view to arbitrary schemata. 

If we know that objects will have some attributes, but we still do not know the type of the 

corresponding values, we may leave the corresponding parameter uninstantiated. However, if 

we already know that we shall instantiate this parameter by some type, we may mark this 

parameter as a type parameter. Ifwe know that there will be some reference r i : C i ,butC i is 

undened, then we have aclass parameter. 

For parameterized classes the possibilities to dene methods and constraints are restricted. 

If is a type parameter and we do not know anything about the type, there is no non-trivial 

way to express a term of that type, but terms are required in assignments as well as in 

constraints. However, we mayhave partial knowledge of that type, e.g. that it is a subtype of 

some other type, in which case we may use terms of that supertype. 

If C is a class parameter, then each call of a method m on C is indeed undened. Therefore, 

for the proof of properties of the calling method such as consistency we only have the 

possibility to assume an arbitrary input-output-relation for m unless we completely defer the 

proof. 

Denition 9.7. If S is a schema, T a type parameter, C a class parameter and M an 

undened method. A parameter restriction is either T T 0 with some value type expression 

T 0 , C isa C 0 with some class name C 0 , C:structure S with some structure expression S 

or a restriction on the types of the signature of M. 

Here denotes the subtype relation and its canonical extension to structure expressions. Note 

that some parameter restrictions may be inferred from context in the schema S. If a parameter 

is unrestricted, we may add the implicit parameter restrictions T ?, C:structure ? 

and T i ?for type parameters, class parameters and types in method signatures. However, 

if there is more than one restriction on a parameter, these may be inconsistent. In the case 

of a consistent set of parameter restrictions, the set of restrictions on one parameter may be 

unied to give only one restriction in the form of Denition 9.7. We then talk of the normalized 

set of parameter restrictions. 

In order to dene the semantics of open (i.e. not closed) schemata, we need the notions of 

instantiations. 

Denition 9.8. Let S be a schema with a consistent set of parameter restrictions. An instantiation 

I is given by a closed schema S 0 that results from S by replacing each type parameter 

T by avalue type, each class parameter by aclass and each undened method by \Let ::: 

o i :: O i ::: "such that all parameter restrictions are satised. S 0 is called minimal i we had 

taken the types and classes occurring in the normalized set of parameter restrictions. 

Example 9.5. Let us look again at Examples 9.1-9.4. The minimal instantiation of the type 

VZ (and VZ 0 ) gives 

Type VZ = 

( begin : ? , 

end : ? , 


End VZ 

The minimal instantiation of the class Insurant leads to the structure expression 

187


name : ? , 

address : ? , 

sex : ? , 

insurance-times : VZ-list , 

agency : ? ) 

The method add-insurant involves the call of check-data on the same class, but this method is 

undened, hence could only be treated as the non-deterministic value selection \ Let accepted 

:: BOOL ". ut 

Finally, the full semantics of an open schema S is given by families of history sets indexed 

by the possible instantiations of S, whereas the minimal semantics is the semantics of the 

minimal instantiation. 

Note that each instantiation can be projected naturally to the minimal one. The principle 

of class abstraction is necessary for stepwise renement as indicated in Section 9.3, since 

otherwise we were not able to support partial designs. On the other hand, it increases the 

band-width of possible concrete designs that occur as instantiations. Therefore, it is desirable 

to provide libraries of abstract (partial) designs to achieve a higher rate of reusability. 

9.4 Stepwise Renement 

Once, an initial OODM schema is given, the following design process is based on stepwise 

renement. Roughly speaking, renement means the reorganization of classes and methods 

such that the semantics of the old schema is \preserved" within the new one. This is captured 

by the next denition. 

Let S and T be closed schemata and suppose there are (partial) functions 

{ f inst that is total taking instances of T to instances of S, 

{ f class that is partial taking a class in T to a class in S and 

{ f meth that is total taking a method in T to a (possibly empty) set of methods in S. 

such that for each method M associated with a class C in T each method M 0 2 f meth (M) is 

associated with f class (C). If S and T are arbitrary schemata, assume these functions to be 

dened on the minimal instantiations. 

Denition 9.9. T is a renement of S i for each pair (D i;1 D i ) in a database history of 

T that corresponds to a method M and each M 0 2 f meth (M) that is dened and terminating 

in f inst (D i;1 ) the pair (f inst (D i;1 )f inst (D i )) corresponds to M 0 . 

There exists a more elegant (but also strongly theoretical) characterization of renement. We 

omit the details here. In [15] the following standard renement steps in the OODM have been 

discussed on the basis of an application example. 

9.4.1 Instantiation 

In Section 9.3 we discussed the possibility of parameterized (open) schemata and dened their 

semantics. Renement by instantiation provides denitions for such parameters, but may also 

introduce new parameters. 

188

Example 9.6. Let us instantiate the type parameters ADDRESS and AGENCY occurring 

in Example 9.3. 

Type ADDRESS = 

( zip : NAT Where self < 100 000 , 

city :STRING , 

street : STRING ) 

End ADDRESS 

Type AGENCY = 

(number : NAT Where self < 1 000 , 


phones : f TELECOM NO g , 

fax : TELECOM NO , 

cares for : f ( zip : NAT Where self < 100 000 , 

city :STRING ) g ) 

End AGENCY 

ut 

Renement by instantiation may also introduce bodies for methods that were undened so 

far. 

9.4.2 Splitting 

Renement by splitting leads to new classes with structure expressions that correspond to 

parts of an existing structure expression which in turn are replaced by references. It is mainly 

used in the case of shared data. 

Example 9.7. The class Agency stems from splitting Insurant in Example 9.3 assuming 

the instantiation of Example 9.6 to be already done. The new reference is agency : Agency. 

Class Agency = 

Structure ( agency : AGENCY ) 

End Agency 



::: ::: , 

agency : Agency ) 

Methods ::: 

End Insurant 

Clearly, the existing methods on the splitted class have also to be changed. 

ut 

9.4.3 Specialization 

Renement by specialization introduces subclasses and subtypes. Moreover, it may involve to 

replace a structure expression such that the new representation type will be a subtype of the 

old one and the new implicit constraints will imply the old ones. 

Example 9.8. Let us introduce a new class Main-Insurant as a subclass of Insurant. 

Objects in this subclass have an additional reference to Company that need not exist for all 

insurants. 

189

Class Main-Insurant = 

IsA Insurant Structure (account-no : NAT , 

employed-by :Company ) 

Methods ::: 

End Main-Insurant 

The new class Insurant results by specializing the old class with this name. We simply 

add a reference to the class Main-Insurant for the case of insurant ofkind \Family". The 

corresponding subtype function is a simple projection. 


Structure ( ::: , 

insurance-times : [ ( begin : DATE , 

end : DATE [?, 

( kind : \Main" j \Interruption" ) [ 

(kind:\Family", 

associated-with : Main-Insurant ))] 

Where ::: , 

agency : Agency ) 

Methods ::: 

End Insurant 

ut 

9.4.4 Extension 

Renement by extension is very simple, since it means the denition of new types, classes, 

constraints or methods that do not yet exist in the schema. 

Example 9.9. A new class New Insurant to capture persons that apply to become an 

insurant isintroduced as follows. 

Class New Insurant = 

Structure (name:NAME , 


sex : SEX , 

when to start : DATE , 

initial-agency : Agency , 

vocational-group : VOCATION-KEY , 

income : NAT Where self < 1 000 000 ) 

Methods ::: 

End New Insurant 

Objects may at the same time belong to both class Insurant and New Insurant with 

dierent names, addresses and so on. Object identiers are used to relate dierent aspects of 

the same object. 

ut 

9.5 Declarativity by Constraint Centered Design 

As announced in Denition 9.5 we now concretize constraints associated with a schema. 

Particular interest will be paid for such constraints that arise as generalizations of constraints 

known from the relational model, e.g. functional, inclusion and exclusion constraints [17, 18]. 

190

Denition 9.10. { An integrity constraint on a schema S is a formula I over the underlying 

type system with free variables fr(I) fC 1 ::: C n g, where each class name C i is used 

as a variable of type f(ident : IDvalue : T Ci )g. 

{ An instance D of a schema is said to be consistent i substituting D(C) for each class 

variable C in each integrity constraint I evaluates to true, when interpreted in the usual 

way. 

Note that the conditions for an instance in Denition 9.6 correspond to model inherent integrity 

constraints. We refer to these constraints as implicit identier, IsA and referential 

constraints on the schema S. Other constraints that are already given implicitly by the structure 

of the schema arise from Where-clauses in predicative types. Indeed, we may replace such 

types by the underlying ground type|just omit the Where-clause| and add the clause as a 

constraint. From the designer's point of view this is not necessary, but it will be as soon as 

constraint maintenance comes into play (see below). 

Example 9.10. Return to Example 9.8, where we introduced the class Main-Insurant as 

a subclass of Insurant. We would like to express that each object currently in Insurant 

with kind = \Main" must also belong to Main-Insurant. This gives the formula 

8i v b ` 

(i v) 2 Insurant ^ 

insurance-times(o) =[(b ? \Main") j `] ) 

9w (i w) 2 Main-Insurant 

with free variables Insurant and Main-Insurant. 

ut 

In particular, we allow distinguished classes of constraints to be specied in OODM schemata. 

These comprise inclusion, exclusion, functional, uniqueness, object generating and path constraints 

and generalize relevant classes known in the relational eld [22]. 

Denition 9.11. Let C C 1 C 2 be classes in a schema S and let c i : T C ! T i (i = 1 2 3) 

and c i : T Ci ! T (i =1 2) be subtype functions. 

{ A functional constraint on C is a constraint of the form 

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) : 

(9.95) 

{ An inclusion constraint on C 1 and C 2 is a constraint of the form 

8t :: T:9i 1 :: IDv 1 :: T C 1 : (i 1v 1 ) 2 x C 1 ^ c 1(v 1 )=t ) 

9i 2 :: IDv 2 :: T C 2 : (i 2v 2 ) 2 x C 2 ^ c 2(v 2 )=t : (9.96) 

{ An exclusion constraint on C 1 , C 2 is a constraint of the form 

8i 1 i 2 :: ID:8v 1 :: T C 1 : 8v 2 :: T C 2 : (i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1 (v 1 ) 6= c 2 (v 2 ) : 

(9.97) 

191

Constraints increase the declarativity of designs. This is important, because in data and 

knowledge intensive application systems the data and constraints on them usually live longer 

than the operations, i.e. the methods. 

Then the problem is to guarantee the consistency of the methods with respect to the 

specied constraints. Sometimes this requires hard verication work, but for a wide spectrum 

of schemata automatic transformation of constraints into methods is provided. 

In [16] consistent generic update operations with respect to implicit constraints have been 

presented. In [18] this has been extended to the classes of constraints mentioned above. In 

[19] an algorithm for the transformation of constraints into transactions has been proven to 

be correct. This algorithm reduces the consistency enforcement task to basic updates. It can 

be shown that this operational approach to consistency enforcement is more powerfull than 

the rule triggering approach [5, 8]. However, the verication of a very technical condition, 

called I-reducedness is required, which limits the applicability of consistency enforcement in 

general. We omit the details of the algorithm here, since they are hidden to the designer. 

The only thing a designer has to know is that constraint specications will be made explicit 

in methods in a canonical way. If this leads to unexpected results, s/he may change the original 

design. It is an open research problem how to support the amelioration of a schema in case 

constraint enforcement leads to inecient methods. 

9.6 Variation Based Reuse: A Research Issue 

The design process presented in Sections 9.3-9.5 implicitly assumes that we want to build a 

new application system from scratch. One promise of the object oriented approach, however, is 

an enormous increase in software reuse. This can be achieved if wekeep the design components, 

i.e. type and class denitions in libraries. The benets hereof are apparent especially if we 

regard the scale of reusability of parameterized class denitions. 

Unfortunately reusability does not automatically imply reuse. Indeed, we have to provide 

mechanisms to relate the intended (new) designs with existing components in such a library. 

Existing type and class denitions are not independent from one another. The idea is now to 

exploit the hierarchies in OODM schemata due to instantiation, specialization and renement. 

This extends the work in [14], where the specialization taxonomy in a KL-ONE like knowledge 

representation system has been exploited for a similar task. 

An intended design is given just as before by a rst (partial) OODM schema. Then the 

following cases may occur. 

{ A class/type of the intended design is an instantiation, specialization or renement ofan 

existing design component. Then we may ask whether a rearrangement of requirements 

would enable the reuse of further instantiations, specializations or renements that exist 

in the library. 

{ A class/type of the intended design is a variant of an existing library component, i.e. 

the rst alternative is true for a reparameterization of this library component. Of course, 

this is always possible, since a pure parameter would satisfy this requirement. Hence 

we have to judge whether it is helpfull to take the reuse of the reparameterization into 

consideration. This approach is similar to the use of a similarity measure in case-based 

reasoning. 

192

Once we have discovered a reusable variant in the library, wemaykeep track of the dierence 

to the intended design and propagate these changes along the existing hierarchies. Then we 

may ask whether the resulting components can be directly reused. 

This suggests a modication of the renement-based design methodology. Before starting 

a renement process existing domain-specic libraries are examined and variants are built. 

Then the renement process is based on selected variants. Moreover, variant construction is 

also required after standard renement steps that introduce new types or classes, since for 

these there may also exist variants in some library. 

Example 9.11. The class Insurant in Example 9.3 corresponds to current legal requirements. 

Some years ago an initial schema for an insurance application would have looked 

slightly dierent, since only main insurants existed at that time. This could have been modelled 

by some class Insurant old. 

Class Insurant old = 

Structure ( contract-no : NAT , 

name : NAME , 


sex : SEX , 

insurance-times : [(begin : DATE ,end:DATE [?)] Where ::: , 

account-no : NAT , 

employed-by :Company , 

family : f NAME g , 

agency : AGENCY ) 

Method ::: 

End Insurant old 

Assume such an initial design and all renements to be kept in some library. Omitting accountno, 

employed-by and agency in the structure expression above would give a common supertype 

of the representation types for Insurant old and Insurant in Example 9.3. 

Then build variants of all the existing renements just omitting this information and check 

whether these are compatible with the new requirements. This avoids repeating renement 

steps that occurred (in modied form) already in the past. 

Finally, specialize Insurant as indicated in Example 9.8 and build variants of the rened 

classes Insurant and Main-Insurant with respect to the hierarchy developed so far. Again 

this should avoid repeating earlier renement steps. 

ut 

The concretization and theoretical treatment of these ideas for the outlined methodology is 

a research issue under current investigation. 

9.7 Inferences in OODB Design 

The work reported in the preceding sections presents rst principles of object oriented database 

design. The main scenario is centered around stepwise renement on the basis of an object 

oriented datamodel supporting class abstraction, generic update operations and declarative 

constraint specication. The datamodel as well as the design process involve a lot of supporting 

inferences. These fall into two classes. Let us rst describe those inferences that are 

intrinsic to the datamodel. 

193

{ The datamodel supports type and class hierarchies. Since methods on subclasses may 

override inherited methods, we have to check that these are indeed specializations in 

order to shrink undesired arbitrariness. 

{ The datamodel supports strongly typed methods, hence the problem to check type correctness. 

A more general problem is the verication of consistency for constraints that 

evade from enforcement. 

{ The datamodel supports generic updates, but these only exist in the case of valuerepresentability. 

This leads to the problem whether a uniqueness constraint is implied. 

The second class of inferences is required by the design methodology and extrinsic to the 

datamodel. 

{ The main scenario is based on stepwise renement. Hence the task to verify formal re- 

nement conditions. However, for the standard renement steps in Section 9.3 this is 

redundant, since they have already been proven to be correct. 

{ In order to enforce consistence the formal requirement on I-reducedness [19] has to be 

satised. Hence the task to check it. 

{ Finally, wehave to recognize the relation of an intended design to existing library components, 

i.e. whether it is an instantiation, specialization, renement or variant. This may 

involve data restructuring as shown in [12]. Moreover, once a usefull variant has been detected, 

we may want to propagate the changes along the dierent hierarchies. This kind 

of variation-based reuse is still a research issue that we areworking on. 

However, there are still open research problems. So far, we do not know the exact boundary 

of the inferences. Another problem is the integration of user interfaces and graphical support 

in order to facilitate the control whether the design ts for the amount ofknowledge resulting 

from the current stage of requirements analysis. 

Currently, there is a research project CODE (Computer-aided Object oriented Design 

Environment) that aims at solving these open problems. The main research topics of CODE 

will be the extension of the design method toward variation-based reuse and the support of 

the outlined methodology by a CASE tool. 


1. M. Atkinson, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The object-oriented 

database system manifesto, Proc. 1st DOOD, Kyoto 1989 


5 (4), 1990, pp. 353-382 

3. G. Booch: Object-oriented design with applications, Benjamin Cummings, 1991 

4. L. Cardelli, P. Wegner: On understanding types, data abstraction and polymorphism, ACM Computing 

Surveys, vol. 17(4), pp. 471-522 

5. S. Ceri, J. Widom: Deriving production rules for constraint maintenance, Proc. 16th Conf. on 

VLDB, Brisbane (Australia), August 1990, pp. 566-577 

6. P. Coad, E. Yourdan: Object-oriented analysis, Prentice Hall, 1991 

7. C. Floyd: A comparative evaluation of system development methods, in T. W. Olle, H. G. Sol, 

A. A. Verrijn-Stuart (Eds.): Information Systems Design Methodologies { Improving the Practice, 

Elsevier 1986 

8. P. Fraternali, S. Paraboschi, L. Tanca: Automatic rule generation for constraint enforcement in 

active databases, in U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations of 

Models and Languages for Data and Objects, Volkse (Germany), October 1992, Springer WICS 

194

9. R. Hull, R. King: Semantic database modeling: survey, applications and research issues, ACM 


10. W. Kim: Object-oriented databases: denition and research directions, IEEE Trans. on Knowledge 

and Data Engineering, vol. 2 (3), 1990, pp. 327-341 

11. B. Meyer: Object-oriented software construction, Prentice-Hall, 1988 

12. B. Piza, K.-D. Schewe, J. W. Schmidt: Term subsumption with type constructors, in Y. Yesha 

(Ed.): Proc. 1st Int. Conf. on Information and Knowledge Management, Baltimore, November 

1992 

13. G. Saake, R. Jungclaus: Specication of database applications in the TROLL language, in 

D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication of Database Systems, 

Glasgow, July 1991, Springer WICS, pp. 228-245 

14. K.-D. Schewe: Variant construction using constraint propagation techniques over semantic networks, 

in J. Retti, K. Leidlmaier (Eds.): Proc. of 5th Austrian AI Conference, Igls (Austria) 1989, 

Springer IFB 208, pp. 188-197 

15. B. Schewe, K.-D. Schewe, B. Thalheim: Verfeinerungsschritte fur eine objektorientierte Entwurfsmethodik, 

in Proc. 23rd GI-Jahrestagung, Dresden (Germany), October 1993 

16. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, genericity and consistency in objectoriented 

databases, in J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Berlin (Germany), October 

1992, Springer LNCS 646, pp. 341-356 

17. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity enforcement in object-oriented 

databases, inU.Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations of Models 

and Languages for Data and Objects, Volkse (Germany), October 1992, Springer WICS 

18. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity preserving updates in object oriented databases, in 

M. Orlowska, M. Papazoglou (Eds.) : Proc. Australian Database Conference, Brisbane (Australia), 

February 1993, World Scientic, pp. 171-185 


CS-08-92, December 1992, submitted for publication 

20. S. Shlaer, S. J. Meller: An object-oriented approach to domain analysis, ACM Software Engineering 

Notes, vol. 14 (3), 1989 

21. C. Sernadas, P. Gouveia, J. Gouveia, A. Sernadas, P. Resende: The reication dimension in objectoriented 

database design, in D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication 

of Database Systems, Glasgow, July 1991, Springer WICS, pp. 275-299 

22. B. Thalheim: Dependencies in relational databases, Teubner, Leipzig 1991 

23. B. Thalheim: Intelligent database design using an extended entity-relationship model, University 

of Rostock, Preprint CS-11-91, Dezember 1991 

195

Chapter 10 

View-Centered Conceptual 

Modelling 

Contents 

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 

10.2 The data layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 

10.2.1 Application-independent abstraction: types . . . . . . . . . . . . . . 199 

10.2.2 Combined structure and behaviour: classes . . . . . . . . . . . . . . 199 

10.2.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 

10.2.4 OODM schemata and instances . . . . . . . . . . . . . . . . . . . . . 201 

10.3 The dialogue layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 

10.3.1 Views in the datamodel . . . . . . . . . . . . . . . . . . . . . . . . . 202 

10.3.2 Dialogue classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 

10.3.3 Operations on d-classes . . . . . . . . . . . . . . . . . . . . . . . . . 204 

10.3.4 The dialogue management level . . . . . . . . . . . . . . . . . . . . . 205 

10.3.5 The impact of genericity: selection, invocation, navigation, deletion . 205 

10.4 The presentation layer . . . . . . . . . . . . . . . . . . . . . . . . . 206 

10.4.1 Presentation of dialogue classes . . . . . . . . . . . . . . . . . . . . . 206 

10.4.2 Presentation of actions . . . . . . . . . . . . . . . . . . . . . . . . . . 208 

10.5 Development Methods . . . . . . . . . . . . . . . . . . . . . . . . . 208 

10.5.1 Designing a New Application . . . . . . . . . . . . . . . . . . . . . . 208 

10.5.2 Changing an Existing Application . . . . . . . . . . . . . . . . . . . 209 

10.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 

The following is a reprint of 

Klaus-Dieter Schewe, Bettina Schewe. View-Centered Conceptual Modelling { An 

Object Oriented Approach. in B. Thalheim (Ed.). Conceptual Modeling { Proc. ER 

'96 . Springer LNCS. 

196

Abstract. Information systems for highly skilled clerical workers present themselves as a 

collection of window-based processes with underlying procedures accessing databases. It is left 

to the users to continue or interrupt a certain piece of work or to switch from one application 

to another. Such system can be supported by three layers: a database layer, a dialogue layer 

and a presentation layer. 

In this paper an integrated object oriented model with a distinction between types and 

classes is outlined. In this model views on the datamodel can be extended to dialogue classes 

which enable a smooth integration of dialogue objects with the underlying datamodel. The 

only remaining task for the presentation layer consists of suitable ergonomic presentations of 

dialogue objects on the screen by means of a general UIMS. 


Conceptual modelling for information systems depends on the intended application. In case 

of the work of highly skilled clerical workers to be supported we must be aware that they do 

not follow a monotone working scheme. E.g., consider agencies of a health insurance company 

with emphasis on the service for clients, who behave dierent from one another, demand for 

optimal service and information without delay, address their demands to the agents either 

personally, by phone or by fax, appreciate not to be burdened with complicated terminology 

and forms etc. Therefore, a supporting information system must support workow beyond 

strict regularity permitting its users to examine additional circumstances, write specialized 

letters instead of using forms, escape or interrupt processes etc. 

As a consequence such information systems have to be composed of several independently 

usable dialogues leaving to the user the decision which one to use in a concrete situation. The 

dialogue system has to oer many quickly reachable dialogue objects without forcing its users 

to reach them in a specic way. Furthermore, it must oer a good overview about a client's 

situation as context to the special data to be actually processed. On the other hand, such 

systems must handle large amounts of data, hence should be supported by a well-designed 

database system without bothering the users with database details. 

From a conceptual modelling point of view the description of dialogues can be divided into 

two major components. The rst one comprises the pure representational aspects concerning 

windows, eld, menues, shortcuts etc., and its design is basically concerned with a UIMS and 

ergonomic criteria [4]. The second one deals with the abstract data contained in the dialogue 

objects. 

The nature of the intended applications of being data-intensivemakes (conceptual) database 

design a central task in their development. This task is governed by general requirements concerning 

the quality of databases, which must be free of redundancies, exible with respect 

to future extensions, not limited to specic applications and achieve highly increased performance. 

However, the data processed in the dialogues is far from satisfying these criteria, but 

give rise to views. 

The development process has to be understood as a learning process, where not all requirements 

are known at the beginning. This requires the participation of the users, because they 

are the only ones who can judge about the usefulness of proposed solutions. As a consequence 

the dialogue objects and hence the views on the data dened by them become the driving 

force in conceptual modelling. This should not be taken as an accident, but as a challenge. 

In this paper we present anintegrated model on the basis of the object oriented datamodel 

(OODM) in [10]. This datamodel has been dened in the spirit of Beeri's fundamental idea 

197

concerning the conceptual separation of values and objects [2]. Values are provided by the 

means of type systems consisting of base types and constructors [6]. Objects are provided 

by the means of classes which combine complex value and reference structures, operations 

and inheritance. This approach to object orientation is quite dierent from the work in [3, 7] 

which focusses on methods for object oriented programming. In particular, it is easy to see that 

certain classes of OODM schemata with only at acyclic reference structures are equivalent 

to schemata in the higher-order entity relationship model [13]. We give an outline of the 

datamodel in Section 10.2. 

The OODM has been extended by dialogue classes in [8, 12] in order to support the development 

of information systems as characterized above. These dialogue classes are dened 

analogously to classes in the datamodel, i.e. they provide structural and behavioural abstractions 

of dialogue objects as well as inheritance. The dialogue objects can then be handled in 

the same way as objects in the database which turns the management of the dialogues into 

a database task. The relationship between the dialogue model and the datamodel is given by 

the means of views. We present the dialogue model in Section 10.3. 

The development of user interfaces then reduces to the task of nding suitable representations 

of dialogue objects on the screen. For this purpose we propose the use of a general 

UIMS. In Section 10.4 we present a brief outline of representational means with respect to 

our integrated model. 

To that end, the work reported in this paper continues our previous work in [12]. With 

respect to that paper we nowachieve some simplication concerning the denition of dialogue 

classes. This denition was rst given independently from the datamodel and led to several 

additional notions such as selection classes, actions and dierent kinds of operations (selection, 

navigation, invocation, processing) and we observed already the relationship to views on the 

datamodel. Now this relationship is directly incorporated in the denition of dialogue classes. 

Furthermore, selection is enabled by exploiting uniqueness constraints in the datamodel that 

were introduced in handling the identication problem in OODBs [10], and navigation can be 

supported by references between dialogue classes. Finally, the variety of dierent operations 

can be simplied using the distinction between hidden and visible operations which is already 

present in the OODM. Actions then correspond to the head of visible operations, while some 

of their characteristics are shifted to presentations. 

With respect to the modelling method we think of a renement-based approach as presented 

in [9, 11] for the OODM and extended to the dialogue model in [8], i.e. the data 

schema and the dialogue schema have to be developped in parallel taking care about their 

interrelationships. This is in contrast to the work in [1, 5], where the starting point for user 

interface design is a complete entity-relationship schema. This topic will be briey sketched 

in Section 10.5. 

10.2 The data layer 

In the object-oriented datamodel (OODM) [10] we distinguish between objects and values. 

Whereas values are common abstractions identied by themselves, objects depend on the 

particular application context and have to be encoded by object identiers. In the OODM 

each object consists of a unique, immutable identier, a set of describing values of possibly 

dierent types, references to other objects and operations associated with the object. 

198

10.2.1 Application-independent abstraction: types 

Types are used to describe immutable sets of values with (type-)operations predened on 

them. Type systems are prescriptions for the syntax and semantics of permitted type definitions. 

Consider a type system that consists of some basic types, type constructors and a 

subtyping relation. Moreover, recursive types, i.e. types dened by equations, and predicative 

types, i.e. types dened by restricting formulae, are included. 

Base types used here are BOOL, NAT, INT, FLOAT, STRING, ID or ?, where ID 

is an abstract identier type without any non-trivial supertype and ? is the trivial type that 

is a supertype of every type. 

The type constructors used here are e 1 j j e n (enumeration), (a 1 : 1 ::: a n : n ) 

(record), fg (nite set), [] (list),hi (bag) or (a : )[(b : ) (union), where 1 ::: n 

are already dened types, e 1 ::: e n are constant values and a 1 ::: a n abare eld selectors. 

We may use base types and constructors to dene new types by nesting. If there is no 

confusion, the eld selectors in record or union types may be omitted. 


standard operations on base types and on records, sets, bags, ::: We omit the details here. A 

type T is called proper i the number of its parameters is 0. T is called a value type i there 

is no occurrence of ID in T .IfT 0 is a proper type occurring in a type T , then there exists a 

corresponding occurrence relation o : T T 0 ! BOOL with o(v 1 v 2 )=true i v 2 occurs 

in v 1 at the position indicated by the position of T 0 in T . 

A subtype function is a function T 0 ! T from a subtype to its supertype (T 0 T ) dened 

by the usual subtyping rules [6]. 

Predicative types are used to restrict the set of values given by some type denition to 

a subset. Formally, a predicative type T consists of an underlying type T 0 and a formula P 

with exactly one free variable self of type T 0 . Clearly, the inclusion then gives a subtype 

function. In order to avoid inationary use of quantiers, other variables are also allowed to 

occur freely in such a formula. They are assumed to be universally quantied. 

Example 10.1. We dene a type PERIOD and a predicative subtype COURSE of [PE- 

RIOD]: 

Type PERIOD = (begin : DATE, end : DATE [?) 

Where self.end 6= ?)self.begin self.end 

End PERIOD 

Type COURSE = [ PERIOD ] 

Where self = concat(L 1 ,[P 1 ,P 2 j L 2 ])) P 2 .end 6= ?^P 2 .end P 1 .begin 

End COURSE 

L 1 and L 2 are lists with elements of type PERIOD, P 1 and P 2 are values of type PERIOD 

and `concat' is the concatenation of two lists. Informally, the formula requires for any two 

successive periods the begin date of the rst one to be later than the end date of the second 

one. 

ut 

10.2.2 Combined structure and behaviour: classes 

The class concept provides the grouping of objects having the same structure and behaviour. 

Structurally this uniformly combines aspects of object values and references. Behaviourally, 

199

this abstracts from operations on single objects including their creation and deletion. In the 

OODM objects usually belong to more than one class. 

References between classes give rise to implicit referential constraints. In addition, subclasses 

(IsA-relationships) require each database instance to satisfy inclusion constraints on 

object identiers. As usual in object oriented approaches class operations are used to model 

the database dynamics. In the OODM these are associated with classes. 

Since identiers can be represented using ID,values and references can be combined into 

a representation type, where each occurrence of ID denotes references to some other class. 

Therefore, we may dene the structure of a class using parameterized types. Moreover, classes 

are arranged in IsA-hierarchies. 

More formally, ifT is a value type with parameters 1 ::: n and if the parameters are 

replaced by pairsr i : C i with a reference name r i and a class name C i , the resulting expression 

is called a structure expression. 

Then a class consists of a class name C, a structure expression S, a set of class names 

D 1 ::: D m (called superclasses) and a set of operations. We call r i the reference named r i 

from class C to class C i .Thetype derived from S by replacing each reference r i : C i bythetype 

ID is called the representation type T C of the class C. Thetype U C = (ident :IDvalue : T C ) 

is called the class type of class C. 

Example 10.2. 

Let us consider a class Insurant for an insurance application. 


Structure (insurance number: NAT , name: NAME, address: ADDRESS, 

course of insurance: [ ( kind : \self", begin : DATE, 

end : (date: DATE, reason: STRING) [?) [ 

(kind : \fam", begin : DATE, end : DATE [?, 

self : Self Insurant, relation: \child" j \spouse") ]) 

Operation ::: 

End Insurant 

Class Self Insurant = 

IsA Insurant 

Structure ( employed by :Company , account no : NAT ) 

Operation ::: 

End Self Insurant 

A period of insurance in this example is of one of two possible kinds: Either the insurant is 

employed by a company and therefore pays his/her own fee or (s)he is a family member of 

the insurant without own income. 

ut 

10.2.3 Operations 

The OODM distinguishes between visible and hidden operations on classes to emphasize those 

that can be invoked by the user. However, all operations on a class including the hidden ones 

can be accessed by other operations. The justication for such aweak hiding concept is due 

to two reasons: 

{ Visible operations serve as a means to specify (nested) transactions. In order to build 



200

{ Hidden operations can be used to handle identiers. Since these identiers do not have 

any meaning to the user, they must not occur within the input or output of a transaction. 

Each operation on a class C consists of a signature and a body. The signature consists of 

an operation name O, a set of input-parameter/type pairs i :: T i and a set of outputparameter/type 

pairs o j :: Tj 0 . The body is recursively built of the following constructs: 

{ assignment x := E, where x is the class variable C of type fU C g or a local variable 

(including the output-parameters), and E is an expression of the same type as x, 

{ local variable declaration Let x :: T , 

{ skip and fail, 

{ sequencing S 1 S 2 and branching IF P THEN S 1 ELSE S 2 ENDIF , 

{ operation call C 0 :- O 0 (in : E 0 1 ::: E0 j out : x0 1 ::: x0 i ), where O0 is an operation on class 

C 0 with compatible signature and 

{ non-deterministic selection of values New:f(x), where f is a selector on the representation 

type of C New Id selects a new identier. 

An operation O on a class C is called value-dened i all types occurring in its signature are 

proper value types. As already mentioned we require each visible operation to be value-dened. 

Subclasses inherit the operations of their superclasses, but overriding is allowed as long as the 

new operation is a specialization of all its corresponding operations in its superclasses, but 

we dispense with a formal discussion of operational specialization. 

10.2.4 OODM schemata and instances 

A database schema S is given by a nite collection of type and class denitions such that 

all types, classes and operations occurring within type denitions, structure denitions and 

operations are dened in S. 

At any time, a class represents a nite set of objects. More precisely this is captured by the 

notion of an instance (or database state). For a closed schema S an instance D assigns to each 

class C a value D(C) of type f(ident : IDvalue : T C )g such that the following conditions 

are satised: 

{ For each class C identiers must be unique. 

{ The set of identiers in a subclass C is a subset of the one in the superclass C 0 . Moreover, 

if T C T 0 C with subtype function f : T C ! T 0 C , then (i v) 2D(C) ) (i f(v)) 2D(C0 ) 

holds. 

{ For each reference r from C to D identiers j occurring in a value v of an object in C 

with respect to the occurrence relation o r , i.e.(i v) 2D(C) and o r (v j) hold, must occur 

in D(D). 

Basic update operations, i.e. insertion, deletion and update of a single object into a class C, 

cannot always be derived in the object-oriented case, because the abstract identiers have 

to be hidden from the user. However, in [10] it has been shown that for value-representable 

classes these operations are uniquely determined by the schema and consistent with respect 

to the implicit referential and inclusion constraints. 

Value-representability of all classes in a closed schema is implied, if we have a (trivial) 

uniqueness constraint for each class. Such aconstraint requires the values of type T C in the 

class extension C to be unique. 

201

10.3 The dialogue layer 

Object orientation within dialogue systems means to enter or select values on the screen and 

to invoke actions on them. The dialogue system reacts by oering other data or by activating 

and deactivating entries in selection lists or possible actions in the action bar [4]. We call 

such a collection of data and possible actions a dialogue object (d-object). In graphical user 

interfaces d-objects are normally presented in a window. 

Users invoke actions to change data in the database, to navigate to another possibly 

new dialogue object or to a modied presentation of the same dialogue object. Depending 

on selections or entries made in a d-object only a part of the possible actions are allowed. 

The processing of an action may require further preconditions depending on the state of the 

dialogue system especially on other user's d-objects. 

A dialog object consists of a unique abstract identier, a set of values v i in associated 

elds F 1 ::: F n which correspond to describing values of objects, a set of references to other 

dialogue objects in order to allow aquicknavigational access, asetofactions to change the 

data and to control the dialogue and a state with the values àctive' and ìnactive'. This 

means, that dialogue objects only exist as long as the dialogue object is visible on the screen. 

If a window is closed the corresponding dialogue object ist deleted. 

The identier serves to administrate the dialogue objects. It is not known to the user, 

cannot be used by him and is not visible. Only the active d-object allows manipulations of 

the represented data and only its actions can be invoked. 

10.3.1 Views in the datamodel 

Roughly spoken a view may be regarded as a stored query. In the relational datamodel queries 

can be expressed by terms in relational algebra. This can be generalized to the OODM using 

its type system. Then a query turns out to be represented by a term t over some type T such 

that the free variables of t represent the classes. 

Since objects employ identiers, we have to distinguish between queries that result in 

values and those that result in (collections of) objects. Therefore we distinguish in the OODM 

between value queries and general access expressions. For a value query the type T of the 

dening term t mustbeavalue type. 

This allows terms t to be built which involve only identiers already existing in the 

database. Thus, such queries are called object preserving. Ifwewant the result of a query to 

represent `new' objects, i.e. if we want tohave object generating queries, we have to apply a 

mechanism to create new object identiers. This can be achieved by object creating functions 

on the type ID with arity ID ::: ID ! ID [10]. 

The idea that a view is a stored query then carries over easily. Thus, a view on the 

schema S consists of a view name v 2 N C such that there is no class C with this name, a 

structure expression S(v) containing references to classes in S or to views on S and a dening 

access expression 1 t(v) oftype f(ident :IDvalue : T v )g, where T v is the representation type 

corresponding to S(v). 

Example 10.3. Let us give a sample view on the schema of Example 10.2: 

View Course of Insurant = 

1 Assume for the moment that view denitions do not contain recursive denitions. 

202

Structure 

[ ( kind : \self", begin : DATE, 

end : (date: DATE, reason: STRING) [?, 

fams: f ( id: Insurant, name:NAME, relation: \child" j \spouse", 

begin : DATE, end:DATE [?) g ) [ 

(kind : \fam", begin : DATE, end : DATE [?, 

self : (id : Insurant, name: NAME, 

begin : DATE, end : DATE [?)) ] 

Definition 

f (i,course) j9cou . (i,cou) 2 Insurant ^ 

course = [ p k9c 2 cou.course of insurance . 

p.kind = c.kind ^ p.begin = c.begin ^ p.end = c.end ^ 

( c.kind = \self" ) p.fams = f (j,fam) j9cou 0 . 

(j,cou 0 ) 2 Insurant ^ fam.name = cou 0 .name ^ 

(\fam", fam.begin, fam.end, i, fam.relation) = 

cou 0 .course of insurance.rst ^ 

p.begin fam.begin ^ 

(p.end 6= ?^fam.end 6= ?)fam.end p.end )) ^ 

( c.kind = \fam" )9(k,cou 00 ) 2 Insurant . c.self = k ^ 

p.self.name = cou 00 .name ^ p.self.begin p.begin ^ 

(\self", p.self.begin, p.self.end) 2 cou 00 .course of insurance ^ 

(p.self.end 6= ?^p.end 6= ?)p.end p.self.end )) ] g 

End Course of Insurant 

This view contains the course of insurance of one concrete insurant. Together with one period 

of kind `self' in that course there are also the latest insurance periods of the family members. 

Together with one period of kind `fam' there is also the period of the insurant to whose family 

the related insurant belongs. 

ut 

10.3.2 Dialogue classes 

Dialogue classes serve to group dialogue objects with the same structure and behaviour. As 

with objects we may use the type ID to represent identiers of d-objects and combine values 

and references in a structure expression now containing also references to other d-classes. This 

can be described by a view as dened in the previous subsection. In addition, there should be 

avisualtype which describes the data shown on the screen. This should be a supertype of the 

representation type corresponding to the structure expression of the dening view. Then the 

content of a d-object may be split over more than one d-class which leads to the introduction 

of super-d-classes. Actions can be expressed by d-operations. 

Thus, a dialogue class (d-class) consists of a unique name DC, a set of names DC 1 , ::: , 

DC n of d-classes (called the super-d-classes of DC), a dening view with a structure expression 

DT 0 DC and a content denition def DC, a value type DT DC which is a supertype of the 

representation type T 0 DC corresponding to DT 0 DC and a set of d-operations. We call DT 0 DC the 

content structure , T 0 DC the content type and DT DC the visual type of the d-class DC. 

Example 10.4. We give a part of the formal denition of a d-class corresponding to the view 

in Example 10.3: 

203

Dialogue class Course of Insurance 

IsA IIP 

Visual 

[(kind: \self", begin: DATE, end: (date: DATE, reason: STRING) [?, 

fams: f(name: NAME, relation: \child" j \spouse", begin: DATE, 

end: DATE [?)g ) [ 

(kind: \fam", begin: DATE, end: DATE [?, 

self: (name: NAME, begin: DATE, end: DATE [?))] 

Content 

[(kind: \self", begin: DATE, end: (date: DATE, reason: STRING) [?, 

fams: f(id : Insurant, name: NAME, relation: \child" j \spouse", 

begin: DATE, end: DATE [?)g ) [ 

(kind: \fam", begin: DATE, end: DATE [?, 

self: (id : Insurant, name:NAME, begin: DATE, end:DATE [?))] 

Definition ::: 

Operations ::: 

End Course of insurance 

For the denition we refer to the view presented in Example 10.3. 

ut 

10.3.3 Operations on d-classes 

If a user selects an action associated with an active d-object, (s)he initiates changes to that 

d-object including its deletion, the creation of a new d-object, modications to the underlying 

database or switches to other d-objects. This is modelled by the d-operations on d-classes. 

As with the datamodel we distinguish between visible and hidden d-operations. Only 

visible d-operations are accessible by user actions, whereas hidden d-operations can only 

be called from other d-operations. In contrast to the datamodel the access to (visible) d- 

operations may be restricted by preconditions that express the statusa of a d-object by means 

of selected or non-selected parts. Such preconditions are given by supertypes of the visual type 

DT DC of the d-class DC. 

Thus, a d-operation consists of a signature, aselection type and a body. The signature is the 

same as for classes in the datamodel. This also applies to the body with the dierences that 

operations to be called can be d-operations on d-classes and operations on classes, whereas 

assignments are not allowed. In this way we circumvent the update problem for views. 

Then by analogy to the datamodel we require visible d-operations to be value-dened. 

Sub-d-classes inherit d-operations from their super-d-classes, and overriding is restricted to 

specialization. 

Example 10.5. The following denes a d-operation on the d-class Course of insurance 

of Example 10.5: 

New Insurant [(fams:(name:NAME)) [ (self:(name:NAME)) [?] 

(in : ?, out:?) 

System :- save (in: cont) 

Let ins :: ID 

Course of insurance :- Select (in: sel, out: ins) 

Course of insurance :- Delete (in: cont.ident) 

204

Course of insurance :- Invoke (in: ins) 

End New Insurant 

Here the selection supertype has been put into brackets. Furthermore, we used two standard 

variables cont of type (ident :ID,value : TDC 0 ) for the identier/content pair of the current 

d-object and sel for the selected values with respect to the selection type. 

This d-operation New Insurant stores the actual data in the database, retrieves the identier 

of the selected insurant, deletes the current d-object and creates a new one associated 

with the course of insurance of the selected insurant. 

For that purposes we used calls to a d-operation save dened on the d-class System (assume 

this has been dened as a super-d-class of IIP) and to generic d-operations on Course 

of insurance for the deletion and creation of d-objects. We shall investigate genericity below. 

ut 

10.3.4 The dialogue management level 

The notions of d-schema and d-instance generalize the corresponding notions for the datamodel. 

A d-schema is a nite collection of type, class and d-class denitions that do not 

contain undened types, classes, operations, d-classes or d-operations occurring in structure 

expressions, references, superclasses, signatures or calls. In particular, each d-schema DS has 

an underlying OODM schema S. 

Then an instance D of S already denes the contents of the views underlying d-classes in 

DS and hence also the contents with respect to the visual types. For a d-class DC we write 

D(DC) for the value of type f(ident :IDvalue : TDC 0 )g dened by the content denition part 

def DC on the instance D. We call D(DC) the set of possible d-objects in d-clasd DC with 

respect to the instance D. 

However, we want to associate with a d-instance the set of actual (active) d-objects in 

d-class DC. This should lead to subsets of D(DC) satisfying conditions analogous to those 

required for instances D. 

Thus, a d-instance DD for a d-schema DS consists of an instance D of the underlying 

OODM schema S and a mapping D act which assigns to each d-class DC 2DSasetD act (DC) 

such that the uniqueness of identiers, the inclusion integrity and the referential integrity (as 

dened for instances) hold. 

10.3.5 The impact of genericity: selection, invocation, navigation, deletion 

As for classes in the datamodel wemay ask for generic operations on d-classes. Since possible d- 

objects are already determined by instances, generic operations for d-classes can only provide 

the deletion of actual d-objects or the invocation of another d-object. Note that the latter 

case comprises the navigation to an existing (active) d-object as well as the creation of a new 

one (in D act (DC)) combined with a switch toit. 

Since sets of actual d-objects behave like sets of ordinary OODM objects, we may exploit 

the identication theoty of the OODM in [10] to generate these generic operations. Even 

simpler, due to the denition via views we only need a value identication for d-objects. 

Recall that such a value identication is given by uniqueness constraints for all classes. 

Since subtyping can be easily extended to structure expressions, such a uniqueness constraint 

on a class C with structure expression S C is simply given by a super-structure-expression S C . 

205

If the representation type T C of S C is a value type, then this means to determine a unique 

object in D(C), i.e. its identier, from a given value of type T C (if such an object exists at 

all). If S C contains a reference, we rst have to identify a referenced object, i.e. to determine 

its identier from a given value of some value type. 

Thus value identication gives rise to a generic select-operation which may call selectoperations 

on other classes. If there are several uniqueness constraints, we may combine the 

required input types using the union constructor. 

Example 10.6. For the class Insurant in Example 10.2 we may obtain a select-operation 

with the signature 

select (in : sel :: (Isn: NAT) [ (name: NAME, date of birth: DATE, address: ADDRESS 

), out: i:: ID) 

Since the view in Example 10.3 is object preserving and the d-class in Example 10.4 is dened 

on top of this view, the operation carries over to a select operation for a d-objerct (used in 

Example 10.5). 

ut 

As seen in Example 10.6 value identication gives rise to a generic select-operation for d- 

classes dened by object preserving views. In this case the delete- and invoke-operations 

can be split into a selection part, i.e. a call to the select-operation and a simpler delete- or 

invoke-operation with an input of type ID (the identier of the d-object). 

In the case of an object generating view the new objects depend on others and we may 

obtain a generic select-operation by rst selecting these other objects. E.g., in our insurance 

application this applies to a d-class in which each period of an insurant is turned into a 

separate object. 

10.4 The presentation layer 

The handling of a dialogue system is best performed using a User Interface Management 

System (UIMS). Such a system provides (among other features) 

{ windows and operations to open and close them, to move them on the screen, to scroll, 

to change their size etc. 

{ several representations of data, such as selection lists or buttons, text entry elds etc. 

{ a main menu where all dialogues start, often called the operation desk. 

10.4.1 Presentation of dialogue classes 

For each d-class there is at least one representation on the screen. Normally there are actions 

with which the representation of the d-class on the screen can be modied without changing 

the state of the d-class. The representation of the d-class is given by the UIMS. The concrete 

description therefore depends on its functionality. 

Visual values are associated with elds consisting of a relation to a component of the 

content type of a d-class, eld attributes like `protected' / ùnprotected', `normal' / èmphasized', 

::: , the typeoftheeld (text entry eld, selection eld, ::: ), a selection state with 

the values `selected' and ùnselected', the information whether data have been entered in a 

eld or not, the information where the cursor is placed and an optional name of the eld. 

206

System History Options Windows 

Course of the Insurance 

1133557 Neumann, Luise 10.11.1948 273 

+more information about the insurant + 

Kind Begin End Reason of End 

Name Relation Begin End 

self 01.04.1979 

Neumann, Marga child 13.02.1984 

Neumann, Horst child 27.04.1986 

fam 10.11.1976 31.03.1979 

Meier-Neumann, 

01.01.1975 

Fritz 

self 01.10.1967 09.11.1976 Too old as student 

fam 10.11.1948 30.09.1967 

Neumann, Wilhelm 01.01.1919 16.08.1990 

Fig. 10.1. The presentation of a d-object 

Fields may be grouped together. Further properties of elds depend on the features of 

the UIMS. For each eld there is at least one representation on the screen comprising a 

declaration of its length, its style of emphasis and its style of representation of protection. 

For each representation there is also a representation of the selection state of the eld. 

Example 10.7. The presentation of a d-object in the class Course of insurance consists 

of three parts corresponding to the d-classes Course of insurance, IIP and System (see 

Figure 10.1): 

{ The ìnsurant information part (IIP)' is part of most d-objects and gives an overview 

about the insurant. 

{ Besides the IIP the d-object contains a list of insurance periods. Each period is represented 

by a group of lines of which therst line contains the kind (self or as family member of 

another insurant), the begin and the end of the period. For periods of kind `self' several 

lines (maybe0)follow with names of family members, the relation of the family member 

to the insurant and begin and end of the latest insurance period of the family member. 

For periods of kind `fam' one line follows with the name and the insurance period of the 

insurant whose family the member belongs to. 

{ The last line is used for messages and originates from the d-class System. ut 

Besides the d-classes which areinvoked by the user there are dialogue boxes, in which data can 

be entered and processed [4]. Dialogue boxes are called by operations of d-classes, if further 

data are needed to nish an operation. 

207

10.4.2 Presentation of actions 

The user uses actions to change the data on the screen and to control the dialogue. These 

actions correspond to the d-operations in the d-classes of the d-object and therefore consist 

of a name used in the action bar, a shortcut symbol with which the action can be invoked 

alternatively and a selection criterium. 

The name of the action is the name of the corresponding d-operation, but names of menus 

may be added if necessary. The selection criterium is given by elds that may or must be 

selected before invoking the action. It corresponds to the selection type of the corresponding 

d-operation. Invoking an action means to execute the body of the corresponding d-operation. 

Example 10.8. Let us explain some actions associated with the d-object in Figure 10.1: 

{ `History' shows earlier states of the course of the insurance. 

{ `System' and Òptions' are pull-down-menus (omitted in the example). E. g., `System' 

contains the following actions: New Insurant, Save, Cancel (Esc), Save and Quit (F3), 

Scroll Forward (Bild#), Scroll Back (Bild"), Desk (Strg + F4). 

{ `New insurant' saves the data on the screen and shows the course of insurance of another 

insurant which can be selected in the list of periods. If no insurant is selected a dialogue 

box with entry elds for the search for a new insurant is activated. 

{ `Save' saves the changes of the data on the screen and shows the same dialogue object 

again. 

{ `Cancel' deletes the dialogue object and returns to the one which was active before respectively 

to the desk. Changes made to the data are forgotten. 

{ `Windows' is a pull-down-menu, containing the list of all existing dialogue objects. It is 

oered by the UIMS and not described here. 

ut 

10.5 Development Methods 

In the previous sections we presented an integrated data- and dialogue-model for conceptual 

modelling, but we did not investigate how to use this model in practice. Due to space limitations, 

the following presentation of development methods will be rather sketchy. We indicate 

the power of the chosen model on the basis of two scenarios. The rst one captures the case 

of a new system to be designed, the second one handles the case of changing or extending a 

working application. In both cases, we concentrate on the conceptual level. 

10.5.1 Designing a New Application 

In designing a new application we have todenetypes, classes and d-classes from scratch. We 

assume that purely presentational aspects are captured by the use of a UIMS. The method we 

propose assumes an almost monotonic growth of application knowledge by means of interviews 

with the intended users and analysis of their working processes to be supported. The rst goal 

will be to gather the basic activities of the users and to outline the corresponding dialogues. 

At this level, representational aspects naturally come into play by means of restrictions on 

screens, facilities of the UIMS and basic hard- and software. 

From a more abstract point of view this means to start with dialogue objects and to 

abstract to dialogue classes without knowing the underlying datamodel schema. E.g., we 

208

could decide to have a presentation of dialogue objects and actions (grouped to menues) as 

in Figure 10.1. The simplest way to obtain a rst conceptual data schema is to take the view 

denition as trivial, i.e., the content type of the view coincides with the representation type 

of a class. Note that references only occur, if we decided to split the data in the presentation 

among several dialogue objects. 

As to the methods, we may either postpone them for later specication or dene them 

on the basis of this rst schema. The rst alternative is generally recommended as long as 

the database schema is not stable. Thus, the rst development step results in a schema with 

certain undened types, classes and references and with redundancies concerning the data 

schema. 

Usually there is not only one such dialogue class { otherwise we are done. Hence there 

will be several dependencies among the classes of the schema. The second step will be to 

make these dependencies explicit by the denition of constraints. Then the third step is to 

rene the schema in order to shift as much of theses dependencies as possible into structures. 

Renement rules for this purpose have been extensively discussed in [8, 9, 11]. These comprise 

{ the splitting of classes thereby introducing references or IsA-relations, 

{ the introduction of new classes by specialization omitting the old class or introducing an 

IsA-relation, 

{ the extension of the schema by new types, classes, d-classes etc. and 

{ the completion of the schema adding denitions to undened components. 

All these renement steps can be reversed. Applying one of them requires consequent changes 

to the constraints and the methods dened so far. Furthermore, each renement guarantees 

that classes of the old schema become views on the new schema, which inturnshows how to 

achieve complete d-classes. 

10.5.2 Changing an Existing Application 

When there already exists a running application and we want to change or extend it, the 

processing method is quite similar. We analyse the new processes, detect the dialogue object 

and add d-classes to the schema. In addition, let the underlying views be trivial thereby 

introducing redundancies on the data layer. Thus, the rst schema update is simply additive. 

In the following steps redundancies have to be made explicit using constraints and these are 

shifted into structure denitions. As a result we obtain again the required view denition which 

can nally be attened. We omit further details and refer to [8] for an extensive discussion of 

an application example. 


In this paper we presented an object oriented model which integrates a datamodel and a 

dialogue model by the means of views. Objects are used as units of data in the database with 

describing values, references to other objects and operations. They are managed by an object 

oriented database management system. 

In the same way d-objects dene the basic units of dialogues. A d-object abstract from 

presentational issues at the user interface und hence provides a description of data and actions 

presented in dialogues. The data in the database and the dialogues are related by the 

209

means of views. This allows d-objects to be managed in the manner as objects. Only screen 

presentations are left to a supporting user interface management system. 

The conceptual building blocks for objects and d-objects then follow the same principles. 

We use classes and d-classes to describe the abstract structural and behavioural aspects of 

both. Then a view on the database describes the possible contents of d-objects. Selection and 

creation correspond to uniqueness constraint thatwere introduced in connection with valuerepresentability. 

In contrast to previous work the paper emphasizes just these relationships 

between the datamodel and the dialogue model. 


1. H. Balzert. Der JANUS-Dialogexperte: Vom Fachkonzept zur Dialogstruktur. Softwaretechnik- 

Trends, 13(3), August 1993. 

2. C. Beeri. A formal approach to object-oriented databases. In Data and Knowledge Engineering, 

Vol. 5, 353 { 382. North Holland, 1990. 

3. P. CoadandE.Yourdan. Object-oriented analysis. Prentice Hall, Englewood-Clis, N.J., 1991. 

4. IBM (International Business Machines Corp.). Systems Application Architecture Common User 

Access / Advanced Interface Design Guide, 1991. Nr. SC34-4290. 

5. C. Janssen, A. Weisbecker, and J. Ziegler. Generating user interfaces from data models and 

dialogue net specications. In Human Factors in Computing Systems (INTERCHI), 418 { 423, 

Amsterdam, 1993. ACM. 

6. J. C. Mitchell. Type systems for programming languages. In J. von Leeuwen, editor, The Handbook 

of Theoretical Computer Science, Vol. B, 365 { 458. Elsevier, 1990. 

7. J. Rumbaugh, M. Blaha, W. Premerlane, F. Eddy, and W. Lorensen. Object-Oriented Modeling 

and Design. Prentice Hall, Englewood Clis, New Jersey, 1991. 

8. B. Schewe. Kooperative Softwareentwicklung { Ein objektorientierter Ansatz. Deutscher Universitatsverlag, 

Leverkusen, 1996. 

9. B. Schewe, K.-D. Schewe, and B. Thalheim. Objektorientierter Datenbankentwurf in der Entwicklung 

datenintensiver Informationssysteme. Informatik -Forschung und Entwicklung, 10(3), 

1995, 115 { 127. 

10. K.-D. Schewe and B. Thalheim. Fundamental concepts of object oriented databases. Acta Cybernetica, 

Szeged, 11(1/2), 1993, 49 { 84. 

11. K.-D. Schewe and B. Thalheim. Principles of object oriented database design. In H. Jaakkola, 

H. Kangassalo, T. Kitahashi, and A. Markus, editors, Information Modelling and Knowledge Bases 

V, 227 { 242. IOS Press, Amsterdam, 1994. 

12. B. Schewe and K.-D. Schewe. A user-centered method for the development of data-intensive 

dialogue systems { an object oriented approach. In E. D. Falkenberg, W. Hesse, A. Olive, editors, 

Information System Concepts, 88 { 103. Chapman & Hall, 1995. 

13. B. Thalheim. Foundations of entity-relationship modeling. Annals of Mathematics and Articial 

Intelligence, 7, 1993, 197 { 256. 

210

Readings in Fundamentals of Object Oriented Databases

Create successful ePaper yourself

Delete template?

Save as template?