07.03.2014 Views

Readings in Fundamentals of Object Oriented Databases

Readings in Fundamentals of Object Oriented Databases

Readings in Fundamentals of Object Oriented Databases

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Read<strong>in</strong>gs</strong> <strong>in</strong> <strong>Fundamentals</strong> <strong>of</strong><br />

<strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong><br />

{ Selected Papers {<br />

Klaus-Dieter Schewe 1 , Bernhard Thalheim 2<br />

1 Technische Universitat Clausthal<br />

Institut fur Informatik, Erzstr. 1<br />

38678 Clausthal-Zellerfeld<br />

2 Technische Universitat Cottbus<br />

Institut fur Informatik, Karl-Marx-Str. 17<br />

03044 Cottbus


Table <strong>of</strong> Contents<br />

1 Fundamental Concepts <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong> 1<br />

2 Identication as a Primitive <strong>of</strong> Database Models 34<br />

3 <strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Modell<strong>in</strong>g 51<br />

4 Higher-Level Genericity <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong> 68<br />

5 Towards a Theory <strong>of</strong> Consistency Enforcement 85<br />

6 Tailor<strong>in</strong>g Consistent Specializations as a Natural Approach to Consistency<br />

Enforcement 118<br />

7 Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance <strong>in</strong> the Context<br />

<strong>of</strong> Transaction Specications 134<br />

8 Consistency Enforcement <strong>in</strong>Entity-Relationship and <strong>Object</strong>-<strong>Oriented</strong> Models<br />

158<br />

9 Pr<strong>in</strong>ciples <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Design 179<br />

10 View-Centered Conceptual Modell<strong>in</strong>g 196<br />

i


Preface<br />

This report conta<strong>in</strong>s a collection <strong>of</strong> selected papers on \<strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong><br />

<strong>Databases</strong>". This work started with a small work<strong>in</strong>g group meet<strong>in</strong>g monthly at Hamburg<br />

University. Orig<strong>in</strong>al participants were Ingrid Wetzel, Bernhard Thalheim and Klaus-Dieter<br />

Schewe. The primary <strong>in</strong>tention was to br<strong>in</strong>g together conceptual modell<strong>in</strong>g and formal specication<br />

approaches to database design and to set up solid mathematical foundations. In<br />

these meet<strong>in</strong>gs we detected rather early that the major po<strong>in</strong>ts to focus on were identication,<br />

genericity and consistency.<br />

After rst tentative papers address<strong>in</strong>g these problems { unpublished or documented <strong>in</strong><br />

technical reports, e.g. [2, 4, 19] { the rst paper just conta<strong>in</strong><strong>in</strong>g the problem areas above <strong>in</strong><br />

its title was presented at ICDT '92 <strong>in</strong> Berl<strong>in</strong> [3]. In parallel the GCS approach to consistency<br />

enforcement was set up [9]. In collaboration with David Stemple we even discovered l<strong>in</strong>guistic<br />

reection as the fundamental implementation issue for these tasks [20]. Chapter 1 conta<strong>in</strong>s<br />

a repr<strong>in</strong>t <strong>of</strong> a polished journal version <strong>of</strong> this work published <strong>in</strong> Acta Cybernetica[6], also<br />

summarized <strong>in</strong> [5]. Chapter 2 conta<strong>in</strong>s a deeper <strong>in</strong>vestigation <strong>of</strong> the identication problem by<br />

Catriel Beeri and Bernhard Thalheim [1]. Chapter 3 conta<strong>in</strong>s a follow-on paper [7], <strong>in</strong> which<br />

the orig<strong>in</strong>al idea from formal specications to consider arbitrary underly<strong>in</strong>g type systems was<br />

taken up and the connection to higher-order <strong>in</strong>tuitionistic logic was established. Chapter 4 a<br />

repr<strong>in</strong>t <strong>of</strong> the paper presented at the 1994 Indian conference on \Management <strong>of</strong> Data" [20].<br />

In the sequel the GCS approach has been developped carefully, which after some prelim<strong>in</strong>ary<br />

work [11, 10] lead to the fundamental journal article [8] <strong>in</strong> Acta Informatica and a<br />

follow-on article <strong>in</strong> [14]. These are repr<strong>in</strong>ted <strong>in</strong> Chapter 5 and Chapter 6.<br />

From the beg<strong>in</strong>n<strong>in</strong>g there was a struggle with the \Active Database" community. Researchers<br />

believ<strong>in</strong>g <strong>in</strong> the unlimited power <strong>of</strong> the rule based approach were not very enthusiastic<br />

with our fundamental approach to consistency enforcement rely<strong>in</strong>g on specication<br />

language semantics. In particular, our emphasiz<strong>in</strong>g the need for a formal denition <strong>of</strong> the<br />

goal <strong>of</strong> consistency enforcement <strong>in</strong> databases that encompasses just term<strong>in</strong>ation, conuence<br />

and consistency, was (and still is) not generally accepted. Therefore, we started a side activity<br />

on limits <strong>of</strong> rule based systems [12, 13, 15, 16, 17, 18] <strong>in</strong> the context <strong>of</strong> transaction specications.<br />

Chapter 7 conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong> the polished article [13] published <strong>in</strong> Acta Cybernetica,<br />

which conta<strong>in</strong>s the major theoretical issues. The Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g article [18],<br />

repr<strong>in</strong>ted <strong>in</strong> Chapter 8, conta<strong>in</strong>s an application <strong>of</strong> part <strong>of</strong> that theory to simple object oriented<br />

schemata.<br />

F<strong>in</strong>ally, we worked on the design <strong>of</strong> object oriented databases. The tie-<strong>in</strong> with formal<br />

specications <strong>in</strong>itiated a renement-based approach <strong>in</strong> [21], repr<strong>in</strong>ted <strong>in</strong> Chapter 9. In the<br />

meantime Bett<strong>in</strong>a Schewe brought up the idea <strong>of</strong> an <strong>in</strong>tegrated user <strong>in</strong>terface design through<br />

the use <strong>of</strong> dialogue objects. This lead to the work <strong>in</strong> [22, 23] and additional articles, reports<br />

and books written <strong>in</strong> German. We repr<strong>in</strong>t [23] <strong>in</strong> Chapter 10.<br />

ii


Acknowledgement<br />

We would like to thank our coauthors and all others who stimulated ideas presented <strong>in</strong> our<br />

articles.<br />

References<br />

1. C. Beeri, B. Thalheim. Identication as a Primitive <strong>of</strong> Database Models. In T .Polle, T. Ripke,<br />

K.-D. Schewe. <strong>Fundamentals</strong> <strong>of</strong> Information Systems. Kluwer 1998.<br />

2. K.-D. Schewe, B. Thalheim, I. Wetzel, J.W. Schmidt. Extensible, safe object oriented design <strong>of</strong><br />

database applications. prepr<strong>in</strong>t CS-09-91. Rostock University. 1991.<br />

3. K.-D. Schewe, J.W. Schmidt, I. Wetzel. Identication, genericity and consistency <strong>in</strong> object oriented<br />

databases. <strong>in</strong> J. Biskup, R. Hull (Eds.). Proc. Int. Conf. on Database Theory { ICDT '92 . Spr<strong>in</strong>ger<br />

LNCS 646. Berl<strong>in</strong> 1992.<br />

4. K.-D. Schewe, B. Thalheim, I. Wetzel. Foundations <strong>of</strong> object oriented database concepts. Technical<br />

Report FBI-HH-B-157/92. Hamburg University. 1992.<br />

5. K.-D. Schewe, B. Thalheim. Towards a formal foundations <strong>of</strong> object oriented databases. SIGMOD<br />

workshop Comb<strong>in</strong><strong>in</strong>g declarative and object oriented databases. Wash<strong>in</strong>gton 1993.<br />

6. K.-D. Schewe, B. Thalheim. Fundamental concepts <strong>of</strong> object oriented databases. Acta Cybernetica,<br />

vol. 11 (4), 49-84. Szeged 1993.<br />

7. K.-D. Schewe. <strong>Fundamentals</strong> <strong>of</strong> object oriented database modell<strong>in</strong>g. Intelligent Systems. Moskau<br />

1997.<br />

8. K.-D. Schewe, B. Thalheim. Towards a theory <strong>of</strong> consistency enforcement. Acta Informatica (to<br />

appear).<br />

9. K.-D. Schewe, B. Thalheim, J.W. Schmidt, I. Wetzel. Enforc<strong>in</strong>g <strong>in</strong>tegrity <strong>in</strong> object oriented<br />

databases. <strong>in</strong> U. Lipeck, B. Thalheim (Eds.). Modell<strong>in</strong>g Database Dynamics. Spr<strong>in</strong>ger Workshops<br />

<strong>in</strong> Computer Science. London 1993.<br />

10. K.-D. Schewe, B. Thalheim, I. Wetzel. Integrity presev<strong>in</strong>g updates <strong>in</strong> object oriented databases. <strong>in</strong><br />

M. Orlowska, M. Papazoglou (Eds.). Advances <strong>in</strong> <strong>Databases</strong> { ADC '93 .World Scientic. Sigapore<br />

1993.<br />

11. K.-D. Schewe, B. Thalheim. Comput<strong>in</strong>g Consistent Transactions. Prepr<strong>in</strong>t CS-08-92. Rostock University<br />

1992.<br />

12. K.-D. Schewe, B. Thalheim. Achiev<strong>in</strong>g Consistency <strong>in</strong> Active <strong>Databases</strong>. <strong>in</strong> S. Chakravarty,<br />

J. Widom (Eds.). Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g { Active Database Systems. Houston 1994.<br />

13. K.-D. Schewe, B. Thalheim. Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance <strong>in</strong> the<br />

Context <strong>of</strong> Transaction Specications. Acta Cybernetica (to appear).<br />

14. K.-D. Schewe. Tailor<strong>in</strong>g Consistent Specilizations as a Natural Approach to Consistency Enforcement.<br />

<strong>in</strong> S. Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.). Integrity <strong>in</strong> <strong>Databases</strong>. Dagstuhl 1996.<br />

15. K.-D. Schewe, B. Thalheim. Active Consistency Enforcement for Repairable Database Transitions.<br />

<strong>in</strong> S. Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.). Integrity <strong>in</strong> <strong>Databases</strong>. Dagstuhl 1996.<br />

16. K.-D. Schewe, B. Thalheim. On the strength <strong>of</strong> rule trigger<strong>in</strong>g systems for <strong>in</strong>tegrity ma<strong>in</strong>tenance.<br />

<strong>in</strong> C. McDonald (Ed.). Database Systems '98 . Australian Computer Science Communications, vol.<br />

20 (2), 77-88. Spr<strong>in</strong>ger 1998.<br />

17. K.-D. Schewe. Well-behav<strong>in</strong>g rule systems for Entity-Relationship and object oriented models. <strong>in</strong><br />

D. Embley, R.Goldste<strong>in</strong> (Eds.). Conceptual Model<strong>in</strong>g { ER '97 . Spr<strong>in</strong>ger LNCS 1331, 141-154.<br />

New York 1997.<br />

18. K.-D. Schewe. Consistency Enforcement <strong>in</strong>Entity-Relationship and <strong>Object</strong> <strong>Oriented</strong> Models. Data<br />

& Knowledge Eng<strong>in</strong>eer<strong>in</strong>g 1998 (to appear).<br />

19. K.-D. Schewe, J.W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel. A reective approach to method<br />

generation <strong>in</strong> object oriented databases. Rostocker Informatik Berichte, vol. 14. Rostock University.<br />

1992.<br />

iii


20. K.-D. Schewe, D. Stemple, B. Thalheim. Higher level genericity <strong>in</strong> object oriented databases. Proc.<br />

Int. Conf. Management <strong>of</strong> Data. Bangalore 1994.<br />

21. K.-D. Schewe, B. Thalheim. Pr<strong>in</strong>ciples <strong>of</strong> object oriented database design. <strong>in</strong> H. Jaakkola et al.<br />

(Eds.). Information Modell<strong>in</strong>g and Knowledge Bases V , 227-242. IOS Press. Amsterdam 1994.<br />

22. B. Schewe, K.-D. Schewe. A user-centered method for the development <strong>of</strong> data-<strong>in</strong>tensive dialogue<br />

systems. <strong>in</strong> E. Falkenberg, W. Hesse. (Eds.). Information System Concepts, 88-103. Chapman &<br />

Hall 1995.<br />

23. K.-D. Schewe, B. Schewe. View-centered conceptual modell<strong>in</strong>g { an object oriented approach. <strong>in</strong><br />

B. Thalheim. (Eds.). Conceptual Model<strong>in</strong>g {ER'96. Spr<strong>in</strong>ger LNCS. Berl<strong>in</strong> 1996.<br />

iv


Chapter 1<br />

Fundamental Concepts <strong>of</strong> <strong>Object</strong><br />

<strong>Oriented</strong> <strong>Databases</strong><br />

Contents<br />

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

1.2 A Motivat<strong>in</strong>g Example . . . . . . . . . . . . . . . . . . . . . . . . . 5<br />

1.3 A Core <strong>Object</strong> <strong>Oriented</strong> Datamodel . . . . . . . . . . . . . . . . . 9<br />

1.3.1 A Simple Type System . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />

1.3.2 The Class Concept as a Structural Primitive . . . . . . . . . . . . . 11<br />

1.3.3 User Dened Integrity Constra<strong>in</strong>ts . . . . . . . . . . . . . . . . . . . 11<br />

1.3.4 Methods as a Basis for Behaviour Modell<strong>in</strong>g . . . . . . . . . . . . . . 12<br />

1.3.5 Queries and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14<br />

1.4 The <strong>Object</strong> Identication Problem . . . . . . . . . . . . . . . . . . 15<br />

1.4.1 The Notion <strong>of</strong> Value-Representability . . . . . . . . . . . . . . . . . 16<br />

1.4.2 Value-Representability <strong>in</strong> the Case <strong>of</strong> Acyclic Reference Graphs . . . 16<br />

1.4.3 Computation <strong>of</strong> Value Representation Types . . . . . . . . . . . . . 17<br />

1.4.4 The F<strong>in</strong>iteness Property . . . . . . . . . . . . . . . . . . . . . . . . . 18<br />

1.4.5 Weak Value-Representability . . . . . . . . . . . . . . . . . . . . . . 20<br />

1.5 The Genericity Problem . . . . . . . . . . . . . . . . . . . . . . . . 21<br />

1.5.1 Generic Update Methods . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

1.5.2 Generic Updates <strong>in</strong> the Case <strong>of</strong> Value-Representability . . . . . . . . 23<br />

1.6 The Consistency Problem . . . . . . . . . . . . . . . . . . . . . . . 25<br />

1.6.1 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 25<br />

1.6.2 Enforc<strong>in</strong>g Integrity <strong>in</strong> the OODM . . . . . . . . . . . . . . . . . . . 27<br />

1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

K.-D. Schewe, B. Thalheim. Fundamental Concepts <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>.<br />

Acta Cybernetica, vol. 11, no. 1-2, 49 - 84. Szeged 1993.<br />

1


Abstract. It is claimed that object oriented databases (OODBs) overcome many <strong>of</strong> the<br />

limitations <strong>of</strong> the relational model. However, the formal foundation <strong>of</strong> OODB concepts is<br />

still an open problem. Even worse, for relational databases a commonly accepted datamodel<br />

existed very early on whereas for OODBs the unication <strong>of</strong> concepts is miss<strong>in</strong>g. The work<br />

reported <strong>in</strong> this paper conta<strong>in</strong>s the results <strong>of</strong> our rst <strong>in</strong>vestigations on a formally founded<br />

object oriented datamodel (OODM) and is <strong>in</strong>tended to contribute to the development <strong>of</strong> a<br />

uniform mathematical theory <strong>of</strong> OODBs.<br />

A clear dist<strong>in</strong>ction between objects and values turns out to be essential <strong>in</strong> the OODM.<br />

Types and Classes are used to structure values and objects repectively. Then the problem<br />

<strong>of</strong> unique object identication occurs. We show that this problem can be be solved<br />

for classes with extents that are completely representable by values. Such classes are called<br />

value-representable.<br />

Another advantage <strong>of</strong> the relational approach istheexistence <strong>of</strong> structurally determ<strong>in</strong>ed<br />

generic update operations. We show that this property can be carried over to object-oriented<br />

datamodels if classes are value-representable. Moreover, <strong>in</strong> this case database consistency<br />

with respect to implicitly specied referential and <strong>in</strong>clusion constra<strong>in</strong>ts will be automatically<br />

preserved.<br />

This result can be generalized with respect to dist<strong>in</strong>guished classes <strong>of</strong> explicitly stated<br />

static constra<strong>in</strong>ts. Given some arbitrary method and some <strong>in</strong>tegrity constra<strong>in</strong>t there exists a<br />

greatest consistent specialization (GCS) that behaves nice <strong>in</strong> that it is compatible with the<br />

conjunction <strong>of</strong> constra<strong>in</strong>ts. We present an algorithm for the GCS construction <strong>of</strong> user-dened<br />

methods and describe the GCSs <strong>of</strong> generic update operations that are required here<strong>in</strong>.<br />

1.1 Introduction<br />

The shortcom<strong>in</strong>gs <strong>of</strong> the relational database approach encouraged much research aimed at<br />

achiev<strong>in</strong>g more appropriate data models. It has been claimed that the object-oriented approach<br />

will be the key technology for future database systems and languages [8]. Several systems<br />

[4, 6, 7, 9, 15, 16, 17, 19, 26, 36, 37, 38] arose from these eorts. However, <strong>in</strong> contrast to<br />

research <strong>in</strong> the relational area there is no common formal agreement on what constitutes an<br />

object-oriented database [10, 11, 13].<br />

The basic question \What is an object?" seems to be trivial, but already here the variety<br />

<strong>of</strong> answers is large. In object oriented programm<strong>in</strong>g the notion <strong>of</strong> an object was <strong>in</strong>tended as<br />

a generalization <strong>of</strong> the abstract data type concept with the additional feature <strong>of</strong> <strong>in</strong>heritance.<br />

In this sense object orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong><br />

order to promote high s<strong>of</strong>tware development productivity. The development <strong>of</strong> object oriented<br />

databases regarded an object also as a basic unit <strong>of</strong> persistent data, a view that is heavily <strong>in</strong>-<br />

uenced by exist<strong>in</strong>g semantic datamodels (SDMs) [2, 29, 31, 39, 40, 60]. Thus, object oriented<br />

databases are composed <strong>of</strong> <strong>in</strong>dependent objects but must also provide for the ma<strong>in</strong>tenance <strong>of</strong><br />

<strong>in</strong>ter-object consistency, a demand that is to some degree <strong>in</strong> dissonance with the basic style<br />

<strong>of</strong> object orientation.<br />

A view that is common <strong>in</strong> OODB research is that objects are abstractions <strong>of</strong> real world<br />

entities and should have an identity [8]. This leads to a dist<strong>in</strong>ction between values and objects<br />

[10, 11]. A value is identied by itself whereas an object has an identity <strong>in</strong>dependent <strong>of</strong> its<br />

value. This object identity is usually encoded by object identiers [1, 3, 34]. Abstract<strong>in</strong>g from<br />

the pure physical level the identier <strong>of</strong> an object can be regarded as be<strong>in</strong>g immutable dur<strong>in</strong>g<br />

2


the object's lifetime. Identiers ease the shar<strong>in</strong>g and update <strong>of</strong> data. However, such abstract<br />

identiers do not relieve us from the task to provide unique identication mechanisms for<br />

objects. In object oriented programm<strong>in</strong>g object names are sucient, but retriev<strong>in</strong>g mass data<br />

by name is senseless.<br />

In most approaches to OODBs an object is coupled with a value <strong>of</strong> some xed structure.<br />

To our po<strong>in</strong>t <strong>of</strong> view this contradicts already the goal <strong>of</strong> objects be<strong>in</strong>g abstractions <strong>of</strong> reality.<br />

In real situations an object has several and also chang<strong>in</strong>g aspects that should be captured by<br />

the object model. Therefore, <strong>in</strong> our object model each object o consists <strong>of</strong> a unique identier<br />

id, a set <strong>of</strong> (type-, value-)pairs (T i v i ), a set <strong>of</strong> (reference-, object-)pairs (ref j o j ) and a set<br />

<strong>of</strong> methods meth k .<br />

Types are used to structure values. Classes serve as structur<strong>in</strong>g primitive for objects<br />

hav<strong>in</strong>g the same structure and behaviour. It is obvious that the multiple aspects view <strong>of</strong> an<br />

object allows them to be simultaneously members <strong>of</strong> more than one class and to change class<br />

memberships. This sett<strong>in</strong>g also makes every discussion on \object migration" unnessecary, as<br />

migration is only a specic form <strong>of</strong> value change.<br />

In our model a class structure uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references.<br />

The extent <strong>of</strong> classes varies over time, whereas types are immutable. Relationships between<br />

classes are represented by references together with referential constra<strong>in</strong>ts on the object identiers<br />

<strong>in</strong>volved. Moreover, each class is accompanied by a collection <strong>of</strong> methods. A schema is<br />

given by a collection <strong>of</strong> class denitions together with explicit <strong>in</strong>tegrity constra<strong>in</strong>ts.<br />

The Identication Problem. One important concept <strong>of</strong> object-oriented databases is object<br />

identity. Follow<strong>in</strong>g [1, 12] the immutable identity <strong>of</strong> an object can be encoded by the concept<br />

<strong>of</strong> abstract object-identiers. The advantages <strong>of</strong> this approach are that shar<strong>in</strong>g, mutability<br />

<strong>of</strong> values and cyclic structures can be represented easily [42]. On the other hand, object<br />

identiers do not have a mean<strong>in</strong>g for the user and should therefore be hidden.<br />

We study whether equality <strong>of</strong>identiers can be derived from the equality <strong>of</strong>values. In the<br />

literature the notion <strong>of</strong> \deep" equality has been <strong>in</strong>troduced for objects with equal values and<br />

references to objects that are also \deeply" equal. This recursive denition becomes <strong>in</strong>terest<strong>in</strong>g<br />

<strong>in</strong> the case <strong>of</strong> cyclic references.<br />

Therefore, we <strong>in</strong>troduce uniqueness constra<strong>in</strong>ts, which express equality on identiers as a<br />

consequence <strong>of</strong> the equality <strong>of</strong> some values or references. On this basis we can address the<br />

problem how tocharacterize those classes that are completely representable (and hence also<br />

identiable) by values.<br />

Generic Update Operations. The success <strong>of</strong> the relational data model is due certa<strong>in</strong>ly to the<br />

existence <strong>of</strong> simple query and update-languages. Preserv<strong>in</strong>g the advantages <strong>of</strong> the relational<br />

<strong>in</strong> OODBs is a serious goal.<br />

The generic query<strong>in</strong>g <strong>of</strong> objects has been approached <strong>in</strong> [1, 12]. While query<strong>in</strong>g is per se<br />

a set-oriented operation, i.e. it is not necessary to select just one s<strong>in</strong>gle object, and hence<br />

does not raise any specic problems with object identiers, th<strong>in</strong>gs change completely <strong>in</strong> case<br />

<strong>of</strong> updates. If an object with a given value is to be updated (or deleted), this is only dened<br />

unambigously, if there does not exist another object with the same value. If more than one<br />

object exists with the same value or more generally with the same value and the same references<br />

to other objects, then the user has to decide, whether an update- or delete-operation is<br />

applied to all these objects, to only one <strong>of</strong> these objects selected non-determ<strong>in</strong>istically or to<br />

none <strong>of</strong> them, i.e. to reject the operation. However, it is not possible to specify a priori such<br />

3


an operation that works <strong>in</strong> the same way for all objects <strong>in</strong> all situations. The same applies<br />

to <strong>in</strong>sert-operations. Hence the problem, <strong>in</strong> which cases operations for the <strong>in</strong>sertion, deletion<br />

and update <strong>of</strong> objects can be dened generically.<br />

Some authors [43] have chosen the solution to abandon generic operations. Others [6, 7, 9]<br />

use identify<strong>in</strong>g values to represent objectidentity, thus embody a strict concept <strong>of</strong> surrogate<br />

keys to avoid the problem. Our approach is dierent from both solutions <strong>in</strong> that we use the<br />

concept <strong>of</strong> hidden abstract identiers, but at the same time formally characterize those classes<br />

for which unique generic operations for the <strong>in</strong>sertion, deletion and update <strong>of</strong> s<strong>in</strong>gle objects<br />

can be derived automatically. It turns out that these are exactly the value-representable ones.<br />

The Consistency Problem. One <strong>of</strong> the primary benets that database systems oer is automatic<br />

enforcement <strong>of</strong> database <strong>in</strong>tegrity. One type <strong>of</strong> <strong>in</strong>tegrity is ma<strong>in</strong>ta<strong>in</strong>ed through automatic<br />

concurrency control and recovery mechanisms another one is the automatic enforcement<br />

<strong>of</strong> user-specied <strong>in</strong>tegrity constra<strong>in</strong>ts. Most commercial database systems, especially<br />

relational database management systems enforce only a bare m<strong>in</strong>imum <strong>of</strong> constra<strong>in</strong>ts, largely<br />

because <strong>of</strong> the performance overhead associated with updates.<br />

The ma<strong>in</strong>tenance problem is the problem how to ensure that the database satises its<br />

constra<strong>in</strong>ts after certa<strong>in</strong> actions. There are at present two approaches to this ma<strong>in</strong>tenance<br />

problem. The rst one, more classical is the modication <strong>of</strong> methods <strong>in</strong> accordance to the specied<br />

<strong>in</strong>tegrity consta<strong>in</strong>ts. The second approach uses generation mechanisms for the specied<br />

events. Upon occurrence <strong>of</strong> certa<strong>in</strong> database events like update operations the management<br />

component is activated for <strong>in</strong>tegrity ma<strong>in</strong>tenance. The rst research direction did not succeed<br />

because <strong>of</strong> some limitations with<strong>in</strong> the approach. The second one is at present one <strong>of</strong> the most<br />

active database research areas. One <strong>of</strong> our objectives is to show that the rst approach can<br />

be extended to object-oriented databases us<strong>in</strong>g stronger mathematical fundamentals.<br />

Accuracy is an obviously important and desirable feature <strong>of</strong> any database. To this end,<br />

<strong>in</strong>tegrity constra<strong>in</strong>ts, conditions that data must satisfy before a database is updated, are<br />

commonly employed as a means <strong>of</strong> help<strong>in</strong>g to ma<strong>in</strong>ta<strong>in</strong> consistency. In relational databases<br />

the specication and enforcement <strong>of</strong> <strong>in</strong>tegrity constra<strong>in</strong>ts has a long tradition [61], whereas<br />

<strong>in</strong> OODBs the <strong>in</strong>tegrity problem has only recently drawn attention [48].<br />

In object oriented databases, <strong>in</strong>tegrity ma<strong>in</strong>tenance can be based on two dierent approaches.<br />

The rst one uses bl<strong>in</strong>d update operations. In this case, any update is allowed and<br />

the system organizes the ma<strong>in</strong>tenance. The second approach is based on methods rewrit<strong>in</strong>g.<br />

This approach is more eective. Assum<strong>in</strong>g a consistent database state the modied method<br />

can not lead to an <strong>in</strong>consistent state.<br />

In relational databases dist<strong>in</strong>guished classes <strong>of</strong> static <strong>in</strong>tegrity constra<strong>in</strong>ts have been discussed<br />

such as<strong>in</strong>clusion, exclusion, functional, key and multi-valued dependencies. All these<br />

constra<strong>in</strong>ts can be generalized to the object oriented case. Then the result on the existence<br />

<strong>of</strong> <strong>in</strong>tegrity preserv<strong>in</strong>g methods can be generalized to capture also these constra<strong>in</strong>ts. We shall<br />

also describe the result<strong>in</strong>g methods.<br />

The Organization <strong>of</strong> the Paper. We start with a motivat<strong>in</strong>g example <strong>in</strong> Section 1.2, then<br />

<strong>in</strong>troduce <strong>in</strong> Section 1.3 a core OODM to formalize the concepts used <strong>in</strong>tuitively <strong>in</strong> the<br />

example. In Section 1.4 the notions <strong>of</strong> (weak) value-representability are<strong>in</strong>troduced <strong>in</strong> order<br />

to handle the identication problem. The genericity problem will be approached <strong>in</strong> Section<br />

1.5. We show the relationship between value-representability and the unique existence <strong>of</strong><br />

generic update operations. The consistency problem is dealt with <strong>in</strong> Section 1.6. We outl<strong>in</strong>e an<br />

4


operational approach based on the computation <strong>of</strong> greatest consistent specializations (GCSs).<br />

S<strong>in</strong>ce the used algorithm allows the problem to be reduced to basic update operations, we<br />

describe the GCSs here<strong>of</strong>. We summarize our results and describe some open problems <strong>in</strong><br />

Section 1.7.<br />

1.2 A Motivat<strong>in</strong>g Example<br />

In this section we start giv<strong>in</strong>g a completely <strong>in</strong>formal <strong>in</strong>troduction to the OODM on the basis<br />

<strong>of</strong> a simple university example. We rst <strong>in</strong>troduce types and classes, then show an example<br />

<strong>of</strong> a database <strong>in</strong>stance, i.e. the content <strong>of</strong> the database at a given timepo<strong>in</strong>t. The representation<br />

<strong>of</strong> an <strong>in</strong>stance requires object identiers. Then we extend the example by <strong>in</strong>troduc<strong>in</strong>g<br />

user-dened constra<strong>in</strong>ts. We shall see that this enables alternative representations without<br />

us<strong>in</strong>g identiers, hence leads to the notion <strong>of</strong> value-representability. F<strong>in</strong>ally, we <strong>in</strong>dicate the<br />

denition <strong>of</strong> methods as a means to model database dynamics. For the sake <strong>of</strong> simplicity we<br />

only describe a generic update method that can be generated by the system.<br />

As already said <strong>in</strong> the <strong>in</strong>troduction, we dist<strong>in</strong>guish between values and objects with the<br />

ma<strong>in</strong> dierence dened by values identify<strong>in</strong>g themselves whereas objects require an additional<br />

external identication mechanism. Types are used to structure values. Thus, let us rst give<br />

some examples <strong>of</strong> types.<br />

Example. Basically, every type can be built from a few predened basic types such as<br />

BOOL, NAT, STRING, etc. and also predened type constructors for records, nite sets,<br />

lists, unions, etc.<br />

The type denition for PERSONNAME uses both a set constructor fg and a (tagged)<br />

record constructor ():<br />

Type PERSONNAME<br />

= ( FirstName : STRING ,<br />

SecondName : STRING ,<br />

Titles : STRING )<br />

End PERSONNAME<br />

The denition <strong>of</strong> a type PERSON uses the type PERSONNAME.<br />

Type PERSON<br />

= ( PersonIdentityNo : NAT ,<br />

Name : PERSONNAME )<br />

End PERSON<br />

The follow<strong>in</strong>g denes STUDENT as a subtype <strong>of</strong> PERSON , i.e. we can naturally project<br />

each value <strong>of</strong> type STUDENT onto a value <strong>of</strong> type PERSON.<br />

Type STUDENT<br />

= ( PersonIdentityNo : NAT ,<br />

StudNo : NAT ,<br />

Name : PERSONNAME )<br />

5


End STUDENT<br />

Besides these denitions <strong>of</strong> types as sets <strong>of</strong> values we may also dene new type constructors<br />

as follows, where is a parameter for this new constructor:<br />

Type MPERSON ()<br />

= ( PersonIdentityNo : NAT ,<br />

Spouse : )<br />

End MPERSON<br />

ut<br />

Next we use these types to build the structural part <strong>of</strong> an OODM schema. We deneaschema<br />

as a collection <strong>of</strong> classes and a class as a variable collection <strong>of</strong> objects.<br />

Example. Each object <strong>in</strong> a class has a structure, which comb<strong>in</strong>es aspects <strong>of</strong> values associated<br />

with the object and references to other objects. This structure can be based on a type<br />

denition as above or <strong>in</strong>volve itself a (nameless) type denition. Moreover, class denitions<br />

<strong>in</strong>volve IsA relations <strong>in</strong> order to model objects <strong>in</strong> more than one class. We use to <strong>in</strong>dicate<br />

concatenation for record types.<br />

Schema University<br />

Class PersonC<br />

Structure PERSON<br />

End PersonC<br />

Class MarriedPersonC<br />

IsA PersonC<br />

Structure ( PersonIdentityNo : NAT ,<br />

Spouse : MarriedPersonC )<br />

End MarriedPersonC<br />

Class StudentC<br />

IsA PersonC<br />

Structure STUDENT <br />

( Supervisor : Pr<strong>of</strong>essorC ,<br />

Major : DepartmentC ,<br />

M<strong>in</strong>or : DepartmentC )<br />

End StudentC<br />

Class Pr<strong>of</strong>essorC<br />

IsA PersonC<br />

Structure ( PersonIdentityNo : NAT ,<br />

Age : NAT ,<br />

Salary : NAT ,<br />

Faculty : DepartmentC )<br />

End Pr<strong>of</strong>essorC<br />

Class DepartmentC<br />

Structure ( DeptName : STRING )<br />

End DepartmentC<br />

ut<br />

In pr<strong>in</strong>ciple, we are now able to describe the content <strong>of</strong> the database at a given timepo<strong>in</strong>t. For<br />

6


such database <strong>in</strong>stances we need a type ID <strong>of</strong> object identiers that is used for two purposes,<br />

rst as a unique and ecient <strong>in</strong>ternal identication mechanism for objects and second for<br />

modell<strong>in</strong>g objects <strong>in</strong> dierent classes and references to other objects. In this case each class<br />

will be associated with a representation type that can be used directly for stor<strong>in</strong>g objects.<br />

Example.<br />

We useD as a name for the <strong>in</strong>stance.<br />

D(PersonC) =f ( i 1 , ( 123 , ( \John" , \Denver" , f \Pr<strong>of</strong>essor" , \Dr" g ))),<br />

( i 2 , ( 124 , ( \Mary" , \Stuart" , f \Dr" g ))),<br />

( i 3 , ( 456 , ( \John" , \Stuart" , fg))),<br />

( i 4 , ( 567 , ( \Laura" , \James" , fg))),<br />

( i 5 , ( 987 , ( \Dave" ,\Ford" , fg))) g<br />

D(MarriedPersonC)=f ( i 1 ,(123,i 2 )),<br />

( i 2 ,(124,i 1 )) g<br />

D(Pr<strong>of</strong>essorC)=f ( i 1 , ( 123 , 48 , 8000 , i 6 ))<br />

D(StudentC)=f ( i 3 , ( 456 , 1023 , ( \John" , \Stuart" , fg),i 1 , i 6 , i 7 )),<br />

( i 4 , ( 567 , 2134 , ( \Laura" , \James" , fg),i 1 , i 6 , i 7 )) g<br />

D(DepartmentC)=f ( i 6 , ( \Computer Science" ) ) ,<br />

( i 7 , ( \Philosophy" ) ) ,<br />

( i 8 , ( \Music" ) ) g<br />

Note that the follow<strong>in</strong>g three conditions are satised by the <strong>in</strong>stance:<br />

{ The object identiers are unique with<strong>in</strong> a class,<br />

{ the IsA relations <strong>in</strong> the schema give rise to set <strong>in</strong>clusion relationships for the underly<strong>in</strong>g<br />

sets <strong>of</strong> identiers (<strong>in</strong>clusion <strong>in</strong>tegrity), and<br />

{ the identiers occurr<strong>in</strong>g with<strong>in</strong> an object's value at a place correspond<strong>in</strong>g to a reference,<br />

always occur as an object identier <strong>in</strong> the referenced class (referential <strong>in</strong>tegrity).<br />

We shall always refer to these conditions as model <strong>in</strong>herent constra<strong>in</strong>ts that must be satised<br />

by each <strong>in</strong>stance. Other <strong>in</strong>tegrity constra<strong>in</strong>ts can be dened by the user and added to the<br />

schema <strong>in</strong> order to capture more application semantics as shown <strong>in</strong> the next example.<br />

Example. First let us express that there are no two persons with the same PersonIdentityNo,<br />

no two students with the same StudentNo and no two departments with the same name. In<br />

order to formulate this, use x P , x S and x D to refer to the content <strong>of</strong> the classes PersonC,<br />

StudentC and DepartmentC, and let c P : PERSON ! (PersonIdentityNo : NAT)<br />

and c S : STUDENT ID 3 ! (StudNo : NAT) be functions that arise from the natural<br />

projection to the components PersonIdentityNo and StudNo <strong>in</strong> PERSON and STUDENT<br />

respectively. This gives the follow<strong>in</strong>g uniqueness constra<strong>in</strong>ts.<br />

8i j :: ID:8v w :: PERSON: (i v) 2 x P ^ (j w) 2 x P ^ c P (v) =c P (w) ) i = j :<br />

8i j :: ID:8v w :: STUDENT ID 3 : (i v) 2 x S ^ (j w) 2 x S ^ c S (v) =c S (w) ) i = j :<br />

8i j :: ID:8v w :: (DeptName : STRING): (i v) 2 x D ^ (j w) 2 x D ^ v = w ) i = j (1.1) :<br />

Let us further assume that the salary <strong>of</strong> a pr<strong>of</strong>essor is determ<strong>in</strong>ed by his/her age. For this<br />

purpose, let Age Salary : T Pr<strong>of</strong> ! NAT be the natural projections to the Age- and<br />

ut<br />

g<br />

7


Salary-values respectively. Then we have the follow<strong>in</strong>g functional constra<strong>in</strong>t on the class<br />

Pr<strong>of</strong>essorC:<br />

8i j :: ID:8v w :: T Pr<strong>of</strong> : (i v) 2 x Pr<strong>of</strong> ^ (j w) 2 x Pr<strong>of</strong> ^ Age(v) =Age(w) )<br />

Salary(v) = Salary(w) : (1.2)<br />

Next assume that we want to guarantee that the spouse <strong>of</strong> a person's spouse is the person<br />

itself, which gives (with the abbreviations understood) the formula<br />

8i j :: ID:8v w :: T MP : (i v) 2 x MP ^ (j w) 2 x MP ^ Spouse(v) =j ) Spouse(w) =i :<br />

(1.3)<br />

Note that all these constra<strong>in</strong>ts are also satised by the <strong>in</strong>stance above.<br />

ut<br />

Now wehave added uniqueness constra<strong>in</strong>ts, the object identiers used <strong>in</strong> <strong>in</strong>stances correspond<br />

one-to-one to values <strong>of</strong> some types associated with the classes. These are the so-called value<br />

identication types V C .Hencewe could remove identiers and represent the same <strong>in</strong>formation<br />

<strong>in</strong> a purely value-based fashion. In our example the value representation type for the class<br />

PersonC is simply PERSON, but for the class MarriedPersonC we need the recursive<br />

type<br />

V MP = PERSON ( Spouse : V MP )<br />

with values that are rational trees [45, 47].<br />

So far only structural aspects (types, classes, constra<strong>in</strong>ts) have been considered. Let us<br />

now add methods to classes <strong>in</strong> order to model the dynamics <strong>of</strong> the database. In the OODM<br />

methods will be modelled <strong>in</strong> a simple procedural style.<br />

Example.<br />

Let us describe an <strong>in</strong>sert-method for the class PersonC.<br />

<strong>in</strong>sert P ersonC (<strong>in</strong>: P :: PERSON,out:I::ID) =<br />

IF 9 O 2 PersonC .value(O) =P<br />

THEN I := ident(O)<br />

ELSE I := NewId <br />

PersonC := PersonC [f( I,P )g<br />

ENDIF<br />

For an <strong>in</strong>sertion <strong>in</strong>to the class MarriedPersonC we need a more complex <strong>in</strong>put type V<br />

recursively dened as<br />

V = PERSON (V [ ID)<br />

For each P :: V let f(P ) :: PERSON be the projection onto PERSON correspond<strong>in</strong>g to<br />

the subtype relation between V and PERSON.Thenwehave<br />

<strong>in</strong>sert MarriedPersonC (<strong>in</strong>: P :: V , out: I :: ID) =<br />

I := <strong>in</strong>sert PersonC (f(P )) <br />

IF 8 O 2 MarriedPersonC . ident(O) 6= I<br />

THEN P 0 := substitute(I,P ,Spouse(P)) <br />

IF P 0 :: ID<br />

THEN J := P 0 8


ELSE J := <strong>in</strong>sert MarriedP ersonC (P 0 )<br />

ENDIF <br />

MarriedPersonC := MarriedPersonC [f(I,f(P ) (J))g<br />

ENDIF<br />

We used the global method NewId to denote the selection <strong>of</strong> a new identier. The expression<br />

substitute(I,P ,T ) denotes the result <strong>of</strong> replac<strong>in</strong>g the value I for P <strong>in</strong> the expression T . Later<br />

we shall use a more abstract syntax oriented toward guarded commands [20, 41, 46]. ut<br />

Later we shall see that methods as described <strong>in</strong> this example are canonical and can be automatically<br />

derived from the schema. Correspond<strong>in</strong>g generic update methods look quite similar<br />

with the only dierence that there is no output. Such generic update methods only exist for<br />

value representable classes <strong>in</strong> which case,however, they enforce <strong>in</strong>tegrity with respect to the<br />

model <strong>in</strong>herent constra<strong>in</strong>ts. However, generic update methods need not be consistent with<br />

respect to the user-dened constra<strong>in</strong>ts. To achieve this, we have to apply the GCS algorithm<br />

to user-dened methods.<br />

In the follow<strong>in</strong>g sections we formally dene the concepts above and pro<strong>of</strong> the ma<strong>in</strong> results<br />

on value representation, generic updates and <strong>in</strong>tegrity enforcement.<br />

1.3 A Core <strong>Object</strong> <strong>Oriented</strong> Datamodel<br />

In this section we present a slightly modied version <strong>of</strong> the object oriented datamodel (OODM)<br />

<strong>of</strong> [45, 47, 49]. We observe that an object <strong>in</strong> the real world always has an identity. Therefore,<br />

abstract (i.e. system-provided) object identiers are <strong>in</strong>troduced to capture identity. However,<br />

neither the real world object that was the basis <strong>of</strong> the abstraction nor the abstract identier<br />

can be used for the identication <strong>of</strong> an object.<br />

In contrast to exist<strong>in</strong>g object oriented datamodels [1, 3, 4, 6, 7, 8, 9, 16, 17, 26, 36, 37, 42,<br />

43, 54] an object is not coupled with a unique type. In contrast, we observe that real world<br />

objects can have dierent aspects that may change over time. Therefore, a primary decision<br />

was taken to let an object be associated with more than one type and to let these types even<br />

change dur<strong>in</strong>g the object's lifetime. The same applies to references to other objects.<br />

In the follow<strong>in</strong>g let N P , N T , N C , N R , N F , N M and V denote arbitrary pairwise disjo<strong>in</strong>t,<br />

denumerable sets represent<strong>in</strong>g parameter-, type-, class-, reference-, function-, method- and<br />

variable-names respectively.<br />

1.3.1 A Simple Type System<br />

Relational approaches to data modell<strong>in</strong>g are called value-oriented s<strong>in</strong>ce <strong>in</strong> these models real<br />

world entities are completely represented by their values. In the object-oriented approach we<br />

dist<strong>in</strong>guish between objects and values. Values can be gouped <strong>in</strong>to types. In general, a type<br />

may be regarded as an immutable set <strong>of</strong> values <strong>of</strong> a uniform structure together with operations<br />

dened on such values. Subtyp<strong>in</strong>g is used to relate values <strong>in</strong> dierent types.<br />

In [12, 47, 49] algebraic type specications as <strong>in</strong> [21, 23] have been used to allow opentype<br />

systems. For the sake <strong>of</strong> simplicity we deviate here from this approach and follow the more<br />

classical view <strong>of</strong> [14, 15, 45] us<strong>in</strong>g a type system that consists <strong>of</strong> some basic types such as<br />

BOOL, NATURAL, INTEGER, STRING, etc., and type constructors for records, nite<br />

sets, bags, lists, etc. and a subtyp<strong>in</strong>g relation. Moreover, assume the existence <strong>of</strong> recursive<br />

9


types, i.e. types dened by (a system <strong>of</strong>) doma<strong>in</strong> equations. In pr<strong>in</strong>ciple we could use one <strong>of</strong><br />

the type systems dened <strong>in</strong> [4, 5, 14, 15, 19, 24, 38]. In addition we suppose the existence <strong>of</strong><br />

an abstract identier type ID <strong>in</strong> T without any non-trivial supertype. Arbitrary types can<br />

then be dened by nest<strong>in</strong>g. A type T without occurrence <strong>of</strong> ID will be called a value-type.<br />

We shall proceed giv<strong>in</strong>g a more formal denition <strong>of</strong> types.<br />

Denition 1.1. (i) A base type is either BOOL, NAT, INT, FLOAT, STRING, ID or<br />

?.<br />

(ii) Let a i 2 N F and i 2 N P (i = 1::: n). A type constructor is either (a 1 :<br />

1 ::: a n : n ) (record), fg (nite set), [] (list), hi (bag) or [ (union).<br />

(iii) A type t is either a base type, a type constructor, a generalized constructor that results<br />

from replac<strong>in</strong>g some parameters <strong>in</strong> a type constructor by types or a recursive type dened<br />

by an equation t = f=tg:t 0 , where t 0 is a generalized constructor and one <strong>of</strong> its parameters<br />

is replaced by t 2 N T .<br />

In the latter two cases the rema<strong>in</strong><strong>in</strong>g parameters <strong>of</strong> the type constructor together with<br />

the parameters <strong>of</strong> the replac<strong>in</strong>g types yield the parameters 1 ::: n <strong>of</strong> t.<br />

(iv) Atype t is called proper i the number <strong>of</strong> its parameters is 0. t is called a value type i<br />

there is no occurrence <strong>of</strong> ID <strong>in</strong> t.<br />

(v) A type form consists <strong>of</strong> a type name t 2 N T and a type t 0 with possibly some <strong>of</strong> its<br />

parameters replaced by type names.<br />

(vi) A type specication T is a nite collection <strong>of</strong> type forms t 1 ::: t n such that the only type<br />

names occurr<strong>in</strong>g here<strong>in</strong> are the names <strong>of</strong> t 1 ::: t n .<br />

The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />

standard operators on base types and on records, sets, bags, ::: We omit the details here.<br />

If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a correspond<strong>in</strong>g occurrence<br />

relation<br />

o : t t 0 ! BOOL :<br />

F<strong>in</strong>ally, we <strong>in</strong>troduce subtypes. For a more detailed <strong>in</strong>troduction to types see either [14] or<br />

[49].<br />

Denition 1.2. (i) A subtype relation on types is given by the follow<strong>in</strong>g rules:<br />

(a) Every type t is its own subtype and a subtype <strong>of</strong> ?.<br />

(b) NAT INT FLOAT .<br />

(c) (::: a i;1 : i;1 a i : i a i+1 : i+1 :::) (::: a i;1 : 0 i;1 a i+1 : 0 i+1 :::)<br />

whenever j 0 j .<br />

(d)<br />

8<br />

<<br />

:<br />

fg fg<br />

[] []<br />

hi hi<br />

9<br />

=<br />

<br />

(e) fg hi and [] hi.<br />

(f) [ .<br />

i .<br />

(ii) A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened<br />

by (a)-(f) above.<br />

10


1.3.2 The Class Concept as a Structural Primitive<br />

The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure which uniformly<br />

comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Moreover, generic operations on objects such<br />

as object creation, deletion and update <strong>of</strong> its values and references are associated with classes<br />

provided these operations can be dened unambigously. <strong>Object</strong>s can belong to dierent classes,<br />

which guarantees each object <strong>of</strong> our abstract object model to be captured by the collection<br />

<strong>of</strong> possible classes. As for values that are only dened via types, objects can only be dened<br />

via classes.<br />

Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values and references to<br />

objects <strong>in</strong> other classes. Identiers can be represented us<strong>in</strong>g the unique identier type ID.<br />

Values and references can be comb<strong>in</strong>ed <strong>in</strong>to a representation type, where each occurence <strong>of</strong><br />

ID denotes references to some other classes. Therefore, we may dene the structure <strong>of</strong> a class<br />

us<strong>in</strong>g parameterized types.<br />

Denition 1.3. (i) Let t be a value type with parameters 1 ::: n .For dist<strong>in</strong>ct reference<br />

names r 1 ::: r n 2 N R and class names C 1 ::: C n 2 N C the expression derived from t<br />

by replac<strong>in</strong>g each i <strong>in</strong> t by r i : C i for i =1::: n is called a structure expression.<br />

(ii) A structural class consists <strong>of</strong> a class name C 2 N C , a structure expression S and a set <strong>of</strong><br />

class names D 1 ::: D m 2 N C (<strong>in</strong> the follow<strong>in</strong>g called the set <strong>of</strong> superclasses). We call r i<br />

the reference named r i from class C to class C i . The type derived from S by replac<strong>in</strong>g<br />

each reference r i : C i bythetype ID is called the representation type T C <strong>of</strong> the class C,<br />

the type U C =(ident : ID value :: T C ) is called the class type <strong>of</strong> C.<br />

(iii) A (structural) schema S is a nite collection <strong>of</strong> structural classes C 1 ::: C n closed under<br />

references and superclasses.<br />

(iv) An <strong>in</strong>stance D <strong>of</strong> a structural schema S assigns to each classC avalue D(C) <strong>of</strong>type U C<br />

such that the follow<strong>in</strong>g conditions are satised:<br />

uniqueness <strong>of</strong> identiers: For every class C we have<br />

8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (1.4)<br />

<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />

8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (1.5)<br />

Moreover, if T C isasubtype <strong>of</strong> TC 0 with subtype function f : T C ! TC 0 , then we have<br />

8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (1.6)<br />

referential <strong>in</strong>tegrity: For each reference from C to C 0 with correspond<strong>in</strong>g occurrence<br />

relation o r wehave<br />

8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (1.7)<br />

1.3.3 User Dened Integrity Constra<strong>in</strong>ts<br />

Let us now extend the notion <strong>of</strong> schema by the <strong>in</strong>troduction <strong>of</strong> explicit user-dened <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts. First we dene the notion <strong>of</strong> constra<strong>in</strong>t schema <strong>in</strong> general, then we restrict ourselves<br />

to dist<strong>in</strong>guished classes <strong>of</strong> constra<strong>in</strong>ts that arise as generalizations <strong>of</strong> constra<strong>in</strong>ts known<br />

from the relational model, e.g. functional and key constra<strong>in</strong>ts, <strong>in</strong>clusion and exclusion constra<strong>in</strong>ts<br />

[48, 52].<br />

11


Denition 1.4.<br />

Let S = fC 1 ::: C n g be a structural schema.<br />

(i) An <strong>in</strong>tegrity constra<strong>in</strong>t on S is a formula I over the underly<strong>in</strong>g type system with free<br />

variables fr(I) fx C 1 ::: x C n<br />

g, where each x Ci is a variable <strong>of</strong> type fU Ci g.We call x Ci<br />

the class variable <strong>of</strong> C i .<br />

(ii) A constra<strong>in</strong>ed schema consists <strong>of</strong> a structural schema S and a nite set <strong>of</strong> <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts on S.<br />

(iii) An <strong>in</strong>stance <strong>of</strong> a constra<strong>in</strong>ed schema is an <strong>in</strong>stance <strong>of</strong> the underly<strong>in</strong>g structural schema.<br />

An <strong>in</strong>stance D is said to be consistent with respect to the <strong>in</strong>tegrity constra<strong>in</strong>t I i<br />

substitut<strong>in</strong>g D(C) for each class variable x C <strong>in</strong> I evaluates to true, when<strong>in</strong>terpreted <strong>in</strong><br />

the usual way.<br />

Note that the conditions for an <strong>in</strong>stance <strong>in</strong> Denition 4 correspond to model <strong>in</strong>herent <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts. We refer to these constra<strong>in</strong>ts as implicit identier, IsA and referential constra<strong>in</strong>ts<br />

on the schema S. Let us now dene some dist<strong>in</strong>guished classes <strong>of</strong> user-dened constra<strong>in</strong>ts.<br />

Denition 1.5. Let C C 1 C 2 be classes <strong>in</strong> a schema S and let c i : T C ! T i (i =1 2 3) and<br />

c i : T Ci ! T (i =1 2) be subtype functions.<br />

(i) A functional constra<strong>in</strong>t on C is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) :<br />

(1.8)<br />

(ii) A uniqueness constra<strong>in</strong>t on C is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) i = i 0 :<br />

(1.9)<br />

A uniqueness constra<strong>in</strong>t onC is called trivial i T C = T 1 and c 1 = id hold.<br />

(iii) An <strong>in</strong>clusion constra<strong>in</strong>t on C 1 and C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8t :: T:9i 1 :: IDv 1 :: T C 1 : (i 1v 1 ) 2 x C 1 ^ c 1(v 1 )=t )<br />

9i 2 :: IDv 2 :: T C 2 : (i 2v 2 ) 2 x C 2 ^ c 2(v 2 )=t : (1.10)<br />

(iv) An exclusion constra<strong>in</strong>t on C 1 , C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8i 1 i 2 :: ID:8v 1 :: T C 1 : 8v 2 :: T C 2 : (i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1 (v 1 ) 6= c 2 (v 2 ) :<br />

(1.11)<br />

1.3.4 Methods as a Basis for Behaviour Modell<strong>in</strong>g<br />

So far, only static aspects have been considered. A structural schema is simply a collection <strong>of</strong><br />

data structures called classes. Let us now turn to add<strong>in</strong>g dynamics to this picture. As required<br />

<strong>in</strong> the object oriented approach operations will be associated with classes. This gives us the<br />

notion <strong>of</strong> a method.<br />

We shall dist<strong>in</strong>guish between visible and hidden methods to emphasize those methods<br />

that can be <strong>in</strong>voked by the user and others. This is not <strong>in</strong>tended to dene an <strong>in</strong>terface <strong>of</strong> a<br />

class, s<strong>in</strong>ce for the moment all methods <strong>of</strong> a class <strong>in</strong>clud<strong>in</strong>g the hidden ones can be accessed<br />

by other methods. The justication for such aweak hid<strong>in</strong>g concept is due to two reasons.<br />

12


{ Visible methods serve as a means to specify (nested) transactions. In order to build<br />

sequences <strong>of</strong> database <strong>in</strong>stances we only regard these transactions assum<strong>in</strong>g a l<strong>in</strong>ear <strong>in</strong>vocation<br />

order on them.<br />

{ Hidden methods can be used to handle identiers. S<strong>in</strong>ce these identiers do not have any<br />

mean<strong>in</strong>g for the user, they must not occur with<strong>in</strong> the <strong>in</strong>put or output <strong>of</strong> a transaction.<br />

Denition 1.6. Let S be a structural schema. Let T 1 ::: T n T 0 1 ::: T0 m be types, M 2 N M<br />

and 1 ::: n o 1 ::: o m 2 V .<br />

(i) A method signature consists <strong>of</strong> a method name M, a set <strong>of</strong> <strong>in</strong>put-parameter / <strong>in</strong>put-type<br />

pairs i :: T i and a set <strong>of</strong> output-parameter / output-type pairs o j :: Tj 0 .We write<br />

o 1 :: T 0 1::: o m :: T 0 m M( 1 :: T 1 ::: n :: T n ) :<br />

(ii) Let C be some structural class <strong>in</strong> S. A method M on C consists <strong>of</strong> a method signature<br />

with name M and a body that is recursively built from the follow<strong>in</strong>g constructs:<br />

(a) assignment x := E, where x is either the class variable x C or a local variable with<strong>in</strong><br />

S, andE is a term <strong>of</strong> the same type as x,<br />

(b) skip, fail, loop,<br />

(c) sequential composition S 1 S 2 , choice S 1 S 2 , projection x :: T j S, guard P ! S,<br />

restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable <strong>of</strong> type<br />

T ,and<br />

(d) <strong>in</strong>stantiation x 0 1 ::: x0 i C0 : S 0 (E1 0 ::: E0 j ), where S0 is a method on class C 0 with<br />

<strong>in</strong>put-parameters 0 1 ::: 0 j and output-parameters o0 1 ::: o0 i ,such that the variables<br />

o 0 f , x0 f have the same type and the term E0 g has the same type as the variable 0 g.<br />

(iii) A method M on a class C with signature o 1 :: T1 0::: o m :: Tm 0 M( 1 :: T 1 ::: n ::<br />

T n ) is called value-dened i all T i (i =1:::n) and Tj 0 (j =1::: m) are proper value<br />

types.<br />

As already mentioned the OODM dist<strong>in</strong>guishes between transactions, i.e. methods visible to<br />

the user, and hidden methods. We require each transaction to be value-dened.<br />

Subclasses <strong>in</strong>herit the methods <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long<br />

as the new method is a specialization <strong>of</strong> all its correspond<strong>in</strong>g methods <strong>in</strong> its superclasses.<br />

Overrid<strong>in</strong>g becomes mandatory <strong>in</strong> the case <strong>of</strong> multiple <strong>in</strong>heritance with name conicts. A<br />

method that overrides a hidden method on some superclass must also be hidden.<br />

Denition 1.7. Let S be a structural schema and C 2Sbe a structural class as <strong>in</strong> Denition<br />

1.3 with superclasses D 1 ::: D k .Amethod specication on C consists <strong>of</strong> two sets<strong>of</strong>methods<br />

S = fM 1 ::: M n g (called transactions) and H = fM1 0::: M0 mg (called hidden methods)<br />

such that the follow<strong>in</strong>g properties hold:<br />

(i) Each M i (i =1::: n)isvalue-dened.<br />

(ii) For each transaction M l on some superclass D l there exists some i 2f1::: ng such that<br />

M i specializes M l .<br />

(iii) For each hidden method M l on some superclass D l there exists some j 2f1::: mg such<br />

that M 0 j specializes M l .<br />

13


Let us briey discuss what specialization means for the <strong>in</strong>put- and output-types. Sometimes<br />

it is required that the <strong>in</strong>put-type for an overrid<strong>in</strong>g method should be a subtype <strong>of</strong> the orig<strong>in</strong>al<br />

one (covariance rule), sometimes the opposite (contravariance rule) is required. The rst rule<br />

applies e.g. if we want tooverride an <strong>in</strong>sert method. In this case the <strong>in</strong>herited method has no<br />

eect on the subclass, but simply calls the \old" method. The second rule applies if <strong>in</strong>puttypes<br />

required on the superclass can be omitted on the subclass. Both rules are captured<br />

by the formal notion <strong>of</strong> specialization. We omit the details [44]. Now we are prepared to<br />

generalize the denition <strong>of</strong> classes and schemata.<br />

Denition 1.8. (i) A class consists <strong>of</strong> a class name C 2 N C , a structure expression S, a set <strong>of</strong><br />

class names D 1 ::: D m 2 N C (called the set <strong>of</strong> superclasses) and a method specication<br />

(S = fM 1 ::: M n g , H = fM 0 1 ::: M0 n 0 g)onC.<br />

(ii) A (behavioural) schema S is a nite collection <strong>of</strong> classes fC 1 ::: C n closed under references,<br />

superclasses and method call together with a collection <strong>of</strong> <strong>in</strong>tegrity constra<strong>in</strong>ts<br />

I 1 ::: I n on S.<br />

(iii) An <strong>in</strong>stance D <strong>of</strong> a behavioural schema S is an <strong>in</strong>stance <strong>of</strong> the underly<strong>in</strong>g structural<br />

schema. A database history on S is a sequence D 0 D 1 ::: <strong>of</strong> <strong>in</strong>stances such that each<br />

transition from D i;1 to D i is due to some transaction on some class C 2S.<br />

Note the relation between database histories used here and the work on the semantics <strong>of</strong><br />

object bases <strong>in</strong> [22, 28].<br />

1.3.5 Queries and Views<br />

Roughly speak<strong>in</strong>g the query<strong>in</strong>g <strong>of</strong> a database is an operation on the database without chang<strong>in</strong>g<br />

its state. The emphasis <strong>of</strong> a query is on the output. While such a general view <strong>of</strong> queries can be<br />

subsumed by transactions, hence by methods <strong>in</strong> the OODM, query languages are <strong>in</strong> particular<br />

<strong>in</strong>tended to be declarative <strong>in</strong> order to support an ad-hoc query<strong>in</strong>g <strong>of</strong> a database without the<br />

need to write new transactions [8].<br />

Query<strong>in</strong>g a relational database can be expressed by terms <strong>in</strong> relational algebra. This view<br />

can be easily generalized to the OODM us<strong>in</strong>g its type system. Therefore, terms over such<br />

types occur naturally. Moreover, type specications are based on other type specications via<br />

constructors, selectors and functions. Hence, T allows arbitrary terms <strong>in</strong>volv<strong>in</strong>g more than one<br />

class variable x C to be built. Then a query turns out be be represented by termt over some<br />

type T such that the free variables <strong>of</strong> t are all class variables. This approach is <strong>in</strong> accordance<br />

with the algebraic approach <strong>in</strong> [12] and with so called universal traversal comb<strong>in</strong>ators [25].<br />

In relational algebra a view may be regarded simply as a stored query (or derived relation).<br />

We shall try to generalize also this view to the OODM.<br />

However, th<strong>in</strong>gs change dramatically, when object identiers come <strong>in</strong>to play [13], s<strong>in</strong>ce<br />

now we have to dist<strong>in</strong>guish between queries that result <strong>in</strong> values and those that result <strong>in</strong><br />

(collections <strong>of</strong>) objects. Therefore we dist<strong>in</strong>guish <strong>in</strong> the OODM between value queries and<br />

general access expressions.<br />

A value query on a schema S can then be represented by a term t <strong>of</strong> some value type T<br />

with fr(t) fx C j C 2Sg. Ad-hoc query<strong>in</strong>g <strong>of</strong> a database should then be restricted to value<br />

queries. This is no loss <strong>of</strong> generality, because for any type T <strong>in</strong> T <strong>in</strong>volv<strong>in</strong>g identiers there<br />

exists a correspond<strong>in</strong>g type T 0 allow<strong>in</strong>g multiple occurrences. Take e.g. a class C. Ifwewant<br />

to get all the objects <strong>in</strong> that class no matter whether they have the same values or not, the<br />

14


correspond<strong>in</strong>g term would be x C . This is not a value query, but if T C is a value type, we may<br />

take T 0 = hT C i and the natural projection given by the subtype functions<br />

f(ident : ID value : )g ! h(ident : IDvalue : )i ! hi :<br />

In the case <strong>of</strong> arbitrary access expressions another problem occurs [13]. So far, we can only<br />

build terms t that <strong>in</strong>volve identiers already exist<strong>in</strong>g <strong>in</strong> the database. Thus, such queries<br />

are called object preserv<strong>in</strong>g. If we want the result <strong>of</strong> a query to represent \new" objects, i.e.<br />

if we want to have object generat<strong>in</strong>g queries, we have to apply a mechanism to create new<br />

object identiers. This can be achieved by object creat<strong>in</strong>g functions on the type ID with arity<br />

ID ::: ID ! ID [32, 35].<br />

The idea that a view is a stored query then carries over easily. However, the structure <strong>of</strong> a<br />

view should be compatible with the structure <strong>of</strong> the schema, i.e. each view may be regarded<br />

as a derived class. Summariz<strong>in</strong>g, we getthe follow<strong>in</strong>g formal denition.<br />

Denition 1.9. Let S = fC 1 ::: C n g be some schema.<br />

(i) A value query on S is a term t over some proper value type T with fr(t) fx C 1 ::: x C n<br />

g.<br />

(ii) An access expression on S is a term t over some proper type T with fr(t) fx C 1 ::: x C n<br />

g.<br />

(iii) A view on S consists <strong>of</strong> a view name v 2 N C such that there is no class C 2S with this<br />

name, a structure expression S(v) conta<strong>in</strong><strong>in</strong>g references to classes <strong>in</strong> S or to views on S<br />

and a den<strong>in</strong>g access expression t(v) <strong>of</strong> type fU v g, where T v is the representation type<br />

correspond<strong>in</strong>g to S(v).<br />

(iv) A (complete) schema is a behavioural schema together with a nite set <strong>of</strong> views. An<br />

<strong>in</strong>stance <strong>of</strong> a complete schema is an <strong>in</strong>stance <strong>of</strong> the underly<strong>in</strong>g structural schema such<br />

that for every view v replac<strong>in</strong>g each class variable x C <strong>in</strong> the access expressions <strong>of</strong> v yields<br />

avalue <strong>of</strong> type fU v g satisfy<strong>in</strong>g the uniqueness property for identiers.<br />

1.4 The <strong>Object</strong> Identication Problem<br />

From an object oriented po<strong>in</strong>t <strong>of</strong>view a database may be considered as a huge collection <strong>of</strong><br />

objects <strong>of</strong> arbitrary complex structure. Hence the problem to uniquely identify and retrieve<br />

objects <strong>in</strong> such collections.<br />

Each object <strong>in</strong> a database is an abstraction <strong>of</strong> a real world object that has a unique identity.<br />

The representation <strong>of</strong> such objects <strong>in</strong> the OODM uses an abstract identier I <strong>of</strong> type ID to<br />

encode this identity. Suchanidentier may be considered as be<strong>in</strong>g immutable. However, from<br />

a systems oriented view permutations or collapses <strong>of</strong> identiers without chang<strong>in</strong>g anyth<strong>in</strong>g<br />

else should not aect the behaviour <strong>of</strong> the database.<br />

For the user the abstract identier <strong>of</strong> an object has no mean<strong>in</strong>g. Therefore, a dierent<br />

access to the identication problem is required. We show that the unique identication <strong>of</strong><br />

an object <strong>in</strong> a class leads to the notion <strong>of</strong> (weak) value-identiability, where weak valuerepresentability<br />

can be used to capture also objects that do not exists for there own, but<br />

depend on other objects. This is related to weak entities <strong>in</strong> entity-relationship models [62].<br />

The stronger notion <strong>of</strong> value-representability is required for the unique denition <strong>of</strong> generic<br />

update operations.<br />

15


1.4.1 The Notion <strong>of</strong> Value-Representability<br />

Accord<strong>in</strong>g to our denitions two objects <strong>in</strong> a class C are identical i they have the same<br />

identier. By the use <strong>of</strong> constra<strong>in</strong>ts, especially uniqueness constra<strong>in</strong>ts, we could restrict this<br />

notion <strong>of</strong> equality.<br />

Let us address the characterization <strong>of</strong> those classes, the objects <strong>in</strong> which are completely<br />

representable by values, i.e. we could drop the object identiers and replace references by values<br />

<strong>of</strong> the referred object. We shall see <strong>in</strong> Section 1.5 that <strong>in</strong> case <strong>of</strong> value-representable classes<br />

we are able to preserve an important advantage <strong>of</strong> relational databases, i.e. the existence <strong>of</strong><br />

structurally determ<strong>in</strong>ed update operations.<br />

Denition 1.10. Let C be a class <strong>in</strong> a schema S with representation type T C .<br />

(i) C is called value-identiable i there exists a proper value type I C such that for all<br />

<strong>in</strong>stances D <strong>of</strong> S there is a function c : T C ! I C such thatthe uniqueness constra<strong>in</strong>t on<br />

C dened by c holds for D.<br />

(ii) C is called value-representable i there exists a proper value type V C such that for all<br />

<strong>in</strong>stances D <strong>of</strong> S there is a function c : T C ! V C such that for D<br />

(a) the uniqueness constra<strong>in</strong>t onC dened by c holds and<br />

(b) for each uniqueness constra<strong>in</strong>tonC dened by some function c 0 : T C ! VC 0 with proper<br />

value type VC 0 there exists a function c00 : V C ! VC 0 that is unique on c(codom(D(C)))<br />

with c 0 = c 00 c.<br />

It is easy to see that each value-representable class C is also value-identiable. Moreover, the<br />

value-representation type V C <strong>in</strong> Denition 1.10 is unique up to isomorphism.<br />

1.4.2 Value-Representability <strong>in</strong> the Case <strong>of</strong> Acyclic Reference Graphs<br />

S<strong>in</strong>ce value-representability is dened by the existence <strong>of</strong> a certa<strong>in</strong> proper value type, it is hard<br />

to decide, whether an arbitrary class is value-representable or not. In case <strong>of</strong> simple classes<br />

the problem is easier, s<strong>in</strong>ce we only have to deal with uniqueness and value constra<strong>in</strong>ts. In<br />

this case it is helpful to analyse the reference structure <strong>of</strong> the class. Hence the follow<strong>in</strong>g<br />

graph-theoretic denitions.<br />

Denition 1.11. The reference graph <strong>of</strong> a class C <strong>in</strong> a schema S is the smallest labelled<br />

graph G rep =(VEl) satisfy<strong>in</strong>g:<br />

(i) There exists a vertex v C 2 V with l(v C ) = ft Cg, where t is the top-level type <strong>in</strong> the<br />

structure expression S <strong>of</strong> C.<br />

(ii) For each proper occurrence <strong>of</strong> a type t 6= ID <strong>in</strong> T C there exists a unique vertex v t 2 V<br />

with l(v t )=ftg.<br />

(iii) For each reference r i : C i <strong>in</strong> the structure expression S <strong>of</strong> C the reference graph G i ref is<br />

a subgraph <strong>of</strong> G ref .<br />

(iv) For each vertex v t or v C correspond<strong>in</strong>g to t(x 1 ::: x n )<strong>in</strong>S there exist unique edges e (i)<br />

t<br />

from v t or v C respectively to v ti <strong>in</strong> case x i is the type t i or to v Ci <strong>in</strong> case x i is the reference<br />

r i : C i . In the rst case l(e (i)<br />

t )=fS i g, where S i is the correspond<strong>in</strong>g selector name <strong>in</strong> the<br />

latter case the label is fS i r i g.<br />

16


Denition 1.12. (i) Let S = fC 1 ::: C n g be a schema. Let S 0 = fC1 0 ::: C0 ng be another<br />

schema such that for all i there exists a uniqueness constra<strong>in</strong>t on C i dened by some<br />

c i : T Ci ! T C 0<br />

i<br />

. Then an identication graph G id <strong>of</strong> the class C i is obta<strong>in</strong>ed from the<br />

reference graph <strong>of</strong> Ci 0 bychang<strong>in</strong>g each label C0 j to C j.<br />

(ii) The identication graph G id result<strong>in</strong>g from the use <strong>of</strong> trivial uniqueness constra<strong>in</strong>ts is<br />

called the standard identication graph.<br />

Clearly, there need not exist any identication graph nor does the existence <strong>of</strong> one identication<br />

graph imply the existence <strong>of</strong> the standard one. However, if the standard identication<br />

graph exist, then it is equal to the reference graph.<br />

Proposition 1.13. Let C be a class <strong>in</strong> a schema S with acyclic reference graph G ref such<br />

that there exist uniqueness constra<strong>in</strong>ts for C and each C i such that C i occurs as a label <strong>in</strong><br />

G ref . Then C is value-representable.<br />

Pro<strong>of</strong>. We use <strong>in</strong>duction on the maximum length <strong>of</strong> a path <strong>in</strong> G ref . If there are no references<br />

<strong>in</strong> the structure expression S <strong>of</strong> C the type T C is a proper value type. S<strong>in</strong>ce there exists a<br />

uniqueness constra<strong>in</strong>tonC, the identity function id on T C also denes a uniqueness constra<strong>in</strong>t.<br />

Hence V C = T C satises the requirements <strong>of</strong> Denition 1.10.<br />

If there are references r i : C i <strong>in</strong> the structure expression S <strong>of</strong> C, then the <strong>in</strong>duction<br />

hypothesis holds for each such C i , because G ref is acyclic. Let V C result from S by replac<strong>in</strong>g<br />

each r i : C i by V Ci . Then V C satises the requirements <strong>of</strong> Denition 1.10.<br />

ut<br />

Corollary 1.14. Let C be a class <strong>in</strong> a schema S such that there exist an acyclic identication<br />

graph G id and uniqueness constra<strong>in</strong>ts for C and each C i occur<strong>in</strong>g as a label <strong>in</strong> G id . Then C<br />

is value-identiable.<br />

1.4.3 Computation <strong>of</strong> Value Representation Types<br />

We want to address the more general case where cyclic references may occur <strong>in</strong> the schema<br />

S = fC 1 ::: C n g. In this case a simple <strong>in</strong>duction argument as <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Theorem 1.13<br />

is not applicable. So we take another approach. We dene algorithms to compute types V C<br />

and I C that turn out to be proper value types under certa<strong>in</strong> conditions. In the next subsection<br />

we then show that these types are the value representation type and the value identication<br />

type required by Denition 1.10.<br />

Algorithm 1.15. Let F (C i )=T i provided there exists a uniqueness constra<strong>in</strong>t onC i dened<br />

by c i : T Ci ! T i , otherwise let F (C i ) be undened. If ID occurs <strong>in</strong> some F (C i ) correspond<strong>in</strong>g<br />

to r j : C j (j 6= i), we writeID j .<br />

Then iterate as long as possible us<strong>in</strong>g the follow<strong>in</strong>g rules:<br />

(i) If F (C j )isaproper value type and ID j occurs <strong>in</strong> some F (C i )(j 6= i), then replace this<br />

correspond<strong>in</strong>g ID j <strong>in</strong> F (C i )by F (C j ).<br />

(ii) If ID i occurs <strong>in</strong> some F (C i ), then let F (C i ) be recursively dened by F (C i )==S i , where<br />

S i is the result <strong>of</strong> replac<strong>in</strong>g ID i <strong>in</strong> F (C i )by the type name F (C i ).<br />

This iteration term<strong>in</strong>ates, s<strong>in</strong>ce there exists only a nite collection <strong>of</strong> classes. If these rules are<br />

no longer applicable, replace each rema<strong>in</strong><strong>in</strong>g occurrence <strong>of</strong> ID j <strong>in</strong> F (C i ) by the type name<br />

F (C j )provided F (C j ) is dened.<br />

ut<br />

17


Note that the the algorithm computes (mutually) recursive types. Now we give a sucient<br />

condition for the result <strong>of</strong> Algorithm 1.15 to be a proper value type.<br />

Lemma 1.16. Let C be a class <strong>in</strong> a schema S such that there exists a uniqueness constra<strong>in</strong>t<br />

for all classes C i occurr<strong>in</strong>g as a label <strong>in</strong> some identication graph G id <strong>of</strong> C. Let I C be the<br />

type F (C) computed by Algorithm 1.15 with respect to the uniqueness constra<strong>in</strong>ts used <strong>in</strong> the<br />

denition <strong>of</strong> G id . Then I C is a proper value type.<br />

Pro<strong>of</strong>. Suppose I C were not a proper value type. Then there exists at least one occurrence <strong>of</strong><br />

ID <strong>in</strong> I C . This corresponds to a class C i without uniqueness constra<strong>in</strong>t occurr<strong>in</strong>g as a label<br />

<strong>in</strong> G id , hence contradicts the assumption <strong>of</strong> the lemma.<br />

ut<br />

1.4.4 The F<strong>in</strong>iteness Property<br />

Let us now address the general case. The basic idea is that there is always only a nite number<br />

<strong>of</strong> objects <strong>in</strong> a database. Assum<strong>in</strong>g the database be<strong>in</strong>g consistent with respect to <strong>in</strong>clusion<br />

and referential constra<strong>in</strong>ts yields that there can not exist <strong>in</strong>nite cyclic references. This will<br />

be expressed by theniteness property. We show that this property allows the computation<br />

<strong>of</strong> value representation types.<br />

Denition 1.17. Let C be a class <strong>in</strong> a schema S and let g kl denote a path <strong>in</strong> G ref from v Ck<br />

to v Cl provided there is a reference r l : C l <strong>in</strong> the structure expression <strong>of</strong> C k . Then a cycle <strong>in</strong><br />

G ref is a sequence g 01 g n;1n with C 0 = C n and C k 6= C l otherwise.<br />

Note that we use paths <strong>in</strong>stead <strong>of</strong> edges, because the edges <strong>in</strong> G ref do not always correspond<br />

to references. Accord<strong>in</strong>g to our denition <strong>of</strong> a class there exists a referential constra<strong>in</strong>t on<br />

C k , C l dened by o kl : T Ck ID ! BOOL correspond<strong>in</strong>g to g kl . Therefore, to each cycle<br />

there exists a correspond<strong>in</strong>g sequence <strong>of</strong> functions o 01 o n;1n . This can be used as follows<br />

to dene a function cyc : ID ID ! BOOL correspond<strong>in</strong>g to a cycle <strong>in</strong> G ref .<br />

Denition 1.18. Let C be a class <strong>in</strong> a schema S and let g 01 g n;1n be a cycle <strong>in</strong> G ref . The<br />

correspond<strong>in</strong>g cycle relation cyc : ID ID ! BOOL is dened by cyc(i j) =true i there<br />

exists a sequence i = i 0 i 1 ::: i n = j (n 6= 0) such that (i l v l ) 2 C l and o ll+1 (i l+1 v l )=true<br />

for all l =0::: n; 1.<br />

Given a cycle relation cyc, letcyc m the m-th power <strong>of</strong> cyc.<br />

Lemma 1.19. Let C be a class <strong>in</strong> a schema S. Then C satises the niteness property,<br />

i.e. for each <strong>in</strong>stance D <strong>of</strong> S and for each cycle <strong>in</strong> G ref the correspond<strong>in</strong>g cycle relation cyc<br />

satises<br />

8i 2 dom(C): 9n: 8j 2 dom(C): 9m


Lemma 1.20. Let D be an <strong>in</strong>stance <strong>of</strong> schema S = fC 1 ::: C n g. Then D satises at<br />

each stage <strong>of</strong> Algorithm 1.15 uniqueness constra<strong>in</strong>ts for all i = 1::: n dened by some<br />

c 0 i : T C i<br />

! F (C i ).<br />

Pro<strong>of</strong>. It is sucient toshow that whenever a rule is applied replac<strong>in</strong>g F (C i )by F (C i ) 0 , then<br />

F (C i ) 0 also denes a uniqueness constra<strong>in</strong>t onC i .<br />

Suppose that (i v) 2 C i holds <strong>in</strong> D. S<strong>in</strong>ce it is possible to apply a rule to F (C i ), there exists<br />

at least one value j :: ID occurr<strong>in</strong>g <strong>in</strong> c i (v). Replac<strong>in</strong>g ID j <strong>in</strong> F (C i ) corresponds to replac<strong>in</strong>g<br />

j by some value v j :: F (C j ). Because <strong>of</strong> the niteness property such a value must exist.<br />

Moreover, due to the uniqueness constra<strong>in</strong>t dened by c j the function f : F (C i ) ! F (C i ) 0<br />

represent<strong>in</strong>g this replacement must be <strong>in</strong>jective onc i (codo(D(C i ))). Hence, c 0 i = f c i denes<br />

a uniqueness constra<strong>in</strong>t onC i .<br />

ut<br />

Now assume that we use only trivial uniqueness constra<strong>in</strong>ts <strong>in</strong> Algorithm 1.15. In order to<br />

dist<strong>in</strong>guish this situation from the general case we write G(C i ) <strong>in</strong>stead <strong>of</strong> F (C i ) to refer to<br />

this special case.<br />

Lemma 1.21. Let D be an <strong>in</strong>stance <strong>of</strong> schema S = fC 1 ::: C n g. Then at each stage <strong>of</strong><br />

Algorithm 1.15 (applied with arbitrary uniqueness constra<strong>in</strong>ts and <strong>in</strong> parallel with trivial<br />

ones) there exists for all i = 1::: n a function c i : G(C i ) ! F (C i ) that is unique on<br />

c i (codom(D(C i ))) with c 0 i =c i c i .<br />

Pro<strong>of</strong>. As <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Lemma 1.20 it is sucient to show that the required property<br />

is preserved by the application <strong>of</strong> a rule from any <strong>of</strong> the two versions <strong>of</strong> Algorithm 1.15.<br />

Therefore, let c i satisfy the required property and let g : G(C i ) ! G(C i ) 0 and f : F (C i ) !<br />

F (C i ) 0 be functions correspond<strong>in</strong>g to the application <strong>of</strong> a rule to G(C i ) and F (C i ) respectively.<br />

Such functions were constructed <strong>in</strong> the pro<strong>of</strong>s <strong>of</strong> Lemma 1.20 and Lemma 1.20 respectively.<br />

Then f c i satises the required property with respect to the application <strong>of</strong> f. In the case<br />

<strong>of</strong> apply<strong>in</strong>g g we know that g is <strong>in</strong>jective on c i (codom(D(C i ))). Let h : G(C i ) 0 ! G(C i ) be<br />

any cont<strong>in</strong>uation <strong>of</strong> g ;1 : g(c i (codom(D(C i )))) ! G(C i ). Then c i h satises the required<br />

property.<br />

ut<br />

Theorem 1.22. Let C be a class <strong>in</strong> a schema S such that there exists a uniqueness constra<strong>in</strong>t<br />

for all classes C i occurr<strong>in</strong>g as a label <strong>in</strong> the reference graph G ref <strong>of</strong> C. Let V C be the type<br />

G(C) computed by Algorithm 1.15 with respect to trivial uniqueness constra<strong>in</strong>ts and let I C be<br />

the type F (C) computed by Algorithm 1.15 with respect to arbitrary uniqueness constra<strong>in</strong>ts.<br />

Then C is value-representable with value representation type V C and each such I C is a value<br />

identication type.<br />

Pro<strong>of</strong>. V C is a proper value type by Lemma 1.16. From Lemma 1.20 it follows that if D is an<br />

<strong>in</strong>stance <strong>of</strong> S, then there exists a function c : T C ! V C such that the uniqueness constra<strong>in</strong>t<br />

dened by c holds for D. The same applies to I C .<br />

If VC<br />

0 is another proper value type and D satises a uniqueness constra<strong>in</strong>t dened by<br />

c 0 : T C ! VC 0 , then V C 0 is some value-identication type I C.Henceby Lemma 1.21 there exists<br />

a function c 00 : V C ! VC 0 that is unique on c(codom(D(C))) with c0 = c 00 c. This proves the<br />

Theorem.<br />

ut<br />

Corollary 1.23. Let S be a schema such that all classes C <strong>in</strong> S are value-identiable. Then<br />

all classes C <strong>in</strong> S are also value-representable.<br />

ut<br />

19


1.4.5 Weak Value-Representability<br />

Let us now ask whether there exist also weaker identication mechanisms other than valuerepresentability.<br />

Inseveral papers, e.g. [42] a navigational approach on the basis <strong>of</strong> the reference<br />

structure has been favoured. This leads to dependent classessimilar to \weak entities"<br />

<strong>in</strong> the entity-relationship model [62]. We shall show that such an approach requires at least<br />

a value-identiable \entrance" <strong>of</strong> some path and the hard restriction on references to be<br />

representable by surjective functions.<br />

Denition 1.24.<br />

Let S be some schema.<br />

(i) If r is a reference from class C to D <strong>in</strong> S and o : T C ID ! BOOL is the function<br />

<strong>of</strong> Denition 4 express<strong>in</strong>g the correspond<strong>in</strong>g referential constra<strong>in</strong>t, then r satises the<br />

(SF)-condition i<br />

(a) o(v i) ^ o(v j) ) i = j and<br />

(b) j 2 dom(x D ) ) 9v :: T C :v 2 codom(x C ) ^ o(v j)<br />

hold for all i j :: IDv :: T C .<br />

(ii) An (SF)-cha<strong>in</strong> from class D to C <strong>in</strong> S is a sequence <strong>of</strong> classes D = C 0 ::: C n = C such<br />

that for all i (i =1::: n) either C i is a subclass <strong>of</strong> C i;1 or there exists a reference r i<br />

from C i;1 to C i satisfy<strong>in</strong>g the (SF)-condition.<br />

(iii) A class C <strong>in</strong> S is called weakly value-identiable i there exists avalue-identiable class<br />

D and an (SF)-cha<strong>in</strong> from D to C.<br />

The notation (SF)-condition has been chosen to emphasize that such a reference represents<br />

a surjective function. It is easy to see tak<strong>in</strong>g n =0that each value-identiable class is also<br />

weakly value-identiable.<br />

Lemma 1.25. If C is a weakly value-identiable class <strong>in</strong> a schema S, then there exists a<br />

proper value type I C such that for each <strong>in</strong>stance D <strong>of</strong> S there exists a function c : ID ! I C<br />

such that c is <strong>in</strong>jective on dom(D(C)).<br />

Call I C a weak value-identication type <strong>of</strong> the class C.<br />

Pro<strong>of</strong>. Let D = C 0 ::: C n = C be an (SF)-cha<strong>in</strong> from the value-identiable class D to C<br />

with correspond<strong>in</strong>g references r i (i =1::: n). If r i satises the (SF)-condition, there exists<br />

a function c i : ID ! ID such that j 2 dom(D(C i )) ) (c i (j)v) 2 x Ci;1 for some v with<br />

o i (v j) (just take some <strong>in</strong>verse image <strong>of</strong> j under the surjective reference function). S<strong>in</strong>ce r i<br />

denes a function, c i is clearly <strong>in</strong>jective. If C i is a subclass <strong>of</strong> C i;1 , then take c i = id.<br />

If c 0 : ID ! I D is the function dened by the uniqueness constra<strong>in</strong>t onD and c 00 : ID !<br />

ID is the concatenation c 1 ::: c n , then c = c 0 c 00 satises the required property. ut<br />

Denition 1.26. A class C <strong>in</strong> a schema S is called weakly value-representable i there exists<br />

apropervalue type V C such that for each <strong>in</strong>stance D <strong>of</strong> S the follow<strong>in</strong>g properties hold.<br />

(i) There is a function c : ID ! V C that is <strong>in</strong>jective ondom(D(C)).<br />

(ii) For each proper value type VC 0 and each function c0 : ID ! VC 0 that is <strong>in</strong>jective on<br />

dom(D(C)) there exists a function c 00 : V C ! VC 0 that is unique on c(dom(D(C))) with<br />

c 0 = c 00 c.<br />

20


We call V C the weak value-representation type <strong>of</strong> the class C.<br />

Note that the weak value-representation typeisuniqueprovided it exists. Aga<strong>in</strong> it is easy to<br />

see that value-representability implies weak value-representability. Moreover, due to Lemma<br />

1.25 each weakly value-representable class is also weakly value-identiable. We shall see that<br />

also the converse <strong>of</strong> this fact is true.<br />

We want to compute weak value representation types. This can be done us<strong>in</strong>g a slight<br />

modication <strong>of</strong> Algorithm 1.15 that completely ignores uniqueness constra<strong>in</strong>ts. We refer to<br />

this algorithm as the bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15 and to emphasize this, we write H(C i )<br />

<strong>in</strong>stead <strong>of</strong> F (C i ). Analogous to Lemmata 1.16 and 1.20 the follow<strong>in</strong>g results holds.<br />

Lemma 1.27. Let C be aclass <strong>in</strong> a schema S and let I C bethetype H(C) computed by the<br />

bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15. Then I C is a proper value type.<br />

Lemma 1.28. Let D be an <strong>in</strong>stance <strong>of</strong> the schema S = fC 1 ::: C n g. Let C, D be classes<br />

such that C is weakly value-identiable, D is value-identiable and there exists some (SF)-<br />

cha<strong>in</strong> from D to C. Let c : ID ! I C be the function <strong>of</strong> Lemma 1.25 correspond<strong>in</strong>g to this<br />

cha<strong>in</strong>. Let c 0 : ID ! H(D) be a function correspond<strong>in</strong>g to the uniqueness constra<strong>in</strong>t on D<br />

and the <strong>in</strong>stance D. Then at each stage <strong>of</strong> the bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15 there exists a<br />

function c : H(D) ! I C that is unique on c 0 (dom D (C)) with c =c c 0 .<br />

Based on these two lemmata we can now state the ma<strong>in</strong> result on weak value representability.<br />

Theorem 1.29. Let C be a weakly value-identiable class <strong>in</strong> a schema S andlet V C be the<br />

product <strong>of</strong> all types H(D), where D is the lead<strong>in</strong>g value-identiable class <strong>in</strong> some maximal<br />

(SF)-cha<strong>in</strong> correspond<strong>in</strong>g to C and H(D) is the result <strong>of</strong> the bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15.<br />

Then C is weakly value-representable with weak value-representation type V C .<br />

Pro<strong>of</strong>. V C is a proper value type by Lemma 1.27. From Lemmata 1.20 and 1.25 it follows<br />

that there exists a function c 0 : ID ! V C that is <strong>in</strong>jective ondom D (C).<br />

From Lemma 1.28 it follows that there exists a function c : V C ! I C that is unique on<br />

c 0 (dom(D(C))) with c =c c 0 . This proves the Theorem.<br />

ut<br />

1.5 The Genericity Problem<br />

The preservation <strong>of</strong> advantages <strong>of</strong> relational databases requires generic operations for query<strong>in</strong>g<br />

and for the <strong>in</strong>sertion, deletion and update <strong>of</strong> s<strong>in</strong>gle objects. While query<strong>in</strong>g [1, 12, 30, 55] is<br />

per se a set-oriented operation, i.e. it is not necessary to select just one s<strong>in</strong>gle object, and<br />

hence does not raise any specic problems with object identiers, th<strong>in</strong>gs change completely<br />

<strong>in</strong> case <strong>of</strong> updates. If an object with a given value is to be updated (or deleted), this is only<br />

dened unambigously, if there does not exist another object with the same value.Ifmorethan<br />

one object exists with the same value or more generally with the same value and the same<br />

references to other objects, then the user has to decide, whether an update- or delete-operation<br />

is applied to all these objects, to only one <strong>of</strong> these objects selected non-determ<strong>in</strong>istically or to<br />

none <strong>of</strong> them, i.e. to reject the operation. However, it is not possible to specify a priori such<br />

an operation that works <strong>in</strong> the same way for all objects <strong>in</strong> all situations. The same applies<br />

to <strong>in</strong>sert-operations. Hence the problem, <strong>in</strong> which cases operations for the <strong>in</strong>sertion, deletion<br />

and update <strong>of</strong> objects can be dened generically.<br />

21


Some authors [43] have chosen the solution to abandon generic operations. Others [6, 7, 9]<br />

use identify<strong>in</strong>g values to represent objectidentity, thus embody a strict concept <strong>of</strong> surrogate<br />

keys to avoid the problem. Our approach is dierent from both solutions <strong>in</strong> that we use the<br />

concept <strong>of</strong> hidden abstract identiers, but at the same time formally characterize those classes<br />

for which unique generic methods for the <strong>in</strong>sertion, deletion and update <strong>of</strong> s<strong>in</strong>gle objects exist.<br />

At the same time <strong>in</strong>clusion and referential <strong>in</strong>tegrity have to be enforced. We show that these<br />

classes are the value-representable ones.<br />

1.5.1 Generic Update Methods<br />

The requirement that object-identiers have to be hidden from the user imposes the restriction<br />

on canonical update operations to be value-dened <strong>in</strong> the sense that the identier <strong>of</strong> a new<br />

object hastobechosen by the system whereas all <strong>in</strong>put- and output-data have to be values<br />

<strong>of</strong> proper value types.<br />

We now formally dene what we mean by generic update methods. For this purpose regard<br />

an <strong>in</strong>stance D <strong>of</strong> a schema S as a set <strong>of</strong> objects. For each recursively dened type T let T<br />

denote by replac<strong>in</strong>g each occurrence <strong>of</strong> a recursive type T 0 <strong>in</strong> T by UNION(T 0 ID).<br />

Denition 1.30. Let C be a class <strong>in</strong> a schema S. Generic update methods on C are <strong>in</strong>sert C ,<br />

delete C and update C satisfy<strong>in</strong>g the follow<strong>in</strong>g properties:<br />

(i) Their <strong>in</strong>put types are proper value types their output type is the trivial type ?.<br />

(ii) In the case <strong>of</strong> <strong>in</strong>sert applied to an <strong>in</strong>stance D there exists some o :: U C such that<br />

(a) the result is an <strong>in</strong>stance D 0 with o 2D 0 and DD 0 hold and<br />

(b) if D is any <strong>in</strong>stance with D D and o 2 D, then D 0 D.<br />

(iii) In the case <strong>of</strong> delete applied to an <strong>in</strong>stance D there exists some o :: U C such that<br />

(a) the result is an <strong>in</strong>stance D 0 with o 62 D 0 and D 0 Dhold and<br />

(b) if D is any <strong>in</strong>stance with DDand o 62 D, then DD 0 .<br />

<br />

(iv) In the case <strong>of</strong> update applied to an <strong>in</strong>stance D = D 1 [D 2 , where D 2 = fog if o 6= o 0 and<br />

D 2 = otherwise there exist o o 0 :: U C with o =(i v) ando 0 =(i v 0 )such that<br />

(a) the result is an <strong>in</strong>stance D 0 <br />

= D 1 [D2 0 with D 2 \D2 0 = ,<br />

(b) o 2D, o 0 2D 0 ,<br />

(c) if D is any <strong>in</strong>stance with D 1 D and o 0 2 D, then D 0 D. <br />

Canonical update methods on C are <strong>in</strong>sert 0 C , delete0 C and update0 C<br />

dened analogously with<br />

the only dierence <strong>of</strong> their output type be<strong>in</strong>g ID and their <strong>in</strong>put-type be<strong>in</strong>g T for some<br />

value-type T .<br />

Note that this denition <strong>of</strong> genericity <strong>in</strong>cludes the consistency with respect to the implicit constra<strong>in</strong>ts<br />

on S. Weshowthatvalue-representability is necessary and sucient for the existence<br />

and uniqueness <strong>of</strong> such operations.<br />

Lemma 1.31. Let C be a class <strong>in</strong> a schema S such that there exist canonical update methods<br />

on C. Then also generic update methods exist on C.<br />

Pro<strong>of</strong>. In the case <strong>of</strong> <strong>in</strong>sert dene <strong>in</strong>sert C (V :: V C ) == I <strong>in</strong>sert 0 C<br />

(V ), i.e. call the<br />

correspond<strong>in</strong>g canonical operation and ignore its output. The same argument applies to delete<br />

and update.<br />

ut<br />

22


Theorem 1.32. Let C be a class <strong>in</strong> a schema S such that there exist generic update methods<br />

on C. Then C is value-representable. Moreover, all super- and subclasses <strong>of</strong> C are also valuerepresentable.<br />

Pro<strong>of</strong>. First consider the delete method with <strong>in</strong>put type I C which isby denition a proper<br />

value type. We show that it is already a value identication type.<br />

If not, then for all <strong>in</strong>stances D and all functions c : T C ! I C there exist i j :: ID and<br />

v w :: T C with<br />

i 6= j ^ (i v) 2D(C) ^ (j w) 2D(C) ^ c(v) =c(w) : (1.12)<br />

Now take o = (i v) and o 0 = (j w). Then there exist two dist<strong>in</strong>ct <strong>in</strong>stances D 0 and D 00<br />

satisfy<strong>in</strong>g the conditions <strong>of</strong> Denition 1.30(iii) with respect to o and o 0 respectively, hence<br />

contradict the assumption <strong>of</strong> a unique generic delete-method on C.<br />

The same argument applies to the <strong>in</strong>put-type V C . Moreover, s<strong>in</strong>ce <strong>in</strong>sertion requires all<br />

values <strong>of</strong> referenced object to be provided, we derive from Algorithm 1.15 and Theorem 1.22<br />

that V C is a value representation type. Therefore, C is value-representable.<br />

The value-representability on superclasses is implied, s<strong>in</strong>ce <strong>in</strong>sert (and update) on C<br />

<strong>in</strong>volve the correspond<strong>in</strong>g method on each superclass. The value-representability <strong>of</strong> subclasses<br />

follows from the propagation <strong>of</strong> update through them. We omit the technical details. ut<br />

1.5.2 Generic Updates <strong>in</strong> the Case <strong>of</strong> Value-Representability<br />

Our next goal is to reduce the existence problem <strong>of</strong> canonical update operations to schemata<br />

without IsA relations.<br />

Lemma 1.33. Let C, D be value-representable classes <strong>in</strong> a schema S such that C is a subclass<br />

<strong>of</strong> D with subtype function g : T C ! T D .Thenthere exists a function h : V C ! V D such that<br />

for each <strong>in</strong>stance D <strong>of</strong> S with correspond<strong>in</strong>g functions c : T C ! V C and d : T D ! V D we have<br />

h(c(v)) = d(g(v)) for all v 2 codom(D(C)).<br />

Pro<strong>of</strong>. By Denition 1.10 c is <strong>in</strong>jectiveoncodom(D(C)), hence any cont<strong>in</strong>uation h <strong>of</strong> dgc ;1<br />

satises the required property.<br />

It rema<strong>in</strong>s to show that h does not depend on D. Suppose D 1 , D 2 are two <strong>in</strong>stances such<br />

that w = c 1 (v 1 )=c 2 (v 2 ) 2 V C , where c 1 d 1 h 1 correspond to D 1 and c 2 d 2 h 2 correspond to<br />

D 2 . Then there exists a permutation on ID such that v 2 = (v 1 ). We may extend to a<br />

permutation on any type. S<strong>in</strong>ce ID has no non-trivial supertype, g permutes with , hence<br />

g(v 2 )=(g(v 1 )). From Denition 1.10 it follows d 2 (g(v 2 )) = d 1 (g(v 1 )), i.e. h 2 (w) =h 1 (w).<br />

ut<br />

In the follow<strong>in</strong>g let S 0 be a schema derived from a schema S by omitt<strong>in</strong>g all IsA relations.<br />

Lemma 1.34. Let C be a value-representable class <strong>in</strong> S such that all its superclasses and<br />

subclasses D 1 :::D n are also value-representable. Then canonical update operations exist on<br />

C <strong>in</strong> S i they exist on C and all D i <strong>in</strong> S 0 .<br />

Pro<strong>of</strong>. By Theorem 1.22 the value-representation type V C is the result <strong>of</strong> Algorithm 1.15,<br />

hence V C does not depend on the <strong>in</strong>clusion constra<strong>in</strong>ts <strong>of</strong> S. Thenwehave<br />

I :: ID<br />

<strong>in</strong>sert 0 C (V :: V C)==<br />

I <strong>in</strong>sert 0 D1 (h 1(V )) ::: I <strong>in</strong>sert 0 D n<br />

(h n (V )) I <strong>in</strong>sert 0 C(V ) <br />

23


where h i : V C ! V Di is the function <strong>of</strong> Lemma 1.33 and <strong>in</strong>sert 0 C<br />

denotes a canonical <strong>in</strong>sert<br />

on C <strong>in</strong> S 0 . Hence <strong>in</strong> this case the result for the <strong>in</strong>sert follows by structural <strong>in</strong>duction on the<br />

IsA-hierarchy.<br />

If the subtype function g required <strong>in</strong> Lemma 1.33 does not exist for some superclass D<br />

then simply add V D to the <strong>in</strong>put type. We omit the details for this case.<br />

The arguments for delete and update are analogous. The value-representability <strong>of</strong> subclasses<br />

is required for the update case.<br />

ut<br />

From now onwe use a global operation NewId that produces a fresh identier I :: ID.This<br />

can be represented as a method us<strong>in</strong>g projection.<br />

Lemma 1.35. Let C be a value-representable class <strong>in</strong> S 0 . Then there exist unique quasicanonical<br />

update operations on C.<br />

Pro<strong>of</strong>. Let r i : C i (i =1:::n) denote the references <strong>in</strong> the structure expression <strong>of</strong> C. IfV be<br />

avalue <strong>of</strong> type V C , then there exist values V ij :: VCi<br />

(i =1:::nj =1:::k i ) occurr<strong>in</strong>g <strong>in</strong> V .<br />

Let V = fV ij =J ij j i =1:::nj =1:::k i g:V denote the value <strong>of</strong> type T C that results from<br />

replac<strong>in</strong>g each V ij by some J ij :: ID. Moreover, for I :: ID let<br />

<br />

V (I) fV=Ig:Vij if V occurs <strong>in</strong> V<br />

ij<br />

=<br />

ij<br />

else<br />

V ij<br />

Then the canonical <strong>in</strong>sert operation can be dened as follows:<br />

I :: ID <strong>in</strong>sert 0 C (V :: VC ) ==<br />

9 I 0 :: ID V 0 :: T C : (P air(I 0 V 0 ) 2 C ^ c(V 0 )=V ) ! I := I 0<br />

9V 0 :: T C :V = V 0 ! I NewId x C := x C [f(IV )g<br />

I NewId J 11 <strong>in</strong>sert 0 (I)<br />

C1 (V 11 ) ::: J nk n<br />

<strong>in</strong>sert 0 C n<br />

(V (I)<br />

nk n<br />

)<br />

x C := x C [f(IV )g<br />

It rema<strong>in</strong>s to show that this operation is <strong>in</strong>deed canonical. Apply the method to some <strong>in</strong>stance<br />

D. If there already exists some o =(I 0 V 0 ) <strong>in</strong> C with c(V 0 ) = V , the result is D 0 = D and<br />

the requirements <strong>of</strong> Denition 1.30 are trivially satised. Otherwise let o = (I V ). If D<br />

is an <strong>in</strong>stance with D D and o 2 D, we have J ij 2 dom(C i ) for all i = 1 :::n, j =<br />

1 :::k i , s<strong>in</strong>ce D satises the referential constra<strong>in</strong>ts. Hence D conta<strong>in</strong>s the dist<strong>in</strong>guished objects<br />

correspond<strong>in</strong>g to the <strong>in</strong>volved quasi-canonical operations <strong>in</strong>sert 0 C i<br />

. By <strong>in</strong>duction on the length<br />

<strong>of</strong> call-sequences D ij D for all i = 1 :::n, j = 1 :::k i , where D ij is the result <strong>of</strong> J ij<br />

<strong>in</strong>sert 0 C i<br />

(V (I)<br />

ij ). Hence D0 = S ij<br />

D ij [fog D. The uniqueness follows from the uniqueness <strong>of</strong><br />

V C .<br />

The denitions and pro<strong>of</strong>s for delete and update are analogous.<br />

Theorem 1.36. Let C be a value-representable class <strong>in</strong> a schema S such that all its superand<br />

subclasses are also value-representable. Then there exist unique generic update operations<br />

on C.<br />

Pro<strong>of</strong>. By Lemma 1.31 and Lemma 1.34 it is sucient to show the existence <strong>of</strong> canonical<br />

update operations on C and all its super- and subclasses <strong>in</strong> the schema S 0 . This follows from<br />

Lemma 1.35.<br />

ut<br />

In [50] it has been shown, how l<strong>in</strong>guistic reection [56] can be exploited to generate the generic<br />

update operations for value-representable classes <strong>in</strong> an OODM schema.<br />

24<br />

ut


1.6 The Consistency Problem<br />

In general a database may be considered as a triplet (S O C), where S denes a structure,<br />

O denotes a collection <strong>of</strong> state chang<strong>in</strong>g operations and C is a set <strong>of</strong> constra<strong>in</strong>ts. Then the<br />

consistency problem is to guaranteethateach specied operation o 2Owill never violate any<br />

constra<strong>in</strong>t I2C. Integrity enforcement aims at the derivation <strong>of</strong> a new set O 0 with j O 0 j=j O j<br />

<strong>of</strong> operations such that(S O 0 C) satises this property.<br />

Suppose we are given a database schema S and a static <strong>in</strong>tegrity constra<strong>in</strong>t I on that<br />

schema. Regard I as a logical formula dened on S. Consistency requires that only those<br />

<strong>in</strong>stances D <strong>of</strong> S are allowed that satisfy I. Call the set <strong>of</strong> such <strong>in</strong>stances sat(S I). Each<br />

transaction is a database transformation. Such a database transformation T takes an arbitrary<br />

<strong>in</strong>stance D and possibly some <strong>in</strong>put values v 1 ::: v n and produces a new <strong>in</strong>stance D 0 and<br />

possibly some output values v1 0 ::: v0 m . T is consistent with respect to I i for each D 2<br />

sat(S I) we also have D 0 2 sat(S I).<br />

Classically consistency is ma<strong>in</strong>ta<strong>in</strong>ed at run-time by transaction monitors. Whenever an<br />

<strong>in</strong>consistent <strong>in</strong>stance is produced the transaction that caused the <strong>in</strong>consistency will be rolled<br />

back. This \everyth<strong>in</strong>g or noth<strong>in</strong>g" approach has been critized, s<strong>in</strong>ce it causes enormous runtime<br />

overhead for consistency check<strong>in</strong>g and rollback. Moreover, it leaves the burden <strong>of</strong> writ<strong>in</strong>g<br />

consistent transactions to the user. In pr<strong>in</strong>ciple the rst problem vanishes, if verication<br />

techniques are used at design time [44, 57, 58], whereas the second one still rema<strong>in</strong>s.<br />

As an alternative alot<strong>of</strong>attention has been paid to <strong>in</strong>tegrity enforcement. In most cases<br />

the envisioned solution is an active database [18, 27, 59, 64, 65], where production rules are<br />

used to repair <strong>in</strong>consistencies <strong>in</strong>stead <strong>of</strong> roll<strong>in</strong>g back. Although this is sometimes coupled<br />

with design time (or even run-time) analysis <strong>of</strong> the rules [18, 27, 33, 63], the approach isnot<br />

always successfull. Moreover, a satisfy<strong>in</strong>g theory for rule trigger<strong>in</strong>g systems with respect to the<br />

<strong>in</strong>tegrity enforcement problem is still miss<strong>in</strong>g. Therefore, we favour an operational approach<br />

[51, 48, 52, 53], which aims at replac<strong>in</strong>g <strong>in</strong>consistent database transactions by consistent<br />

specializations.<br />

1.6.1 Greatest Consistent Specializations<br />

In general non-determ<strong>in</strong>istic partial state transitions S as used <strong>in</strong> our method language can<br />

be described by a subset <strong>of</strong> DD ? , where D denotes the set <strong>of</strong> possible states and D ? =<br />

D[f?g, where ? is a special symbol used to <strong>in</strong>dicate non-term<strong>in</strong>ation. It can be shown<br />

[20, 41, 46, 44] that this is equivalent to den<strong>in</strong>g two predicate transformers wp(S) andwlp(S)<br />

associated with S satisfy<strong>in</strong>g the pair<strong>in</strong>g condition wp(S)(R) , wlp(S)(R) ^ wp(S)(true) and<br />

the universal conjunctivity <strong>of</strong> wlp(S),i.e.<br />

wlp(S)(8i 2 I:R i ) , 8i 2 I:wlp(S)(R i ) :<br />

The predicate transformers assign to some postcondition R the weakest (liberal) precondition<br />

<strong>of</strong> S to establish R. Clearly, pre- and postconditions are X-constra<strong>in</strong>ts. Informally these<br />

conditions can be characterized as follows:<br />

{ wlp(S)(R) characterizes those <strong>in</strong>itial states such that all term<strong>in</strong>at<strong>in</strong>g executions <strong>of</strong> S will<br />

reach a nal state characterized by R provided S is dened <strong>in</strong> that <strong>in</strong>itial state, and<br />

{ wp(S)(R) characterizes those <strong>in</strong>itial states such that all executions <strong>of</strong> S term<strong>in</strong>ate and<br />

will reach a nal state characterized by R provided S is dened.<br />

25


The use <strong>of</strong> these predicate transformers for the denition <strong>of</strong> language semantics is usually<br />

called \axiomatic semantics". Based on this consistency and specialization can be formally<br />

dened and used for the formal description <strong>of</strong> the consistency problem. For this purpose we<br />

dene \extended operations" and therefore need to know for each operation S the set <strong>of</strong><br />

classes S 0 such that S does neither read nor change the class variables x C with C =2 S 0 . In<br />

this case we callS a S 0 -operation. We omit the formal denition [41, 51].<br />

Denition 1.37. Let S be a schema, I a constra<strong>in</strong>t and S, T methods dened on S 1 S<br />

and S 2 S respectively with S 1 S 2 .<br />

(i) S is consistent with respect to I i I ) wlp(S)(I) holds.<br />

(ii) T specializes S i wp(S)(true) ) wp(T )(true) and wlp(S)(R) ) wlp(T )(R) hold for<br />

all constra<strong>in</strong>ts R with free variables x C such that C 2S 1 (denoted T v S).<br />

Hence the follow<strong>in</strong>g denition <strong>of</strong> a greatest consistent specialization:<br />

Denition 1.38. Let S be a schema, I a constra<strong>in</strong>t and S a method dened on S 1 S. A<br />

method S I is a Greatest Consistent Specialization (GCS) <strong>of</strong> S with respect to I i<br />

(i) S I v S ,<br />

(ii) S I is consistent with respect to I and<br />

(iii) for each method T satisfy<strong>in</strong>g properties (i) and (ii) (<strong>in</strong>stead <strong>of</strong> S I )wehave T v S I .<br />

If only properties (i) and (ii) are satised, we simply talk <strong>of</strong> a consistent specialization.<br />

Let us rst state the ma<strong>in</strong> results from [48].<br />

Theorem 1.39. Let S be a schema, I, J constra<strong>in</strong>ts and S a method dened on S 1 S.<br />

(i) There exists a greatest consistent specialization S I <strong>of</strong> S with respect to I. Moreover, S I<br />

is uniquely determ<strong>in</strong>ed (up to semantic equivalence) by S and I.<br />

(ii) The GCSs (S I ) J and S (I^J) co<strong>in</strong>cide on <strong>in</strong>itial states satisfy<strong>in</strong>g I^J.<br />

The pro<strong>of</strong> <strong>of</strong> these results heavily uses predicate transformers and is therefore omitted here.<br />

In [51] it has been shown that a GCS|that is <strong>in</strong> general non-determ<strong>in</strong>istic|can be written<br />

as a nite choice <strong>of</strong> maximal quasi-determ<strong>in</strong>istic specializations (MQCSs), where quasideterm<strong>in</strong>ism<br />

means determ<strong>in</strong>ism up to the selection <strong>of</strong> some values. In most cases this value<br />

selection can be shifted to the <strong>in</strong>put, but the selection <strong>of</strong> object identiers should be left to<br />

the system.<br />

Next, we formally dene quasi-determ<strong>in</strong>ism and then present the ma<strong>in</strong> result from [51],<br />

an algorithm for the computation <strong>of</strong> MQCSs.<br />

Denition 1.40. A method S is called quasi-determ<strong>in</strong>istic i there exist types T 1 ::: T n<br />

such thatS is semantically equivalent to<br />

where S 0 is a determ<strong>in</strong>istic method.<br />

y 1 :: T 1 j :::y n :: T n j S 0 <br />

26


Algorithm 1.41.<br />

In: An X-operation S and constra<strong>in</strong>ts I 1 ::: I n dened on extensions Y 1 ::: Y n <strong>of</strong> X.<br />

Let ` be the list <strong>of</strong> the constra<strong>in</strong>ts. As long as ` 6= nil proceed as follows:<br />

1. Set S 0 I = S.<br />

2. Choose and remove one constra<strong>in</strong>t I i from `.<br />

3. Check whether S 0 I is I i-reduced. If not, stop with no result, otherwise cont<strong>in</strong>ue.<br />

4. Make S 0 I -free by replac<strong>in</strong>g each occurr<strong>in</strong>g S 1 S 2 by S 1 wlp(S 1 )(false) ! S 2 .<br />

5. Replace each basic assignment <strong>in</strong>SI 0 by some (subsumption-free) MQCS with respect to<br />

I i .<br />

6. Compute P (S I )as<br />

P ( S I ) fz 1 =x 1 ::: z n =x n g:wlp(fx 1 =z 1 ::: x n =z n g: S I )(:wlp(S)(z 1 6= x 1 _:::_z n 6= x n )) <br />

where the x i are the class variables occurr<strong>in</strong>g <strong>in</strong> I or <strong>in</strong> S and the z i are used as a disjo<strong>in</strong>t<br />

copy <strong>of</strong> these.<br />

7. Set S = P (S I ) ! S 0 I .<br />

Set S 0 I = S.<br />

Out: An operation I!SI 0 , where S0 I is a (subsumption-free) MQCS <strong>of</strong> the orig<strong>in</strong>al S with<br />

respect to the conjunction I <strong>of</strong> the constra<strong>in</strong>ts.<br />

ut<br />

An extension <strong>of</strong> the GCS algorithm to compute all (subsumption-free) MQCSs is easy.<br />

It has been shown <strong>in</strong> [51] that Algorithm 1.41 is correct. However, it depends on check<strong>in</strong>g<br />

avery technical condition, I-reducedness. We omit this condition here.<br />

1.6.2 Enforc<strong>in</strong>g Integrity <strong>in</strong> the OODM<br />

S<strong>in</strong>ce Algorithm 1.41 allows <strong>in</strong>tegrity enforcement to be reduced to the case <strong>of</strong> assignments,<br />

we may restrict ourselves to the case <strong>of</strong> a s<strong>in</strong>gle explicit constra<strong>in</strong>t <strong>in</strong> addition to the trivial<br />

uniqueness constra<strong>in</strong>ts that are required to assure value-representability and that are used to<br />

construct generic update operations. In the follow<strong>in</strong>g we describe MQCSs with respect to the<br />

constra<strong>in</strong>ts <strong>in</strong>troduced <strong>in</strong> Denition 1.5.<br />

Inclusion Constra<strong>in</strong>ts. Let I be an <strong>in</strong>clusion constra<strong>in</strong>t onC 1 , C 2 dened via c i : T Ci ! T<br />

(i = 1 2). Then each <strong>in</strong>sertion <strong>in</strong>to C 1 requires an additional <strong>in</strong>sertion <strong>in</strong>to C 2 whereas a<br />

deletion on C 2 requires a deletion on C 1 . Update on one <strong>of</strong> the C i requires an additional<br />

update on the other class.<br />

Let us rst concentrate on the <strong>in</strong>sert-operation on C 1 (for an <strong>in</strong>sert on C 2 there is noth<strong>in</strong>g<br />

to do). Insertion <strong>in</strong>to C 1 requires an <strong>in</strong>put-value <strong>of</strong> type V C 1 an additional <strong>in</strong>sert on C 2 then<br />

requires an <strong>in</strong>put-value <strong>of</strong> type V C 2 .However, these <strong>in</strong>put-values are not <strong>in</strong>dependent, because<br />

the correspond<strong>in</strong>g values <strong>of</strong> type T C 1 and T C2 must satisfy the general <strong>in</strong>clusion constra<strong>in</strong>t.<br />

Therefore we rst show that the constra<strong>in</strong>t can be \lifted" to a constra<strong>in</strong>t on the valuerepresentation<br />

types. Note that this is similar to the handl<strong>in</strong>g <strong>of</strong> IsA-constra<strong>in</strong>ts <strong>in</strong> Lemma<br />

1.33.<br />

27


Lemma 1.42. Let C 1 , C 2 be classes, c i : T Ci ! T functions and let V Ci be the value-representation<br />

type <strong>of</strong>C i (i =1 2). Then there exist functions f i : V Ci ! T such that for all database<br />

<strong>in</strong>stances D<br />

f 1 (d D 1 (v 1 )) = f 2 (d D 2 (v 2 )) , c 1 (v 1 )=c 2 (v 2 ) (1.13)<br />

for all v i 2 codom(D(x Ci )) (i =1 2) holds. Here d D i : T Ci ! V Ci denotes the function used<br />

<strong>in</strong> the uniqueness constra<strong>in</strong>t on C i with respect to D.<br />

Pro<strong>of</strong>. Due to Denition 1.10 we may dene f i = c i (d D i );1 on c i (codom(D(x Ci ))) (i =1 2).<br />

Then wehave toshow that this denition is <strong>in</strong>dependent <strong>of</strong> the <strong>in</strong>stance D. Suppose D 1 , D 2<br />

are two dierent <strong>in</strong>stances. Then there exists a permutation on ID such that d D 2<br />

i<br />

= d D 1<br />

i<br />

,<br />

where is extended to T Ci . Then<br />

c i (d D 2<br />

i<br />

) ;1 = c i ;1 (d D 1<br />

i<br />

) ;1 = ;1 c i (d D 1) ;1 <br />

i<br />

s<strong>in</strong>ce c i permutes with ;1 . Then the stated equality follows.<br />

ut<br />

Now let V C 1C2 = V C1 V C2 and dene the new <strong>in</strong>sert-operation on C 1 by (<strong>in</strong>sert C 1 ) I ((v 1v 2 )::<br />

V C 1C2 ) == f 1 (v 1 )=f 2 (v 2 ) ! <strong>in</strong>sert C 1 (v 1) <strong>in</strong>sert C 2 (v 2) (1.14)<br />

where the f i are the functions <strong>of</strong> Lemma 1.42. Note there there is no need to require C 1 6= C 2 .<br />

Delete- and update-operations can be dened analogously.<br />

Functional and Uniqueness Constra<strong>in</strong>ts. Now let I be a functional constra<strong>in</strong>t on C<br />

dened via c 1 : T C ! T 1 and c 2 : T C ! T 2 . In this case noth<strong>in</strong>g is required for the delete<br />

operation whereas for <strong>in</strong>serts (and updates) we have to add a postcondition. Moreover, let<br />

c D : T C ! V C denote the function associated with the value-representability <strong>of</strong> C and the<br />

database <strong>in</strong>stance D and let all other notations be as before. Let us aga<strong>in</strong> concentrate on the<br />

<strong>in</strong>sert-operation. Let <strong>in</strong>sert 0 C<br />

denote the canonical <strong>in</strong>sert on C. Then we dene<br />

(<strong>in</strong>sert C ) I (V :: V C ) ==<br />

I :: ID j I <strong>in</strong>sert 0 C (V )<br />

V 0 :: T C j (IV 0 ) 2 x C !<br />

( 8J :: IDW :: T C : ((JW) 2 x C<br />

^ c 1 (W )=c 1 (V 0 ) ) c 2 (W )=c 2 (V 0 )) ! skip : (1.15)<br />

Note that <strong>in</strong> this case there is no change <strong>of</strong> <strong>in</strong>put-type. For delete- and update-operations we<br />

have analogous denitions.<br />

A uniqueness constra<strong>in</strong>t dened via c 1 : T C ! T 1 is equivalent toa functional constra<strong>in</strong>t<br />

dened via c 1 and c 2 = id : T C ! T C plus the trivial uniqueness constra<strong>in</strong>t. S<strong>in</strong>ce trivial<br />

uniqueness constra<strong>in</strong>ts are already enforced by the canonical update operations, there is no<br />

need to handle separately arbitrary uniqueness constra<strong>in</strong>ts.<br />

28


Exclusion Constra<strong>in</strong>ts. The handl<strong>in</strong>g <strong>of</strong> exclusion constra<strong>in</strong>ts is analogous to the handl<strong>in</strong>g<br />

<strong>of</strong> <strong>in</strong>clusion constra<strong>in</strong>ts. This means that an <strong>in</strong>sert (update) on one class may cause a delete<br />

on the other, whereas delete-operations rema<strong>in</strong> unchanged.<br />

We concentrate aga<strong>in</strong> on the <strong>in</strong>sert-operation. Let I be an exclusion constra<strong>in</strong>t onC 1 and<br />

C 2 dened via c i : T Ci ! T (i =1 2). Let f i : V Ci ! T denote the functions from Lemma<br />

1.42. Then we dene a new <strong>in</strong>sert-operation on C 1 by<br />

(<strong>in</strong>sert C 1 ) I (V :: V C1 )==<br />

<strong>in</strong>sert C 1 (V )<br />

S: ((I :: ID j V 0 :: T C 2 j (IV 0 ) 2 x C 2<br />

^c 2 (V 0 )=f 1 (V ) ! delete C 2 (V 0 ) S ) skip ) : (1.16)<br />

For delete- and update-operations an analogous result holds.<br />

Theorem 1.43. The methods S I <strong>in</strong> (1.14), (1.15) and (1.16) are MQCSs <strong>of</strong> generic <strong>in</strong>sertmethods<br />

with respect to <strong>in</strong>clusion, functional and exclusion constra<strong>in</strong>ts respectively.<br />

The pro<strong>of</strong> <strong>in</strong>volves detailed use <strong>of</strong> predicate transformers and is therefore omitted here [48, 49].<br />

Analogous results hold for delete and update.<br />

1.7 Conclusion<br />

In this paper we describe rst results concern<strong>in</strong>g the formal foundations <strong>of</strong> object oriented<br />

database concepts. For this purpose we<strong>in</strong>troduced a formal object oriented datamodel (OODM)<br />

with the follow<strong>in</strong>g characteristics.<br />

{ <strong>Object</strong>s are considered to be abstractions <strong>of</strong> real world entities, hence they have an immutable<br />

identity. This identity is encoded by abstract identiers that are assumed to form<br />

some type ID. This identier concept eases the modell<strong>in</strong>g <strong>of</strong> shared data and cyclic references,<br />

however, it does not relieve us from the problem to provide unique identication<br />

mechanisms for objects <strong>in</strong> a database.<br />

{ In our approach there is not only one value <strong>of</strong> a given type that is associated with an<br />

object. In contrast we allow several values <strong>of</strong> possibly dierent types to belong to an<br />

object, and even this collection <strong>of</strong> types may change.<br />

{ Classes are used to structure objects. At each time a class corresponds to a collection <strong>of</strong><br />

objects with values <strong>of</strong> the same type and references to objects <strong>in</strong> a xed set <strong>of</strong> classes.<br />

Inheritance is based on IsA relations that express an <strong>in</strong>clusion at each time <strong>of</strong> the sets <strong>of</strong><br />

objects. Moreover, referential <strong>in</strong>tegrity is supported.<br />

{ We associate with each class a collection <strong>of</strong> methods. Methods are specied by guarded<br />

commands, hence the method language is computationally complete. In order to allow<br />

the handl<strong>in</strong>g <strong>of</strong> identiers that are always hidden from the user as well as user-accessible<br />

transactions a hid<strong>in</strong>g operator on methods is <strong>in</strong>troduced. Generic update operations, i.e.<br />

<strong>in</strong>sert, delete and update on a class are assumed to be automatically derived whenever<br />

this is possible.<br />

{ We associate <strong>in</strong>tegrity constra<strong>in</strong>ts to schemata. Certa<strong>in</strong> k<strong>in</strong>ds <strong>of</strong> such constra<strong>in</strong>ts can be<br />

obta<strong>in</strong>ed by generaliz<strong>in</strong>g correspond<strong>in</strong>g constra<strong>in</strong>ts <strong>in</strong> the relational model. We assume<br />

that methods are automatically changed <strong>in</strong> order to enforce <strong>in</strong>tegrity.<br />

29


On this basis <strong>of</strong> this formal OODM we study the problems <strong>of</strong> identication, genericity and<br />

<strong>in</strong>tegrity. Weshow that the unique identication <strong>of</strong> objects <strong>in</strong> a class requires the class to be<br />

value-representable.<br />

An advantage <strong>of</strong> database systems is to provide generic update operations. We show that<br />

the unique existence <strong>of</strong> such generic methods requires also value-representability. However, <strong>in</strong><br />

this case referential and <strong>in</strong>clusion <strong>in</strong>tegrity can be enforced automatically. This result can be<br />

generalized with respect to dist<strong>in</strong>guished classes <strong>of</strong> user-dened <strong>in</strong>tegrity constra<strong>in</strong>ts. Given<br />

some arbitrary method S and some constra<strong>in</strong>t I there exists a greatest consistent specialization<br />

(GCS) S I <strong>of</strong> S with respect to I. Such a GCS behaves nice <strong>in</strong> that it is compatible with the<br />

conjunction <strong>of</strong> constra<strong>in</strong>ts. For the GCS construction <strong>of</strong> a user-dened transaction we apply<br />

the GCS algorithm developped<strong>in</strong>[48,51,52,53].<br />

This work on mathematical foundations <strong>of</strong> OODB concepts is not yet completed. A lot <strong>of</strong><br />

problems are still left open and are the matter <strong>of</strong> current <strong>in</strong>vestigations and future research.<br />

{ In our approach classes are sets. What are other bulk types? Does it make sense to abstract<br />

from classes <strong>in</strong> this way?<br />

{ The problem <strong>of</strong> updatable views is still open.<br />

{ Our approach to genericity only handles the worst case expressed by the value representation<br />

type. We assume that polymorphism will help to generalize our results to the general<br />

case. Moreover, we must <strong>in</strong>tegrate communication aspects at least with respect to the<br />

user.<br />

{ The usual axiomatic semantics for guarded commands abstracts from an execution model.<br />

All results are true for semantic equivalence classes. However, we also need optimization,<br />

especially with respect to the derived GCSs.<br />

{ We only presented a formal OODM without look<strong>in</strong>g <strong>in</strong>to methodological aspects such as<br />

the characterization <strong>of</strong> good designs.<br />

We express the hope that others will also contribute to solve open problems <strong>in</strong> OODB foundation<br />

or <strong>in</strong> the implementation <strong>of</strong> more sophisticated object oriented database languages on<br />

a sound mathematical basis.<br />

References for Chapter 1<br />

1. S. Abiteboul: Towards a deductive object-oriented database language, Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g,<br />

vol. 5, 1990, pp. 263 { 287<br />

2. S. Abiteboul, R. Hull: IFO: A Formal Semantic Database Model, ACM ToDS, vol. 12 (4), December<br />

1987, pp. 525 { 565<br />

3. S. Abiteboul, P. Kanellakis: <strong>Object</strong> Identity as a Query Language Primitive, <strong>in</strong> Proc. SIGMOD,<br />

Portland Oregon, 1989, pp. 159 { 173<br />

4. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: Types for <strong>Databases</strong>: The Galileo Experience, <strong>in</strong> Type Systems<br />

and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and<br />

Computational Sciences, Research Report CS/90/3, 27 { 37<br />

5. A. Albano, A. Dearle, G. Ghelli, C. Marl<strong>in</strong>, R. Morrison, R. Ors<strong>in</strong>i, D. Stemple: AFramework for<br />

Compar<strong>in</strong>g Type Systems for Database Programm<strong>in</strong>g Languages, <strong>in</strong>Type Systems and Database<br />

Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and Computational<br />

Sciences, Research Report CS/90/3, 1990<br />

6. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: <strong>Object</strong>s and Classes for a Database Programm<strong>in</strong>g Language, FIDE<br />

technical report 91/16, 1991<br />

30


7. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: ARelationship Mechanism for a Strongly Typed <strong>Object</strong>-<strong>Oriented</strong><br />

Database Programm<strong>in</strong>g Language, <strong>in</strong> A. Sernadas (Ed.): Proc. VLDB 91, Barcelona 1991<br />

8. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The <strong>Object</strong>-<strong>Oriented</strong><br />

Database System Manifesto, Proc. 1st DOOD, Kyoto 1989<br />

9. F. Bancilhon, G. Barbedette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeer,<br />

P. Richard, F. Velez: The Design and Implementation <strong>of</strong> O 2 , an <strong>Object</strong>-<strong>Oriented</strong> Database System,<br />

Proc. <strong>of</strong> the ooDBS II workshop, Bad Munster, FRG, September 1988<br />

10. C. Beeri: Formal Models for <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong>, Proc. 1st DOOD 1989, pp. 370 { 395<br />

11. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />

5 (4), 1990, pp. 353 { 382<br />

12. C. Beeri, Y. Kornatzky: Algebraic Optimization <strong>of</strong> <strong>Object</strong>-<strong>Oriented</strong> QueryLanguages, <strong>in</strong> S. Abiteboul,<br />

P. C. Kanellakis (Eds.): Proc. ICDT '90, Spr<strong>in</strong>ger LNCS 470, pp. 72 { 88<br />

13. C. Beeri: New Data Models and Languages - the Challange <strong>in</strong> Proc. PODS '92<br />

14. L. Cardelli, P. Wegner: On Understand<strong>in</strong>g Types, Data Abstraction and Polymorphism, ACM<br />

Comput<strong>in</strong>g Suerveys 17,4, pp 471 { 522<br />

15. L. Cardelli: Typeful Programm<strong>in</strong>g, Digital Systems Research Center Reports 45, DEC SRC Palo<br />

Alto, May 1989<br />

16. M. Carey, D. DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc.<br />

ACM SIGMOD 88<br />

17. M. Caruso, E. Sciore: The VISION <strong>Object</strong>-<strong>Oriented</strong> Database Management System, Proc.<strong>of</strong>the<br />

Workshop on Database Programm<strong>in</strong>g Languages, Rosco, France, September 1987<br />

18. S. Ceri, J. Widom: Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />

VLDB, Brisbane (Australia), August 1990, pp. 566 { 577<br />

19. A. Dearle, R. Connor, F. Brown, R. Morrison: Napier88 - ADatabase Programm<strong>in</strong>g Language?,<br />

<strong>in</strong> Type Systems and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong><br />

Mathematical and Computational Sciences, Research Report CS/90/3, 10 { 26<br />

20. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Spr<strong>in</strong>ger-Verlag, 1989<br />

21. H.-D. Ehrich, M. Gogolla, U. Lipeck: Algebraische Spezikation abstrakter Datentypen, Teubner-<br />

Verlag, 1989<br />

22. H.-D. Ehrich, A. Sernadas: Fundamental <strong>Object</strong> Concepts and Constructors, <strong>in</strong> G. Saake, A. Sernadas<br />

(Eds.): Information Systems { Correctness and Reusability, TU Braunschweig, Informatik<br />

Berichte 91-03, 1991<br />

23. H. Ehrig, B. Mahr: <strong>Fundamentals</strong> <strong>of</strong> Algebraic Specication, vol.1, Spr<strong>in</strong>ger 1985<br />

24. L. Fegaras, T. Sheard, D. Stemple: The ADABTPL Type System, <strong>in</strong>Type Systems and Database<br />

Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and Computational<br />

Sciences, Research Report CS/90/3, 45 { 56<br />

25. L. Fegaras, T. Sheard, D. Stemple: Uniform Traversal Comb<strong>in</strong>ators: Denition, Use and Properties,<br />

University <strong>of</strong> Massachusetts, 1992<br />

26. D. Fishman, D. Beech, H. Cate, E. Chow et al.: IRIS: An <strong>Object</strong>-<strong>Oriented</strong> Database Management<br />

System, ACM ToIS, vol. 5(1), January 1987<br />

27. P. Fraternali, S. Paraboschi, L. Tanca: Automatic Rule Generation for Constra<strong>in</strong>t Enforcement<br />

<strong>in</strong> Active <strong>Databases</strong>, <strong>in</strong> U. Lipeck (Ed.): Proc. 4th Int. Workshop on Foundations <strong>of</strong> Models and<br />

Languages for Data and <strong>Object</strong>s \MODELLING DATABASE DYNAMICS", Volkse (Germany),<br />

October 19-22, 1992<br />

28. G. Gottlob, G. Kappel, M. Schre: Semantics <strong>of</strong> <strong>Object</strong>-<strong>Oriented</strong> Data Models { The Evolv<strong>in</strong>g<br />

Algebra Approach, <strong>in</strong> J. W. Schmidt, A. A. Stognij (Eds.): Proc. Next Generation Information<br />

Systems Technology, Spr<strong>in</strong>ger LNCS, vol. 504, 1991<br />

29. M. Hammer, D. McLeod: Database Description with SDM: A Semantic Database Model, J.ACM,<br />

vol. 31 (3), 1984, pp. 351 { 386<br />

30. A. Heuer, P. Sander: Classify<strong>in</strong>g <strong>Object</strong>-<strong>Oriented</strong> Results <strong>in</strong> a Class/Type Lattice, <strong>in</strong> B. Thalheim<br />

et al. (Ed.): Proceed<strong>in</strong>gs MFDBS 91, Spr<strong>in</strong>ger LNCS 495, pp. 14 { 28<br />

31. R. Hull, R. K<strong>in</strong>g: Semantic Database Model<strong>in</strong>g: Survey, Applications and Research Issues, ACM<br />

Comput<strong>in</strong>g Surveys, vol. 19(3), September 1987<br />

31


32. R. Hull, M. Yoshikawa: ILOG: Declarative Creation and Manipulation <strong>of</strong> <strong>Object</strong> Identiers, <strong>in</strong><br />

Proc. 16th VLDB, Brisbane (Australia), 1990, pp. 455 { 467<br />

33. A. P. Karadimce, S. D. Urban, Diagnos<strong>in</strong>g Anomalous Rule Behaviour <strong>in</strong> <strong>Databases</strong> with Integrity<br />

Ma<strong>in</strong>tenance Production Rules, <strong>in</strong> Proc. 3rd Int. Workshop on Foundations <strong>of</strong> Models and Languages<br />

for Data and <strong>Object</strong>s, Aigen (Austria), September 1991, pp. 77 { 102<br />

34. S. Khoshaan, G. Copeland: <strong>Object</strong> Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon,<br />

1986<br />

35. M. Kifer, J. Wu: ALogic for <strong>Object</strong>-<strong>Oriented</strong> Logic Programm<strong>in</strong>g (Maier's O-Logic Revisited), <strong>in</strong><br />

PODS'89, pp. 379 { 393<br />

36. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrat<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong><br />

Programm<strong>in</strong>g System with a Database System, <strong>in</strong> Proc. OOPSLA 1988<br />

37. D. Maier, J. Ste<strong>in</strong>, A. Ottis, A. Purdy: Development <strong>of</strong> an <strong>Object</strong>-<strong>Oriented</strong> DBMS, OOPSLA,<br />

September 1986<br />

38. F. Matthes, J. W. Schmidt: Bulk Types { Add-On or Built-In?, <strong>in</strong> Proc. DBPL III, Nafplion 1991<br />

39. J. Mylopoulos, P. A. Bernste<strong>in</strong>, H. K. T. Wong: A Language Facility for Design<strong>in</strong>g Interactive<br />

Database-Intensive Applications, ACM ToDS, vol. 5 (2), April 1980, pp. 185 { 207<br />

40. J. Mylopoulos, A. Borgida, M. Jarke, M. Koubarakis: Telos: Represent<strong>in</strong>g Knowledge About Information<br />

Systems, ACM ToIS, vol. 8 (4), October 1990 pp. 325 { 362<br />

41. G. Nelson: A Generalization <strong>of</strong> Dijkstra's Calculus, ACM TOPLAS, vol. 11 (4), October 1989, pp.<br />

517 { 561<br />

42. A. Ohori: Represent<strong>in</strong>g <strong>Object</strong> Identity <strong>in</strong> a Pure Functional Language, Proc. ICDT 90, Spr<strong>in</strong>ger<br />

LNCS, pp. 41 { 55<br />

43. G. Saake, R. Jungclaus: Specication <strong>of</strong> Database Applications <strong>in</strong> the TROLL Language, <strong>in</strong> Proc.<br />

Int. Workshop on the Specication <strong>of</strong> Database Systems, Glasgow, 1991<br />

44. K.-D. Schewe, I. Wetzel, J. W. Schmidt: Towards a Structured Specication Language for Database<br />

Applications, <strong>in</strong> D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication <strong>of</strong> Database<br />

Systems, Spr<strong>in</strong>ger WICS, 1991, pp. 255 { 274 (an extended version appeared as FIDE technical<br />

report 1991/30, October 1991)<br />

45. K.-D. Schewe, B. Thalheim, I. Wetzel,J.W.Schmidt: Extensible Safe <strong>Object</strong>-<strong>Oriented</strong> Design <strong>of</strong><br />

Database Applications, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t CS-09-91, September 1991<br />

46. K.-D. Schewe: Spezikation daten<strong>in</strong>tensiver Anwendungssysteme (<strong>in</strong> German), lecture manuscript,<br />

University <strong>of</strong> Hamburg, W<strong>in</strong>ter 1991/92<br />

47. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />

<strong>Oriented</strong> <strong>Databases</strong>, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, pp. 341-356<br />

48. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity Enforcement <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong><br />

<strong>Databases</strong>, <strong>in</strong> U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations <strong>of</strong> Models<br />

and Languages for Data and <strong>Object</strong>s \MODELLING DATABASE DYNAMICS", Volkse (Germany),<br />

October 19-22, 1992<br />

49. K.-D. Schewe, B. Thalheim, I. Wetzel: Foundations <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Concepts, University<br />

<strong>of</strong>Hamburg, Report FBI-HH-B-157/92, October 1992<br />

50. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: AReective Approach to Method<br />

Generation <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, University <strong>of</strong> Rostock, Rostocker Informatik Berichte,<br />

no. 14, 1992<br />

51. K.-D. Schewe, B. Thalheim: Comput<strong>in</strong>g Consistent Transactions, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t<br />

CS-08-92, December 1992<br />

52. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity Preserv<strong>in</strong>g Updates <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>,<br />

<strong>in</strong> M. Orlowska, M. Papazoglou (Eds.) : Proc. 4th Australian Database Conference, Brisbane,<br />

February 1993, World Scientic, pp. 171-185<br />

53. K.-D. Schewe, B. Thalheim: Exceed<strong>in</strong>g the Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems to Achieve Consistent<br />

Transactions, submitted for publication<br />

54. M. H. Scholl, H.-J. Schek: ARelational <strong>Object</strong> Model, <strong>in</strong> Proc. ICDT 90, Spr<strong>in</strong>ger LNCS, pp. 89<br />

{105<br />

32


55. G. M. Shaw, S. B. Zdonik: An <strong>Object</strong>-<strong>Oriented</strong> Query-Algebra, IEEE Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 12<br />

(3), 1989, pp. 29 { 36<br />

56. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programm<strong>in</strong>g to Database Languages,<br />

<strong>in</strong> Proc. HICSS '92<br />

57. D. Stemple, S. Mazumdar, T. Sheard: On the Modes and Mean<strong>in</strong>g <strong>of</strong> Feedback to Transaction<br />

Designer, <strong>in</strong> Proc. SIGMOD 1987, pp. 375 { 386<br />

58. D. Stemple, T. Sheard: Automatic Verication <strong>of</strong> Database Transaction Safety, ACM ToDS vol.<br />

14 (3), September 1989<br />

59. M. Stonebraker, A. Ju<strong>in</strong>gran, J. Goh, S. Potam<strong>in</strong>os: On Rules, Procedures, Cach<strong>in</strong>g and Views <strong>in</strong><br />

Database Systems, <strong>in</strong> Proc. SIDMOD 1990, pp. 281 { 290<br />

60. S. Y. W. Su: SAM : A Semantic Association Model for Corporate and Scientic-Statistical<br />

<strong>Databases</strong>, Inf. Sci., vol. 29, 1983, pp. 151 { 199<br />

61. B. Thalheim: Dependencies <strong>in</strong> Relational <strong>Databases</strong>, Teubner Leipzig, 1991<br />

62. B. Thalheim: The Higher-Order Entity-Relationship Model, <strong>in</strong>J.W.Schmidt, A. A. Stognij (Eds.):<br />

Proc. Next Generation Information Systems Technology, Spr<strong>in</strong>ger LNCS, vol. 504, 1991<br />

63. S. D. Urban, L. Delcambre: Constra<strong>in</strong>t Analysis: a Design Process for Specify<strong>in</strong>g Operations on<br />

<strong>Object</strong>s, IEEETrans. on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), December 1990<br />

64. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: Set-oriented Production Rules <strong>in</strong> Relational Database Systems, <strong>in</strong><br />

Proc. SIGMOD 1990, pp. 259 { 270<br />

65. Y. Zhou, M. Hsu: A Theory for Rule Trigger<strong>in</strong>g Systems, <strong>in</strong> Proc. EDBT '90, Spr<strong>in</strong>ger LNCS 416,<br />

pp. 407 { 421<br />

33


Chapter 2<br />

Identication as a Primitive <strong>of</strong><br />

Database Models<br />

Contents<br />

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35<br />

2.2 The Identication Problem . . . . . . . . . . . . . . . . . . . . . . 36<br />

2.3 Identication Concepts <strong>in</strong> <strong>Databases</strong> . . . . . . . . . . . . . . . . . 41<br />

2.4 Comparison <strong>of</strong> Identication Concepts . . . . . . . . . . . . . . . . 44<br />

2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

Catriel Beeri, Bernhard Thalheim. Identication as a Primitive <strong>of</strong> Database Models.<br />

In T .Polle, T. Ripke, K.-D. Schewe. <strong>Fundamentals</strong> <strong>of</strong> Information Systems. Kluwer<br />

1998.<br />

34


Abstract. Identication is one <strong>of</strong> the ma<strong>in</strong> primitives <strong>of</strong> database technology. Whereas identication<br />

<strong>of</strong> real world entities by humans is an extremely exible mechanism, identication <strong>in</strong><br />

a database system is severely restricted, s<strong>in</strong>ce the identication mechanism used <strong>in</strong> it depends<br />

on the data model and the type system on which it is based. To understand the modell<strong>in</strong>g<br />

power <strong>of</strong> a data model, it is necessary to understand the identication mechanism it supports.<br />

Thus, this paper surveys and analyses the identication mechanism <strong>of</strong> database models.<br />

2.1 Introduction<br />

<strong>Databases</strong> are used to represent entities 1 <strong>of</strong> the real world. On a suciently high level <strong>of</strong><br />

abstraction, every th<strong>in</strong>g we deal with <strong>in</strong> our life, whether concrete or abstract, is an entity.<br />

However, to facilitate the construction <strong>of</strong> a world model, and certa<strong>in</strong>ly if one wants to use such<br />

a model as a basis <strong>of</strong> a database representation, it is useful to dist<strong>in</strong>guish between entities,<br />

properties <strong>of</strong> entities, associations between entities, etc. In a computerized system, some entities<br />

are represented by atomic values (numbers, for example), whereas others are represented<br />

by structured, non-atomic values, such as tuples <strong>in</strong> the relational model, or by objects <strong>in</strong><br />

object-oriented systems. Properties and associations are represented as part <strong>of</strong> the structures<br />

represent<strong>in</strong>g the entities, or as additional structures. For example, an employee tuple <strong>in</strong> a<br />

relational database conta<strong>in</strong>s the values for its properties <strong>of</strong> <strong>in</strong>terest, and may also conta<strong>in</strong><br />

values that represent relationships, for example a department number. If the relationship between<br />

employees and projects is many tomany, then a separate relation may store the tuples<br />

describ<strong>in</strong>g it. In an object-oriented database, properties <strong>of</strong> an entity represented by an object<br />

are stored with it as associated values or objects. In either case, we may say that properties<br />

and associations are described by structures.<br />

Entities <strong>in</strong> the real world have the follow<strong>in</strong>g properties:<br />

{ An entity is uniquely identied by its history, and by its properties and associations.<br />

{ Its set <strong>of</strong> properties and associations can be arbitrary.<br />

{ It has a life cycle | it is created, it exists, then it ceases to exist.<br />

{ An entity can exist <strong>in</strong>dependently <strong>of</strong> other entities.<br />

Note that this holds even for entities that on rst thoughtwemaybelieve not to satisfy some <strong>of</strong><br />

the above. For example, nails <strong>in</strong> a box exist, and each is a unique physical entity. Furthermore,<br />

each is uniquely identied at every po<strong>in</strong>t <strong>of</strong>timeby its location <strong>in</strong> the box. Time-<strong>in</strong>dependent<br />

identication, for example by m<strong>in</strong>ute dierences <strong>in</strong> lengths or weights may also exist. The<br />

last property may not hold for abstract entities, i.e., entities that are conceptual, rather than<br />

physical.<br />

The fact that the set <strong>of</strong> properties is arbitrary is important <strong>in</strong> real life. We recognize other<br />

people by many dierent properties. We may believe we know somebodyby his hair colour.<br />

Meet<strong>in</strong>g him after twenty years, the colour is changed, or the hair is gone, yet we do know<br />

him.<br />

The exibility that exists <strong>in</strong> the real world cannot be directly supported <strong>in</strong> computerized<br />

representations. When we choose to represent a universe <strong>of</strong> discourse <strong>in</strong> a database, we restrict<br />

the properties and associations we care to represent to a nite, pre-specied set. Although<br />

this set is arbitrary, <strong>in</strong> the sense that we can choose it as we like, it is xed by the choice.<br />

1 We use `entity' here <strong>in</strong> the normal natural language sense. It should not be confused with (closely related<br />

but technical) use <strong>in</strong> the entity-relationship model.<br />

35


Furthermore, our choice regard<strong>in</strong>g what to represent are guided by feasibility and cost. While<br />

we can,ifwe wish, record the location <strong>of</strong> each nailateachpo<strong>in</strong>t <strong>of</strong> time, practically, however,<br />

we are ready to pay the price <strong>of</strong> such a system for locat<strong>in</strong>g cars, but not nails. A primary<br />

goal <strong>of</strong> the representations we choose is to allow us to uniquely identity entities, as this si<br />

the basis for proper use and manipulation. The restrictions on representations impose severe<br />

limits on how we can uniquely identify the (representations <strong>of</strong> the) entities <strong>in</strong> the database.<br />

This applies not only to the cases where we have given up the option <strong>of</strong> unique representation,<br />

such as the nail box, but also to many <strong>of</strong> the cases where our representation is `full'.<br />

While identication <strong>in</strong> relational databases has been solved by the key concept, the issue is<br />

still vague <strong>in</strong> OODB's. By way <strong>of</strong>motivation, one <strong>of</strong> the authors has performed an experiment<br />

on a commercial OODB. Three objects with the name 'John' and cyclic references 'friends'<br />

between them were created. The query `How many John's are <strong>in</strong> the database' was run several<br />

times. The results were 3, 6, 9 for the rst three runs, respectively, and <strong>in</strong>creased similarly for<br />

subsequent runs. It seems very plausible that the failure has to do with unique identiability<br />

<strong>of</strong> the three objects.<br />

Overview on the paper<br />

In this paper we discuss the representation <strong>of</strong> entities <strong>in</strong> <strong>in</strong>formation systems, and the identi-<br />

cation mechanisms <strong>of</strong> dierent database models. We dist<strong>in</strong>guish several notions <strong>of</strong> identication,<br />

and <strong>in</strong> particular between identication and separability, and we show that currently<br />

implemented mechanisms are limited.<br />

Section 2 discusses the identication problem <strong>in</strong> general and for object-oriented databases.<br />

Section 3 <strong>in</strong>troduces dierent identication concepts. These concepts are compared <strong>in</strong> Section<br />

4. Section 5 demonstrates that there are further concepts which can be used for identication<br />

as well.<br />

2.2 The Identication Problem<br />

Identication is <strong>in</strong>timately related to equality. In the real world, to say that entities t 1 and<br />

t 2 are equal means that they are the same | they are identical. To uniquely identify an<br />

entity is to be able to separate it <strong>of</strong> from any entity that is not identical to it. As mentioned<br />

above, <strong>in</strong> the real world, we may identify entities by arbitrary comb<strong>in</strong>ations <strong>of</strong> properties<br />

and associations, which may change over time. The situation is further complicated by the<br />

fact that entities are <strong>of</strong>ten related to roles and abstractions, and it is not always clear <strong>in</strong> a<br />

statement to which <strong>of</strong> those one relates.<br />

Consider the follow<strong>in</strong>g equalities: Cl<strong>in</strong>ton = Cl<strong>in</strong>ton, Cicero = Ford, and Cl<strong>in</strong>ton = The<br />

President <strong>of</strong> the USA. Cl<strong>in</strong>ton and Ford refer to physical entities that existed (each at a certa<strong>in</strong><br />

time). Thus, the rst equality is trivialy true, it is an identity, and the second is false. Is the<br />

third equality true or false? If we take it to mean that these two dierent names, denot<strong>in</strong>g<br />

two conceptual personalities Cl<strong>in</strong>ton and The President <strong>of</strong> the USA, are an identical physical<br />

person, then it is true. If the <strong>in</strong>tention is that the two conceptual entities are identical, it is<br />

false. And note that if we say it is true, then Ford = The President <strong>of</strong> the USA was also true<br />

at some time.<br />

In the real world, such dierences are resolved <strong>in</strong> a variety <strong>of</strong> ways | by context, by<br />

ask<strong>in</strong>g for clarication, by misunderstand<strong>in</strong>g, and so on. In a database system us<strong>in</strong>g current<br />

36


technology the mean<strong>in</strong>g has to be clear, or, at most, resolution should be obvious given schema<br />

<strong>in</strong>formation. Note that deduc<strong>in</strong>g that some objects <strong>in</strong> a database are identical can have nontrivial<br />

consequences. Consider the situation that <strong>in</strong> a database we have:<br />

fbooksg |||- The President, Cl<strong>in</strong>ton||{ fbus<strong>in</strong>ess friendsg.<br />

By the equality Cl<strong>in</strong>ton = The President the object Cl<strong>in</strong>ton can <strong>in</strong>herit the properties <strong>of</strong> the<br />

President, e.g. the books, and the President <strong>in</strong>herits the bus<strong>in</strong>ess friends <strong>of</strong> Cl<strong>in</strong>ton.<br />

Logical <strong>Fundamentals</strong> <strong>of</strong> Identication<br />

Computerized systems are one <strong>in</strong>stance <strong>of</strong> formal systems for world representation. It is <strong>of</strong><br />

<strong>in</strong>terest to consider how equality and identication were treated <strong>in</strong> other doma<strong>in</strong>s that deal<br />

with such systems.<br />

In philosophy and the study <strong>of</strong> logic various pr<strong>in</strong>ciples have been considered together with<br />

the equality concept. (for a theory <strong>of</strong> equality see [5], [13], [17], [8]).<br />

The <strong>in</strong>dist<strong>in</strong>guishability pr<strong>in</strong>ciple [7] formulated by Leibniz [16] states that entities which<br />

cannot be dist<strong>in</strong>guished by the unary formulas or predicates <strong>of</strong> the given language are equal,<br />

i.e.<br />

x=y i 8P ( P (x) $ P (y) ) .<br />

The characterization <strong>of</strong> entities is related to the abstraction pr<strong>in</strong>ciple <strong>in</strong> the sense it is used <strong>in</strong><br />

database modell<strong>in</strong>g, i.e. abstract<strong>in</strong>g from most properties and concentrat<strong>in</strong>g on some properties.<br />

The presented <strong>in</strong>dist<strong>in</strong>guishability depends on the chosen language and on the applicability<br />

<strong>in</strong> the case <strong>of</strong> partial predicates. Thus, denot<strong>in</strong>g by P (x)! the applicability <strong>of</strong>P to x,<br />

we have two versions <strong>of</strong> Leibniz pr<strong>in</strong>ciple:<br />

1. 8P ( (P (x)! ^ P (y)!) ;! (P (x) $ P (y)) ) <br />

2. 8P ( (P (x)! $ P (y)!) ^ ((P (x)! ^ P (y)!) ;! (P (x) $ P (y)) ) .<br />

The rst version restricts dist<strong>in</strong>ction to those predicates which are dened on x and y at<br />

the same time. If P is not dened on one <strong>of</strong> the entities and dened on the other then this<br />

dierence is not used for dist<strong>in</strong>ction. The second version permits this possibility.<br />

The pr<strong>in</strong>ciple is also related to the observation property. If for a given entity its identity<br />

can be observed on the basis <strong>of</strong> a calculus then this observability can be used for identication<br />

as well. Observation is closely related to and crucially depends on scope. A scope denes what<br />

is visible from a given viewpo<strong>in</strong>t, and only what is visible can be used for identication. For<br />

example, a view over a database <strong>of</strong>ten presents less <strong>in</strong>formation than the complete database,<br />

and that may prevent entity representations to be dist<strong>in</strong>guishable from each other <strong>in</strong> the view.<br />

(This is closely related to the view update problem.)<br />

Summariz<strong>in</strong>g, (<strong>in</strong>)dist<strong>in</strong>guishability depends on the languages which are used for representation<br />

<strong>of</strong> entities, and for query<strong>in</strong>g them. These characteristics hold <strong>in</strong> <strong>in</strong>formation systems<br />

as well, as shown <strong>in</strong> the sequel.<br />

Values and <strong>Object</strong>s<br />

Values are the basic build<strong>in</strong>g blocks <strong>of</strong> data. Atomic values represent universally known abstractions.<br />

For example, numbers are atomic types. By `universally known abstraction' we<br />

mean that it has a standard mean<strong>in</strong>g that is known to a large community further, <strong>in</strong> this<br />

community, there are accepted denotation(s) for it. Certa<strong>in</strong>ly, numbers satisfy these requirements.<br />

Values can also be comb<strong>in</strong>ed <strong>in</strong> various ways to form structures, such as tuples, lists<br />

or sets. These are non-atomic or structured values. Most <strong>of</strong>ten values are partitioned <strong>in</strong>to<br />

37


sets, called doma<strong>in</strong>s, such as the set <strong>of</strong> <strong>in</strong>tegers, the set <strong>of</strong> characters, and so on a value is an<br />

element <strong>of</strong>adoma<strong>in</strong><strong>of</strong>values. Each value has a xed, user-visible denotation/representation,<br />

bound to an element <strong>of</strong> the doma<strong>in</strong>, and these have the same form for all values <strong>in</strong> a doma<strong>in</strong>.<br />

A system normally supports several doma<strong>in</strong>s <strong>of</strong> atomic values, and several k<strong>in</strong>ds <strong>of</strong> nonatomic<br />

values that can be constructed from them, and depend<strong>in</strong>g on the <strong>in</strong>tended semantics,<br />

a collection <strong>of</strong> functions, also called operations, on these doma<strong>in</strong>s. Values do not change the<br />

operations only map values to other values. They are not created, nor do they cease to exist.<br />

The other k<strong>in</strong>d <strong>of</strong> data <strong>in</strong> computerized system are objects. They normally represent<br />

entities or abstractions that are not necessarily universally known, and whose existence is not<br />

pre-wired <strong>in</strong>to the system. Their properties are <strong>in</strong>uenced by those <strong>of</strong> such entities and by<br />

the representation method, and are the follow<strong>in</strong>g [6]:<br />

{ An object has an <strong>in</strong>ternal structure and has a state accord<strong>in</strong>g to this <strong>in</strong>ternal structure.<br />

{ It has a life cycle : it is created, it can be modied and it is nally removed.<br />

{ Its identity cannot be changed dur<strong>in</strong>g its lifetime. Identity isthatproperty <strong>of</strong>anobject<br />

which dist<strong>in</strong>guishes each object from all others[12].<br />

{ An object can exist <strong>in</strong>dependently from other objects. 2<br />

The <strong>in</strong>ternal structure <strong>of</strong> an object serves as a representation <strong>of</strong> its properties, and possibly<br />

also <strong>of</strong> (some <strong>of</strong>) its associations. 3 At any po<strong>in</strong>t <strong>of</strong> time, this structure provides the values<br />

<strong>of</strong> the properties at that time. Just as is the case for real world entities, properties can be<br />

changed | this is modication. However, the identity <strong>of</strong> the object never changes, throughout<br />

its lifetime. A value can be seen as a special k<strong>in</strong>d <strong>of</strong> object that has no properties except itself<br />

(for an atomic value), or its components (for a structured value). Hence, a value is never<br />

modied. Whereas numbers are immutable and exist forever, an employee object is created<br />

when an employee is hired, its properties are subject to change (e.g., a salary raise), and when<br />

the employee quits the company, its object is removed.<br />

While objects dier from values <strong>in</strong> several ways as described above, the dierence that is<br />

<strong>of</strong>ten assumed as the primary concept that dist<strong>in</strong>guishes OODB's from previous models, is<br />

the existence <strong>of</strong> object identity. Simply stated, an object has an identity that is <strong>in</strong>dependent<br />

<strong>of</strong> its properties and associations, and is immutable. The object properties, or associations<br />

<strong>in</strong> which it participates can change, but the identity never changes throughout its lifetime.<br />

This idea is considered as a cornerstone for proper representation <strong>of</strong> real world entities. In<br />

particular, each object represents a unique entity, and only one object represents each entity.<br />

But how is this requirement accomplished <strong>in</strong> a database system? A common approach<br />

is to implement identication by means <strong>of</strong> object identiers (o-id's). O-id's are system supplied<br />

(and implementation-dependent) atomic items, used solely for the purpose <strong>of</strong> identify<strong>in</strong>g<br />

objects <strong>in</strong> the system. 4 An identier is assigned to an object upon creation, and it<br />

never changes. The uniqueness and immutability <strong>of</strong> identiers guarantee that the system can<br />

uniquely identify each object throughout its lifetime, That means that they can be used for<br />

access structures, or to allow an object to be an attribute value <strong>of</strong> another (or <strong>of</strong> several other<br />

objects), by us<strong>in</strong>g the o-id as a surrogate. However, o-id's are considered <strong>in</strong>ternal, their values<br />

be<strong>in</strong>g mean<strong>in</strong>gless to users, hence the only operation on them that is permitted to users is to<br />

2 In some models that <strong>in</strong>corporate a notion <strong>of</strong> composite object, the existence <strong>of</strong> an object may depend on<br />

that <strong>of</strong> others.<br />

3 In the currently accepted models, relationships as a separate concept are not supported. See [18] for an<br />

<strong>in</strong>uential paper that suggests an extension <strong>of</strong> object models with relationships.<br />

4 The o-id concept is similar to that <strong>of</strong> surrogate or tuple identier <strong>in</strong> relational databases.<br />

38


ask whether two o-id's (equivalently, two objects) are identical. This is commensurate with<br />

the OODB philosophy <strong>of</strong> encapsulation | the <strong>in</strong>ternal state <strong>of</strong> an object can be observed<br />

and manipulated exclusively through an its <strong>in</strong>terface. Indeed, if o-id's were made available for<br />

users to view, they would simply be just another attribute value, like employee numbers.<br />

But now we observe that if the user cannot see the values <strong>of</strong> o-id's, they cannot serve<br />

him/her for identify<strong>in</strong>g objects! That is, while o-id's may serve a useful role at the implementation<br />

level, they serve nosuch role at the conceptual level. Thus, as noted <strong>in</strong> a previous work<br />

[3], identiers are an implementational, not a conceptual, concept. We note that some systems<br />

actually use a physical address as the o-id, and this may change with physical reorganization.<br />

In such systems, the o-id certa<strong>in</strong>ly cannot serve for conceptual identication, although from<br />

the system's po<strong>in</strong>t <strong>of</strong> view, s<strong>in</strong>ce such changes guarantee <strong>in</strong>tegrity <strong>of</strong> references, the o-id's can<br />

be considered immutable.<br />

The identication <strong>of</strong> objects at the conceptual and external levels must ultimately rely on<br />

values, just as <strong>in</strong> the value-based models. The dierence (if at all) is that an OODB has a<br />

rich structure, hence many more ways values can be associated with objects for identication.<br />

Further, the rich structure possibly allows the structure itself to serve as an identication or<br />

equality mechanism. For example, consider the two well-known notions <strong>of</strong> equality for objects,<br />

based on the values <strong>in</strong> this representation: In shallow equality two objects are (shallow) equal<br />

if they have the same structure, and the values <strong>in</strong> the structures are pairwise equal. Note<br />

that a component <strong>of</strong> a structure may be an object, and then equality asidentity isused. In<br />

deep equality two objects are equal if their structures match, and for each pair <strong>of</strong> match<strong>in</strong>g<br />

components, either they are equal values, or deep equal objects. Note that <strong>in</strong> the real world,<br />

entities are identied by value properties (such as hair colour, timbre <strong>of</strong> voice, height), or by<br />

associations with other entities that have value-vased identication. Although identication<br />

<strong>in</strong> the real world is exible and potentially complex, it is eventually value-based.<br />

Among implemented or proposed OODB models we can dist<strong>in</strong>guish three dierent k<strong>in</strong>ds:<br />

Value-based databases: All objects are value-identiable, i.e. can be identied by values <strong>of</strong><br />

their (public) attributes or by an unfold<strong>in</strong>g, unnest<strong>in</strong>g <strong>of</strong> the values. This means that a<br />

subset <strong>of</strong> the public attributes serves as a key for a class.<br />

Value-representable databases: All objects are reference-identiable, where reference-identi-<br />

ability can be recursively dened as follows:<br />

{ Each value-identiable object is also reference-identiable.<br />

{ If an object is identied by acomb<strong>in</strong>ation <strong>of</strong> attribute values and by references to or<br />

from a set <strong>of</strong> objects such that each object <strong>in</strong> this set is reference-identiable, then<br />

the object is itself reference-identiable.<br />

Identier-based databases: There are objects which are not reference-identiable.<br />

Figure 2.1 depicts the relationships these classes. Note that we do not claim that no other<br />

methods for identication exist. F<strong>in</strong>d<strong>in</strong>g methods that are ecient yet expressive is a subject<br />

for research.<br />

The Identication Problem <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong><br />

In summary <strong>of</strong> the discussion above, the issue <strong>of</strong> identication <strong>of</strong> objects <strong>in</strong> OODB's is not<br />

solved by the use <strong>of</strong> o-id's and is far from be<strong>in</strong>g well understood. In particular, the last class,<br />

it is possible for a database to conta<strong>in</strong> objects that cannot be dist<strong>in</strong>guished from each other.<br />

39


[htbp]<br />

value-oriented database<br />

database<br />

value-representable database<br />

value-based database<br />

P PPPPP<br />

<br />

<br />

<br />

<br />

object-oriented database<br />

P PPPPP<br />

<br />

<br />

<br />

<br />

P PPPPP<br />

<br />

<br />

<br />

<br />

non-value-based database<br />

identier-based database<br />

Fig. 2.1. Classication <strong>of</strong> databases<br />

We now illustrate the problem. For simplicity, we use a simple graph model (similar models<br />

have been used <strong>in</strong> e.g., GOOD, [11]). <strong>Object</strong> graphs are dened on a set O [ V <strong>of</strong> nodes,<br />

where O is a set <strong>of</strong> (abstract) objects, and V is a set <strong>of</strong> (atomic) values, and a set L <strong>of</strong> edge<br />

labels. Labels can be 2, state or names (used as attribute names). Type constructor names<br />

and class names are assumed to be elements <strong>of</strong> V .Thus, V may conta<strong>in</strong> values such astuple,<br />

set, emp-class. Now an object graph G is given by a nite set N <strong>of</strong> nodes and a nite set E <strong>of</strong><br />

labeled edges, i.e. E N L N. The label 2 appears on an edge that connects an object to<br />

its class, or an element to a set that conta<strong>in</strong>s it. The label state connects an object to a tuple<br />

value, what represent its state. A name label connects a tuple to a component. Thus, this<br />

simple model can describe complex types constructed by tuple and set constructors, object<br />

classes, and object states (without encapsulation).<br />

Let us consider the graphs shown <strong>in</strong> gure 2.2. In (a), the objects o 1 o 2 o 3 cannot be<br />

[htbp]<br />

o 1<br />

AK<br />

s A 0<br />

<br />

s ?<br />

A<br />

A s<br />

1 A<br />

<br />

A<br />

s * HHY 0<br />

s 0H<br />

s HA<br />

o 2<br />

- o 3<br />

a<br />

<br />

<br />

<br />

<br />

s<br />

<br />

<br />

<br />

<br />

o 4<br />

s<br />

B<br />

BBBBBBN<br />

s<br />

s<br />

o 6<br />

(b)<br />

-<br />

o 5<br />

s<br />

B<br />

BBBBBBN<br />

s<br />

-<br />

b<br />

(a)<br />

Fig. 2.2. Identication <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong><br />

dist<strong>in</strong>guished each from one another. They have the same outgo<strong>in</strong>g and <strong>in</strong>com<strong>in</strong>g edges, i.e.<br />

the graph is completely symmetric. However, if somehow o 1 could be dist<strong>in</strong>guished from o 2<br />

then all three objects can be dist<strong>in</strong>guished from each other. In (b), s<strong>in</strong>ce there are value nodes<br />

and a 6= b, objects o 4 o 5 o 6 can be dist<strong>in</strong>guished either by their outgo<strong>in</strong>g or <strong>in</strong>com<strong>in</strong>g edges.<br />

We can use identication trees for present<strong>in</strong>g the local structure <strong>of</strong> the graph around each<br />

object. The trees <strong>in</strong> gure 2.3 show the similarity <strong>of</strong> <strong>of</strong> the neighborhoods for the three objects.<br />

If (o i so j ) 2 E the edge (o j s!o i ) is used for <strong>in</strong>vers<strong>in</strong>g the order.<br />

The graph <strong>in</strong> gure 2.4 also has non-trivial symmetries, yet the objects cannot be uniquely<br />

identied. However, the objects are divided <strong>in</strong>to two sets that can be dist<strong>in</strong>guished from each<br />

other. <strong>Object</strong>s o 2 o 3 can be dist<strong>in</strong>guished, that is separated from each other, i objects o 1 o 4<br />

40


[bhtp]<br />

o 1<br />

o 2<br />

<br />

H s<br />

<br />

s HHHj<br />

<br />

H s<br />

? <br />

s HHHj<br />

?<br />

o 2<br />

! o 3<br />

s 0 1<br />

o 3<br />

! o 1<br />

s 0 1<br />

o 3<br />

<br />

H s<br />

<br />

s HHHj<br />

?<br />

o 1<br />

! o 2<br />

s 0 1<br />

Fig. 2.3. Trees <strong>of</strong> <strong>of</strong> depth 1 for o 1 o 2 o 3<br />

[htbp]<br />

o 2<br />

s 2<br />

s 2<br />

-<br />

Xy<br />

6 X s 3<br />

s 1 o 1 s 1<br />

XXX o 4<br />

Xy<br />

XXXXX X ? s 1 s s 1<br />

s 3<br />

s 3<br />

2<br />

Xz XXXX<br />

o 3 9<br />

s 2<br />

s 3<br />

:<br />

Xy X XXXX<br />

XXXXX XzX<br />

Fig. 2.4. <strong>Object</strong>s which cannot be dist<strong>in</strong>guished<br />

can be dist<strong>in</strong>guished.<br />

The examples demonstrate that the fact that objects have o-id's cannot serve for identi-<br />

cation. They illustrate that objects may be identiable conditional to the identiability <strong>of</strong><br />

others, and the close relationship between identiability and dist<strong>in</strong>guishability. Several general<br />

approaches to identication and dist<strong>in</strong>guishability are <strong>in</strong>troduced and discussed next.<br />

2.3 Identication Concepts <strong>in</strong> <strong>Databases</strong><br />

We have already mentioned the option <strong>of</strong> identify<strong>in</strong>g objects by values <strong>of</strong> their attributes, or<br />

additionally by associations to other objects. Generally, objects can be identied by the their<br />

position <strong>in</strong> the graph as well. We now consider concrete formalizations <strong>of</strong> these ideas.<br />

The rst two ideas concern homomorphic mapp<strong>in</strong>gs on graphs. Given two object graphs<br />

G = (NE) and G 0 = (N 0 E 0 ). A mapp<strong>in</strong>g h : N ;! N 0 preserves node labels if, for<br />

each u 2 N \ V , h(u) = u. It preserves adjacency if for all nodes u v <strong>in</strong> G, and for each<br />

label s, if there exists an edge (u s v) <strong>in</strong>G then there exists an edge (h(u)sh(v)) <strong>in</strong> G 0 . If,<br />

additionally, whenever (h(u)sh(v)) is <strong>in</strong> G 0 there is an edge (u s v) <strong>in</strong> G, then we say it<br />

strongly preserves adjacency.<br />

The mapp<strong>in</strong>g h is called g-homomorphism if it maps N onto N 0 and both preserves node<br />

labels and strongly preserves adjacency. It is an isomorphism if it is a bijective map. We<br />

denote a g-homomorphism by h : G ;! G 0 .<br />

The requirement that homomorphisms preserve node labels embodies our assumption that<br />

avalue is a uniquely identiable entity, that can be mentioned by users. Hence, a node labeled<br />

by avalue cannot be mapped to a node labeled by another. Recall that class names and type<br />

constructor names are also considered values, so they must be mapped to themselves. Thus,<br />

s<strong>in</strong>ce a g-homomorphism strongly preserves adjacency objects <strong>in</strong> a class can only be mapped<br />

to objects <strong>of</strong> the same class, s<strong>in</strong>ce they are related to the node represent<strong>in</strong>g their class by an<br />

edge.<br />

Identiability by homomorphisms:<br />

41


We say that two nodes o 1 o 2 <strong>of</strong> G are <strong>in</strong>dist<strong>in</strong>guishable by a g-homomorphism if there<br />

exists a graph G 0 and a g-homomorphism h : G ;! G 0 such that h(o 1 )=h(o 2 ). An object<br />

o is H-uniquely identiable if there is no other object o 0 dierent from o such that o o 0 are<br />

<strong>in</strong>dist<strong>in</strong>guishable by some g-homomorphism. The graph G is H-identiable if each <strong>of</strong> its objects<br />

is H-uniquely identiable.<br />

<br />

Identiability by automorphisms:<br />

Given the object graph G = (NE), a mapp<strong>in</strong>g h is called g-automorphism if it is a<br />

g-isomorphism from G to itself. Denote the automorphism group <strong>of</strong> G by ; (G). Two nodes<br />

u v from G are called A-equivalent (denoted by u = v) if there exists a g-automorphism h <strong>in</strong><br />

; (G) withv = h(u). (It is easily seen that it is <strong>in</strong>deed an equivalence relation.) For each node<br />

u the set <strong>of</strong> <strong>of</strong> all A-equivalent nodes is called the orbit <strong>of</strong> u (denoted by Or(u)). A node u is<br />

called A-identiable if u = h(u) foreach g-automorphism h, that is if it is the only element <strong>in</strong><br />

Or(u). The node is called A-unidentiable otherwise. The graph is A-identiable if its nodes<br />

are A-identiable.<br />

<br />

Identication by bisimulation:<br />

A related idea is obta<strong>in</strong> by generaliz<strong>in</strong>g from mapp<strong>in</strong>gs to relations. Given two graphs as<br />

above, a bisimulation between them is a b<strong>in</strong>ary relation R that preserves labels and adjacency,<br />

i.e., if R(u v) and u 2 N \ V ,thenu = v, and if R(u v), then there exists an edge (u s u 0 )<br />

i there exists an edge (v s v 0 ) and R(u 0 v 0 ). Bisimulations are closed under union, hence<br />

there always exists a maximal bisimulation between two graphs. Two nodes u u 0 <strong>in</strong> G are<br />

B-identiable is they are related by the maximal bisimulation between the graph and itself.<br />

<br />

The three previous denitions use the idea that values are the basis for identiability,<br />

but they rely on a global mechanism, namely the existence or unexistence <strong>of</strong> mapp<strong>in</strong>gs with<br />

certa<strong>in</strong> properties. The next two denitions <strong>in</strong>troduce ideas that are essentially local. Let us<br />

dene the local neighborhood <strong>of</strong> a node u to consist <strong>of</strong> u, all the nodes v such that (u s v)<br />

or (v s u) are <strong>in</strong> the graph, and all the edges connect<strong>in</strong>g them. That is, it is the subgraph<br />

<strong>of</strong> G <strong>in</strong>duced by the nodes whose distance from u is at most one (where edge directions are<br />

ignored). (This is essentially a neighborhood <strong>of</strong> radius one, <strong>in</strong> the term<strong>in</strong>ology <strong>of</strong> [9].) We<br />

denote the local neighborhood <strong>of</strong> u by ln(u). A g-isomorphism from ln(u) onto ln(v) is a<br />

regular g-isomorphism between these two graphs, that maps u to v. Thus, we assume that<br />

u is a dist<strong>in</strong>guished node <strong>in</strong> ln(u). Given a set P <strong>of</strong> pairs <strong>of</strong> nodes <strong>in</strong> N, a g-isomorphism<br />

f from l(u) to l(v) is a P -mapp<strong>in</strong>g if for any node u 0 <strong>in</strong> ln(u), the pair (u 0 f(u 0 ) is not <strong>in</strong><br />

P . The set P should be thought <strong>of</strong>asaset<strong>of</strong> excluded pairs, that cannot be related by the<br />

mapp<strong>in</strong>g. A typical case is when P is the set <strong>of</strong> values | these are all dierent from each<br />

other. Another case is a set <strong>of</strong> objects <strong>in</strong> a view, where the <strong>in</strong>formation that they are pairwise<br />

dierent cannot be deduced from the data <strong>in</strong> the view, but can be given as a summary <strong>of</strong><br />

<strong>in</strong>formation <strong>in</strong> the underly<strong>in</strong>g database.<br />

Identiability by values:<br />

The idea here is that values are dist<strong>in</strong>guishable from each other, and from objects. Also,<br />

nodes can be dist<strong>in</strong>guished if they are connected to dist<strong>in</strong>guishable nodes, or have a dierent<br />

pattern <strong>of</strong> connectivity. In the follow<strong>in</strong>g algorithm, this idea is repeatedly applied until a<br />

xpo<strong>in</strong>t isreached. For generality, the algorithm is given <strong>in</strong> terms <strong>of</strong> an arbitrary <strong>in</strong>itial set<br />

42


IE <strong>of</strong> pairs <strong>of</strong> nodes that are assumed to be known to be unequal. 5<br />

1. Input G =(NE)<br />

2. Initialization<br />

NotId = IE<br />

3. Repeat until no further change<br />

NotId := NotId [<br />

f (u v) (v u) j thereisno NotId-mapp<strong>in</strong>g <strong>of</strong> ln(u) onto ln(v)g<br />

4. Output : NotId <br />

For a graph G, let the canonical <strong>in</strong>equality setbe<br />

IE can (G) =f(u v)ju v are dierent nodes u v 2 N \ V _ (u 2 N n V ^ v 2 N \ V )g:<br />

Anodeu 2 N is called V-identiable if for each nodev 2 N the property(u v) 2 NotId holds,<br />

when the algorithm is started with IE can (G). Otherwise the node is call V-unidentiable. The<br />

graph G is V-identiable if each <strong>of</strong> its nodes is V-identiable.<br />

<br />

The close relationship between computation and logical <strong>in</strong>ference suggests that the previous<br />

denition can be brought <strong>in</strong>to a logical form.<br />

Identiability by (dis)equational logics:<br />

This logic is an analog <strong>of</strong> <strong>in</strong>equality systems used for ADT logics. Now we dene a Hilbert<br />

type deductive system for this logic. In addition to the predicate above, the system uses<br />

another b<strong>in</strong>ary predicate, that we denote 6=. Also, rather than writ<strong>in</strong>g (u v) 2 6=, we arite<br />

u 6= v. The set <strong>of</strong> axioms is assumed to be a given set IE <strong>of</strong> pairs on B =(O L) (whichmay<br />

be empty). The deductive system is denoted D IE<br />

V<br />

Axioms<br />

Rules<br />

u 6= v<br />

if (u v) 2 IE<br />

there is no 6= ;mapp<strong>in</strong>g <strong>of</strong> ln(u) onto ln(v)<br />

u 6= v<br />

Now we dene the derivation relationship `IE on the basis <strong>of</strong> DB IE.<br />

The node n is E-identiable if for every other node n 0 2 N we can derive `IEcan(G) n 6= n 0 .<br />

The graph G is E-identiable if each <strong>of</strong> its nodes is E-identiable.<br />

<br />

Identiability by queries<br />

We dene a simple query language on B = (O V L). The set <strong>of</strong> queries Q(B) is the<br />

smallest set generated by the follow<strong>in</strong>g formation rules.<br />

(i) If M is a subset <strong>of</strong> V then M is a query.<br />

(ii) If q is a query, andJ is a subset <strong>of</strong> L then ! J (q), J (q) are queries.<br />

(iii) If q q 0 are queries, then q [ q 0 , q \ q 0 and q n q 0 are queries.<br />

The semantics <strong>of</strong> queries, i.e., their mean<strong>in</strong>g on a graph G =(NE) is dened as follows.<br />

(i) M(G) =fu 2 N j u 2 Mg<br />

5 We discuss below a scenario where this is <strong>of</strong> <strong>in</strong>terest.<br />

43


(ii) J (q)(G) =fu 2 N j (u l v) 2 El 2 Jv 2 q(G)g<br />

(iii) ! J (q)(G) =fu 2 N j (v l u) 2 El 2 Jv 2 q(G)g<br />

(iv) (q [ q 0 )(G) =q(G) [ q 0 (G) (q \ q 0 )(G) =q(G) \ q 0 (G) (q n q 0 )(G) =q(G) n q 0 (G)<br />

Us<strong>in</strong>g the rst type <strong>of</strong> query, we can select any subset <strong>of</strong> the value nodes <strong>of</strong> a graph. Us<strong>in</strong>g the<br />

next two k<strong>in</strong>ds, we can express complex reachability patterns. Note that the language does<br />

not conta<strong>in</strong> iteration idioms. However, s<strong>in</strong>ce queries can essentially be composed, and we can<br />

write large queries, which compensate the lack <strong>of</strong> such idioms.<br />

Given a graph G =(NE) onB, letQ G be the subset <strong>of</strong> Q(B) that mentions only values<br />

and labels <strong>in</strong> G. Two nodes u v are Q-<strong>in</strong>dist<strong>in</strong>guishable (denoted u = Q v) if for all q <strong>in</strong> Q G ,<br />

u 2 q(G) i v 2 q(G). A node u <strong>in</strong> N is Q-identiable if there is a query q <strong>in</strong> Q such that<br />

q(G) =fug. G is Q-identiable if each node<strong>in</strong>N is Q-identiable. .<br />

2.4 Comparison <strong>of</strong> Identication Concepts<br />

We now proceed to compare the expressive power <strong>of</strong> the mechanisms <strong>in</strong>troduced above. We<br />

start with V and E identiability.<br />

Proposition 2.1. For each graph G, for any set <strong>of</strong> pairs IE, and for any nodes u v <strong>in</strong> N,<br />

the algorithm term<strong>in</strong>ates with (u v) <strong>in</strong> NotId if and only if `IE u 6= v.<br />

Pro<strong>of</strong>. Easy <strong>in</strong>ductions on computations, deductions respectively.<br />

Corollary 2.2. Anode u is V-identiable if and only if is E-identiable.<br />

<br />

From now, we use computation to refer to either computation or derivation. We note that<br />

computations are non-determ<strong>in</strong>istic <strong>in</strong> the sense that <strong>in</strong> each stepwe can consider an arbitrary<br />

pair. However, when we consider the set <strong>of</strong> pairs <strong>of</strong> nodes that are <strong>in</strong>ferred to be unequal, we<br />

have:<br />

Lemma 2.3. Computations are conuent: All computations on a given graph from an <strong>in</strong>itial<br />

set IE term<strong>in</strong>ate with the same set <strong>of</strong> pairs <strong>of</strong> nodes for 6=.<br />

Pro<strong>of</strong>. If at a given po<strong>in</strong>t <strong>in</strong> a computation, it is possible to derive that u 6= v, then the<br />

execution <strong>of</strong> other steps does not <strong>in</strong>validate any <strong>of</strong> the prerequisites, s<strong>in</strong>ce 6= canonly grow.<br />

Hence this fact rema<strong>in</strong>s derivable.<br />

Given a set <strong>of</strong> pairs, if wewantto<strong>in</strong>terpret it as a set <strong>of</strong> <strong>in</strong>equalities, then it is desirable that<br />

its complement has the properties <strong>of</strong> equality, <strong>in</strong> particular that is it an equivalence relation<br />

on the nodes <strong>of</strong> the graph. Let us call a set <strong>of</strong> pairs whose complement is an equivalence<br />

relation well-behaved.<br />

Proposition 2.4. Assume that the given <strong>in</strong>itial set is well-behaved. Then the computation<br />

produces a well-behaved set NotId.<br />

Pro<strong>of</strong>. By the lemma, the order <strong>of</strong> steps <strong>in</strong> a computation is irrelevant, so we consider steps<br />

done <strong>in</strong> a certa<strong>in</strong> order, and organized <strong>in</strong>to stages: In a given stage, we take the complement<br />

<strong>of</strong> the set NotId computed so far, take a connected component <strong>of</strong>thatcomplement, and test<br />

each pair <strong>in</strong> this component for <strong>in</strong>clusion <strong>in</strong> NotId. We add all pairs that qualify at the end<br />

44


<strong>of</strong> the stage to the set <strong>of</strong> pairs, then proceed to the next stage. This computation is still<br />

non-determ<strong>in</strong>istic <strong>in</strong> the choice <strong>of</strong> a component for a stage.<br />

The claim now proceeds by <strong>in</strong>duction on stages. By assumption, the given set is wellbehaved,<br />

so after <strong>in</strong>itialization, NotId is well-behaved. Assumed that it is well-behaved after<br />

k stages, we show the property holds after the k + 1 stage. Note that s<strong>in</strong>ce it is well-behaved,<br />

the complement is an equivalence relation | a connected component <strong>of</strong> the complement is<br />

an equivalence class. Now, a pair u v <strong>in</strong> the class is noted for <strong>in</strong>clusion <strong>in</strong> NotId if and only<br />

if there is no NotId-isomorphism <strong>of</strong> ln(u) onto ln(v). Thus, two nodes u v will rema<strong>in</strong> <strong>in</strong><br />

the complement <strong>of</strong>NotId i there is such an isomorphism between ln(u)ln(v). It is easy to<br />

see that s<strong>in</strong>ce NotId is well-behaved, the set <strong>of</strong> pairs that are not <strong>in</strong>cluded <strong>in</strong> it <strong>in</strong> this stage<br />

is a (disjo<strong>in</strong>t) partition <strong>of</strong> the class.<br />

We now proceed to compare V- and A-identiability:<br />

Proposition 2.5. If nodes u v are V-dist<strong>in</strong>guishable, then they are A-dist<strong>in</strong>guishable.<br />

Pro<strong>of</strong>. We claim that if u v are A-<strong>in</strong>dist<strong>in</strong>guishable, that is, there is a g-automorphism h<br />

that maps u to v, then the pair (u v) will not be <strong>in</strong> NotId . The pro<strong>of</strong> is by <strong>in</strong>duction on<br />

the stages <strong>of</strong> a computation. Clearly, the pairs <strong>in</strong> IE can (G) are A-dist<strong>in</strong>guishable, each is a<br />

separate class <strong>in</strong> the complement <strong>of</strong> NotId and is mapped by h to itself. So (u v) is not <strong>in</strong><br />

IEcan(G), for any such pair. For the <strong>in</strong>duction, we assume that for all u v, ifh(u) =v then<br />

u v are <strong>in</strong> the complement <strong>of</strong> NotId after k stages, and we prove this holds after the next<br />

stage. Indeed, let u v be a pair such that h(u) = v. Then the restriction <strong>of</strong> h to ln(u) is a<br />

NotId-isomorphism onto ln(v), so the pair (u v) is not put <strong>in</strong>to NotId.<br />

The converse to the proposition does not hold. As an example, consider a database with n<br />

classes, C 1 ::: C n . Let class C i conta<strong>in</strong> objects x i y i z i , and let each <strong>of</strong> these haveanl-labeled<br />

outgo<strong>in</strong>g edge, so the graph conta<strong>in</strong>s the 3n edges (a i la i+1 ), for i = 1n and a = x y z.<br />

Further, assume the existence <strong>of</strong> the follow<strong>in</strong>g three edges: (x n lx 1 ) (y n lz 1 ) (z n ly 1 ).<br />

F<strong>in</strong>ally, assume the follow<strong>in</strong>g 3n h-edges: (x i hy i ) (y i hz i ) (z i hx i ). Now, s<strong>in</strong>ce class nodes<br />

are V-dist<strong>in</strong>guishable, the algorithm for V-identication will partition the objects so that<br />

x i y i z i will be <strong>in</strong> an equivalence class <strong>in</strong> the complement <strong>of</strong>NotId. S<strong>in</strong>ce the structure <strong>of</strong> each<br />

class is symmetric, no additional partition can occur. However, no non-trivial automorphism<br />

exist. Indeed, assume it exists, and call it h. Without loss <strong>of</strong> generality, assume h(x 1 )=y 1 .<br />

Then necessarily, from then edge structure, h(y 1 ) = z 1 h(z 1 ) = x 1 , and the same, namely<br />

h(x i ) = y i , ::: , must hold for the objects <strong>in</strong> the classes C 2 ::: C n . But now we reach a<br />

contradiction, s<strong>in</strong>ce (y n lz 1 )is<strong>in</strong>E, but (h(y n )lh(z 1 )) = (z n lx 1 ) is not <strong>in</strong> E.<br />

We now consider properties <strong>of</strong> H. Every g-homomorphism on a graph G partitions the<br />

nodes <strong>in</strong>to equivalence classes. We can compare g-homomorphisms by compar<strong>in</strong>g the partitions<br />

they <strong>in</strong>duce. We say h dom<strong>in</strong>ates h 0 if each equivalence class <strong>of</strong> h 0 is conta<strong>in</strong>ed <strong>in</strong> an<br />

equivalence class <strong>of</strong> h. Clearly, dom<strong>in</strong>ance <strong>in</strong>duces a partial order. A g-homomorphism <strong>of</strong><br />

G that dom<strong>in</strong>ates all other g-homomorphisms <strong>of</strong> G is called maximum. Any two maximal<br />

mapp<strong>in</strong>gs dom<strong>in</strong>ate each other, hence <strong>in</strong>duce the same partitions. Thus, their images are<br />

g-isomorphic.<br />

Proposition 2.6. There exists a g-homomorphism which is maximum.<br />

45


Pro<strong>of</strong>. Let HN be the transitive closure <strong>of</strong> the b<strong>in</strong>ary relationship `u and v are H-<strong>in</strong>dist<strong>in</strong>guishable'.<br />

It is an equivalence relation on the nodes N <strong>of</strong> the graph G. We can create a graph ^G whose<br />

nodes are the elements <strong>of</strong> HN, and such that ([u]l[v]) is an edge if and only if there is an<br />

edge (u l v) <strong>in</strong> G, where u is any member <strong>of</strong> the equivalence class [u], and similarly for v.<br />

We claim that the mapp<strong>in</strong>g ^h that maps all elements <strong>of</strong> [u] <strong>in</strong>N to [u] is a g-homomorphism<br />

from G onto ^G.<br />

Assume that for some g-homomorphism h, h(u) =h(v). Further assume that (u l u 0 ) is<br />

an edge <strong>in</strong> G. Then (h(u)lh(u 0 )) = (h(v)lh(u 0 )) is an edge <strong>in</strong> the image under h. S<strong>in</strong>ce h<br />

strongly preserves adjacency, (v l u 0 )must be an edge <strong>in</strong> G. Thus, u and v are connected by<br />

l edges to precisely the same nodes. The claim obviously holds also for other edge labels, and<br />

also for back edges <strong>of</strong> the form (u 0 lu). In short, u v have the same connections.<br />

Now, if v and w are identied by some h 0 , then v and w have the same connections. It<br />

follows that u and w have the same connections. By <strong>in</strong>duction, we nowhave thatifwehave a<br />

sequence u 1 ::: u n ,suchthateach pair u i u i+1 is identied by some h i , then all elements <strong>in</strong><br />

the sequence have the same connections. In other words, all elements <strong>of</strong> an equivalence class<br />

<strong>in</strong> HN have the same connections. It follows that ^h is <strong>in</strong>deed a g-homomorphism. It is clear<br />

that it is a maximal g-homomorphism.<br />

Proposition 2.7. H-identiability is the same as B-identiability<br />

Pro<strong>of</strong>. A bisimulation on a graph <strong>in</strong>duces an equivalence relation on its nodes, and it is<br />

easy to see that there exists a g-homomorphism from the graph to its image modulo this<br />

equivalence relation. In particular, tak<strong>in</strong>g the maximal bisimulation, we have that if objects<br />

are H-identiable, then they are B-identiable. For the opposite direction, we observe that<br />

the maximal g-homomorphism <strong>in</strong>duces a bisimulation on the graph.<br />

We now compare V and H. Let us call the classes <strong>in</strong> the partition <strong>of</strong> the nodes <strong>of</strong> G <strong>in</strong>to<br />

equivalence classes by ^h-classes. As noted above, the notion <strong>of</strong> V-identiability also denes<br />

equivalence classes on G, each consist<strong>in</strong>g <strong>of</strong> objects that are pairwise V-<strong>in</strong>dist<strong>in</strong>guishable.<br />

Denote these as V-classes.<br />

Proposition 2.8. Let G be a graph. Then each ^h-class is conta<strong>in</strong>ed <strong>in</strong> some V-class.<br />

Pro<strong>of</strong>. We prove the claim by <strong>in</strong>duction on the stages <strong>of</strong> the computation <strong>of</strong> the V-classes.<br />

Initially, each v 2 N \ V is an equivalence class by itself, and all other elements <strong>of</strong> N are <strong>in</strong><br />

one class. Clearly, each v 2 N \ V is also a s<strong>in</strong>gleton ^h-class, so the claim holds. Assume now<br />

it holds after stage k. If the local neighborhoods <strong>of</strong> u v <strong>of</strong> the same ^h-class are compared,<br />

they will be found to be isomorphic. To see this, observe thatasweshowed above theyhave<br />

precisely the same connections. Thus, the mapp<strong>in</strong>g that takes u to v and leaves all other nodes<br />

<strong>in</strong> place is an isomorphism <strong>of</strong> ln(u) and ln(v). It follows that they will not be separated.<br />

Corollary 2.9. V-identiability implies H-identiability.<br />

The converse fails: In the example above, no twonodeshave precisely the same connections,<br />

hence ^h is the identity, so each node is H-identiable, but as we saw, nodes are not V-<br />

identiable.<br />

If we change the example by conneect<strong>in</strong>g x n to x 1 , ::: , then there is a non-trivial<br />

g-automorphism, so nodes <strong>in</strong> the same class are not A-identiable, but they are still H-<br />

identiable.<br />

46


Proposition 2.10. A-identiability implies H-identiability.<br />

Pro<strong>of</strong>. Assume nodes u v are H-undist<strong>in</strong>guishable. We claim that the map that <strong>in</strong>terchanges<br />

u with v and is the identity elsewhere is a g-automorphism.<br />

We now consider Q. As mentioned above, when one starts from IE can (G), the complement<br />

<strong>of</strong> the 6= relation computed by the algorithm for V-identication is a partition <strong>of</strong> the nodes<br />

<strong>in</strong>to equivalence classes, the V-classes.<br />

Proposition 2.11. Let G be agraph, and q be a query. Then the q(G) is a union <strong>of</strong> V-classes.<br />

Pro<strong>of</strong>. The pro<strong>of</strong> uses <strong>in</strong>duction on the structure <strong>of</strong> queries. If M V , then M(G) =M \ N,<br />

which obviously is a union <strong>of</strong> V-classes, as each value is <strong>in</strong> a class by itself. Now, assume the<br />

claim is true for a query Q, let J L, and consider the query ! q . Assume u 2 ! q (G), and<br />

also assume that u 0 is <strong>in</strong> the same V-class as u. Then there is a 6=-isomorphism from ln(u)<br />

onto ln(u 0 ). In particular, if there is an l-edge from u to a V-class, there is an l-edge from u 0<br />

to the same class. It follows that u 0 is also <strong>in</strong> ! q (G). The case for q is similar. S<strong>in</strong>ce sets <strong>of</strong><br />

V-classes are closed under boolean operations, the pro<strong>of</strong> is complete.<br />

Corollary 2.12. Q-identiability implies V-identiability.<br />

The converse does not hold, s<strong>in</strong>ce the V-algorithm can count, while queries do not count.<br />

For example, assume o 1 o 2 have l-edges to a node labeled 3, o 3 has a k-edge to o 1 , while o 4<br />

has k-edges to both o 1 and o 2 . The V-algorithm will separate o 3 from o 4 , s<strong>in</strong>ce their local<br />

neighborhoods are not isomorphic. Queries cannot separate o 1 from o 2 , hence cannot also<br />

separate nodes related to them by the same k<strong>in</strong>d <strong>of</strong> edges, like o 3 o 4 .<br />

Notice, for non-canonical sets <strong>of</strong> <strong>in</strong>equalities the generalized V-identiability and the equivalent<br />

E-identiability donot imply H-identiability orA-identiability. Thus, identiability<br />

based <strong>in</strong>equality sets can be a very powerful method.<br />

Integrity constra<strong>in</strong>ts can be used for identiability aswell, s<strong>in</strong>ce they can impose dist<strong>in</strong>guishability<br />

or <strong>in</strong>dist<strong>in</strong>guishability <strong>of</strong> objects. As we have seen, sometimes two objects can be<br />

dist<strong>in</strong>guished from each other if a pair <strong>of</strong> others can. Thus, dist<strong>in</strong>guishability that is deduced<br />

from the <strong>in</strong>tegrity constra<strong>in</strong>ts <strong>in</strong> the system can propagate.<br />

Us<strong>in</strong>g a generalized relational representation, equality generat<strong>in</strong>g dependencies are constra<strong>in</strong>ts<br />

<strong>of</strong> the follow<strong>in</strong>g form:<br />

8x((P R (x 1 ) ^ ::: ^ P R (x m ) ^ F (x m+1 )<br />

! G(x)<br />

where F G are conjunctions <strong>of</strong> equalities <strong>of</strong> the form x ij = x i 0 j0, P is the predicate symbol<br />

associated with relation R, andx i x. Based on the transformation <strong>of</strong> the constra<strong>in</strong>t to the<br />

equivalent formula<br />

8(x)(P R (x 1 ) ^ ::: ^ P R (x m ) ^:G(x m+1 )<br />

!:F (x))<br />

we can use the <strong>in</strong>equality set IE <strong>in</strong> order to extend the deductive system DV<br />

IE . Thus, we<br />

can express the identication properties on the basis <strong>of</strong> value-dist<strong>in</strong>guishability or equational<br />

logic <strong>in</strong> the case <strong>of</strong> equality generat<strong>in</strong>g dependencies. Notice, that functional dependencies,<br />

key constra<strong>in</strong>ts, and generalized functional dependencies are special equality generat<strong>in</strong>g dependencies.<br />

47


Corollary 2.13. Identication extended byequality-generat<strong>in</strong>g dependencies can be expressed<br />

by V-identiability.<br />

An exclusion dependency is an expression <strong>of</strong> the form<br />

R[R:A 1 :::: R:A n ] k S[S:B 1 :::S:B n ]:<br />

The property specied by the exclusion dependency can be directly translated to <strong>in</strong>equalities<br />

among objects.<br />

A generalized <strong>in</strong>clusion dependency is an expression <strong>of</strong> the form<br />

R 1 [X 1 ] \ ::: \ R n [X n ] S 1 [Y 1 ] [ ::: [ S m [Y m ]<br />

for compatible sequences X i Y j . Similarily to equality-generat<strong>in</strong>g dependencies, generalized<br />

<strong>in</strong>clusion dependencies can be transformed to negated formulas. These formulas are the basis<br />

for the extension <strong>of</strong> the deductive system D IE<br />

V .<br />

Corollary 2.14. Identication extended by generalized <strong>in</strong>clusion dependencies and exclusion<br />

dependencies can be expressed by V-identiability.<br />

Disjunctive existence constra<strong>in</strong>ts X ) Y 1 Y 2 ::: Y n specify that if a tuple is completely dened<br />

on X then it is completely dened on Y i for some i. There is an axiomatization for disjunctive<br />

existence constra<strong>in</strong>ts. They can be represented by monotone Boolean functions.<br />

S<strong>in</strong>ce the existence has been treated explicitly <strong>in</strong> the denition <strong>of</strong> value-identiability we<br />

conclude directly:<br />

Corollary 2.15. Identication extended by existence dependencies can be expressed by V-<br />

identiability.<br />

Summariz<strong>in</strong>g the comparisons above we obta<strong>in</strong><br />

Corollary 2.16. V-identiability and E-identiability are equivalent for generalized <strong>in</strong>clusion,<br />

exclusion, existence and equality-generat<strong>in</strong>g dependencies.<br />

2.5 Conclusion<br />

This paper has reconsidered notions <strong>of</strong> identity andidentiability <strong>in</strong> OODB'ss. We have proposed<br />

and justied the thesis that object identiers, as proposed and used <strong>in</strong> most OODB<br />

implementations are system-related but do not address problems <strong>of</strong> users. In particular, although<br />

the o-id mechanism guarantees that objects do have a unique identity, as required<br />

<strong>in</strong> the foundational postulates for OODB's, that by itself does not provide for identiability,<br />

namely the ability <strong>of</strong> a program or user to uniquely identify an object. For the latter,<br />

as far users <strong>of</strong> an OODB are concerned, a value-based mechanism must be used. We have<br />

shown a close relationships between identiability and separability <strong>of</strong> objects from each others.<br />

In order to better understand identiability, wehave studied various notions that can be<br />

used as a specication <strong>of</strong> this notion, and have classied their relative strengths. Our results<br />

and discussion complement and augment previous discussion <strong>of</strong> object identication and its<br />

complexity <strong>in</strong> the literature, e.g., [2, 10, 14, 15].<br />

We have not considered the practical issues <strong>of</strong> identiability. S<strong>in</strong>ce the mechanism must<br />

be value-based, a notion <strong>of</strong> keys, as used <strong>in</strong> relational databases certa<strong>in</strong>ly suces. However,<br />

48


OODB's oer a much richer structure, and it seems reasonable to expect that more <strong>of</strong> this<br />

structure be used for identication. E.g., one can use not only the value <strong>of</strong> a key attribute,<br />

but also it membership <strong>in</strong> a given class, <strong>in</strong> particular dist<strong>in</strong>guish between membership <strong>in</strong><br />

subclasses <strong>of</strong> a given class. Initial work <strong>in</strong> this direction has been reported <strong>in</strong> [19, 20]. We see<br />

<strong>in</strong> this an important research direction.<br />

A related problem concerns the representation <strong>of</strong> real-world entities by database objects.<br />

A-posteriori it is possible for two or more objects to represent the same real world entity.<br />

Thus, <strong>in</strong> addition to the primary notion <strong>of</strong> object equality, where objects are equal if they are<br />

identical <strong>in</strong> the database, we have referential equality, where two database objects refer to the<br />

same real world entity. Although not discussed much <strong>in</strong> the literature, it is probably undesirable<br />

to allow dierent database objects to represent the same real world entity. Whether<br />

such phenomena can be avoided depends on the identication mechanisms supported by the<br />

system. A mechanism that is easy to use, simple to understand, and can be directly related<br />

to properties <strong>of</strong> real-world entities can help avoid problems <strong>of</strong> multiple representations.<br />

A nal po<strong>in</strong>t concerns views. One can say that the reason o-id's cannot help users to<br />

identify objects is the existence <strong>of</strong> an abstraction barrier between the system and its users.<br />

The system knows the values <strong>of</strong> o-id's and can use them freely for all its needs. However, s<strong>in</strong>ce<br />

only an equality test is exported, these same o-id's are much less useful for the users. One<br />

can say that the OODB as seen by the users is a view <strong>of</strong> the <strong>in</strong>ternal OODB, as seen by the<br />

system. It follows that one should expect similar problems when deal<strong>in</strong>g with views. Namely,<br />

it is possible that the identifcation mechanism <strong>in</strong> an OODB uniquely identies each object<br />

<strong>in</strong> the current state. Yet, s<strong>in</strong>ce views present a restricted viewpo<strong>in</strong>t, it is possible that this<br />

property does not hold for some views. This may be problematic for views that allow updates,<br />

possibly also for queries. Note that view denitions may form abstraction barriers <strong>in</strong> ways<br />

that are much more sophisticated that the simple <strong>in</strong>terface between the implementation and<br />

conceptual levels <strong>of</strong> an OODB. The analysis <strong>of</strong> identiability issues may therefore be more<br />

dicult. The problem will become both more dicult and more important as distributed<br />

access to dist<strong>in</strong>ct OODB's through the Web becomes common.<br />

References for Chapter 2<br />

1. S. Abiteboul, P.C. Kanellakis, <strong>Object</strong> identity as a query language primitive. Proc. SIGMOD,<br />

1989, 159 - 173.<br />

2. S. Abiteboul, J. Van den Bussche, Deep equality revised. Proc. DOOD'95 (eds. T.W. L<strong>in</strong>g, A.O.<br />

Mendelzon, L. Vielle), LNCS 1013, 213 - 228.<br />

3. C. Beeri, A formal approach toobject-oriented databases. Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, 5,<br />

1990, 4, 353 - 382.<br />

4. C. Beeri, Some thoughts on the future evolution <strong>of</strong> object-oriented database concepts. Proc. BTW<br />

93 (ed. W. Stucky), Spr<strong>in</strong>ger, 1993, 18 -32.<br />

5. K. Berka, L. Kreiser, Texts on logics. Akademie-Verlag, Berl<strong>in</strong>, 1973.<br />

6. J. Biskup and H.H. Bruggemann. An object-surrogate-value approach for database languages.<br />

Technical report 16-3-89, University Hildesheim, Dept. Computer Science.<br />

7. H.B. Curry, Foundations <strong>of</strong> mathematical logic. McGraw-Hill, New York, 1963.<br />

8. G. Frege, Funktion und Begri. Jena 1891.<br />

9. H. Gaifman, On local and non-local properties. Proc. <strong>of</strong> the Herbrand Symposium, Logic Colloq.<br />

'81, North-Holland, Amsterdam, 1982.<br />

10. M. Gogolla, A declarative query approach to object identication. Proc. OO-ER95 (ed.M.Papazoglou),<br />

LNCS 1021, 65 - 76.<br />

49


11. M. Gyssens, J. Paredaens, D. v. Gucht, A graph-oriented object database model. Proc. PODS,<br />

1990, 417-424.<br />

12. S.N. Khoshaan, G. Copeland, <strong>Object</strong> identity. Proc. OOPSLA-86, special Issue <strong>of</strong> SIGPLAN<br />

Notices (ed. N. Meyrowitz), 21 (12), Dec. 1986, 406 - 416.<br />

13. S.C. Kleene, Mathematical logic. John Wiley, New York, 1967.<br />

14. H.-J. Kle<strong>in</strong>, J. Rasch. Value based identication and functional dependencies for object databases.<br />

Proc. 3rd Basque Int. Workshop on Information Technology, IEEE Comp. Sci. Press, 1997, 22-34.<br />

15. A. Kosky, Observational dist<strong>in</strong>guishability. Proc. 5th DBPL, Electronic Report <strong>of</strong> Conferences <strong>in</strong><br />

Comput<strong>in</strong>g, Spr<strong>in</strong>ger, 1995.<br />

16. G.W. Leibniz, Fragmente zur Logik. Edited by Fr. Schmidt, Berl<strong>in</strong>, 1960.<br />

17. P.S. Poreckij, Theorie conjo<strong>in</strong>te des egalites des non-egalites logiques. News <strong>of</strong> Physics Society <strong>of</strong><br />

Kazan University, XVI, No. 1-2, 1908.<br />

18. J. Rumbaugh, Controll<strong>in</strong>g propagation <strong>of</strong> operations us<strong>in</strong>g attributes on relations. Proc. OOP-<br />

SLA88, ACM Sigplan Notices (23,11), Nov. 1988, 285{296.<br />

19. K.-D. Schewe, J.W. Schmidt, and I. Wetzel, Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />

<strong>Oriented</strong> <strong>Databases</strong>. In J. Biskup, R. Hull (eds.), Proc. 3rd International Conference on Database<br />

Theory, ICDT '92, Berl<strong>in</strong> (Germany), Lecture Notes <strong>in</strong> Computer Science 341{356, 1992, Spr<strong>in</strong>ger.<br />

20. K.-D. Schewe, B. Thalheim, Fundamental Conceps <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Concepts. Acta Cybernetica,<br />

11, No. 4, 1993, 49 { 81<br />

21. B. Thalheim, Reconsider<strong>in</strong>g key and identication concepts <strong>in</strong> dierent database models. Technical<br />

Report CS-08-91, University <strong>of</strong> Rostock, 1991.<br />

22. J. Van den Bussche,J.Paredaens, The expressive power <strong>of</strong> complex values <strong>in</strong> object-based data<br />

models. Inf. Comput. 120, 220{236.<br />

23. J. Van den Bussche, D. van Gucht, M. Andries, M. Gyssens, On the completeness <strong>of</strong> object-creat<strong>in</strong>g<br />

database transformation languages. JACM 44:2, March 1997, 272{319<br />

50


Chapter 3<br />

<strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong><br />

Database Modell<strong>in</strong>g<br />

Contents<br />

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br />

3.2 Type Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />

3.3 OODM Schemata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br />

3.4 Value Representability . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />

3.5 Logical Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />

3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

Klaus{Dieter Schewe. <strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Modell<strong>in</strong>g. Intelligent<br />

Systems. Moskau 1997.<br />

51


Abstract. Solid theoretical foundations <strong>of</strong> object oriented databases (OODBs) are still miss<strong>in</strong>g.<br />

The work reported <strong>in</strong> this paper conta<strong>in</strong>s results on a formally founded object oriented<br />

datamodel (OODM) and is <strong>in</strong>tended to contribute to the development <strong>of</strong> a uniform mathematical<br />

theory <strong>of</strong> OODBs.<br />

A clear dist<strong>in</strong>ction between objects and values turns out to be essential <strong>in</strong> the OODM.<br />

Types and classes are used to structure values and objects repectively. This can be founded<br />

on top <strong>of</strong> any underly<strong>in</strong>g type system. We outl<strong>in</strong>e dierent approaches to type systems and<br />

their semantics and claim that OODB theory on top <strong>of</strong> arbitrary type systems leads to type<br />

theory with topos-theoretically dened semantics.<br />

On this basis the known solutions to the problems <strong>of</strong> unique object identication and<br />

genericity can be generalized. It turns out that extents <strong>of</strong> classes must be completely representable<br />

by values. Such classes are called value-representable. As a consequence object<br />

identiers degenerate to a pure implementation concept. This stimulates considerations that<br />

do not depend on such identiers.<br />

In order to approach this problem object oriented schemata and <strong>in</strong>stances are reorganized<br />

by means <strong>of</strong> general category-theoretical arguments to let them occur as theories <strong>in</strong> the higherorder<br />

<strong>in</strong>tuitionistic logic associated with a topos dened by the type system. Moreover, <strong>in</strong> the<br />

case <strong>of</strong> value-representability itcan be seen that object identiers can be dispensed with at<br />

the logical level. This allows to approach queries algebraically as well as logically and sets up<br />

a start<strong>in</strong>g po<strong>in</strong>t for deduction with<strong>in</strong> OODBs.<br />

3.1 Introduction<br />

The shortcom<strong>in</strong>gs <strong>of</strong> the relational database approach encouraged much research aimed at<br />

achiev<strong>in</strong>g more appropriate data models. It has been claimed that the object-oriented approach<br />

will be the key technology for future database systems and languages [8]. Several systems<br />

[5, 6, 7, 9, 19, 20, 21, 22, 24, 27, 38, 40, 41, 70] arose from these eorts. However, <strong>in</strong> contrast<br />

to research <strong>in</strong> the relational area there is no common formal agreement on what constitutes<br />

an object-oriented database [11, 12, 14].<br />

The basic question \What is an object?" seems to be trivial, but already here the variety<br />

<strong>of</strong> answers is large. In object oriented programm<strong>in</strong>g the notion <strong>of</strong> an object was <strong>in</strong>tended as<br />

a generalization <strong>of</strong> the abstract data type concept with the additional feature <strong>of</strong> <strong>in</strong>heritance.<br />

In this sense object orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong><br />

order to promote high s<strong>of</strong>tware development productivity. The development <strong>of</strong> object oriented<br />

databases regarded an object also as a basic unit <strong>of</strong> persistent data, a view that is heavily <strong>in</strong>-<br />

uenced by exist<strong>in</strong>g semantic datamodels (SDMs) [2, 30, 31, 43, 44, 63]. Thus, object oriented<br />

databases are composed <strong>of</strong> <strong>in</strong>dependent objects but must also provide for the ma<strong>in</strong>tenance <strong>of</strong><br />

<strong>in</strong>ter-object consistency, a demand that is to some degree <strong>in</strong> dissonance with the basic style<br />

<strong>of</strong> object orientation.<br />

Theoretical <strong>in</strong>vestigations <strong>in</strong> the eld <strong>of</strong> OODBs are rare. The few exist<strong>in</strong>g results <strong>in</strong> OODB<br />

theory can be classied <strong>in</strong> three groups. The rst one [25, 65, 66, 67] studies expressiveness and<br />

complexity <strong>of</strong> query languages with object creation and duplicate elim<strong>in</strong>ation. This follows<br />

more or less the ideas <strong>of</strong> the IQL framework [3]. The second one [12, 14, 15, 16, 54, 55] asks<br />

for the fundamental features <strong>of</strong> object oriented datamodels and their semantical foundations.<br />

The third group [4, 37] cont<strong>in</strong>ues the l<strong>in</strong>e <strong>of</strong> research <strong>in</strong> which databases occur as theories<br />

dened by logic programs.<br />

52


A view that is common <strong>in</strong> OODB research is that objects are abstractions <strong>of</strong> real world<br />

entities and should have an identity [8]. This leads to a dist<strong>in</strong>ction between values and objects<br />

[11, 12]. A value is identied by itself whereas an object has an identity <strong>in</strong>dependent <strong>of</strong> its<br />

value. This object identity is usually encoded by object identiers [1, 3, 36]. Abstract<strong>in</strong>g from<br />

the pure physical level the identier <strong>of</strong> an object can be regarded as be<strong>in</strong>g immutable dur<strong>in</strong>g<br />

the object's lifetime. Identiers ease the shar<strong>in</strong>g and update <strong>of</strong> data. However, such abstract<br />

identiers do not relieve us from the task to provide unique identication mechanisms for<br />

objects. In object oriented programm<strong>in</strong>g object names are sucient, but retriev<strong>in</strong>g mass data<br />

by name is senseless.<br />

In most approaches to OODBs an object is coupled with a value <strong>of</strong> some xed structure.<br />

To our po<strong>in</strong>t <strong>of</strong> view this contradicts already the goal <strong>of</strong> objects be<strong>in</strong>g abstractions <strong>of</strong> reality.<br />

In real situations an object has several and also chang<strong>in</strong>g aspects that should be captured by<br />

the object model. Therefore, <strong>in</strong> our object model each object o consists <strong>of</strong> a unique identier<br />

id, a set <strong>of</strong> (type-, value-)pairs (T i v i ), a set <strong>of</strong> (reference-, object-)pairs (ref j o j ) and a set<br />

<strong>of</strong> operations m k .<br />

Types are used to structure values. Then the rst problem concerns the semantics <strong>of</strong> the<br />

type system, i.e. the variety <strong>of</strong> types that can be dened and used <strong>in</strong> schema denitions.<br />

We consider three dierent approaches based on a simple type system with set semantics, the<br />

typed -calculus and a slightly extended version <strong>of</strong> Girard-Reynolds polymorphism [17, 42, 48].<br />

For the third case it is well-known that there is no set-theoretic model. In this case, however,<br />

suitable models can be obta<strong>in</strong>ed <strong>in</strong> the eective topos [34, 32, 50] or even <strong>in</strong> Grothendieck<br />

topoi [47]. Moreover, we may always ask how good a model is with respect to computational<br />

aspects. Here aga<strong>in</strong> it may be argued that hav<strong>in</strong>g an <strong>in</strong>tuitionist's m<strong>in</strong>d, i. e. tak<strong>in</strong>g a topostheoretic<br />

po<strong>in</strong>t <strong>of</strong> view, may helptohave eective computations [49].<br />

Classes serve as structur<strong>in</strong>g primitive for objects hav<strong>in</strong>g the same structure and behaviour.<br />

It is obvious that the multiple aspects view <strong>of</strong> an object allows them to be simultaneously<br />

members <strong>of</strong> more than one class and to change class memberships. In the OODM a class<br />

structure uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. The extent <strong>of</strong> classes<br />

varies over time, whereas types are immutable. Relationships between classes are represented<br />

by references together with referential constra<strong>in</strong>ts on the object identiers <strong>in</strong>volved. Moreover,<br />

each class is accompanied by acollection <strong>of</strong> operations. A schema is given by acollection <strong>of</strong><br />

class denitions together with explicit <strong>in</strong>tegrity constra<strong>in</strong>ts. It will be shown that the semantics<br />

<strong>of</strong> OODM schemata can be dened <strong>in</strong> a uniform way <strong>in</strong>dependently from the underly<strong>in</strong>g type<br />

system.<br />

Important OODB problems concern the unique identication <strong>of</strong> objects and the existence<br />

<strong>of</strong> generic update operations [55]. Follow<strong>in</strong>g [1, 13] the immutable identity <strong>of</strong> an object can<br />

be encoded by the concept <strong>of</strong> abstract object-identiers. The advantages <strong>of</strong> this approach are<br />

that shar<strong>in</strong>g, mutability <strong>of</strong>values and cyclic structures can be represented easily [46]. On the<br />

other hand, object identiers do not have a mean<strong>in</strong>g for the user and should therefore be<br />

hidden. The notion <strong>of</strong> value-representability is known to guarantee unique identication <strong>in</strong><br />

the case <strong>of</strong> set semantics. This can be generalized to the general case. The same applies to<br />

the genericity problem.<br />

Then we show us<strong>in</strong>g now categorical terms how classes, schemata and <strong>in</strong>stances can be captured<br />

categorically. Us<strong>in</strong>g the <strong>in</strong>ternal logic <strong>of</strong> a topos, we may dene schemata and <strong>in</strong>stances<br />

by theories and even get rid <strong>of</strong> object identiers us<strong>in</strong>g the existence, identity and description<br />

predicates <strong>in</strong> <strong>in</strong>tuitionistic logic <strong>in</strong>stead. On this basis algebraic and logical queries can be<br />

dened. However, this last step depends on value-representability, a necessary property for<br />

53


genericity [55], whereas for the unique identication <strong>of</strong> objects weak value-identiability would<br />

be sucient [16, 55]. However, some slight extensions { which areomitted here { allow also<br />

to capture this case.<br />

Throughout the paper we assume some basic knowledge about category theory [10], elementary<br />

topos theory [35, 39] and their relation to higher-order <strong>in</strong>tuitionistic logic [28, 39, 60].<br />

3.2 Type Systems<br />

We start with a brief look at three dierent type systems and their semantics. The three<br />

approaches comprise a very simple type system with set semantics, typed -calculus with<br />

semantics <strong>in</strong> cartesian closed categories and a version <strong>of</strong> the polymorphic or second-order<br />

typed -calculus.<br />

Common to all these cases is the view that types are basically given by base types and<br />

constructors. The latter will occur as types with free (type) variables. A type without free<br />

variables will be called proper. Among the base types we assume an abstract identier type<br />

ID.Atype T without occurrence <strong>of</strong> ID will be called a value-type.<br />

A Simple Type System. In set-based modell<strong>in</strong>g a type may be regarded as an immutable<br />

set <strong>of</strong> values <strong>of</strong> a uniform structure together. Subtyp<strong>in</strong>g is used to relate values <strong>in</strong> dierent<br />

types. We use a type system that consists <strong>of</strong> some base types such as BOOL, NAT, INT,<br />

STRING, etc., and type constructors for records, nite sets, lists, etc. Arbitrary types can<br />

then be dened by nest<strong>in</strong>g. Moreover, we assume recursive types with a semantics dened<br />

by rational trees. We shall proceed giv<strong>in</strong>g a more formal denition <strong>of</strong> types. Thus the type<br />

system can be dened as<br />

t := b j x j (a 1 : t 1 :::a n : t n ) jftg j[t] j x:t :<br />

The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Then the type system may be<br />

extended by a subtype relation t 0 t [42], which semantically gives rise to subtype functions<br />

t 0 ! t. We omit the details here.<br />

If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a correspond<strong>in</strong>g occurrence<br />

relation<br />

o : t t 0 ! <br />

where is the truth object <strong>in</strong> Set, i.e. = BOOL.<br />

Typed -Calculus. In the typed -claculus the ma<strong>in</strong> emphasis is on function types, i.e. we<br />

can dene the type system by<br />

t := b j x j (a 1 : t 1 :::a n : t n ) j t 1 ! t 2 j :<br />

The semantics <strong>of</strong> the typed -calculus can described by cartesian closed categories.<br />

54


Polymorphism. As the third approach we choose some slightly enriched version <strong>of</strong> Girard-<br />

Reynolds polymorphism (GRP), i.e. types are given by the language<br />

t := b j x j t 1 ::: t n j t 1 ! t 2 j x:t <br />

where b denotes some collection <strong>of</strong> base types <strong>in</strong>clud<strong>in</strong>g for our purposes a type ID <strong>of</strong> object<br />

identiers, x represents some type variable, represents product types, ! represents function<br />

types and impredicative polymorphic abstraction with x runn<strong>in</strong>g over all types [42, 48].<br />

First recall the notion <strong>of</strong> a topos. Atopos E is a nitely-complete cartesian-closed category<br />

with a subobject classier, i. e. there is an object and a global element true :1l ! such<br />

that for each monomorphism f : A,! B there is a unique classify<strong>in</strong>g morphism cl(f) :B ! <br />

such that f and triv dene the pullback <strong>of</strong>cl(f) andtrue. Here1l denotes a term<strong>in</strong>al object.<br />

Then we may dene what we mean by a model <strong>of</strong> our type theory <strong>in</strong> a topos E. Amodel<br />

<strong>of</strong> GRP <strong>in</strong> a topos E consist <strong>of</strong> an essentially small <strong>in</strong>ternal category IE that is closed under<br />

nite products, exponents and Ob(IE)-<strong>in</strong>dexed products together with an embedd<strong>in</strong>g <strong>in</strong>to E<br />

which preserves these properties. The commonly known such model is given by the category<br />

Per <strong>of</strong> partial equivalence relations <strong>in</strong> the eective topos E .<br />

For an exhibition <strong>of</strong> various approaches to construct such modelswe refer to [34, 47].<br />

3.3 OODM Schemata<br />

In this section we present a slightly modied version <strong>of</strong> the object oriented datamodel (OODM)<br />

<strong>of</strong> [52, 54, 58]. We observe that an object <strong>in</strong> the real world always has an identity. Therefore,<br />

abstract (i.e. system-provided) object identiers are <strong>in</strong>troduced to capture identity. However,<br />

neither the real world object that was the basis <strong>of</strong> the abstraction nor the abstract identier<br />

can be used for the identication <strong>of</strong> an object.<br />

In contrast to exist<strong>in</strong>g object oriented datamodels [1, 3, 5, 6, 7, 8, 9, 20, 21, 27, 38, 40,<br />

46, 61] an object is not coupled with a unique type. In contrast, we observe that real world<br />

objects can have dierent aspects that may change over time. Therefore, a primary decision<br />

was taken to let an object be associated with more than one type and to let these types even<br />

change dur<strong>in</strong>g the object's lifetime. The same applies to references to other objects.<br />

The Class Concept. The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same<br />

structure which uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Moreover, generic<br />

operations on objects such as object creation, deletion and update <strong>of</strong> its values and references<br />

are associated with classes provided these operations can be dened unambigously. <strong>Object</strong>s<br />

can belong to dierent classes, which guarantees each object <strong>of</strong> our abstract object model to<br />

be captured by the collection <strong>of</strong> possible classes. As for values that are only dened via types,<br />

objects can only be dened via classes.<br />

Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values and references to<br />

objects <strong>in</strong> other classes. Identiers can be represented us<strong>in</strong>g the unique identier type ID.<br />

Values and references can be comb<strong>in</strong>ed <strong>in</strong>to a representation type, where each occurence <strong>of</strong><br />

ID denotes references to some other classes. Therefore, we may dene the structure <strong>of</strong> a class<br />

us<strong>in</strong>g types with free variables.<br />

As to dynamics we dist<strong>in</strong>guish between visible and hidden operations to emphasize those<br />

operations that can be <strong>in</strong>voked by the user and others. All operations on a class <strong>in</strong>clud<strong>in</strong>g the<br />

55


hidden ones can be accessed by other operations, but only hidden operations can be used to<br />

handle identiers.<br />

In the follow<strong>in</strong>g N denotes some (large enough) collection <strong>of</strong> names.<br />

(i) Let t be a value type with free variables 1 ::: n .For pairwise dist<strong>in</strong>ct reference names<br />

r 1 ::: r n 2 N and class names C 1 ::: C n 2 N the expression derived from t by replac<strong>in</strong>g<br />

each i <strong>in</strong> t by r i : C i for i =1::: n is called a structure expression.<br />

(ii) A class consists <strong>of</strong> a class name C 2 N, a structure expression S, a set <strong>of</strong> superclass names<br />

fD 1 ::: D m gN and a set fm 1 :::m k g <strong>of</strong> operations. We callr i a reference from class<br />

C to class C i . The type derived from S by replac<strong>in</strong>g each reference r i : C i bythetype ID<br />

is called the representation type T C <strong>of</strong> the class C, the type U C =(ident : IDvalue :: T C )<br />

is called the class type <strong>of</strong> C.<br />

(iii) An operation signature consists <strong>of</strong> a operation name M 2 N, a set <strong>of</strong> <strong>in</strong>put-parameter<br />

/ <strong>in</strong>put-type pairs i :: T i ( i 2 N) and a set <strong>of</strong> output-parameter / output-type pairs<br />

o j :: T 0 j (o j 2 N). We write<br />

o 1 :: T 0 1::: o m :: T 0 m M( 1 :: T 1 ::: n :: T n ) :<br />

(iv) A operation M on aclassC consists <strong>of</strong> a operation signature with name M and a body<br />

that is recursively built from the follow<strong>in</strong>g constructs:<br />

(a) assignment x := E, where x is either the class variable x C or a local variable with<strong>in</strong><br />

S, andE is a term <strong>of</strong> the same type as x,<br />

(b) skip, fail, loop,<br />

(c) sequential composition S 1 S 2 , choice S 1 S 2 , projection x :: T j S, guard P ! S,<br />

restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable <strong>of</strong> type<br />

T ,and<br />

(d) <strong>in</strong>stantiation x 0 1 ::: x0 i C0 : S 0 (E1 0 ::: E0 j ), where S0 is a operation on class C 0 with<br />

<strong>in</strong>put-parameters 0 1 ::: 0 j and output-parameters o0 1 ::: o0 i ,such that the variables<br />

o 0 f , x0 f have the same type and the term E0 g has the same type as the variable 0 g.<br />

(v) An operation M on a class C with signature o 1 :: T1 0::: o m :: Tm 0 M( 1 :: T 1 ::: n ::<br />

T n ) is called value-dened i all T i (i =1:::n) and Tj 0 (j =1::: m) are proper value<br />

types.<br />

(vi) A schema S is a nite collection <strong>of</strong> classes C 1 ::: C n closed under references, superclasses<br />

and occurrences <strong>of</strong> class names <strong>in</strong> operations.<br />

Semantics. First assume that the underly<strong>in</strong>g type systems has a set semantics. Then we can<br />

dene <strong>in</strong>stances <strong>of</strong> OODM schemata.<br />

An <strong>in</strong>stance D <strong>of</strong> a schema S assigns to each classC avalue D(C) <strong>of</strong>type U C such that<br />

the follow<strong>in</strong>g conditions are satised:<br />

uniqueness <strong>of</strong> identiers: For every class C we have<br />

8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (3.17)<br />

<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />

8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (3.18)<br />

Moreover, if T C is a subtype <strong>of</strong> T 0 C with subtype function f : T C ! T 0 C ,thenwehave<br />

8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (3.19)<br />

56


eferential <strong>in</strong>tegrity: For each reference from C to C 0 with correspond<strong>in</strong>g occurrence relation<br />

o r wehave<br />

8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) :<br />

(3.20)<br />

On the basis <strong>of</strong> topos theory we can rephrase the denition <strong>of</strong> database <strong>in</strong>stances. Instead <strong>of</strong><br />

the set D(C) wehave to consider a subobject DC ,! |<br />

ID T C , i.e. a monomorphism <strong>in</strong> IE. If<br />

: ID T C ! ID is the canonical projection, then the uniqueness <strong>of</strong> identiers means that<br />

| is monic. If j C i is the image factorization <strong>of</strong> | with j C : im(DC) ! ID, then this<br />

must factor through j D if C is a subclass <strong>of</strong> D. Thirdly, letDC be the subobject <strong>of</strong> DC ID<br />

classied by<br />

DC ID |id<br />

,! ID T C ID id ;! T C ID or<br />

;! <br />

where o r corresponds to the reference from C to D. Letj r : im(DC) ,! ID result from the<br />

image factorization <strong>of</strong> { for { : DC ,! DC ID. Then j r must factor through j D .<br />

The semantics <strong>of</strong> operations can be dened via predicate transformers as shown <strong>in</strong> [26, 45]<br />

for the classical case and <strong>in</strong> [57] for the topos-based semantics.<br />

Example. Let us look at a simple university example based on the simple type system with<br />

set semantics. We rst<strong>in</strong>troduce types and classes, then show an example <strong>of</strong> an <strong>in</strong>stance.<br />

Type PERSONNAME = ( FirstName : STRING , SecondName : STRING , Titles : f<br />

STRING g )<br />

Type PERSON = (PersonIdentityNo : NAT Name : PERSONNAME )<br />

Type MPERSON = ( PersonIdentityNo : NAT , Spouse : )<br />

Then let the schema consist <strong>of</strong> the follow<strong>in</strong>g classes:<br />

Class PersonC<br />

Structure PERSON<br />

End PersonC<br />

Class MarriedPersonC<br />

IsA PersonC<br />

Structure ( PersonIdentityNo : NAT , Spouse : MarriedPersonC )<br />

End MarriedPersonC<br />

Class StudentC<br />

IsA PersonC<br />

Structure ( StudentNumber : NAT , Supervisor : Pr<strong>of</strong>essorC ,<br />

Major : DepartmentC , M<strong>in</strong>or : DepartmentC )<br />

End StudentC<br />

Class Pr<strong>of</strong>essorC<br />

IsA PersonC<br />

Structure ( PersonIdentityNo : NAT , Age : NAT ,<br />

Salary : NAT ,Faculty :DepartmentC )<br />

End Pr<strong>of</strong>essorC<br />

57


Class DepartmentC<br />

Structure ( DeptName : STRING )<br />

End DepartmentC<br />

Next use D as a name for the <strong>in</strong>stance.<br />

D(PersonC) =f ( i 1 , ( 123 , ( \John" , \Denver" , f \Pr<strong>of</strong>essor" , \Dr" g ))),<br />

( i 2 , ( 124 , ( \Mary" , \Stuart" , f \Dr" g ))),<br />

( i 3 , ( 456 , ( \John" , \Stuart" , fg))),<br />

( i 4 , ( 567 , ( \Laura" , \James" , fg))),<br />

( i 5 , ( 987 ,(\Dave" ,\Ford" , fg))) g<br />

D(MarriedPersonC)=f ( i 1 , ( 123 , i 2 )),<br />

( i 2 , ( 124 , i 1 )) g<br />

D(Pr<strong>of</strong>essorC)=f ( i 1 , ( 123 , 48 , 8000 , i 6 ))<br />

D(StudentC)=f ( i 3 , ( 456 , 1023 , ( \John" , \Stuart" , fg),i 1 , i 6 , i 7 )),<br />

( i 4 , ( 567 , 2134 , ( \Laura" , \James" , fg),i 1 , i 6 , i 7 )) g<br />

D(DepartmentC)=f ( i 6 , ( \Computer Science" ) ) ,<br />

( i 7 , ( \Philosophy" ) ) ,<br />

( i 8 ,(\Music"))g<br />

3.4 Value Representability<br />

From an object oriented po<strong>in</strong>t <strong>of</strong>view a database may be considered as a huge collection <strong>of</strong><br />

objects <strong>of</strong> arbitrary complex structure. Hence the problem to uniquely identify and retrieve<br />

objects <strong>in</strong> such collections.<br />

Each object <strong>in</strong> a database is an abstraction <strong>of</strong> a real world object that has a unique identity.<br />

The representation <strong>of</strong> such objects <strong>in</strong> the OODM uses an abstract identier I <strong>of</strong> type ID to<br />

encode this identity. Suchanidentier may be considered as be<strong>in</strong>g immutable. However, from<br />

a systems oriented view permutations or collapses <strong>of</strong> identiers without chang<strong>in</strong>g anyth<strong>in</strong>g<br />

else should not aect the behaviour <strong>of</strong> the database.<br />

For the user the abstract identier <strong>of</strong> an object has no mean<strong>in</strong>g. Therefore, a dierent<br />

access to the identication problem is required. We show that the unique identication <strong>of</strong><br />

an object <strong>in</strong> a class leads to the notion <strong>of</strong> value-identiability. The stronger notion <strong>of</strong> valuerepresentability<br />

is required for the unique denition <strong>of</strong> generic update operations. The setbased<br />

case has been handled <strong>in</strong> [54, 55]<br />

(i) A class C is called value-identiable i there exists a proper value type I C , called valueidentication<br />

type such that for all <strong>in</strong>stances D <strong>of</strong> S there is a morphism c : T C ! I C<br />

such that the composition<br />

DC ,! ID T C<br />

2<br />

! T C<br />

c<br />

! I C<br />

is monic.<br />

(ii) C is called value-representable i there exists a value-dentication type V C such that<br />

for all <strong>in</strong>stances D <strong>of</strong> S there is a morphism c : T C ! V C such that for all valueidentication<br />

types I C and the image factorization T<br />

c V<br />

C ! DV C ,! V C there exists a<br />

morphism c 0 : DV C ! I C with c I = c 0 c V .<br />

58<br />

g


It is easy to see that each value-representable class C is also value-identiable. Moreover, the<br />

value-representation type V C <strong>in</strong> is unique up to isomorphism.<br />

We want to dene algorithms to compute types V C and I C that turn out to be proper<br />

value types under certa<strong>in</strong> conditions. For this we extend subtyp<strong>in</strong>g to structure expressions<br />

<strong>in</strong> a natural way tak<strong>in</strong>g care <strong>of</strong> IsA-relations. Then each super structure expression S 0 and<br />

each <strong>in</strong>stance dene a morphism IS 0 : DC ! DC 0 ,! ID T S 0 us<strong>in</strong>g the representation type<br />

T S 0 <strong>of</strong> S 0 .<br />

Algorithm. Let F (C i )=T i provided there exists a super structure expression on C i dened<br />

by c i : T Ci ! T i , otherwise let F (C i ) be undened. If ID occurs <strong>in</strong> some F (C i ) correspond<strong>in</strong>g<br />

to r j : C j (j 6= i), we writeID j .<br />

Then iterate as long as possible us<strong>in</strong>g the follow<strong>in</strong>g rules:<br />

(i) If F (C j )isaproper value type and ID j occurs <strong>in</strong> some F (C i )(j 6= i), then replace this<br />

correspond<strong>in</strong>g ID j <strong>in</strong> F (C i )by F (C j ).<br />

(ii) If ID i occurs <strong>in</strong> some F (C i ), then let F (C i ) be recursively dened by F (C i )==S i , where<br />

S i is the result <strong>of</strong> replac<strong>in</strong>g ID i <strong>in</strong> F (C i )by the type name F (C i ).<br />

The iteration term<strong>in</strong>ates, s<strong>in</strong>ce there exists only a nite collection <strong>of</strong> classes. If these rules are<br />

no longer applicable, replace each rema<strong>in</strong><strong>in</strong>g occurrence <strong>of</strong> ID j <strong>in</strong> F (C i ) by the type name<br />

F (C j )provided F (C j ) is dened.<br />

ut<br />

Note that the the algorithm computes (mutually) recursive types.<br />

The reference graph <strong>of</strong> a class C <strong>in</strong> a schema S is the smallest labelled graph G rep =<br />

(VEl) satisfy<strong>in</strong>g:<br />

(i) There exists a vertex v C 2 V with l(v C ) = ft Cg, where t is the top-level type <strong>in</strong> the<br />

structure expression S <strong>of</strong> C.<br />

(ii) For each proper occurrence <strong>of</strong> a type t 6= ID <strong>in</strong> T C there exists a unique vertex v t 2 V<br />

with l(v t )=ftg.<br />

(iii) For each reference r i : C i <strong>in</strong> the structure expression S <strong>of</strong> C the reference graph G i ref is<br />

a subgraph <strong>of</strong> G ref .<br />

(iv) For each vertex v t or v C correspond<strong>in</strong>g to t(x 1 ::: x n )<strong>in</strong>S there exist unique edges e (i)<br />

t<br />

from v t or v C respectively to v ti <strong>in</strong> case x i is the type t i or to v Ci <strong>in</strong> case x i is the reference<br />

r i : C i . In the rst case l(e (i)<br />

t )=fS i g, where S i is the correspond<strong>in</strong>g selector name <strong>in</strong> the<br />

latter case the label is fS i r i g.<br />

Let S = fC 1 ::: C n g be a schema. Let S 0 = fC1 0 ::: C0 ng be another schema such that for<br />

all i there exists a super structure expression on C i dened by some c i : T Ci ! T C 0<br />

i<br />

. Then an<br />

identication graph G id <strong>of</strong> the class C i is obta<strong>in</strong>ed from the reference graph <strong>of</strong> Ci 0 bychang<strong>in</strong>g<br />

each label Cj 0 to C j.<br />

With these notations it is easy to see that for a class C such that there exists a super<br />

structure expression for all classes C i occurr<strong>in</strong>g as a label <strong>in</strong> some identication graph G id <strong>of</strong><br />

C and the type I C computed by the Algorithm with respect to the super structure expression<br />

used <strong>in</strong> the denition <strong>of</strong> G id , I C is a proper value type.<br />

Theorem. (i) Let C be a class <strong>in</strong> a schema S such that there exists a super structure expression<br />

for all classes C i occurr<strong>in</strong>gasalabel <strong>in</strong> the reference graph G ref <strong>of</strong> C. Let V C be the<br />

59


type G(C) computed by the Algorithm with respect to trivial super structure expressions<br />

and let I C be the type F (C) computed by the Algorithm with respect to arbitrary super<br />

structure expressions. Then C is value-representable with value representation type V C<br />

and each such I C is a value identication type.<br />

(ii) Let C be a class <strong>in</strong> a schema S such that there exist generic update methods on C.<br />

Then C is value-representable. Moreover, all super- and subclasses <strong>of</strong> C are also valuerepresentable.<br />

(iii) Let C be a value-representable class <strong>in</strong> a schema S such that all its super- and subclasses<br />

are also value-representable. Then there exist unique generic update operations on C.<br />

The pro<strong>of</strong> mimiques the set-based arguments <strong>in</strong> [55].<br />

3.5 Logical Reconstruction<br />

So far, we have seen the decisive role<strong>of</strong>type semantics for OODBs. Given a topos <strong>of</strong> types,<br />

we may describe <strong>in</strong>stances <strong>of</strong> a schema on top <strong>of</strong> it. The only assumption is the existence <strong>of</strong> a<br />

type ID <strong>of</strong> object identiers. Moreover, it is known from [28, 35, 39] that topoi are <strong>in</strong>herently<br />

connected with higher-order <strong>in</strong>tuitionistic logics.<br />

In pr<strong>in</strong>cipal, there are two (equivalent) ways to approach the logic <strong>of</strong> a topos. The rst<br />

one is given by the Mitchell-Benabou language anf Kripke-Joyal semantics [39], the second<br />

one based on Fourman-Scott languages [28] follows more the general l<strong>in</strong>e <strong>of</strong> logics den<strong>in</strong>g its<br />

syntax and <strong>in</strong>terpretation <strong>in</strong> an (arbitrary) topos.<br />

In our presentation we take the second approach, because it directly comes up with equality,<br />

existence and description [60]. Recall that a Fourman-Scott language L consists <strong>of</strong><br />

{ two sets Sort and Const <strong>of</strong> sorts and constants,<br />

{ a power sort map [] : S n2IN Sortn ! Sort written (A 1 ::: A n ) 7! [A 1 ::: A n ],<br />

{ a family <strong>of</strong> countable sets fVar s g s2Sort <strong>in</strong>dexed by the sorts and<br />

{ amap#:Const ! Sort assign<strong>in</strong>g to each constant its sort.<br />

We also use Var= S s2Sort Var s to refer to the set <strong>of</strong> all variables. Then for a given variable<br />

x 2 Varwe write#x to refer to the sort <strong>of</strong> x. Moreover, we use f = [] as an abbreviation for<br />

the empty power sort which will be regarded as consist<strong>in</strong>g <strong>of</strong> truth values.<br />

The terms T s (L) <strong>of</strong> sort s 2 Sort for a language L are constructed from L as the smallest<br />

set such that each variable x <strong>of</strong> sort s, each constant c with #c = s and Ix:' for each variable<br />

x with #x = s and each formula ' belong to T s (L).<br />

The formulae <strong>of</strong> L build the smallest set F(L) such that the follow<strong>in</strong>g formulae are <strong>in</strong><br />

F(L):<br />

{ E for each term 2T(L),<br />

{ for terms , <strong>of</strong> the same sort s,<br />

{ ( 1 ::: n ) for terms i 2T si (L) and 2T [s 1::: s n](L),<br />

{ ' ^ for formulae ' and ,<br />

{ ' ) for formulae ' and and<br />

{ 8x:' for variables x 2 Var and formulae ' .<br />

60


We may then <strong>in</strong>troduce the other junctors :, _, ,, the predicate = and the quantier 9 as<br />

abbreviations.<br />

The <strong>in</strong>tension beh<strong>in</strong>d the description symbol I needs some explanation. Informally Ix:'<br />

means the unique x that satises '. However, such anx may not exist.<br />

The logic deals with this problem by <strong>in</strong>troduc<strong>in</strong>g a formal existence predicate E, where<br />

E means that exists. This is formalized by dist<strong>in</strong>guish<strong>in</strong>g doma<strong>in</strong>s ~ A <strong>of</strong> possible elements<br />

and to let E pick out the subdoma<strong>in</strong>s <strong>of</strong> actual elements. Then bound variables will range<br />

only over actual elements. When <strong>in</strong>terpret<strong>in</strong>g the logic <strong>in</strong> a topos this construction is related<br />

to partial morphism classication.<br />

The <strong>in</strong>troduction <strong>of</strong> an existence predicate also <strong>in</strong>uences the equality predicate = which<br />

is considered as a property <strong>of</strong> actual elements. In order to compare also possible elements<br />

the equivalence predicate is <strong>in</strong>troduced. Non-exist<strong>in</strong>g elements are all considered to be<br />

equivalent. S<strong>in</strong>ce then equality can be dened <strong>in</strong> terms <strong>of</strong> the equivalence and the existence<br />

predicates, only is taken as a primitive <strong>in</strong> the logic.<br />

We have mentioned above that the sort f will be considered as truth values. Then the<br />

formula () with 2Tf(L) is a formula that asserts .<br />

We dispense with a description <strong>of</strong> the axioms and rules that dene the derivation operator<br />

` as well as with the <strong>in</strong>terpretation <strong>of</strong> L <strong>in</strong> an arbitrary topos. We only mention that each<br />

theory T <strong>of</strong> L canonically denes a topos IE(T ), called the topos <strong>of</strong> denable types and<br />

denable total functions, and that each topos E can be written <strong>in</strong> this form. In particular,<br />

there is a canonical <strong>in</strong>terpretation <strong>of</strong> L <strong>in</strong> IE(T ), which is sound and complete.<br />

In order to dene IE(T ) we <strong>in</strong>troduce types and relations as terms <strong>of</strong> specic syntactic<br />

forms. Such types reect the many possible subdoma<strong>in</strong>s <strong>of</strong> doma<strong>in</strong>s associated with power<br />

sorts.<br />

A type A is a term <strong>of</strong> the form Iy :: [s]:8x :: s:(' , y(x)). A relation f from s to t is a<br />

term <strong>of</strong> the form Iz :: [s t]:8x :: s y :: t:(' , z(x y)). Atype A or a relation f is said to be<br />

denable i the den<strong>in</strong>g formula is closed.<br />

A more convenient notationforatype A dened by the formula ' is A = fx :: s j 'g. For<br />

a term <strong>of</strong> sort s we then get the formula 2 A. Foravariable x with #x = s we may use<br />

the quantiers 8x 2 A and 9x 2 A.<br />

For a relation f we may use the notation f # () forIy :: t:f(y) for 2T s (L) even if do<br />

not know whether f is the graph <strong>of</strong> a function. Furthermore, we use functional abstraction<br />

writ<strong>in</strong>g x :: s: as an abbreviation for Iz :: [s t]:8x :: s y :: t:(y = , z(x y)).<br />

F<strong>in</strong>ally, two relations f, g from type A to type B are equivalent with respect to T i<br />

T `8x 2 A:(f # (x) g # (x)) holds.<br />

Let T be a theory over L. ThetoposIE(T )<strong>of</strong>denable types and denable total functions<br />

has as objects the denable types <strong>of</strong> L and as morphisms from A to B equivalence classes <strong>of</strong><br />

denable relations from A to B such that T `8x 2 A:f # (x) 2 B holds. For f 2 Hom(A B)<br />

and g 2 Hom(BC) the composition g f 2 Hom(A C) is dened by x 2 A:(g # (f # (x))).<br />

Schemata and Instances as Theories. Given a topos E, let us now try to shift the categorical<br />

characterization <strong>of</strong> <strong>in</strong>stances <strong>in</strong>to the associated logic. Recall that the sorts <strong>of</strong> this logic are<br />

the objects <strong>of</strong> E, the constants <strong>of</strong> sort A are the morphisms c :1l ! A, ~ where A : A ! A ~ is<br />

the partial morphism classier for A [35, 39, 53] and the power sort map takes A 1 ::: A n<br />

to A 1:::A n<br />

.<br />

Now consider the monomorphism | : DC ,! ID T C and the canonical projection 1 :<br />

IDT C ! ID.Asabove letj C : im(DC) ,! ID result from the image factorization <strong>of</strong> 1 |.<br />

61


S<strong>in</strong>ce we assume 1 | to be monic, the universal property <strong>of</strong> images gives rise to a unique<br />

monomorphism { : im(DC) ! DC.<br />

S<strong>in</strong>ce we assume value-representability, 2 | is also monic, hence 2 |{ gives a monomorphism<br />

from im(DC), a subobject <strong>of</strong> ID to T C . Then the universal property <strong>of</strong> the partial<br />

morphism classier TC gives rise to a unique monomorphism I(C) :ID ! T ~ C .<br />

Similarly, consider the morphism o r : T C ID ! correspond<strong>in</strong>g to a reference r from<br />

class C to class D. S<strong>in</strong>ce TC I(D) : T C ID ! T ~ C T ~ D denes a monomorphism, we<br />

may aga<strong>in</strong> consider the partial morphism classier for , which is = id . This gives us<br />

a unique morphism ~o r : TC ~ T ~ D ! . Then let I(r) = ^~o r : TC ~ ! <br />

T ~ D<br />

be its exponential<br />

adjo<strong>in</strong>t.<br />

Then the morphisms I(C) for all classes C 2Sand I(r) for all references <strong>in</strong> S (assum<strong>in</strong>g<br />

for the moment unique reference names) are sucient to describe objects. In fact, we may<br />

th<strong>in</strong>k <strong>of</strong> these morphisms as semantically associated with an <strong>in</strong>stance, whereas syntactically<br />

we may use the class names C and the reference names r <strong>in</strong>stead.<br />

This gives rise to formulae <strong>of</strong> the form EIo: ' as \ground facts" given by some <strong>in</strong>stance.<br />

Moreover, the follow<strong>in</strong>g formulae dene the axioms <strong>of</strong> the schema S:<br />

8o: EC(o) ) ED(o) if C is a subclass <strong>of</strong> D (3.21)<br />

8o: EC(o) )8o 0 : (r(C(o))(D(o 0 )) ) ED(o 0 )) for a reference r from C to D (3.22)<br />

8o o 0 :C(o) =C(o 0 ) ) o = o 0 (3.23)<br />

If Ax(S) is the set <strong>of</strong> formulae (3.21), (3.22) and (3.23) dened for schema S, then this<br />

corresponds to the theory T 0 = f' j Ax(S) ` 'g. If <strong>in</strong> addition Ax(I) is a set <strong>of</strong> formulae<br />

given by some <strong>in</strong>stance, maybe only \ground facts" as above, then the correspond<strong>in</strong>g theory<br />

is T 0 = f' j Ax(S) [ Ax(I) ` 'g.<br />

Note that each model <strong>of</strong> such a theory T 0 <strong>in</strong> the underly<strong>in</strong>g topos IE(T ) gives rise to a<br />

logical morphism IE(T 0 ) ! IE(T ) [28].<br />

Let us nally remark that the construction <strong>of</strong> I(C) is also possible, if value-representability<br />

is not assumed, but <strong>in</strong> this case we shall not get a monomorphism. Then <strong>in</strong> general a fact<br />

such asIo: C(o) =t may not exist, i. e. EIo: C(o) =t may not factor through true. Then the<br />

only model would be the <strong>in</strong>consistent topos. Nevertheless a smooth extension to the case <strong>of</strong><br />

weak value-identiability is still possible.<br />

Gett<strong>in</strong>g Rid <strong>of</strong> Identiers. By the work [16] object identiers have been identied as a<br />

pure implementation concept. This leads to the requirement <strong>of</strong> weak value-identiability. In<br />

our construction above, where we assumed the stronger value-representability this is already<br />

reected by the fact that I(C) isamorphism <strong>in</strong>to T ~ C , whereas <strong>in</strong> the rst categorical reformulation<br />

we had monomorphisms <strong>in</strong>to ID T C .<br />

Nevertheless, the type T ~ C still <strong>in</strong>volves identiers correspond<strong>in</strong>g to references, but as shown<br />

<strong>in</strong> [53, 54, 55] the value types that can be used to identify objects can be eectively computed.<br />

Let us sketch a correspond<strong>in</strong>g construction <strong>in</strong> E.<br />

Thus, consider the pullback <strong>of</strong>I(r) : T ~ C ! <br />

T ~ D<br />

and id TD ~ . This denes an object T CrD<br />

and morphisms exp 0 (r) : T CrD ! <br />

T ~ D<br />

and exp(r) : T CrD ! T ~ C , the latter be<strong>in</strong>g monic.<br />

S<strong>in</strong>ce I(r) I(C) =id TD ~ (I(r) I(C)) the universal property <strong>of</strong> pullbacks denes a unique<br />

monomorphism I(C rD):ID ! T CrD with exp(r) I(C rD)=I(C).<br />

We may repeat this construction with respect to all morphisms correspond<strong>in</strong>g to references<br />

<strong>in</strong>clud<strong>in</strong>g the exp 0 (r) constructed above. This denes a diagram D : ; ! E. Let O denote<br />

62


the limit <strong>of</strong> D. Then there is also a unique monomorphism I : ID ! O such that all the<br />

morphisms I(C) are given by I and D.<br />

Note that we may also assume all objects <strong>in</strong> D(; ) to be bounded, i. e. there exists a<br />

monomorphism <strong>in</strong>to some xed object R. Then also O will turn out as a subobject <strong>of</strong> R.<br />

The construction <strong>of</strong> O glues together types and references, but still does not <strong>in</strong>troduce<br />

object description without any identiers. For this let C : T C ! TC 0 result by elim<strong>in</strong>ation <strong>of</strong><br />

all identiers, formally C occurs as the pushout <strong>of</strong> U ! 1l and U ! T C , where these two<br />

morphisms dene the pullback <strong>of</strong> the exponential adjo<strong>in</strong>t ^o r and the exponential adjo<strong>in</strong>t <strong>of</strong><br />

true triv ID .<br />

If fg dene the pushout <strong>of</strong> C and I(r), then we get also an object TCrD 0 by the pullback<br />

<strong>of</strong> f and g, hence also morphisms CrD : T CrD ! TCrD 0 .IfD0 is a diagram that extends D<br />

by these morphisms, then we obta<strong>in</strong> the required types without occurrences <strong>of</strong> ID that can<br />

be used to extend the logic.<br />

Queries. In the relational model there are two basic approaches to queries based on the<br />

relational algebra and the relational calculus. We are now able to <strong>in</strong>troduce analogous constructions<br />

<strong>in</strong> the OODM.<br />

In the algebraic perspective we may use all operation supplied by the type system. Syntactically<br />

this means to consider all closed value terms as queries with a semantics dened<br />

by morphisms t :1l ! T ~ . In addition, each class C denes a query with semantics given by<br />

I(C) :ID ! T ~ C <strong>in</strong> an <strong>in</strong>stance I. Comb<strong>in</strong><strong>in</strong>g these two basic queries us<strong>in</strong>g all operators <strong>of</strong><br />

the type system gives a simple query language [55]. Note that <strong>in</strong> the relational subcase we<br />

obta<strong>in</strong> the operators <strong>of</strong> relational algebra without the jo<strong>in</strong>.<br />

Furthermore, we need polymorphic operators to comb<strong>in</strong>e queries. For queries dened by<br />

morphisms I 1 ! A and I 2 ! B and functions A ! C and B ! C we may consider the<br />

\<strong>in</strong>ner" pullback A C B ! B, A C B ! B and <strong>in</strong> the same way the \outer" pullback<br />

I = I 1 C I 2 .Thenby universality we obta<strong>in</strong> a unique morphism I ! A C B, den<strong>in</strong>g the<br />

semantics <strong>of</strong> pullback query. In the relational algebra the jo<strong>in</strong> corresponds to such pullbacks.<br />

For classes conta<strong>in</strong><strong>in</strong>g references we may also consider queries fr=Dg:C dened by the<br />

substitution <strong>of</strong> class D for reference r : D.Semantically we consider aga<strong>in</strong> a pullback T ~ C r <br />

T ~ D<br />

over id : <br />

T ~ D<br />

! <br />

T ~ D<br />

and I(r) : TC ~ ! <br />

T ~ D<br />

. Then the morphisms I(C) : ID ! T ~ C and<br />

I(r)I(C) :ID ! <br />

T ~ D<br />

give rise to a unique monomorphism ID ,! T ~ C r <br />

T ~ D<br />

, which denes<br />

the semantics <strong>of</strong> reference substitution queries.<br />

For the calculus th<strong>in</strong>gs are much easier, s<strong>in</strong>ce we may exploit the associated logic. S<strong>in</strong>ce<br />

classes and references have been <strong>in</strong>corporated <strong>in</strong>to the logic, a qery is simply given by a term<br />

Ix:' with a den<strong>in</strong>g formula '. This generalizes the relational approach.<br />

3.6 Conclusion<br />

In this paper we <strong>in</strong>dicated some fundamentals and logical semantics for object oriented<br />

databases. The start<strong>in</strong>g po<strong>in</strong>t was the consideration <strong>of</strong> build<strong>in</strong>g blocks <strong>in</strong> OODB schemata,<br />

i.e. types and classes. First we observedadecisive importance <strong>of</strong> type semantics. <strong>Object</strong>s are<br />

considered to be abstractions <strong>of</strong> real world entities, hence they have an immutable identity.<br />

This identity is rst encoded by abstract identiers that are assumed to form some type ID.<br />

There is not only one value <strong>of</strong> a given type that is associated with an object. In contrast we<br />

allow several values <strong>of</strong> possibly dierent types to belong to an object, and even this collection<br />

63


<strong>of</strong> types may change. Classes are used to structure objects. At each time a class corresponds<br />

to a collection <strong>of</strong> objects with values <strong>of</strong> the same type and references to objects <strong>in</strong> a xed set<br />

<strong>of</strong> classes.<br />

In general, it is reasonable to assume a semantics based on topos theory. Then all these<br />

considerations can be generalized us<strong>in</strong>g notions from category theory. On this basis the problems<br />

<strong>of</strong> identication and genericity have been solved <strong>in</strong> general. The unique identication<br />

<strong>of</strong> objects and the existence <strong>of</strong> generic update operations <strong>in</strong> a class require the class to be<br />

value-representable.<br />

S<strong>in</strong>ce topos theory is <strong>in</strong>herently connected with higher-order <strong>in</strong>tuitionistic logic, we were<br />

able to rst rephrase the notions <strong>of</strong> object oriented databases <strong>in</strong> category theory, then to<br />

transform them <strong>in</strong>to logic. This allows the denition <strong>of</strong> query algebra and calculus. Tak<strong>in</strong>g<br />

value-representability as a desirable property <strong>in</strong>to account, we could even show how to get<br />

rid <strong>of</strong> object identiers that have already been detected as a pure implementation concept.<br />

The results achieved so far seem to oer a reasonable logical foundation for object oriented<br />

databases. They even allow to relate this eld to recent <strong>in</strong>vestigations <strong>in</strong> foundations <strong>of</strong><br />

computer science with respect to type theory and eective computation.<br />

Nevertheless, it is just the beg<strong>in</strong>n<strong>in</strong>g <strong>of</strong> a story concern<strong>in</strong>g deductive capabilities <strong>in</strong> object<br />

oriented databases. To proceed, it will be <strong>in</strong>terest<strong>in</strong>g to <strong>in</strong>vestigate (higher-order) geometric<br />

theories [39, 68, 69]. Further research is planned <strong>in</strong> this direction.<br />

As to the dynamics <strong>of</strong> object oriented databases concern<strong>in</strong>g the formalization <strong>of</strong> operation<br />

semantics we may wish to exploit e.g. axiomatic semantics <strong>in</strong> the sense <strong>of</strong> Dijkstra's predicate<br />

transformers [23, 26, 45]. The problem with this theory is that it depends on the use <strong>of</strong><br />

a suitable logic that guarantees the existence <strong>of</strong> predicate transformers with the <strong>in</strong>tended<br />

semantics. Whilst the classical theory uses an <strong>in</strong>nitary rst-order logic L ! !1 the required<br />

generalization to topos logic has been shown <strong>in</strong> [53, 57].<br />

F<strong>in</strong>ally, types can be handled <strong>in</strong> a much more exible way, ifwe extend algebraic data type<br />

specications by higher-order functional and truth-value sorts and dene topoi as models <strong>of</strong><br />

such constructor theories. This approach is described <strong>in</strong> [53, 56].<br />

Then it is an open problem, how this k<strong>in</strong>d <strong>of</strong> type theory relates to synthetic doma<strong>in</strong> theory,<br />

which is roughly \doma<strong>in</strong> theory with<strong>in</strong> a topos" [29, 33, 51, 64]. The basic assumption <strong>of</strong><br />

this theory is that \doma<strong>in</strong>s" are specic objects <strong>in</strong> a topos such that all morphisms between<br />

them are cont<strong>in</strong>uous and all constructions are solely based on categorical properties without<br />

recurr<strong>in</strong>g to order-theoretic properties. Aga<strong>in</strong> the eective topos turns out to be a reasonable<br />

source <strong>of</strong> examples <strong>of</strong> that k<strong>in</strong>d <strong>of</strong> theory.<br />

References for Chapter 3<br />

1. S. Abiteboul: Towards a deductive object-oriented database language, Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g,<br />

vol. 5, 1990, pp. 263 { 287<br />

2. S. Abiteboul, R. Hull: IFO: A Formal Semantic Database Model, ACM ToDS, vol. 12 (4), December<br />

1987, pp. 525 { 565<br />

3. S. Abiteboul, P. Kanellakis: <strong>Object</strong> Identity as a Query Language Primitive, <strong>in</strong> Proc. SIGMOD,<br />

Portland Oregon, 1989, pp. 159 { 173<br />

4. H. At-Kaci: An Overview <strong>of</strong> LIFE, <strong>in</strong>J.W.Schmidt, A. A. Stognij (Eds.): Proc. Next Generation<br />

Information Systems Technology , Spr<strong>in</strong>ger LNCS, vol. 504, 1991, pp. 42 { 58<br />

5. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: Types for <strong>Databases</strong>: The Galileo Experience, <strong>in</strong> Type Systems<br />

and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and<br />

Computational Sciences, Research Report CS/90/3, 27 { 37<br />

64


6. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: <strong>Object</strong>s and Classes for a Database Programm<strong>in</strong>g Language, FIDE<br />

technical report 91/16, 1991<br />

7. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: ARelationship Mechanism for a Strongly Typed <strong>Object</strong>-<strong>Oriented</strong><br />

Database Programm<strong>in</strong>g Language, <strong>in</strong> A. Sernadas (Ed.): Proc. VLDB 91, Barcelona 1991<br />

8. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The <strong>Object</strong>-<strong>Oriented</strong><br />

Database System Manifesto, Proc. 1st DOOD, Kyoto 1989<br />

9. F. Bancilhon, G. Barbedette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeer,<br />

P. Richard, F. Velez: The Design and Implementation <strong>of</strong> O 2 , an <strong>Object</strong>-<strong>Oriented</strong> Database System,<br />

Proc. <strong>of</strong> the ooDBS II workshop, Bad Munster, FRG, September 1988<br />

10. M. Barr, C. Wells: Category Theory for Comput<strong>in</strong>g Science, Prentice-Hall 1990<br />

11. C. Beeri: Formal Models for <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong>, Proc. 1st DOOD 1989, pp. 370 { 395<br />

12. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />

5 (4), 1990, pp. 353 { 382<br />

13. C. Beeri, Y. Kornatzky: Algebraic Optimization <strong>of</strong> <strong>Object</strong>-<strong>Oriented</strong> QueryLanguages, <strong>in</strong> S. Abiteboul,<br />

P. C. Kanellakis (Eds.): Proc. ICDT '90, Spr<strong>in</strong>ger LNCS 470, pp. 72 { 88<br />

14. C. Beeri: New Data Models and Languages - the Challange <strong>in</strong> Proc. PODS '92<br />

15. C. Beeri, T. Milo: Subtyp<strong>in</strong>g <strong>in</strong> OODBs, <strong>in</strong> Proc. PODS'91<br />

16. C. Beeri, B. Thalheim: Can I see your Identication, please?, Proc. <strong>of</strong> the Workshop on Database<br />

Semantics, Rez, January 1995 (to appear)<br />

17. K. B. Bruce, A. R. Meyer: The Semantics <strong>of</strong> Second Order Polymorphic Lambda Calculus, <strong>in</strong><br />

G. Kahn, D. B. MacQueen, G. Plotk<strong>in</strong> (Eds.): Semantics <strong>of</strong> Data Types, Spr<strong>in</strong>ger LNCS 173,<br />

1984, 131-144<br />

18. L. Cardelli, P. Wegner: On Understand<strong>in</strong>g Types, Data Abstraction and Polymorphism, ACM<br />

Comput<strong>in</strong>g Suerveys 17,4, pp 471 { 522<br />

19. L. Cardelli: Typeful Programm<strong>in</strong>g, Digital Systems Research Center Reports 45, DEC SRC Palo<br />

Alto, May 1989<br />

20. M. Carey, D. DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc.<br />

ACM SIGMOD 88<br />

21. M. Caruso, E. Sciore: The VISION <strong>Object</strong>-<strong>Oriented</strong> Database Management System, Proc.<strong>of</strong>the<br />

Workshop on Database Programm<strong>in</strong>g Languages, Rosco, France, September 1987<br />

22. R.G.G. Cattell: <strong>Object</strong> Data Management: <strong>Object</strong> <strong>Oriented</strong> and Extended Relational Database<br />

Systems, Addison-Wesley, 1991<br />

23. P. Cousot: Methods and Logics for Prov<strong>in</strong>g Programs, <strong>in</strong>J.van Leeuwen (Ed.): The Handbook <strong>of</strong><br />

Theoretical Computer Science, vol B: \Formal Models and Semantics", Elsevier, 1990, 841-993<br />

24. A. Dearle, R. Connor, F. Brown, R. Morrison: Napier88 - ADatabase Programm<strong>in</strong>g Language?,<br />

<strong>in</strong> Type Systems and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong><br />

Mathematical and Computational Sciences, Research Report CS/90/3, 10 { 26<br />

25. K. Denn<strong>in</strong>gho, V. Vianu: Database Method Schemas and <strong>Object</strong> Creation, <strong>in</strong> Proc. PODS '93,<br />

265-275<br />

26. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Spr<strong>in</strong>ger-Verlag, 1989<br />

27. D. Fishman, D. Beech, H. Cate, E. Chow et al.: IRIS: An <strong>Object</strong>-<strong>Oriented</strong> Database Management<br />

System, ACM ToIS, vol. 5(1), January 1987<br />

28. M. P. Fourman: The Logic <strong>of</strong> Topoi, <strong>in</strong> J. Barwise (Ed.): Handbook <strong>of</strong> Mathematical Logic, North-<br />

Holland Studies <strong>in</strong> Logic, vol. 90, 1977, 1053-1090<br />

29. P. Freyd: Recursive Types reduced to Inductive Types, <strong>in</strong> J. Mitchell (Ed.): 5th Symposium on<br />

Logic <strong>in</strong> Computer Science, Philadelphia, 1990<br />

30. M. Hammer, D. McLeod: Database Description with SDM: A Semantic Database Model, J.ACM,<br />

vol. 31 (3), 1984, pp. 351 { 386<br />

31. R. Hull, R. K<strong>in</strong>g: Semantic Database Model<strong>in</strong>g: Survey, Applications and Research Issues, ACM<br />

Comput<strong>in</strong>g Surveys, vol. 19(3), September 1987<br />

32. J. Hyland: The Eective Topos, <strong>in</strong>A.Troelstra, D. van Dalen (Eds.): The L.E.J. Brouwer Centenary<br />

Symposium, North Holland, 1982, 165-216<br />

65


33. J. Hyland: First Steps <strong>in</strong> Synthetic Doma<strong>in</strong> Theory, <strong>in</strong> A. Carboni, M. Pedicchio, G. Rosol<strong>in</strong>i<br />

(Eds.): Category Theory '90 , Spr<strong>in</strong>ger LNM, vol. 1488, 1992<br />

34. J. Hyland, E. Rob<strong>in</strong>son, G. Rosol<strong>in</strong>i: The Discrete <strong>Object</strong>s <strong>in</strong> the Eective Topos, Proc. LMS 60<br />

(1990), 1-60<br />

35. P. Johnstone: Topos Theory, LMS Monographs vol. 10, Academic Press, 1977<br />

36. S. Khoshaan, G. Copeland: <strong>Object</strong> Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon,<br />

1986<br />

37. M. Kifer, G. Lausen. F-Logic: A Higher-order Language for Reason<strong>in</strong>g about <strong>Object</strong>s, Inheritance<br />

and Schema, <strong>in</strong> Proc. SIGMOD 1989, 134-146<br />

38. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrat<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong><br />

Programm<strong>in</strong>g System with a Database System, <strong>in</strong> Proc. OOPSLA 1988<br />

39. S. Mac Lane, I. Moerdijk: Sheaves <strong>in</strong> Geometry and Logic { A First Introduction to Topos Theory,<br />

Spr<strong>in</strong>ger Universitext, 1992<br />

40. D. Maier, J. Ste<strong>in</strong>, A. Ottis, A. Purdy: Development <strong>of</strong> an <strong>Object</strong>-<strong>Oriented</strong> DBMS, OOPSLA,<br />

September 1986<br />

41. F. Matthes, J. W. Schmidt: Bulk Types { Add-On or Built-In?, <strong>in</strong> Proc. DBPL III, Nafplion 1991<br />

42. J. C. Mitchell: Type Systems for Programm<strong>in</strong>g Languages, <strong>in</strong>J.van Leeuwen (Ed.): The Handbook<br />

<strong>of</strong> Theoretical Computer Science, vol B: \Formal Models and Semantics", Elsevier, 1990, 365-458<br />

43. J. Mylopoulos, P. A. Bernste<strong>in</strong>, H. K. T. Wong: A Language Facility for Design<strong>in</strong>g Interactive<br />

Database-Intensive Applications, ACM ToDS, vol. 5 (2), April 1980, pp. 185 { 207<br />

44. J. Mylopoulos, A. Borgida, M. Jarke, M. Koubarakis: Telos: Represent<strong>in</strong>g Knowledge About Information<br />

Systems, ACM ToIS, vol. 8 (4), October 1990 pp. 325 { 362<br />

45. G. Nelson: A Generalization <strong>of</strong> Dijkstra's Calculus, ACM TOPLAS, vol. 11 (4), October 1989, pp.<br />

517 { 561<br />

46. A. Ohori: Represent<strong>in</strong>g <strong>Object</strong> Identity <strong>in</strong> a Pure Functional Language, Proc. ICDT 90, Spr<strong>in</strong>ger<br />

LNCS, pp. 41 { 55<br />

47. A. M. Pitts: Polymorphism is Set Theoretic, Constructively, <strong>in</strong> D.H. Pitt, A. Poigne, D.E. Rydeheard<br />

(Eds.): Category Theory and Computer Science, Spr<strong>in</strong>ger LNCS 283, 12-39<br />

48. J. C. Reynolds: Polymorphism is not Set-Theoretic, <strong>in</strong> G. Kahn, D. B. MacQueen, G. Plotk<strong>in</strong><br />

(Eds.): Semantics <strong>of</strong> Data Types, Spr<strong>in</strong>ger LNCS 173, 1984, 145-156<br />

49. G. Rosol<strong>in</strong>i: Categories and Eective Computations, <strong>in</strong> D.H. Pitt, A. Poigne, D.E. Rydeheard<br />

(Eds.): Category Theory and Computer Science, Spr<strong>in</strong>ger LNCS 283, 1-11<br />

50. G. Rosol<strong>in</strong>i, E. Rob<strong>in</strong>son: Colimit Completions and the Eective Topos, Journal <strong>of</strong> Symbolic Logic<br />

55 (1990), 678-699<br />

51. G. Rosol<strong>in</strong>i: Notes on Synthetic Doma<strong>in</strong> Theory, University <strong>of</strong> Genova, February 1995<br />

52. K.-D. Schewe, B. Thalheim, I. Wetzel,J.W.Schmidt: Extensible Safe <strong>Object</strong>-<strong>Oriented</strong> Design <strong>of</strong><br />

Database Applications, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t CS-09-91, September 1991<br />

53. K.-D. Schewe: Specication <strong>of</strong> Data-Intensive Application Systems, Habilitation Thesis, TU Cottbus,<br />

1994<br />

54. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />

<strong>Oriented</strong> <strong>Databases</strong>, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, 341-356<br />

55. K.-D. Schewe, B. Thalheim: Fundamental Concepts <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, Acta Cybernetica,<br />

vol. 11 (4), 1993, 49-84<br />

56. K.-D. Schewe: A Semantics for Type Specications Based onTopos Theory, TU Cottbus, Technical<br />

Report I-5 / 1994<br />

57. K.-D. Schewe: A Non-Classical Generalization <strong>of</strong> Dijkstra's Calculus { Axiomatic Semantics for<br />

Typed Program Specications, TU Cottbus, Technical Report I-6 / 1994<br />

58. K.-D. Schewe, B. Thalheim, I. Wetzel: Foundations <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Concepts, University<br />

<strong>of</strong>Hamburg, Report FBI-HH-B-157/92, October 1992<br />

59. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: AReective Approach to Method<br />

Generation <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, University <strong>of</strong> Rostock, Rostocker Informatik Berichte,<br />

no. 14, 1992<br />

66


60. D. S. Scott: Identity and Existence <strong>in</strong> Intuitionistic Logic, <strong>in</strong> M. P. Fourman, C. J. Mulvey,<br />

D. S. Scott (Eds.): Applications <strong>of</strong> Sheaves, Spr<strong>in</strong>ger LNM 753, 660-696<br />

61. M. H. Scholl, H.-J. Schek: ARelational <strong>Object</strong> Model, <strong>in</strong> Proc. ICDT 90, Spr<strong>in</strong>ger LNCS, pp. 89<br />

{105<br />

62. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programm<strong>in</strong>g to Database Languages,<br />

<strong>in</strong> Proc. HICSS '92<br />

63. S. Y. W. Su: SAM : A Semantic Association Model for Corporate and Scientic-Statistical<br />

<strong>Databases</strong>, Inf. Sci., vol. 29, 1983, pp. 151 { 199<br />

64. P. Taylor: The Fixed Po<strong>in</strong>t Property <strong>in</strong> Synthetic Doma<strong>in</strong> Theory, <strong>in</strong>G.Kahn:6th Symposium on<br />

Logic <strong>in</strong> Computer Science, Amsterdam 1991, 152-160<br />

65. J. Van den Bussche, Dirk Van Gucht: A Hierarchy <strong>of</strong> Faithful Set Creation <strong>in</strong> Pure OODBs, <strong>in</strong><br />

J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, 326-340<br />

66. J. Van den Bussche, Dirk Van Gucht: Semi-determ<strong>in</strong>ism, <strong>in</strong> Proc. PODS '92, ACM Press, 191-201<br />

67. J. Van den Bussche: Formal Aspects <strong>of</strong> <strong>Object</strong> Identity <strong>in</strong> Database Manipulation, Ph.D. Thesis,<br />

University <strong>of</strong>Antwerp, 1993<br />

68. S. Vickers: Geometric Theories and <strong>Databases</strong>, <strong>in</strong> M.P. Fourman, P.T. Johnstone, A.M. Pitts<br />

(Eds.): Applications <strong>of</strong> Category Theory <strong>in</strong> Computer Science, London Mathematical Society<br />

Lecture Notes Series, Cambridge University Press, 1992, 288-314<br />

69. S. Vickers: Geometric Logic <strong>in</strong> Computer Science, <strong>in</strong> G.L. Burn, S.J. Gray, M.D. Ryan (Eds.):<br />

Theory and Formal Methods 1993, Spr<strong>in</strong>ger WiCS, 1993, 37-54<br />

70. S.B. Zdonik, D. Maier: <strong>Read<strong>in</strong>gs</strong> <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> Database Systems, Morgan Kaufmann Publishers,<br />

1990<br />

67


Chapter 4<br />

Higher-Level Genericity <strong>in</strong> <strong>Object</strong><br />

<strong>Oriented</strong> <strong>Databases</strong><br />

Contents<br />

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69<br />

4.2 A Core <strong>Object</strong> <strong>Oriented</strong> Database Language . . . . . . . . . . . . 71<br />

4.2.1 A Simple Type System . . . . . . . . . . . . . . . . . . . . . . . . . . 71<br />

4.2.2 Specication <strong>of</strong> Structure . . . . . . . . . . . . . . . . . . . . . . . . 72<br />

4.2.3 Database Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72<br />

4.2.4 Specication <strong>of</strong> Behaviour . . . . . . . . . . . . . . . . . . . . . . . . 73<br />

4.3 Genericity Beyond Polymorphism . . . . . . . . . . . . . . . . . . 74<br />

4.3.1 Implicit Schema Extensions . . . . . . . . . . . . . . . . . . . . . . . 74<br />

4.3.2 L<strong>in</strong>guistic Reection . . . . . . . . . . . . . . . . . . . . . . . . . . . 75<br />

4.3.3 Reection Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />

4.3.4 Generators for Generic Updates . . . . . . . . . . . . . . . . . . . . . 77<br />

4.4 Integrity Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . 80<br />

4.4.1 User-Dened Integrity Constra<strong>in</strong>ts . . . . . . . . . . . . . . . . . . . 80<br />

4.4.2 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 81<br />

4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

Klaus-Dieter Schewe, David Stemple, Bernhard Thalheim. Higher-Level Genericity <strong>in</strong><br />

<strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>. Proc. COMAD 1994.<br />

68


Abstract. <strong>Object</strong> oriented databases (OODBs) are composed <strong>of</strong> semi-<strong>in</strong>dependent objects<br />

but must also provide for the ma<strong>in</strong>tenance <strong>of</strong> <strong>in</strong>ter-object consistency, especially with respect<br />

to constra<strong>in</strong>ts aris<strong>in</strong>g from class hierarchies and <strong>in</strong>ter-object references. Hence the problem to<br />

provide consistent generic update methods.<br />

We address the problem how to derive such methods from the structure <strong>of</strong> an OODB<br />

schema by the specication <strong>of</strong> generator macros for them. These generators are based on a<br />

strict mathematical formalization <strong>of</strong> OODB concepts <strong>in</strong>clud<strong>in</strong>g the possibility to represent<br />

syntactic components <strong>of</strong> the language as values with<strong>in</strong> the language itself, which isknown to<br />

form the basis <strong>of</strong> l<strong>in</strong>guistic reection.<br />

Moreover, the approach can be extended to the enforcement <strong>of</strong> user-dened <strong>in</strong>tegrity constra<strong>in</strong>ts<br />

that give rise to context sensitive macros turn<strong>in</strong>g each user-dened method <strong>in</strong>to<br />

branches <strong>of</strong> its greatest consistent specialization.<br />

Keywords: object oriented databases, genericity,<strong>in</strong>tegrity constra<strong>in</strong>ts, consistency, l<strong>in</strong>guistic<br />

reection<br />

4.1 Introduction<br />

The relational datamodel (RDM) was the rst to support the complete abstraction from<br />

physical data organization. This was certa<strong>in</strong>ly one <strong>of</strong> its advantages <strong>in</strong> comparison to former<br />

hierarchical and network models and one <strong>of</strong> the reasons for its success. Another one is certa<strong>in</strong>ly<br />

due to the simplicity and elegance <strong>of</strong> query and update languages. In particular, each RDM<br />

schema is accompanied by operations to <strong>in</strong>sert, delete or update a tuple. These operations are<br />

generic <strong>in</strong> the sense that they are applicable to each relation <strong>in</strong> the schema. To beeven more<br />

precise, the k<strong>in</strong>d <strong>of</strong> genericity required here can be obta<strong>in</strong>ed by parametric polymorphism,<br />

s<strong>in</strong>ce it is sucient to know the underly<strong>in</strong>g RECORD-type <strong>of</strong> a relation.<br />

However, some shortcom<strong>in</strong>gs <strong>of</strong> the RDM encouraged much research aimed at achiev<strong>in</strong>g<br />

more exible and ecient datamodels. It has been claimed <strong>in</strong> [1] that object orientation<br />

provides the key technology for future database systems and languages. In order to provide<br />

object oriented databases (OODBs) with the same grade <strong>of</strong> maturity as exist<strong>in</strong>g relational<br />

systems it is a reasonable goal try<strong>in</strong>g to preserve the advantages <strong>of</strong> the RDM here<strong>in</strong>.<br />

In object oriented programm<strong>in</strong>g the notion <strong>of</strong> an object was <strong>in</strong>tended as a generalization <strong>of</strong><br />

the abstract data type concept with the additional feature <strong>of</strong> <strong>in</strong>heritance. In this sense object<br />

orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong> order to promote<br />

high s<strong>of</strong>tware development productivity. <strong>Object</strong> oriented databases must regard an object<br />

also as a basic unit <strong>of</strong> persistent data, and therefore are composed <strong>of</strong> <strong>in</strong>dependent objectsbut<br />

must also provide for the ma<strong>in</strong>tenance <strong>of</strong> <strong>in</strong>ter-object consistency, a demand that is to some<br />

degree <strong>in</strong> dissonance with the basic style <strong>of</strong> object orientation.<br />

Therefore, it is not too surpris<strong>in</strong>g that many object oriented database systems do not<br />

provide generic update methods [12] or these fail to enforce model <strong>in</strong>herent <strong>in</strong>clusion and<br />

referential constra<strong>in</strong>ts. Another source <strong>of</strong> confusion is due to object identiers [4, 14], a concept<br />

used for encod<strong>in</strong>g the identity <strong>of</strong> objects. Mak<strong>in</strong>g such identiers visible to the user as done<br />

<strong>in</strong> programm<strong>in</strong>g languages does not make much sense <strong>in</strong> databases. However, regard<strong>in</strong>g them<br />

as a pure implementation concept as <strong>in</strong> [6] raises the problem, whether generic updates do<br />

actually exist.<br />

In fact, generic updates <strong>in</strong> OODBs are much more complicated than <strong>in</strong> the RDM due<br />

to the fact that identiers may not occur with<strong>in</strong> <strong>in</strong>put- and output-values and that at least<br />

69


model <strong>in</strong>herent constra<strong>in</strong>ts have to be ma<strong>in</strong>ta<strong>in</strong>ed, which requires context <strong>in</strong>formation <strong>in</strong><br />

order to provide generic update methods. In has been shown <strong>in</strong> [21, 17] that parametric<br />

polymorphism [11] as used <strong>in</strong> most object oriented languages is <strong>in</strong>sucient for the genericity<br />

problem. Nevertheless, we shall address <strong>in</strong> this paper the problem how to derive automatically<br />

and eciently generic update methods <strong>in</strong> OODBs.<br />

Our solution is based on the formally dened object oriented datamodel (OODM) <strong>in</strong>troduced<br />

<strong>in</strong> [24] with a clear dist<strong>in</strong>ction between values and objects as required <strong>in</strong> [7, 8]. Types<br />

correspond to immutable sets <strong>of</strong> values, classes correspond to mutable collections <strong>of</strong> objects.<br />

In Section 4.2 we briey describe the basic features <strong>of</strong> this model.<br />

The ma<strong>in</strong> advantage <strong>of</strong> the used OODM is its theoretical basis [24]. Some compet<strong>in</strong>g<br />

models [13, 16] are closely oriented to a particilar object oriented programm<strong>in</strong>g language<br />

and ignore certa<strong>in</strong> mismatches <strong>in</strong> coupl<strong>in</strong>g these with a database. Others [9, 10] are basically<br />

behaviourally extended semantic datamodels. [3, 5] share the OODM property <strong>of</strong> a datamodel<br />

orientation, but still ignore the problems <strong>of</strong> object identication, genericity and consistency<br />

with respect to model-<strong>in</strong>herent constra<strong>in</strong>ts.<br />

As shown <strong>in</strong> [22, 24] generic consistent update methods exist for value-representable classes<br />

and only for them. Hence, the construction <strong>of</strong> such methods depends on additional <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts that are required for value-representability. Moreover, <strong>in</strong> order to capture cyclic<br />

references between objects, an extension to types is required that allows rational tree 1 structures<br />

to be dened by type equations. This corresponds to the -terms <strong>in</strong>troduced <strong>in</strong> [2]. Such<br />

generic consistent update methods as well as nitely representable, but <strong>in</strong>nite structures are<br />

miss<strong>in</strong>g <strong>in</strong> almost all compet<strong>in</strong>g OODB languages.<br />

The ecient construction <strong>of</strong> generic update methods is based on l<strong>in</strong>guistic reection as<br />

described <strong>in</strong> [19, 20]. Type-safe l<strong>in</strong>guistic reection came up with the development <strong>of</strong> the<br />

ADABTPL language which laid a primary <strong>in</strong>terest on the develoment <strong>of</strong> correct database<br />

transactions [18]. It turned out that synthesiz<strong>in</strong>g common operations <strong>in</strong> the RDM such asor<br />

natural jo<strong>in</strong> would be helpfull, but these are not polymorphically expressible.<br />

The ma<strong>in</strong> idea is to provide macro facilities that allow to compute with syntactic representations<br />

<strong>of</strong> language constructs <strong>in</strong> a type-safe fashion with<strong>in</strong> the database language itself.<br />

In Section 4.3 we describe this approach to generator macros for generic update methods.<br />

The approach <strong>in</strong>cludes the computation <strong>of</strong> the value-representation types for all classes<br />

<strong>in</strong> a given schema, hence genericity <strong>in</strong> this case exceeds the capability <strong>of</strong> simple parametric<br />

polymorphism. The implementation <strong>of</strong> the acyclic case without propagation is described <strong>in</strong><br />

[27]. The implementation <strong>of</strong> the general case is currently <strong>in</strong>vestigated.<br />

The approach suggests an immediate extension to <strong>in</strong>tegrity enforcement with respect to<br />

explicit user-dened constra<strong>in</strong>ts, especially those that arise from generaliz<strong>in</strong>g correspond<strong>in</strong>g<br />

constra<strong>in</strong>ts <strong>in</strong> the RDM such as functional, <strong>in</strong>clusion and exclusion constra<strong>in</strong>ts [26]. Enforc<strong>in</strong>g<br />

constra<strong>in</strong>ts can be formalized by the computation <strong>of</strong> greatest consistent specializations (GCSs)<br />

<strong>of</strong> user-dened methods, an approach that occurs naturally <strong>in</strong> OODBs, s<strong>in</strong>ce operational<br />

specialization is already present when overrid<strong>in</strong>g methods.<br />

In [23] an algorithm has been presented that allows GCS construction (under certa<strong>in</strong><br />

technical prerequisites) to be reduced to basic operations. Greatest consistent specializations<br />

<strong>of</strong> generic update methods for the OODM were presented <strong>in</strong> [25]. We briey describe GCS<br />

construction <strong>in</strong> Section 4.4.<br />

1 A rational tree is a nite or <strong>in</strong>nite tree with only a nite number <strong>of</strong> dierent subtrees.<br />

70


4.2 A Core <strong>Object</strong> <strong>Oriented</strong> Database Language<br />

In the object-oriented approach we dist<strong>in</strong>guish between objects and values. Values can be<br />

gouped <strong>in</strong>to types that may be regarded as an immutable set <strong>of</strong> values <strong>of</strong> a uniform structure<br />

together with operations dened on them. Subtyp<strong>in</strong>g is used to relate values <strong>in</strong> dierent types.<br />

The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure which uniformly<br />

comb<strong>in</strong>es aspects <strong>of</strong> object values, references and subreferences, but objects can belong<br />

to dierent classes. As for values that are only dened via types, objects can only be dened<br />

via classes.<br />

References and subreferences between classes give rise to implicit referential constra<strong>in</strong>ts.<br />

In addition, subreferences (part-<strong>of</strong>) dene local referential constra<strong>in</strong>ts, and subclasses (IsArelationships)<br />

require each database <strong>in</strong>stance to satisfy <strong>in</strong>clusion constra<strong>in</strong>ts on object identiers.<br />

We shall later extend this picture allow<strong>in</strong>g additional constra<strong>in</strong>ts to be dened by the<br />

user.<br />

As usual <strong>in</strong> object oriented approaches methods are used to model the database dynamics.<br />

In the OODM these are associated with classes. In addition we shall later add macros with<br />

the dierence that a macro produces new language expressions from language expressions.<br />

4.2.1 A Simple Type System<br />

Here we follow the classical view <strong>of</strong> types <strong>in</strong> [11] us<strong>in</strong>g a type system that consists <strong>of</strong> some<br />

basic types, type constructors and a subtyp<strong>in</strong>g relation. Moreover, assume the existence <strong>of</strong><br />

recursive types, i.e. types dened by doma<strong>in</strong> equations.<br />

The base types are BOOL, NAT, INT, FLOAT, STRING, ID or ?, where ID is an<br />

abstract identier type without any non-trivial supertype and ? is the trivial type that is a<br />

supertype for every type. The type constructors are (a 1 : 1 ::: a n : n ) (record), fg (nite<br />

set), [] (list), hi (bag) or (a 1 : 1 ) [ :::[ (a n : n ) (union). We may use base types and<br />

constructors to dene new types by nest<strong>in</strong>g.<br />

It is easy to extend such atype system by add<strong>in</strong>g e.g. a function type constructor ! .<br />

We absta<strong>in</strong>ed from this extension here because <strong>of</strong> the object identication and genericity<br />

problems. We shall discuss this problem at the end <strong>of</strong> the next subsection.<br />

Example 4.1. The type denition for PERSONNAME uses both the set constructor fg<br />

and the record constructor ():<br />

Type PERSONNAME =<br />

( FirstName : STRING , SecondName : STRING , Titles : f STRING g )<br />

End PERSONNAME<br />

The denition <strong>of</strong> a type PERSON uses the type PERSONNAME.<br />

Type PERSON =<br />

( PersonIdentityNo : NAT , Name : PERSONNAME )<br />

End PERSON<br />

The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />

standard operators on base types and on records, sets, bags, ::: We omit the details here. A<br />

type t is called proper i the number <strong>of</strong> its parameters is 0. t is called a value type i there<br />

is no occurrence <strong>of</strong> ID <strong>in</strong> t. If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a<br />

correspond<strong>in</strong>g occurrence relation o : t t 0 ! BOOL.<br />

71<br />

ut


A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened by<br />

the usual subtype relation [11].<br />

4.2.2 Specication <strong>of</strong> Structure<br />

Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values and references / subreferences<br />

to other objects. Identiers can be represented us<strong>in</strong>g the unique identier type ID.<br />

Values and (sub)references can be comb<strong>in</strong>ed <strong>in</strong> a representation type, where each occurrence<br />

<strong>of</strong> ID denotes references to some other classes. Therefore, we may dene the structure <strong>of</strong> a<br />

class us<strong>in</strong>g parameterized types. Moreover, classes are arranged <strong>in</strong> IsA-hierarchies.<br />

A structural class consists <strong>of</strong> an class name C, a set <strong>of</strong> class names D 1 ::: D m (<strong>in</strong> the<br />

follow<strong>in</strong>g called superclasses) and a value type expression S with all parameters replaced<br />

either by areference ref r i : C i or by asubreference part r i : C i with (sub)reference names<br />

r i and class names C i .<br />

If r i occurs with<strong>in</strong> ref r i : C i <strong>in</strong> the structure expression <strong>of</strong> a class C, we call r i the<br />

reference named r i from class C to class C i .Ifr i occurs with<strong>in</strong> part r i : C i <strong>in</strong> the structure<br />

expression <strong>of</strong> a class C, wecallr i the subreference named r i from class C to class C i .<br />

The type derived from the structure expression S <strong>of</strong> a class by replac<strong>in</strong>g each reference<br />

ref r i : C i and each subreference part r i : C i bythetype ID is called the representation type<br />

T C <strong>of</strong> the class C, thetype U C =(ident : IDvalue :: T C ) is called the class type <strong>of</strong> C.<br />

Example 4.2.<br />

Let us now describe some structural classes <strong>in</strong> a simple university example.<br />

Class PersonC<br />

Structure PERSON<br />

End PersonC<br />

Class Pr<strong>of</strong>essorC<br />

IsA PersonC<br />

Structure ( PersonIdentityNo : NAT , Age : NAT ,<br />

Salary : NAT , ref Faculty :DepartmentC )<br />

End Pr<strong>of</strong>essorC<br />

Class DepartmentC<br />

Structure ( DeptName : STRING ,<br />

ref Head : Pr<strong>of</strong>essorC ,<br />

Phones : f NAT g )<br />

End DepartmentC<br />

ut<br />

4.2.3 Database Instances<br />

A (structural) schema S is a nite collection <strong>of</strong> structural classes C 1 ::: C n closed under<br />

references and superclasses. In order to dene the semantics <strong>of</strong> structural schemata, we need<br />

the notion <strong>of</strong> a database <strong>in</strong>stance.<br />

An <strong>in</strong>stance D <strong>of</strong> a structural schema S assigns to each class C avalue D(C) <strong>of</strong>type fU C g<br />

such that the follow<strong>in</strong>g conditions are satised:<br />

uniqueness <strong>of</strong> identiers: For every class C we have<br />

8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (4.24)<br />

72


<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />

8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (4.25)<br />

Moreover, if T C is a subtype <strong>of</strong> TC 0 with subtype function f : T C ! TC 0 , then we have<br />

8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (4.26)<br />

referential <strong>in</strong>tegrity: For each reference or subreference from C to C 0 with correspond<strong>in</strong>g<br />

occurrence relation o r wehave<br />

8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (4.27)<br />

local referential <strong>in</strong>tegrity: For each subreference r from C to a class C 0 with correspond<strong>in</strong>g<br />

occurrence relation o r wehave<br />

8i 1 i 2 j :: ID:8v 1 v 2 :: T C : (i 1 v 1 ) 2D(C) ^ (i 2 v 2 ) 2D(C) ^ j 2 dom(D(C 0 )) ^<br />

o r (v 1 j) ^ o r (v 2 j) ) i 1 = i 2 ^ v 1 = v 2 : (4.28)<br />

We know from [22] that schema-dened generic update operations only exist for valuerepresentable<br />

classes. In turn, value-representability is implied by impos<strong>in</strong>g a trivial uniqueness<br />

constra<strong>in</strong>t oneach class. Therefore, <strong>in</strong> order to guarantee the existence <strong>of</strong> generic update<br />

methods we also assume for each class C the follow<strong>in</strong>g condition:<br />

value-identiability:<br />

8i j :: ID:8v :: T C : (i v) 2D(C) ^ (j v) 2D(C) ) i = j : (4.29)<br />

If we donothave function types, then for each database <strong>in</strong>stance it is decidable, whether the<br />

value-identiability condition holds. If functions come <strong>in</strong>to play, this is no longer true, s<strong>in</strong>ce<br />

we then have tocheck the equality <strong>of</strong> functions. Introduc<strong>in</strong>g function types therefore requires<br />

a more sophisticated treatment <strong>of</strong> value-identiability <strong>in</strong> the sense that we have to require a<br />

decidable uniqueness constra<strong>in</strong>t. For the reective generation <strong>of</strong> generic update operations,<br />

however, we need to know that they exist, not the reason why they exist. In order not to<br />

overload the presentation, we therefore decided to keep the type system as simple as possible.<br />

4.2.4 Specication <strong>of</strong> Behaviour<br />

So far, only static aspects have been considered. A structural schema is simply a collection <strong>of</strong><br />

data structures called classes. Let us now turn to add<strong>in</strong>g dynamics to this picture. As required<br />

<strong>in</strong> the object oriented approach operations will be associated with classes. This gives us the<br />

notion <strong>of</strong> a method.<br />

We shall dist<strong>in</strong>guish between visible and hidden methods to emphasize those methods<br />

that can be <strong>in</strong>voked by the user and others. Each method on a structural class C consists <strong>of</strong><br />

a signature and a body. The signature consists <strong>of</strong> a method name and sets <strong>of</strong> parameter/type<br />

pairs for <strong>in</strong>put and output. The body is dened by the usual constructs <strong>of</strong> a procedural<br />

programm<strong>in</strong>g language. A method M on a class C is called value-dened i all types occurr<strong>in</strong>g<br />

<strong>in</strong> its signature are proper value types.<br />

73


Example 4.3.<br />

Let us describe an <strong>in</strong>sert-method for the class PersonC.<br />

Method <strong>in</strong>sert 0 PersonC<br />

( <strong>in</strong> : P :: PERSON, out :I::ID) =<br />

IF 9 O 2 PersonC .value(O) =P<br />

THEN I := ident(O)<br />

ELSE I := NewId <br />

PersonC := PersonC [f( I,P )g<br />

ENDIF<br />

We used the global method NewId to denote the selection <strong>of</strong> a new identier. Note that this<br />

method is not value-dened, but we could simply drop the output to receive avalue-dened<br />

method.<br />

ut<br />

As already mentioned we dist<strong>in</strong>guish between methods visible to the user and hidden methods.<br />

We require each visible method to be value-dened. In particular, we use the value-dened<br />

generic update methods <strong>in</strong>sert C , delete C and update C for each class C that exist, s<strong>in</strong>ce<br />

we require value-representability [22]. Moreover, we use the quasi-generic update methods<br />

<strong>in</strong>sert 0 C , delete0 C and update0 C<br />

for each class C that are used to dene generic updates. The<br />

only dierence is that the generic updates suppress the output <strong>of</strong> type ID. The method <strong>in</strong><br />

Example 4.3 is quasi-generic.<br />

Subclasses <strong>in</strong>herit the methods <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long as<br />

the new method is a specialization <strong>of</strong> all its correspond<strong>in</strong>g methods <strong>in</strong> its superclasses.<br />

A (behavioural) schema S is a nite collection <strong>of</strong> behavioural classes fC 1 ::: C n g closed<br />

under references, superclasses and method calls, where behavioural class just means a structural<br />

class together with methods on it.<br />

4.3 Genericity Beyond Polymorphism<br />

Our goal is to provide generic update methods <strong>in</strong>sert C , delete C and update C for each class<br />

C <strong>of</strong> a database schema. These update methods are \generic" <strong>in</strong> the sense, that they are<br />

applicable to each class <strong>of</strong> a schema. These methods demand the identication <strong>of</strong> objects<br />

without access<strong>in</strong>g the object identier, s<strong>in</strong>ce oids are an <strong>in</strong>ternal concept and do not have a<br />

mean<strong>in</strong>g for the user <strong>of</strong> a database. Hence the need for value-representability.<br />

Besides this identication problem we alsohave to cope with the enforcement <strong>of</strong> implicit<br />

<strong>in</strong>tegrity constra<strong>in</strong>ts. In [22] it has been shown that value-representability is a necessary and<br />

sucient condition for the existence <strong>of</strong> consistent generic update methods.<br />

4.3.1 Implicit Schema Extensions<br />

S<strong>in</strong>ce we assume the existence <strong>of</strong> a trivial uniqueness constra<strong>in</strong>t for each class, generic and<br />

quasi-generic update methods always exist. Let us rst illustrate this by an example.<br />

Example 4.4. Consider aga<strong>in</strong> the schema <strong>of</strong> Example 4.2. The value-representation types for<br />

the classes Pr<strong>of</strong>essorC and DepartmentC are<br />

Type V Pr<strong>of</strong> =<br />

(PersonIdentityNo : NAT , Age : NAT ,Salary:NAT ,<br />

Faculty : ( DeptName : STRING , Head : V Pr<strong>of</strong> , Phones : f NAT g ))<br />

74


End V Pr<strong>of</strong><br />

Type V Dept =<br />

( DeptName : STRING , Head : V Pr<strong>of</strong> , Phones : f NAT g )<br />

End V Dept<br />

Select<strong>in</strong>g the component Faculty(V ) for a value V :: V Pr<strong>of</strong> gives the required value <strong>of</strong> type<br />

V Dept . However, we have to choose a new identier for a new object <strong>in</strong> Pr<strong>of</strong>essorC and<br />

due to the cycle <strong>in</strong> this schema this identier also occurs <strong>in</strong> the value <strong>of</strong> some new object <strong>in</strong><br />

DepartmentC, hence we need the more complex type<br />

Type VPr<strong>of</strong> =<br />

(PersonIdentityNo : NAT , Age : NAT , Salary : NAT ,<br />

Faculty : ( DeptName : STRING ,<br />

Head:(value : VPr<strong>of</strong> ) [ (ident :ID), Phones : f NAT g ))<br />

End VPr<strong>of</strong><br />

<br />

Note that this is a supertype <strong>of</strong> V Pr<strong>of</strong> . Let the correspond<strong>in</strong>g subtype function be f Pr<strong>of</strong> .<br />

Neglect<strong>in</strong>g for the moment IsA-relations the quasi-generic <strong>in</strong>sert on the class Pr<strong>of</strong>essorC<br />

is given by<br />

Method <strong>in</strong>sert 0 Pr<strong>of</strong><br />

(<strong>in</strong> :V::VPr<strong>of</strong> , out :I::ID) =<br />

IF 9O 2 Pr<strong>of</strong>essorC:f Pr<strong>of</strong> (value(O)) =V<br />

THEN I := ident(O)<br />

ELSE I:=NewId <br />

IF 9J :: ID: Head(Faculty(V)) =(ident:J)<br />

THEN Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V) g<br />

ELSE Let V 0 :: VDept . V0 := Faculty(V) <br />

Let K::ID . K:=<strong>in</strong>sert 0 Dept (V0 ) <br />

Let V 00 :: VPr<strong>of</strong> . V00 := ( PersonIdentityNo(V), Age(V), Salary(V), K ) <br />

Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V 00 ) g<br />

ENDIF<br />

ENDIF<br />

Our aim now is to generate (quasi-)generic update methods from the structural schema and<br />

to add them to the correspond<strong>in</strong>g classes, i.e. to implicitly change the behavioural schema.<br />

A natural rst idea is to exploit polymorphism as <strong>in</strong> [11] for this task. However, generic<br />

consistent updates on a class C have to be value-dened, hence require an <strong>in</strong>put-type V C<br />

without any occurrence <strong>of</strong> ID.Such an <strong>in</strong>put-type has to be computed from the schema and<br />

hence the generation requires meta-<strong>in</strong>formation. It has been shown <strong>in</strong> [17] that the need for<br />

meta-<strong>in</strong>formation exceeds the capability <strong>of</strong> polymorphism. The alternative is to use l<strong>in</strong>guistic<br />

reection as proposed <strong>in</strong> [20].<br />

ut<br />

4.3.2 L<strong>in</strong>guistic Reection<br />

The basic idea <strong>of</strong> l<strong>in</strong>guistic reection is to use reection types suchasSCHEMA rep , CLASS rep ,<br />

TYPE rep , METHOD rep , COMMAND rep , etc. for the representation <strong>of</strong> abstract syntax expressions<br />

represent<strong>in</strong>g schemata, classes, types, methods, commands (method bodies), etc.<br />

respectively. For each <strong>of</strong> these, there exists a function raise associat<strong>in</strong>g with this syntactic<br />

expression a true schema, class, type, etc. respectively.<br />

75


The used types for the representation <strong>of</strong> language constructs such as types, classes, constra<strong>in</strong>ts,<br />

methods, commands and schemata form the basis <strong>of</strong> l<strong>in</strong>guistic reection and will<br />

therefore be called reection types 2 .<br />

Moreover, we need a macro value-rep with signature<br />

SCHEMA rep CLASS rep ! TYPE rep<br />

<br />

where value-rep(S C) represents a value type needed for the unique identication (and representation)<br />

<strong>of</strong> some object, hence is needed for the generic <strong>in</strong>sert-, delete- and update-methods<br />

on raise(C).<br />

Such macros provide a more general way to specify database behaviour. They can be<br />

understood as transformations <strong>of</strong> language expressions. The ma<strong>in</strong> dierence to macros <strong>in</strong><br />

traditional programm<strong>in</strong>g languages, e.g. LISP, is that the expressions are abstract syntax<br />

that are represented <strong>in</strong> additional predened types. Hence macros are also strongly typed.<br />

Then the core <strong>of</strong> problem then is to dene three macros with signatures<br />

<strong>in</strong>sert : S :: SCHEMA rep C:: CLASS rep ! METHOD rep<br />

delete : S :: SCHEMA rep C:: CLASS rep ! METHOD rep and<br />

update : S :: SCHEMA rep C:: CLASS rep ! METHOD rep :<br />

Clearly, there are also other macros used by these ma<strong>in</strong> macros, and there is also one macro<br />

generic with signature SCHEMA rep ! SCHEMA rep that transforms a whole user-dened<br />

schema <strong>in</strong>to an <strong>in</strong>ternal schema with generic update methods added to all classes.<br />

4.3.3 Reection Types<br />

Let us now briey <strong>in</strong>dicate some <strong>of</strong> the reection types that are needed to construct generic<br />

update methods. We follow the presentation <strong>in</strong> Section 4.2 start<strong>in</strong>g with TYPE rep . In general,<br />

each type was given by atype-name and a den<strong>in</strong>g type-expression, hence<br />

Type TYPE rep =<br />

( name : NAME rep , type-exp : TYPE EXP rep )<br />

End TYPE rep<br />

Type expressions are given by the base types and type constructors dened <strong>in</strong> Section 4.2.1<br />

which leads to the follow<strong>in</strong>g recursive denition<br />

Type TYPE EXP rep =<br />

( BoolT : ? ) [ :::[ ( SetT : ( element-type : TYPE FORM rep )) [<br />

( RecordT : [ ( tag : NAME rep ,eld:TYPE FORM rep )])<br />

End TYPE EXP rep<br />

with values <strong>of</strong> (reection) type TYPE FORM rep be<strong>in</strong>g either type expressions (<strong>of</strong> type<br />

TYPE EXP rep ) or simply type names.<br />

Type TYPE FORM rep =<br />

(type-name : NAME rep ) [ (type-exp : TYPE EXP rep )<br />

End TYPE FORM rep<br />

Next, let us describe CLASS rep ,which can be built analogously. In particular there is a close<br />

2 In the orig<strong>in</strong>al work on l<strong>in</strong>guistic reection [20] the notion representation type was used <strong>in</strong>stead <strong>of</strong> reection<br />

type. Here we changed this notation <strong>in</strong> order not to run <strong>in</strong>to confusion with representation types <strong>of</strong> classes.<br />

76


essemblance between TYPE EXP rep and STRUCTURE rep .<br />

Type CLASS rep =<br />

( name : NAME rep , isa : f NAME rep g ,<br />

structure : STRUCTURE rep , methods : f METHOD rep g )<br />

End CLASS rep<br />

The dierence between the representation <strong>of</strong> type expressions and the one for structure expressions<br />

is that the latter may conta<strong>in</strong> references and subreferences <strong>in</strong>dicated by the use <strong>of</strong><br />

TYPE REF FORM rep <strong>in</strong>stead <strong>of</strong> TYPE FORM rep .<br />

Type STRUCTURE rep =<br />

( BoolT : ? ) [ :::[ ( SetS : ( element-type : TYPE REF FORM rep )) [<br />

( RecordS : [ ( tag : NAME rep , eld : TYPE REF FORM rep )])<br />

End STRUCTURE rep<br />

Then the extension <strong>of</strong> TYPE REF FORM rep with respect to TYPE FORM rep simply consists<br />

<strong>in</strong> add<strong>in</strong>g reference expressions.<br />

Type TYPE REF FORM rep =<br />

(type-name : NAME rep ) [ (type-exp : TYPE EXP rep ) [<br />

( ref-exp : ( ref-k<strong>in</strong>d : ( REF : ? ) [ (PART :? ),<br />

reference : NAME rep , class : NAME rep ))<br />

End TYPE REF FORM rep<br />

F<strong>in</strong>ally, let us <strong>in</strong>dicate the denition <strong>of</strong> the reection type METHOD rep , but without go<strong>in</strong>g<br />

too much <strong>in</strong>to details. We omit the denitions for COMMAND rep and EXPR rep .<br />

Type METHOD rep =<br />

( name : NAME rep ,<br />

<strong>in</strong>-list : [ ( parameter : NAME rep ,type : TYPE FORM rep )],<br />

out-list : [ ( parameter : NAME rep ,type : TYPE FORM rep )],<br />

body : COMMAND rep )<br />

End METHOD rep<br />

Further details on the reection types <strong>of</strong> the OODM will be omitted here.<br />

4.3.4 Generators for Generic Updates<br />

In order to build generator macros for generic update methods we follow the constructive<br />

pro<strong>of</strong> <strong>of</strong> their existence <strong>in</strong> [22]. Then we have to cope with the follow<strong>in</strong>g problems.<br />

(i) We have toprovide value types for the <strong>in</strong>put. This will be achieved by the macro value-rep<br />

already mentioned above.<br />

(ii) Generic update methods are value-dened, but nevertheless have to cope with identiers.<br />

This seem<strong>in</strong>gly contradiction can be resolved by construct<strong>in</strong>g canonical update methods<br />

with ID as output-type. Clearly, these give hidden methods <strong>in</strong> the <strong>in</strong>ternal schema. The<br />

correspond<strong>in</strong>g generic update method then just consists <strong>of</strong> a call to the canonical one and<br />

simply neglects its output.<br />

(iii) Inclusion <strong>in</strong>tegrity has to be enforced. Therefore, each <strong>in</strong>sert propagates through superclasses,<br />

whereas deletions propagate through subclasses and updates do both. For the<br />

macros <strong>in</strong>sert, delete and update this means that we have to build methods ignor<strong>in</strong>g all<br />

IsA-relations and then arrange these <strong>in</strong> a sequence.<br />

77


(iv) The most dicult task is the enforcement <strong>of</strong> referential <strong>in</strong>tegrity, especially <strong>in</strong> the case<br />

<strong>of</strong> cycles as e.g. <strong>in</strong> Example 4.2. We have to propagate <strong>in</strong> both directions along these<br />

references, but for cycles we also have tochoose new identiers before start<strong>in</strong>g this propagation.<br />

At rst glance it seems that our approach starts with the most complicated case, where all<br />

operations are propagated along references, whereas it is much simpler for an <strong>in</strong>sert-operation<br />

to require referenced objects already to exist and to disallow delete-operations as long as there<br />

still exist referenc<strong>in</strong>g objects. There are two reasons not to follow this simpler approach:<br />

(i) Our approach does only take care about implicit structurally dened constra<strong>in</strong>ts, whereas<br />

the simpler scenario arises, when certa<strong>in</strong> additional user-dened transition constra<strong>in</strong>ts are<br />

added. We briey discuss the general handl<strong>in</strong>g <strong>of</strong> any k<strong>in</strong>d <strong>of</strong> user-dened constra<strong>in</strong>ts on<br />

a solid theoretical ground <strong>in</strong> Section 4.4 (see also [23, 25]).<br />

(ii) These additional constra<strong>in</strong>ts may discard the generic operations, <strong>in</strong>particular <strong>in</strong> the cases<br />

<strong>of</strong> cycles <strong>in</strong> the schema. In theses cases it is desirable to let the database designer become<br />

aware <strong>of</strong> the consequences <strong>of</strong> add<strong>in</strong>g certa<strong>in</strong> constra<strong>in</strong>ts, whereas s/he has only the<br />

chance to completely change the schema (omitt<strong>in</strong>g all cycles) <strong>in</strong> the case, where additional<br />

constra<strong>in</strong>ts are tacitly assumed.<br />

An implementation <strong>of</strong> the simplied scenario on the basis <strong>of</strong> a strongly typed persistent<br />

programm<strong>in</strong>g language is reported <strong>in</strong> the PH. D. thesis [27].<br />

Let us now partly <strong>in</strong>dicate the solution to these problem by generator macros for the<br />

canonical update methods.<br />

The macro rep-type computes for a given class its representation type:<br />

Macro rep-type ( <strong>in</strong> :C::CLASS rep , out :TC::TYPE EXP rep ) =<br />

call rep-type-struct(<strong>in</strong> : structure(C) , out :TC)<br />

The called macro rep-type-struct <strong>in</strong>volves several cases depend<strong>in</strong>g on the value E which conta<strong>in</strong>s<br />

the representation <strong>of</strong> a structure expression. This stems from a type constructor, hence<br />

leads to the case dist<strong>in</strong>ctions.<br />

Macro rep-type-struct ( <strong>in</strong> :E::STRUCTURE rep ,<br />

out :E 0 :: TYPE EXP rep ) =<br />

CASE E :::<br />

E = (RecordS : [ (tag : N , eld : S) j L]) !<br />

Let S 0 :: TYPE REF FORM rep <br />

IF S=(ref-exp:R)<br />

THEN S 0 := ( type-name : ID )<br />

ELSE S 0 := S<br />

ENDIF <br />

Let L 0 :: TYPE EXP rep .<br />

call rep-type-struct(<strong>in</strong> :L,out :L 0 )<br />

E 0 := (RecordT : [ (tag : N , eld : S 0 ) j L 0 ])<br />

ENDCASE<br />

The macro value-rep is a little bit more complicate, s<strong>in</strong>ce it propagates through the whole<br />

schema.<br />

Macro value-rep ( <strong>in</strong> :C::CLASS rep ,S::SCHEMA rep ,<br />

78


out :VC::TYPE EXP rep ) =<br />

call vrep-type-struct(<strong>in</strong> : structure(C),S,[name(C)] , out :VC)<br />

Macro vrep-type-struct<br />

( <strong>in</strong> :E::STRUCTURE rep ,S::SCHEMA rep , K :: [ NAME rep ],<br />

out :E 0 :: TYPE EXP rep ) =<br />

CASE E :::<br />

E = (RecordS : [ (tag : N , eld : S) j L]) !<br />

Let S 0 :: TYPE REF FORM rep <br />

IF S = ( ref-exp : ( ref-k<strong>in</strong>d : x , reference : R , class : N 0 ))<br />

THEN<br />

IF N 0 =2 K<br />

THEN Let C 0 2 S . name(C 0 )=N 0 <br />

Let VC 0 :: TYPE EXP rep .<br />

call vrep-type-struct(<strong>in</strong> : structure(C 0 ),S,[N 0 j K] , out :VC 0 )<br />

S 0 := ( type-exp : VC 0 )<br />

ELSE S 0 := ( type-name : ( vrep : N 0 ))<br />

ENDIF<br />

ELSE S 0 := S<br />

ENDIF <br />

Let L 0 :: TYPE EXP rep .<br />

call vrep-type-struct(<strong>in</strong> : L,S,K , out :L 0 )<br />

E 0 := (RecordT : [ (tag : N , eld : S 0 ) j L 0 ])<br />

ENDCASE<br />

This solves the rst problem above. The trivial solution to the second problem has already<br />

been given. Now concentrate on the enforcement <strong>of</strong> IsA-constra<strong>in</strong>ts. The follow<strong>in</strong>g<br />

macro <strong>in</strong>sert-seq computes the body <strong>of</strong> the required canonical <strong>in</strong>sert. It uses the macro classdescription<br />

to get the denition <strong>of</strong> a class <strong>in</strong> a schema from the class name and the macro<br />

<strong>in</strong>sert-ref to compute the core <strong>of</strong> the method body anf enforc<strong>in</strong>g referential <strong>in</strong>tegrity.<br />

Macro <strong>in</strong>sert-seq<br />

( <strong>in</strong> :C::CLASS rep ,S::SCHEMA rep , out :P::COMMAND rep ) =<br />

Let P1, P2 :: COMMAND rep .<br />

call command-list(<strong>in</strong> : isa(C),S , out : P1) <br />

call <strong>in</strong>sert-ref(<strong>in</strong> : C,S , out :P2) <br />

P := sequence(P1,P2)<br />

Macro command-list<br />

( <strong>in</strong> :NL::f NAME rep g ,S::SCHEMA rep ,<br />

out :P::COMMAND rep ) =<br />

CASE NL<br />

NL = ! P := `skip'<br />

NL = f N g[L ^ N =2 L !<br />

Let D::CLASS rep .<br />

call class-description(<strong>in</strong> : N,S , out :D)<br />

Let P1, P2 :: COMMAND rep .<br />

call <strong>in</strong>sert-ref(<strong>in</strong> :D,S,out : P1) <br />

call command-list(<strong>in</strong> : L,S , out : P2) <br />

P := sequence(P1,P2)<br />

79


ENDCASE<br />

To solve the third problem we have to construct also the <strong>in</strong>put- and output-lists <strong>of</strong> the methods.<br />

This is straightforward.<br />

F<strong>in</strong>ally, to solve the fourth problem let us briey consider how to enforce referential<br />

<strong>in</strong>tegrity with<strong>in</strong> <strong>in</strong>sertions. Aga<strong>in</strong>, we have to build an <strong>in</strong>put-type that extends the valuerepresentation<br />

type by add<strong>in</strong>g ID. The correspond<strong>in</strong>g macros are completely analogous to<br />

value-rep and vrep-type-struct. Then the denition <strong>of</strong> <strong>in</strong>sert-ref follows the spirit <strong>of</strong> Example<br />

4.4 and the macros shown above. We omit further details.<br />

4.4 Integrity Enforcement<br />

The reective approach <strong>in</strong> the preced<strong>in</strong>g section allows to cope with the implicit <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts, hence suggests an immediate generalization to arbitrary user-dened constra<strong>in</strong>ts.<br />

Then the problem is to guarantee the consistency <strong>of</strong> a specied method with respect to such<br />

constra<strong>in</strong>ts.<br />

4.4.1 User-Dened Integrity Constra<strong>in</strong>ts<br />

Let us rst extend the notion <strong>of</strong> schema S by the <strong>in</strong>troduction <strong>of</strong> explicit user-dened <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts which are formulae I over the underly<strong>in</strong>g type system with free variables fr(I) <br />

fx C 1 ::: x C n<br />

g, where each x Ci is a variable <strong>of</strong> type fU Ci g.Wecallx Ci the class variable <strong>of</strong><br />

C i .<br />

A constra<strong>in</strong>ed schema consists <strong>of</strong> a behavioural schema S and a nite set <strong>of</strong> <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts on S.An<strong>in</strong>stance D is said to be consistent with respect to the <strong>in</strong>tegrity constra<strong>in</strong>t<br />

I i substitut<strong>in</strong>g D(C) for each class variable x C <strong>in</strong> I evaluates to true, when<strong>in</strong>terpreted <strong>in</strong><br />

the usual way.<br />

We use abbreviations for dist<strong>in</strong>guished classes <strong>of</strong> constra<strong>in</strong>ts. For this let C C 1 C 2 be<br />

classes and let c i : T C ! T i (i =1 2 3) and c i : T Ci ! T (i =1 2) be subtype functions.<br />

A functional constra<strong>in</strong>t on C is a constra<strong>in</strong>t C:c 1 ! C:c 2 which abbreviates<br />

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) :<br />

(4.30)<br />

A uniqueness constra<strong>in</strong>t on C is a constra<strong>in</strong>t UNIQUE(c 1 )orC:c 1 ! C:ident which abbreviates<br />

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) (i 0 v 0 ) 2 x C ) i = i 0 : (4.31)<br />

An <strong>in</strong>clusion constra<strong>in</strong>t on C 1 and C 2 is a constra<strong>in</strong>t C 1 :c 1 C 2 :c 2 which abbreviates<br />

8t :: T:9(i 1 v 1 ) 2 x C 1 :c 1(v 1 )=t )9(i 2 v 2 ) 2 x C 2 :c 2(v 2 )=t: (4.32)<br />

An exclusion constra<strong>in</strong>t on C 1 , C 2 is a constra<strong>in</strong>t C 1 :c 1 kC 2 :c 2 which abbreviates<br />

80


8i 1 i 2 :: ID:8v 1 :: T C 1 :8v 2 :: T C 2 :(i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1(v 1 ) 6= c 2 (v 2 ) :<br />

(4.33)<br />

Assume c 1 c 2 c 3 denes a uniqueness constra<strong>in</strong>t on C. Then an object generat<strong>in</strong>g<br />

constra<strong>in</strong>t on C is a constra<strong>in</strong>t C:c 1 C:c 2 which abbreviates<br />

8i 1 i 2 :: ID:8v 1 v 2 :: T C :(i 1 v 1 ) 2 x C ^ (i 2 v 2 ) 2 x C ^ c 1 (v 1 )=c 1 (v 2 ) )<br />

9(i v) 2 x C ):c 1 (v) =c 1 (v 1 ) ^ c 2 (v) =c 2 (v 1 ) ^ c 3 (v) =c 3 (v 2 ) : (4.34)<br />

These constra<strong>in</strong>t notations can be easily extended to path constra<strong>in</strong>ts us<strong>in</strong>g the usual dotnotation.<br />

In addition we may use the notation -!r i for a (sub)reference r i . The dierence to<br />

-:r i is that the latter refers to a value <strong>of</strong> type ID, whereas the former corresponds to the<br />

referenced object <strong>in</strong> class C i or equivalently to a value <strong>of</strong> type U Ci .Paths can be abbreviated<br />

if this does not lead to confusion, <strong>in</strong> particular the selector value is usually omitted.<br />

Example 4.5. Let us assume that the salary <strong>of</strong> a pr<strong>of</strong>essor is determ<strong>in</strong>ed by his/her age.<br />

For this purpose, let Age Salary : T Pr<strong>of</strong> ! NAT be the natural projections to the Ageand<br />

Salary-values respectively. Thenwehave the follow<strong>in</strong>g functional constra<strong>in</strong>t on the class<br />

Pr<strong>of</strong>essorC:<br />

Constra<strong>in</strong>t Pr<strong>of</strong>essorC.Age ! Pr<strong>of</strong>essorC.Salary which abbreviates<br />

8i j :: ID:8v w :: T Pr<strong>of</strong> : (i v) 2 x Pr<strong>of</strong> ^ (j w) 2 x Pr<strong>of</strong> ^ Age(v) = Age(w)<br />

) Salary(v) = Salary(w) :<br />

As a second example take DepartmentC!Head:F aculty = DepartmentC:ident , which<br />

states that the head <strong>of</strong> a department alsoworks <strong>in</strong> that department.<br />

ut<br />

4.4.2 Greatest Consistent Specializations<br />

The problem <strong>of</strong> <strong>in</strong>tegrity enforcement can be formalized by greatest consistent specializations<br />

(GCSs). Given a method M andan<strong>in</strong>tegrity constra<strong>in</strong>t I, the GCS M I satises<br />

{ M I is consistent with respect to I,<br />

{ M I specializes M and<br />

{ each consistent specialization <strong>of</strong> M also specializes M I .<br />

It has been shown <strong>in</strong> [23] that GCSs always exist. Moreover, if we consider more than one<br />

constra<strong>in</strong>t, i.e. a conjunction <strong>of</strong> constra<strong>in</strong>ts, GCSs can be built successively and do not depend<br />

on the order <strong>of</strong> the constra<strong>in</strong>ts. It has also been shown how to compute GCS branches under<br />

some technical prerequisites. The restriction to GCS branches is due to practicality. We omit<br />

the algorithm, s<strong>in</strong>ce it envolves calculations with predicate transformers that can hardly be<br />

expla<strong>in</strong>ed <strong>in</strong> a few l<strong>in</strong>es. Instead, let us look at a simple example.<br />

Example 4.6. Let us consider the <strong>in</strong>sert-method on the class Pr<strong>of</strong>essorC from Example<br />

4.2 and the functional constra<strong>in</strong>t <strong>in</strong> Example 4.5. The method <strong>in</strong>sert 0 Pr<strong>of</strong><br />

(see Example 4.4)<br />

has to be replaced by the follow<strong>in</strong>g one [25].<br />

81


Method <strong>in</strong>sert 0 Pr<strong>of</strong><br />

(<strong>in</strong> :V::VPr<strong>of</strong> , out :I::ID) =<br />

IF 9O 2 Pr<strong>of</strong>essorC: f Pr<strong>of</strong> (value(O)) = V<br />

THEN I := ident(O)<br />

ELSE<br />

IF 9O 2 Pr<strong>of</strong>essorC: Age(value(O) = Age(V) ^ Salary(value(O) 6= Salary(V)<br />

THEN skip<br />

ELSE I:=NewId <br />

IF 9J :: ID: Head(Faculty(V)) = (ident :J)<br />

THEN Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V) g<br />

ELSE Let V 0 :: VDept . V0 := Faculty(V) <br />

Let K::ID . K:=<strong>in</strong>sert 0 Dept (V0 )<br />

Let V 00 :: VPr<strong>of</strong> .<br />

V 00 := ( PersonIdentityNo(V), Age(V), Salary(V), K ) <br />

Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V 00 ) g<br />

ENDIF<br />

ENDIF<br />

ENDIF<br />

ut<br />

From this example it can be seen how toprovide the generators for GCS construction. Indeed,<br />

we must have for each constra<strong>in</strong>t a generator for the GCSs <strong>of</strong> generic update methods, and <strong>in</strong><br />

addition, for each constra<strong>in</strong>t a generator for the precondition required <strong>in</strong> the GCS algorithm.<br />

We omit further details [23].<br />

4.5 Conclusion<br />

<strong>Object</strong> oriented databases dier from relational ones <strong>in</strong> that richer structures and implicit<br />

constra<strong>in</strong>ts, especially <strong>in</strong>clusion constra<strong>in</strong>ts (IsA) and referential constra<strong>in</strong>ts, are provided.<br />

This forbids a simple approach to genericity. Indeed, each generic method must enforce at<br />

least these implicit constra<strong>in</strong>ts. Consequently we must also be able to derive the necessary<br />

<strong>in</strong>put types for these operations from a given schema.<br />

Here we have solved this genericity problem us<strong>in</strong>g a reective approach, i.e. that the generators<br />

themselves can be represented <strong>in</strong> an object oriented database language us<strong>in</strong>g strongly<br />

typed macros. This form <strong>of</strong> l<strong>in</strong>guistic reection exceeds the limits <strong>of</strong> polymorphism. Reection<br />

is based on the possibility torepresent syntactic components <strong>of</strong> the language such astypes,<br />

classes, methods, etc. as values with<strong>in</strong> the language itself and to compute new schemata from<br />

these representations. This gives a practical solution to the genericity problem, whilst its<br />

theoretical justication was proven <strong>in</strong> [22, 24]. A partial implementation <strong>of</strong> the approach has<br />

been described <strong>in</strong> [27].<br />

We also sketched how to extend the approach to <strong>in</strong>tegrity enforcement. Based on the<br />

theoretical results <strong>in</strong> [23] each constra<strong>in</strong>t gives rise to a macro that transforms a user-dened<br />

method <strong>in</strong>to its greatest consistent specialization with respect to the given constra<strong>in</strong>t.<br />

To summarize genericityand<strong>in</strong>tegrity enforcement are not only theoretically well-founded,<br />

but can also be eciently built <strong>in</strong>to object oriented database languages. This allows a tremendous<br />

<strong>in</strong>crease <strong>in</strong> declarativity <strong>in</strong> object oriented databases.<br />

However, we have to ensure the value-representability <strong>of</strong> a schema, a demand that was<br />

granted for free <strong>in</strong> the RDM, and we have toprovide type-safe reective database languages.<br />

82


Of course, the second demand is only sucient and <strong>in</strong> general not necessary, s<strong>in</strong>ce we<br />

could build <strong>in</strong> algorithms for method generation and GCS construction as long as we have<br />

access to the schema denitions. The ma<strong>in</strong> advantage <strong>of</strong> the reective approach is that the<br />

work <strong>of</strong> these algorithms is made explicit and type-safe <strong>in</strong> the schema by the use <strong>of</strong> the macro<br />

language. This allows e.g. schema changes or changes to <strong>in</strong>tegrity constra<strong>in</strong>ts to be easily<br />

ma<strong>in</strong>ta<strong>in</strong>ed, s<strong>in</strong>ce they aect only a few macros.<br />

Another advantage <strong>of</strong> the outl<strong>in</strong>ed object oriented approach is that it allows to cope with<br />

constra<strong>in</strong>ts that are either structurally determ<strong>in</strong>ed or explicitly dened by the user. The<br />

traditional approach <strong>in</strong> the RDM usually buries such constra<strong>in</strong>ts <strong>in</strong> database programs.<br />

References for Chapter 4<br />

1. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The <strong>Object</strong>-<strong>Oriented</strong><br />

Database System Manifesto, Proc. 1st DOOD, Kyoto 1989<br />

2. H. At-Kaci: An Overview <strong>of</strong> LIFE, <strong>in</strong>J.W.Schmidt, A. A. Stognij (Eds.): Proc. Next Generation<br />

Information Systems Technology, Spr<strong>in</strong>ger LNCS 504, 1991, 42-58<br />

3. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: ARelationship Mechanism for a Strongly Typed <strong>Object</strong>-<strong>Oriented</strong><br />

Database Programm<strong>in</strong>g Language, <strong>in</strong> Proc. VLDB 1991<br />

4. S. Abiteboul, P. Kanellakis: <strong>Object</strong> Identity as a Query Language Primitive, <strong>in</strong> Proc. SIGMOD,<br />

Portland Oregon, 1989, 159-173<br />

5. F. Bancilhon, C. Delobel, P. Kanellakis: Build<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong> Database System: The Story<br />

<strong>of</strong> O2 , Morgan Kaufmann, 1992<br />

6. C. Beeri: Formal Models for <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong>, Proc. 1st DOOD 1989, 370-395<br />

7. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />

5(4), 1990, 353-382<br />

8. C. Beeri: New Data Models and Languages { the Challenge, <strong>in</strong> Proc. PODS '92<br />

9. M. Carey, D.DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc.<br />

ACM SIGMOD 88<br />

10. M. Caruso, E. Sciore: The VISION <strong>Object</strong>-<strong>Oriented</strong> Database Management System, Proc. <strong>of</strong> the<br />

Workshop on Database Programm<strong>in</strong>g Languages, Rosco, France, September 1987<br />

11. L. Cardelli, P. Wegner: On Understand<strong>in</strong>g Types, Data Abstraction and Polymorphism, ACM<br />

Comput<strong>in</strong>g Surveys, vol. 17(4), 471-522<br />

12. A. Heuer: Objektorientierte Datenbanken (<strong>in</strong> German), Addison Wesley, 1992<br />

13. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrat<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong><br />

Programm<strong>in</strong>g System with a Database System, <strong>in</strong> Proc. OOPSLA 1988<br />

14. S. Khoshaan, G. Copeland: <strong>Object</strong> Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon,<br />

1986<br />

15. B. Meyer: <strong>Object</strong>-<strong>Oriented</strong> S<strong>of</strong>tware Construction, Prentice-Hall, 1988<br />

16. D. Maier, J. Ste<strong>in</strong>, A. Ottis, A. Purdy: Development <strong>of</strong> an <strong>Object</strong>-<strong>Oriented</strong> DBMS, OOPSLA,<br />

September 1986<br />

17. D. Stemple, L. Fegaras, T. Sheard, A. Socorro: Exceed<strong>in</strong>g the Limits <strong>of</strong> Polymorphism <strong>in</strong> Database<br />

Programm<strong>in</strong>g Languages, <strong>in</strong> Proc. EDBT90, Spr<strong>in</strong>ger LNCS 416, 1990<br />

18. T. Sheard, D. Stemple: Automatic Verication <strong>of</strong> Database Transaction Safety, ACM ToDS vol.<br />

14 (3), September 1989<br />

19. D. Stemple, T. Sheard: ARecursive Base for Database Programm<strong>in</strong>g Primitives, <strong>in</strong> Proceed<strong>in</strong>gs<br />

<strong>of</strong> the First International East/West Database Workshop, Kiev, October 1990<br />

20. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programm<strong>in</strong>g to Database Languages,<br />

<strong>in</strong> Proc. HICSS '92<br />

21. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: A Reective Approach to<br />

Method Generation <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, University <strong>of</strong> Rostock, Rostocker Informatik<br />

Berichte, no. 14, 1992<br />

83


22. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />

<strong>Oriented</strong> <strong>Databases</strong>, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, 341-356<br />

23. K.-D. Schewe, B. Thalheim: Comput<strong>in</strong>g Consistent Transactions, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t<br />

CS-08-92, December 1992<br />

24. K.-D. Schewe, B. Thalheim: Fundamental concepts <strong>of</strong> object oriented databases, Acta Cybernetica,<br />

vol. 11 (4), 1993, 49-85<br />

25. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity Preserv<strong>in</strong>g Updates <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>,<br />

<strong>in</strong> M. Orlowska, M. Papazoglou (Eds.) : Proc. 4th Australian Database Conference, Brisbane,<br />

February 1993, World Scientic, 171-185<br />

26. B. Thalheim: Dependencies <strong>in</strong> Relational <strong>Databases</strong>, Teubner Leipzig, 1991<br />

27. I. Wetzel: Programmieren mit STYLE: Uber die systematische Entwicklung von Programmierumgebungen<br />

(<strong>in</strong> German), Ph.D. Thesis, Hamburg University, 1994<br />

84


Chapter 5<br />

Towards a Theory <strong>of</strong> Consistency<br />

Enforcement<br />

Contents<br />

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86<br />

5.1.1 The Concistency Enforcement Problem . . . . . . . . . . . . . . . . 87<br />

5.1.2 The Problem <strong>of</strong> GCS Construction . . . . . . . . . . . . . . . . . . . 87<br />

5.1.3 The Practicality <strong>of</strong> GCS Construction . . . . . . . . . . . . . . . . . 88<br />

5.2 A Motivat<strong>in</strong>g Example . . . . . . . . . . . . . . . . . . . . . . . . . 88<br />

5.2.1 Constra<strong>in</strong>ts <strong>in</strong> the Relational Model . . . . . . . . . . . . . . . . . . 89<br />

5.2.2 Stepwise Consistency Enforcement . . . . . . . . . . . . . . . . . . . 90<br />

5.3 Fundamental Features <strong>of</strong> State-Based Specications . . . . . . . 92<br />

5.3.1 Formal Specications with Guarded Commands . . . . . . . . . . . . 92<br />

5.3.2 Axiomatic Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 94<br />

5.3.3 Consistency and Specialization . . . . . . . . . . . . . . . . . . . . . 95<br />

5.3.4 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 97<br />

5.4 The Construction <strong>of</strong> GCSs . . . . . . . . . . . . . . . . . . . . . . . 99<br />

5.4.1 I-reduced Guarded Commands . . . . . . . . . . . . . . . . . . . . . 100<br />

5.4.2 An Upper Bound for GCSs . . . . . . . . . . . . . . . . . . . . . . . 103<br />

5.4.3 The General Form <strong>of</strong> GCSs . . . . . . . . . . . . . . . . . . . . . . . 106<br />

5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108<br />

5.6 A Normal Form for the Specialization Pro<strong>of</strong> Obligation . . . . . 109<br />

5.7 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem for Sequences . . . . . . . . 110<br />

5.8 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem <strong>in</strong> the Recursive Case . . . 114<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

K.-D. Schewe, B. Thalheim. Towards a Theory <strong>of</strong> Consistency Enforcement. Acta<br />

Informatica 1998 (to appear).<br />

85


Abstract. Specications with <strong>in</strong>variants occur <strong>in</strong> almost all formal specication languages.<br />

Hence the problem is to prove the consistency <strong>of</strong> the specied operations with respect to<br />

the <strong>in</strong>variants. Whilst the problem seems to be easily solvable <strong>in</strong> predicative specications, it<br />

usually requires sophisticated verication eorts, when specications <strong>in</strong> the style <strong>of</strong> Dijkstra's<br />

guarded commands as e.g. <strong>in</strong> the specication language B are used.<br />

As an alternative a computational approach to consistency enforcement will be discussed<br />

<strong>in</strong> this paper. The basic idea is to replace <strong>in</strong>consistent operations by new consistent ones<br />

preserv<strong>in</strong>g at the same time the <strong>in</strong>tention <strong>of</strong> the old one. More precisely, this can be formalized<br />

by consistent spezializations, where specialization is a specic partial order on operations<br />

dened via predicate transformers.<br />

It can be shown that greatest consistent specializations (GCSs) always exist and are compatible<br />

with conjunctions <strong>of</strong> <strong>in</strong>variants. Then under certa<strong>in</strong> prerequisites the general construction<br />

<strong>of</strong> such GCSs is possible. In general, GCS construction can be embedded <strong>in</strong> renement<br />

calculi and therefore strengthens the systematic development <strong>of</strong> correct programs.<br />

5.1 Introduction<br />

Invariants provide an excellent way toachieve declarativity <strong>in</strong> formal specications. Therefore,<br />

almost all commonly used specication languages such asVDM[4,14],Z[26,27]andB[1,2]<br />

as well as research prototypes [20] allow atleaststatic <strong>in</strong>variants to be dened, i.e. conditions<br />

that have to be satised by all states. Then consistency <strong>of</strong> an operation 1 S with respect to<br />

the specied (static) <strong>in</strong>variant means that S transforms consistent states only <strong>in</strong>to consistent<br />

ones. More generally, transition <strong>in</strong>variants restrict the allowed pairs <strong>of</strong> <strong>in</strong>itial and nal states<br />

for operations S, and dynamic constra<strong>in</strong>ts restrict the allowed sequences <strong>of</strong> states [16, 17].<br />

Especially <strong>in</strong> the context <strong>of</strong> data-<strong>in</strong>tensive application systems, where the nal implementation<br />

will make use <strong>of</strong> persistent data stored <strong>in</strong> databases, most <strong>of</strong> the application semantics<br />

is expressible by static and dynamic <strong>in</strong>tegrity constra<strong>in</strong>ts, which is just another word for<br />

<strong>in</strong>variant [16, 17, 20, 24, 25, 28].<br />

Consistency pro<strong>of</strong>s are therefore an <strong>in</strong>herent and important task with<strong>in</strong> the development <strong>of</strong><br />

correct programs. However, as po<strong>in</strong>ted out <strong>in</strong> [3] there is a fundamental dierence <strong>in</strong> the way<br />

<strong>in</strong>variants are handled <strong>in</strong> VDM specications (and similarly Z specications) and specications<br />

<strong>in</strong> B. The predicative style <strong>in</strong> the former languages allows <strong>in</strong>variants to be considered as part<br />

<strong>of</strong> the specication, hence nd<strong>in</strong>g a correct program that satises the specication is left to<br />

renement. On the other hand, the axiomatic semantics associated with B operations <strong>in</strong> the<br />

style <strong>of</strong> Dijkstra [9, 12, 19] enables the denition <strong>of</strong> consistency pro<strong>of</strong> obligations [1, 8, 20] <strong>in</strong><br />

a suitable logic. At rst glance the VDM and Z approach seems to be advantagous, because<br />

it avoids signicant verication eorts.<br />

To the authors' po<strong>in</strong>t <strong>of</strong> view <strong>in</strong>dustrial applicability andacceptance <strong>of</strong> formal methods<br />

can only be expected if the whole renement process is taken <strong>in</strong>to consideration. Start<strong>in</strong>g from<br />

a high-level specication the application <strong>of</strong> provably correct renement steps should not stop<br />

before a formal specication is reached that is equivalent to an executable program. Then<br />

the automatic derivation <strong>of</strong> such a program should be possible as demonstrated <strong>in</strong> [13, 24].<br />

As a consequence, prov<strong>in</strong>g consistency is an unavoidable problem, s<strong>in</strong>ce at least once <strong>in</strong> the<br />

renement process we shall leave the ground <strong>of</strong> purely predicative specications [18, 24].<br />

1 To be precise, we should write operation specication to emphasize the <strong>in</strong>dependence from the implementation,<br />

but throughout the paper we drop this dist<strong>in</strong>ction.<br />

86


The B approach allows static <strong>in</strong>variants to be specied and pro<strong>of</strong> obligations to be derived.<br />

The consistency verication problem can be approached by us<strong>in</strong>g theorem provers or pro<strong>of</strong><br />

assistants, but the burden <strong>of</strong> writ<strong>in</strong>g consistent specications is left to the user. In the context<br />

<strong>of</strong> database transactions the same applies to the extended Boyer-Moore approach <strong>in</strong> [25].<br />

Hence the problem is to assist the user <strong>in</strong> this task and to provide solid and theoretically<br />

founded techniques for consistency enforcement as an alternative toverication. This problem<br />

is <strong>in</strong>vestigated <strong>in</strong> this paper.<br />

Of course, it cannot be expected to obta<strong>in</strong> a panacea for the development <strong>of</strong> correct<br />

programs, s<strong>in</strong>ce any approach to consistency enforcement must rely on certa<strong>in</strong> assumptions.<br />

We must take the specied operations and <strong>in</strong>variants as xed, i.e., the specied operations<br />

and <strong>in</strong>variants reect exactly the <strong>in</strong>tention <strong>of</strong> the user. Otherwise enforcement may produce<br />

an undesired new operation. In any case, just as for the results <strong>of</strong> verication, specications<br />

result<strong>in</strong>g from consistency enforcement may be used to give some feedback to the specify<strong>in</strong>g<br />

user and may encourage changes to a specication.<br />

5.1.1 The Concistency Enforcement Problem<br />

Given an operation S and an <strong>in</strong>variant I the basic idea is to replace S by a new operation<br />

S I which is consistent with respect to I. S<strong>in</strong>ce this alone is not a sucient property because<br />

<strong>of</strong> its <strong>in</strong>dependence from S, we claim that S I should be as close to S as possible. The rst<br />

problem is to nd a suitable notion for \close". The <strong>in</strong>tuition beh<strong>in</strong>d our work is that each<br />

operation has an \eect", i.e. performs certa<strong>in</strong> state changes, and S I should \preserve the<br />

eect" <strong>of</strong> S.<br />

Whatever the denition <strong>of</strong> eect preservation will be, it should lead to a partial order<br />

v on operations. With respect to this partial order we have S I v S and S I should be the<br />

greatest (consistent) operation with this property.<br />

In Section 5.2 we rst look at a practical example <strong>in</strong> the relational model taken from [22]<br />

to motivate the construction <strong>of</strong> S I or more precisely <strong>of</strong> one <strong>of</strong> its determ<strong>in</strong>istic branches.<br />

In Section 5.3 we recall fundamental features <strong>of</strong> state-based specications with emphasis on<br />

predicate transformer semantics.<br />

This is used to characterize consistency by a formula <strong>in</strong> <strong>in</strong>nitary rst-order logic. In<br />

the same spirit we may formalize operational specialization which denes a partial order<br />

on operations. Then it is natural to take this order for a formal denition <strong>of</strong> consistency<br />

enforcement. This leads directly to the notion <strong>of</strong> a greatest consistent specialization (GCS)<br />

which rst appeared <strong>in</strong> [21] <strong>in</strong> an object oriented database context. The rst results show<br />

the existence <strong>of</strong> GCSs and their compatibility with conjunctions, which allows to enforce<br />

consistency step-by-step for any order <strong>of</strong> the <strong>in</strong>variants.<br />

5.1.2 The Problem <strong>of</strong> GCS Construction<br />

These results are not at all surpris<strong>in</strong>g, because we know that for each specication us<strong>in</strong>g<br />

guarded commands we always nd an equivalent predicative specication and vice versa.<br />

Hence conjunction would be sucient to nd at least one solution. In fact, we may translate<br />

a specication <strong>in</strong>to a predicative form, jo<strong>in</strong> it with the <strong>in</strong>variant and then translate back to<br />

obta<strong>in</strong> the GCS. For the case <strong>of</strong> specications <strong>in</strong> the style <strong>of</strong> B this <strong>in</strong>troduces unbounded<br />

choices and therefore destroys the \operational avour" [3] <strong>of</strong> specications with guarded<br />

87


commands. Therefore, we are look<strong>in</strong>g for an approach to GCS construction which preserves<br />

this style.<br />

An <strong>in</strong>sucient alternative would be to consider just the basic operations with<strong>in</strong> the specication<br />

S, i.e. assignments or skip, and to replace them by their GCSs. In some cases this<br />

leads to over-specialization <strong>in</strong> other cases we do not even get a specialization at all. The ma<strong>in</strong><br />

result <strong>of</strong> this paper shows that under some technical prerequisites it is nevertheless possible to<br />

concentrate on basic operations. Replac<strong>in</strong>g them by their GCSs <strong>in</strong> a given complex operation<br />

denes a new operation S 0 I which is specialized by S I.We then get the GCS <strong>of</strong> the complex<br />

operation by add<strong>in</strong>g a precondition. This fundamental result will be shown <strong>in</strong> Section 5.4, but<br />

parts <strong>of</strong> the rather lengthy pro<strong>of</strong> <strong>of</strong> the \upper bound theorem" are shifted to the appendix.<br />

5.1.3 The Practicality <strong>of</strong> GCS Construction<br />

GCSs are <strong>in</strong> general non-determ<strong>in</strong>istic. Their determ<strong>in</strong>istic branches reect several alternative<br />

strategies for consistency enforcement. We may therefore ask, whether it is possible to<br />

construct directly such determ<strong>in</strong>istic branches. In Section 5.4 we showthatifwe build S 0 I by<br />

replac<strong>in</strong>g the basic operations <strong>in</strong> S by specializations <strong>of</strong> GCSs, we still achieve specializations<br />

<strong>of</strong> GCSs. This gives a second compatibility result which is <strong>of</strong> particular <strong>in</strong>terest for practical<br />

applications. Especially <strong>in</strong> data-<strong>in</strong>tensive applications, where we deal with sets, we maywant<br />

to choose determ<strong>in</strong>istic GCS branches with m<strong>in</strong>imized symmetric dierence for set values <strong>in</strong><br />

the <strong>in</strong>itial and nal state. More liberately this can be formalized by subsumption-free branches<br />

<strong>of</strong> the GCS.<br />

This paper is a cont<strong>in</strong>uation <strong>of</strong> the work <strong>in</strong> [21], where the notion <strong>of</strong> GCS was <strong>in</strong>troduced<br />

to give a solid theoretical basis for consistency enforcement, but there was no idea how<br />

to construct them. Especially <strong>in</strong> the context <strong>of</strong> data-<strong>in</strong>tensive applications there are many<br />

compet<strong>in</strong>g approaches based on active rule management [6, 10, 11, 15, 29, 30], but <strong>in</strong> none <strong>of</strong><br />

these a complete denition <strong>of</strong> the problem has been given.<br />

The ma<strong>in</strong> new result <strong>of</strong> this paper concerns the construction <strong>of</strong> GCSs. It is shown how this<br />

can be reduced to basic operations, for which GCSs must still be detected case by case. On<br />

this basis, the practical paper [23] <strong>in</strong>dicates an ecient implementation based on l<strong>in</strong>guistic<br />

reection. In addition the technical prerequisite <strong>of</strong> I-reducedness, which only occurs as a<br />

means for the pro<strong>of</strong> <strong>of</strong> the ma<strong>in</strong> result, shows the limits <strong>of</strong> consistency enforcement. The ma<strong>in</strong><br />

technical diculty <strong>in</strong>thepro<strong>of</strong>was to absta<strong>in</strong> from look<strong>in</strong>g at specializations <strong>of</strong> S, but rst<br />

to achieve a consistent generalization <strong>of</strong> S I , which is <strong>in</strong> general not a specialization <strong>of</strong> S.<br />

5.2 A Motivat<strong>in</strong>g Example<br />

Let us rst illustrate consistency enforcement <strong>in</strong> the relational model. For this we consider a<br />

small fragment <strong>of</strong> the example used <strong>in</strong> [10, 22].<br />

Recall that a relation schema is simply a set <strong>of</strong> attributes. Moreover, with each attribute<br />

A <strong>in</strong> a relation schema R we associate a data type dom(A), but for our purposes here the data<br />

types are not important. A relational database schema S S is a nite set <strong>of</strong> relation schemata.<br />

A tuple t over a relation schema R is a map R !<br />

A2R<br />

dom(A) witht(A) 2 dom(A). We<br />

usually denote tuples as ord<strong>in</strong>ary tuples with components named by the attributes. Sometimes,<br />

we even omit the attributes assum<strong>in</strong>g a xed order on them. Then a relation over R<br />

is a nite set <strong>of</strong> such tuples. An <strong>in</strong>stance <strong>of</strong> S associates with each relation schema R 2S a<br />

relation r over R.<br />

88


5.2.1 Constra<strong>in</strong>ts <strong>in</strong> the Relational Model<br />

An <strong>in</strong>tegrity constra<strong>in</strong>t over a database schema S is a formula<br />

I P 1 (x 1 ) ^ :::^ P n (x n ) ) Q 1 (y 1 ) _ :::_ Q m (y m ) <br />

where the predicate symbols P i , Q j either correspond to relation schemata R 2 S or are<br />

comparison predicates (= 6=


WIRE<br />

wire id connection wire type voltage power<br />

4711 HH-HB Koax 12 600<br />

4814 HH-H Tel 12 600<br />

TUBE<br />

tube id connection tube type<br />

8314 HH-H GX44<br />

8511 HH-HB GX44<br />

023 HB-H T33<br />

CONNECTION<br />

connection from to<br />

HH-H Hamburg Hannover<br />

HH-HB Hamburg Bremen<br />

HB-H Bremen Hannover<br />

It is easy to see that this <strong>in</strong>stance satises the constra<strong>in</strong>ts above.<br />

ut<br />

5.2.2 Stepwise Consistency Enforcement<br />

With each relation schema R we also associate basic update operations <strong>in</strong>sert R (t) and<br />

delete R (t). If a tuple t to be <strong>in</strong>serted already exists <strong>in</strong> the relation, the <strong>in</strong>sert-operation<br />

does noth<strong>in</strong>g. If a tuple t to be deleted does not exist, then a deletion also does noth<strong>in</strong>g.<br />

Thus, these operations could also be written as assignments R := R [ftg and R := R ;ftg<br />

by slightly abus<strong>in</strong>g the relation schemata as variables which take relations as values.<br />

Example 5.2. Consider the schema S from Example 5.1 and the operation <strong>in</strong>sert WIRE (t).<br />

This may lead to a violation <strong>of</strong> constra<strong>in</strong>t ID 1 ,<strong>in</strong>whichcasewemust add a tuple to TUBE.<br />

Hence it can be replaced by<br />

WIRE := WIRE [ftg <br />

IF connection(t) =2 TUBE[connection]<br />

THEN TUBE := TUBE [f(?,connection(t),?) g<br />

ENDIF<br />

Here the question marks stand for arbitrarily chosen values <strong>of</strong> the correspond<strong>in</strong>g data type.<br />

Similarly, wemay replace delete TUBE (t) by<br />

TUBE := TUBE ;ftg <br />

IF connection(t) 2 WIRE[connection] ; TUBE[connection]<br />

THEN WIRE := WIRE ;ft 0 j connection(t 0 ) = connection(t)g<br />

ENDIF<br />

In order to enforce FD 2 wemay then replace <strong>in</strong>sert TUBE (t) by<br />

IF 8t 0 2 TUBE . tube id(t) 6= tube id(t 0 )<br />

THEN TUBE := TUBE [ftg<br />

ENDIF<br />

Let us now add the exclusion constra<strong>in</strong>t ED WIRE[wire id] k TUBE[tube id]. In order to<br />

enforce this constra<strong>in</strong>t <strong>in</strong>sertions <strong>in</strong>to one <strong>of</strong> WIRE or TUBE should be followed by deletions<br />

<strong>in</strong> the other. The result<strong>in</strong>g operations are<br />

and<br />

WIRE := WIRE ;ftg <br />

TUBE := TUBE ;ft 0 j tube id(t 0 ) = wire id(t)g<br />

TUBE := TUBE ;ftg <br />

90


WIRE := WIRE ;ft 0 j wire id(t 0 )=tubeid(t)g .<br />

If we now take together FD 2 , ID 1 and ED we must be very carefull. E.g., if we execute<br />

<strong>in</strong>sert WIRE (8511,HH-HB,Koax,12,600) on the <strong>in</strong>stance above, we may rst delete the tuple<br />

(8511,HH-HB,GX44) <strong>in</strong> TUBE <strong>in</strong> order to enforce ED and then the two tuples (4711,HH-<br />

HB,Koax,12,600) and (8511,HH-HB,Koax,12,600) <strong>in</strong> WIRE <strong>in</strong> order to enforce ID 2 . The result<strong>in</strong>g<br />

<strong>in</strong>stance would be (omitt<strong>in</strong>g CONNECTION):<br />

WIRE<br />

wire id connection wire type voltage power<br />

4814 HH-H Tel 12 600<br />

TUBE<br />

tube id connection tube type<br />

8314 HH-H GX44<br />

023 HB-H T33<br />

Thus, the \eect" <strong>of</strong> the orig<strong>in</strong>al operation, i.e. <strong>in</strong>sertion <strong>of</strong> a tuple <strong>in</strong>to WIRE, is completely<br />

destroyed. The new eect is a deletion <strong>in</strong> WIRE and TUBE.<br />

The alternative works as follows: We start with the <strong>in</strong>sert WIRE operation and replace<br />

it by the one above used to enforce ID 1 . The result<strong>in</strong>g operation <strong>in</strong>volves an <strong>in</strong>sertion <strong>in</strong>to<br />

TUBE. Next \enforce" FD 2 by replac<strong>in</strong>g <strong>in</strong>sert TUBE . The result<strong>in</strong>g complex operation now<br />

<strong>in</strong>volves <strong>in</strong>sert WIRE and <strong>in</strong>sert TUBE . These are both replaced <strong>in</strong> order to \enforce" ED. We<br />

obta<strong>in</strong> the follow<strong>in</strong>g operation:<br />

WIRE := WIRE [ftg <br />

TUBE := TUBE ;ft 0 j tube id(t 0 ) = wire id(t)g <br />

IF connection(t) =2 TUBE[connection]<br />

THEN SELECT i =2 TUBE[tube id] [ WIRE[wire id] <br />

TUBE := TUBE [f(i,connection(t),?) g<br />

ENDIF<br />

However, this operation is not consistent with respect to ID 1 ,whichwe enforced before. We<br />

therefore add a precondition which holds exactly <strong>in</strong> those cases, where previous enforcement<br />

steps are preserved. This condition is wire id(t) =2 TUBE[tube id]. Then the nal operation<br />

will be<br />

IF wire id(t) =2 TUBE[tube id]<br />

THEN WIRE := WIRE [ftg <br />

IF connection(t) =2 TUBE[connection]<br />

THEN SELECT i =2 TUBE[tube id] [ WIRE[wire id] <br />

TUBE := TUBE [f(i,connection(t),?) g<br />

ENDIF<br />

ELSE fail<br />

ENDIF<br />

Here fail is used to express undenedness. If the condition <strong>in</strong> the IF-clause is not satised,<br />

the whole operation will be rejected.<br />

ut<br />

Example 5.2 reects exactly the construction <strong>of</strong> a GCS branch. The presentation so far is<br />

completely <strong>in</strong>formal. In the follow<strong>in</strong>g sections we shall justify this approach. We shall see that<br />

the chosen order <strong>of</strong> the constra<strong>in</strong>ts is not important. We shall also see that the precondition<br />

arises naturally from the specialization order.<br />

91


5.3 Fundamental Features <strong>of</strong> State-Based Specications<br />

In the follow<strong>in</strong>g consider a specication to consist <strong>of</strong> a state space, <strong>in</strong>variants and operations.<br />

A state space is simply a collection <strong>of</strong> typed state variables, where the types are assumed to be<br />

sets. Operations on these sets are dened by functions. Invariants are dened by formulae <strong>in</strong><br />

an underly<strong>in</strong>g logic L. F<strong>in</strong>ally, operations will be specied by generalized guarded commands.<br />

5.3.1 Formal Specications with Guarded Commands<br />

In the follow<strong>in</strong>g assume a xed many-sorted, S <strong>in</strong>nitary, rst-order logic L and a xed <strong>in</strong>terpretation<br />

structure (D !), where D =<br />

T type T is a set (semantic doma<strong>in</strong>) and ! assigns<br />

type-compatible functions !(f) :T 1 ::: T n ! T and !(p) :T 1 ::: T n !ftrue falseg<br />

to n-ary function symbols f and n-ary predicate symbols p respectively. Then ! can be extended<br />

<strong>in</strong> the usual way to the terms and formulae <strong>of</strong> L and we may assume the doma<strong>in</strong><br />

closure property, i.e. for each d 2 D there exists some closed term t <strong>in</strong> L with !(t) =d.<br />

Denition 5.1. (i) A state space X isaniteset<strong>of</strong>variables <strong>of</strong> L such thatforeach x 2 X<br />

there is an associated type T x .We write x :: T x .<br />

(ii) A (static) <strong>in</strong>variant on a state space X (for short: X-<strong>in</strong>variant) isaformula I <strong>of</strong> L with<br />

free variables <strong>in</strong> X (fr(I) X).<br />

(iii) A transition <strong>in</strong>variant J on a state space X is a formula <strong>of</strong> L with fr(J ) X [ X 0 ,<br />

where X 0 isadisjo<strong>in</strong>tcopy<strong>of</strong>X.<br />

(iv) Given a state space X, a state on X is a type-compatible variable assignment x 7!<br />

(x) 2 T x for each x 2 X.<br />

Let denote the set <strong>of</strong> all states. Clearly, states 2 are sucient to<strong>in</strong>terpret X-<strong>in</strong>variants,<br />

whereas state pairs () 2 suce to <strong>in</strong>terpret transition <strong>in</strong>variants. We use the<br />

notations j= and j= () <strong>in</strong> these cases. The disjo<strong>in</strong>tcopy X 0 <strong>of</strong> the state space X for transition<br />

<strong>in</strong>variants is used to dist<strong>in</strong>guish between the values <strong>in</strong> <strong>in</strong>itial and nal states respectively.<br />

Example 5.3. Let Z denote the set <strong>of</strong> <strong>in</strong>tegers. Consider the state space X = fx 1 x 2 g where<br />

T x 1 = T x2 is the set <strong>of</strong> nite subsets <strong>of</strong> the cartesian product Z Z. In addition, for i =1 2<br />

we have projection functions i : Z Z ! Z. By abuse <strong>of</strong> notation we also use i to denote<br />

the elementwise shift to set arguments result<strong>in</strong>g <strong>in</strong> a nite set <strong>of</strong> <strong>in</strong>tegers. Moreover, consider<br />

the static <strong>in</strong>variants:<br />

I 1 1 (x 1 ) 1 (x 2 )<br />

I 2 8x y :: Z Z: x2 x 2 ^ y 2 x 2 ^ 2 (x) = 2 (y) ) 1 (x) = 1 (y)<br />

I 3 2 (x 1 ) \ 2 (x 2 ) = <br />

ut<br />

Note that Example 5.3 captures the essentials <strong>of</strong> Examples 5.1 and 5.2.<br />

For operations we use guarded commands <strong>in</strong> the style <strong>of</strong> Dijkstra and Nelson [9, 12, 19, 24]<br />

<strong>in</strong>clud<strong>in</strong>g partiality and recursion. We dispense with more sophisticated constructs such as<br />

the dovetail-operator r [5], s<strong>in</strong>ce fairness is beyond the scope <strong>of</strong> this piece <strong>of</strong> work.<br />

Denition 5.2. Let X be some state space. An operation S on X consists <strong>of</strong> a set <strong>of</strong> <strong>in</strong>putparameters<br />

f 1 ::: k g, a set <strong>of</strong> output-parameters fo 1 ::: o l g and a body. To each <strong>in</strong>putparameter<br />

i corresponds a type I i and to each output-parameter o j corresponds a type O j .<br />

The body <strong>of</strong> S is a guarded command, i.e. it is recursively built from the follow<strong>in</strong>g constructs:<br />

92


(i) assignment x := E, where x is a state variable <strong>in</strong> X, an output parameter or a local<br />

variable with<strong>in</strong> S and E is a term <strong>of</strong> the same type as x,<br />

(ii) skip, fail, loop ,<br />

(iii) sequential composition S 1 S 2 ,choice S 1 S 2 ,unbounded choice @ x :: T S, guardP ! S<br />

and restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable <strong>of</strong> type<br />

T and<br />

(iv) the least xpo<strong>in</strong>t operator S: f(S), where f(S) is an expression built as above us<strong>in</strong>g <strong>in</strong><br />

addition the operation variable S.<br />

We usually write o 1 ::: o l S( 1 ::: k ).<br />

Let us rst expla<strong>in</strong> the <strong>in</strong>formal (and rather procedural) mean<strong>in</strong>g <strong>of</strong> guarded commands.<br />

Each operation may be partial, i.e. it is undened on a subset <strong>of</strong> , and it is <strong>in</strong> general<br />

non-determ<strong>in</strong>istic, i.e. start<strong>in</strong>g <strong>in</strong> an <strong>in</strong>itial state may result <strong>in</strong> more than one nal state ,<br />

where may also be 1 to denote non-term<strong>in</strong>ation.<br />

Then the <strong>in</strong>formal mean<strong>in</strong>g <strong>of</strong> assignments, sequences and skip is the obvious one. Choices<br />

mean to arbitrarily select one <strong>of</strong> the operations, if it is dened. The <strong>in</strong>tention beh<strong>in</strong>d the<br />

unbounded choice is the <strong>in</strong>troduction <strong>of</strong> a new variable x not occurr<strong>in</strong>g <strong>in</strong> X <strong>of</strong> the given type<br />

T and to execute S on the extended state space X [fxg. A guard P ! S gives a precondition<br />

P for S. IfP is not satised, the whole operation is undened. Restricted choice S T means<br />

to execute S unless it is undened <strong>in</strong> which case T is taken.<br />

The basic operations fail and loop are only <strong>in</strong>troduced for theoretical completeness: fail<br />

is always undened, and loop never term<strong>in</strong>ates, but loop is the least element <strong>in</strong> the Nelson<br />

order , hence is fundamental for recursion, whereas fail will occur as the least element <strong>in</strong><br />

the specialization order v. Recall that the Nelson order is dened by T S i<br />

wp(T )(R) ) wp(S)(R) and wlp(S)(R) ) wlp(T )(R) (5.35)<br />

hold for all X-<strong>in</strong>variants R [19]. The denition <strong>of</strong> the specialization order v will occur <strong>in</strong><br />

Denition 5.6.<br />

Example 5.4. Let us extend Example 5.3 by some operations. Dene S(a b :: Z) by x 1 :=<br />

x 1 [f(a b)g and S 0 (a b :: Z) by<br />

x 1 := x 1 [f(a b)g ((a 62 map( 1 )(x 2 ) ! @ c :: Z x 2 := x 2 [f(a c)g ) skip ) :<br />

S <strong>in</strong>serts a new pair (a b) <strong>in</strong>to the set value <strong>of</strong> x 1 , and S 0 conta<strong>in</strong>s an additional <strong>in</strong>sertion<br />

<strong>in</strong>to x 2 .Thus, S 0 is a sequence S T ,whereT compensates a violation <strong>of</strong> the <strong>in</strong>variant I 1 .<br />

T itself has the form U skip. The reason for the skip is that U is a guard and thus<br />

is partial. Omitt<strong>in</strong>g the skip would lead to S 0 be<strong>in</strong>g undened <strong>in</strong> case <strong>of</strong> no violation to I 1 ,<br />

whilst now S 0 co<strong>in</strong>cides with S <strong>in</strong> that case.<br />

ut<br />

We allow types to be omitted, if they are known from the context or if they are not necessary<br />

for the understand<strong>in</strong>g. Moreover, we allow cascaded unbounded choices @ x 1 @ ::: @ x n S<br />

to be abbreviated by @x 1 ::: x n S.<br />

93


5.3.2 Axiomatic Semantics<br />

In general, we may describe the semantics <strong>of</strong> an operation S simply by a set (S) <br />

( [f1g), where 1 is the special symbol used to <strong>in</strong>dicate non-term<strong>in</strong>ation. S<strong>in</strong>ce nobody<br />

wants to dene the semantics <strong>of</strong> operations by explicit enumeration <strong>of</strong> all state pairs, we are<br />

look<strong>in</strong>g for an equivalent logical characterization.<br />

Let R be an X-<strong>in</strong>variant and consider the set <strong>of</strong> states R = f 2 jj= Rg satisfy<strong>in</strong>g R.<br />

If we take R as a postcondition for an operation S, wewant to associate with it the weakest<br />

(liberal) precondition <strong>of</strong> S to establish R. Informally these conditions can be characterized as<br />

follows:<br />

{ wlp(S)(R) characterizes those <strong>in</strong>itial states such that all term<strong>in</strong>at<strong>in</strong>g executions <strong>of</strong> S will<br />

reach a nal state characterized by R, i.e. j= wlp(S)(R) holds i for all 2 with<br />

() 2 (S) wehave j= R,and<br />

{ wp(S)(R) characterizes those <strong>in</strong>itial states such that all executions <strong>of</strong> S term<strong>in</strong>ate and<br />

will reach a nal state characterized by R, i.e. j= wp(S)(R) holds i for all () 2 (S)<br />

we have 6= 1 and j= R.<br />

Thus, wlp(S) and wp(S) are functions from X-<strong>in</strong>variants to X-<strong>in</strong>variants, which are usually<br />

called predicate transformers. It can be shown that these predicate transformers always exist<br />

and satisfy<br />

wp(S)(R) , wlp(S)(R) ^ wp(S)(true) and (5.36)<br />

wlp(S)(^ ^<br />

R i ) , wlp(S)(R i ) : (5.37)<br />

i2I<br />

i2I<br />

Call (5.36) the pair<strong>in</strong>g condition and (5.37) the universal conjunctivity property.<br />

The existence pro<strong>of</strong> is based on the assumption that L is an <strong>in</strong>nitary logic and the doma<strong>in</strong><br />

closure assumption. The latter allows for a given state to nd a characteriz<strong>in</strong>g predicate P ,<br />

i.e. we have j= P i = .<br />

Furthermore, the former property then allows to write R , W 2 R<br />

P , which is used<br />

to prove that the pair<strong>in</strong>g condition and the universal conjunctivity are sucient to dene<br />

operations, i.e., the denition <strong>of</strong> predicate transformers wlp(S) and wp(S) satisfy<strong>in</strong>g (5.36)<br />

and (5.37) is equivalent to the denition <strong>of</strong> (S) [9, 12, 19, 24].<br />

Let us dene the semantics <strong>of</strong> operations axiomatically via predicate transformers. The<br />

pro<strong>of</strong>s <strong>of</strong> universal conjunctivity and the pair<strong>in</strong>g condition are straightforward [19].<br />

Denition 5.3. Let S, S 1 , S 2 be guarded commands on some state space X, T some type,<br />

E(x) some term <strong>of</strong> type T x and let x 2 X and y 62 X. Then we have forany formula R <strong>of</strong> L:<br />

wlp(skip)(R) , wp(skip)(R) ,R (5.38)<br />

wlp(fail)(R) , wp(fail)(R) , true (5.39)<br />

wlp(loop)(R) , true <br />

wp(loop)(R) , false (5.40)<br />

wlp(x := E)(R) , wp(x := E)(R) ,fx=Eg:R (5.41)<br />

94


where fx=Eg:R denotes the substitution <strong>of</strong> the variable x <strong>in</strong> R by the expression E,<br />

wlp(S 1 S 2 )(R) , wlp(S 1 )(wlp(S 2 )(R)) <br />

wp(S 1 S 2 )(R) , wp(S 1 )(wp(S 2 )(R)) (5.42)<br />

wlp(P ! S)(R) ,P) wlp(S)(R) <br />

wp(P ! S)(R) ,P) wp(S)(R) (5.43)<br />

wlp(S 1 S 2 )(R) , wlp(S 1 )(R) ^ wlp(S 2 )(R) <br />

wp(S 1 S 2 )(R) , wp(S 1 )(R) ^ wp(S 2 )(R) (5.44)<br />

wlp(S 1 S 2 )(R) , wlp(S 1 )(R) ^ (wp(S 1 )(false) ) wlp(S 2 )(R)) <br />

wp(S 1 S 2 )(R) , wp(S 1 )(R) ^ (wp(S 1 )(false) ) wp(S 2 )(R)) (5.45)<br />

wlp(@ y :: T S)(R) ,8y :: T:wlp(S)(R) <br />

wp(@ y :: T S)(R) ,8y ^ :: T:wp(S)(R) (5.46)<br />

wlp(S:f(S))(R) , wlp(f (loop))(R) and<br />

<br />

wp(S:f(S))(R) , _ <br />

where ranges over the ord<strong>in</strong>al numbers.<br />

wp(f (loop))(R) (5.47)<br />

The recursive guarded command S:f(S) is the least xpo<strong>in</strong>t <strong>of</strong> f with respect to the Nelson<br />

order dened <strong>in</strong> (5.35). Then we must know the monotonicity <strong>of</strong> the constructors <strong>in</strong><br />

Denition 5.2 with respect to this order, which can be easily proven [19].<br />

Note that operations may only eect parts <strong>of</strong> the state space. For consistency enforcement<br />

it will be necessary to \extend" such operations. Therefore, we need to know for each operation<br />

S the subspace Y X such thatS does not change the values <strong>in</strong> X ; Y . In this case we call<br />

S a Y -operation on X. A formal denition is the follow<strong>in</strong>g.<br />

Denition 5.4. Let X be a state space and S an operation on X. S is a Y -operation for<br />

Y X i wlp(S)(R) ,Rand wp(S)(R) ,Rhold for each (X ; Y )-<strong>in</strong>variant R and Y is<br />

m<strong>in</strong>imal with this property.<br />

Note that for each operation S on X there is always a Y X such that S is a Y -operation.<br />

Let us now giveacharacterization for determ<strong>in</strong>istic operations. For this we need the notion<br />

<strong>of</strong> the dual or conjugate predicate transformers wlp(S) and wp(S) which are dened by<br />

wlp(S) (R) :wlp(S)(:R) and wp(S) (R) :wp(S)(:R) :<br />

(5.48)<br />

Denition 5.5. An operation S on the state space X is called determ<strong>in</strong>istic i wlp(S) (R) )<br />

wp(S)(R) holds for all X-<strong>in</strong>variants R.<br />

5.3.3 Consistency and Specialization<br />

General <strong>in</strong>variants and arbitrary operations on a state space X raise the problem, whether<br />

consistency as dened by the <strong>in</strong>variants is always satised by the operations. One approach<br />

to address this problem is to use general verication techniques, i.e. to derive (and prove)<br />

general pro<strong>of</strong> obligations <strong>in</strong> the predicate transformer calculus. Let us rst express these<br />

pro<strong>of</strong> obligations.<br />

95


Denition 5.6. Let X = fx 1 ::: x n g be a state space, Z Y X subspaces, I an X-<br />

<strong>in</strong>variant, J a transition <strong>in</strong>variant, S a Z-operation and T a Y -operation. Then<br />

(i) S is consistent with respect to I i I)wlp(S)(I) holds,<br />

(ii) T specializes 2 S i wp(S)(true) ) wp(T )(true) and wlp(S)(R) ) wlp(T )(R) hold for all<br />

Z-<strong>in</strong>variants R (denoted T v S), and<br />

(iii) S is consistent with respect to J i S v J holds, where J is dened as<br />

loop @ x 0 1::: x 0 n J ! x 1 := x 0 1 ::: x n := x 0 n :<br />

Recall the <strong>in</strong>tention beh<strong>in</strong>d these denitions. An X-<strong>in</strong>variant partitions <strong>in</strong>to two disjo<strong>in</strong>t<br />

<br />

subsets. If we consider the <strong>in</strong>variant I we have = I [ :I . States not satisfy<strong>in</strong>g the<br />

<strong>in</strong>variant should never be reached, i.e. if S is started <strong>in</strong> a state satisfy<strong>in</strong>g I, it should only<br />

reach states also satisfy<strong>in</strong>g I, i.e., if we have 2 I , then for all 2 with () 2 (S)<br />

we should always have 2 I . Recall from the <strong>in</strong>troduction to this section that the set <strong>of</strong> all<br />

<strong>in</strong>itial states such that each term<strong>in</strong>at<strong>in</strong>g execution <strong>of</strong> S reaches I is wlp(S)(I) , i.e.,<br />

f 2 j () 2 (S) implies j= for all 2 g = wlp(S)(I) :<br />

Hence we have the requirement I wlp(S)(I) which isequivalent to (i) [8].<br />

The <strong>in</strong>tuition beh<strong>in</strong>d the denition <strong>of</strong> specialization is that whenever an execution <strong>of</strong> the<br />

specialized operation T establishes some post-predicate R, then this execution should already<br />

be one <strong>of</strong> the general operation S. Clearly, v denes a partial order on operations.<br />

Each transition <strong>in</strong>variant may be regarded as a very general operation J that allows any<br />

state pair () satisfy<strong>in</strong>g J . Hence transition consistency for an operation S is equivalent to<br />

S v J . The loop-part <strong>in</strong> the denition <strong>of</strong> J gives wp( J )(true) , false,whichallows to<br />

consider only wlp( J )(R) ) wlp(S)(R) for all Z-<strong>in</strong>variants R.<br />

There exists an equivalent characterization <strong>of</strong> specialization and hence also <strong>of</strong> transition<br />

consistency that avoids the quantication over all X-<strong>in</strong>variants, but uses conjugate predicate<br />

transformers as dened <strong>in</strong> (5.48). The result <strong>in</strong> Proposition 5.7 is assumed to be commonly<br />

known. E.g., [7] mentions a similar result <strong>in</strong> the wp-calculus without pro<strong>of</strong>. The pro<strong>of</strong> is rather<br />

technical and can be done by simple direct calculations. S<strong>in</strong>ce we do not know <strong>of</strong>any reference<br />

for such a pro<strong>of</strong>, we have added it <strong>in</strong> Appendix 5.6.<br />

Proposition 5.7. Let S and T be operations on a state space X = fx 1 ::: x n g. Let Z =<br />

fz 1 ::: z n g be disjo<strong>in</strong>t to X with T xi = T zi . Then wlp(S)(R) ) wlp(T )(R) holds for all<br />

X-<strong>in</strong>variants R i<br />

fz 1 =x 1 ::: z n =x n g:wlp(T 0 )(wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) (5.49)<br />

holds, where T 0 results from T by renam<strong>in</strong>g all the variables x i by z i .<br />

ut<br />

2 Some other authors would prefer to call v renement. This is justied as long as renement does not comprise<br />

the extension <strong>of</strong> specications. This view <strong>of</strong> renement as a more general methodological means underlies<br />

our overall work on formal methods, whereas we prefer the notation specialization <strong>in</strong> this more restrictive<br />

context. From a technical po<strong>in</strong>t <strong>of</strong> view, we simply consider a partial order (with some nice properties) on<br />

operations.<br />

96


Note that the order <strong>of</strong> the substitution is irrelevant. Then T v S holds i we have (5.49)<br />

and wp(S)(true) ) wp(T )(true). This is a result <strong>of</strong> its own right, which enables mechanical<br />

or even automatic verication. In the context <strong>of</strong> consistency enforcement (5.49) denes an<br />

X-<strong>in</strong>variant, say P. If we restrict the operations S and T to R , then T would become a<br />

specialization <strong>of</strong> S. Ifwehave wp(S)(true) , wp(T )(true), this implies that P ! T with P<br />

dened by (5.49) is the greatest common specialization <strong>of</strong> S and T . This will later be used to<br />

nd the precondition <strong>in</strong> the GCS.<br />

Corollary 5.8. Let S and T be operations on a state space X = fx 1 ::: x n g with wp(S)(true) ,<br />

wp(T )(true). Dene an X-<strong>in</strong>variant P ST by (5.49). Then P ST ! T is the greatest common<br />

specialization <strong>of</strong> S and T .<br />

ut<br />

In general the considered operations will be non-determ<strong>in</strong>istic. Informally, non-determ<strong>in</strong>ism<br />

may be considered as glue<strong>in</strong>g together <strong>in</strong>nitely many determ<strong>in</strong>istic operations by a choice<br />

operator. Sometimes, however, we are <strong>in</strong>terested only <strong>in</strong> these determ<strong>in</strong>istic branches. We give<br />

a formal denition for this.<br />

Denition 5.9. Let S and T be operations on X with T v S and wp(T ) (true) , wp(S) (true).<br />

If T is determ<strong>in</strong>istic, it is called a determ<strong>in</strong>istic branch <strong>of</strong> S.<br />

If we T and S are semantically equivalent to some @y 1 :: T 1 ::: y n :: T n T 0 and<br />

@y 1 :: T 1 ::: y n :: T n S 0 respectively such that T 0 is a determ<strong>in</strong>istic branch <strong>of</strong>S 0 ,thenwe<br />

call T a quasi-determ<strong>in</strong>istic branch <strong>of</strong> S.<br />

5.3.4 Greatest Consistent Specializations<br />

Suppose now to be given an operation S and a static <strong>in</strong>variant I. Assume that S is an Y -<br />

operation, whereas I is dened on X with Y X. The idea is to construct a \new" operation<br />

S I that is consistent with respect to I and can be used to replace S. Roughly spoken this<br />

means that the eect <strong>of</strong> S I on the state variables <strong>in</strong> X should not dier from the eect <strong>of</strong><br />

S. Formally this is expressed by consistent specialization. S<strong>in</strong>ce there will be more than one<br />

such specialization and we therefore choose the \greatest", i.e. all others should specialize it.<br />

Denition 5.10. Let Y X be state spaces, S a Y -operation and I an X-<strong>in</strong>variant. Then<br />

an operation S I on X is called Greatest Consistent Specialization (GCS) <strong>of</strong> S with respect to<br />

I i<br />

(i) S I v S holds,<br />

(ii) S I is consistent with respect to I and<br />

(iii) for each operation T on X satisfy<strong>in</strong>g properties (i) and (ii) (<strong>in</strong>stead <strong>of</strong> S I )wehave T v S I .<br />

Example 5.5. Consider S = loop, which is already consistent with respect to any <strong>in</strong>variant<br />

I. HenceS I must also be loop.<br />

Similarly, S = fail is consistent with respect to any <strong>in</strong>variant I, whichgives S I = fail.<br />

ut<br />

Example 5.6. Let Z denote the set <strong>of</strong> <strong>in</strong>tegers. Take the state space X = fxg with x :: Z<br />

and suppose the X-constra<strong>in</strong>t I x 0 and the X-operation S = x := x ; a for some<br />

constant a 0. Then we have<br />

S I = (x a _ x


(i) holds, s<strong>in</strong>ce<br />

wlp(S)(R) ,fx=x ; ag:R<br />

) (x a _ x


Moreover, due to the construction <strong>of</strong> T and the denition <strong>of</strong> specialization, S 0 is the the<br />

least upper bound <strong>of</strong> T with respect to v, hence it must be also a specialization <strong>of</strong> S. On<br />

the other hand, from the consistency pro<strong>of</strong> obligation and the construction <strong>of</strong> T we obta<strong>in</strong><br />

immediately that S 0 is consistent with respect to I. HenceS 0 2T must hold, which proves<br />

that S 0 is a GCS S I <strong>of</strong> S with respect to I. This completes the existence pro<strong>of</strong>.<br />

The uniqueness follows immediately, s<strong>in</strong>ce each GCS S 0 <strong>of</strong> S with respect to I must be<br />

the least upper bound <strong>of</strong> T .<br />

ut<br />

We observe that the GCS with respect to the conjunction <strong>of</strong> <strong>in</strong>variants can be successively<br />

built. Similarly, we obta<strong>in</strong> a trivial compatibility result with respect to further specialization.<br />

Both results, given <strong>in</strong> the next proposition, have already been proven <strong>in</strong> [21].<br />

Proposition 5.12. If I 1 and I 2 are static <strong>in</strong>variants on X, then for any operation S on<br />

Y X the GCSs (S I 1 ) I2 and S (I1^I2) co<strong>in</strong>cide on <strong>in</strong>itial states satisfy<strong>in</strong>g I 1 ^I 2 , i.e.,<br />

I 1 ^I 2 ! (S I 1 ) I2 and I 1 ^I 2 ! S (I 1^I2) are semantically equivalent.<br />

For an X-<strong>in</strong>variant I and a Z-operation T v S the GCS T I <strong>of</strong> T with respect to I is a<br />

specialization <strong>of</strong> S I .<br />

ut<br />

5.4 The Construction <strong>of</strong> GCSs<br />

The pro<strong>of</strong> <strong>of</strong> the existence result <strong>in</strong> Proposition 5.11 is not constructive. Therefore, we have<br />

to nd a way to construct the GCS <strong>of</strong> an operation with respect to some given <strong>in</strong>variant.<br />

For the basic operations loop and fail we have already computed their GCSs <strong>in</strong> Example<br />

5.6. For skip,whichisa-operation, we notice that each operation T with wp(T )(true) , true<br />

is already a specialization. Hence we have to nd the greatest consistent operation with this<br />

property. Informally, when start<strong>in</strong>g <strong>in</strong> a state 2 I the result<strong>in</strong>g state must also lie <strong>in</strong> I .<br />

When start<strong>in</strong>g <strong>in</strong> a state 2 :I wemayreachany nal state 2 . For X = fx 1 ::: x n g<br />

this operation is given by<br />

(I !(@x 0 1::: x 0 n (fx 1 =x 0 1::: x n =x 0 ng:I ! x 1 := x 0 1 ::: x n := x 0 n)))<br />

(@x 0 1 ::: x0 n x 1 := x 0 1 ::: x n := x 0 n ) :<br />

The required properties are easily checked 3 .<br />

For assignments, we assume a case-by-case analysis for selected classes <strong>of</strong> <strong>in</strong>variants. In a<br />

data-<strong>in</strong>tensive context such work has been done <strong>in</strong> [24, 21], but also the rule-based approach<br />

<strong>in</strong> [10] can be exploited for this task.<br />

In general, however, an operation is complex, built up from the basic operations and the<br />

constructors <strong>in</strong> Denition 5.2. It would be ne, if the GCS could be built just by replac<strong>in</strong>g<br />

the <strong>in</strong>volved basic operations by their GCSs, but <strong>in</strong> general this is wrong.<br />

Example 5.7. We have seen <strong>in</strong> Example 5.6 that GCSs may sometimes just arise from add<strong>in</strong>g<br />

preconditions. Now let X and I be the same, but take<br />

S = S 1 S 2<br />

= x := x ; a x := x + a<br />

for some <strong>in</strong>teger a 0. Clearly, S is semantically equivalent toskip. Aswehave seenabove,<br />

we obta<strong>in</strong>wp(S I )(true) , true. However, if we replace S 1 and S 2 by their GCSs, i.e.<br />

3 Formally, this also follows from Lemma 5.27 <strong>in</strong> Appendix 5.7.<br />

99


(S 1 ) I = (x


(ii) For all states with j= <br />

I we have, if<br />

P )fx 1 =x 0 1::: x l =x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g::I)<br />

is a -constra<strong>in</strong>t for S + 1 ,thenitisalsoa-constra<strong>in</strong>t forS+ 1 S 2.<br />

In both cases the evaluation order <strong>in</strong> the substitutions is not important.<br />

Example 5.8. Let us cont<strong>in</strong>ue Example 5.7 with X = fx :: Zg, Ix 0, S 1 = x := x ; a,<br />

S 2 = x := x + a and S = S 1 S 2 for some <strong>in</strong>teger a 0. In this example S 1 is determ<strong>in</strong>istic<br />

and hence its only determ<strong>in</strong>istic branch.<br />

(i) Take acharacteriz<strong>in</strong>g predicate P <br />

x = b for some b :: Z. Thenj= :I holds i<br />

b


Example 5.9. Now take X = fx yg with data types T x = T y be<strong>in</strong>g the set <strong>of</strong> nite subsets<br />

<strong>of</strong> some set T . Let I x y and S(a b :: T ) = S 1 S 2 with S 1 = y := y ;fag and<br />

S 2 = y := y [fbg. Then S 1 and S 2 are fyg-operations, and S 1 is determ<strong>in</strong>istic.<br />

(i) Regard the -constra<strong>in</strong>ts <strong>of</strong> S 1 <strong>of</strong> the form P ) (8x 1 :: T x :x 1 y 0 ) with P y = y 0<br />

as required <strong>in</strong> Denition 5.14. We have<br />

fy 0 =yg:wlp(fy=y 0 g:S 1 )(P ) (8x 1 :: T x :x 1 y 0 )) ,<br />

8x 1 :: T x :x 1 y 0 ;fag ,<br />

false :<br />

Hence there is no such constra<strong>in</strong>t and consequently the conjunction <strong>of</strong> these constra<strong>in</strong>ts<br />

is equivalent totrue, which is trivially a -constra<strong>in</strong>t forS.<br />

(ii) Then regard the -constra<strong>in</strong>ts <strong>of</strong> S 1 <strong>of</strong> the form P ) (8x 1 :: T x :x 1 6 y 0 )withP y =<br />

y 0 as required <strong>in</strong> Denition 5.14. We have<br />

fy 0 =yg:wlp(fy=y 0 g:S 1 )(P ) (8x 1 :: T x :x 1 6 y 0 )) ,<br />

:9x 1 :: T x :x 1 y 0 ;fag ,<br />

false :<br />

Aga<strong>in</strong> the conjunction <strong>of</strong> these constra<strong>in</strong>ts is equivalent to true, which is trivially a -<br />

constra<strong>in</strong>t forS.<br />

Hence, <strong>in</strong> this case S is -I-reduced.<br />

ut<br />

Note the fundamental dierence between these two examples. In both cases the free variables<br />

<strong>in</strong> the <strong>in</strong>variant I conprise all variables <strong>of</strong> X. In Example 5.8 the operation S 1 is an X-<br />

operation and S was not -I-reduced, whereas <strong>in</strong> Example 5.9 S 1 is a Y 1 -operation for a<br />

proper subset Y 1 <strong>of</strong> X and S is -I-reduced. The follow<strong>in</strong>g lemma shows that this observation<br />

can be generalized.<br />

Lemma 5.15. Let the notations be as <strong>in</strong> Denition 5.14. Suppose that S = S 1 S 2 is not<br />

-I-reduced. Then we have Y 1 = X.<br />

Pro<strong>of</strong>. Without loss <strong>of</strong> generality wemay assume that S 1 is determ<strong>in</strong>istic.<br />

Let P ) fx 1 =x 0 1 ::: x l=x 0 l g:(8 i(i = 1:::m):fy 1 = 1 ::: y m = m g:K) be a -constra<strong>in</strong>t<br />

for S 1 , where K is either I or :I. Furthermore, let P x 1 = d 1^:::^x n = d n with n = l+m<br />

and y 1 = x l+1 ::: y m = x n and assume j= :K. Thenweget<br />

fx 0 1 =x 1::: x 0 l =x lg:wlp(S 1 0 )<br />

(P )fx 1 =x 0 1 ::: x l=x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g:K)) ,<br />

8 i (i =1:::m):(fx 0 1=x 1 ::: x 0 l =x lg:wlp(S 1 0 )<br />

(P )fx 1 =x 0 1::: x l =x 0 l g:fy 1= 1 ::: y m = m g:K)) :<br />

For Y 1 6= X we have m 6= 0and at least one i will be bound <strong>in</strong> this formula. S<strong>in</strong>ce j= :K<br />

holds, this formula can never be satised. Hence the conjunction <strong>of</strong> -constra<strong>in</strong>ts for S 1 <strong>of</strong> the<br />

given form is true, which is trivially a -constra<strong>in</strong>t for S. Hence S is -I-reduced. ut<br />

102


F<strong>in</strong>ally, we may extend Denition 5.14 to arbitrary operations requir<strong>in</strong>g all occurr<strong>in</strong>g sequences<br />

to be -I-reduced.<br />

Denition 5.16. Let S be an X-operation and I some Y -<strong>in</strong>variant with X Y . S is called<br />

I-reduced i the follow<strong>in</strong>g holds:<br />

(i) If S is one <strong>of</strong> fail, skip, loop or an assignment, then S is always I-reduced.<br />

(ii) If S = S 1 S 2 ,thenS is I-reduced i S 1 and S 2 are I-reduced and S is -I-reduced.<br />

(iii) If S is one <strong>of</strong> P ! T ,@y :: T y T , S 1 S 2 or S 1 S 2 , then S is I-reduced i S 1 and S 2<br />

or T respectively are I-reduced.<br />

(iv) If S = T:f(T ), then S is I-reduced i f (loop) isI-reduced for each ord<strong>in</strong>al number .<br />

5.4.2 An Upper Bound for GCSs<br />

Now we are prepared for our rst goal. We prove that the GCS S I <strong>of</strong> an I-reduced operation<br />

S specializes S I which is built by replac<strong>in</strong>g each primitive operation <strong>in</strong> S by its GCS. As<br />

announced the pro<strong>of</strong> will be done by structural <strong>in</strong>duction on S us<strong>in</strong>g the constructors <strong>in</strong><br />

Denition 5.2.<br />

Proposition 5.17. Let S 0 = P ! S be a Y -operation and I an X-<strong>in</strong>variant with Y X. If<br />

T v S 0 is consistent with respect to I, then we have T vP! S I .<br />

Pro<strong>of</strong>. S<strong>in</strong>ce T v S 0 v S holds and T is consistent with respect to I, we conclude T v S I<br />

from Denition 5.10. In addition we have :P ) wp(S 0 )(false) ) wp(T )(false), hence<br />

T vP! S I follows.<br />

ut<br />

Proposition 5.18. Let S = S 1 S 2 beaY -operation and I an X-<strong>in</strong>variant with Y X. If<br />

T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I .<br />

Pro<strong>of</strong>. T is semantically equivalenttoT 0 Q!loop with wp(T 0 )(true) , true, wlp(T 0 )(R) ,<br />

wlp(T )(R) for all R and Q,wp(T ) (false). Then Q!loop v S implies<br />

Q!loop = (Q 1 ! loop) (Q 2 ! loop)<br />

with Q i ! loop v S i for i =1 2. If we show T 0 v (S 1 ) I (S 2 ) I ,thenalso<br />

T v (S 1 ) I (Q 1 ! loop)<br />

| {z }<br />

(S1) 0 I<br />

(S 2 ) I (Q 2 ! loop)<br />

| {z }<br />

(S2) 0 I<br />

holds, but it is easy to see that (S i ) 0 I v (S i) I holds for i =1 2. Hence the result.<br />

From now onwemay therefore assume without loss <strong>of</strong> generality that wp(T )(true) , true<br />

holds. Then for any state dene T = T (P ! skip) with a characteriz<strong>in</strong>g predicate P <br />

<strong>of</strong> the state . Clearly, T is a determ<strong>in</strong>istic specialization <strong>of</strong> T ,s<strong>in</strong>ce<br />

<br />

wlp(T ) (P ) , wlp(T ) wlp(T )<br />

(P ^P ) ,<br />

(P ) for = <br />

) wlp(T ) (P<br />

false else<br />

) :<br />

S<strong>in</strong>ce wp(T )(P ) , wp(T )(true), we may also derive the stated determ<strong>in</strong>ism from this<br />

computation. Analogously we have<br />

103


wp(T ) (P ) , wp(T ) wp(T )<br />

(P ^P ,<br />

(P ) for = <br />

wp(T ) (false) else<br />

<br />

) wp(T ) (P ) <br />

s<strong>in</strong>ce predicate transformers are monotonic. Consequently T is also consistent with respect<br />

to I.<br />

In addition the determ<strong>in</strong>ism implies that T is semantically eqivalent to T1 T2 with<br />

T <br />

i<br />

v S i for i =1 2. More precisely we have T <br />

i<br />

From Proposition 5.7 we derive<br />

= P <br />

i<br />

! T with<br />

P <br />

i fz=yg:wlp(fy=zg:T )(fy=zg:P ) wlp(S i ) (z = y)) :<br />

P 1 _P 2 ,fz=yg:wlp(fy=zg:T )(P ) wlp(S) (z = y)) (5:49)<br />

, true <br />

s<strong>in</strong>ce T v S holds, hence T =(P1 _P 2 ) ! T = P1 ! T P2 ! T . Then it follows<br />

from Denition 5.10 that Ti v (S i ) I holds, hence also T v (S 1 ) I (S 2 ) I .<br />

F<strong>in</strong>ally, the least upper bound <strong>of</strong> all T with respect to v must specialize (S 1 ) I (S 2 ) I ,<br />

but this least upper bound is semantically equivalent toT ,which completes the pro<strong>of</strong>. ut<br />

Unbounded choice can be handled analogous to choice.<br />

Proposition 5.19. Let S 0 =@y :: T y S be aY -operation and I an X-<strong>in</strong>variant with Y X.<br />

If T v S 0 is consistent with respect to I, thenwe have T v @y :: T y S I .<br />

ut<br />

Proposition 5.20. Let S = S 1 S 2 be a Y -operation and I an X-<strong>in</strong>variant with Y X. If<br />

T v S is consistent with respect to I, thenwe have<br />

T v (S 1 ) I wp(S 1 )(false) ! (S 2 ) I v (S 1 ) I (S 2 ) I :<br />

Moreover, we have T v (S 1 ) I wlp(S 1 )(false) ! (S 2 ) I .<br />

Pro<strong>of</strong>. Dene T 1 = wp(S 1 ) (true) ! T and T 2 = wp(S 1 )(false) ! T . S<strong>in</strong>ce wp(S 1 ) (true) _<br />

wp(S 1 )(false) , true, we certa<strong>in</strong>ly have T = T 1 T 2 . Moreover, T 1 v S 1 and T 2 v<br />

wp(S 1 )(false) ! S 2 obviously hold.<br />

S<strong>in</strong>ce T 1 and T 2 are consistent with respect to I, it follows T 1 v (S 1 ) I and T 2 v<br />

wp(S 1 )(false) ! (S 2 ) I v wp((S 1 ) I )(false) ! (S 2 ) I , hence also<br />

T 1 T 2 v (S 1 ) I wp(S 1 )(false) ! (S 2 ) I<br />

v (S 1 ) I wp((S 1 ) I )(false) ! (S 2 ) I = (S 1 ) I (S 2 ) I <br />

which proves the rst result. S<strong>in</strong>ce wp(S 1 )(false) ) wlp(S 1 )(false) holds, the second result<br />

is obvious.<br />

ut<br />

The most dicult part concerns sequences. In this case the pro<strong>of</strong> is rather lengthy and requires<br />

several lemmata concern<strong>in</strong>g a specic form <strong>of</strong> GCSs and a very technical result on -Ireducedness.<br />

Therefore the pro<strong>of</strong> is shifted to Appendix 5.7.<br />

Proposition 5.21. Let S = S 1 S 2 be an I-reduced Y -operation and I an X-<strong>in</strong>variant with<br />

Y X. If T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . ut<br />

104


The rema<strong>in</strong><strong>in</strong>g case is given by an I-reduced recursive operation S:f(S). For this we use<br />

<strong>in</strong>duction on ord<strong>in</strong>al numbers. The ma<strong>in</strong> diculty is to br<strong>in</strong>g together two dierent partial<br />

orders, the specialization order v <strong>of</strong> Denition 5.6 and the Nelson-order <strong>in</strong> (5.35). The specialization<br />

order is fundamental for GCSs, whereas the Nelson-order is required for recursion.<br />

For recursive guarded commands the monotonicity <strong>of</strong> all operation constructors with respect<br />

to the Nelson-order is fundamental [19]. Unfortunately, a similar result does not hold<br />

for the specialization order. More precisely, the result is false for the -constructor <strong>in</strong> its rst<br />

component.<br />

Lemma 5.22. Let f(S) be a guarded command expression built from the constructors <strong>in</strong><br />

Denition 5.2 except restricted choice . Then f is monotonic with respect to the specialization<br />

order v.<br />

Pro<strong>of</strong>. The pro<strong>of</strong> is done by structural <strong>in</strong>duction. For each constructor it is completely<br />

analogous to the correspond<strong>in</strong>g pro<strong>of</strong> for the Nelson-order <strong>in</strong> [19]. We omit the details. ut<br />

In Proposition 5.20 we have seen that S I may conta<strong>in</strong> the choice-constructor <strong>in</strong>stead <strong>of</strong> the<br />

restricted choice, provided we <strong>in</strong>clude some guard. Replac<strong>in</strong>g with<strong>in</strong> a recursive operation<br />

some S 1 S 2 by(S 1 ) I (S 2 ) I would destroy the required result.<br />

The next lemma follows from tak<strong>in</strong>g together Propositions 5.17{5.21.<br />

Lemma 5.23. Let T be aconsistent specialization <strong>of</strong> some I-reduced f(S 0 ) with respect to I,<br />

where f(S) is an expression built from the constructors <strong>in</strong> Denition 5.2. Construct f I (S)<br />

from f(S) as follows:<br />

(i) Each restricted choice S 1 S 2 occurr<strong>in</strong>g with<strong>in</strong> f(S) will be replaced by<br />

S 1 wlp(S 1 )(false) ! S 2 :<br />

(ii) Then each basic operation, i.e. skip and assignments x := E(x) will be replaced by their<br />

GCSs with respect to I.<br />

Then we have T v f I (S 0 I ).<br />

ut<br />

Proposition 5.24. Let S 0 = S:f(S) be anI-reduced Y -operation and T v S 0 beaconsistent<br />

specialization with respect to some X-<strong>in</strong>variant I with Y X. Then we also have T v<br />

S:f I (S), where f I (S) is built as <strong>in</strong> Lemma 5.23.<br />

ut<br />

Aga<strong>in</strong>, the pro<strong>of</strong> is rather lengthy and requires additional lemmata. Hence it is shifted to<br />

Appendix 5.8. We maynow summarize the result achieved so far, which gives the announced<br />

upper bound theorem.<br />

Theorem 5.25. Let S be some I-reduced Y -operation and I some X-<strong>in</strong>variant with Y X.<br />

Let SI result from S as follows:<br />

(i) Each restricted choice S 1 S 2 occurr<strong>in</strong>g with<strong>in</strong> S will be replaced byS 1 wlp(S 1 )(false) !<br />

S 2 .<br />

(ii) Then each basic operation, i.e. loop, fail, skip and assignments x := E(x) will be replaced<br />

by the GCSs with respect to I.<br />

Then T v S I holds for each consistent specialization T v S with respect to I.<br />

ut<br />

105


5.4.3 The General Form <strong>of</strong> GCSs<br />

Now we are prepared to state the ma<strong>in</strong> result on the general form <strong>of</strong> GCSs. Informally, the<br />

GCS <strong>of</strong> an I-reduced operation S results <strong>in</strong> two steps. First we have to remove all restricted<br />

choices and to replace basic update operations by their GCSs. Then we have seen that the<br />

GCS S I specializes the result<strong>in</strong>g SI 0 .Now add a precondition P (S0 I ) that \lters" only those<br />

computations <strong>of</strong> SI 0 that specialize the orig<strong>in</strong>al S. This precondition corresponds to the normal<br />

form <strong>of</strong> the specialization pro<strong>of</strong> obligation <strong>in</strong> (5.49).<br />

Theorem 5.26. Let I be anX-<strong>in</strong>variant and S some I-reduced Y -operation with Y X =<br />

fx 1 ::: x n g. Let S 0 I result from S by rst replac<strong>in</strong>g each restricted choice S 1 S 2 by<br />

S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic assignment operation by its GCS with respect<br />

to I. For a disjo<strong>in</strong>t copy fz 1 ::: z n g <strong>of</strong> X dene<br />

P (S 0 I ) fz 1=x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) <br />

where T results from S 0 I by renam<strong>in</strong>g all x i to z i . Then the GCS <strong>of</strong> S with respect to I can<br />

be written <strong>in</strong> the form S I = P (S 0 I ) ! S0 I .<br />

Pro<strong>of</strong>.<br />

Let R be an arbitrary X-<strong>in</strong>variant. Then we have<br />

wlp(S 0 I ) (R) , P (S 0 I ) ^ wlp(S0 I ) (R) (5.49) ) wlp(S) (R)<br />

and the wp-condition can be proven analogously, which implies that S I as given <strong>in</strong> the theorem<br />

is a specialization <strong>of</strong> S. S<strong>in</strong>ce SI 0 is consistent with respect to I, the same holds for S I. Hence<br />

the given operation S I is <strong>in</strong>deed a consistent specialization.<br />

It rema<strong>in</strong>s to show that it is already the GCS. Let T v S be some arbitrary consistent<br />

specialization and assume without loss <strong>of</strong> generality that wp(T )(true) , true holds. From<br />

Theorem 5.25 we know that T v SI 0 holds. Hence the result follows from wp(T ) (true) )<br />

P (SI 0 ).<br />

If j= :P (SI 0 ) holds, we conclude from Proposition 5.7 that there exists some state 0<br />

with<br />

j= wlp(S) (P 0) ^ :wlp(S 0 I)(P 0) <br />

hence also j= wlp(S) (P 0) ^ :wlp(T )(P 0), s<strong>in</strong>ce T v SI 0 holds. But s<strong>in</strong>ce T v S is<br />

assumed, we must have j= :wp(T ) (true) follows, which completes the pro<strong>of</strong>.<br />

ut<br />

Let us nally come back to our start<strong>in</strong>g po<strong>in</strong>t and look at practical applications. In general,<br />

GCSs are non-determ<strong>in</strong>istic, which reects various strategies for consistency enforcement.<br />

In most practical cases, however, we are only <strong>in</strong>terested <strong>in</strong> one <strong>of</strong> these strategies, i.e., we<br />

usually select a determ<strong>in</strong>istic or quasi-determ<strong>in</strong>istic branch <strong>of</strong> the GCS. The selection <strong>of</strong><br />

quasi-determ<strong>in</strong>istic branches is related to an <strong>in</strong>teractive support for the values to be selected.<br />

Consider the special case, where we deal with nite sets as <strong>in</strong> Examples 5.1 and 5.3. One<br />

strategy would be to change value as little as possible, i.e. the symmetric dierence between<br />

the old and the new values should be m<strong>in</strong>imized. Accord<strong>in</strong>g to our general result on GCS<br />

construction a reasonable result can be achieved, if we already choose such quasi-determ<strong>in</strong>istic<br />

branches for the GCS <strong>of</strong> the <strong>in</strong>volved basic operations. In particular, we only have to take<br />

care <strong>of</strong> assignments. We demonstrate this approach by a nal example.<br />

106


Example 5.10. Let us cont<strong>in</strong>ue Example 5.3, which comprises the essentials <strong>of</strong> the application<br />

example 5.1. The follow<strong>in</strong>g calculations will justify the <strong>in</strong>formal approach <strong>in</strong> Example<br />

5.2. Let the notations be as <strong>in</strong> Example 5.3. Then we consider the fx 1 g-operation<br />

S(a b :: Z) = x 1 := x 1 [f(a b)g. Proposition 5.12 allows to build the GCS successively. Let<br />

us take the<strong>in</strong>variants <strong>in</strong> the order given <strong>in</strong> Example 5.3.<br />

Step 1. First consider the <strong>in</strong>clusion <strong>in</strong>variant I 1 . S<strong>in</strong>ce S is just an assignment, it is I 1 -reduced.<br />

We then replace S by a branch <strong>of</strong> its GCS with respect to I 1 and obta<strong>in</strong> SI 0 (a b :: INT) =<br />

x 1 := x 1 [f(a b)g ( a =2 map( 1 )(x 2 ) ! @ c :: INT x 2 := x 2 [f(a c)g skip ) <br />

(5.54)<br />

which isanX-operation with P (SI 0 ) , true. Thenwe redene S by (5.54).<br />

Step 2. Now consider the <strong>in</strong>variant I 2 . S<strong>in</strong>ce S is a sequence S 1 S 2 with a fx 1 g-operation<br />

S 1 ,theI 2 -reducedness follows from Lemma 5.15.<br />

We have to remove the restricted choice and then replace the basic assignment to x 2 by<br />

the follow<strong>in</strong>g GCS branch with respect to I 2<br />

( a =2 map( 1 )(x 2 ) ! c =2 map( 2 )(x 2 ) ! x 2 := x 2 [f(a c)g )( a 2 map( 1 )(x 2 ) ! skip )<br />

For the result<strong>in</strong>g operation SI 0 we compute P (S0 I ) , true. After some rearrangements we<br />

obta<strong>in</strong> the follow<strong>in</strong>g GCS branch with respect to I 1 ^I 2 :<br />

x 1 := x 1 [f(a b)g <br />

(( a =2 map( 1 )(x 2 ) ! @ c :: INT c =2 map( 2 )(x 2 ) ! x 2 := x 2 [f(a c)g )<br />

a 2 map( 1 )(x 2 ) ! skip ) : (5.55)<br />

Then we redene the body <strong>of</strong> S by (5.55).<br />

Step 3. Now regard the exclusion <strong>in</strong>variant I 3 . Aga<strong>in</strong> I 3 -reducedness follows from Lemma<br />

5.15. Replace S 1 = x 1 := x 1 [f(a b)g <strong>in</strong> S by the GCS branch<br />

x 1 := x 1 [f(a b)g x 2 := x 2 ;fx 2 x 2 j 2 (x) =bg<br />

and analogously replace S 2<br />

= x 2 := x 2 [f(a c)g by the GCS branch<br />

x 2 := x 2 [f(a c)g x 1 := x 1 ;fx 2 x 1 j 2 (x) =cg :<br />

Then we compute<br />

P (SI) 0 , b =2 map( 2 )(x 2 ) ^<br />

(a =2 map( 1 )(x 2 ) )8c :: INT: (c 62 map( 2 )(x 2 ) ) c =2 map( 2 )(x 1 ) [fbg) ) <br />

hence after some rearrangements the nal result is<br />

S I (a b :: INT) = b =2 map( 2 )(x 2 ) ! x 1 := x 1 [f(a b)g <br />

(( a 62 map( 1 )(x 2 ) ! @ c :: INT <br />

c =2 map( 2 )(x 2 ) ^ c =2 map( 2 )(x 1 ) ! x 2 := x 2 [f(a c)g )<br />

a 2 map( 1 )(x 2 ) ! skip ) :<br />

Note that this result reects exactly the <strong>in</strong>formal considerations <strong>in</strong> Example 5.2.<br />

ut<br />

107


5.5 Conclusion<br />

The work reported <strong>in</strong> this paper deals with consistency enforcement <strong>in</strong> formal specications.<br />

This approach formalizes the problem by greatest consistent specializations (GCSs). Under<br />

the technical prerequisite <strong>of</strong> reducedness the computation <strong>of</strong> such a GCS can be be retraced to<br />

the denition <strong>of</strong> GCSs for basic update operations. It is possible to replace basic operations<br />

with<strong>in</strong> a complex operation specication by their GCSs and to compute a specialization<br />

precondition. This result is a step towards a general and theoretically founded solution <strong>of</strong><br />

consistency enforcement. However, a series <strong>of</strong> open problems is left for future research.<br />

{ The notion <strong>of</strong> a GCS can also be dened for transition constra<strong>in</strong>ts. Same as for static<br />

constra<strong>in</strong>ts existence, uniqueness and compatibility results are already known [21]. The<br />

problem is to extend also the results on GCS construction.<br />

{ The result on the construction <strong>of</strong> GCSs allows the problem <strong>of</strong> consistency enforcement to<br />

be reduced to basic operations, i.e. assignments, and simple constra<strong>in</strong>ts that are comb<strong>in</strong>ed<br />

by conjunction. The rema<strong>in</strong><strong>in</strong>g problem is to nd GCSs for basic operations, which has<br />

to be done case by case for selected classes <strong>of</strong> constra<strong>in</strong>ts (cf. [24, 21]).<br />

{ GCS construction depends on check<strong>in</strong>g for I-reducedness. This is equivalent toshow that<br />

certa<strong>in</strong> rst-order formulae derived from I are tautologies. Whilst this is undecidable<br />

<strong>in</strong> general, the problem is to characterize those <strong>in</strong>variants I for which I-reducedness is<br />

decidable.<br />

{ Even if we are able to decide I-reducedness, the problem is how to proceed <strong>in</strong> the case<br />

<strong>of</strong> non-reduced operations. It is not very satisfactory to break o with no result. The<br />

question is to nd a reduction algorithm or at least to nd conditions under which such<br />

an algorithm could exist. For <strong>in</strong>clusion, exclusion, functional and card<strong>in</strong>ality constra<strong>in</strong>ts<br />

<strong>in</strong> data-<strong>in</strong>tensive applications such reductions have beenworked out <strong>in</strong> [24].<br />

{ By us<strong>in</strong>g axiomatic semantics GCSs are only determ<strong>in</strong>ed up to semantic equivalence.<br />

The construction <strong>of</strong> a GCS, however, will result <strong>in</strong> a concrete syntactic form us<strong>in</strong>g typed<br />

guarded commands. An operational <strong>in</strong>terpretation <strong>of</strong> this form may <strong>in</strong>volve backtrack<strong>in</strong>g<br />

[19] and may be totally <strong>in</strong>ecient. Hence optimization may be required.<br />

All these problems are a bit technical <strong>in</strong> nature. The hardest problem, however, is concerned<br />

with the selection <strong>of</strong> the specialization order. As the use <strong>of</strong> the ma<strong>in</strong> result <strong>in</strong> practice demonstrates<br />

this order might still be too coarse for enforcement purposes. On the other hand,<br />

multi-valued dependencies, which concern only one set-valued state variable lead to preconditions,<br />

although we might expect additional changes <strong>in</strong>stead [21].<br />

One possible solution might be to choose an order based on -constra<strong>in</strong>ts, but then the<br />

problem leads back to GCSs because <strong>of</strong> the relation between transition consistency and specialization.<br />

Clos<strong>in</strong>g the rema<strong>in</strong><strong>in</strong>g gaps <strong>in</strong> this piece <strong>of</strong> work is a matter <strong>of</strong> current research.<br />

108


Appendix<br />

5.6 A Normal Form for the Specialization Pro<strong>of</strong> Obligation<br />

In this appendix we only give a pro<strong>of</strong> <strong>of</strong> Proposition 5.7, which is rst repeated here.<br />

Proposition 5.7 Let S and T be operations on a state space X = fx 1 ::: x n g. Let Z =<br />

fz 1 ::: z n g be disjo<strong>in</strong>t to X with T xi = T zi . Then wlp(S)(R) ) wlp(T )(R) holds for all<br />

X-<strong>in</strong>variants R i<br />

fz 1 =x 1 ::: z n =x n g:wlp(T 0 )(wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) (5.56)<br />

holds, where T 0 results from T by renam<strong>in</strong>g all the variables x i by z i .<br />

Pro<strong>of</strong>.<br />

In [19] it has been shown that we mayalways write wlp(T 0 )(R) <strong>in</strong> the form<br />

8z 0 1::: z 0 n: (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n) )fz 1 =z 0 1::: z n =z 0 ng:R) :<br />

In particular, (5.56) is equivalent to<br />

8z 0 1::: z 0 n: (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n) )<br />

fz 1 =z 0 1::: z n =z 0 ng:wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) :<br />

S<strong>in</strong>ce S is an X-operation, we conclude<br />

fz 1 =z 0 1::: z n =z 0 ng:wlp(S) (x 1 = z 1 ^ :::^ x n = z n ) , wlp(S) (x 1 = z 0 1 ^ :::^ x n = z 0 n) :<br />

Now assume wlp(S)(R) ) wlp(T )(R) for all X-predicates R. Then also wlp(S 0 )(R) )<br />

wlp(T 0 )(R) holds for all Z-predicates R, where S 0 results from S by renam<strong>in</strong>g all x i to<br />

z i .<br />

In particular, we maytake R as z 1 = (d 1 )^:::^z n = (d n ) for arbitrary constants d i 2 D<br />

and a selector function , which assigns closed terms (d) tosemantic constants d 2 D such<br />

that ! T = id D holds.<br />

But then wlp(S 0 )(R) can be rewritten as<br />

Hence<br />

fx 1 =z 1 ::: x n =z n g:wlp(S) (x 1 = (d 1 ) ^ :::^ x n = (d n ))<br />

8z 0 1 ::: z0 n : (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n ) )<br />

fx 1 =z 1 ::: x n =z n g:wlp(S) (x 1 = z1 0 ^ :::^ x n = zn 0 )) (5.57)<br />

holds, which implies (5.56).<br />

Conversely, (5.56) implies (5.57). For an arbitrary X-predicate R we may then write<br />

wlp(T ) (R) as<br />

9z 0 1 ::: z0 n : (fz 1=x 1 ::: z n =x n g:wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n ) ^fx 1=z 0 1 ::: x n=z 0 n g:R)<br />

and by (5.57)we conclude<br />

fz 1 =x 1 ::: z n =x n g:(9z 0 1 ::: z0 n : (wlp(S) (x 1 = z 0 1 ^:::^x n = z 0 n )^fx 1=z 0 1 ::: x n=z 0 n g:R)) <br />

which is wlp(S) (R) by the normal form representation for wlp(S). Hence wlp(S)(R) )<br />

wlp(T )(R) as required.<br />

ut<br />

109


5.7 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem for Sequences<br />

In this appendix we give the pro<strong>of</strong> <strong>of</strong> Proposition 5.21. For this we need two lemmata. The<br />

rst <strong>of</strong> these shows a general syntactic form <strong>of</strong> GCSs based on unbounded choices. A similar<br />

result occurs if we exploit the equivalence to predicative specications.<br />

Lemma 5.27. Let S be a Y -operation, Y X and I an <strong>in</strong>variant on X. Then the greatest<br />

consistent specialization S I <strong>of</strong> S with respect to I is semantically equivalent to<br />

(I ! S @ z := I!skip) (:I ! S @ z := ) (5.58)<br />

where z has been used as an abbreviation <strong>of</strong> the collection <strong>of</strong> state variables <strong>in</strong> X ; Y and<br />

ranges over values <strong>of</strong> the correspond<strong>in</strong>g types. Moreover, S I is uniquely determ<strong>in</strong>ed (up to<br />

semantic equivalence) by S and I.<br />

Pro<strong>of</strong>. We have to verify the conditions <strong>in</strong> Denition 5.10 for S I dened by (5.58). For an<br />

arbitrary Y -<strong>in</strong>variant R we have<br />

wlp(S I ) (R) , (I ^wlp(S) (9:fz=g:(I ^R))) _ (:I ^ wlp(S) (9:fz=g:R))<br />

, (I ^wlp(S) ((9:fz=g:I) ^R)) _ (:I ^ wlp(S) (R))<br />

) (I ^wlp(S) (R)) _ (:I ^ wlp(S) (R))<br />

, wlp(S) (R) :<br />

Here we exploited the monotonicity <strong>of</strong> conjugate predicate transformers with respect to implication<br />

and the fact that the variables z do not occur <strong>in</strong> R. Then the same calculation can be<br />

used with wlp replaced everywhere by wp, which shows (i). For the pro<strong>of</strong> <strong>of</strong> (ii) we compute<br />

wlp(S I )(I) , (I )wlp(S)(8:fz=g:(I )I))) ^ (:I ) wlp(S)(8:fz=g:I))<br />

, (:I ) wlp(S)(8:fz=g:I) <br />

which implies I)wlp(S I )(I) as required.<br />

For the pro<strong>of</strong> <strong>of</strong> (iii) let P be a characteriz<strong>in</strong>g predicate and let T v S be an arbitrary<br />

consistent specialization <strong>of</strong> S. We dist<strong>in</strong>guish two cases.<br />

Case 1. Assume P ):I.Thenwe also have wlp(T ) (P ) ) wlp(T ) (:I) ):I, s<strong>in</strong>ce T<br />

is consistent with respect to I and wlp(S) is monotonic. It follows<br />

wlp(T ) (P ) ) :I ^ wlp(S) (P ) ) :I ^ wlp(S) (9:fz=g:P ) ) wlp(S I ) (P ) :<br />

The last implication follows from the calculation <strong>of</strong> wlp(S I ) <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> (i).<br />

Case 2. Assume P ) I. Then we have wlp(T ) (P ) , wlp(T ) (I ^P ). S<strong>in</strong>ce T v S<br />

holds, the monotonicity <strong>of</strong>wlp(S) implies<br />

wlp(S) (9:fz=g:(I ^P )) ^ wlp(S) (9:fz=g:P ) : (5.59)<br />

In any case (wlp(T ) (P ) ^I) _ (wlp(T ) (P ) ):I) holds. Together with (5.59) we get<br />

wlp(T ) (P ) ) (I ^wlp(S) (9:fz=g:(I ^P ))) _ (:I ^ wlp(S) (9:fz=g:P ))<br />

, wlp(S I ) (P ) :<br />

110


The universal conjunctivity property then implies wlp(T ) (R) ) wlp(S I ) (R) for all R.<br />

In addition, it is easy to see that wp(T ) (false) ) wp(S I ) (false) holds. Hence T is a<br />

specialization <strong>of</strong> S I .<br />

ut<br />

The second technical lemma reformulates the properties <strong>of</strong> -I-reducedness for determ<strong>in</strong>istic<br />

S 1 .<br />

Lemma 5.28. Let the notations be as <strong>in</strong> Denition 5.14. Assume that S is -I-reduced and<br />

S 1 is determ<strong>in</strong>istic. Then follow<strong>in</strong>g two conditions hold:<br />

(i) For all states and with j= :I , j= :I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ) we<br />

have, if<br />

P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)<br />

is a -constra<strong>in</strong>t for S 1 ,thenP )fx=x 0 g:8 Y ;X 1:fy=g:I) is a -constra<strong>in</strong>t for S.<br />

(ii) For all states and with j= I , j= I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ) we<br />

have, if<br />

P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) ):I)<br />

is a -constra<strong>in</strong>t for S 1 ,thenP )fx=x 0 g:8 Y ;X 1:fy=g::I) is a -constra<strong>in</strong>t for S.<br />

Pro<strong>of</strong>. Let and be states with j= :I , j= :I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ).<br />

Assume that<br />

P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) (5.60)<br />

is a -constra<strong>in</strong>t forS 1 . Then we have<br />

j= wlp(S 1 ) (wlp(S 2 ) (9 Y ;X 1;X2 :fy=g:P )) ) (s<strong>in</strong>ce S 1 is determ<strong>in</strong>istic)<br />

j= wlp(S 1 )(wlp(S 2 ) (9 Y ;X 1;X2 :fy=g:P )) )<br />

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P ) wlp(fx=x 0 g:S 2 ) (9 Y ;X 1;X2 :fy=g:fx=x0 g:P )) :(5.61)<br />

From (5.60) we have<br />

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) :<br />

Together with (5.61) this implies<br />

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:8 Y ;X 1 :fy=g:I) :<br />

Hence P ) fx=x 0 g:8 Y ;X 1 :fy=g:I is a -constra<strong>in</strong>t for S 1. S<strong>in</strong>ce S is assumed to be -<br />

I-reduced, it follows that P ) fx=x 0 g:8 Y ;X 1:fy=g:I is also a -constra<strong>in</strong>t for S, hence<br />

condition (i). The pro<strong>of</strong> <strong>of</strong> condition (ii) is completely analogous.<br />

ut<br />

With the help <strong>of</strong> these two technical lemmata we can now approach the pro<strong>of</strong> <strong>of</strong> the upper<br />

bound theorem for sequences. We use Lemma 5.27 to compute S I and (S 1 ) I (S 2 ) I . This<br />

allows to compute their predicate transformers. Then we verify the required specialization,<br />

for which Lemma 5.28 must be exploited.<br />

111


Proposition 5.21 Let S = S 1 S 2 be an I-reduced Y -operation and I an X-<strong>in</strong>variant with<br />

Y X. If T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . ut<br />

Pro<strong>of</strong>. We may assume without loss <strong>of</strong> generality that wp(T )(true) , true holds. Then it<br />

suces to show wlp(S I ) (P ) ) wlp((S 1 ) I (S 2 ) I ) (P ) for all characteriz<strong>in</strong>g predicates<br />

P .<br />

Moreover, s<strong>in</strong>ce S 1 is the least upper bound <strong>of</strong> its determ<strong>in</strong>istic branches with respect<br />

to v, we may assume without loss <strong>of</strong> generality thatS 1 is determ<strong>in</strong>istic. Hence the stronger<br />

properties <strong>in</strong> Lemma 5.28 can be used.<br />

First we compute both sides <strong>of</strong> such an implication us<strong>in</strong>g (5.58). We have<br />

wlp(S I ) (P ) , (I^9:wlp(S) (fy=g:I ^fy=g:P )) _<br />

(:I ^ 9:wlp(S) (fy=g:P ))<br />

(5.62)<br />

and<br />

wlp((S 1 ) I (S 2 ) I ) (P ) ,<br />

(I ^wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp((S 2) I ) (P )))) _<br />

(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:wlp((S 2) I ) (P ))) ,<br />

(I ^wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:(I ^P ))))) _<br />

(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:(I ^P ))))) _<br />

(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:(:I ^ wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P )))) ,<br />

9 Y ;X 1 :9 Y ;X2 0 :(wlp(S 1 ) (fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:(I ^P )))) _<br />

9 Y ;X 1 :9 Y ;X2 0 ::I ^ (wlp(S 1 ) (:fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:P ))) :<br />

(5.63)<br />

Case 1. Assume P ):Iholds. Then we have wlp(S I ) (P ) ) wlp(S I ) (:I) ):I, s<strong>in</strong>ce<br />

S I is consistent with respect to I. Hence, s<strong>in</strong>ce we assume wlp(S I ) (P ), we have to consider<br />

only the second l<strong>in</strong>e <strong>of</strong> (5.62). We want to show that this implies the second l<strong>in</strong>e <strong>of</strong> (5.63),<br />

i.e. (due to consistency :I can be omitted)<br />

9 Y ;X 1 :9 Y ;X2 0 : (:I ^ (wlp(S 1 ) (:fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:P )))) :<br />

(5.64)<br />

Assume that (5.64) does not hold, i.e. there exists some state with<br />

j= wlp(S 1 )(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) : (5.65)<br />

We then calculate that (5.65) is equivalent to<br />

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(fx=x 0 g:<br />

8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) ,<br />

| {z }<br />

R<br />

j= fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) )fx 0 =x 00 g:fx=x 0 g:R) ,<br />

j= P )fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) )fx 0 =x 00 g:fx=x 0 g:R) ,<br />

j= fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) ) (P )fx 0 =x 00 g:fx=x 0 g:R)) ,<br />

j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:R) :<br />

112


From this we conclude that<br />

P )fx=x 0 g:(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) (5.66)<br />

is a -constra<strong>in</strong>t forS 1 . Due to Lemma 5.28(i), s<strong>in</strong>ce j= :I and j= :I hold, this implies<br />

to be a -constra<strong>in</strong>t forS, hence we get<br />

P )fx=x 0 g:(8 Y ;X 1:fy=g:I) (5.67)<br />

j= fx 0 =xg:wlp(fx=x 0 g:S)(P )fx=x 0 g:(8 Y ;X 1 :fy=g:I)) ,<br />

(do the same calculation as above)<br />

j= fx 0 =xg:wlp(fx=x 0 g:S)(fx=x 0 g:8 Y ;X 1 :fy=g:I) ,<br />

j= wlp(S)(8 Y ;X 1:fy=g:I) : (5.68)<br />

which leads to a contradiction, s<strong>in</strong>ce P ):Iimplies<br />

wlp(S) (9 Y ;X 1 :fy=g::I) , :wlp(S)(8 Y ;X1 :fy=g:I)<br />

and on the other hand due to consistency we have<br />

wlp(S I ) (P ) ) wlp(S I ) (9 Y ;X 1;X2 :fy=g:P ) ) wlp(S) (9 Y ;X 1;X2 :fy=g:P ) :<br />

This proves the assertion <strong>in</strong> Case 1.<br />

Case 2. Now assume that P )Iand j= <br />

subcases.<br />

wlp(S I ) (P ) hold. From (5.62) we derive two<br />

Case 2.1 Assume j= :I ^ 9:wlp(S) (fy=g:P ). Then we conclude (always j= :::)<br />

9:wlp(S 1 ) (wlp(S 2 ) (fy=g:P )) ,<br />

9:wlp(S 1 ) ((I _:I) ^ wlp(S 2 ) (fy=g:P )) ,<br />

9:wlp(S 1 ) (I^wlp(S 2 ) (fy g:(P ^I))) _9:wlp(S 1 ) (:I ^ wlp(S 2 ) (fy=g:P )) <br />

hence (5.63) follows. This proves Case 2.1.<br />

Case 2.2. Now assume j= I^9:wlp(S) (fy=g:(I ^P )). We show that this implies<br />

j= 9 Y ;X 1 :9 Y ;X2 0 :(wlp(S 1 ) (fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:(I ^P )))) <br />

(5.69)<br />

which implies the rst l<strong>in</strong>e <strong>of</strong> (5.63).<br />

As <strong>in</strong> Case 1 assume that (5.69) does not hold. We use analogous computations to derive<br />

that<br />

P )fx=x 0 g:(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) ):I))<br />

is a -constra<strong>in</strong>t forS 1 . Accord<strong>in</strong>g to Lemma 5.28, s<strong>in</strong>ce j= I, j= I, this implies<br />

P )fx=x 0 g:(8 Y ;X 1 :fy=g::I)<br />

113


to be a -constra<strong>in</strong>t forS. Thus, we get<br />

j= fx 0 =xg:wlp(fx=x 0 g:S)(P )fx=x 0 g:(8 Y ;X 1 :fy=g::I)) ,<br />

j= fx 0 =xg:wlp(fx=x 0 g:S)(fx=x 0 g:8 Y ;X 1 :fy=g::I) ,<br />

j= wlp(S)(8 Y ;X 1:fy=g::I) : (5.70)<br />

However, from our assumption and P )Iwe conclude<br />

wlp(S) (9 Y ;X 1 :fy=g:I) ,:wlp(S)(8 Y ;X1 :fy=g::I)<br />

contradict<strong>in</strong>g (5.70). This proves the assertion <strong>in</strong> Case 2.2.<br />

ut<br />

5.8 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem <strong>in</strong> the Recursive<br />

Case<br />

In this appendix we prove the upper bound theorem for recursive operations. For this we need<br />

an additional lemma that deals with the compatibility <strong>of</strong> GCSs with the Nelson-order. Let<br />

F <br />

denote the least upper bound with respect to the Nelson-order and F the least upper<br />

bound with respect to the specialization order.<br />

Lemma 5.29. Let T , S and S for each ord<strong>in</strong>al number be Y -operations such that S 0 S <br />

holds for 0 and let I be some X-<strong>in</strong>variant for Y X. Then we have:<br />

(i) If T S holds, then T I F S I follows.<br />

(ii) The least upper bound exists and we have<br />

<<br />

S I<br />

0<br />

@ G<br />

<<br />

<br />

S<br />

<br />

1<br />

A<br />

I<br />

v<br />

G<br />

<br />

S<br />

<br />

I :<br />

<<br />

Pro<strong>of</strong>. (i) follows, because all constructors <strong>of</strong> Denition 5.2 are monotonic <strong>in</strong> the Nelson<br />

order , hence the rst result follows from (5.58).<br />

(ii) S<strong>in</strong>ce S 0 S holds for 0 , the family (S ) < forms an ascend<strong>in</strong>g cha<strong>in</strong>. From<br />

[19] we know that <strong>in</strong> this case the least upper bound with respect to the Nelson order exists.<br />

It rema<strong>in</strong>s to show the required specialization assertion.<br />

Let T 1 and T 2 denote the left hand side F and the right hand side <strong>of</strong> this assertion respectively.<br />

S<strong>in</strong>ce for all < we have S S we conclude that S I T 1 and hence also<br />

<<br />

T 2 T 1 .Thisproves the wp(T 2 )(R) ) wp(T 1 )(R) for all X-constra<strong>in</strong>ts R.<br />

It rema<strong>in</strong>s to show wlp(T 2 )(R) ) wlp(T 1 )(R), which follows directly from Proposition<br />

5.7. ut<br />

Now we can give the ma<strong>in</strong> pro<strong>of</strong>.<br />

Proposition 5.24 Let S 0 = S:f(S) be anI-reduced Y -operation and T v S 0 beaconsistent<br />

specialization with respect to some X-<strong>in</strong>variant I with Y X. Then we also have T v<br />

S:f I (S), where f I (S) is built as <strong>in</strong> Lemma 5.23.<br />

114


Pro<strong>of</strong>.<br />

Recall from [19] that we have<br />

S 0<br />

= f (loop) = f<br />

0<br />

@ G<br />

<<br />

<br />

f (loop)<br />

1<br />

A<br />

for some ord<strong>in</strong>al number , hence from Lemmata 5.23 and 5.29 we derive<br />

0<br />

T v f I<br />

@<br />

0<br />

@ G<br />

<<br />

<br />

f (loop)<br />

0 1<br />

1 1<br />

T1<br />

A A v fI G z }| {<br />

<br />

f<br />

B<br />

(loop) I<br />

I @<br />

C<br />

<<br />

| {z }<br />

The last <strong>in</strong>equality follows from Lemma 5.29(ii) applied to the operand and from the monotonicity<br />

<strong>of</strong>f I with respect to the specialization order as stated <strong>in</strong> Lemma 5.22.<br />

Now dene T2 = f I (loop) and apply transnite <strong>in</strong>duction to show T 1 v T 2 for all .<br />

For = 0 the result is obvious, s<strong>in</strong>ce loop I is semantically equivalent toloop.<br />

Now assume T1 0<br />

v T2 0<br />

holds for all 0 < . For 0 < we have f<br />

F<br />

0<br />

I (loop) f I (loop).<br />

Hence the least upper bound <strong>in</strong> the Nelson order<br />

f 0<br />

(loop) exists and we have<br />

0<br />

f I<br />

B<br />

1<br />

T2<br />

z }| 0<br />

{<br />

<br />

f 0<br />

I (loop)<br />

C<br />

0 <<br />

| {z }<br />

T2<br />

G<br />

B<br />

@<br />

0 <<br />

I<br />

T1<br />

A = f I (loop) :<br />

Then by apply<strong>in</strong>g the <strong>in</strong>duction hypothesis (change <strong>in</strong> T 1 to 0 and to ) for an arbitrary<br />

X-constra<strong>in</strong>t R we get<br />

wlp(T 2 )(R) ,<br />

and<br />

wp(T 2 )(R) ,<br />

^<br />

0 <<br />

_<br />

0 <<br />

<strong>in</strong>duction hypothesis<br />

wlp(T2 0<br />

)(R) )<br />

<strong>in</strong>duction hypothesis<br />

wp(T2 0<br />

)(R) )<br />

^<br />

0 <<br />

_<br />

0 <<br />

A :<br />

wlp(T 0<br />

1 )(R) , wlp(T 1)(R)<br />

wp(T 0<br />

1 )(R) , wp(T 1)(R) :<br />

Consequently T 1 v T 2 holds. F<strong>in</strong>ally, from Lemma 5.22 we conclude T v f I (T 1 ) v f I (T 2 )=<br />

(loop) as required.<br />

ut<br />

f I<br />

References for Chapter 5<br />

1. J. R. Abrial: \A Formal Approach to Large S<strong>of</strong>tware Construction", <strong>in</strong> J. L. A. Van de Snepscheut<br />

(Ed.), Mathematics <strong>of</strong> Program Construction, Spr<strong>in</strong>ger LNCS 375, 1989, 1-20<br />

2. J. R. Abrial: The B Method, Prentice Hall International (to appear)<br />

3. J. Bicarregui, B. Ritchie: \Invariants, Frames and Postconditions: a Comparison <strong>of</strong> the VDM and B<br />

Notations", <strong>in</strong> J.C.P. Woodcock, P.G. Larsen (Eds.): Formal Methods Europe (FME'93), Spr<strong>in</strong>ger<br />

LNCS 670, 1993, 162-182<br />

115


4. D. Bjrner, C. B. Jones (1982): Formal Specication and S<strong>of</strong>tware Development, Prentice Hall<br />

5. M. Broy, G. Nelson: \Add<strong>in</strong>g Fair Choice to Dijkstra's Calculus", ACM TOPLAS, vol. 16 (3),<br />

1994, 924-938<br />

6. S. Ceri, J. Widom: \Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance", <strong>in</strong> Proceed<strong>in</strong>gs 16th<br />

Conference on VLDB, 1990, 566-577<br />

7. W. Chen, J. T. Udd<strong>in</strong>g: \Towards a Calculus <strong>of</strong> Data Renement",<strong>in</strong>J.L.AVan de Snepscheut<br />

(Ed.): Mathematics <strong>of</strong> Program Construction, Spr<strong>in</strong>ger LNCS 375, 1989, 197-218<br />

8. P. Cousot: \Methods and Logics for Prov<strong>in</strong>g Programs", <strong>in</strong> J. van Leeuwen (Ed.): The Handbook<br />

<strong>of</strong> Theoretical Computer Science, vol. B, Elsevier, 1990, 841-993<br />

9. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Spr<strong>in</strong>ger, Texts and<br />

Monographs <strong>in</strong> Computer Science, 1989<br />

10. P. Fraternali, S. Paraboschi, L. Tanca: \Automatic Rule Generation for Constra<strong>in</strong>t Enforcement<br />

<strong>in</strong> Active <strong>Databases</strong>", <strong>in</strong> U. Lipeck, B. Thalheim (Eds.): Modell<strong>in</strong>g Database Dynamics, Spr<strong>in</strong>ger<br />

WICS, 1993, 153-173<br />

11. M. Gertz, U. W. Lipeck: \Deriv<strong>in</strong>g Integrity Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Triggers from Transition Graphs", <strong>in</strong><br />

Proceed<strong>in</strong>gs 9th ICDE, IEEE Computer Society Press, 1993, 22-29<br />

12. D. Gries: The Science <strong>of</strong> Programm<strong>in</strong>g, Spr<strong>in</strong>ger Texts and Monographs <strong>in</strong> Computer Science,<br />

1981<br />

13. T. Gunther, K.-D. Schewe, I. Wetzel: \On the Derivation <strong>of</strong> Executable Database Programs<br />

from Formal Specications", <strong>in</strong> J.C.P. Woodcock, P.G. Larsen (Eds.): Formal Methods Europe<br />

(FME'93), Spr<strong>in</strong>ger LNCS 670, 1993, 351-366<br />

14. C. B. Jones: Systematic S<strong>of</strong>tware Development us<strong>in</strong>g VDM , Prentice-Hall International, 1986<br />

15. A. P. Karadimce, S. D. Urban: \Diagnos<strong>in</strong>g Anomalous Rule Behaviour <strong>in</strong> <strong>Databases</strong> with Integrity<br />

Ma<strong>in</strong>tenance Production Rules", <strong>in</strong> Proceed<strong>in</strong>gs 3rd Int. Workshop on Foundations <strong>of</strong> Models and<br />

Languages for Data and <strong>Object</strong>s, 1991, 77-102<br />

16. U. W. Lipeck: Dynamische Integritat von Datenbanken, Spr<strong>in</strong>ger IFB 209, 1987<br />

17. J.-J. Meyer, H. Weigand, R. Wier<strong>in</strong>ga: \A Specication Language for Static, Dynamic and Deontic<br />

Integrity Constra<strong>in</strong>ts", <strong>in</strong> J. Demetrovics, B. Thalheim (Eds.): MFDBS 89 , Spr<strong>in</strong>ger LNCS 364,<br />

347-366<br />

18. C. Morgan: Programm<strong>in</strong>g from Specications, Prentice Hall, 1988<br />

19. G. Nelson: \A Generalization <strong>of</strong> Dijkstra's Calculus", ACM TOPLAS, vol. 11 (4), 1989, 517-561<br />

20. K.-D. Schewe, I. Wetzel, J. W. Schmidt: \Towards a Structured Specication Language for<br />

Database Applications", <strong>in</strong> D. Harper, M. Norrie: Specication <strong>of</strong> Database Systems, Spr<strong>in</strong>ger<br />

Workshops <strong>in</strong> Comput<strong>in</strong>g Science, 1992, 255-274<br />

21. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: \Integrity Enforcement <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong><br />

<strong>Databases</strong>", <strong>in</strong> U. W. Lipeck, B. Thalheim (Eds.): Modell<strong>in</strong>g Database Dynamics, Spr<strong>in</strong>ger WICS,<br />

1993, 174-195<br />

22. K.-D. Schewe, B. Thalheim: \Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>", <strong>in</strong> S. Chakravarty,<br />

J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, 1994<br />

23. K.-D. Schewe, D. Stemple, B. Thalheim: \Higher Level Genericity <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>",<br />

<strong>in</strong> S. Chakravarty (Ed.): Conference on the Management <strong>of</strong> Data, 1994<br />

24. K.-D. Schewe: Specication and Development <strong>of</strong> Correct Relational Database Programs, book<br />

manuscript<br />

25. T. Sheard, D. Stemple: \Automatic Verication <strong>of</strong> Database Transaction Safety", ACM ToDS,<br />

vol. 14 (3), 1989, 322-368<br />

26. J. M. Spivey: Understand<strong>in</strong>g Z, A Specication Language and its Formal Semantics, Cambridge<br />

University Press, 1988<br />

27. J. M. Spivey: The Z Notation, A Reference Manual, Prentice Hall, 1989<br />

28. D. Stemple, S. Mazumdar, T. Sheard (1987): \On the Modes and Mean<strong>in</strong>g <strong>of</strong> Feedback toTransaction<br />

Designer", <strong>in</strong> Proceed<strong>in</strong>gsSIGMOD1987, 1987, 375-386<br />

29. S. D. Urban, L. Delcambre: \Constra<strong>in</strong>t Analysis: A Design Process for Specify<strong>in</strong>g Operations on<br />

<strong>Object</strong>s", IEEE Transactions on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), 1990<br />

116


30. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: \Set-oriented Production Rules <strong>in</strong> Relational Database Systems", <strong>in</strong><br />

Proceed<strong>in</strong>gs SIGMOD, 1990, 259-270<br />

117


Chapter 6<br />

Tailor<strong>in</strong>g Consistent Specializations<br />

as a Natural Approach to<br />

Consistency Enforcement<br />

Contents<br />

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119<br />

6.2 A Review on Transaction Transformation by Specialization . . . 121<br />

6.2.1 Greatest Consistent Specialization . . . . . . . . . . . . . . . . . . . 121<br />

6.2.2 The Construction <strong>of</strong> GCSs . . . . . . . . . . . . . . . . . . . . . . . . 122<br />

6.2.3 Two Major Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 123<br />

6.3 Weaker Notions <strong>of</strong> Eect Preservation . . . . . . . . . . . . . . . . 124<br />

6.3.1 Maximal Consistent Eect Preservers . . . . . . . . . . . . . . . . . 125<br />

6.3.2 Eective MCE Construction . . . . . . . . . . . . . . . . . . . . . . . 126<br />

6.4 Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . 127<br />

6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128<br />

6.6 The Predicate Transformer Calculus . . . . . . . . . . . . . . . . . 130<br />

6.7 I-reducedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

Klaus-Dieter Schewe. Tailor<strong>in</strong>g Consistent Specializations as a Natural Approach to<br />

Consistency Enforcement. <strong>in</strong> S.Conrad, H.-J.Kle<strong>in</strong>, K.-D. Schewe (Eds.). Integrity <strong>in</strong><br />

<strong>Databases</strong>. available at<br />

http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceed<strong>in</strong>gs.html.<br />

118


Abstract. Consistency enforcement may be regarded as a process <strong>of</strong> transaction transformation,<br />

where the modied transaction will be consistent with respect to a given set <strong>of</strong><br />

constra<strong>in</strong>ts. The computational approach by Schewe and Thalheim requires the modied<br />

transaction to be the greatest consistent one below the orig<strong>in</strong>al transaction with respect to<br />

some order. The order should express the preservation <strong>of</strong> the \eect" <strong>of</strong> the orig<strong>in</strong>al transaction.<br />

Thus, the major problem is to nd the right order.<br />

The rst choice, specialization, turns out to provide good computational properties, but<br />

on the one hand the order is too weak, because arbitrary changes to state variables not<br />

touched by the orig<strong>in</strong>al transaction are allowed, and on the other hand it is too strong, as<br />

eect preservation by specialization means that further changes to the other state variables<br />

are forbidden.<br />

In this paper, modications <strong>of</strong> greatest consistent specializations are studied to avoid these<br />

problems. Weaken<strong>in</strong>g the notion <strong>of</strong> eect preservation leads to the denition <strong>of</strong> maximal<br />

consistent eect preservers (MCEs). This turns out to be a reasonable choice, s<strong>in</strong>ce they<br />

preserve the computational strength achieved for consistent specializations. Moreover, for<br />

basic operations they are compatible with dierent consistency enforcement strategies chosen<br />

by users.<br />

6.1 Introduction<br />

Consistency enforcement is considered to be one <strong>of</strong> the major application elds <strong>of</strong> active<br />

database systems. It is expected that arbitrary sets <strong>of</strong> static <strong>in</strong>tegrity constra<strong>in</strong>ts allow repair<strong>in</strong>g<br />

ECA-rules to be dened or even generated. The analysis <strong>of</strong> the result<strong>in</strong>g rule trigger<strong>in</strong>g<br />

system concentrates on the term<strong>in</strong>ation <strong>of</strong> the rule system, the <strong>in</strong>dependence <strong>of</strong> the nal<br />

database state from the chosen selection order <strong>of</strong> the rules and on consistency. The work <strong>in</strong><br />

[1] can be taken as a representative <strong>of</strong> this approach.<br />

The mentioned requirements are not sucient for a reasonable rule behaviour, because<br />

they do not take care about the <strong>in</strong>teraction <strong>of</strong> the rules. In general, given a complex database<br />

transition, rule systems may always <strong>in</strong>validate the eect <strong>of</strong> that transition, e.g. an <strong>in</strong>sertion<br />

may be turned <strong>in</strong>to a deletion and vice versa. The work <strong>in</strong> [4] presents critical examples<br />

with respect to undesirable rule behaviour. In [2] the rule analysis is characterized as purely<br />

syntactical.<br />

The basic problem seems to be that an accepted theory <strong>of</strong> consistency enforcement is still<br />

miss<strong>in</strong>g. S<strong>in</strong>ce it is easy to dene an RTS that empties the database <strong>in</strong> case <strong>of</strong> any constra<strong>in</strong>t<br />

violation, it is not sucient to ensure consistency <strong>of</strong> the result. Therefore, the notion <strong>of</strong> greatest<br />

consistent specialization (GCS) was <strong>in</strong>troduced <strong>in</strong> [6] as a theoretical means for a denition<br />

<strong>of</strong> consistency enforcement. The basic considerations <strong>of</strong> this approach are quite simple:<br />

{ Instead <strong>of</strong> a constra<strong>in</strong>t set we may study a s<strong>in</strong>gle constra<strong>in</strong>t { just take the conjunction.<br />

{ S<strong>in</strong>ce consistency is basically a property <strong>of</strong> transactions, we may consider an arbitrary<br />

complex database transition.<br />

{ Then there should be a partial order, called specialization, on transactions such that it<br />

expresses the preservation <strong>of</strong> the eects <strong>of</strong> the orig<strong>in</strong>al transition. With respect to this<br />

order a solution <strong>of</strong> the consistency enforcement problem should be the GCS.<br />

Thus, the fundamental idea <strong>of</strong> the GCS approach is the transformation <strong>of</strong> arbitrary database<br />

transitions <strong>in</strong>to GCSs which should then be handled as transactions. Both consistency and<br />

specialization can be dened <strong>in</strong> terms <strong>of</strong> the extended predicate transformer calculus [3].<br />

119


The rst results on GCSs demonstrated their existence, uniqueness and a commutativity<br />

property with respect to several constra<strong>in</strong>ts [6]. Thus, the restriction to a s<strong>in</strong>gle constra<strong>in</strong>t<br />

is only necessary for denitional purposes, s<strong>in</strong>ce the GCS with respect to a conjunction <strong>of</strong><br />

constra<strong>in</strong>ts can be built successively.Furthermore, the order <strong>of</strong> the constra<strong>in</strong>ts is not important<br />

for such a construction. Such a property does not hold for any rule-based approach. The price<br />

for this exibility is the <strong>in</strong>herent non-determ<strong>in</strong>istic nature <strong>of</strong> GCSs.<br />

The <strong>in</strong>terest<strong>in</strong>g problem how GCSs are to be constructed was <strong>in</strong>vestigated <strong>in</strong> detail <strong>in</strong><br />

[5]. It could be shown that under mild technical restrictions the problem can be reduced to<br />

nd<strong>in</strong>g GCSs for basic operations. The GCS <strong>of</strong> a complex database transition results, if rstly<br />

<strong>in</strong>volved basic operations are replaced by their GCSs and secondly a precondition is added.<br />

Hence, GCSs <strong>in</strong> general are partial, i.e. <strong>in</strong> certa<strong>in</strong> cases there is no other choice than a rollback.<br />

This partiality cannot be achieved by rule systems, <strong>in</strong> particular, the computed precondition<br />

heavily depends on the orig<strong>in</strong>al database transition, whilst the rule-based approach aims at<br />

a solution that is <strong>in</strong>dependent from user-dened database transitions.<br />

Nevertheless, two major drawbacks exist for the GCS approach. The rst one concerns the<br />

rigidity <strong>of</strong> the specialization order with respect to the part <strong>of</strong> the database aected by a transition.<br />

State changes related to the orig<strong>in</strong>al transition can only be discarded, but not changed.<br />

E.g., with respect to a functional dependency <strong>in</strong> the RDM an <strong>in</strong>sertion is only allowed, if<br />

it is consistent. The same holds for multi-valued dependencies. This is only one reasonable<br />

enforcement strategy, but from an <strong>in</strong>tuitive po<strong>in</strong>t <strong>of</strong> view alternatives are imag<strong>in</strong>able.<br />

The second drawback concerns the arbitrar<strong>in</strong>ess with respect to the part <strong>of</strong> the database<br />

not aected by the orig<strong>in</strong>al transition. Here any changes are allowed as long as consistency<br />

is achieved. In [5] this problem has been circumvented allow<strong>in</strong>g branches <strong>of</strong> GCSs to be computed.<br />

This pragmatic approach leads to reasonable consistent specializations and restricts<br />

the non-determ<strong>in</strong>ism <strong>of</strong> GCSs. In Section 6.2 we present a formal review <strong>of</strong> the GCS approach.<br />

These two problems are taken up <strong>in</strong> this paper. In fact, the rst problem is <strong>in</strong>vestigated,<br />

but the nal solution comprises both problems at a time. The rst idea with respect to the<br />

rigidity problem is to weaken the order and to preserve not all the eects <strong>of</strong> the orig<strong>in</strong>al<br />

transition. Indeed, this was also the case for specialization, s<strong>in</strong>ce only eects on the part<br />

<strong>of</strong> the database aected by the more general transition were considered. In this paper, we<br />

consider eects { formalized by specic transition constra<strong>in</strong>ts { that are compatible with the<br />

given static constra<strong>in</strong>t. These transition constra<strong>in</strong>ts can be ordered by implication and we<br />

may consider m<strong>in</strong>imal constra<strong>in</strong>ts with this property. This leads to the notion <strong>of</strong> maximal<br />

consistent eect preservers (MCEs).<br />

Indeed, MCEs t well with our <strong>in</strong>tuition. For this we shortly discuss dierent enforcement<br />

strategies with respect to basic operations and selected classes <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> the RDM and<br />

conv<strong>in</strong>ce ourselves concern<strong>in</strong>g the naturality <strong>of</strong> the MCE approach.<br />

After the formal denition <strong>of</strong> MCEs <strong>in</strong> Section 6.3 we analyze the computational properties<br />

<strong>of</strong> MCEs. We shall see that existence, uniqueness and commutativity hold as they did<br />

for GCSs. Then, aga<strong>in</strong> under mild technical restrictions, we show that the problem can be<br />

reduced to nd<strong>in</strong>g MCEs for basic operations. As for GCSs the MCE <strong>of</strong> a complex database<br />

transition results, if <strong>in</strong>volved basic operations are replaced by some MCEs and a precondition<br />

is computed. Hence, MCEs preserve the computational properties <strong>of</strong> GCSs. Formal denitions<br />

and results for MCEs are presented <strong>in</strong> Section 6.3. Unfortunately, the pro<strong>of</strong>s are more<br />

complicated than they were for GCSs. Therefore, pro<strong>of</strong>s are omitted, but they follow the same<br />

approach as the pro<strong>of</strong>s for GCSs <strong>in</strong> [5]. We conclude with a short summary.<br />

Throughout the paper we assume some familiarity with guarded command notations and<br />

120


their axiomatic semantics by predicate transformers 1 [3]. Furthermore, pro<strong>of</strong>s are omitted,<br />

because they tend to become rather lengthy.<br />

6.2 A Review on Transaction Transformation by<br />

Specialization<br />

The start<strong>in</strong>g po<strong>in</strong>t <strong>of</strong> the computational approach to consistency enforcement was the use<br />

<strong>of</strong> axiomatic semantics <strong>in</strong> the extended style <strong>of</strong> Dijkstra [3] (see Appendix 6.6. We consider<br />

a state space X as a set <strong>of</strong> (typed) variables. In the relational model [5] each state variable<br />

corresponds to a relation schema the possible relation dene the associated type. In object<br />

oriented models [6] state variables correspond to classes with the associated class type.<br />

A state over X is given by a map which associates with each state variable a value <strong>of</strong><br />

its type. We use to denote the set <strong>of</strong> states over X.<br />

Then a (static) constra<strong>in</strong>t I is dened as a formula (<strong>in</strong> a naturally dened many-sorted<br />

logic) with free variables <strong>in</strong> the state space X, i.e. fr(I) X. It is clear that states are<br />

sucient for the <strong>in</strong>terpretation <strong>of</strong> constra<strong>in</strong>ts.<br />

A database transition can then be dened by a guarded command over the state space X<br />

captur<strong>in</strong>g non-determ<strong>in</strong>ism, partiality and general recursion. Note that such aformalization<br />

is much more general than usual denitions <strong>of</strong> transactions, but the <strong>in</strong>volved orthogonality <strong>of</strong><br />

constructors such as sequence, guard, choice etc. is signicant for pro<strong>of</strong>s to be kept simple.<br />

Furthermore, guarded commands are just one way to describe the syntax <strong>of</strong> transitions.<br />

A transition constra<strong>in</strong>t on a state space X is a formula J with free variables <strong>in</strong> X [ X 0<br />

us<strong>in</strong>g a disjo<strong>in</strong>t copy X 0 <strong>of</strong> X, i.e. fr(J ) X [X 0 .Ifx 0 2 X 0 corresponds to the state variable<br />

x 2 X, then values associated to x coresspond to before-states, whereas values associated to<br />

x 0 correspond to after-states. In particular, state pairs () 2 suce to <strong>in</strong>terpret<br />

transition constra<strong>in</strong>ts.<br />

Each transition constra<strong>in</strong>t J gives rise to a database transition S(J ) <strong>in</strong> a simple way. All<br />

state pairs satisfy<strong>in</strong>g J are used to dene S(J ). In addition, s<strong>in</strong>ce we have to decide, how<br />

to handle term<strong>in</strong>ation, we choose also to take all pairs ( 1) <strong>in</strong>to (S(J )). Then it can be<br />

shown that S(J ) can be written as<br />

S(J )=(@@x 0 1 :::x0 n J ! x 1 := x 0 1 ::: x n := x 0 n ) loop :<br />

The computational approach, however, abstracts from syntactic means. All denitions are<br />

given <strong>in</strong> terms <strong>in</strong> predicate transformers. This applies for the notions <strong>of</strong> operational specialization<br />

and consistency with respect to a static constra<strong>in</strong>t. These are the necessary <strong>in</strong>gredients<br />

for the denition and <strong>in</strong>vestigation <strong>of</strong> greatest consistent specializations (GCSs).<br />

6.2.1 Greatest Consistent Specialization<br />

As already said the operational approach to consistency enforcement starts with a formal<br />

denition <strong>of</strong> the goal. The idea is quite simple. We choose an order on transitions which<br />

should model the preservation <strong>of</strong> \eects". This order is called specialization and denoted v.<br />

Then a consistent specialization is expresses both consistency and the preservation <strong>of</strong> eects.<br />

F<strong>in</strong>ally we takethegreatest consistent specialization, if it exists.<br />

1 Wepresent a short summary <strong>of</strong> the used version <strong>of</strong> Dijkstra's calculus <strong>in</strong> Appendix 6.6.<br />

121


The <strong>in</strong>tention beh<strong>in</strong>d specialization is quite easy. Ifwe are given an execution <strong>of</strong> a database<br />

transition T work<strong>in</strong>g on a larger state space X than the database transition S work<strong>in</strong>g on<br />

Y X, then we may restrict this computation to Y . S<strong>in</strong>ce states have been dened as<br />

mapp<strong>in</strong>gs, this is just a restriction <strong>of</strong> mapp<strong>in</strong>gs. Then specialization means that any execution<br />

<strong>of</strong> T (restricted to Y ) should also be an execution <strong>of</strong> S. It is straightforward to show that this<br />

is exactly captured by the predicate transformer formulation <strong>in</strong> Denition 6.1 (i).<br />

This also allows transition consistency <strong>of</strong> a database transition S with respect to a transition<br />

constra<strong>in</strong>t J to be formalized by S specializ<strong>in</strong>g S(J ).<br />

As to static consistency with respect to some static constra<strong>in</strong>t I we require that any<br />

term<strong>in</strong>at<strong>in</strong>g execution <strong>of</strong> a transition T start<strong>in</strong>g <strong>in</strong> a state satisfy<strong>in</strong>g I should also reach a<br />

state satisfy<strong>in</strong>g I, which is formalized by Denition 6.1 (ii).<br />

S<strong>in</strong>ce specialization captures our <strong>in</strong>tuitive notion <strong>of</strong> \preservation <strong>of</strong> eects", the denition<br />

<strong>of</strong> Greatest Consistent Specializations <strong>in</strong> Denition 6.1 (iii) is now obvious.<br />

Denition 6.1. Let S, T be database transitions on Y and X respectively with Y X. Let<br />

I denote a static constra<strong>in</strong>t onX.<br />

(i) T specializes S (T v S) i for all static constra<strong>in</strong>ts R on Y the implications wlp(S)(R) )<br />

wlp(T )(R) and wp(S)(R) ) wp(T )(R) hold.<br />

(ii) T is consistent with respect to I i I)wlp(T )(I) holds.<br />

(iii) S I is a greatest consistent specialization <strong>of</strong> S with respect to I i S I v S holds, S I is<br />

consistent with respect to I and S I is the greatest database transition with respect to v<br />

with these properties.<br />

ut<br />

The rst properties that were derived for GCSs concerned their existence, uniqueness and<br />

their relation to conjunctions (or equivalently sets) <strong>of</strong> constra<strong>in</strong>ts. Due to the very general<br />

approach to database transitions <strong>in</strong>clud<strong>in</strong>g non-determ<strong>in</strong>ism and partiality, their existence<br />

can be easily veried. Due to the abstract semantic nature <strong>of</strong> the denition the uniqueness<br />

(up to semantic equivalence) is obvious. Additionally, the rst steps towards GCS theory<br />

detected a commutativity property with respect to conjunctions, at least if we restrict <strong>in</strong>itial<br />

states to those satify<strong>in</strong>g the constra<strong>in</strong>ts [6]. We summarize:<br />

Proposition 6.2. Let S be a database transition on Y and let I and J be static constra<strong>in</strong>ts<br />

on X with Y X. The the GCS S I exists and is uniquely determ<strong>in</strong>ed up to semantic equivalence<br />

byS and I. Furthermore I^J ! S I^J and I^J ! (S I ) J are semantically equivalent.<br />

ut<br />

This proposition is important for the practical computation <strong>of</strong> GCSs. If there exists an eective<br />

way to compute GCSs, then the proposition allows the computation to be restricted to simple<br />

constra<strong>in</strong>ts that cannot be decomposed as a conjunction. Then we can use the conjuncts <strong>of</strong> a<br />

more complex constra<strong>in</strong>t <strong>in</strong> any order to build the GCS. Thus, the commutativity property<br />

allows consistency to be enforced stepwise tak<strong>in</strong>g any order <strong>of</strong> the constra<strong>in</strong>ts.<br />

6.2.2 The Construction <strong>of</strong> GCSs<br />

In order to become practically <strong>in</strong>terest<strong>in</strong>g GCSs must allow to be constructed. How toachieve<br />

construction means seemed to be a hopeless problem at the beg<strong>in</strong>n<strong>in</strong>g, at least for complex<br />

database transitions. It is clear that a naive approach {replac<strong>in</strong>g just basic operations such<br />

122


as <strong>in</strong>sertions and deletions by their GCSs { leads to wrong results. More precisly, we obta<strong>in</strong> a<br />

consistent specialization, but not the greatest, or even worse, we obta<strong>in</strong> not a specialization<br />

at all.<br />

The major breakthrough was achieved by requir<strong>in</strong>g database transitions to be I-reduced.<br />

This is only a technical condition (see Appendix 6.7) { <strong>in</strong> fact, only a condition on sequences {<br />

which <strong>in</strong>formally states that there is no self-repair<strong>in</strong>g. We omit the technical denition here,<br />

s<strong>in</strong>ce it is only understandable <strong>in</strong> connection with the pro<strong>of</strong> <strong>of</strong> the ma<strong>in</strong> results [5]. Then<br />

it was shown that I-reduced database transitions the GCS is itself a specialization <strong>of</strong> the<br />

database transition result<strong>in</strong>g from the replacement <strong>of</strong> basic operations by their GCSs (upper<br />

bound theorem) and it appears that it add<strong>in</strong>g a certa<strong>in</strong> precondition gives the complete GCS<br />

(ma<strong>in</strong> theorem). We summarize:<br />

Theorem 6.3. Let I be a static constra<strong>in</strong>t on X and S some I-reduced database transition<br />

on Y with Y X = fx 1 ::: x n g. Let SI 0 result from S by rst replac<strong>in</strong>g each restricted<br />

choice S 1 S 2 by S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic database transition by its<br />

GCS with respect to I. For a disjo<strong>in</strong>t copy fz 1 ::: z n g <strong>of</strong> X dene<br />

P (S 0 I ) fz 1=x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) <br />

where T results from S 0 I by renam<strong>in</strong>g all x i to z i . Then the GCS <strong>of</strong> S with respect to I can<br />

be written <strong>in</strong> the form S I = P (S 0 I ) ! S0 I . ut<br />

The theorem needs some explanation concern<strong>in</strong>g both its term<strong>in</strong>ology and its impact. By a<br />

disjo<strong>in</strong>t copy<strong>of</strong>X = fx 1 :::x n g we mean a state space Z = fz 1 :::z n g such that the types<br />

<strong>of</strong> z i and x i co<strong>in</strong>cide and X \ Z = hold. Then the notation fx=tg:R with a variable x, a<br />

term t <strong>of</strong> the same type as x and a formula R denotes the result <strong>of</strong> the substitution <strong>of</strong> each<br />

free occurrence <strong>of</strong> x <strong>in</strong> R by t.<br />

The formula P (SI 0 ) results from a rst order reformulation <strong>of</strong> the specialization condition<br />

<strong>in</strong> Denition 6.1 (i), which was basically second order. This is possible, s<strong>in</strong>ce S works on X, T<br />

works on the disjo<strong>in</strong>t copy Z, i.e. their parallel execution has the same eect as any sequence,<br />

and the formula x 1 = z 1 ^^x n = z n on X [ Z expresses a \glue" between X and Z. Ifthe<br />

given formula were always true, T would be a specialization <strong>of</strong> S. Tak<strong>in</strong>g it as a precondition<br />

restricts T to those executions that may occur <strong>in</strong> a specialization <strong>of</strong> S. S<strong>in</strong>ce the T chosen<br />

<strong>in</strong> the theorem is already a consistent generalization <strong>of</strong> S I , we really obta<strong>in</strong> the GCS. The<br />

lengthy pro<strong>of</strong> is conta<strong>in</strong>ed <strong>in</strong> [5].<br />

Note that the theorem together with the commutativity result mentioned beforehand<br />

gives eective means for GCS construction and hence for consistency enforcement <strong>in</strong> the<br />

basic computational approach. The hard part <strong>of</strong> the pro<strong>of</strong> is to show that S I v SI 0 holds<br />

(upper bound theorem) which requires lengthy structural <strong>in</strong>duction [5].<br />

6.2.3 Two Major Problems<br />

Let us look at consistency enforcement from a more practical po<strong>in</strong>t <strong>of</strong> view and ask whether<br />

GCSs really co<strong>in</strong>cide with our <strong>in</strong>tuition. In general, GCSs are non-determ<strong>in</strong>istic, which reects<br />

various strategies for consistency enforcement. The approach <strong>in</strong> [5] selects a branch <strong>of</strong> the GCS<br />

which is related to an <strong>in</strong>teractive support for the values to be selected.<br />

E.g., take an <strong>in</strong>clusion constra<strong>in</strong>t x 2 p ) x 2 q and an <strong>in</strong>sertion <strong>in</strong>to p, then GCS<br />

branches oer the freedom to chose any newvalue for q provided it is a superset <strong>of</strong> p [fxg.<br />

123


Intuitively we prefer this value to be q [fxg, i.e. to keep change propagation as simple as<br />

possible. For GCS branches, however, there is no such \preference" or otherwise said:<br />

For a database transition on Y X the GCS approach is too liberal on X ; Y .<br />

On the other hand, multi-valued dependencies, which concern only one set-valued state variable<br />

lead to preconditions, although we might expect additional changes <strong>in</strong>stead. Otherwise<br />

said:<br />

For a database transition on Y X the GCS approach is too restrictive onY .<br />

This demonstrates that the specialization order might still be too coarse for enforcement<br />

purposes. One possible solution orig<strong>in</strong>at<strong>in</strong>g from the work <strong>in</strong> [5] is to choose an order based<br />

on -constra<strong>in</strong>ts. We shall follow this idea <strong>in</strong> Section 6.3.<br />

6.3 Weaker Notions <strong>of</strong> Eect Preservation<br />

Intuitively there exist various enforcement strategies with respect to basic operations <strong>in</strong> order<br />

to enhance the rigidity <strong>of</strong> GCSs. E.g., consider an <strong>in</strong>sertion <strong>of</strong> a new tuple <strong>in</strong>to a relation:<br />

{ For a functional dependency we may enforce consistency either by add<strong>in</strong>g a precondition<br />

(the choice made <strong>in</strong> the practical example <strong>in</strong> [5]) oder propagate the deletion <strong>of</strong> other<br />

tuples.<br />

{ For amultivalued dependency we either propagate further <strong>in</strong>sertions or add a precondition.<br />

{ For an <strong>in</strong>clusion dependency we may propagate an <strong>in</strong>sertion <strong>in</strong> the other relation or add<br />

a precondition.<br />

{ For an exclusion dependency we may propagate the deletion <strong>of</strong> tuples <strong>in</strong> the other relation<br />

or add a precondition.<br />

For the case <strong>of</strong> a deletion <strong>of</strong> a tuple the alternatives are similar.<br />

Let us now try to characterize the relationship between the orig<strong>in</strong>al database transition<br />

and those result<strong>in</strong>g from such rewrit<strong>in</strong>g eorts <strong>in</strong> order to nd a weaker notion <strong>of</strong> eect<br />

preservation that allow to encompass the problems we have with GCSs.<br />

In general the eects <strong>of</strong> a database transition T may be expressed by transition constra<strong>in</strong>ts.<br />

Just take a the characteriz<strong>in</strong>g predicate <strong>of</strong> a subset <strong>of</strong> (T ). Then T is certa<strong>in</strong>ly consistent<br />

with a constra<strong>in</strong>t choosen <strong>in</strong> that way. Therefore, we <strong>in</strong>troduce the notion <strong>of</strong> a -constra<strong>in</strong>t,<br />

i.e. a transition constra<strong>in</strong>t that is satised by a database transition S [5]:<br />

Denition 6.4. Let S be a database transition on X = fx 1 ::: x n g. A -constra<strong>in</strong>t for S<br />

is a transition constra<strong>in</strong>t J on X such thatfx 0 1 =x 1::: x 0 n =x ng:wlp(S 0 )(J ) holds, where S 0<br />

results from S by renam<strong>in</strong>g all x i to x 0 i .<br />

ut<br />

Example 6.1. Look at the the <strong>in</strong>sertion S <strong>of</strong> a new tuple t <strong>in</strong>to a relation r. Then the<br />

follow<strong>in</strong>g formulae are -constra<strong>in</strong>ts for S:<br />

{ t 2 r 0<br />

{ 8u: u 2 r ) u 2 r 0 124


{ 8u: u 2 q , u 2 q 0 for all relation schemata q 6= r<br />

{ 8u: u 6= t ^ u 2 r 0 ) u 2 r ut<br />

If S is dened on Y X and we require all -constra<strong>in</strong>ts J <strong>of</strong> S with fr(J ) Y [ Y 0 to be<br />

also -constra<strong>in</strong>ts for T , then T will be a specialization <strong>of</strong> S. The converse also holds. Thus,<br />

we may replace the specialization condition by the preservation <strong>of</strong> certa<strong>in</strong> -constra<strong>in</strong>ts.<br />

6.3.1 Maximal Consistent Eect Preservers<br />

We have already seen that we may always associate a database transition S(J ) with each<br />

transition constra<strong>in</strong>t J .In order to preserve J we must require to specialize S(J ). The basic<br />

idea <strong>of</strong> the tailored operational approach is now to consider not all -constra<strong>in</strong>ts, but only<br />

some <strong>of</strong> them. Thus, we do no longer build the GCS <strong>of</strong> S with respect to I, but the GCS <strong>of</strong><br />

some S(J ).<br />

If some -constra<strong>in</strong>ts <strong>of</strong> S are omitted <strong>in</strong> J ,thenS(J ) will allow executions that do not<br />

occur <strong>in</strong> any specialization <strong>of</strong> S. In this way, we can capture the reasonable changes that<br />

were listed at the beg<strong>in</strong>n<strong>in</strong>g <strong>of</strong> this section. However, tak<strong>in</strong>g any such -constra<strong>in</strong>t is much<br />

too weak. S(J ) should only add executions that are consistent with I. This justies to dene<br />

-constra<strong>in</strong>ts that are compatible with a given static constra<strong>in</strong>t I on X <strong>in</strong> the sense that<br />

buld<strong>in</strong>g the GCS S(J ) I does not <strong>in</strong>crease partiality.<br />

Denition 6.5. A -constra<strong>in</strong>t J for a database transition S is compatible with a static<br />

constra<strong>in</strong>t I i wp(S(J ) I )(false) ) wp(S(J ))(false) holds.<br />

ut<br />

Example 6.2. It is easy to see that each <strong>of</strong> the -constra<strong>in</strong>ts <strong>in</strong> the previous example is<br />

compatible with I chosen to be a multivalued dependency. Furthermore, the conjunction<br />

<strong>of</strong> three <strong>of</strong> these constra<strong>in</strong>ts is also compatible with I, but the conjunction <strong>of</strong> all four -<br />

constra<strong>in</strong>ts is not.<br />

ut<br />

The last example suggests to consider the implication order on -constra<strong>in</strong>ts. We say that<br />

J 1 is stronger than J 2 i J 1 ) J 2 holds. Unfortunately there is no smallest -constra<strong>in</strong>t<br />

compatible with I and we cannot consider the \strongest" I-compatible -constra<strong>in</strong>t for S,<br />

but we may consider m<strong>in</strong>imal elements <strong>in</strong> this order.<br />

Denition 6.6. A -constra<strong>in</strong>t J for S is low with respect to I i it is I-compatible and<br />

there is no strictly stronger I-compatible -constra<strong>in</strong>t.<br />

ut<br />

Nowwe are prepared to dene maximal consistent eect preservers for a database transition S.<br />

For these we choosealow -constra<strong>in</strong>t J which formalizes an eect <strong>of</strong> S to be preserved. Then<br />

we take a consistent database transition S I that preserves this eect, but rema<strong>in</strong>s undened,<br />

whereever S is undened. F<strong>in</strong>ally, we require S I to be a greatest database transition with<br />

these properties with respect to the specialization order.<br />

Denition 6.7. Let S be a database transition and I a static constra<strong>in</strong>t onX. LetJ be a<br />

low -constra<strong>in</strong>t <strong>of</strong> S with respect to I. A database transition S I on X is called a maximal<br />

consistent eect preserver (MCE) <strong>of</strong> S with respect to I i<br />

(i) J is a -constra<strong>in</strong>t forS I ,<br />

(ii) wp(S)(false) ) wp(S I )(false) holds,<br />

125


(iii) S I is consistent with respect to I and<br />

(iv) any other database transition T with these properties specializes S I .<br />

ut<br />

Note that <strong>in</strong> this denition the state space on which S is dened is no longer important. It<br />

\vanishes" <strong>in</strong>side the chosen J . Then it is easy to see that the <strong>in</strong>formal enforcement strategies<br />

at the beg<strong>in</strong>n<strong>in</strong>g <strong>of</strong> this section are captured by MCEs for basic database transitions.<br />

Furthermore, property (iv) employs the specialization order v aga<strong>in</strong>. This seems to be<br />

surpris<strong>in</strong>g for the rst moment, but it turns out to be a natural denition as shown <strong>in</strong> the<br />

follow<strong>in</strong>g lemma which follows directly from the denitions.<br />

Lemma 6.8. Let S be adatabase transition and I a static constra<strong>in</strong>t on X. Let J be a low<br />

-constra<strong>in</strong>t <strong>of</strong> S with respect to I. Then wp(S) (true) ! S(J ) I is the MCE with respect to<br />

I. ut<br />

From the lemma we maydraw rst conclusions:<br />

{ For a chosen low -constra<strong>in</strong>t with respect to I the MCE S I always exists and is uniquely<br />

determ<strong>in</strong>ed (up to semantic equivalence) by S, I and J .<br />

{ MCEs are closely related to GCSs. Apart from the precondition wp(S) (true) the MCE<br />

is the GCS <strong>of</strong> a slightly extended database transition, i.e. possible changes have been<br />

<strong>in</strong>corporated <strong>in</strong>to S(J ).<br />

The lemma suggests to apply the theory <strong>of</strong> GCS construction from Section 6.2 to the construction<br />

<strong>of</strong> MCEs. This idea, however, is mislead<strong>in</strong>g, s<strong>in</strong>ce there is no eective way to construct<br />

S(J ). Instead, we shall <strong>in</strong>vestigate eective MCE construction below. On the other hand, we<br />

can show that commutativity also holds for MCEs.<br />

Proposition 6.9. For static constra<strong>in</strong>ts I 1 , I 2 each preconditioned MCE I 1 ^I 2 ! S I 1^I2<br />

is semantically equivalent to I 1 ^I 2 ! (S I1 ) I 2 and vice versa.<br />

ut<br />

6.3.2 Eective MCE Construction<br />

Let us now ask for the eective construction <strong>of</strong> MCEs for complex database transitions. Aga<strong>in</strong><br />

a naive approach { replac<strong>in</strong>g just basic operations such as <strong>in</strong>sertions and deletions by some<br />

<strong>of</strong> their MCEs { leads to wrong results, but we observe that an MCE S I for a chosen low<br />

-constra<strong>in</strong>t J is a specialization <strong>of</strong> S(J ) 0 I , a database transition that is basically built by<br />

replac<strong>in</strong>g basic database transitions by their GCSs. Hence, it seems promis<strong>in</strong>g not to consider<br />

the replacement by GCSs, but by selected MCEs.<br />

As <strong>in</strong> the case <strong>of</strong> GCS construction we have to require I-reducedness, the purely technical<br />

condition which excludes self-repair<strong>in</strong>g with<strong>in</strong> sequences [5]. Then it can be shown that for<br />

I-reduced database transitions each MCE is itself a specialization <strong>of</strong> the database transition<br />

(S I ) 0 , a database transition that is basically built by replac<strong>in</strong>g basic database transitions by<br />

MCEs (upper bound theorem). Then it appears that add<strong>in</strong>g a precondition gives a MCE<br />

(ma<strong>in</strong> theorem). Thus, we obta<strong>in</strong> the follow<strong>in</strong>g result:<br />

Theorem 6.10. Let I be a static constra<strong>in</strong>t on X and S some I-reduced database transition.<br />

Assume X = fx 1 ::: x n g. Let (S I ) 0 result from S by rst replac<strong>in</strong>g each restricted choice<br />

S 1 S 2 by S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic transition <strong>in</strong> S by one <strong>of</strong> its MCEs<br />

with respect to I. For a disjo<strong>in</strong>t copy fz 1 ::: z n g <strong>of</strong> X dene<br />

126


P ((S I ) 0 ) fz 1 =x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) <br />

where T results from (S I ) 0 by renam<strong>in</strong>g all x i to z i . Then<br />

S I = wp(S) (true) ! (P ((S I ) 0 ) ! (S I ) 0 )<br />

is an MCE for S with respect to I.<br />

ut<br />

This theorem aga<strong>in</strong> requires some <strong>in</strong>formal explanation. Its basic impact is the reduction<br />

<strong>of</strong> the MCE construction problem to basic operations. Practically this means to chose an<br />

enforcement strategy for basic operations by the means <strong>of</strong> a MCE. Then the theorem shows<br />

how to construct a correspond<strong>in</strong>g MCE for any complex operation, i.e. for any \<strong>in</strong>tended<br />

transaction".<br />

This also works if alternatives for MCEs <strong>of</strong> basic operations, i.e. <strong>in</strong>sertions, deletions etc.,<br />

are permitted. In this case each MCE comb<strong>in</strong>ation can be used <strong>in</strong> the theorem to dene MCEs<br />

<strong>of</strong> complex transitions.<br />

If we take the theorem together with the commutativity result mentioned beforehand,<br />

this gives eective means for MCE construction for arbitrary sets <strong>of</strong> constra<strong>in</strong>ts and hence<br />

for consistency enforcement <strong>in</strong> the tailored computational approach.<br />

6.4 Application Example<br />

Let us now look at a simple application example for the (tailored) computational approach.<br />

We consider a simple MCE computation that is similar to the computation <strong>of</strong> a GCS branch<br />

<strong>in</strong> [5]. Consider the state space X = fx 1 x 2 :: FSET(INT INT)g where FSET() is the<br />

nite set type constructor. Moreover, consider the static constra<strong>in</strong>ts:<br />

I 1 map( 1 )(x 1 ) map( 1 )(x 2 )<br />

I 2 8x y :: INT INT: x 2 x 2 ^ y 2 x 2 ^ 2 (x) = 2 (y) ) 1 (x) = 1 (y)<br />

I 3 2 (x 1 ) \ 2 (x 2 ) = <br />

These are examples <strong>of</strong> an <strong>in</strong>clusion dependency, a functional dependency and an exclusion<br />

dependency.<br />

Example 6.3. Let the state space and constra<strong>in</strong>ts be as above. Now consider the fx 1 g-<br />

operation S(a b :: INT) = x 1 := x 1 [f(a b)g. Let us take the constra<strong>in</strong>ts <strong>in</strong> the given<br />

order.<br />

Step 1. First consider the <strong>in</strong>clusion constra<strong>in</strong>t I 1 .We dispense with the pro<strong>of</strong> <strong>of</strong> I 1 -reducedness.<br />

S is a determ<strong>in</strong>istic basic assignment that can be replaced by its MCE with respect to I 1 and<br />

the low -constra<strong>in</strong>t<br />

J x 0 1 = x 1 [f(a b)g^9c: x 0 2 = x 2 [f(a c)g :<br />

Then we compute (S I ) 0 (a b :: INT) =<br />

x 1 := x 1 [f(a b)g ( a =2 1 (x 2 ) ! @@ c :: INT x 2 := x 2 [f(a c)g skip ) <br />

which isanX-operation with P ((S I ) 0 ) , true. Dene this as the new S.<br />

127


Step 2. Now consider the <strong>in</strong>variant I 2 . Aga<strong>in</strong> the reducedness pro<strong>of</strong> is omitted. We have<br />

to remove the restricted choice and to replace the basic assignment tox 2 by the MCE with<br />

respect to I 2 and the low -constra<strong>in</strong>t<br />

J x 0 1 = x 1 ^ (x 0 2 = x 2 [f(a c)g _x 0 2 = x 2) :<br />

( a =2 1 (x 2 ) ! c =2 2 (x 2 ) ! x 2 := x 2 [f(a c)g )( a 2 1 (x 2 ) ! skip )<br />

Then we compute P (SI 0 ) $ true. Hence the new S is (after some rearrangements)<br />

S(a b :: INT) = x 1 := x 1 [f(a b)g <br />

(( a =2 1 (x 2 ) ! @@ c :: INT c =2 2 (x 2 ) ! x 2 := x 2 [f(a c)g )<br />

a 2 1 (x 2 ) ! skip ) :<br />

Step 3. Now regard the exclusion <strong>in</strong>variant I 3 . Reducedness holds, but we omit the pro<strong>of</strong>.<br />

Replace S 1 = x 1 := x 1 [f(a b)g <strong>in</strong> S by the MCE<br />

x 1 := x 1 [f(a b)g x 2 := x 2 ;fx 2 x 2 j 2 (x) =bg<br />

and analogously replace S 2<br />

= x 2 := x 2 [f(a c)g by theMCE<br />

x 2 := x 2 [f(a c)g x 1 := x 1 ;fx 2 x 1 j 2 (x) =cg :<br />

Then we compute<br />

P (S 0 I ) , b =2 2(x 2 ) ^<br />

( =2 1 (x 2 ) )8c :: INT: (c 62 2 (x 2 ) ) c =2 2 (x 1 ) [fbg) ) <br />

hence the nal result is (after some rearrangements) semantically equivalent to<br />

S I (a b :: INT) = b =2 2 (x 2 ) ! x 1 := x 1 [f(a b)g <br />

(( a 62 1 (x 2 ) ! @@ c :: INT <br />

c =2 2 (x 2 ) ^ c =2 2 (x 1 ) ! x 2 := x 2 [f(a c)g )<br />

a 2 1 (x 2 ) ! skip ) :<br />

ut<br />

6.5 Conclusion<br />

We <strong>in</strong>vestigated major problems <strong>of</strong> the computational approach to consistency enforcement.<br />

These problems are related to the specialization order chosen by this approach. We exam<strong>in</strong>ed<br />

<strong>in</strong>tuitive strategies for consistency enforcement with respect to basic operations and selected<br />

classes <strong>of</strong> constra<strong>in</strong>ts. In these cases the strict eect preservation property <strong>of</strong>the specialization<br />

approach can be restricted to so-called I-compatible -constra<strong>in</strong>ts. In fact, choos<strong>in</strong>g such<br />

a constra<strong>in</strong>t that is m<strong>in</strong>imal with respect to the implication order gives rise to the denition<br />

<strong>of</strong> maximal consistent eect preservers (MCEs) as a natural approach to consistency<br />

enforcement.<br />

128


Fortunately, MCEs are closely related to greatest consistent specializations (GCSs) that<br />

were studied before. Each MCE is given by the GCS <strong>of</strong> a slightly extended transition and<br />

a precondition. This does not help directly <strong>in</strong> construct<strong>in</strong>g MCEs, but it turns out that<br />

MCE construction can be done <strong>in</strong> the same way as GCS construction, i.e., the consistency<br />

enforcement problem can be reduced to nd<strong>in</strong>g MCEs for basic operations, which is only a<br />

problem <strong>of</strong> practical calculation.<br />

Thus, the tailored approach presented <strong>in</strong> this paper may be considered as a general solution<br />

to (static) consistency enforcement. Moreover, as <strong>in</strong>dicated <strong>in</strong> [8] an ecient and exible<br />

implementation can be achieved by the use <strong>of</strong> l<strong>in</strong>guistic reection. The only problem that<br />

might be critical concerns the technical prerequisite <strong>of</strong> I-reducedness which excludes badly<br />

written database transitions. As shown <strong>in</strong> [7] for selected classes <strong>of</strong> constra<strong>in</strong>ts it is even<br />

possible to rewrite transitions <strong>in</strong> such away thatI-reducedness always holds.<br />

References for Chapter 6<br />

1. S. Ceri, P.Fraternali, S. Paraboschi, L. Tanca: Automatic Generation <strong>of</strong> Production Rules for Integrity<br />

Ma<strong>in</strong>tenance. ACM TODS 19(3), 1994, 367-422.<br />

2. P.Fraternali, S. Paraboschi: Order<strong>in</strong>g and Select<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance:<br />

Complexity and Heuristic Solution. to appear <strong>in</strong> IEEE TKDE.<br />

3. G. Nelson: A Generalization <strong>of</strong> Dijkstra's Calculus. ACM TOPLAS 11 (4), 1989, 517-561.<br />

4. K.-D. Schewe, B. Thalheim: Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>. In S. Chakravarty,<br />

J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g |Active <strong>Databases</strong>. Workshop Proceed<strong>in</strong>gs.<br />

Houston, Februar 1994.<br />

5. K.-D. Schewe, B. Thalheim: A Computational Approach to Consistency Enforcement. submitted<br />

for publication.<br />

6. K.-D. Schewe, B. Thalheim, J. Schmidt, I. Wetzel: Integrity Enforcement <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong><br />

<strong>Databases</strong>. In U. W. Lipeck, B. Thalheim (Eds.): Modell<strong>in</strong>g Database Dynamics. Spr<strong>in</strong>ger Workshops<br />

<strong>in</strong> Comput<strong>in</strong>g. Volkse 1992, 174-195.<br />

7. K.-D. Schewe: Specication and Development <strong>of</strong> Correct Relational Database Programs. Technical<br />

Report. submitted for publication.<br />

8. K.-D. Schewe, D. Stemple, B. Thalheim: Higher Level Genericity <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>. In:<br />

Proc. Conference on the Management <strong>of</strong> Data (COMAD '94). Bangalore (India), December 1994.<br />

129


Appendix<br />

6.6 The Predicate Transformer Calculus<br />

This section gives a brief review <strong>of</strong> Dijkstra's classical calculus [3]. Assume that S is a program<br />

specication and that X is the nite set <strong>of</strong> variables occurr<strong>in</strong>g <strong>in</strong> S. We usually call X a state<br />

space. If D is a set <strong>of</strong> values (or more generally a doma<strong>in</strong>), then a state is simply a variable<br />

assignment : X ! D. Let be the set <strong>of</strong> all such states (more generally: a power doma<strong>in</strong>).<br />

Then the overall mean<strong>in</strong>g <strong>of</strong> S can be given by a subset (S) [ f1g, where<br />

() 2 (S) means that start<strong>in</strong>g S <strong>in</strong> the <strong>in</strong>itial state , may lead to the nal state and 1<br />

represents non-term<strong>in</strong>ation. This description does not depend on the style <strong>of</strong> the specication<br />

S. Of course, this trivial semantics description comprises non-determ<strong>in</strong>ism and partiality.<br />

Consider an <strong>in</strong>nitary logic and assume that there is an equality predicate. Regard formulae<br />

R with free variables <strong>in</strong> X. These are called X-predicates. Let F(X) be the set <strong>of</strong> all<br />

X-predicates. Let St =(D !) be a xed structure for the <strong>in</strong>terpretation <strong>of</strong> L with semantic<br />

doma<strong>in</strong> D and assume that St satises the doma<strong>in</strong> closure property, i.e. for each d 2 D there<br />

is some closed term t 2T(L) with!(t) =d. Obviously, a state is sucient to<strong>in</strong>terpret an<br />

X-predicate. Write j= R if <strong>in</strong>terpret<strong>in</strong>g R <strong>in</strong> state yields true. Now dene two mapp<strong>in</strong>gs<br />

wlp(S) andwp(S) on equivalence classes <strong>of</strong> X-predicates.<br />

j= wlp(S)(R) i () 2 (S) ^ 6= 1)j= R and (6.71)<br />

j= wp(S)(R) i () 2 (S) ) 6= 1^ j= R : (6.72)<br />

we callw(l)p(S)(R) theweakest (liberal) precondition <strong>of</strong> S for the postcondition R. Note that<br />

this denition precisely formalizes the <strong>in</strong>formal mean<strong>in</strong>g <strong>of</strong> wlp(S) andwp(S). Moreover, the<br />

predicate transformers are uniquely determ<strong>in</strong>ed by (S) uptoequivalence.<br />

Theorem 6.11. For a given program specication S the predicate transformers wlp(S) and<br />

wp(S) exist. Moreover, they satisfy<br />

wp(S)(R) , wlp(S)(R) ^ wp(S)(true) (pair<strong>in</strong>g condition) and (6.73)<br />

wlp(S)(^ ^<br />

R i ) , wlp(S)(R i ) (universal conjunctivity) : (6.74)<br />

i2I<br />

i2I<br />

The follow<strong>in</strong>g <strong>in</strong>version theorem shows that universal conjunctivity and the pair<strong>in</strong>g condition<br />

already suce to nd a specication S with correspond<strong>in</strong>g predicate transformers. For this<br />

recall that the dual f <strong>of</strong> a predicate transformer f is dened as f (R) =:f(:R).<br />

Theorem 6.12. Let flp and fp be predicate transformers satisfy<strong>in</strong>g (6.73) and (6.74) <strong>in</strong><br />

place <strong>of</strong> wlp(S) and wp(S). Then for a program specication S with<br />

(S) =f() jj= flp (P )g[f( 1) jj= fp (false)g<br />

ut<br />

wlp(S)(R) , flp(R) and wp(S)(R) , fp(R).<br />

ut<br />

130


In [3] recursion has been <strong>in</strong>vestigated with respect to the order v dened by S v T i<br />

wlp(T )(R) ) wlp(S)(R) andwp(S)(R) ) wp(T )(R) hold for all X-predicates R. Therefore,<br />

for monotonic f with respect to v the program specication T = S:f(S) can be dened as<br />

a least xpo<strong>in</strong>t and wlp(T ) (resp. wp(T )) is dened by conjunction (disjunction).<br />

F<strong>in</strong>ally, regard the follow<strong>in</strong>g language <strong>of</strong> guarded commands built recursively from the<br />

follow<strong>in</strong>g constructs.<br />

(i) assignments x := E for a variable x and a term E,<br />

(ii) skip, fail, loop,<br />

(iii) sequential composition S 1 S 2 , choice S 1 S 2 , projection @@x S, guard P ! S and<br />

restricted choice S 1 S 2 ,whereP is a well-formed formula and x is a variable.<br />

The <strong>in</strong>formal mean<strong>in</strong>g <strong>of</strong> an assignment is the usual one. skip is an operation that \does<br />

noth<strong>in</strong>g", loop is always dened, but never term<strong>in</strong>ates, and fail is always undened. The<br />

latter two commands are only justied as least elements with respect to the Nelson order {<br />

used for recursion { and the specialization order.<br />

The <strong>in</strong>tended mean<strong>in</strong>g <strong>of</strong> a sequence is also the standard one: rst execute S 1 , then S 2 .A<br />

guard denes a precondition. If P is satised, S is executed, otherwise there is no execution.<br />

Choice means demonic choice, i.e., choose any <strong>of</strong> S 1 or S 2 as long as it is dened, even, if<br />

this leads to non-term<strong>in</strong>ation. Restricted choice on the other hand prefers the execution <strong>of</strong><br />

S 1 unless it is undened, <strong>in</strong> which case S 2 is taken. F<strong>in</strong>ally, theunbounded choice operator<br />

<strong>in</strong>troduces a new variable x and executes S on the state space extended by x.<br />

For this language the axiomatic semantics can be dened by<br />

w(l)p(x := E)(R) ,fx=Eg:R <br />

w(l)p(skip)(R) ,R <br />

w(l)p(fail)(R) , true <br />

wlp(loop)(R) , true and wp(loop)(R) , false <br />

w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(w(l)p(S 2 )(R)) <br />

w(l)p(P ! S)(R) ,P) w(l)p(S)(R) <br />

w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(R) ^ w(l)p(S 2 )(R) <br />

w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(R) ^ (wp(S 1 )(false) ) w(l)p(S 2 )(R)) and<br />

w(l)p(@@x S)(R) ,8x:w(l)p(S)(R) :<br />

Then for all expression f(S) built from the constructors above f will be monotonic with<br />

respect to v and we get<br />

^<br />

_<br />

wlp(S:f(S))(R) , wlp(f (loop))(R) and wp(S:f(S))(R) , wp(f (loop))(R) <br />

<br />

<br />

where ranges <strong>in</strong> both cases over the ord<strong>in</strong>al numbers.<br />

For any guarded command S we may also consider the conjugate predicate transformers<br />

wp(S) and wlp(S) , which are dened by<br />

w(l)p(S) (R) = :w(l)p(S)(:R) :<br />

131


6.7 I-reducedness<br />

For all the constructors for a guarded command S except the sequence each computation<br />

<strong>of</strong> the result<strong>in</strong>g complex operation already occurs as a computation <strong>of</strong> one <strong>of</strong> the <strong>in</strong>volved<br />

components. Therefore we may expect that GCS construction can be done componentwise.<br />

For sequences, however, this is not the case.<br />

Let us now dene the technical I-reducedness condition for sequences. Assume that the<br />

types <strong>of</strong> values are understood from the context.<br />

Denition 6.13. Let S = S 1 S 2 be an Y -operation such that S i is a Y i -operation for Y i Y<br />

(i = 1 2). Let I be some X-<strong>in</strong>variant with Y X. Let X ; Y 1 = fy 1 ::: y m g, Y 1 =<br />

fx 1 ::: x l g and assume that fx 0 1 ::: x0 l g is disjo<strong>in</strong>t copy <strong>of</strong> Y 1 disjo<strong>in</strong>t als<strong>of</strong>rom X. Then<br />

S is called -I-reduced i the follow<strong>in</strong>g two conditions hold:<br />

(i) For all states with j= <br />

:I we have, if<br />

P )fx 1 =x 0 1::: x l =x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g:I)<br />

is a -constra<strong>in</strong>t for S 1 ,thenitisalsoa-constra<strong>in</strong>t forS.<br />

(ii) For all states we have, if<br />

P )fx 1 =x 0 1 ::: x l=x 0 l g:8 i(i =1:::m):fy 1 = 1 ::: y m = m g::I)<br />

is a -constra<strong>in</strong>t for S 1 ,thenitisalsoa-constra<strong>in</strong>t forS.<br />

Example 6.4. Take X = fx 1 :: FSET(T )x 2 :: FSET(T )g, I x 1 x 2 and S(x y :: T )=<br />

S 1 S 2 with S 1 = x 2 := x 2 ;fxg and S 2 = x 1 := x 1 [fyg.<br />

(i) A -constra<strong>in</strong>t for S 1 <strong>in</strong> the form P ) D 0 C 0 with P C = C 0 ^ D = D 0 is<br />

C = C 0 ^ D = D 0 ^ D 0 C 0 ^ x 62 D 0 ) D 0 C 0 :<br />

S<strong>in</strong>ce Denition 6.13 additionally requires j= :I, i.e. D 0 6 C 0 ,theconjunction <strong>of</strong> such<br />

constra<strong>in</strong>ts is true, which is also a -constra<strong>in</strong>t <strong>of</strong>S.<br />

(ii) Now takea-constra<strong>in</strong>t <strong>of</strong> S 1 <strong>in</strong> the form P ) D 0 6 C 0 . Then we have<br />

fC 0 =C D 0 =Dg:wlp(fC=C 0 D=D 0 g:S 1 )(C = C 0 ^ D = D 0 ) D 0 6 C 0 ,<br />

C = C 0 ^ D = D 0 ) D 6 C ;fxg ,<br />

D 0 6 C 0 ;fxg<br />

x 2 D 0 _ D 0 6 C 0 :<br />

Denition 6.13 additionally requires j= I, i.e. D 0 C 0 , the conjunction J <strong>of</strong> such<br />

constra<strong>in</strong>ts is<br />

x 2 D ^ D C ) D 0 6 C 0 :<br />

Then we compute<br />

fC 0 =C D 0 =Dg:wlp(fC=C 0 D=D 0 g:S)(J ) ,<br />

x 2 D ^ D C ) D 6 (C ;fxg) [fyg ,<br />

true for x 6= y<br />

false for x = y : 132


Hence S is -I-reduced only if x 6= y, but the operation<br />

S 0 (x y :: T ) = (x 6= y ! S 1 S 2 ) S 2<br />

is always -I-reduced and semantically equivalent toS.<br />

ut<br />

We may extend this denition to arbitrary operations requir<strong>in</strong>g all occurr<strong>in</strong>g sequences to be<br />

-I-reduced.<br />

Denition 6.14. Let S be an X-operation and I some Y -<strong>in</strong>variant with X Y . S is called<br />

I-reduced i the follow<strong>in</strong>g holds:<br />

(i) If S is one <strong>of</strong> fail, skip, loop or an assignment, then S is always I-reduced.<br />

(ii) If S = S 1 S 2 ,thenS is I-reduced i S 1 and S 2 are I-reduced and S is -I-reduced.<br />

(iii) If S is one <strong>of</strong> P ! T ,@@y :: T y T , S 1 S 2 or S 1 S 2 , then S is I-reduced i S 1 and S 2<br />

or T respectively are I-reduced.<br />

(iv) If S = T:f(T ), then S is I-reduced i f (loop) isI-reduced for each ord<strong>in</strong>al number .<br />

133


Chapter 7<br />

Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems<br />

for Integrity Ma<strong>in</strong>tenance <strong>in</strong> the<br />

Context <strong>of</strong> Transaction<br />

Specications<br />

Contents<br />

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />

7.2 Unrepairable Transitions . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

7.3 Critical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138<br />

7.4 Stratied Constra<strong>in</strong>t Sets . . . . . . . . . . . . . . . . . . . . . . . 140<br />

7.5 An Algorithm for Check<strong>in</strong>g Stratication . . . . . . . . . . . . . . 142<br />

7.6 Locally Stratied Constra<strong>in</strong>t Sets . . . . . . . . . . . . . . . . . . . 146<br />

7.7 Complexity <strong>of</strong> Local Stratication . . . . . . . . . . . . . . . . . . 150<br />

7.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

K.-D. Schewe, B. Thalheim. Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance<br />

<strong>in</strong> the Context <strong>of</strong> Transaction Specications. Acta Cybernetica 1998 (to appear).<br />

134


Abstract. Integrity Ma<strong>in</strong>tenance is considered one <strong>of</strong> the major application elds <strong>of</strong> rule<br />

trigger<strong>in</strong>g systems (RTSs). In the case <strong>of</strong> a given <strong>in</strong>tegrity constra<strong>in</strong>t be<strong>in</strong>g violated by a<br />

database transition these systems trigger repair<strong>in</strong>g actions. However, it will be shown that<br />

for any set <strong>of</strong> constra<strong>in</strong>ts there exist unrepairable transitions, which depend on the closure<br />

<strong>of</strong> the constra<strong>in</strong>t set. This implies that <strong>in</strong>tegrity ma<strong>in</strong>tenance by RTSs is only possible, if the<br />

constra<strong>in</strong>t implication problem is decidable.<br />

Even if unrepairable transitions are excluded, this does not prevent the RTS to produce<br />

undesired behaviour. Writ<strong>in</strong>g constra<strong>in</strong>ts as sets (conjunctions) <strong>of</strong> simple ones <strong>in</strong> implicative<br />

normal form, this behaves well if there is only one such constra<strong>in</strong>t. In general, however, the<br />

rule trigger<strong>in</strong>g approach fails to solve the problem.<br />

Analyz<strong>in</strong>g the behaviour <strong>of</strong> RTSs leads to the denition <strong>of</strong> critical paths <strong>in</strong> associated<br />

rule hypergraphs and the requirement <strong>of</strong>such paths be<strong>in</strong>g absent. It will be shown that this<br />

requirement can be satised if the underly<strong>in</strong>g set <strong>of</strong> constra<strong>in</strong>ts is stratied, but this notion<br />

turns out to be too strong to be also necessary. A sucient and necessary condition for the<br />

absence <strong>of</strong> critical paths is obta<strong>in</strong>ed, if sets <strong>of</strong> constra<strong>in</strong>ts are required to be locally stratied.<br />

Keywords: active databases, <strong>in</strong>tegrity ma<strong>in</strong>tenance<br />

7.1 Introduction<br />

Active databases (ADBs) aim at extend<strong>in</strong>g relational (or object oriented) DBMS by rule<br />

trigger<strong>in</strong>g systems (RTSs), i.e. by sets <strong>of</strong> rules which on a given event and <strong>in</strong> the case <strong>of</strong> a<br />

condition be<strong>in</strong>g satised trigger actions on the database (ECA-rules). Events can be external<br />

events, time conditions or <strong>in</strong>ternal events result<strong>in</strong>g from operations on the database. Conditions<br />

are usually given by boolean queries that have to be evaluated aga<strong>in</strong>st the database.<br />

The action part consists <strong>of</strong> a sequence <strong>of</strong> basic operations to <strong>in</strong>sert, delete or update tuples<br />

(or objects respectively) <strong>in</strong> the database.<br />

The current research on ADBs (see e.g. [3]) is dom<strong>in</strong>ated by implementational aspects,<br />

whilst foundations <strong>of</strong> RTSs are seldom approached. The work <strong>in</strong> [1, 2, 4, 9, 10] and partly<br />

<strong>in</strong> [3] considers the problem to enforce database <strong>in</strong>tegrity by the use <strong>of</strong> RTSs. The results<br />

concern the generation <strong>of</strong> repair<strong>in</strong>g ECA-rules and partly the analysis <strong>of</strong> the result<strong>in</strong>g RTS.<br />

This analysis concentrates on the term<strong>in</strong>ation <strong>of</strong> the rule system, the <strong>in</strong>dependence <strong>of</strong> the nal<br />

database state from the chosen selection order <strong>of</strong> the rules (conuence) and on consistency.<br />

These requirements are not sucient for a reasonable rule behaviour, because it is easy<br />

to dene an RTS that empties the database <strong>in</strong> case <strong>of</strong> any constra<strong>in</strong>t violation. Therefore,<br />

we claim an additional requirement, which <strong>in</strong>formally means that the <strong>in</strong>tended eect <strong>of</strong> a<br />

transition may not be turned <strong>in</strong>to its opposite by the RTS.<br />

In this short paper we analyze the limits <strong>of</strong> the rule trigger<strong>in</strong>g approach. For a given set<br />

<strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicational normal form we rst <strong>in</strong>vestigate the existence <strong>of</strong> unrepairable<br />

transitions. These are determ<strong>in</strong>ed by the closure <strong>of</strong> the constra<strong>in</strong>t set. It turns out that the<br />

decidability <strong>of</strong> the constra<strong>in</strong>t implication problem is necessary for <strong>in</strong>tegrity ma<strong>in</strong>tenance by<br />

RTSs.<br />

Next we analyze, how to obta<strong>in</strong> RTSs that denitely repair constra<strong>in</strong>t violations by a<br />

(repairable) transition without <strong>in</strong>validat<strong>in</strong>g its <strong>in</strong>tended eect. Given an RTS we rst associate<br />

with it a rule hypergraph which corresponds to the possible sequences <strong>of</strong> triggered rules. Next<br />

we dene critical trigger paths <strong>in</strong> these hypergraphs that correspond to the propagation <strong>of</strong><br />

135


conditions. Indeed it can be shown that the existence <strong>of</strong> a s<strong>in</strong>gle critical trigger path makes<br />

the RTS work <strong>in</strong>correctly for at least one transition.<br />

F<strong>in</strong>ally, we analyze constra<strong>in</strong>t sets <strong>in</strong> order to detect, whether it is possible to dene an<br />

RTS <strong>of</strong> repair<strong>in</strong>g actions such that the critical trigger paths <strong>in</strong> its associated hypergraph can<br />

only <strong>in</strong>validate unrepairable transitions. For this we rst <strong>in</strong>troduce stratied constra<strong>in</strong>t sets<br />

that satisfy this condition. S<strong>in</strong>ce the converse is not true, we nally weaken the concept to<br />

locally stratied constra<strong>in</strong>t sets which gives a necessary and sucient conditions for the RTS<br />

to work correctly.<br />

7.2 Unrepairable Transitions<br />

In the follow<strong>in</strong>g we consider the relational datamodel with <strong>in</strong>tegrity constra<strong>in</strong>ts given by<br />

formulae <strong>in</strong> implicative normal form<br />

I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) (7.75)<br />

with predicate symbols p i , q j , which correspond either to a relation <strong>of</strong> the schema or are<br />

comparison predicates (= 6=


Let us rst demonstrate the <strong>in</strong>suciency <strong>of</strong> a naive RTS approach by a simple example.<br />

In \real" applications the situation <strong>of</strong> Example 7.1 will not occur <strong>in</strong> such an obvious way,<br />

but there are always implied and <strong>in</strong> general not detectable constra<strong>in</strong>ts lead<strong>in</strong>g to analogous<br />

problems as shown <strong>in</strong> [6].<br />

Example 7.1. Take two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x) and<br />

I 2 p(x) ^ q(x) ) false. This implies p to be always empty, hence <strong>in</strong>sertions <strong>in</strong>to p should<br />

be abolished. Then we obta<strong>in</strong> the follow<strong>in</strong>g repair<strong>in</strong>g rules:<br />

R 1 : ON <strong>in</strong>sert p (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />

R 2 :ONdelete q (x) IF:I 1 DO delete p (x)<br />

R 3 : ON <strong>in</strong>sert p (x) IF:I 2 DO delete q (x)<br />

R 4 : ON <strong>in</strong>sert q (x) IF:I 2 DO delete p (x)<br />

If we try to execute a transition <strong>in</strong>sert p (x) on a database state satisfy<strong>in</strong>g q(x), then we<br />

successively trigger the rules R 3 and R 2 with the eect <strong>of</strong> only delet<strong>in</strong>g a <strong>in</strong> q. This contradicts<br />

the orig<strong>in</strong>al <strong>in</strong>tention <strong>of</strong> the transition.<br />

ut<br />

In order to analyze the un<strong>in</strong>tended behaviour <strong>in</strong> Example 7.1 consider a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong><br />

implicational normal form. Let denote the (semantic) closure, i.e. = fI j j= Ig.Now<br />

let I2 be non-trivial, i.e. it does not hold <strong>in</strong> all database states. Write I <strong>in</strong> implicational<br />

normal form<br />

I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m )<br />

and let p i 1 ::: p i k<br />

and q j 1 ::: p j` denote the relation symbols on the left and right hand<br />

sides <strong>of</strong> I respectively. Wemay dene a transition T by<br />

delete qj<br />

1 (y j1 ) ::: delete q j` (y j`) <strong>in</strong>sert pi<br />

1 (x i1 ) ::: <strong>in</strong>sert p ik<br />

(x ik ) :<br />

If we start T with values for the x i and y j such that the additional conditions on the left<br />

hand side <strong>of</strong> I are satised, whilst the additional conditions on the right hand side are not, T<br />

will always reach a database state satisfy<strong>in</strong>g :I. This eect <strong>of</strong> T is <strong>in</strong>tentional and hence the<br />

only reasonable approach to<strong>in</strong>tegrity ma<strong>in</strong>tenance <strong>in</strong> this case is to disallow such transitions.<br />

More formally, the eect <strong>of</strong> a transition T <strong>in</strong> a state is given by the strongest (with<br />

respect to )) formula E (T ) = such that j= wp(T )( ) holds. Here wp(T )( ) denotes<br />

the weakest precondition <strong>of</strong> under the transition , i.e. start<strong>in</strong>g T <strong>in</strong> <strong>in</strong>itial state will<br />

reach a nal state satisfy<strong>in</strong>g .<br />

S<strong>in</strong>ce we only consider sequences <strong>of</strong> <strong>in</strong>sertions and deletions, E (T ) can always be written<br />

as a conjunction <strong>of</strong> literals, i.e. <strong>in</strong> negated implicational normal form, with the positive literals<br />

correspond<strong>in</strong>g to <strong>in</strong>sertions and the negative onestodeletions. In addition, we may consider<br />

the eect <strong>of</strong> a sequence T RT S, where T is a transition and RT S a system <strong>of</strong> rules. We say<br />

that RT S <strong>in</strong>validates the eect <strong>of</strong> T i 6j= E (T ) ^ E (T RT S) holds for some state .<br />

Then it is justied to call a transition T repairable with respect to the constra<strong>in</strong>t set<br />

i :E (T ) =2 holds for at least one state . Then a complete term<strong>in</strong>at<strong>in</strong>g system<br />

RT S <strong>of</strong> ECA-rules always <strong>in</strong>validates the eect <strong>of</strong> a non-repairable transition T . Hence the<br />

problem is to detect (and exclude) non-repairable transitions. In order to decide whether a<br />

given transition T is repairable or not, we must be able to decide, whether :E (T ) is <strong>in</strong> the<br />

closure . Hence the implication problem for constra<strong>in</strong>ts must be decidable.<br />

137


Proposition 7.1. Let be aset<strong>of</strong>constra<strong>in</strong>ts. The problem to decide, whether a transition<br />

T is repairable with respect to is equivalent to the constra<strong>in</strong>t implication problem for ,<br />

i.e. the problem to decide, whether a given constra<strong>in</strong>t I is a member <strong>of</strong> or not. ut<br />

Proposition 7.1 denes the rst limit on <strong>in</strong>tegrity ma<strong>in</strong>tenance by rule trigger<strong>in</strong>g systems. In<br />

the follow<strong>in</strong>g sections we shall concentrate on repairable transitions.<br />

Note that our treatment ignores the term<strong>in</strong>ation problem. Non-term<strong>in</strong>at<strong>in</strong>g transitions<br />

have to be excluded as well, but this problem is <strong>in</strong>dependent from the repairability problem,<br />

s<strong>in</strong>ce non-term<strong>in</strong>ation <strong>of</strong> RTSs occurs as an orthogonal problem.<br />

7.3 Critical Paths<br />

Let us ask, whether we can always nd a complete set <strong>of</strong> repair rules for all repairable<br />

transitions. For this we <strong>in</strong>troduce the notions <strong>of</strong> associated hypergraphs and critical trigger<br />

paths.<br />

Denition 7.2. Let S = fp 1 ::: p n g be a relational database schema and RT S = fR 1 ::: R m g<br />

a system <strong>of</strong> ECA-rules on S. Then the associated rule hypergraph (VE) is constructed as follows:<br />

{ V is the disjo<strong>in</strong>t union <strong>of</strong> S and RT S. We then talk <strong>of</strong> S-vertices and RT S-vertices<br />

respectively.<br />

{ If R 2 RT S has event-part Ev on p 2 S and actions on p 1 ::: p k , then we have a<br />

hyperedge from p to fRg labelled by +or; depend<strong>in</strong>g on Ev be<strong>in</strong>g an <strong>in</strong>sert or delete,<br />

and a hyperedge from fRg to fp 1 ::: p k g analogously labelled by k values + or ;. ut<br />

Figure 7.1 shows the associated rule hypergraph <strong>of</strong> Example 7.1 <strong>in</strong> which case we have a<br />

simple graph.<br />

<br />

q<br />

;<br />

@@I <br />

@ ;<br />

;; @<br />

p<br />

<br />

R 2<br />

; + - R 4<br />

;<br />

@ ;<br />

+ @@<br />

; ;; @R<br />

;<br />

R 1<br />

+ + - R 3<br />

Fig. 7.1. Associated Rule Hypergraph<br />

Denition 7.2 ignores the condition part <strong>of</strong> the rules. These come <strong>in</strong>to play ifwe consider<br />

critical trigger paths <strong>in</strong> associated hypergraphs. These are dened <strong>in</strong> several steps start<strong>in</strong>g<br />

from paths <strong>in</strong> the associated hypergraph which correspond to possible sequences <strong>of</strong> ECArules<br />

with respect only to their event- and action-parts. Secondly we attach formulae to the<br />

S-vertices <strong>in</strong> the path <strong>in</strong> such a way that pre- and postconditions <strong>of</strong> the <strong>in</strong>volved rules are<br />

expressed. Then we talk <strong>of</strong> trigger paths.<br />

A maximal trigger path with contradict<strong>in</strong>g <strong>in</strong>itial and nal condition will then be called<br />

critical. Then imag<strong>in</strong>e a transition with an eect implied by the <strong>in</strong>itial formula, i.e. that<br />

138


there is an <strong>in</strong>itial state such that runn<strong>in</strong>g the transition <strong>in</strong> this state results <strong>in</strong> a state which<br />

satises the <strong>in</strong>itial condition <strong>of</strong> the trigger path. If we execute this transition followed by the<br />

rule trigger<strong>in</strong>g system along the critical trigger path will then turn the eect <strong>of</strong> the transition<br />

<strong>in</strong>to its opposite. This means that the RT S <strong>in</strong>validates the eect <strong>of</strong> at least one transition.<br />

Denition 7.3. Let G = (VE) be the rule hypergraph associated with a system RT S <strong>of</strong><br />

rules. A trigger path <strong>in</strong> G is a sequence v 0 e 1 v1 0 e0 1 ::: e0`v` <strong>of</strong> vertices and hyperedges with<br />

the follow<strong>in</strong>g conditions:<br />

{ v i 2S holds for all i =0::: `,<br />

{ vi 0 2 RT S holds for all i =1::: `,<br />

{ e i is a hyperedge from v i;1 to vi 0 and<br />

{ e 0 i is a hyperedge from v0 i to V i with v i 2 V i and the same label as e i+1 .<br />

We call ` the length <strong>of</strong> the trigger path.<br />

In addition we associate with each vertex v i 2 S (i = 0::: `) a formula ' i <strong>in</strong> negated<br />

implication normal form such thatj= ' i ) cond(vi+1 0 ) holds for the condition part cond(v0 i+1 )<br />

<strong>of</strong> rule vi+1 0 2 RT S and j= ' i ) wp(A i+1 )(' i+1 ) holds for the action-part A i+1 <strong>of</strong> rule vi+1<br />

0<br />

(i =0::: `; 1). Furthermore, there is no e`+1 2 E from v` to v0`+1 with the same label as<br />

e 0` such thatj= '` ) cond(v0`+1 ) holds.<br />

Then a trigger path is critical i j= :(' 0 ^ '`) holds. Such a critical trigger path is<br />

called admissible i there is a consistent state and a repairable transition T such that<br />

E (T ) , ' 0 holds.<br />

ut<br />

Critical trigger paths for the associated rule hypergraph <strong>in</strong> Figure 7.1 are sketched <strong>in</strong> Figure<br />

7.2. Note that <strong>in</strong> this case both critical trigger paths are not admissible.<br />

<br />

p<br />

<br />

<br />

p<br />

<br />

Fig. 7.2. Critical Trigger Paths<br />

<br />

q<br />

<br />

<br />

q<br />

<br />

+ + + ;<br />

R 1 R 4<br />

- - - -<br />

+ ; ; ;<br />

R 3 R 2<br />

- - - -<br />

<br />

p<br />

<br />

<br />

p<br />

<br />

p(x) ^:q(x) p(x) ^ q(x) :p(x) ^ q(x)<br />

v 0 e 1 v 0 1 e 0 1<br />

v 1 e 2 v 0 2 e 0 2<br />

v 2<br />

p(x) ^ q(x) p(x) ^:q(x) :p(x) ^:q(x)<br />

If a critical trigger path is not admissible, then only a non-repairable transition can be <strong>in</strong>validated<br />

by runn<strong>in</strong>g the rules<strong>in</strong>thetrigger path. S<strong>in</strong>ce we exclude non-repairable transitions,<br />

we only have to consider admissible trigger paths. After these remarks we are able to prove<br />

our rst result.<br />

Proposition 7.4. Let RT S be acomplete set <strong>of</strong> rules associated with a set <strong>of</strong> constra<strong>in</strong>ts<br />

and let G =(VE) be the associated rule hypergraph. Then G conta<strong>in</strong>s an admissible critical<br />

trigger path i there exists a consistent database state and a repairable transition T such that<br />

execut<strong>in</strong>g T <strong>in</strong> and consecutively runn<strong>in</strong>g RT S <strong>in</strong>validates the eect <strong>of</strong> T without leav<strong>in</strong>g<br />

the database unchanged.<br />

139


Pro<strong>of</strong>. Let us rst assume that G conta<strong>in</strong>s an admissible critical trigger path. Let ' 0 ::: '`<br />

denote the formulae associated with the S-vertices <strong>in</strong> this trigger path.<br />

Case 1. Assume that e 1 is labelled by +.Then' 0 conta<strong>in</strong>s at least one positive literal p(x).<br />

Let be a consistent state and T a repairable transition such that E (T ) is given by ' 0 .<br />

We may assume that j= :p(x) holds and that the nal action <strong>in</strong> T <strong>in</strong> an <strong>in</strong>sertion <strong>in</strong>to p. If<br />

we start T <strong>in</strong> the <strong>in</strong>itial state , then the result<strong>in</strong>g state satises ' 0 .<br />

T followed by the RT S may then result <strong>in</strong> a state satisfy<strong>in</strong>g '`. Hence the eect <strong>of</strong><br />

T RT S <strong>in</strong> is given by '`. S<strong>in</strong>ce j= :(' 0 ^ '`) holds by the denition <strong>of</strong> critical trigger<br />

paths, this implies that RT S <strong>in</strong>validates the eect <strong>of</strong> T . Furthermore, is consistent with<br />

respect to all constra<strong>in</strong>ts <strong>in</strong> , s<strong>in</strong>ce RT S is complete and there is no hyperedge e`+1 from v`<br />

to some v 0`+1 2 RT S with the same label as e0` such thatj= '` ) cond(v0`+1 ) holds.<br />

It rema<strong>in</strong>s to show 6= . If this does not hold, we get j= '` and consequently there<br />

exists some such that '` ,:p(x) ^ and ' 0 , p(x) ^ hold. This implies `>1, because<br />

otherwise the rule v1 0 would have the form ON <strong>in</strong>sert p(x) IF :I DO delete p (x) , which we<br />

excluded.<br />

If `>1 holds, there is at least one other literal q(y) (or :q(y)) <strong>in</strong> ' 0 such that delete q (y)<br />

(or <strong>in</strong>sert q (y) respectively) occurs <strong>in</strong> the action-part <strong>of</strong> v1 0 .Thenwemay consider the admissible<br />

critical trigger path v 1 e 2 ::: v` <strong>of</strong> length ` ; 1 <strong>in</strong>stead. Follow<strong>in</strong>g the argumentation<br />

above, we maychoose and T <strong>in</strong> such away that j= :q(y) (orj= q(y) respectively) holds.<br />

This implies 6= as required.<br />

Case 2. If e 1 is labelled by ;, then ' 0 conta<strong>in</strong>s a literal :p(x). Thus, we have to consider<br />

a transition T conta<strong>in</strong><strong>in</strong>g delete p (x) as its nal action and a consistent state with j= p(x)<br />

and E (T ) , ' 0 . Then we may apply the same arguments as for case 1.<br />

Conversely, assume that there is no admissible critical trigger path. Let T be a repairable<br />

transition and a database state which is consistent with respect to . Now start T <strong>in</strong> <br />

and assume that the result<strong>in</strong>g state 0 is not consistent. Then consider a trigger path <strong>of</strong> nite<br />

length such that j= 0 ' 0 holds. The consecutive execution <strong>of</strong> the rules <strong>in</strong> this trigger path will<br />

result <strong>in</strong> a state satisfy<strong>in</strong>g '`. Thus, we have E (T ) , ' 0 and E (T RT S) , '`.<br />

Accord<strong>in</strong>g to our assumption, the used trigger path cannot be critical, i.e. '` ^ ' 0 is<br />

satisable. Hence RT S does not <strong>in</strong>validate the eect <strong>of</strong> T .<br />

ut<br />

7.4 Stratied Constra<strong>in</strong>t Sets<br />

Accord<strong>in</strong>g to the result <strong>in</strong> Proposition 7.4 we may ask for constra<strong>in</strong>t sets that allow to dene<br />

complete RTSs which exclude admissible critical trigger paths <strong>in</strong> their associated hypergraphs.<br />

Let us start with a simple example.<br />

Example 7.2. Take aga<strong>in</strong> two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x)<br />

and I 2 q(x) ) p(x) which implies p to be always equal to q. Thenwe obta<strong>in</strong> the follow<strong>in</strong>g<br />

repair<strong>in</strong>g rules:<br />

R 1 : ON <strong>in</strong>sert p (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />

R 2 :ONdelete q (x) IF:I 1 DO delete p (x)<br />

R 3 : ON <strong>in</strong>sert q (x) IF:I 2 DO <strong>in</strong>sert p (x)<br />

R 4 :ONdelete p (x) IF:I 2 DO delete q (x)<br />

140


In this case there are no admissible critical paths <strong>in</strong> the associated rule hypergraph. We omit<br />

further details.<br />

ut<br />

Let us now <strong>in</strong>vestigate the reason for the absence <strong>of</strong> admissible critical trigger paths <strong>in</strong> Example<br />

7.2. This leads us to the notion <strong>of</strong> a stratied set <strong>of</strong> constra<strong>in</strong>ts.<br />

The motivation beh<strong>in</strong>d this is as follows: In Example 7.2 <strong>in</strong>sertions (deletions) on a relation<br />

p only trigger <strong>in</strong>sertions (deletions) on q and vice versa. This should be sucient for not<br />

<strong>in</strong>validat<strong>in</strong>g a once established eect. The correspond<strong>in</strong>g constra<strong>in</strong>ts can therefore be grouped<br />

together.<br />

Denition 7.5. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicative normal form (7.75) on a schema<br />

S. The is called stratied i we have a partition = 1 [ :::[ n with pairwise disjo<strong>in</strong>t<br />

constra<strong>in</strong>t sets i called strata such that the follow<strong>in</strong>g conditions are satised:<br />

(i) If L is a literal on the left hand side (right hand side) <strong>of</strong> some constra<strong>in</strong>t I 2 i , then<br />

all constra<strong>in</strong>ts J 2 conta<strong>in</strong><strong>in</strong>g a literal L 0 on the right hand side (left hand side) such<br />

that L and L 0 are uniable also lie <strong>in</strong> stratum i .<br />

(ii) All constra<strong>in</strong>ts I, J conta<strong>in</strong><strong>in</strong>g uniable literals L and L 0 either on the left or the right<br />

hand side must lie <strong>in</strong> dierent strata i and j .<br />

ut<br />

Now we can prove <strong>in</strong> general that stratied constra<strong>in</strong>t sets always give rise to RTSs without<br />

admissible critical trigger paths <strong>in</strong> the associated rule hypergraph.<br />

Proposition 7.6. Let be a stratied constra<strong>in</strong>t set on a schema S. Then there exists a<br />

complete RTS such that for any repairable transition T on S the RTS does not <strong>in</strong>validate the<br />

eect <strong>of</strong> T .<br />

Pro<strong>of</strong>. Given a constra<strong>in</strong>t I <strong>in</strong> implicative normal form (7.75), then each relation symbol p i<br />

on the left hand side gives rise to rules<br />

ON <strong>in</strong>sert pi (x i )IF:I DO <strong>in</strong>sert qj (y j ) ,<br />

ON <strong>in</strong>sert pi (x i )IF:I DO delete pj (y j )<br />

with relation symbols q j occurr<strong>in</strong>g on the right hand side and p j (j 6= i) on the left hand side<br />

<strong>of</strong> I. Similarly, each predicate symbol q j on the right hand side gives rise to rules<br />

ON delete qj (y j )IF:I DO <strong>in</strong>sert qi (y j ) (i 6= j) ,<br />

ON delete qj (y j )IF:I DO delete pi (y j )<br />

This denes a complete set RT S <strong>of</strong> rules. Now assume there exists a critical trigger path<br />

v 0 e 1 v1 0 e0 1 ::: e0`v` <strong>in</strong> the associated rule hypergraph. Each RT S-vertex vi 0 corresponds to<br />

a constra<strong>in</strong>t I i 2 . S<strong>in</strong>ce e 0 i and e i+1 are equally labelled correspond<strong>in</strong>g to the action- or<br />

event-part respectively, the construction <strong>of</strong> the rules above implies I i and I i+1 to lie <strong>in</strong> the<br />

same stratum (i =0::: `; 1).<br />

However, the condition j= :(' 0 ^ '`) implies that ' 0 conta<strong>in</strong>s a literal L, '` its negation,<br />

hence the construction <strong>of</strong> rules implies I 1 and I` to lie <strong>in</strong> dierent strata. Hence, there are<br />

only critical trigger paths <strong>of</strong> length ` =1.<br />

Accord<strong>in</strong>g to our construction <strong>of</strong> RT S this implies j= ' 0 ):Ito hold for some I 2 .<br />

Thus, :' 0 2 holds. Due to the denition <strong>of</strong> admissible critical trigger paths and the<br />

denition <strong>of</strong> repairable transitions, we conclude that the trigger paths <strong>of</strong> length ` = 1 cannot<br />

be admissible. Then the proposition follows from Proposition 7.4.<br />

ut<br />

141


F<strong>in</strong>ally, we may ask for cases, where stratied constra<strong>in</strong>t sets occur. Recall from [5] that a<br />

relational database schema S with constra<strong>in</strong>t set is <strong>in</strong> Entity-Relationship normal form<br />

(ERNF) { and hence is equivalent toanER-schema{i<br />

{ all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> are key-based and non-redundant,<br />

{ there is no cycle <strong>of</strong> <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> ,<br />

{ each relation schema R 2 S is <strong>in</strong> BCNF with respect to the functional dependencies <strong>in</strong><br />

and<br />

{ there are only <strong>in</strong>clusion and functional dependencies <strong>in</strong> .<br />

If a relational database schema S with constra<strong>in</strong>t set is <strong>in</strong> ERNF, then it is easy to see<br />

that is stratied.<br />

Corollary 7.7. Let S be a database schema <strong>in</strong> ERNF with respect to the constra<strong>in</strong>t set .<br />

Then is stratied.<br />

ut<br />

Hence, follow<strong>in</strong>g the design approach <strong>of</strong> Mannila and Raiha <strong>in</strong>[5]{ifthisissucient for the<br />

application { leads to schemata without any problems concern<strong>in</strong>g consistency enforcement by<br />

RTSs.<br />

Example 7.3.<br />

Let us look at the follow<strong>in</strong>g constra<strong>in</strong>ts<br />

I 1 : p(x y) ) q(x z) and<br />

I 2 : q(x z) ^ q(y z) ) x = y :<br />

Then this set <strong>of</strong> constra<strong>in</strong>ts corresponds to the Entity-Relationship diagram [8] <strong>in</strong> Figure 7.3.<br />

Obviously, the constra<strong>in</strong>t set is stratied.<br />

ut<br />

;; @ @@ ; C q<br />

(0 1) - D<br />

6<br />

(0 1)<br />

;; @ @@ ; ;; @ @@ ; -<br />

A p B<br />

Fig. 7.3. Entity-Relationship constra<strong>in</strong>ts<br />

7.5 An Algorithm for Check<strong>in</strong>g Stratication<br />

Before we analyze the converse <strong>of</strong> Proposition 7.6 and present the weaker notion <strong>of</strong> locally<br />

stratied constra<strong>in</strong>t sets, let us rst concentrate on an algorithm for check<strong>in</strong>g stratication<br />

and its complexity. For this we consider the set<br />

142


BW = f> ?g [ (IN ;f0g) [ffj 1 ::: j n gjn 1j k 2 IN ;f0gg :<br />

In the algorithm we successively add labels from BW to constra<strong>in</strong>ts. A label i 2 IN for<br />

a constra<strong>in</strong>t I is used to <strong>in</strong>dicate that I must lie <strong>in</strong> the stratum i . A label fj 1 ::: j n g<br />

<strong>in</strong>dicates that I must not lie <strong>in</strong> jk for k =1::: n. ? represents no <strong>in</strong>formation and > an<br />

<strong>in</strong>consistent assignment <strong>of</strong> stratum numbers.<br />

For a more convenient term<strong>in</strong>ology we call an element <strong>of</strong>BW black, ifitis<strong>in</strong>(IN ;f0g) [<br />

f>g, otherwise white. Furthermore, we use a commutative, associative b<strong>in</strong>ary operation on<br />

BW dened by<br />

x ? = x <br />

x > = > <br />

i if i = j<br />

i j =<br />

> otherwise<br />

fj 1 ::: j n gfk 1 ::: k m g = fj 1 ::: j n g[fk 1 ::: k m g and<br />

> if i = jk for some k 2f1::: ng<br />

i fj 1 ::: j n g =<br />

i otherwise<br />

<br />

:<br />

Algorithm 7.8 (Stratication Check).<br />

Input: aset = fI 1 ::: I n g <strong>of</strong> constra<strong>in</strong>ts<br />

<strong>in</strong> clausal form I i = L i1 _ :::_ L <strong>in</strong>i<br />

Output: a boolean value b<br />

Method:<br />

VAR gather : ARRAY 1 :::n OF BW ,<br />

mb, mb 0 : BW <br />

BEGIN<br />

FOR i =1TO n DO<br />

gather(i) :=?<br />

ENDFOR <br />

b := true <br />

mb := 1 <br />

WHILE 6= DO<br />

CHOOSE i 0 2f1::: ng WITH I i 0 2 AND gather(i 0) is maximal <br />

:= ;fI i 0 g <br />

IF gather(i 0 ) is white<br />

THEN gather(i 0 ):=mb <br />

mb := mb +1<br />

ENDIF <br />

mb 0 := gather(i 0 )<br />

FOR j =1TO n i 0 DO<br />

FOR ALL I k 2 DO<br />

FOR ` =1TO n ik DO<br />

IF L i 0j and L k` are uniable AND gather(i 0 ) 6= ><br />

THEN gather(k) :=gather(k) fgather(i 0 )g<br />

ELSIF L i 0j and L k` are uniable<br />

143


THEN gather(k) :=gather(k) gather(i 0 )<br />

ENDIF<br />

ENDFOR<br />

ENDFOR<br />

ENDFOR<br />

ENDDO <br />

FOR i =1TO n DO<br />

IF gather(i) =><br />

THEN b := false<br />

ENDIF<br />

ENDFOR <br />

RETURN(b)<br />

END<br />

ut<br />

We have tocheck that the algorithm is correct. Then we analyze its time complexity. Before<br />

we do this let us rst look at a simple example.<br />

Example 7.4.<br />

Consider the follow<strong>in</strong>g constra<strong>in</strong>ts:<br />

I 1 = :p(x) _:q(x) _ r(x) _ s(x) <br />

I 2 = :q(x) _ r(x) _:t(x) <br />

I 3 = p(x) _:r(x) <br />

I 4 = :s(x) _ t(x) and<br />

I 5 = q(x) _:t(x) :<br />

Then consider Table 1. Each row corresponds to a constra<strong>in</strong>t I i and lists the values added<br />

Table1. Stratication Check<br />

L 11 L 12 L 13 L 14 L 32 L 31 L 21 L 22 L 23 L 41 L 42 L 52 L 51 gather<br />

1 1 1 1 1 1<br />

3 1 1 1 1 1<br />

2 f1g f1g 1 > > > ><br />

4 1 > > > ><br />

5 1 > > > > ><br />

I 1 I 3 I 2 I 4 I 5 b = false<br />

to gather(i) dur<strong>in</strong>g the excution <strong>of</strong> the algorithm. The chosen order <strong>of</strong> the constra<strong>in</strong>ts <strong>in</strong> the<br />

algorithm is I 1 , I 3 , I 2 , I 4 , I 5 . Then b will become false and hence is not stratiable. ut<br />

Let us now address the correctness <strong>of</strong> Algorithm 7.8.<br />

Proposition 7.9. Let be a set <strong>of</strong> constra<strong>in</strong>ts. Then is stratiable i Algorithm 7.8<br />

applied to the <strong>in</strong>put computes the output b = true.<br />

144


Pro<strong>of</strong>. Let us rst assume that is stratied. Let = 1 [ :::[ n be a decomposition<br />

<strong>in</strong>to strata and assume that the i are taken m<strong>in</strong>imal with the required properties. We use<br />

<strong>in</strong>duction on n.<br />

For n = 1 there are no uniable literals L and L 0 <strong>in</strong> dierent constra<strong>in</strong>ts I J 2 . Hence<br />

gather(i) will become 1 for alle i and we obta<strong>in</strong> b = true.<br />

For n>1 we may assume without loss <strong>of</strong> generality that some constra<strong>in</strong>t <strong>in</strong> 1 will be<br />

chosen rst. Then, due to our m<strong>in</strong>imality assumption, we get gather(i) =1for all I i 2 1 ,<br />

whereas gather(j) will be white for all I j =2 1 . Thus, all constra<strong>in</strong>ts <strong>in</strong> 1 will be chosen<br />

rst.<br />

S<strong>in</strong>ce gather(j) was white for I j =2 1 and gather(i) =1for I i 2 1 before chos<strong>in</strong>g the<br />

rst constra<strong>in</strong>t <strong>in</strong> 2 [:::[ n ,wemay apply the <strong>in</strong>duction hypothesis to 2 [:::[ n , which<br />

gives gather(j) 6= > for all I j =2 1 . This implies b = true as claimed <strong>in</strong> the proposition.<br />

Conversely, assume that the algorithm produces the result b = true. Thenwe must have<br />

gather(i) 2 IN ;f0g. Dene k = fI i 2 j gather(i) = kg. Assume that the partition<br />

= 1 [ :::[ n does not satisfy the conditions for strata <strong>in</strong> Denition 7.5. Then there are<br />

two possible cases:<br />

(i) There are literals L and L 0 <strong>in</strong> constra<strong>in</strong>ts I i 2 k and I j 2 ` with k 6= ` such that L<br />

and L 0 are uniable. Suppose that I i is chosen rst by the algorithm. Then k will be<br />

added to gather(j), which gives gather(j) => contradict<strong>in</strong>g our assumption.<br />

(ii) There are uniable literals L and L 0 <strong>in</strong> constra<strong>in</strong>ts I i I j 2 k .IfI i is chosen rst by the<br />

algorithm, fkg will be added to gather(j), which also gives gather(j) => contradict<strong>in</strong>g<br />

our assumption.<br />

Thus 1 [ :::[ n is a partition <strong>in</strong>to strata, which completes the pro<strong>of</strong>.<br />

ut<br />

Proposition 7.10. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> clausal form, n =#, k the maximal<br />

arity <strong>of</strong> predicate symbols occurr<strong>in</strong>g <strong>in</strong> constra<strong>in</strong>ts I2 and let ` be the maximum number <strong>of</strong><br />

literals <strong>in</strong> these constra<strong>in</strong>ts. Then the time complexity <strong>of</strong> Algorithm 7.8 for check<strong>in</strong>g, whether<br />

is stratied is <strong>in</strong> O(k `2 n 2 ).<br />

Pro<strong>of</strong>. The <strong>in</strong>itialization and the nal computation <strong>of</strong> b can both be done <strong>in</strong> O(n) steps.<br />

In the <strong>in</strong>ner FOR-loop the test for uniability can be done <strong>in</strong> O(k) steps, s<strong>in</strong>ce there are no<br />

function symbols. All other operations have a complexity <strong>in</strong>O(1). Hence the <strong>in</strong>ner FOR-loop<br />

has a total complexity <strong>in</strong>O(k). This loop is executed `0 `00 times, where `0 is the number <strong>of</strong><br />

literals <strong>in</strong> the chosen constra<strong>in</strong>t I i 0 and `00 is the total number <strong>of</strong> literals <strong>in</strong> the rema<strong>in</strong><strong>in</strong>g<br />

constra<strong>in</strong>ts. If I i 0 is the i'th literal chosen by the algorithm, this can be estimated by `2(n;i).<br />

S<strong>in</strong>ce each I 2 will be chosen by the algorithm, the outer WHILE-loop will be executed<br />

n times. This gives the total complexity <strong>in</strong><br />

O(n)+O(`2 <br />

nX<br />

i=1<br />

(n ; i)) O(k)+O(n) = O(k `2 n 2 )<br />

as claimed <strong>in</strong> the proposition.<br />

ut<br />

It is easy to see that n ` can be replaced by the total number u = P n<br />

i=1 n i <strong>of</strong> literals <strong>in</strong> <br />

with u < n `. Thus, the time complexity <strong>of</strong> the stratication check<strong>in</strong>g algorithm 7.8 is <strong>in</strong><br />

O(k u 2 ).<br />

145


From Proposition 7.6 we know that active mechanisms can be eectively applied, if the<br />

constra<strong>in</strong>t set is stratied. In particular, this holds for schemata <strong>in</strong> ERNF [5], which are equivalent<br />

to Entity-Relationship schemata. From Proposition 7.10 we know that a stratication<br />

check can be done eciently.<br />

7.6 Locally Stratied Constra<strong>in</strong>t Sets<br />

Unfortunately, the converse <strong>of</strong> Proposition 7.6 does not hold, as seen <strong>in</strong> the next example.<br />

The reason for this is that <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6 we considered all repair<strong>in</strong>g rules for<br />

a given constra<strong>in</strong>t, whereas the constra<strong>in</strong>t set <strong>in</strong> Example 7.5 allows to select only a subset<br />

thus ga<strong>in</strong><strong>in</strong>g the required result without loos<strong>in</strong>g the completeness <strong>of</strong> the RTS.<br />

Example 7.5. Take three unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ^ r(x) )<br />

q(x), I 2 q(x) ) p(x) andI 3 p(x) ) r(x). It is easy to see that this constra<strong>in</strong>t set is not<br />

stratied.<br />

However, we may consider the follow<strong>in</strong>g system <strong>of</strong> ECA-rules:<br />

R 1 : ON <strong>in</strong>sert p (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />

R 2 :ONdelete q (x) IF:I 1 DO delete p (x)<br />

R 3 : ON <strong>in</strong>sert r (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />

R 4 :ONdelete q (x) IF:I 1 DO delete r (x)<br />

R 5 : ON <strong>in</strong>sert q (x) IF:I 2 DO <strong>in</strong>sert p (x)<br />

R 6 :ONdelete p (x) IF:I 2 DO delete q (x)<br />

R 7 : ON <strong>in</strong>sert p (x) IF:I 3 DO <strong>in</strong>sert r (x)<br />

R 8 :ONdelete r (x) IF:I 3 DO delete p (x)<br />

We dispense with show<strong>in</strong>g that there are no admissible critical trigger paths <strong>in</strong> the associated<br />

rule hypergraph.<br />

Note that the construction <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6 would result <strong>in</strong> two more rules<br />

correspond<strong>in</strong>g to <strong>in</strong>sertions:<br />

R 9 : ON <strong>in</strong>sert p (x) IF:I 1 DO delete r (x)<br />

R 10 : ON <strong>in</strong>sert r (x) IF:I 1 DO delete p (x)<br />

These give rise to admissible critical trigger paths. The one shown <strong>in</strong> Figure 7.4 allows to<br />

<strong>in</strong>validate the eect <strong>of</strong> the repairable transition <strong>in</strong>sert p (x).<br />

ut<br />

<br />

p<br />

<br />

<br />

r<br />

<br />

+ ; ; ;<br />

R 9 R 8<br />

- - - -<br />

<br />

p<br />

<br />

p(x) ^:q(x) ^ r(x) p(x) ^:q(x) ^:r(x) :p(x) ^:q(x) ^:r(x)<br />

v 0 e 1 v1 0 e 0 v<br />

1 1 e 2 v2 0 e 0 v<br />

2 2<br />

Fig. 7.4. An Admissible Critical Trigger Path<br />

146


The constra<strong>in</strong>t set <strong>in</strong> Example 7.5 is not stratied, but nevertheless the associated RTS does<br />

not <strong>in</strong>validate the eect <strong>of</strong> repairable transitions. This shows that a constra<strong>in</strong>t set need not<br />

be stratied to allow a reasonable rule behaviour. Indeed, replac<strong>in</strong>g I 1 <strong>in</strong> the example by<br />

I1 0 p(x) ) q(x) gives an equivalent constra<strong>in</strong>t set, which is stratied. However, equivalence<br />

<strong>of</strong> constra<strong>in</strong>t sets is undecidable <strong>in</strong> general. Therefore, we <strong>in</strong>troduce the weaker notion <strong>of</strong><br />

be<strong>in</strong>g locally stratied. In this case we shall construct RTSs which only conta<strong>in</strong> a subset <strong>of</strong><br />

the set <strong>of</strong> rules constructed <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6.<br />

Denition 7.11. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicative normal form on a schema S.<br />

A labelled subsystem consists <strong>of</strong> a subset 0 = fI 2 j L (I) is dened g together with<br />

a set <strong>of</strong> clauses 00 = f L (I) jI2 0 g and a literal L (the label) such that each constra<strong>in</strong>t<br />

I2 0 can be written as the disjunction L (I) _I 0 with j= I 0 ) L.<br />

Here L (I) is dened i the negation L does not occur <strong>in</strong> I (written as a clause). Then<br />

L (I) results from I by omission <strong>of</strong> the literal L if the result conta<strong>in</strong>s at least two literals.<br />

Otherwise L (I) is simply I. We call I 0 the label part and L (I) the label-free part <strong>of</strong> the<br />

constra<strong>in</strong>t I. IfL is understood from the context, we drop the subscript and write <strong>in</strong>stead<br />

<strong>of</strong> L .<br />

A labelled subsystem ( 0 00 L) is called stratied i the set 00 is stratied <strong>in</strong> the sense<br />

<strong>of</strong> Denition 7.5 or locally stratied as dened below.<br />

The constra<strong>in</strong>t set is called locally stratied i = 1 0 [:::[0 n with stratied labelled<br />

subsystems (i 000<br />

i L i) (i =1::: n) such that for each constra<strong>in</strong>t I 2 i 0 and each literal<br />

L occurr<strong>in</strong>g <strong>in</strong> its label part with respect to i there exists another j with I 2 j 0 and L<br />

occurr<strong>in</strong>g <strong>in</strong> its label-free part <strong>of</strong> I with respect to j .<br />

ut<br />

Example 7.6. For the constra<strong>in</strong>t set <strong>in</strong> Example 7.5 we obta<strong>in</strong> the partition <strong>in</strong>to 1 0 =<br />

fI 1 I 3 g and 2 0 = fI 1 I 2 g.<br />

For the rst <strong>of</strong> these we have the label L 1 :p(x) and the label-free parts dened by<br />

L 1 (I 1) q(x) _:r(x) and L 1 (I 3) I 3 .<br />

For 2 0 we get the label L 2 :r(x) and the label-free parts L 2 (I 1) :p(x) _ q(x) and<br />

L 2 (I 2) I 2 .<br />

This shows that the constra<strong>in</strong>t set <strong>in</strong> Example 7.5 is <strong>in</strong>deed locally stratied. ut<br />

Note that each stratied constra<strong>in</strong>t set is also locally stratied. In this case we dene<br />

depth() = 0. If is locally stratied by a partition = 1 0 [:::[0 n ,we dene depth() =<br />

max n i=1 depth(00 i ) + 1. We calldepth() the depth <strong>of</strong> the locally stratied constra<strong>in</strong>t set .<br />

F<strong>in</strong>ally, we can strengthen Proposition 7.6 now deal<strong>in</strong>g with locally stratied constra<strong>in</strong>t<br />

sets. This condition turns out to be sucient and also necessary for the absence <strong>of</strong> admissible<br />

critical trigger paths.<br />

Theorem 7.12. Let be aconstra<strong>in</strong>t set on a schema S. Then is locally stratied i there<br />

exists a complete RTS such that for any repairable transition T the RTS does not <strong>in</strong>validate<br />

the eect <strong>of</strong> T .<br />

Pro<strong>of</strong>. First assume that is locally stratied. Let the labelled subsystems <strong>in</strong> the partition<br />

be (i 000<br />

i L i) for i =1::: n.We shall use <strong>in</strong>duction on the depth <strong>of</strong> . For depth() =0<br />

we are done by Proposition 7.6.<br />

Let us now consider the case depth() = 1. As <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6 we construct<br />

an RTS for . S<strong>in</strong>ce each i<br />

00 is stratied <strong>in</strong> the sense <strong>of</strong> Denition 7.5, we rst construct a<br />

147


ule system RT Si 0 with respect to 00 i as <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6. The condition parts<br />

<strong>in</strong> these rules have theform: Li (I) forI2i 0. Then let RT S i result S from RT Si 0 bychang<strong>in</strong>g<br />

n<br />

all condition parts replac<strong>in</strong>g : Li (I) by :I. F<strong>in</strong>ally, take RT S = i=1 RT S i.<br />

Due to the last property <strong>in</strong> the denition <strong>of</strong> locally stratied constra<strong>in</strong>t sets <strong>in</strong> Denition<br />

7.11 we conclude that RT S is complete.<br />

Now consider a critical trigger path v 0 e 1 v1 0 e0 1 v 1::: e 0`v` <strong>in</strong> the rule hypergraph associated<br />

with RT S. Without loss <strong>of</strong> generality assume v1 0 2 RT S 1. Accord<strong>in</strong>g to Proposition<br />

7.4 we have to show that this trigger path is not admissible.<br />

We use <strong>in</strong>duction on the length ` <strong>of</strong> this critical trigger path. For ` =1we may use the<br />

same argument as<strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.4. Therefore, assume `>1 and take a state<br />

with j= and a transition T with j= E (T ) , ' 0 .Thenwehave to show that T is not<br />

repairable.<br />

Assume that T is repairable. Then there exists a state with j= such that :E (T ) =2<br />

.We shall derive acontradiction from this.<br />

For this regard the critical trigger path v 1 e 2 v2 0 e0 2 v 2::: e 0`v` <strong>of</strong> length ` ; 1. By <strong>in</strong>duction<br />

it is not admissible. If A 1 is the action <strong>in</strong> the rule v1 0 , we get j= E (T A 1 ) , ' 1<br />

and T A 1 cannot be repairable. In particular, this implies :E (T A 1 ) 2 .<br />

S<strong>in</strong>ce A 1 is a simple <strong>in</strong>sertion or deletion, we getj= :E (T ) , '^L and j= :E (T A 1 )<br />

, '^ L for some literal L and its negation L.From this we conclude ' 2 and L 2 .<br />

Then there must exist a resolution refutation for L from <strong>in</strong>put . Any literal L 0 (except<br />

L) <strong>in</strong> this refutation must be selected at least once for build<strong>in</strong>g the resolvent. Therefore, due<br />

to our construction <strong>of</strong> L 1 (I) wemay cancel all clauses I2 conta<strong>in</strong><strong>in</strong>g the literal L 1 and<br />

simultaneously the literal L 1 <strong>in</strong> all clauses. Thus, there must also exist a resolution refutation<br />

for L from <strong>in</strong>put 1 00.<br />

On the other hand, each clause <strong>in</strong> 1 00 conta<strong>in</strong>s at least two literals. Therefore, any resolvent<br />

will also conta<strong>in</strong> at least two literals unless we have some I 1 2 1 00 with literals L 1 and L 2<br />

and another I 2 2 1 00 with literals L0 1 and L0 2 such that L 1, L 0 1 (and L 2, L 0 2 respectively) are<br />

uniable.<br />

This property, however, means that 1 00 is not stratied contradict<strong>in</strong>g our assumptions.<br />

Hence T cannot be repairable and we are done.<br />

Next let depth() > 1. We proceed analogously. By <strong>in</strong>duction, s<strong>in</strong>ce i<br />

00 is (locally) stratied,<br />

there exists a rule system RT Si 0 for 00 i with the required property. The condition parts<br />

<strong>in</strong> these rules have theform: Li (I) forI2i 0. Then let RT SS i result from RT Si 0 bychang<strong>in</strong>g<br />

n<br />

all condition parts from : Li (I) to:I. F<strong>in</strong>ally, take RT S = i=1 RT S i.<br />

Aga<strong>in</strong> due to the last property <strong>in</strong> the denition <strong>of</strong> locally stratied constra<strong>in</strong>t sets (cf.<br />

Denition 7.11) RT S must be complete.<br />

Now consider a critical trigger path v 0 e 1 v1 0 e0 1 v 1::: e 0`v` <strong>in</strong> the rule hypergraph associated<br />

with RT S. Accord<strong>in</strong>g to Proposition 7.4 we have to show that this trigger path is<br />

not admissible. Without loss <strong>of</strong> generality assume v1 0 2 RT S 1. Then take a maximal k such<br />

that v1 0 ::: v0 k 2 RT S 1 holds. Then for i =0::: k we may write ' i as a conjunction i ^J<br />

with j= i ): L 1 (I i) for some I i 2 1 0 .Hence,ifwe replace v0 i by the correspond<strong>in</strong>g rule <strong>in</strong><br />

RT S1 0 ,we obta<strong>in</strong> a critical trigger path for RT S0 1 .<br />

Now take a state with j= and a transition T with j= E (T ) , ' 0 . We have to<br />

show that T is not repairable. Assume the contrary. Then there exists a state with j= <br />

and :E =2 .<br />

Assume j= :L 1 . S<strong>in</strong>ce j= holds and each constra<strong>in</strong>t I 2 1 0 can be written as a<br />

disjunction I 0 _ L 1 (I) with j= I0 ) L 1 ,we conclude j= 1 00.<br />

148


S<strong>in</strong>ce v 0 e 1 v1 0 e0 1 v 1::: e 0 k v k is a critical trigger path for RT S1 0 and j= E , ' 0<br />

holds, we may apply the <strong>in</strong>duction hypothesis to 1 00 with depth(00 1 ) < depth(). Therefore,<br />

T cannot be repairable, i.e. for any state with j= 1 00 weget:E (T ) 2 (1 00)<br />

.<br />

In particular, take = . Then :E (T ) 2 (1 00)<br />

implies j= : L 1 (I) for some I 2 0 1<br />

and further 6j= contradict<strong>in</strong>g our assumption on . Thus, we must have j= L 1 .<br />

Assume j= :L 1 . Then we must have j= 1 00 and consequently :E (T ) 2 (1 00)<br />

. As<br />

above this implies j= : L 1 (I) for some I20 1 and hence 6j= contradict<strong>in</strong>g our assumption<br />

on . Hence,wemust have j= L 1 .<br />

Now let I 1 2 correspond to the rule v1 0 . Without loss <strong>of</strong> generality we may assume<br />

j= ' 0 ):L 1 . Otherwise, we must have j= :I1 0 and L1 (I 1)must not conta<strong>in</strong> L 1 . This implies<br />

L 1 to occur <strong>in</strong> J , <strong>in</strong> which case we may change it to :L 1 without aect<strong>in</strong>g the trigger path<br />

be<strong>in</strong>g critical.<br />

S<strong>in</strong>ce j= L 1 holds, T must <strong>in</strong>volve an <strong>in</strong>sertion (deletion) correspond<strong>in</strong>g to a negative<br />

(positive) literal L 1 . Hence, j= E (T ) ,:L 1 ^: holds. Due to the <strong>in</strong>dependence <strong>of</strong> J<br />

from 1 00 wemaychoose <strong>in</strong> such away that 2 (1 00)<br />

holds.<br />

However, this implies j= :E (T ) , L 1 _ 2 contradict<strong>in</strong>g the non-repairability <strong>of</strong><br />

T with respect to RT S1 0 . This completes the suciency pro<strong>of</strong>.<br />

Conversely, assume that we are given a complete RTS for which for any repairable<br />

transition T does not <strong>in</strong>validate its eect. Accord<strong>in</strong>g to Proposition 7.4 this implies that all<br />

critical trigger paths <strong>in</strong> the associated rule hypergraph are not admissible. From this we have<br />

to construct a partition <strong>of</strong> <strong>in</strong>to stratied labelled subsystems.<br />

First consider a s<strong>in</strong>gle rule R correspond<strong>in</strong>g to a constra<strong>in</strong>t I 2 . In particular, I is<br />

the condition part <strong>of</strong> this rule. S<strong>in</strong>ce RT S is complete, the event part <strong>of</strong> R gives rise to a<br />

negative (positive) literal L ev <strong>in</strong> I for the case <strong>of</strong> an <strong>in</strong>sertion (deletion). Similarly, an <strong>in</strong>sertion<br />

(deletion) <strong>in</strong> the action part <strong>of</strong> R gives rise to a positive (negative) literal L a <strong>in</strong> I.<br />

Let (I) = L ev _ L a . If I conta<strong>in</strong>s n 1 more literals L 1 ::: L n , let i (I) = (I) _<br />

L 1 _ :::_ L i _ :::_ L n . Then dene i 0(R) = fJ 2 j L i<br />

(J ) is dened g and i 00(R)<br />

=<br />

|{z}<br />

omit<br />

f Li (J ) jJ 2 i 0(R)g. (For I,(I) letL 1 = L ev and L 2 = L a and dene i 0 (R) and00 i (R)<br />

analogously.)<br />

Dene (R) =f(i 0(R)00<br />

i (R)L i) j i 00 (R) is locally stratied g, if this satises the last<br />

condition <strong>of</strong> Denition 7.11. Otherwise let (R) = . Then the elements <strong>of</strong> (R) dene<br />

stratied labelled subsystems <strong>of</strong> .<br />

In order to check the local stratication for i 00 (R) rst check, whether it is stratied. If<br />

not, dene for each literal L <strong>in</strong> i (I) thesetsiL 0 (R) =fJ 2 00 i (R) j L(J ) is dened g and<br />

iL 00 (R) =f L(J ) jJ 2 iL 0 (R)g. Consider f(0 iL (R)00 iL<br />

(R)L) j 00 iL<br />

(R) is locally stratied g<br />

and check the last condition S <strong>of</strong> Denition 7.11.<br />

Now take LSS =<br />

R2RT S<br />

(R). If (R) 6= holds for all R 2 RT S, this satises the last<br />

condition <strong>of</strong> Denition 7.11 and we obta<strong>in</strong> a partition <strong>of</strong> <strong>in</strong>to stratied labelled subsystems.<br />

Then LSS is the required partition.<br />

It rema<strong>in</strong>s to show (R) 6= <strong>in</strong> the construction above. Assume (R) =. Then there<br />

exists a sequence L 1 L 2 ::: L k <strong>of</strong> literals <strong>in</strong> I and a sequence (1 0 00 1 L 1)::: (k 0 00 k L k)<br />

<strong>of</strong> non-stratied labelled subsystems such that i+1 0 = fJ 2 00 i j Li+1 (J ) is dened g and<br />

k 00 conta<strong>in</strong>s two clauses Ik 1 and Ik 2 with literals L1 , L 10 and L 2 , L 20 respectively such that L 1 ,<br />

L 2 and L 10 , L 20 are uniable.<br />

I1 k and Ik 2 correspond to rules with respect to 00 k<br />

that dene an admissible trigger path<br />

<strong>in</strong> the associated rule hypergraph. S<strong>in</strong>ce for i =1 2 I1 k is : L k<br />

(I1 k;1 ), we may successively<br />

149


eplace these rules by rules correspond<strong>in</strong>g to k;1 00 ::: 00 1 and simultaneously replace the<br />

formulae ' k i by ' k;1<br />

i<br />

= ' k i ^:L k::: ' 0 i = ' 1 i ^:L 1. The result<strong>in</strong>g trigger path is still<br />

critical and due to our construction it is also admissible with respect to contradict<strong>in</strong>g our<br />

assumption. This completes the necessity pro<strong>of</strong>.<br />

ut<br />

Example 7.7.<br />

Let us extend Example 7.3 and add a third constra<strong>in</strong>t<br />

I 3 p(x z) ^ q(y z) ) false :<br />

In terms <strong>of</strong> the Entity-Relationship diagram <strong>in</strong> Figure 7.3 I 3 corresponds to an exclusion<br />

constra<strong>in</strong>t BkD. It is easy to see that the new set fI 1 I 2 I 3 g <strong>of</strong> constra<strong>in</strong>ts is not stratied.<br />

In particular, any local stratication must conta<strong>in</strong> a labelled subsystem with label :q(x z)<br />

with the reduced constra<strong>in</strong>ts I2 0 :q(y z) _ x = y and I0 3 I 3. However, :q(x z) cannot<br />

occur <strong>in</strong> the label-free part <strong>of</strong> some I2 0 , s<strong>in</strong>ce this always denes the same labelled subsystem.<br />

Hence, the given constra<strong>in</strong>t set is also not locally stratied. This shows that add<strong>in</strong>g a s<strong>in</strong>gle<br />

exclusion constra<strong>in</strong>t toan Entity-relationship schema may already destroy a reasonable rule<br />

behaviour.<br />

ut<br />

7.7 Complexity <strong>of</strong> Local Stratication<br />

Let us now look at the check, whether a given set <strong>of</strong> constra<strong>in</strong>ts is locally stratied. In<br />

the second part <strong>of</strong> the pro<strong>of</strong> <strong>of</strong> Theorem 7.12 we have seen that this check can be done by<br />

direct construction <strong>of</strong> the desired partition <strong>in</strong>to maximal stratied labelled subsystems. The<br />

rst part <strong>of</strong> that pro<strong>of</strong> then <strong>in</strong>dicates how to construct the correspond<strong>in</strong>g RTS. In [7] we gave<br />

an explicit algorithm which also produces for each constra<strong>in</strong>t the set <strong>of</strong> \reduced" constra<strong>in</strong>ts<br />

used <strong>in</strong> the RTS construction. However, the time complexity <strong>of</strong> that algorithm was beyond<br />

any practicality, s<strong>in</strong>ce we could pro<strong>of</strong> the follow<strong>in</strong>g result.<br />

Proposition 7.13. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> clausal form, n =#, ` the maximum<br />

number <strong>of</strong> literals <strong>in</strong> constra<strong>in</strong>ts I2 and k the maximal arity <strong>of</strong> predicate symbols occurr<strong>in</strong>g<br />

<strong>in</strong> these constra<strong>in</strong>ts. Then check<strong>in</strong>g to be locally stratied can be done with a time complexity<br />

<strong>in</strong> O(k `2 n 2n2 `).<br />

ut<br />

We nowwant toshow that this complexity result is not accidentally. For this we rst show a<br />

technical lemma.<br />

Lemma 7.14. Let be a set <strong>of</strong> clauses conta<strong>in</strong><strong>in</strong>g only propositional atoms. Let L be a<br />

literal, such that L does not occur <strong>in</strong> any <strong>of</strong> the clauses <strong>in</strong> . Assume = 1 [ 2 such<br />

that L does not occur <strong>in</strong> any <strong>of</strong> the clauses <strong>in</strong> 1 , but <strong>in</strong> all clauses <strong>of</strong> 2 . Moreover, 2<br />

conta<strong>in</strong>s only clauses with exactly two literals. If 1 is locally stratied and 2 is stratied,<br />

then is locally stratied.<br />

Pro<strong>of</strong>. First assume that 2 conta<strong>in</strong>s a s<strong>in</strong>gle clause C = L _ L 0 .If 1 is not stratied, there<br />

is a partition 1 = 11 0 [[0 1n (n>2) with stratied labelled subsystems (0 1i 00 1i L i).<br />

Then at most one L k can be L 0 and we may dene<br />

(<br />

i 0 1i 0 if L i = L 0<br />

=<br />

1i 0 [fCg otherwise :<br />

150


By <strong>in</strong>duction ( 0 i 00 i L i) is a stratied labelled subsystem. Thus, = 0 1 [[0 n denes<br />

the required partition.<br />

Now assume that 1 is stratied. Let 1 = 11 [[ 1n be a partition <strong>in</strong>to pairwise<br />

disjo<strong>in</strong>t strata. If 1 conta<strong>in</strong>s just one clause C 0 with L 0 and no clause with L 0 ,we are done,<br />

s<strong>in</strong>ce C may be added to the stratum <strong>of</strong> C 0 . Analogously, C may dene its own stratum, if<br />

such aC 0 does not exist at all. Therefore, we are reduced to the follow<strong>in</strong>g two cases:<br />

{ There is more than one clause <strong>in</strong> 1 conta<strong>in</strong><strong>in</strong>g L 0 (and hence none conta<strong>in</strong><strong>in</strong>g L 0 ) and<br />

these clauses belong to dierent strata.<br />

{ There are exactly two clauses C 1 and C 2 conta<strong>in</strong><strong>in</strong>g L 0 or L 0 respectively. Inparticular,<br />

C 1 and C 2 belong to the same stratum 1i .<br />

In both cases we choose the literals L 1 = L and L 2 = L 0 to dene labelled subsystems<br />

( 1 1 L 1 ) and (fCg[ 1 ;fC 00 j C 00 conta<strong>in</strong>s L 0 g<br />

| {z }<br />

2<br />

0<br />

00<br />

2 L 2 ) <br />

where 2 0 (and hence also 00 2 ) are stratied by the previous remarks.<br />

In the rst case choose C 0 conta<strong>in</strong><strong>in</strong>g L 0 and another literal L 00 to dene a labelled<br />

subsystem<br />

( 0 1 [fCg<br />

| {z }<br />

3<br />

0<br />

00<br />

3 L 3)<br />

with L 3 = L 00 ,where1 0 is a proper subset <strong>of</strong> 1 not conta<strong>in</strong><strong>in</strong>g C 0 . By <strong>in</strong>duction 3 00 must<br />

be locally stratied.<br />

In the second case choose C 2 = L 0 _ C2 0 , a literal L00 <strong>in</strong> C2 0 and L 3 = L 00 ,which denes a<br />

labelled subsystem (3 0 00 3 L 3) as before with 3 0 = 0 1 [fCg with a proper subset 0 1 ( 1<br />

conta<strong>in</strong><strong>in</strong>g C 1 , but not C 2 .Thus, 3 0 and 00 3 are stratied.<br />

In both cases we have obta<strong>in</strong>ed a partition = 1 [ 2 0 [ 0 3 with stratied labelled<br />

subsystems ( 1 1 L 1 ), (2 0 00 2 L 2) and (3 0 00 3 L 3). S<strong>in</strong>ce the additional condition for<br />

local stratication is easily veried, we conclude that is locally stratied.<br />

For the general case we may assume that 0 = 1 [ ( 2 ;fCg) is locally stratied by<br />

successive application <strong>of</strong> the constructions <strong>in</strong> the rst part <strong>of</strong> this pro<strong>of</strong>. Then we observe<br />

that <strong>in</strong> the case <strong>of</strong> non-stratied 0 we do not change labels, when we add C. However, it<br />

may happen that one <strong>of</strong> these labels now is L. This label results (as label L 1 ) from add<strong>in</strong>g<br />

C 0 to some stratied constra<strong>in</strong>t set.From the construction <strong>of</strong> this local stratication and the<br />

fact that 2 is stratied we conclude that the other labels L 2 and L 3 are dierent from L,<br />

which guarantees the local stratication condition to hold also <strong>in</strong> the general case.<br />

For the case <strong>of</strong> 0 be<strong>in</strong>g stratied the arguments are the same as before except for the case<br />

that 0 conta<strong>in</strong>s exactly one clause C 0 with L 0 and none with L 0 . Then the correspond<strong>in</strong>g<br />

stratum may also conta<strong>in</strong> clauses C i with literals L i and L i+1 (i = 1:::m), where L 1<br />

occurs <strong>in</strong> C 0 and L m+1 = L.<br />

In particular, we have C m 2 2 and add<strong>in</strong>g C to this stratum is no longer possible. S<strong>in</strong>ce<br />

2 is stratied, we must have m > 0, but then the literals L 0 , L 1 and L dene a local<br />

stratication with associated constra<strong>in</strong>t sets 0 ;fC 0 g[fCg, 0 ;fC m g[fCg and 0<br />

respectively.<br />

ut<br />

151


We shall use Lemma 7.14 <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> NP-hardness to shr<strong>in</strong>k propositional constra<strong>in</strong>t sets.<br />

Another way toreducethetechnical complexity <strong>of</strong> that pro<strong>of</strong> is to drop the restriction on <br />

to conta<strong>in</strong> only clauses with at least one negative literal. If is a set <strong>of</strong> propositional clauses<br />

conta<strong>in</strong><strong>in</strong>g neither the atom q nor its negation, we add :q to each clause to form the set ext<br />

<strong>of</strong> clauses.<br />

Lemma 7.15. Let be a set <strong>of</strong> propositional clauses each with at least two literals. Then<br />

is locally stratied i is satisable and locally stratied.<br />

ext<br />

Pro<strong>of</strong>. First let be locally stratied and satisable. If is not stratied, we may choose<br />

the same labels to obta<strong>in</strong> a local stratication for ext .<br />

Thus, assume to be stratied. Then ( ext :q) is a stratied labelled subsystem.<br />

S<strong>in</strong>ce all clauses <strong>in</strong> all other labelled subsystems conta<strong>in</strong> the literal :q, wehave to isolate these<br />

clauses. Therefore, take a model for whichisgiven by a set fL 1 :::L n g <strong>of</strong> literals occurr<strong>in</strong>g<br />

<strong>in</strong> which must be <strong>in</strong>terpreted as true. Tak<strong>in</strong>g L i as a label and the correspond<strong>in</strong>g labelled<br />

subsystem (i 000<br />

i L i), we obta<strong>in</strong> a proper subset i 0 ( ext .For #i 00 > 1wemay proceed<br />

with the other literals L j . The last step results <strong>in</strong> unary sets f:q _ L k g which are obviously<br />

stratied.<br />

Conversely, given a local stratication for ext we can remove :q to obta<strong>in</strong> a local stratication<br />

for . It rema<strong>in</strong>s to show that is satisable. If ext is stratied, this is obvious,<br />

because a literal L with L occurr<strong>in</strong>g <strong>in</strong> some clause <strong>in</strong> cannot occur <strong>in</strong> any clause<strong>of</strong>.<br />

If ext is not stratied, there is at least one stratied labelled subsystem ( 0 00 L) such<br />

that :q occurs <strong>in</strong> all clauses <strong>in</strong> 00 , i.e. 00 = 0 ext and 0 is satisable. This still holds if we<br />

put back the literal L and extend our <strong>in</strong>terprete L as false to satisfy clauses <strong>in</strong> ; 0 . ut<br />

Theorem 7.16.<br />

NP-hard.<br />

Let be a set <strong>of</strong> constra<strong>in</strong>ts. Then check<strong>in</strong>g that is locally stratied is<br />

Pro<strong>of</strong>. We show that the disjo<strong>in</strong>t cover problem (DCP) { which isknown to be NP-complete<br />

{ can be reduced <strong>in</strong> polynomial time to the local stratication problem. For this, let (X S)<br />

be an <strong>in</strong>stance <strong>of</strong> DCP, i.e. X is a nite set, say X = fx 1 :::x n g and S = fS 1 :::S m g is a<br />

subset <strong>of</strong> the power set P(X). The problem is to decide, whether a subset S 0 Sexists such<br />

that X is the disjo<strong>in</strong>t union <strong>of</strong> the sets <strong>in</strong> S 0 .SuchaS 0 is called a S solution for (X S).<br />

Without loss <strong>of</strong> generality we may always assume that X = S i holds. Moreover, we<br />

may allow S to be a multiset.<br />

We now associate with (X S) a set <strong>of</strong> constra<strong>in</strong>ts . For this let p ij be a propositional<br />

atom for all x i 2 S j .For S i = fx j 1 :::x j i<br />

g2Swe dene clauses :p jk i _ p j`i and :p j`i _ p jk i<br />

for k ` 2 f1:::ig, k 6= `. We refer to these clauses as connection clauses with respect to<br />

S i . For x i 2 S j \ S k (j 6= k) we dene an exclusion clause :p ij _:p ik . F<strong>in</strong>ally, for each x i<br />

we dene a cover clause p ij 1 __p ij m<br />

for the sets S j 1 :::S j m<br />

2Sconta<strong>in</strong><strong>in</strong>g x i provided<br />

m 2. conta<strong>in</strong>s all these connection, exclusion and cover clauses.<br />

Then we have toshowthat(X S) has a solution i is locally stratied and satisable.<br />

For this we <strong>in</strong>troduce a partial order on DCP-<strong>in</strong>stances lett<strong>in</strong>g (X 1 S 1 ) < (X 2 S 2 )i<br />

X<br />

S2S1<br />

j S j <<br />

X S2S2<br />

or<br />

0<br />

@ X S2S1<br />

= X S2S2<br />

and<br />

S i 2S<br />

jS 1 j > jS 2 j<br />

1<br />

A<br />

152


holds.<br />

First let S 0 = fS i 1 :::S i k<br />

g be a solution for (X S). Then is obviously satisable. In<br />

order to use <strong>in</strong>duction with respect to we consider the follow<strong>in</strong>g two operations:<br />

{ Replace S j 2S 0 by S j ;fx`g and add S m+1 = fx`g for some x` 2 S j .<br />

{ Replace S j =2S 0 by S j ;fx`g for some x` 2 S j .<br />

In both cases we obta<strong>in</strong> a smaller DCP-<strong>in</strong>stance which has a solution. By <strong>in</strong>duction the<br />

correspond<strong>in</strong>g constra<strong>in</strong>t set1 0 is locally stratied.<br />

In the rst case we remove all clauses with literals p im+1 from 1 0 . The result<strong>in</strong>g subset 00 1<br />

is still locally stratied. Now build the labelled subsystem ( 0 00 L) with the label L = :p`j .<br />

The clauses <strong>in</strong> 0 (and hence <strong>in</strong> 00 ) do not conta<strong>in</strong> p`j , i.e. we omit the cover clause with<br />

respect to x` and connection clauses conta<strong>in</strong><strong>in</strong>g p`j with respect to x` 2 S j . Clauses <strong>in</strong> 00<br />

conta<strong>in</strong><strong>in</strong>g :p`j arise from the restriction to keep at least two literals, hence must also lie <strong>in</strong><br />

0 . Therefore, we obta<strong>in</strong> 00 = 1 [ 2 , where 2 is stratied and conta<strong>in</strong>s only clauses with<br />

two literals, one <strong>of</strong> them is :p`j , whereas clauses <strong>in</strong> 1 do not conta<strong>in</strong> :p`j .<br />

Thus, the rema<strong>in</strong><strong>in</strong>g connection clauses with respect to x` 2 S j and the exclusion clauses<br />

with respect to x` 2 S j occur <strong>in</strong> 2 . This implies 1 = 1 00 . From Lemma 7.14 we conclude<br />

that 00 is locally stratied.<br />

In the second case we build the labelled subsystem ( 0 00 L) with the label L = p`j .<br />

The clauses <strong>in</strong> 0 (and hence <strong>in</strong> 00 ) do not conta<strong>in</strong> :p`j , i.e. we omit exclusion clauses<br />

and connection clauses conta<strong>in</strong><strong>in</strong>g :p`j with respect to x` 2 S j . Aga<strong>in</strong>, the clauses <strong>in</strong> 00<br />

conta<strong>in</strong><strong>in</strong>g p`j only arise from the restriction to keep at least two literals. Hence, these clauses<br />

dene a stratied subset 2 <strong>of</strong> 00 (and <strong>of</strong> 0 )conta<strong>in</strong><strong>in</strong>g only clauses with two literals.<br />

The rema<strong>in</strong><strong>in</strong>g clauses form a subset 1 and clauses <strong>in</strong> 1 do not conta<strong>in</strong> p`j , i.e. the<br />

rema<strong>in</strong><strong>in</strong>g connection clauses with respect to x` 2 S j and the cover clause with respect to x`<br />

(if it conta<strong>in</strong>s just two literals) occur <strong>in</strong> 2 , which implies 1 = 1 0 . From Lemma 7.14 we<br />

conclude that 00 is locally stratied.<br />

S<strong>in</strong>ce <strong>in</strong> the rst case (x` 2 S j 2S 0 ) only the cover clause with respect to x` and connection<br />

clauses conta<strong>in</strong><strong>in</strong>g p`j and <strong>in</strong> the second case (x` 2 S j =2S 0 ) only exclusion clauses with respect<br />

to x` 2 S j and connection clauses conta<strong>in</strong><strong>in</strong>g :p`j are omitted <strong>in</strong> 0 , the additional condition<br />

for local stratication is easily veried, if all such choices are taken provided there are at least<br />

three such possibilities. The only critical case arises, if there are only three choices <strong>of</strong> the<br />

second k<strong>in</strong>d, all with the same x`. In this case we must have another S j = fx`g 2S 0 and we<br />

simply add the labelled subsystem ( 0 00 :p`j ) to satisfy the additional local stratication<br />

condition.<br />

If there are at most two choices, then either<br />

{ S = S 0 and there is exactly one S j = fx k x`g or<br />

{ S 0 conta<strong>in</strong>s only unary sets and these are exactly S j = fx j g =2 S 0 and S k = fx k g =2 S 0 or<br />

{ S 0 conta<strong>in</strong>s only unary sets and there is exactly one S j = fx k x`g =2 S 0 .<br />

In the rst case conta<strong>in</strong>s only two connection clauses with respect to S j and hence is<br />

obviously stratied. In the second case conta<strong>in</strong>s only four clauses<br />

:p kk _:p kk 0 p kk _ p kk 0 :p jj _:p jj 0 and p jj _ p jj 0<br />

for S j 0 = fx j g2S 0 and S k 0 = fx k g2S 0 ,hence is stratied.<br />

153


In the third case we obta<strong>in</strong> six clauses<br />

:p kj _ p`j :p`j _ p kj :p kj _ p kk 0 :p`j _ p``0 p kj _ p kk 0 and p`j _ p``0<br />

for S k 0 = fx k g2S 0 and S`0 = fx`g 2S 0 . Us<strong>in</strong>g Lemma 7.14 it is easily veried that the labels<br />

p kj , p`j , :p kj and :p`j dene a partition <strong>in</strong>to stratied labelled subsystems.<br />

For the converse let us rst assume that is stratied, i.e. there cannot exist three<br />

clauses with literals L, L and L respectively. In connection clauses we may have L = p`j<br />

(or L = :p`j ) and it follows that does not conta<strong>in</strong> exclusion or cover clauses for x` 2 S j .<br />

This implies x` =2 S k for all k 6= j. Ifwehave an exclusion clause for x` 2 S j ,say :p`j _:p`k ,<br />

then we also have a cover clause p`j _ p`k _ C 0 and vice versa, but there cannot be further<br />

exclusion clauses nor connection clauses for x` 2 S j , i.e. C 0 false and S j = fx`g.<br />

To summarize, if x` occurs <strong>in</strong> more than one S j ,then#S j = 1 and there are just two such<br />

sets. Therefore, for a solution S 0 we take all S j with #S j 2 and select a s<strong>in</strong>gleton set fx`g<br />

for the rema<strong>in</strong><strong>in</strong>g elements.<br />

Next assume that is locally stratied, i.e. there is a local stratication with labels<br />

L 1 :::L n (n 3). Aga<strong>in</strong>, we proceed by <strong>in</strong>duction on DCP-<strong>in</strong>stances.<br />

For L 1 = :p`j and the stratied labelled subsystem (1 0 00 1 L 1) the cover clause for x`<br />

and connection clauses for x` 2 S j conta<strong>in</strong><strong>in</strong>g p`j have been removed from to give 1 0 ,<br />

hence must occur <strong>in</strong> two other labelled subsystems such that for a label :p ki we must have<br />

k:` and for a label p ki wemust have i 6= j.<br />

Analogously, for L 1 = p`j exclusion and connection clauses for x` 2 S j , the latter ones<br />

conta<strong>in</strong><strong>in</strong>g :p`j have been removed omitted <strong>in</strong> 1 0 and must occur <strong>in</strong> two other labelled<br />

subsystems such that for another positive labelp ki wemust have k:` and for a negative label<br />

:p ki wemust have i 6= j. Hence, for the m<strong>in</strong>imum number <strong>of</strong> three labels L 1 , L 2 and L 3 we<br />

obta<strong>in</strong> the follow<strong>in</strong>g four cases:<br />

L 1 = :p`j L 2 = :p k 1i1 L 3 = :p k 2i2 with pairwise dierent ` k 1 k 2 <br />

L 1 = :p`j L 2 = :p k 1i1 L 3 = p k 2i2 with ` 6= k 1 and j 6= i 2 6= i 1 <br />

L 1 = :p`j L 2 = p k 1i1 L 3 = p k 2i2 with k 1 6= k 2 and i 1 6= j 6= i 2 or<br />

L 1 = p`j L 2 = p k 1i1 L 3 = p k 2i2 with pairwise dierent ` k 1 k 2 :<br />

For a negative literal L i = :p`j or a positive literal L i = p`j it follows from Lemma 7.14 that<br />

replac<strong>in</strong>g S j by S j ;fx`g and fx`g denes a locally stratied constra<strong>in</strong>t set. Therefore, by<br />

<strong>in</strong>duction <strong>in</strong> all four cases (with the restrictions for <strong>in</strong>dices) we obta<strong>in</strong> solutions for smaller<br />

DCP-<strong>in</strong>stances with<br />

S 1 = fS 1 :::S j ;fx`g:::S m fx`gg <br />

S 2 = fS 1 :::S i 1 ;fx k1 g:::S m fx k 1gg and<br />

S 3 = fS 1 :::S i 2 ;fx k2 g:::S m fx k 2 gg<br />

respectively. Ifany <strong>of</strong> these solutions conta<strong>in</strong>s both (or none) <strong>of</strong> the splitted components, e.g.<br />

S j ;fx`g and fx`g, we also have a solution for the orig<strong>in</strong>al problem.<br />

Therefore, assume that all solutions for (X S i ) must conta<strong>in</strong> exactly one <strong>of</strong> the splitted<br />

components denoted as S 1 , S 2 and S 3 . Let S 0 i = fSi 1 :::Si n i<br />

S i g be a solution for (X S i ).<br />

For i 6= j we proceed <strong>in</strong> the follow<strong>in</strong>g way:<br />

Start with T i = S 0 i ;S0 j , T j = S 0 j ;S0 i and T = fS jg and execute the follow<strong>in</strong>g steps until<br />

there are no more changes:<br />

154


{ Remove all sets from T i <strong>in</strong>tersect<strong>in</strong>g some set <strong>in</strong> T and let these dene a new T .<br />

{ Remove all sets from T j <strong>in</strong>tersect<strong>in</strong>g some set <strong>in</strong> T and let these dene a new T .<br />

F<strong>in</strong>ally, ifT i (and then also T j ) are non-empty, this means that we may replace T j S 0 j by<br />

T i or S 0 j ;T j by S 0 i ;T i. Accord<strong>in</strong>g to our assumption on solutions we always keep either S i<br />

or S j . Consequently, the procedure above denesacha<strong>in</strong><br />

S i ; S j i1 ; Si i1 ; Sj i2 ; Si i2 ;;Sj i k<br />

; S i i k<br />

; S j<br />

<br />

where neighbour<strong>in</strong>g sets have a common element. This is still true, if we replace S i by the<br />

orig<strong>in</strong>al S j .Tak<strong>in</strong>g together all three choices for (i j) we obta<strong>in</strong> an odd-length cycle<br />

S i 1 ; S i2 ; S i3 ;;S i m<br />

; S i 1<br />

with <strong>in</strong>tersect<strong>in</strong>g neighbour<strong>in</strong>g sets S ij 2S.Let 0 be the set <strong>of</strong> constra<strong>in</strong>ts correspond<strong>in</strong>g to<br />

fS i 1 :::S i m<br />

g. Then 0 diers from a subset 0 only by the fact that cover clauses may<br />

have been shortened. S<strong>in</strong>ce omitted (positive) literals <strong>in</strong> these cover clauses do not occur <strong>in</strong><br />

any other clauses <strong>in</strong> 0 , this must be locally stratied i 0 is locally stratied. Therefore,<br />

the pro<strong>of</strong> is completed, if we can show that cycles as above always dene constra<strong>in</strong>t sets that<br />

are not locally stratied or not satisable.<br />

With each neighbour<strong>in</strong>g pair (S ij S ij+1 )wemay associate a witness x 2 S i j<br />

\ S ij+1 . Then<br />

without loss <strong>of</strong> generality (just rename <strong>in</strong>dices) we canalways assume a cycle<br />

S 1<br />

x1<br />

; S 2<br />

x2<br />

; S 3 ;;S m<br />

x m;<br />

Sm+1 = S 1<br />

and show that the follow<strong>in</strong>g conditions can be achieved:<br />

{ m is odd,<br />

{ the x i are pairwise dierent,<br />

{ the S i are pairwise dierent and<br />

{ the cover clause <strong>in</strong> 0 for x` has the form p`` _ p``+1 _ C0`, where literals <strong>in</strong> C0` do not<br />

occur <strong>in</strong> any other clause <strong>in</strong> 0 .<br />

The last condition will allow us to assume without loss <strong>of</strong> generality thatcover clauses <strong>in</strong> 0<br />

only conta<strong>in</strong> two literals.<br />

In order to achieve such a cycle rercall that our orig<strong>in</strong>al cycle is composed <strong>of</strong> three subpaths<br />

(called anks) correspond<strong>in</strong>g to a solution <strong>of</strong> a smaller DCP-<strong>in</strong>stance and each pair <strong>of</strong> anks<br />

has a common set (called corner). If S i ( S j is such acorner, then the follow<strong>in</strong>g cases may<br />

arise:<br />

{ The two nieghbours S i and S k co<strong>in</strong>cide which allows to remove the corner S j and to<br />

identify S i with S k .<br />

{ If S i , S j and S k are pairwise dierent, we either obta<strong>in</strong> a simple cycle <strong>of</strong> length 3 or let<br />

the caycle unchanged.<br />

{ If one <strong>of</strong> the neighbours equals S j ,say S k = S j , then S k is not common <strong>in</strong> the solutions<br />

for the ank with S j and S k , i.e. there must be some S j 0 <strong>in</strong> the same solution as S i with<br />

S j \ S j 0 6= . In this case we may replace the even number <strong>of</strong> edges between S j and S j 0<br />

by a s<strong>in</strong>gle edge. By the same argument theeven number <strong>of</strong> edges between the opposite<br />

edge S` (<strong>in</strong> the same ank) and some S`0 by a s<strong>in</strong>gle edge.<br />

155


In all these cases the cycle length rema<strong>in</strong>s odd.<br />

If x i occurs twice, say between Si 1 and S i 2 and between Si 3 and S i 4 respectively, wemay<br />

assume paths from S i 1 to S i4 and from S i2 to S i3 <strong>of</strong> length n 1 and n 2 respectively. Then there<br />

are cycles with S i 2 , S i3 and S i1 , S i4 connected by x i respectively and one <strong>of</strong> the correspond<strong>in</strong>g<br />

lengths n 1 +1 or n 2 +1 must be odd. The only critical cases occur for S i 2 = S i4 or S i1 = S i3 ,<br />

but these correspond to corners that have already been removed.<br />

F<strong>in</strong>ally, <strong>in</strong> order to achieve the condition on cover clauses consider S i \ S j 6= .<br />

{ If S i and S j belong to dierent anks, but to the same solution, then we have S i = S j<br />

and we may identify them and remove theeven number <strong>of</strong> edges between them.<br />

{ If S i and S j belong to dierent anks and dierent solutions, then for S i 6= S j we may<br />

replace the odd number <strong>of</strong> edges between them by a s<strong>in</strong>gle new edge, whereas for S i = S j<br />

we may consider the odd number <strong>of</strong> edges between them as our new cycle.<br />

{ If S i and S j belong to the same ank, then the number <strong>of</strong> edges between them is even i<br />

S i = S j ,thus may beremoved or replaced by a s<strong>in</strong>gle new edge.<br />

The conditions on our cycle now allows clauses to be arranged <strong>in</strong> such away thatwehave<br />

0 = f L 1 _ L 2 L 2 _ L 3 ::: L p;1 _ L p L p _ L 1 g<br />

for an even number p with L p=2+i = L i for i = 1:::p=2. Such a 0 , however, is not<br />

satisable.<br />

ut<br />

7.8 Conclusion<br />

In this article we <strong>in</strong>vestigated the limits <strong>of</strong> rule trigger<strong>in</strong>g systems (RTSs) for ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g<br />

database <strong>in</strong>tegrity. The rst result assures the existence <strong>of</strong> non-repairable transitions. In order<br />

to disallow such transitions the constra<strong>in</strong>t implication problem must be decidable.<br />

Secondly, we analyzed critical trigger paths <strong>in</strong> rule hypergraphs associated with RTSs. We<br />

could show that the existence <strong>of</strong> critical trigger paths leads to RTSs which may<strong>in</strong>validate the<br />

eect <strong>of</strong> some transitions, even if these are repairable. Such abehaviour can only be excluded<br />

for locally stratied constra<strong>in</strong>t sets. In this case the needed RTS can be computed eectively,<br />

but check<strong>in</strong>g local stratication is NP-hard.<br />

To summarize, both results limit the applicability<strong>of</strong>RTSs for <strong>in</strong>tegrity ma<strong>in</strong>tenance under<br />

the assumption that the <strong>in</strong>tended eects <strong>of</strong> user-dened transitions should be preserved.<br />

Fortunately, there is a stronger condition on a constra<strong>in</strong>t set to be stratied, which is only<br />

sucient for reasonable rule behaviour, but not necessary. Stratied constra<strong>in</strong>t sets occur, if<br />

we have a relational database schema <strong>in</strong> Entity-Relationship normal form, which means that<br />

it is equivalent to an ER-schema without exclusion constra<strong>in</strong>ts. Check<strong>in</strong>g stratication is not<br />

only eective, but also ecient.<br />

On the other hand, the RTS approach to <strong>in</strong>tegrity ma<strong>in</strong>tenance completely ignores userdened<br />

transitions. Thus, a second conclusion from our studies is that these should be taken<br />

<strong>in</strong>to consideration.<br />

References for Chapter 7<br />

1. S. Ceri, J. Widom: Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />

VLDB, Brisbane (Australia), August 1990, 566-577<br />

156


2. S. Ceri, P. Fraternali, S. Paraboschi, L. Tanca: Automatic Generation <strong>of</strong> Production Rules for<br />

Integrity Ma<strong>in</strong>tenance. ACM ToDS, vol. 19(3), 1994, 367-422.<br />

3. S. Chakravarty, J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc.,<br />

Houston, Februar 1994<br />

4. M. Gertz, U. W. Lipeck: Deriv<strong>in</strong>g Integrity Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Triggers from Transition Graphs, <strong>in</strong> Proc.<br />

9th ICDE, IEEE Computer Society Press, 1993, 22-29<br />

5. H. Mannila, K.-J. Raiha: The Design <strong>of</strong> Relational <strong>Databases</strong>, Addison-Wesley 1992<br />

6. K.-D. Schewe, B. Thalheim: Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>, <strong>in</strong> S. Chakravarty, J.<br />

Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc., Houston, Februar<br />

1994<br />

7. K.-D. Schewe, B. Thalheim: Active Consistency Enforcement for Repairable Database Transitions,<br />

<strong>in</strong> S.Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.): Integrity <strong>in</strong> <strong>Databases</strong>, Proc. 6th Int. Workskop<br />

on Foundations <strong>of</strong> Models and Languages for Data and <strong>Object</strong>s, Schlo Dagstuhl, 1996, 87-102,<br />

available via http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceed<strong>in</strong>gs.html<br />

8. B. Thalheim: Foundations <strong>of</strong> entity-relationship model<strong>in</strong>g, Annals <strong>of</strong> Mathematics and Articial<br />

Intelligence, vol. 7, 1993, 197-256<br />

9. S. D. Urban, L. Delcambre: Constra<strong>in</strong>t Analysis: a Design Process for Specify<strong>in</strong>g Operations on<br />

<strong>Object</strong>s, IEEETrans. on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), December 1990<br />

10. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: Set-oriented Production Rules <strong>in</strong> Relational Database Systems, <strong>in</strong><br />

Proc. SIGMOD 1990, 259-270<br />

157


Chapter 8<br />

Consistency Enforcement <strong>in</strong><br />

Entity-Relationship and<br />

<strong>Object</strong>-<strong>Oriented</strong> Models<br />

Contents<br />

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159<br />

8.2 Rule Systems for Consistency Ma<strong>in</strong>tenance . . . . . . . . . . . . . 160<br />

8.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161<br />

8.2.2 ECA-Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />

8.3 Problems with Rule-Based Integrity Enforcement . . . . . . . . . 164<br />

8.3.1 Non-Repairable Transactions . . . . . . . . . . . . . . . . . . . . . . 164<br />

8.3.2 Critical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165<br />

8.3.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167<br />

8.4 Well-behav<strong>in</strong>g Rule Systems . . . . . . . . . . . . . . . . . . . . . . 169<br />

8.4.1 Stratied Rule Systems . . . . . . . . . . . . . . . . . . . . . . . . . 169<br />

8.4.2 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Entity-Relationship Schemata . . . . . . . 170<br />

8.4.3 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Simple <strong>Object</strong>-<strong>Oriented</strong> Schemata . . . . . 172<br />

8.5 Conict Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 174<br />

8.5.1 Problem <strong>of</strong> Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . 174<br />

8.5.2 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176<br />

8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

K.-D. Schewe. Consistency Enforcement <strong>in</strong>Entity-Relationship and <strong>Object</strong>-<strong>Oriented</strong><br />

Models. Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g. 1998 (to appear).<br />

158


Abstract. Integrity Ma<strong>in</strong>tenance is considered one <strong>of</strong> the major application elds <strong>of</strong> rule<br />

trigger<strong>in</strong>g systems (RTSs). In the case <strong>of</strong> a given <strong>in</strong>tegrity constra<strong>in</strong>t be<strong>in</strong>g violated by a<br />

database transaction these systems trigger repair<strong>in</strong>g actions. However, it has been shown<br />

that for any set <strong>of</strong> constra<strong>in</strong>ts there exist non-repairable transactions, which depend on the<br />

closure <strong>of</strong> the constra<strong>in</strong>t set. Even if non-repairable transactions are excluded, this does not<br />

restra<strong>in</strong> the RTS from produc<strong>in</strong>g undesired behaviour.<br />

Analyz<strong>in</strong>g the behaviour <strong>of</strong> RTSs leads to the denition <strong>of</strong> critical paths <strong>in</strong> associated rule<br />

hypergraphs and the requirement<strong>of</strong>such paths be<strong>in</strong>g absent. It is shown that this requirement<br />

can be satised if the underly<strong>in</strong>g set <strong>of</strong> constra<strong>in</strong>ts is stratied and that this is always the<br />

case for the structural constra<strong>in</strong>ts <strong>in</strong> Entity-Relationship and simple object-oriented models.<br />

Moreover, <strong>in</strong> both cases there is no ambiguity for the selection <strong>of</strong> rules.<br />

Keywords. <strong>in</strong>tegrity constra<strong>in</strong>ts, consistency enforcement, active databases, Entity-Relationship,<br />

object-orientation, analysis <strong>of</strong> rule systems<br />

8.1 Introduction<br />

Active databases (ADBs) aim at extend<strong>in</strong>g relational (or object-oriented) DBMS by rule<br />

trigger<strong>in</strong>g systems (RTSs), i.e. by sets <strong>of</strong> rules which on a given event and <strong>in</strong> the case <strong>of</strong> a<br />

condition be<strong>in</strong>g satised trigger actions on the database (ECA-rules). Events can be external<br />

events, time conditions or <strong>in</strong>ternal events result<strong>in</strong>g from operations on the database. Conditions<br />

are usually given by boolean queries that have to be evaluated aga<strong>in</strong>st the database.<br />

The action part consists <strong>of</strong> a sequence <strong>of</strong> basic operations to <strong>in</strong>sert, delete or update tuples<br />

(or objects respectively) <strong>in</strong> the database.<br />

The work <strong>in</strong> [3, 4, 8, 16, 17] and partly <strong>in</strong> [5] considers the problem to enforce database<br />

<strong>in</strong>tegrity by the use <strong>of</strong> RTSs. The results concern the generation <strong>of</strong> repair<strong>in</strong>g ECA-rules and<br />

partly the analysis <strong>of</strong> the result<strong>in</strong>g RTS. This analysis concentrates on the term<strong>in</strong>ation <strong>of</strong> the<br />

rule system, the <strong>in</strong>dependence <strong>of</strong> the nal database state from the chosen selection order <strong>of</strong><br />

the rules (conuence) andonconsistency.<br />

These requirements are not sucient for a reasonable rule behaviour, because it is easy<br />

to dene an RTS that empties the database <strong>in</strong> case <strong>of</strong> any constra<strong>in</strong>t violation. Therefore,<br />

we claim an additional requirement, which <strong>in</strong>formally means that the <strong>in</strong>tended eect <strong>of</strong> a<br />

transaction may not be turned <strong>in</strong>to its opposite by the RTS.<br />

In this paper we <strong>in</strong>vestigate general problems with RTSs and show that these cannot occur<br />

<strong>in</strong> simple Entity-relationship- and object-oriented schemata. The rst problem concerns<br />

the existence <strong>of</strong> non-repairable transactions that are determ<strong>in</strong>ed by the closure <strong>of</strong> the constra<strong>in</strong>t<br />

set. The second problem arises from the analysis <strong>of</strong> how to obta<strong>in</strong> RTSs that denitely<br />

repair constra<strong>in</strong>t violations by a (repairable) transaction without <strong>in</strong>validat<strong>in</strong>g its <strong>in</strong>tended<br />

eect. Given an RTS we associate with it a rule hypergraph which corresponds to the possible<br />

sequences <strong>of</strong> triggered rules. We dene critical trigger paths <strong>in</strong> these hypergraphs that correspond<br />

to the propagation <strong>of</strong> conditions. Then it can be shown that the existence <strong>of</strong> a s<strong>in</strong>gle<br />

critical trigger path makes the RTS work <strong>in</strong>correctly for at least one transaction.<br />

Next we analyze constra<strong>in</strong>t sets <strong>in</strong> order to detect whether it is possible to dene an<br />

RTS <strong>of</strong> repair<strong>in</strong>g actions such that the critical trigger paths <strong>in</strong> its associated hypergraph can<br />

only <strong>in</strong>validate non-repairable transactions. For this we <strong>in</strong>troduce stratied constra<strong>in</strong>t sets<br />

that satisfy this condition. We apply our results to the case <strong>of</strong> specic Entity-Relationship<br />

159


and simple object-oriented models and demonstrate that structurally determ<strong>in</strong>ed constra<strong>in</strong>t<br />

sets <strong>in</strong> these cases are always stratied. Furthermore, it will be shown that <strong>in</strong> these cases<br />

ambiguities aris<strong>in</strong>g from dierent execution orders can also be detected.<br />

The work presented <strong>in</strong> this paper extends previous research <strong>in</strong> [12, 14] <strong>in</strong> that theoretical<br />

<strong>in</strong>vestigations about the strength and weaknesses <strong>of</strong> the rule trigger<strong>in</strong>g approach for <strong>in</strong>tegrity<br />

ma<strong>in</strong>tenance have been directly tied <strong>in</strong> with consistency <strong>in</strong> Entity-Relationship and simple<br />

object-oriented models. A prelim<strong>in</strong>ary version was presented at the 1997 conference on Conceptual<br />

Modell<strong>in</strong>g (ER '97) [13].<br />

8.2 Rule Systems for Consistency Ma<strong>in</strong>tenance<br />

Let us rst consider the relational data model with <strong>in</strong>tegrity constra<strong>in</strong>ts given by closed<br />

formulae I <strong>in</strong> implicative normal form<br />

8x 1 :::x k : 9y 1 :::y`: p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) :<br />

(8.76)<br />

The vectors x i consist only <strong>of</strong> universally quantied variables x j and the vectors y i consist <strong>of</strong><br />

both universally quantied variables x j and existentially quantied variables y j . The predicate<br />

symbols p i , q j correspond either to a relation <strong>of</strong> the schema or are comparison predicates<br />

(= 6=


8.2.1 Motivation<br />

Let us rst illustrate consistency enforcement us<strong>in</strong>g a small fragment <strong>of</strong> the example used <strong>in</strong><br />

[4, 10].<br />

Example 8.1. Let us dene a schema with some simple functional and <strong>in</strong>clusion constra<strong>in</strong>ts.<br />

For simplicity we omit all types. The relation schemata are<br />

WIRE = f wire id, connection, wire type, voltage, power g ,<br />

TUBE = f tube id, connection, tube type g and<br />

CONNECTION = f connection, from, to g<br />

These are used to express that there are tubes between two locations and wires <strong>in</strong> these tubes.<br />

In addition consider the follow<strong>in</strong>g constra<strong>in</strong>ts:<br />

FD 1 WIRE : wire id ! connection, wire type, voltage, power<br />

FD 2 TUBE : tube id ! connection, tube type<br />

FD 3 CONNECTION : connection ! from, to<br />

ID 1 WIRE[connection] TUBE[connection]<br />

ID 2 TUBE[connection] CONNECTION[connection]<br />

The rst three functional dependencies express that the values <strong>of</strong> wire id, tube id and connection<br />

are unique <strong>in</strong> relations over WIRE, TUBE and CONNECTION respectively. The latter<br />

<strong>in</strong>clusion constra<strong>in</strong>ts express that there is no wire nor tube without a correspond<strong>in</strong>g tuple <strong>in</strong><br />

a relation over CONNECTION.<br />

Then the follow<strong>in</strong>g relations dene an <strong>in</strong>stance <strong>of</strong> the schema:<br />

WIRE<br />

wire id connection wire type voltage power<br />

4711 HH-HB Koax 12 600<br />

4814 HH-H Tel 12 600<br />

TUBE<br />

tube id connection tube type<br />

8314 HH-H GX44<br />

8511 HH-HB GX44<br />

023 HB-H T33<br />

CONNECTION<br />

connection from to<br />

HH-H Hamburg Hannover<br />

HH-HB Hamburg Bremen<br />

HB-H Bremen Hannover<br />

It is easy to see that this <strong>in</strong>stance satises the constra<strong>in</strong>ts above.<br />

Now consider the operation <strong>in</strong>sert WIRE (t). This may lead to a violation <strong>of</strong> constra<strong>in</strong>t ID 1 ,<br />

<strong>in</strong> which case we must add a tuple to TUBE. Hence it can be replaced by<br />

<strong>in</strong>sert WIRE (t) <br />

IF connection(t) =2 TUBE[connection]<br />

THEN <strong>in</strong>sert TUBE (? connection(t) ?)<br />

ENDIF<br />

Here the question marks stand for arbitrarily chosen values <strong>of</strong> the correspond<strong>in</strong>g data type.<br />

Similarly, the operation delete TUBE (t) may also violate ID 1 . Therefore, we may replace<br />

delete TUBE (t) by<br />

delete TUBE (t) <br />

IF connection(t) 2 WIRE[connection] ; TUBE[connection]<br />

161


THEN FOR ALL t 0 WITH connection(t 0 ) = connection(t) DO<br />

delete WIRE (t 0 )<br />

ENDFOR<br />

ENDIF<br />

In order to enforce FD 2 wemay then replace <strong>in</strong>sert TUBE (t) by<br />

IF 8t 0 2 TUBE . tube id(t) 6= tube id(t 0 )<br />

THEN <strong>in</strong>sert WIRE (t)<br />

ENDIF<br />

Let us now add the exclusion constra<strong>in</strong>t ED WIRE[wire id] k TUBE[tube id]. In order to<br />

enforce this constra<strong>in</strong>t <strong>in</strong>sertions <strong>in</strong>to one <strong>of</strong> WIRE or TUBE should be followed by deletions<br />

<strong>in</strong> the other. The result<strong>in</strong>g transactions are<br />

and<br />

<strong>in</strong>sert WIRE (t) <br />

FOR ALL t 0 2 TUBE WITH tube id(t 0 ) = wire id(t) DO<br />

delete TUBE (t 0 )<br />

ENDFOR<br />

delete TUBE (t) <br />

FOR ALL t 0 2 WIRE WITH wire id(t 0 )=tubeid(t) DO<br />

delete WIRE (t 0 )<br />

ENDFOR<br />

If we now take together FD 2 , ID 1 and ED we must be very careful. E.g., if we execute<br />

<strong>in</strong>sert WIRE (8511,HH-HB,Koax,12,600) on the <strong>in</strong>stance above, we may rst delete the tuple<br />

(8511,HH-HB,GX44) <strong>in</strong> TUBE <strong>in</strong> order to enforce ED and then the two tuples (4711,HH-<br />

HB,Koax,12,600) and (8511,HH-HB,Koax,12,600) <strong>in</strong> WIRE <strong>in</strong> order to enforce ID 2 . The result<strong>in</strong>g<br />

<strong>in</strong>stance would be (omitt<strong>in</strong>g CONNECTION):<br />

WIRE<br />

wire id connection wire type voltage power<br />

4814 HH-H Tel 12 600<br />

TUBE<br />

tube id connection tube type<br />

8314 HH-H GX44<br />

023 HB-H T33<br />

Thus, the \eect" <strong>of</strong> the orig<strong>in</strong>al operation, i.e. <strong>in</strong>sertion <strong>of</strong> a tuple <strong>in</strong>to WIRE, is completely<br />

destroyed. The new eect is a deletion <strong>in</strong> WIRE and TUBE.<br />

ut<br />

8.2.2 ECA-Rules<br />

Active databases approach <strong>in</strong>tegrity enforcement by us<strong>in</strong>g ECA-rules. The general form <strong>of</strong><br />

these rules is<br />

ON heventi IF hconditioni DO hactioni : (8.80)<br />

heventi corresponds to an <strong>in</strong>ternal event, i.e. an <strong>in</strong>sert- or delete-operation. hconditioni is a<br />

formula to be evaluated aga<strong>in</strong>st the actual database state, e.g. it could be the negation :I<br />

<strong>of</strong> a constra<strong>in</strong>t I <strong>in</strong> implicative normal form (8.76). hactioni is a sequence <strong>of</strong> basic <strong>in</strong>sert- or<br />

delete-operations to be triggered, i.e. to be executed if the event occurred and the condition<br />

is satised.<br />

162


In the sequel the assumed execution model for ECA-rules relies on a deferred modus, i.e.<br />

the system RTS <strong>of</strong> rules is started after nish<strong>in</strong>g a transaction. Furthermore, we do not assume<br />

any order <strong>of</strong> the rules. Instead <strong>of</strong> this the execution model relies on demonic non-determ<strong>in</strong>ism,<br />

i.e. if the events <strong>of</strong> several rules r 1 ::: r n occur and their conditions evaluate to true, any<strong>of</strong><br />

these r i may be executed unless it is undened.<br />

Example 8.2. Let us look aga<strong>in</strong> at the schema used <strong>in</strong> Example 8.1. For the sake <strong>of</strong> simplicity<br />

we only consider the constra<strong>in</strong>ts ID 1 and ED. Then the changed operations can be expressed<br />

by rules. First consider <strong>in</strong>sert WIRE (t). The correspond<strong>in</strong>g rule would be<br />

ON <strong>in</strong>sert WIRE (w c t v p) IFc =2 TUBE[connection] DO <strong>in</strong>sert TUBE (?c?)<br />

(8.81)<br />

with ? stand<strong>in</strong>g for any value to be selected. This form is not yet exactly the one <strong>in</strong> (8.80),<br />

but writ<strong>in</strong>g relations as predicates we obta<strong>in</strong> the follow<strong>in</strong>g:<br />

ON <strong>in</strong>sert WIRE IF 9w c t v p: 8x t 0 : WIRE(w c t v p) ^:TUBE(x c t 0 )<br />

DO <strong>in</strong>sert TUBE (?c?) : (8.82)<br />

Note that the condition part is exactly the negation <strong>of</strong> ID 1 . Analogously, the other changes<br />

to operations discussed <strong>in</strong> Example 8.1 give rise to the follow<strong>in</strong>g rules:<br />

ON delete TUBE IF 9w c t v p: 8x t 0 : WIRE(w c t v p) ^:TUBE(x c t 0 )<br />

DO delete WIRE (w c t v p) (8.83)<br />

ON <strong>in</strong>sert WIRE IF 9w c t v p c 0 t 0 : WIRE(w c t v p) ^ TUBE(w c 0 t 0 )<br />

DO delete TUBE (w c 0 t 0 ) (8.84)<br />

ON <strong>in</strong>sert TUBE IF 9x c t v p c 0 t 0 : WIRE(x c 0 t 0 vp) ^ TUBE(x c t)<br />

DO delete WIRE (x c 0 t 0 vp) (8.85)<br />

In order to t with the <strong>in</strong>tended behaviour described <strong>in</strong> Example 8.1 it may occur that the<br />

same rule has to be executed several times. This can be achieved, if the semantics <strong>of</strong> the<br />

IF-part is considered as a WHILE-condition.<br />

ut<br />

Given a s<strong>in</strong>gle constra<strong>in</strong>t I <strong>in</strong> implicative normal form (8.76) we already get m<strong>in</strong>imum requirements<br />

for repair<strong>in</strong>g rules. If a relation symbol p occurs on the left hand side (right hand<br />

side) <strong>of</strong> (8.76), then each <strong>in</strong>sert- (delete-)operation on p may violate (8.76), hence give rise<br />

to event-parts. The correspond<strong>in</strong>g condition-part is simply :I. However, for the action-part<br />

there are still several alternatives.<br />

We call a system <strong>of</strong> ECA-rules complete i for all these cases <strong>of</strong> events and conditions<br />

there exists at least one repair<strong>in</strong>g rule, i.e. whenever the rule is selectable <strong>in</strong> some database<br />

state, the execution <strong>of</strong> the action part will establish I as a postcondition. However, we exclude<br />

those rules which simply <strong>in</strong>validate the event. For transactions we simply consider sequences<br />

<strong>of</strong> <strong>in</strong>sert- and delete-operations.<br />

Example 8.3. The four rules <strong>in</strong> the previous Example 8.2 form a complete system <strong>of</strong> ECArules,<br />

if we consider only the constra<strong>in</strong>ts ID 1 and ED from Example 8.1. However, if we also<br />

consider the other constra<strong>in</strong>ts <strong>in</strong> that example, we have to dene at least ve more rules to<br />

obta<strong>in</strong> a complete rule set, one rule for each <strong>of</strong> the three key constra<strong>in</strong>ts correspond<strong>in</strong>g to the<br />

events <strong>in</strong>sert WIRE , <strong>in</strong>sert TUBE and <strong>in</strong>sert CONNECTION , respectively, and two rules for the<br />

<strong>in</strong>clusion constra<strong>in</strong>t ID 2 correspond<strong>in</strong>g to <strong>in</strong>sert TUBE and delete CONNECTION .<br />

ut<br />

163


8.3 Problems with Rule-Based Integrity Enforcement<br />

If we were given only a s<strong>in</strong>gle constra<strong>in</strong>t I, then any <strong>of</strong> the rule constructions discussed <strong>in</strong><br />

the previous section would be sucient to enforce consistency. However, real systems { like<br />

the t<strong>in</strong>y one <strong>in</strong> Example 8.1 { conta<strong>in</strong> many constra<strong>in</strong>ts and the <strong>in</strong>terference <strong>of</strong> the rules may<br />

lead to problems.<br />

8.3.1 Non-Repairable Transactions<br />

Let us rst demonstrate the <strong>in</strong>suciency <strong>of</strong> a naive RTS approach us<strong>in</strong>g a second trivial<br />

example. In \real" applications as <strong>in</strong> the previous subsection the situation <strong>of</strong> Example 8.4 will<br />

not occur <strong>in</strong> such anobvious way, but there are always implied and <strong>in</strong> general not detectable<br />

constra<strong>in</strong>ts lead<strong>in</strong>g to analogous problems.<br />

Example 8.4. Take two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x) and<br />

I 2 p(x) ^ q(x) ) false. This implies p to be always empty, hence <strong>in</strong>sertions <strong>in</strong>to p should<br />

be abolished. Then we obta<strong>in</strong> the follow<strong>in</strong>g repair<strong>in</strong>g rules:<br />

R 1 : ON <strong>in</strong>sert p IF 9x: p(x) ^:q(x) DO <strong>in</strong>sert q (x)<br />

R 2 : ON delete q IF 9x: p(x) ^:q(x) DO delete p (x)<br />

R 3 : ON <strong>in</strong>sert p IF 9x: p(x) ^ q(x) DOdelete q (x)<br />

R 4 : ON <strong>in</strong>sert q IF 9x: p(x) ^ q(x) DO delete p (x)<br />

Here aga<strong>in</strong> the condition part <strong>in</strong> R 1 and R 2 is simply :I 1 and the condition part <strong>in</strong> R 3 and<br />

R 4 is :I 2 .<br />

If we try to execute a transaction <strong>in</strong>sert p (a) on a database state satisfy<strong>in</strong>g q(a), then we<br />

successively trigger the rules R 3 and R 2 with the eect <strong>of</strong> only delet<strong>in</strong>g a <strong>in</strong> q. This contradicts<br />

the orig<strong>in</strong>al <strong>in</strong>tention <strong>of</strong> the transaction.<br />

ut<br />

In order to analyze the un<strong>in</strong>tended behaviour <strong>in</strong> Example 8.4 consider a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong><br />

implicational normal form. Let denote the (semantic) closure, i.e. = fI j j= Ig.Now<br />

let I2 be non-trivial, i.e. it does not hold <strong>in</strong> all database states. Write I <strong>in</strong> implicational<br />

normal form<br />

I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m )<br />

and let p i 1 ::: p i k<br />

and q j 1 ::: q j` denote the relation symbols on the left and right hand<br />

sides <strong>of</strong> I respectively. Wemay dene a transaction T by<br />

delete qj<br />

1 (y j1 ) ::: delete q j` (y j`) <strong>in</strong>sert pi<br />

1 (x i1 ) ::: <strong>in</strong>sert p ik<br />

(x ik ) :<br />

If we startT with values for the x i and y j such that the additional conditions on the left hand<br />

side <strong>of</strong> I are satised, whilst the additional conditions on the right hand side are not, T will<br />

always reach a database state satisfy<strong>in</strong>g :I. This eect <strong>of</strong> T is <strong>in</strong>tentional and hence the only<br />

reasonable approach to<strong>in</strong>tegrity ma<strong>in</strong>tenance <strong>in</strong> this case is to disallow such transactions.<br />

More formally, the eect <strong>of</strong> a transaction T <strong>in</strong> a state is given by the strongest (with<br />

respect to )) formula E (T ) = such that j= wp(T )( ) holds. Here wp(T )( ) denotes<br />

the weakest precondition <strong>of</strong> under the transaction , i.e. start<strong>in</strong>g T <strong>in</strong> <strong>in</strong>itial state will<br />

reach a nal state satisfy<strong>in</strong>g .<br />

164


S<strong>in</strong>ce we only consider sequences <strong>of</strong> <strong>in</strong>sertions and deletions, E (T ) can always be written<br />

as a conjunction <strong>of</strong> literals, i.e. <strong>in</strong> negated implicational normal form, with the positive literals<br />

correspond<strong>in</strong>g to <strong>in</strong>sertions and the negative onestodeletions. In addition, we may consider<br />

the eect <strong>of</strong> a sequence T RT S, where T is a transaction and RT S a system <strong>of</strong> rules. We say<br />

that RT S <strong>in</strong>validates the eect <strong>of</strong> T i 6j= E (T ) ^ E (T RT S) holds for some state .<br />

Then it is justied to call a transaction T repairable with respect to the constra<strong>in</strong>t set <br />

i :E (T ) =2 holds for at least one state . Then a complete term<strong>in</strong>at<strong>in</strong>g system RT S<br />

<strong>of</strong> ECA-rules always <strong>in</strong>validates the eect <strong>of</strong> a non-repairable transaction T . Hence the rst<br />

problem is to detect (and exclude) non-repairable transactions. In order to decide whether a<br />

given transaction T is repairable or not, we must be able to decide, whether :E (T ) is <strong>in</strong><br />

the closure . Hence the implication problem for constra<strong>in</strong>ts must be decidable.<br />

Note that our treatment ignores the term<strong>in</strong>ation problem. Non-term<strong>in</strong>at<strong>in</strong>g transactions<br />

have to be excluded as well, but this problem is <strong>in</strong>dependent from the repairability problem,<br />

s<strong>in</strong>ce non-term<strong>in</strong>ation <strong>of</strong> RTSs occurs as an orthogonal problem.<br />

8.3.2 Critical Paths<br />

Let us ask, whether we can always nd a complete set <strong>of</strong> repair rules for all repairable<br />

transactions. For this we <strong>in</strong>troduce the notions <strong>of</strong> associated hypergraphs and critical trigger<br />

paths.<br />

Let S = fp 1 ::: p n g be a relational database schema and RT S = fR 1 ::: R m g a system<br />

<strong>of</strong> ECA-rules on S. Then the associated rule hypergraph (VE) is constructed as follows:<br />

{ V is the disjo<strong>in</strong>t union <strong>of</strong> S and RT S. We then talk <strong>of</strong> S-vertices and RT S-vertices<br />

respectively.<br />

{ If R 2 RT S has event-part Ev on p 2 S and actions on p 1 ::: p k , then we have a<br />

hyperedge from p to fRg labelled by +or; depend<strong>in</strong>g on Ev be<strong>in</strong>g an <strong>in</strong>sert or delete,<br />

and a hyperedge from fRg to fp 1 ::: p k g analogously labelled by k values + or ;.<br />

Example 8.5. Figure 8.1 shows the associated rule hypergraph <strong>of</strong> Example 8.4 <strong>in</strong> which case<br />

we have a simple graph. Note that whenever action-parts consist only <strong>of</strong> a s<strong>in</strong>gle operation,<br />

the rule hypergraph degenerates to a graph.<br />

As a more practical example Figure 8.2 conta<strong>in</strong>s the associated hypergraph for Example<br />

8.1 with the rules discussed <strong>in</strong> Example 8.2. In particular, rules R 1 , R 2 and R 3 correspond to<br />

the functional dependencies FD 1 ,FD 2 and FD 3 ,rulesR 4 and R 5 to the <strong>in</strong>clusion dependency<br />

ID 1 ,rulesR 6 and R 7 to the <strong>in</strong>clusion dependency ID 2 and rules R 8 and R 9 to the exclusion<br />

dependency ED. Furthermore, we used the abbreviations W , T and C for WIRE, TUBE and<br />

CONNECTION, respectively.<br />

ut<br />

So far we ignore the condition part <strong>of</strong> the rules. These come <strong>in</strong>to play if we consider<br />

critical trigger paths <strong>in</strong> associated hypergraphs. These are dened <strong>in</strong> several steps start<strong>in</strong>g<br />

from paths <strong>in</strong> the associated hypergraph which correspond to possible sequences <strong>of</strong> ECArules<br />

with respect only to their event- and action-parts. Secondly we attach formulae to the<br />

S-vertices <strong>in</strong> the path <strong>in</strong> such a way that pre- and postconditions <strong>of</strong> the <strong>in</strong>volved rules are<br />

expressed. Then we talk <strong>of</strong> trigger paths.<br />

A maximal trigger path with contradict<strong>in</strong>g <strong>in</strong>itial and nal condition will then be called<br />

critical. Then imag<strong>in</strong>e a transaction with an eect implied by the <strong>in</strong>itial formula, i.e. that<br />

there is an <strong>in</strong>itial state such that runn<strong>in</strong>g the transaction <strong>in</strong> this state results <strong>in</strong> a state which<br />

165


q<br />

;<br />

@@I <br />

@ ; ;<br />

;; @<br />

p<br />

<br />

R 2<br />

; + - R 4<br />

;<br />

@ ;<br />

+ @@<br />

; ;; @R<br />

R 1<br />

+ + - R 3<br />

Fig. 8.1. Associated Rule Hypergraph for RT S = fR 1 R 2 R 3 R 4 g <strong>in</strong> Example 8.4<br />

<br />

W <br />

<br />

*+<br />

HY;<br />

H<br />

H<br />

R 8<br />

A<br />

AAAAA<br />

AU ;<br />

+<br />

R 4<br />

<br />

<br />

R 6<br />

H HHj +<br />

T *+ H HHj +<br />

HY;<br />

<br />

6 AAK<br />

; H <br />

H ; <br />

6 6<br />

; A;<br />

R ; 5<br />

<br />

R ;<br />

7<br />

A<br />

<br />

A <br />

+ A +<br />

? A + ?<br />

R 1 R 9 R 2 R 3<br />

<br />

C<br />

<br />

+<br />

?<br />

Fig. 8.2. Associated Rule Hypergraph for Example 8.1<br />

satises the <strong>in</strong>itial condition <strong>of</strong> the trigger path. Execut<strong>in</strong>g this transaction followed by the<br />

rule trigger<strong>in</strong>g system along the critical trigger path will then turn the eect <strong>of</strong> the transaction<br />

<strong>in</strong>to its opposite. This means that the RT S <strong>in</strong>validates the eect <strong>of</strong> at least one transaction.<br />

Let G =(VE) be the rule hypergraph associated with a system RT S <strong>of</strong> rules. A trigger<br />

path <strong>in</strong> G is a sequence v 0 e 1 v 0 1 e0 1 ::: e0`v` <strong>of</strong> vertices and hyperedges with the follow<strong>in</strong>g<br />

conditions:<br />

{ v i 2S holds for all i =0::: `,<br />

{ vi 0 2 RT S holds for all i =1::: `,<br />

{ e i is a hyperedge from v i;1 to vi 0 and<br />

{ e 0 i is a hyperedge from v0 i to V i with v i 2 V i and the same label as e i+1 .<br />

We call ` the length <strong>of</strong> the trigger path.<br />

In addition we associate with each vertex v i 2 S (i = 0::: `) a formula ' i <strong>in</strong> negated<br />

implication normal form such thatj= ' i ) cond(vi+1 0 ) holds for the condition part cond(v0 i+1 )<br />

<strong>of</strong> rule vi+1 0 2 RT S and j= ' i ) wp(A i+1 )(' i+1 ) holds for the action-part A i+1 <strong>of</strong> rule vi+1<br />

0<br />

(i =0::: `; 1). Furthermore, there is no e`+1 2 E from v` to v0`+1 with the same label as<br />

e 0` such thatj= '` ) cond(v0`+1 ) holds.<br />

Then a trigger path is critical i j= :(' 0 ^ '`) holds. Such a critical trigger path is<br />

called non-admissible i there is a consistent state and a repairable transaction T such that<br />

E (T ) , ' 0 holds.<br />

166


Critical trigger paths for the associated rule hypergraph <strong>in</strong> Figure 8.1 are sketched <strong>in</strong><br />

Figure 8.3. Note that <strong>in</strong> this case both critical trigger paths are not non-admissible.<br />

<br />

p<br />

<br />

<br />

p<br />

<br />

Fig. 8.3. Critical Trigger Paths<br />

<br />

q<br />

<br />

<br />

q<br />

<br />

+ + + ;<br />

R 1 R 4<br />

- - - -<br />

+ ; ; ;<br />

R 3 R 2<br />

- - - -<br />

<br />

p<br />

<br />

<br />

p<br />

<br />

p(x) ^:q(x) p(x) ^ q(x) :p(x) ^ q(x)<br />

v 0 e 1 v 0 1 e 0 1<br />

v 1 e 2 v 0 2 e 0 2<br />

v 2<br />

p(x) ^ q(x) p(x) ^:q(x) :p(x) ^:q(x)<br />

If a critical trigger path is not non-admissible, then only a non-repairable transaction<br />

can be <strong>in</strong>validated by runn<strong>in</strong>g the rules <strong>in</strong> the trigger path. S<strong>in</strong>ce we exclude non-repairable<br />

transactions, we only have to consider non-admissible trigger paths. After these remarks we<br />

are able to state our next result:<br />

If RT S is a complete set <strong>of</strong> rules associated with a set <strong>of</strong> constra<strong>in</strong>ts and let G =(VE)<br />

be the associated rule hypergraph, then G conta<strong>in</strong>s an non-admissible critical trigger path i<br />

there exists a consistent database state and a repairable transaction T such that execut<strong>in</strong>g<br />

T <strong>in</strong> and consecutively runn<strong>in</strong>g RT S <strong>in</strong>validates the eect <strong>of</strong> T without leav<strong>in</strong>g the database<br />

unchanged.<br />

To sketch a pro<strong>of</strong>, consider the sequence ' 0 :::'` <strong>of</strong> formulae associated with a critical<br />

trigger path. Accord<strong>in</strong>g to the label <strong>of</strong> e 1 be<strong>in</strong>g + or ; ' 0 either conta<strong>in</strong>s a literal p(x) or<br />

:p(x). Choose a consistent state with j= :p(x) orj= p(x), respectively, and a repairable<br />

transaction T with E (T ) , ' 0 . By the denition <strong>of</strong> critical trigger paths RT S <strong>in</strong>validates<br />

the eect <strong>of</strong> T .F<strong>in</strong>ally, use <strong>in</strong>duction on the length ` to show that the state result<strong>in</strong>g from<br />

T followed by RT S is dierent from .<br />

Conversely, if there is no admissible critical trigger path, let T be a repairable transition<br />

and a database state which is consistent with respect to . Now start T <strong>in</strong> and assume<br />

that the result<strong>in</strong>g state 0 is not consistent. Then consider a trigger path <strong>of</strong> nite length such<br />

that j= 0 ' 0 holds. The consecutive execution <strong>of</strong> the rules <strong>in</strong> this trigger path will result <strong>in</strong><br />

a state satisfy<strong>in</strong>g '`. Thus, we have E (T ) , ' 0 and E (T RT S) , '`. Accord<strong>in</strong>g to<br />

our assumption, the used trigger path cannot be critical. Hence RT S does not <strong>in</strong>validate the<br />

eect <strong>of</strong> T .<br />

The full pro<strong>of</strong> is conta<strong>in</strong>ed <strong>in</strong> [12] and [14, p.82f.].<br />

8.3.3 Extensions<br />

In our model the execution <strong>of</strong> a rule with condition-part :I does not completely repair<br />

violations to the constra<strong>in</strong>t I, s<strong>in</strong>ce there may be more than just one violat<strong>in</strong>g tuple. There<br />

are two possible solutions to this problem:<br />

{ The rst <strong>of</strong> these solutions considers a WHILE-semantics for the rules. In this case the<br />

second condition for critical trigger paths has to be replaced by j= ' i ) wp(A i+1 )(' i+1)<br />

167


R 8<br />

+<br />

?<br />

W <br />

HY;<br />

<br />

*+<br />

H<br />

H<br />

+<br />

R 4<br />

A<br />

AAAAA<br />

AU ;<br />

<br />

T <br />

<br />

*+<br />

HY;<br />

H H<br />

R 6<br />

H HHj +<br />

H HHj +<br />

; -<br />

;<br />

; -<br />

<br />

<br />

<br />

<br />

<br />

;<br />

+<br />

R 5<br />

<br />

<br />

6 AAK<br />

6 6<br />

; A;<br />

; R ;<br />

7<br />

A<br />

<br />

A <br />

+ A +<br />

? A + ?<br />

R 1 R 9 R 2 R 3<br />

<br />

C<br />

<br />

+<br />

?<br />

Fig. 8.4. Extended Rule Hypergraph for Example 8.1<br />

with A i+1 represent<strong>in</strong>g the iteration <strong>of</strong> the action-part as long as the condition is satised.<br />

Example 8.6 shows a critical trigger path us<strong>in</strong>g WHILE-semantics.<br />

{ The second solution extends the rule hypergraph, as if the action-part <strong>of</strong> each rule repeated<br />

the event. Of course, this is not necessary for rules that denitely repair all violations to<br />

I. Figure 8.4 extends the one <strong>in</strong> Figure 8.2 with respect to the rules R 5 , R 7 , R 8 and R 9 .<br />

In Example 8.6 we discuss critical trigger paths with respect to this extension.<br />

Example 8.6. The rst picture <strong>in</strong> Figure 8.5 shows a critical trigger path correspond<strong>in</strong>g to<br />

the rule hypergraph <strong>in</strong> Figure 8.2 us<strong>in</strong>g WHILE-semantics. The used formulae are<br />

' 0 W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^ T (8511 HH-HB:::)<br />

' 1 W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^:T (8511 HH-HB:::)<br />

' 2 :W (8511 HH-HB:::) ^:W (4711 HH-HB:::) ^:T (8511 HH-HB:::)<br />

Us<strong>in</strong>g extensions to the hypergraph <strong>in</strong>stead { as shown <strong>in</strong> Figure 8.4 { gives rise to the critical<br />

trigger path <strong>in</strong> the second picture <strong>in</strong> Figure 8.5 us<strong>in</strong>g<br />

' 0 2 :W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^:T (8511 HH-HB:::)<br />

and the same formulae ' 0 , ' 1 and ' 2 as above.<br />

ut<br />

Both extensions do not aect the result stated above. To sketch a pro<strong>of</strong>, the second solution<br />

is the same as add<strong>in</strong>g \dummy" actions, i.e. those repeat<strong>in</strong>g the event, to the action part.<br />

Therefore, it corresponds to a slightly changed rule system with the same behaviour. Then<br />

the iteration <strong>in</strong> the rst solution corresponds to rule iteration <strong>in</strong> the second solution.<br />

Analogously, if action-parts conta<strong>in</strong> more than one operation, the critical trigger paths<br />

considered so far do not reect completely the sequences <strong>of</strong> rule executions. However, extend<strong>in</strong>g<br />

hyperedges from RT S-nodes to S-nodes accord<strong>in</strong>g to previously triggered rules captures<br />

this situation. Just as before, this does not aect our ma<strong>in</strong> result on critical trigger paths.<br />

S<strong>in</strong>ce the practical rule systems we are <strong>in</strong>terested <strong>in</strong>, only comprise simple action parts, we<br />

do not discuss further examples for this extension.<br />

168


W<br />

<br />

<br />

W<br />

<br />

<br />

<br />

T<br />

<br />

<br />

T<br />

' 0 ' 1 ' 2<br />

+ - ;<br />

R 8<br />

- ;- ;<br />

R 5<br />

-<br />

<br />

W<br />

<br />

<br />

T<br />

<br />

' 0 ' 1 ' 0 2<br />

+ ; ; ;<br />

R 8 R 5 R 5<br />

- - - - - -<br />

<br />

W<br />

<br />

' 2<br />

Fig. 8.5. Critical Trigger Paths for Example 8.1<br />

8.4 Well-behav<strong>in</strong>g Rule Systems<br />

Let us now ask for constra<strong>in</strong>t sets that allow us to dene complete RTSs which exclude nonadmissible<br />

critical trigger paths <strong>in</strong> their associated hypergraphs. Let us start with a simple<br />

example.<br />

Example 8.7. Take aga<strong>in</strong> two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x)<br />

and I 2 q(x) ) p(x) which implies p to be always equal to q. Thenwe obta<strong>in</strong> the follow<strong>in</strong>g<br />

repair<strong>in</strong>g rules:<br />

R 1 : ON <strong>in</strong>sert p IF 9x: p(x) ^:q(x) DO <strong>in</strong>sert q (x) (8.86)<br />

R 2 : ON delete q IF 9x: p(x) ^:q(x) DO delete p (x) (8.87)<br />

R 3 : ON <strong>in</strong>sert q IF 9x: :p(x) ^ q(x) DO <strong>in</strong>sert p (x) (8.88)<br />

R 4 : ON delete p IF 9x: :p(x) ^ q(x) DO delete q (x) (8.89)<br />

First observe, that all edges <strong>in</strong> critical trigger paths are equally labelled with either + or ;.<br />

For the case <strong>of</strong> +andv 0 = p consider all constants a such that j= ' 0 ) p(a) ^:q(a) holds,<br />

but from the denition <strong>of</strong> eects such a pair can only result from a non-repairable transaction<br />

T or an <strong>in</strong>consistent start<strong>in</strong>g state . The same argument applies to the other cases. Hence<br />

there are no non-admissible critical paths <strong>in</strong> the associated rule hypergraph.<br />

ut<br />

8.4.1 Stratied Rule Systems<br />

Let us now <strong>in</strong>vestigate the reason for the absence <strong>of</strong> non-admissible critical trigger paths <strong>in</strong><br />

Example 8.7. This leads us to the notion <strong>of</strong> a stratied set <strong>of</strong> constra<strong>in</strong>ts.<br />

The motivation beh<strong>in</strong>d this is as follows: In Example 8.7 <strong>in</strong>sertions (deletions) on a relation<br />

p only trigger <strong>in</strong>sertions (deletions) on q and vice versa. This should be sucient for not<br />

<strong>in</strong>validat<strong>in</strong>g an eect once it has been established. The correspond<strong>in</strong>g constra<strong>in</strong>ts can therefore<br />

be grouped together.<br />

Aset <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicative normal form (8.76) on a schema S is called stratied<br />

i we have a partition = 1 [:::[ n with pairwise disjo<strong>in</strong>t constra<strong>in</strong>t sets i called strata<br />

such that the follow<strong>in</strong>g conditions are satised:<br />

(i) If L 1 :::L k is a sequence <strong>of</strong> literals on the left hand side (right hand side) <strong>of</strong> I2 i and<br />

J 2 conta<strong>in</strong>s a sequence L 0 1:::L0` <strong>of</strong> literals on the right hand side (left hand side)<br />

such that fL 1 :::L k L 0 1 :::L0`g is uniable, then J must also lie <strong>in</strong> stratum i.<br />

169


(ii) If I 6= J conta<strong>in</strong> sequences <strong>of</strong> literals L 1 :::L k and L 0 1:::L0` both on the left (right)<br />

hand side such that fL 1 :::L k L 0 1:::L0`g is uniable with most general unier and<br />

:I, :J conta<strong>in</strong> uniable literals on the right (left) hand side, then I and J must lie <strong>in</strong><br />

dierent strata i and j , unless one <strong>of</strong> the <strong>in</strong>stances :I or :J is always satised.<br />

Example 8.8. Consider the constra<strong>in</strong>ts <strong>in</strong> Example 8.1 except the exclusion constra<strong>in</strong>t ED.<br />

Then the rst condition above requires ID 1 and ID 2 to lie <strong>in</strong> the same stratum. The same<br />

applies to FD 3 and ID 2 (or FD 2 and ID 1 , respectively).<br />

We may also unify the left hand sides <strong>of</strong> FD 2 and ID 2 (or FD 1 and ID 1 , respectively), but<br />

then the result<strong>in</strong>g <strong>in</strong>stance <strong>of</strong> the functional constra<strong>in</strong>t degenerates to true.<br />

ut<br />

Our next result states that stratied constra<strong>in</strong>t sets always give rise to RTSs without nonadmissible<br />

critical trigger paths <strong>in</strong> the associated rule hypergraph.<br />

If is a stratied constra<strong>in</strong>t set on a schema S, then there exists a complete RTS such<br />

that for any repairable transaction T on S the RTS does not <strong>in</strong>validate the eect <strong>of</strong> T .<br />

To sketch a pro<strong>of</strong> consider I <strong>in</strong> implicative normal form (8.76). For each relation symbol<br />

p i on the left hand side dene rules<br />

ON <strong>in</strong>sert pi IF :I DO <strong>in</strong>sert qj (y j )<br />

ON <strong>in</strong>sert pi IF :I DO delete pj (y j )<br />

and<br />

with relation symbols q j occurr<strong>in</strong>g on the right hand side and p j (j 6= i) on the left hand side<br />

<strong>of</strong> I. Similarly, each predicate symbol q j on the right hand side gives rise to rules<br />

ON delete qj IF :I DO <strong>in</strong>sert qi (y i )<br />

ON delete qj IF :I DO delete pi (y i ) :<br />

This denes a complete set RT S <strong>of</strong> rules. Due to this rule construction the constra<strong>in</strong>ts correspond<strong>in</strong>g<br />

to the rules <strong>in</strong> a critical trigger path all belong to the same stratum. However,<br />

the condition j= :(' 0 ^ '`) implies that ' 0 conta<strong>in</strong>s a literal L, '` its negation, hence the<br />

construction <strong>of</strong> rules implies I 1 and I` to lie <strong>in</strong> dierent strata. This shows that there are no<br />

non-admissible critical trigger paths. The full pro<strong>of</strong> can be found <strong>in</strong> [12, 14].<br />

and<br />

8.4.2 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Entity-Relationship Schemata<br />

F<strong>in</strong>ally, we may ask for cases where stratied constra<strong>in</strong>t sets occur. Recall from [9] that a<br />

relational database schema S with constra<strong>in</strong>t set is <strong>in</strong> Entity-Relationship normal form<br />

(ERNF) { and hence is equivalent toanER-schema{i<br />

{ all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> are key-based and non-redundant,<br />

{ there is no cycle <strong>of</strong> <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> ,<br />

{ each relation schema R 2 S is <strong>in</strong> BCNF with respect to the functional dependencies <strong>in</strong><br />

and<br />

{ there are only <strong>in</strong>clusion and functional dependencies <strong>in</strong> .<br />

If a relational database schema S with constra<strong>in</strong>tset is <strong>in</strong> ERNF, then a slight generalization<br />

<strong>of</strong> the argument given <strong>in</strong> Example 8.8 shows that is stratied. Indeed, property (i) forces<br />

<strong>in</strong>clusion dependencies R[X 1 ] S[Y 1 ] and S[X 2 ] T [Y 2 ] to belong to the same stratum. By<br />

170


the same property the key constra<strong>in</strong>ts dened by theY i also belong to this stratum. F<strong>in</strong>ally,<br />

property (ii) does not constra<strong>in</strong> pairs <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> .<br />

Furthermore, we only obta<strong>in</strong> an acyclic set <strong>of</strong> functional and <strong>in</strong>clusion constra<strong>in</strong>ts, for<br />

which the implication problem is decidable [2]. Hence we are able to detect also unrepairable<br />

transactions. Follow<strong>in</strong>g the design approach <strong>of</strong> Mannila and Raiha <strong>in</strong> [9] leads to schemata<br />

without any problems concern<strong>in</strong>g consistency enforcement byRTSs.<br />

;; @ @@ ; C q<br />

(0 1) - D<br />

6<br />

(0 1)<br />

;; @ @@ ; ;; @ @@ ; -<br />

A p B<br />

Fig. 8.6. Entity-Relationship constra<strong>in</strong>ts<br />

Example 8.9. Let us look at the higher order Entity-Relationship diagram [15] <strong>in</strong> Figure<br />

8.6, which leads to the constra<strong>in</strong>ts<br />

I 1 : p(x y) ) q(x z) and<br />

I 2 : q(x z) ^ q(y z) ) x = y :<br />

Stratication property (i) applied to q on the right hand side <strong>of</strong> I 1 and on the left hand<br />

side <strong>of</strong> I 2 forces I 1 , I 2 to lie <strong>in</strong> the same stratum. Property (ii) is not applicable. Hence the<br />

constra<strong>in</strong>t set is stratied. However, if we add a third constra<strong>in</strong>t<br />

I 3 p(x z) ^ q(y z) ) false<br />

which <strong>in</strong>terms<strong>of</strong>theEntity-Relationship diagram <strong>in</strong> Figure 8.6 corresponds to an exclusion<br />

constra<strong>in</strong>t BkD, the new set fI 1 I 2 I 3 g <strong>of</strong> constra<strong>in</strong>ts is no longer stratied. This is due to<br />

the fact that stratication property (i) forces I 1 and I 3 to lie <strong>in</strong> the same stratum, wheras<br />

now stratication property (ii) forces I 1 (or analogously I 2 ) to lie <strong>in</strong> a stratum dierent from<br />

the one <strong>of</strong> I 3 .Thus, there is no stratication satisfy<strong>in</strong>g both properties.<br />

ut<br />

connection {<br />

to {<br />

from {<br />

{tubeid<br />

CONNECTION ;; @ TUBE<br />

@@ ; (1 1) -<br />

{tubetype<br />

6<br />

;; @ @@ ; (1 1) -<br />

WIRE<br />

{ wire id<br />

{ wire type<br />

{voltage<br />

{power<br />

Fig. 8.7. Entity-Relationship constra<strong>in</strong>ts correspond<strong>in</strong>g to Example 8.1<br />

171


Example 8.10. Let us take another look at Example 8.1. We have already seen <strong>in</strong> Example<br />

8.8 that the set <strong>of</strong> functional and <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> this example is stratied. Aga<strong>in</strong>,<br />

add<strong>in</strong>g the exclusion constra<strong>in</strong>t ED destroys this property, s<strong>in</strong>ce the stratication property<br />

(i) forces ED to belong to the same stratum as ID 1 , whereas property (ii) implies it ly<strong>in</strong>g <strong>in</strong><br />

a dierent stratum. This is practically the same argument as <strong>in</strong> the previous Example 8.9.<br />

Aga<strong>in</strong> the schema corresponds to the Entity-Relationship diagram <strong>in</strong> Figure 8.7 with<br />

ED correspond<strong>in</strong>g to the exclusion constra<strong>in</strong>t W [wire id] T [tube id] and ID 1 to the path<br />

<strong>in</strong>clusion constra<strong>in</strong>t W:C[connection] T:C[connection].<br />

ut<br />

8.4.3 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Simple <strong>Object</strong>-<strong>Oriented</strong> Schemata<br />

A similar situation arises for simple schemata <strong>in</strong> object-oriented data models. The OODM<br />

<strong>in</strong>vestigated <strong>in</strong> [11] dist<strong>in</strong>guishes between objects and values. Types are used to describe<br />

immutable sets <strong>of</strong> values with (type-)operations predened on them. Type systems are prescriptions<br />

for the syntax and semantics <strong>of</strong> permitted type denitions. We mayalways consider<br />

type systems that consists <strong>of</strong> some base types, type constructors and a subtyp<strong>in</strong>g relation.<br />

E.g., base types could BOOL, NAT, INT, STRING, ID or OK,whereID is an abstract<br />

identier type without any non-trivial supertype and OK is the trivial type (which has exactly<br />

one value ok). Type constructors could be record types (a 1 : 1 ::: a n : n ) and nite set<br />

types fg.<br />

We may use base types and constructors to dene new types by nest<strong>in</strong>g. In addition, we<br />

may build parameterized types lett<strong>in</strong>g type variables <strong>in</strong> constructors be un<strong>in</strong>stantiated. Then<br />

atype T is called proper i the number <strong>of</strong> its parameters is 0. T is called a value type i there<br />

is no occurrence <strong>of</strong> ID <strong>in</strong> T .IfT 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type T , then there exists a<br />

correspond<strong>in</strong>g occurrence relation o : T T 0 ! BOOL with o(v 1 v 2 )=true i v 2 occurs<br />

<strong>in</strong> v 1 at the position <strong>in</strong>dicated by the position <strong>of</strong> T 0 <strong>in</strong> T . Each subtype relation T 1 T 2 as<br />

above denes a subtype function T 1 ! T 2 on the correspond<strong>in</strong>g sets <strong>of</strong> values.<br />

The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure and behaviour.<br />

Structurally this uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Behaviourally,<br />

this abstracts from operations on s<strong>in</strong>gle objects <strong>in</strong>clud<strong>in</strong>g their creation and<br />

deletion.<br />

S<strong>in</strong>ce identiers can be represented us<strong>in</strong>g ID,values and references can be comb<strong>in</strong>ed <strong>in</strong>to<br />

a representation type, where each occurrence <strong>of</strong> ID denotes references to some other class.<br />

Therefore, we may dene the structure <strong>of</strong> a class us<strong>in</strong>g parameterized types.<br />

If T is a value type with parameters 1 ::: n and if the parameters are replaced by<br />

pairs r i : C i with a reference name r i and a class name C i ,theresult<strong>in</strong>g expression is called<br />

a structure expression. A class consists <strong>of</strong> a class name C, a structure expression S, a set <strong>of</strong><br />

class names D 1 ::: D m (called superclasses) and a set <strong>of</strong> operations. Wecallr i the reference<br />

named r i from class C to class C i .Thetype derived from S by replac<strong>in</strong>g each reference r i : C i<br />

by the type ID is called the representation type T C <strong>of</strong> the class C.<br />

A database schema S is given by a nite collection <strong>of</strong> type and class denitions such that<br />

all types, classes and operations occurr<strong>in</strong>g with<strong>in</strong> type denitions, structure denitions and<br />

operations are dened <strong>in</strong> S.<br />

Then an <strong>in</strong>stance D assigns to each classC avalue D(C) <strong>of</strong>type f(ident : IDvalue : T C )g<br />

such that the follow<strong>in</strong>g conditions are satised:<br />

{ For each class C identiers must be unique.<br />

172


{ The set <strong>of</strong> identiers <strong>in</strong> a subclass C is a subset <strong>of</strong> the one <strong>in</strong> the superclass C 0 . Moreover,<br />

if T C T 0 C with subtype function f : T C ! T 0 C , then (i v) 2D(C) ) (i f(v)) 2D(C0 )<br />

holds.<br />

{ For each reference r from C to D identiers j occurr<strong>in</strong>g <strong>in</strong> a value v <strong>of</strong> an object <strong>in</strong> C<br />

with respect to the occurrence relation o r , i.e.(i v) 2D(C) and o r (v j) hold, must occur<br />

<strong>in</strong> D(D).<br />

Let us consider only simple schemata as they occur <strong>in</strong> most practical object-oriented systems.<br />

In such aschema structure expressions always have the form (a 1 : T 1 :::a n : T n ), where T i<br />

is either a value type or a class name. In the latter case a i is a reference. In accordance to<br />

many practical systems we may then call a i an attribute.<br />

Example 8.11. Let us consider a simple university schema adapted from [11]:<br />

Class PersonC<br />

Structure (PersonIdentityNo : NAT , Address : STRING )<br />

Class MarriedPersonC<br />

IsA PersonC<br />

Structure (Spouse:MarriedPersonC )<br />

Class StudentC<br />

IsA PersonC<br />

Structure ( StudNo : NAT ,Name:STRING, Supervisor : Pr<strong>of</strong>essorC,<br />

Major : DepartmentC, M<strong>in</strong>or : DepartmentC )<br />

Class Pr<strong>of</strong>essorC<br />

IsA PersonC<br />

Structure (Age:NAT , Salary : NAT ,Faculty :DepartmentC )<br />

Class DepartmentC<br />

Structure ( DeptName : STRING, Head:Pr<strong>of</strong>essorC )<br />

This schema can be translated { us<strong>in</strong>g some self-explan<strong>in</strong>g abbreviations for the attribute<br />

names { <strong>in</strong>to a relational one with the follow<strong>in</strong>g relation schemata:<br />

Person = (id p<strong>in</strong>o address) <br />

MPerson = (id spouse) <br />

Student = (id sno name sup major m<strong>in</strong>or) <br />

Pr<strong>of</strong> = (id age salary faculty) <br />

Dept = (id dname head)<br />

In addition, we get the follow<strong>in</strong>g functional and <strong>in</strong>clusion dependencies:<br />

Person :id ! p<strong>in</strong>o address MPerson :id ! spouse <br />

Student :id ! sno name sup major m<strong>in</strong>or <br />

Pr<strong>of</strong> :id ! age salary faculty Dept :id ! dname head<br />

MPerson[id] Person[id] Student[id] Person[id] Pr<strong>of</strong>[id] Person[id]<br />

MPerson[spouse] MPerson[id] Student[sup] Pr<strong>of</strong>[id]<br />

Student[major] Dept[id] Student[m<strong>in</strong>or] Dept[id]<br />

Pr<strong>of</strong>[faculty] Dept[id] Dept[head] Pr<strong>of</strong>[id] :<br />

173


Note that all these relations have an attribute id with the type ID as its doma<strong>in</strong>. Furthermore,<br />

all <strong>in</strong>clusion constra<strong>in</strong>ts dened by theschema are key-based with just this attribute occurr<strong>in</strong>g<br />

on the right hand side. The <strong>in</strong>clusion constra<strong>in</strong>ts that stem from subclass<strong>in</strong>g also have idon<br />

their left hand sides. In particular all <strong>in</strong>clusion constra<strong>in</strong>ts are unary.<br />

ut<br />

The observations made <strong>in</strong> Example 8.11 can be generalized. From the denition <strong>of</strong> conditions<br />

to be satised by <strong>in</strong>stances, each <strong>in</strong>clusion constra<strong>in</strong>t dened by a simple object-oriented<br />

schema is key-based with the identier attribute id occurr<strong>in</strong>g on its right hand side. Furthermore,<br />

id denes a key for each relation. However, due to the use <strong>of</strong> the set-type-constructor<br />

relations appear to be not <strong>in</strong> rst normal form.<br />

With these observations concern<strong>in</strong>g the nature <strong>of</strong> constra<strong>in</strong>t sets arises from transform<strong>in</strong>g<br />

simple object-oriented schemata we may repeat our arguments used for Entity-<br />

Relationship constra<strong>in</strong>ts to see that is stratied. Indeed, property (i) forces <strong>in</strong>clusion dependencies<br />

R[a] S[id] and S[b] T [id] to belong to the same stratum. By the same property<br />

the key constra<strong>in</strong>ts dened by the attributes id on S or T also belong to this stratum. F<strong>in</strong>ally,<br />

property (ii) does not constra<strong>in</strong> pairs <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> .<br />

S<strong>in</strong>ce all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> are unary, the implication problem is decidable [7].<br />

Therefore, we are also able to detect non-repairable transactions.<br />

8.5 Conict Resolution<br />

Referential actions are special rules to cope with violations <strong>of</strong> a foreign key constra<strong>in</strong>t R 1 [X] <br />

R 2 [Y ]. Note that all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> Entity-Relationship and object-oriented models<br />

considered so far have this form. As <strong>in</strong> SQL we only consider the case <strong>of</strong> delete- andupdateoperations<br />

on R 2 , i.e. we consider the deletion (or update) <strong>of</strong> a tuple t 2 2I(R 2 ). If this leads<br />

to constra<strong>in</strong>t violation, there mustbeatleast one tuple t 1 2I(R 1 ) with t 1 [X] =t 2 [Y ]. The<br />

follow<strong>in</strong>g actions have been suggested:<br />

cascade: Also delete t 1 (or update the values for the attributes <strong>in</strong> X such that the constra<strong>in</strong>t<br />

violation dissappears). If there is more than one such tuple, the action is applied to all <strong>of</strong><br />

them.<br />

set null: Set the values for the attributes <strong>in</strong> X to a null value.<br />

restrict: Reject the deletion or update on R 2 and roll back.<br />

In the rst two cases wehave a reaction by propagation, s<strong>in</strong>ce referenc<strong>in</strong>g tuples also disappear<br />

from the <strong>in</strong>stance.<br />

8.5.1 Problem <strong>of</strong> Ambiguity<br />

Assume that wehave associated a referential action with all constra<strong>in</strong>ts <strong>in</strong> I. Then the problem<br />

occurs that the nal result <strong>of</strong> an operation depends on the order <strong>of</strong> apply<strong>in</strong>g referential actions.<br />

A propagation path (for short: p-path) is a sequence R n [X n ]:::R 1 [X 1 ]such that there are<br />

constra<strong>in</strong>ts R i;1 [Yi;1 0 ] R i[Y i ] <strong>in</strong> I with X i Yi 0 Y i for i =2:::n, all these constra<strong>in</strong>ts<br />

are associated with a referential action <strong>of</strong> k<strong>in</strong>d cascade or set null and R i;1 [X i;1 ] R i [X i ]<br />

is <strong>in</strong> I .<br />

A restriction path (for short: r-path) is a sequence R n [X n ]:::R 1 [X 1 ]such that there are<br />

constra<strong>in</strong>ts R i;1 [Y 0<br />

i;1 ] R i[Y i ]<strong>in</strong>I with X i Y 0<br />

i Y i for i =2:::n, where R 1 [Y 0 1 ] R 2[Y 2 ]<br />

174


is associated with a referential action <strong>of</strong> k<strong>in</strong>d restrict and all other constra<strong>in</strong>ts are associated<br />

with a referential action <strong>of</strong> k<strong>in</strong>d cascade or set null, andR i;1 [X i;1 ] R i [X i ]is<strong>in</strong>I .<br />

A p-path R n [X n ]:::R 1 [X 1 ] is called a phantom i there is an r-path Rm[X 0 m], 0 ::: ,<br />

R1 0 [X0 1 ] with R0 m = R n , Xm 0 = X n and an <strong>in</strong>clusion constra<strong>in</strong>t R 1 [X 1 ] R1 0 [X0 1 ]<strong>in</strong>Dep .<br />

A schema S has a conict i there is a p-path R n [X n ]:::R 1 [X 1 ] correspond<strong>in</strong>g to<br />

constra<strong>in</strong>ts R i;1 [Yi;1 0 ] R i[Y i ], a r-path Rm[X 0 m]:::R 0 1 0 [X0 1 ] correspond<strong>in</strong>g to constra<strong>in</strong>ts<br />

Ri;1 0 [Z0 i;1 ] R0 i [Z i] with R n [X n ]=Rm[X 0 m]andR 0 1 [X 1 ]=R1 0 [X0 1 ], an <strong>in</strong>stance I and tuples<br />

t n :::t 1 <strong>in</strong> I(R n ):::I(R 1 )witht i [Y i ]=t i;1 [Yi;1 0 ] and tuples t0 m :::t0 1 <strong>in</strong> I(R0 m ):::I(R0 1 )<br />

with t 0 i [Z i]=t 0 i;1 [Z0 i;1 ]suchthatt n = t 0 m and t 1 = t 0 1 hold. A conict is called a phantom i<br />

the <strong>in</strong>volved p-path is a phantom.<br />

The condition t 1 = t 0 1 could be omitted, s<strong>in</strong>ce the existence <strong>of</strong> tuple sequences satisfy<strong>in</strong>g<br />

all other conditions implies the existence <strong>of</strong> tuples as claimed <strong>in</strong> the denition.<br />

registration {<br />

Car<br />

{ p<strong>in</strong>o<br />

Person { name<br />

{ address<br />

{ licence no {date<br />

AK<br />

A<br />

H H<br />

H Driver H Patient<br />

HH<br />

<br />

H<br />

A<br />

H<br />

H<br />

H<br />

HH<br />

<br />

<br />

-<br />

Qk<br />

3<br />

Q<br />

Q <br />

H<br />

Q<br />

H<br />

H<br />

Accident H {costs<br />

HH<br />

<br />

<br />

Cl<strong>in</strong>ic<br />

{clname<br />

Fig. 8.8. Entity-Relationship schema lead<strong>in</strong>g to a conict<br />

Example 8.12. Consider the Entity-Relationship diagram <strong>in</strong> Figure 8.8. Transform<strong>in</strong>g it to<br />

a relational schema gives rise to the relation schemata<br />

Person = (p<strong>in</strong>o name address)<br />

Driver = (p<strong>in</strong>o registration licence no)<br />

Patient = (p<strong>in</strong>o cl name date)<br />

together with the <strong>in</strong>clusion dependencies<br />

and<br />

Driver[p<strong>in</strong>o] Person[p<strong>in</strong>o]<br />

Car = (registration)<br />

Cl<strong>in</strong>ic = (cl name)<br />

Accident = (p<strong>in</strong>o registration cl name costs)<br />

Accident[p<strong>in</strong>o cl name] Patient[p<strong>in</strong>o cl name]<br />

Accident[p<strong>in</strong>o registration] Driver[p<strong>in</strong>o registration]<br />

Patient[p<strong>in</strong>o] Person[p<strong>in</strong>o] :<br />

Now suppose that the last <strong>of</strong> these dependencies has been equipped with the referential action<br />

restrict, whilst all others are equipped with cascade. ThenPerson[p<strong>in</strong>o], Driver[p<strong>in</strong>o],<br />

175


Accident[p<strong>in</strong>o registration] is a p-path and Person[p<strong>in</strong>o], Patient[p<strong>in</strong>o], Accident[p<strong>in</strong>o cl name]<br />

is an r-path. These two paths show that the schema has a conict.<br />

Now extend the schema as shown <strong>in</strong> Figure 8.9. We obta<strong>in</strong> the additional relation schema<br />

Bad Driver = (p<strong>in</strong>o) with the <strong>in</strong>clusion dependency Bad Driver[p<strong>in</strong>o] Person[p<strong>in</strong>o]. Assume<br />

this dependency to be equipped with the referential action restrict. Then Person[p<strong>in</strong>o],<br />

Bad Driver[p<strong>in</strong>o] constitutes another r-path.<br />

If (for reasons beyond the small section <strong>of</strong> constra<strong>in</strong>ts considered) we derive the <strong>in</strong>clusion<br />

dependency Accident[p<strong>in</strong>o] Bad Driver[p<strong>in</strong>o] 2 Dep , then the above conict will be a<br />

phantom.<br />

ut<br />

{ p<strong>in</strong>o<br />

H H (0,1)<br />

HBad Driver<br />

Person { name<br />

HH<br />

<br />

H<br />

<br />

-<br />

{ address<br />

registration {<br />

{ licence no {date<br />

Car <br />

H AK<br />

A<br />

H H<br />

Driver H Patient<br />

HH<br />

<br />

H<br />

A<br />

H<br />

H<br />

H<br />

HH<br />

<br />

<br />

-<br />

Qk<br />

3<br />

Q<br />

Q <br />

H<br />

Q H H<br />

Accident H {costs<br />

HH<br />

<br />

<br />

Fig. 8.9. Entity-Relationship schema lead<strong>in</strong>g to a phantom conict<br />

Cl<strong>in</strong>ic<br />

{clname<br />

If there is a conict, then a deletion or update for the tuple t n = t 0 m violates the constra<strong>in</strong>ts<br />

R n;1 [Yn;1 0 ] R n[Y n ] and Rm;1 0 [Z0 m;1 ] R0 m [Z m]. Execut<strong>in</strong>g the correspond<strong>in</strong>g referential<br />

actions violates the \next" foreign key constra<strong>in</strong>ts along the p-path or r-path respectively. Dependend<strong>in</strong>g<br />

on the order <strong>of</strong> the referential actions the tuple t 1 = t 0 1 is either deleted accord<strong>in</strong>g<br />

to the actions along the p-path and consequently no constra<strong>in</strong>t violation for R1 0 [Z0 1 ] R0 2 [Z 2]<br />

may occur or it leads to a rollback accord<strong>in</strong>g to the actions along the r-path. This is the core<br />

<strong>of</strong> the ambiguity problem.<br />

However, if it is a phantom conict, we also have a r-path Rk 00[X00<br />

k ]:::R00 1 [X00 1 ] with R00 k =<br />

R n with foreign key constra<strong>in</strong>ts Ri;1 00 [U i 0] R00 i [U i]andXk 00 = X n and an <strong>in</strong>clusion constra<strong>in</strong>t<br />

R 1 [X 1 ] R1 00[X00<br />

1 ]. Hence there are also tuples t00 k :::t00 1 with t00 i [U i] = t i;1 [Ui 0 ]. Hence the<br />

tuple t 00<br />

1 enforces a rollback and there is no ambiguity.<br />

Thus, the ambiguity problem is to decide for a given schema S together with a set Dep =<br />

K [ I <strong>of</strong> m<strong>in</strong>imal key and referential key constra<strong>in</strong>ts has a non-phantom conict or not.<br />

8.5.2 Decidability<br />

In order to show that ambiguity asdenedabove is decidable, we rst recall that implication<br />

for <strong>in</strong>clusion dependencies alone is decidable [2]. Thus, we can compute all p-paths and r-<br />

paths. S<strong>in</strong>ce a conict corresponds to a \diamond" with a p-path and a r-path, the existence<br />

<strong>of</strong> conicts is obviously decidable and we onlyhave to discard phantom p-paths. For this we<br />

have to decide, whether an arbitrary <strong>in</strong>clusion constra<strong>in</strong>t (R 1 [X 1 ] R1 0 [X0 1 ] <strong>in</strong> the denition<br />

176


<strong>of</strong> phantom p-paths) is <strong>in</strong> Dep .Thus, the existence <strong>of</strong> non-phantom conicts is decidable i<br />

for any <strong>in</strong>clusion constra<strong>in</strong>t it is decidable whether Dep j= holds.<br />

We have seen <strong>in</strong> the previous section that constra<strong>in</strong>t sets dened by Entity-Relationship<br />

or simple object-oriented schemata only conta<strong>in</strong> functional and <strong>in</strong>clusion dependencies. We<br />

know that for arbitrary sets <strong>of</strong> functional and <strong>in</strong>clusion constra<strong>in</strong>ts the implication problem<br />

Dep j= is undecidable [6], but for the Entity-Relationship case the <strong>in</strong>clusion dependencies<br />

are acyclic. Then it is well known [1] that the implication problem Dep j= is decidable.<br />

For the object-oriented case all <strong>in</strong>clusion dependencies are unary. For this case it is also<br />

well known [7] that the implication problem Dep j= is decidable.<br />

Therefore, for both cases <strong>of</strong> constra<strong>in</strong>t sets considered <strong>in</strong> this paper, those result<strong>in</strong>g from<br />

Entity-Relationship schemata and those aris<strong>in</strong>g from simple object-oriented schemata, the<br />

ambiguity problem for referential actions is decidable.<br />

8.6 Conclusion<br />

In this paper we <strong>in</strong>vestigated rule trigger<strong>in</strong>g systems (RTSs) for ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g consistency<br />

aris<strong>in</strong>g from implicit constra<strong>in</strong>ts <strong>in</strong> Entity-Relationship and object-oriented models. Unfortunately,<br />

their always exist non-repairable transactions. In order to disallow such transactions<br />

the constra<strong>in</strong>t implication problem must be decidable, which is the case for both models. In<br />

the rst case we are <strong>in</strong> the situation <strong>of</strong> acyclic <strong>in</strong>clusion constra<strong>in</strong>ts, whereas <strong>in</strong> the second<br />

case we only obta<strong>in</strong> unary <strong>in</strong>clusion constra<strong>in</strong>ts.<br />

Secondly, we analyzed critical trigger paths <strong>in</strong> rule hypergraphs associated with RTSs. We<br />

could show that the existence <strong>of</strong> critical trigger paths leads to RTSs which may<strong>in</strong>validate the<br />

eect <strong>of</strong> some transactions, even if these are repairable. Such abehaviour can be excluded for<br />

stratied constra<strong>in</strong>t sets, which holds for the constra<strong>in</strong>t sets aris<strong>in</strong>g from Entity-Relationship<br />

and object-oriented models.<br />

Thirdly, we <strong>in</strong>vestigated the ambiguity problem for rules for the case that rollback is<br />

allowed <strong>in</strong> the action part. This aga<strong>in</strong> can be reduced to the decidability problem for constra<strong>in</strong>t<br />

implication, hence holds for the chosen models.<br />

To summarize, the general applicability <strong>of</strong>RTSs for <strong>in</strong>tegrity ma<strong>in</strong>tenance is limited, if we<br />

assume that the <strong>in</strong>tended eects <strong>of</strong> user-dened transactions should be preserved. Fortunately,<br />

conicts do no occur or can be detected eciently if we only consider constra<strong>in</strong>ts aris<strong>in</strong>g from<br />

conceptual design with Entity-Relationship and certa<strong>in</strong> object-oriented models.<br />

References for Chapter 8<br />

1. S. Abiteboul, R. Hull, V. Vianu. Foundations <strong>of</strong> databases. Addison-Wesley 1995.<br />

2. M. A. Casanova, R. Fag<strong>in</strong>, C.H.Papadimitriou. Inclusion dependencies and their <strong>in</strong>teraction with<br />

functional dependencies. Journal <strong>of</strong> Computer and System Sciences 28 (1), 29-59, 1984.<br />

3. S. Ceri, J. Widom: Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />

VLDB, Brisbane (Australia), August 1990, 566-577.<br />

4. S. Ceri, P. Fraternali, S. Paraboschi, L. Tanca: Automatic Generation <strong>of</strong> Production Rules for<br />

Integrity Ma<strong>in</strong>tenance. ACM ToDS, vol. 19(3), 1994, 367-422.<br />

5. S. Chakravarty, J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc.,<br />

Houston, Februar 1994.<br />

6. A. K. Chandra, M. Y. Vardi. The implication problem for functional and <strong>in</strong>clusion dependencies is<br />

undecidable. SIAM Journal <strong>of</strong> Comput<strong>in</strong>g 14, 671-677, 1985.<br />

177


7. S. S. Cosmadakis, P. Kanellakis, M. Y. Vardi. Polynomial-time implication problems for unary <strong>in</strong>clusion<br />

dependencies. Journal <strong>of</strong> the ACM 37, 15-46, 1990.<br />

8. M. Gertz, U. W. Lipeck: Deriv<strong>in</strong>g Integrity Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Triggers from transaction Graphs, <strong>in</strong><br />

Proc. 9th ICDE, IEEE Computer Society Press, 1993, 22-29.<br />

9. H. Mannila, K.-J. Raiha: The Design <strong>of</strong> Relational <strong>Databases</strong>, Addison-Wesley 1992.<br />

10. K.-D. Schewe, B. Thalheim: Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>, <strong>in</strong> S. Chakravarty, J.<br />

Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc., Houston, Februar<br />

1994.<br />

11. K.-D. Schewe and B. Thalheim. Fundamental concepts <strong>of</strong> object oriented databases. Acta Cybernetica,<br />

vol. 11(1/2), Szeged 1993, 49 - 84.<br />

12. K.-D. Schewe, B. Thalheim: Active Consistency Enforcement for Repairable Database Transitions,<br />

<strong>in</strong> S.Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.): Integrity <strong>in</strong> <strong>Databases</strong>, Proc. 6th Int. Workskop<br />

on Foundations <strong>of</strong> Models and Languages for Data and <strong>Object</strong>s, Schlo Dagstuhl, 1996, 87-102,<br />

available via http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceed<strong>in</strong>gs.html.<br />

13. K.-D. Schewe: Well-Behav<strong>in</strong>g Rule Systems for Entity-Relationship and <strong>Object</strong> <strong>Oriented</strong> Models,<br />

<strong>in</strong> D. W. Embley, R. C. Goldste<strong>in</strong> (Eds.): Conceptual Model<strong>in</strong>g { ER '97, Spr<strong>in</strong>ger LNCS 1331,<br />

1997, 141-154.<br />

14. K.-D. Schewe, B. Thalheim: On the Strength <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance,<br />

<strong>in</strong> C. McDonald (Ed.): Database Systems, Proc. 9th Australasian Database Conference, Perth<br />

1998, published as Australian Computer Science Communications, vol. 20 (2), 77-88.<br />

15. B. Thalheim: Foundations <strong>of</strong> entity-relationship model<strong>in</strong>g, Annals <strong>of</strong> Mathematics and Articial<br />

Intelligence, vol. 7, 1993, 197-256.<br />

16. S. D. Urban, L. Delcambre: Constra<strong>in</strong>t Analysis: a Design Process for Specify<strong>in</strong>g Operations on<br />

<strong>Object</strong>s, IEEETrans. on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), December 1990.<br />

17. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: Set-oriented Production Rules <strong>in</strong> Relational Database Systems, <strong>in</strong><br />

Proc. SIGMOD 1990, 259-270.<br />

178


Chapter 9<br />

Pr<strong>in</strong>ciples <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong><br />

Database Design<br />

Contents<br />

9.1 Philosophy <strong>of</strong>OODB Design . . . . . . . . . . . . . . . . . . . . . . 180<br />

9.2 The <strong>Object</strong> <strong>Oriented</strong> Datamodel: Basic Features . . . . . . . . . 181<br />

9.2.1 Type Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182<br />

9.2.2 Class Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />

9.2.3 Method Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />

9.2.4 Schema Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185<br />

9.3 Class Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186<br />

9.4 Stepwise Renement . . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />

9.4.1 Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />

9.4.2 Splitt<strong>in</strong>g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />

9.4.3 Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />

9.4.4 Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190<br />

9.5 Declarativity by Constra<strong>in</strong>t Centered Design . . . . . . . . . . . . 190<br />

9.6 Variation Based Reuse: A Research Issue . . . . . . . . . . . . . . 192<br />

9.7 Inferences <strong>in</strong> OODB Design . . . . . . . . . . . . . . . . . . . . . . 193<br />

This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />

Klaus-Dieter Schewe, Bernhard Thalheim. Pr<strong>in</strong>ciples <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database<br />

Design. <strong>in</strong> H. Jaakkola, H. Kangassalo, T. Kitahashi, A. Markus (Eds.). Information<br />

Modell<strong>in</strong>g and Knowledge Bases V , 227 { 242. IOS Press, Amsterdam, 1994.<br />

179


Abstract. The design <strong>of</strong> complex <strong>in</strong>formation systems requires a transparent model-based<br />

methodology. It has been claimed that object orientation will have a signicant impact on<br />

the development <strong>of</strong>such a methodology, especially as reusability and naturality <strong>of</strong> conceptual<br />

modell<strong>in</strong>g are concerned.<br />

The methodology presented <strong>in</strong> this paper concentrates on four signicant pr<strong>in</strong>ciples <strong>of</strong><br />

object oriented database (OODB) design. The basic constituent is stepwise renement, i.e.<br />

to beg<strong>in</strong> the design process with a partial model that is completed and concretized furtheron<br />

depend<strong>in</strong>g on the growth <strong>of</strong> application knowledge. Class abstraction, i.e. to support libraries<br />

<strong>of</strong> <strong>in</strong>complete parameterized designs that are <strong>in</strong>stantiated and specialized later, is a natural<br />

consequence here<strong>of</strong>. Declarativity is achieved by constra<strong>in</strong>t centered design with (up to some<br />

degree) automatic transformation <strong>in</strong>to consistent transactions. Variations enable the design<br />

<strong>of</strong> <strong>in</strong>formation systems with heavy reuse <strong>of</strong> exist<strong>in</strong>g design components.<br />

The methodology is based on a theoretically founded object oriented datamodel (OODM).<br />

Hence the support <strong>of</strong> <strong>in</strong>ferences such as decid<strong>in</strong>g the identiability <strong>of</strong>objects, detect<strong>in</strong>g the<br />

relation <strong>of</strong> an <strong>in</strong>tended design to components <strong>in</strong> exist<strong>in</strong>g design libraries, and check<strong>in</strong>g operations<br />

for reducedness as a prerequisite for the automatic transformation <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong>to<br />

consistent transactions.<br />

9.1 Philosophy <strong>of</strong> OODB Design<br />

The design <strong>of</strong> data and knowledge <strong>in</strong>tensive <strong>in</strong>formation systems requires a transparent modelbased<br />

methodology. Classically there exist seperate methods for the database and transaction<br />

design without a satisfactory <strong>in</strong>tegration [7, 9]. Therefore, it is a natural hope that the use <strong>of</strong><br />

object oriented design methods will improve the situation.<br />

<strong>Object</strong> orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong> order<br />

to promote high s<strong>of</strong>tware development productivity. This idea stems from programm<strong>in</strong>g languages<br />

and most methods proposed so far [3, 6, 11, 20] are <strong>in</strong>tended to support object oriented<br />

program development. The ma<strong>in</strong> dierence <strong>in</strong> object oriented database (OODB) design is due<br />

to the notion <strong>of</strong> object that is now <strong>in</strong>tended to serve as a basic unit <strong>of</strong> persistent data, a view<br />

that is <strong>in</strong>fluenced by semantic datamodels [9]. S<strong>in</strong>ce classes then serve not only as behaviour<br />

abstractions but also as (persistent) data collections, we have to cope with object identication,<br />

whereas <strong>in</strong> object oriented programm<strong>in</strong>g a simple identication mechanism via object<br />

names is sucient. This makes OODB design a signicantly dierent task to object oriented<br />

program development, although some ideas <strong>of</strong> the approaches to the latter eld can be taken<br />

over.<br />

Still most object oriented datamodels are very close to the language level [1, 10] no matter<br />

whether their development started from a semantic datamodel or an object oriented programm<strong>in</strong>g<br />

language. For object oriented database design, however, it is necessary to shift<br />

the approach to the conceptual level as also claimed <strong>in</strong> work <strong>of</strong> the IS-Core group [13, 21].<br />

Therefore, the primary goal <strong>of</strong> our methodology is to provide a conceptual object oriented<br />

model with greater naturality <strong>in</strong> application modell<strong>in</strong>g. At the same time we want to improve<br />

the design quality and to raise the rate <strong>of</strong> s<strong>of</strong>tware reuse.<br />

The work presented <strong>in</strong> this paper is centered around the theoretically founded object<br />

oriented datamodel (OODM) <strong>in</strong>troduced <strong>in</strong> [16] and partly based on the work <strong>in</strong> [2]. This<br />

model supports the uniform representation <strong>of</strong> designs at each level <strong>of</strong> concretion. In particular<br />

there is no need to use dierent models for the conceptual and logical design respectively.<br />

180


We regard requirements analysis and conceptual modell<strong>in</strong>g as two activities runn<strong>in</strong>g <strong>in</strong><br />

parallel. We start with an <strong>in</strong>itial design that is a one-to-one representation <strong>of</strong> rst knowledge<br />

about the <strong>in</strong>tended application. The analysis task is to grasp and describe such knowledge<br />

with the formal representation tools. The follow<strong>in</strong>g design process is monotonic, as the amount<br />

<strong>of</strong> application knowledge <strong>in</strong>creases. Each knowledge <strong>in</strong>crement then corresponds to some re-<br />

nement, i.e. a change|not only extension|<strong>of</strong> the design. However, this does not prejudice a<br />

particular, e.g. \top-down" design procedure. In contrast, the OODM favours <strong>in</strong>complete partial<br />

designs with the specication <strong>of</strong> details left for renement. Keep<strong>in</strong>g even such<strong>in</strong>termediate<br />

designs <strong>in</strong>creases the spread <strong>of</strong> possible reuse. This is close to the Design-by-Units-strategy<br />

[23].<br />

Classical design methods are centered around data, processes or constra<strong>in</strong>ts respectively.<br />

With<strong>in</strong> the unied model <strong>in</strong> our approach we may regard all these aspects at the same<br />

time and keep only track <strong>of</strong> the dependencies among them, s<strong>in</strong>ce constra<strong>in</strong>ts depend on the<br />

data and processes on both other components. This implies the relative <strong>in</strong>dependence <strong>of</strong><br />

renement steps on data, processes or constra<strong>in</strong>ts as long as these dependencies are taken<br />

<strong>in</strong>to consideration.<br />

S<strong>in</strong>ce processes <strong>in</strong> data and knowledge <strong>in</strong>tensive application systems change much faster<br />

than constra<strong>in</strong>ts, it is desirable to m<strong>in</strong>imize the process design task and to achieve a maximum<br />

<strong>of</strong> declarativity. As shown <strong>in</strong> [17, 18, 19] it is possible (up to some degree) to compute maximal<br />

specializations <strong>of</strong> specied processes <strong>in</strong> order to enforce consistency.<br />

The use <strong>of</strong> a uniform OODM dur<strong>in</strong>g the whole design process enables to build design<br />

libraries. Due to the support <strong>of</strong> abstract partial designs the components <strong>of</strong> such libraries can<br />

be more generic than usually assumed, but it is a truism that reusability does not imply<br />

reuse. We have to support mechanisms to retrieve a maximum <strong>of</strong> exist<strong>in</strong>g reusable library<br />

components for a given partial design. This leads to the concept <strong>of</strong> variation-based reuse<br />

extend<strong>in</strong>g results on variant construction<strong>in</strong>semantic networks [14].<br />

Such a methodology <strong>in</strong>volves a high level <strong>of</strong> <strong>in</strong>ferences. Some <strong>of</strong> these <strong>in</strong>ferences are <strong>in</strong>tr<strong>in</strong>sic<br />

to the used datamodel. Among them are the recognition <strong>of</strong> object identiability, specialication<br />

and type correctness or the verication <strong>of</strong> renement correctness. Others are extr<strong>in</strong>sic such as<br />

the pro<strong>of</strong> <strong>of</strong> reducedness as a prerequisite for consistency enforcement or the ascerta<strong>in</strong>ment<br />

<strong>of</strong> the relationship to exist<strong>in</strong>g library components.<br />

In the rema<strong>in</strong>der <strong>of</strong> this paper we shall rst describe the fundamental issues <strong>of</strong> the OODM<br />

<strong>in</strong> Section 9.2, then <strong>in</strong> Sections 9.3-9.6 we briefly concretize the basic pr<strong>in</strong>ciples <strong>of</strong> our design<br />

methodology. Section 9.7 presents a short outl<strong>in</strong>e <strong>of</strong> the required <strong>in</strong>ferences and a discussion<br />

<strong>of</strong> open research problems.<br />

9.2 The <strong>Object</strong> <strong>Oriented</strong> Datamodel: Basic Features<br />

In the object-oriented approach we dist<strong>in</strong>guish between objects and values. Whereas values<br />

are encoded by themselves, objects have tobeencodedby object identiers. In our approach<br />

each object consists <strong>of</strong> a unique, immutable identier, a set <strong>of</strong> values <strong>of</strong> possibly dierent<br />

types, references to other objects and methods associated with the object.<br />

Values can be grouped <strong>in</strong>to types. In general, a type may be regarded as an immutable set<br />

<strong>of</strong> values <strong>of</strong> a uniform structure together with operations dened on such values. Subtyp<strong>in</strong>g<br />

is used to relate values <strong>in</strong> dierent types. The class concept provides the group<strong>in</strong>g <strong>of</strong> objects<br />

hav<strong>in</strong>g the same structure which uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references.<br />

181


<strong>Object</strong>s can belong to dierent classes, which guarantees each object <strong>of</strong> our abstract object<br />

model to be captured by the collection <strong>of</strong> possible classes. As for values that are only dened<br />

via types, objects can only be dened via classes. Thus, a design consists <strong>of</strong> type and class<br />

denitions.<br />

9.2.1 Type Denitions<br />

We follow the classical view <strong>of</strong> types <strong>in</strong> [4] us<strong>in</strong>g a type system that consists <strong>of</strong> some basic<br />

types, type constructors and a subtyp<strong>in</strong>g relation. Moreover, recursive types, i.e. types dened<br />

by doma<strong>in</strong> equations, and predicative types, i.e. types dened by restrictions, can be dened.<br />

Denition 9.1. { The base types are BOOL, NAT, INT, FLOAT, STRING, ID or ?,<br />

where ID is an abstract identier type without any non-trivial supertype and ? is the<br />

trivial type that is a supertype for every type.<br />

{ The type constructors are e 1 j je n (enumeration), (a 1 : 1 ::: a n : n ) (record), fg<br />

(nite set), [] (list), hi (bag) or (a : ) [ (b : ) (union).<br />

We may use base types and constructors to dene new types by nest<strong>in</strong>g. If there is no confusion,<br />

the eld selectors <strong>in</strong> record or union types may be omitted.<br />

The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />

standard operators on base types and on records, sets, bags, ::: We omit the details here. A<br />

type t is called proper i the number <strong>of</strong> its parameters is 0. t is called a value type i there<br />

is no occurrence <strong>of</strong> ID <strong>in</strong> t. If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a<br />

correspond<strong>in</strong>g occurrence relation o : t t 0 ! BOOL.<br />

A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened by<br />

the usual subtype relation [4].<br />

Example 9.1.<br />

Let us dene a type VZ and a simple subtype VZ 0 here<strong>of</strong>.<br />

Type VZ =<br />

( beg<strong>in</strong> : DATE ,<br />

end : DATE [?,<br />

k<strong>in</strong>d-<strong>of</strong>-<strong>in</strong>surance : \Ma<strong>in</strong>" j \Family" j \Interruption" )<br />

End VZ<br />

Type VZ 0 =<br />

( beg<strong>in</strong> : DATE ,<br />

end : DATE ,<br />

k<strong>in</strong>d-<strong>of</strong>-<strong>in</strong>surance : \Ma<strong>in</strong>" j \Family" j \Interruption" )<br />

End VZ 0<br />

ut<br />

Predicative Types are used to restrict the set <strong>of</strong> values given by some type denition to a<br />

subset. For this purpose a formula with exactly one free variable self is used. Clearly, the<br />

<strong>in</strong>clusion then gives a subtype function. In order to avoid <strong>in</strong>flationary use <strong>of</strong> quantiers, other<br />

variables are also allowed to occur freely <strong>in</strong> such a formula. They are assumed to be universally<br />

quantied.<br />

Denition 9.2. A predicative type T consists <strong>of</strong> an underly<strong>in</strong>g type T 0 and a formula P with<br />

exactly one free variable self <strong>of</strong> type T 0 .<br />

182


Example 9.2. Let us dene a predicative subtype <strong>of</strong> [ VZ ].<br />

Type VZ-list = [ VZ ] Where<br />

( self = concat(L 1 ,[V 1 ,V 2 j L 2 ]))<br />

V 2 :: VZ 0 ^ V 2 .end V 1 .beg<strong>in</strong> ) ^<br />

( self = concat(L 1 ,[V j L 2 ]))<br />

V .end 6= ?)V .beg<strong>in</strong> V .end )<br />

End VZ-list<br />

ut<br />

9.2.2 Class Denitions<br />

Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values, references to other objects<br />

and methods. Let us postpone methods for a while. Identiers can be represented us<strong>in</strong>g the<br />

unique identier type ID.Values and references can be comb<strong>in</strong>ed <strong>in</strong>to a representation type,<br />

where each occurence <strong>of</strong> ID denotes references to some other classes. Therefore, we may<br />

dene the structure <strong>of</strong> a class us<strong>in</strong>g parameterized types. Moreover, classes are arranged <strong>in</strong><br />

IsA-hierarchies.<br />

Denition 9.3. { If t is a value type with parameters 1 ::: n such that ID does not<br />

occur <strong>in</strong> t and if some <strong>of</strong> the parameters are replaced by pairs r i : C i with a reference<br />

name r i and a class name C i , the result<strong>in</strong>g expression is called a structure expression.<br />

Note that a structure expression may still conta<strong>in</strong> parameters.<br />

{ A class consists <strong>of</strong> a class name C, a structure expression S, a set <strong>of</strong> class names<br />

D 1 ::: D m (called superclasses) and a set <strong>of</strong> methods. Wecallr i the reference named r i<br />

from class C to class C i . The type derived from S by replac<strong>in</strong>g each reference r i : C i by<br />

the type ID is called the representation type T C <strong>of</strong> the class C.<br />

Example 9.3.<br />

Let us consider a class Insurant for an <strong>in</strong>surance application.<br />

Class Insurant =<br />

Structure (contract-no : NAT ,<br />

name : NAME ,<br />

address : ADDRESS ,<br />

sex : SEX ,<br />

<strong>in</strong>surance-times : VZ-list ,<br />

agency : AGENCY )<br />

Method :::<br />

End Insurant<br />

ut<br />

In this example there are no references, hence the structure expression is simply a type. We<br />

could have dened this type, say INSURANT-DATA, separately from the class denition as<br />

<strong>in</strong> Section 9.2.1. Then the structure would simply be Structure INSURANT-DATA.<br />

9.2.3 Method Denitions<br />

Let us now turn to add<strong>in</strong>g dynamics to the OODM. As required <strong>in</strong> the object oriented<br />

approach operations will be associated with classes. This gives us the notion <strong>of</strong> a method.<br />

We shall dist<strong>in</strong>guish between visible and hidden methods to emphasize those methods that<br />

can be <strong>in</strong>voked by the user and others. However, all methods <strong>of</strong> a class <strong>in</strong>clud<strong>in</strong>g the hidden<br />

ones can be accessed by other methods. The justication for such a weak hid<strong>in</strong>g concept is<br />

due to two reasons.<br />

183


{ Visible methods serve as a means to specify (nested) transactions. In order to build<br />

sequences <strong>of</strong> database <strong>in</strong>stances we only regard these transactions assum<strong>in</strong>g a l<strong>in</strong>ear <strong>in</strong>vocation<br />

order on them.<br />

{ Hidden methods can be used to handle identiers. S<strong>in</strong>ce these identiers do not have any<br />

mean<strong>in</strong>g for the user, they must not occur with<strong>in</strong> the <strong>in</strong>put or output <strong>of</strong> a transaction.<br />

Each method on a class C consists <strong>of</strong> a signature and a body. The signature consists <strong>of</strong> a<br />

method name and sets <strong>of</strong> parameter/type pairs for <strong>in</strong>put and output. The body is dened by<br />

the usual constructs <strong>of</strong> a procedural programm<strong>in</strong>g language.<br />

Denition 9.4. { A method signature consists <strong>of</strong> a method name M, a set <strong>of</strong> <strong>in</strong>put-parameter/type<br />

pairs i :: T i and a set <strong>of</strong> output-parameter/type pairs o j :: Tj 0.<br />

{ A method onaclassC consists <strong>of</strong> a method signature and a body that is recursively built<br />

from the follow<strong>in</strong>g constructs:<br />

assignment x := E, where x is either the class variable C <strong>of</strong> type fU C g or a local<br />

variable with<strong>in</strong> S (<strong>in</strong>clud<strong>in</strong>g the output-parameters), and E is a expression <strong>of</strong> the<br />

same type as x,<br />

local variable declaration Let x :: T ,<br />

skip and fail,<br />

sequenc<strong>in</strong>g S 1 S 2 and branch<strong>in</strong>g IF P THEN S 1 ELSE S 2 ENDIF,<br />

method call C 0 :- M 0 (<strong>in</strong> : E1 0 ::: E0 j out : x0 1 ::: x0 i ), where M 0 is a method on class<br />

C 0 with compatible signature and<br />

non-determ<strong>in</strong>istic selection <strong>of</strong> values New:f(x), where f is a selector on the representation<br />

type <strong>of</strong> C.<br />

If the class name is omitted <strong>in</strong> a method-call, then we refer to the class C itself or to the<br />

global method New Id to denote the selection <strong>of</strong> a new identier. Clearly, wemay regard this<br />

method as belong<strong>in</strong>g to an abstract class Any that is a superclass <strong>of</strong> all classes with structure<br />

?.<br />

A method M on a class C is called value-dened i all types occurr<strong>in</strong>g <strong>in</strong> its signature are<br />

proper value types. As already mentioned we dist<strong>in</strong>guish between methods visible to the user<br />

and hidden methods. We require each visible method to be value-dened. Subclasses <strong>in</strong>herit<br />

the methods <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long as the new method is a<br />

specialization <strong>of</strong> all its correspond<strong>in</strong>g methods <strong>in</strong> its superclasses.<br />

Example 9.4. Let us add the method add-<strong>in</strong>surant to the class Insurant <strong>of</strong> Example 9.3.<br />

184


Method<br />

add-<strong>in</strong>surant ( <strong>in</strong> : request-data :: REQUEST-DATA ,<br />

out : contract-no :: NAT ) =<br />

Insurant :- check-data ( <strong>in</strong> : request-data ,<br />

out : acceptable :: BOOL )<br />

IF acceptable<br />

THEN Let I :: ID , C :: NAT <br />

New.contract-no (C) <br />

New Id ( out : I ) <br />

Insurant :- compute-<strong>in</strong>surant-data<br />

(<strong>in</strong>:request-data,C ,out:V) <br />

Insurant := Insurant [f(I,V)g<br />

ELSE fail<br />

ENDIF<br />

ut<br />

Let us briefly discuss what it means that a method N on a class D specializes the method M<br />

on a superclass C. First, we may assume|tak<strong>in</strong>g records|that there is exactly one <strong>in</strong>putand<br />

one output-type, say I N (resp. I M ) and O N (resp. O M ). The <strong>in</strong>put-type is used for two<br />

purposes: object identication <strong>in</strong> D (resp. C) and provid<strong>in</strong>g necessary parameters, hence I N<br />

(resp. I M ) is a subtype <strong>of</strong> some I D I 0 N (resp. I C I 0 M ).<br />

In order to \<strong>in</strong>herit" the behaviour <strong>of</strong> M to N we must be able to transform N <strong>in</strong> such a<br />

way that it becomes applicable to the <strong>in</strong>put <strong>of</strong> M. Hence we have to project the parameter<br />

parts, whereas identication may exploit object identiers (see Denition 9.6). Hence I 0 M must<br />

be a subtype <strong>of</strong> I 0 N .<br />

Note that this gives some k<strong>in</strong>d <strong>of</strong> partial contravariance, whereas [11] requires covariance<br />

and [1] requires contravariance only. The dierences are due to the mismatches between<br />

program and database design as already mentioned <strong>in</strong> Section 9.1.<br />

For the output-types the situation is much simpler requir<strong>in</strong>g O N to be a subtype <strong>of</strong> O M .<br />

We may then transform N <strong>in</strong> a canonical way to some N 0 with the same signature as M.<br />

Both may be regarded as methods on C. Then, if N 0 applied to some <strong>in</strong>put-value yields some<br />

result, this should also result from apply<strong>in</strong>g M (but not vice versa). A more formal discussion<br />

on the theme occurs <strong>in</strong> [17].<br />

9.2.4 Schema Denitions<br />

Now we are prepared for the denition <strong>of</strong> a database schema that is simply given by a nite<br />

collection<strong>of</strong>type and class denitions. Later we shall add constra<strong>in</strong>t denitions. Thus, tak<strong>in</strong>g<br />

together Examples 9.1-9.4, we get a schema with only one class Insurant and only one<br />

method add-<strong>in</strong>surant.<br />

However, some <strong>of</strong> the types <strong>in</strong> this schema such asNAME, ADDRESS, REQUEST ;<br />

DATA are undened. The same applies to the methods check-data and compute-<strong>in</strong>surantdata<br />

called by add-<strong>in</strong>surant. This style <strong>of</strong> allow<strong>in</strong>g partiality <strong>in</strong> OODM schemata allows to<br />

capture also <strong>in</strong>complete knowledge about an application area and will be essential for our<br />

methodology. In the next two chapters we shall expla<strong>in</strong> <strong>in</strong> more detail this feature and show<br />

how toexploit it for a standard renement process.<br />

First let us have a closer look at schemata that are \complete", i.e. correspond to anal<br />

design <strong>of</strong> an application. This leads to the notion <strong>of</strong> closed schemata.<br />

185


Denition 9.5. A schema S is a nite collection <strong>of</strong> type, class and constra<strong>in</strong>t denitions. It is<br />

closed i all types, classes and methods occurr<strong>in</strong>g with<strong>in</strong> type denitions, structure denitions<br />

and methods are dened <strong>in</strong> S.<br />

Let us postpone constra<strong>in</strong>ts for a while. At each time, a class is given by a nite set <strong>of</strong> objects.<br />

More precisely, we need the notion <strong>of</strong> a database <strong>in</strong>stance.<br />

Denition 9.6. An <strong>in</strong>stance D <strong>of</strong> a closed schema S assigns to each class C a value D(C)<br />

<strong>of</strong> type f(ident : IDvalue : T C )g such that the follow<strong>in</strong>g conditions are satised:<br />

uniqueness <strong>of</strong> identiers: For every class C we have<br />

8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (9.90)<br />

<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />

8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (9.91)<br />

Moreover, if T C is a subtype <strong>of</strong> TC 0 with subtype function f : T C ! TC 0 , then we have<br />

8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (9.92)<br />

referential <strong>in</strong>tegrity:<br />

relation o r wehave<br />

For each reference from C to C 0 with correspond<strong>in</strong>g occurrence<br />

8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (9.93)<br />

Basic update methods, i.e. <strong>in</strong>sertion, deletion and update <strong>of</strong> a s<strong>in</strong>gle object <strong>in</strong>to a class C,<br />

can not always be derived <strong>in</strong> the object-oriented case, because the abstract identiers have<br />

to be hidden from the user. However, <strong>in</strong> [16] it has been shown that for value-representable<br />

classes these operations are uniquely determ<strong>in</strong>ed by the schema and consistent with respect<br />

to the implicit referential and <strong>in</strong>clusion constra<strong>in</strong>ts.<br />

Value-representability <strong>of</strong> all classes <strong>in</strong> a closed schema is implied, if we can derive a (trivial)<br />

uniqueness constra<strong>in</strong>t for each class. Such aconstra<strong>in</strong>t requires the values <strong>of</strong> type T C <strong>in</strong> the<br />

class extension C to be unique:<br />

8i j :: ID:8v :: T C : (i v) 2D(C) ^ (j v) 2D(C) ) i = j : (9.94)<br />

F<strong>in</strong>ally, the semantics <strong>of</strong> a closed schema is given by database histories, where a database<br />

history on a schema S is a sequence D 0 D 1 ::: <strong>of</strong> <strong>in</strong>stances such that D 0 is the empty<br />

database and each transition from D i;1 to D i is due to some visible method on some class<br />

C 2S.<br />

9.3 Class Abstraction<br />

As we have seen <strong>in</strong> Section 9.2 the structure expression <strong>of</strong> a class <strong>in</strong> an OODM schema may<br />

conta<strong>in</strong> parameters. These arise from parameterized types. Parameterized classes allow to<br />

abstract from concrete structures. Indeed, an <strong>in</strong>stance <strong>of</strong> a parameterized class may not be<br />

186


egarded as a s<strong>in</strong>gle set <strong>of</strong> pairs, but as a family here<strong>of</strong> <strong>in</strong>dexed by the possible <strong>in</strong>stantiations.<br />

Let us now extend and concretize this view to arbitrary schemata.<br />

If we know that objects will have some attributes, but we still do not know the type <strong>of</strong> the<br />

correspond<strong>in</strong>g values, we may leave the correspond<strong>in</strong>g parameter un<strong>in</strong>stantiated. However, if<br />

we already know that we shall <strong>in</strong>stantiate this parameter by some type, we may mark this<br />

parameter as a type parameter. Ifwe know that there will be some reference r i : C i ,butC i is<br />

undened, then we have aclass parameter.<br />

For parameterized classes the possibilities to dene methods and constra<strong>in</strong>ts are restricted.<br />

If is a type parameter and we do not know anyth<strong>in</strong>g about the type, there is no non-trivial<br />

way to express a term <strong>of</strong> that type, but terms are required <strong>in</strong> assignments as well as <strong>in</strong><br />

constra<strong>in</strong>ts. However, we mayhave partial knowledge <strong>of</strong> that type, e.g. that it is a subtype <strong>of</strong><br />

some other type, <strong>in</strong> which case we may use terms <strong>of</strong> that supertype.<br />

If C is a class parameter, then each call <strong>of</strong> a method m on C is <strong>in</strong>deed undened. Therefore,<br />

for the pro<strong>of</strong> <strong>of</strong> properties <strong>of</strong> the call<strong>in</strong>g method such as consistency we only have the<br />

possibility to assume an arbitrary <strong>in</strong>put-output-relation for m unless we completely defer the<br />

pro<strong>of</strong>.<br />

Denition 9.7. If S is a schema, T a type parameter, C a class parameter and M an<br />

undened method. A parameter restriction is either T T 0 with some value type expression<br />

T 0 , C isa C 0 with some class name C 0 , C:structure S with some structure expression S<br />

or a restriction on the types <strong>of</strong> the signature <strong>of</strong> M.<br />

Here denotes the subtype relation and its canonical extension to structure expressions. Note<br />

that some parameter restrictions may be <strong>in</strong>ferred from context <strong>in</strong> the schema S. If a parameter<br />

is unrestricted, we may add the implicit parameter restrictions T ?, C:structure ?<br />

and T i ?for type parameters, class parameters and types <strong>in</strong> method signatures. However,<br />

if there is more than one restriction on a parameter, these may be <strong>in</strong>consistent. In the case<br />

<strong>of</strong> a consistent set <strong>of</strong> parameter restrictions, the set <strong>of</strong> restrictions on one parameter may be<br />

unied to give only one restriction <strong>in</strong> the form <strong>of</strong> Denition 9.7. We then talk <strong>of</strong> the normalized<br />

set <strong>of</strong> parameter restrictions.<br />

In order to dene the semantics <strong>of</strong> open (i.e. not closed) schemata, we need the notions <strong>of</strong><br />

<strong>in</strong>stantiations.<br />

Denition 9.8. Let S be a schema with a consistent set <strong>of</strong> parameter restrictions. An <strong>in</strong>stantiation<br />

I is given by a closed schema S 0 that results from S by replac<strong>in</strong>g each type parameter<br />

T by avalue type, each class parameter by aclass and each undened method by \Let :::<br />

o i :: O i ::: "such that all parameter restrictions are satised. S 0 is called m<strong>in</strong>imal i we had<br />

taken the types and classes occurr<strong>in</strong>g <strong>in</strong> the normalized set <strong>of</strong> parameter restrictions.<br />

Example 9.5. Let us look aga<strong>in</strong> at Examples 9.1-9.4. The m<strong>in</strong>imal <strong>in</strong>stantiation <strong>of</strong> the type<br />

VZ (and VZ 0 ) gives<br />

Type VZ =<br />

( beg<strong>in</strong> : ? ,<br />

end : ? ,<br />

k<strong>in</strong>d-<strong>of</strong>-<strong>in</strong>surance : \Ma<strong>in</strong>" j \Family" j \Interruption" )<br />

End VZ<br />

The m<strong>in</strong>imal <strong>in</strong>stantiation <strong>of</strong> the class Insurant leads to the structure expression<br />

187


Structure (contract-no : NAT ,<br />

name : ? ,<br />

address : ? ,<br />

sex : ? ,<br />

<strong>in</strong>surance-times : VZ-list ,<br />

agency : ? )<br />

The method add-<strong>in</strong>surant <strong>in</strong>volves the call <strong>of</strong> check-data on the same class, but this method is<br />

undened, hence could only be treated as the non-determ<strong>in</strong>istic value selection \ Let accepted<br />

:: BOOL ". ut<br />

F<strong>in</strong>ally, the full semantics <strong>of</strong> an open schema S is given by families <strong>of</strong> history sets <strong>in</strong>dexed<br />

by the possible <strong>in</strong>stantiations <strong>of</strong> S, whereas the m<strong>in</strong>imal semantics is the semantics <strong>of</strong> the<br />

m<strong>in</strong>imal <strong>in</strong>stantiation.<br />

Note that each <strong>in</strong>stantiation can be projected naturally to the m<strong>in</strong>imal one. The pr<strong>in</strong>ciple<br />

<strong>of</strong> class abstraction is necessary for stepwise renement as <strong>in</strong>dicated <strong>in</strong> Section 9.3, s<strong>in</strong>ce<br />

otherwise we were not able to support partial designs. On the other hand, it <strong>in</strong>creases the<br />

band-width <strong>of</strong> possible concrete designs that occur as <strong>in</strong>stantiations. Therefore, it is desirable<br />

to provide libraries <strong>of</strong> abstract (partial) designs to achieve a higher rate <strong>of</strong> reusability.<br />

9.4 Stepwise Renement<br />

Once, an <strong>in</strong>itial OODM schema is given, the follow<strong>in</strong>g design process is based on stepwise<br />

renement. Roughly speak<strong>in</strong>g, renement means the reorganization <strong>of</strong> classes and methods<br />

such that the semantics <strong>of</strong> the old schema is \preserved" with<strong>in</strong> the new one. This is captured<br />

by the next denition.<br />

Let S and T be closed schemata and suppose there are (partial) functions<br />

{ f <strong>in</strong>st that is total tak<strong>in</strong>g <strong>in</strong>stances <strong>of</strong> T to <strong>in</strong>stances <strong>of</strong> S,<br />

{ f class that is partial tak<strong>in</strong>g a class <strong>in</strong> T to a class <strong>in</strong> S and<br />

{ f meth that is total tak<strong>in</strong>g a method <strong>in</strong> T to a (possibly empty) set <strong>of</strong> methods <strong>in</strong> S.<br />

such that for each method M associated with a class C <strong>in</strong> T each method M 0 2 f meth (M) is<br />

associated with f class (C). If S and T are arbitrary schemata, assume these functions to be<br />

dened on the m<strong>in</strong>imal <strong>in</strong>stantiations.<br />

Denition 9.9. T is a renement <strong>of</strong> S i for each pair (D i;1 D i ) <strong>in</strong> a database history <strong>of</strong><br />

T that corresponds to a method M and each M 0 2 f meth (M) that is dened and term<strong>in</strong>at<strong>in</strong>g<br />

<strong>in</strong> f <strong>in</strong>st (D i;1 ) the pair (f <strong>in</strong>st (D i;1 )f <strong>in</strong>st (D i )) corresponds to M 0 .<br />

There exists a more elegant (but also strongly theoretical) characterization <strong>of</strong> renement. We<br />

omit the details here. In [15] the follow<strong>in</strong>g standard renement steps <strong>in</strong> the OODM have been<br />

discussed on the basis <strong>of</strong> an application example.<br />

9.4.1 Instantiation<br />

In Section 9.3 we discussed the possibility <strong>of</strong> parameterized (open) schemata and dened their<br />

semantics. Renement by <strong>in</strong>stantiation provides denitions for such parameters, but may also<br />

<strong>in</strong>troduce new parameters.<br />

188


Example 9.6. Let us <strong>in</strong>stantiate the type parameters ADDRESS and AGENCY occurr<strong>in</strong>g<br />

<strong>in</strong> Example 9.3.<br />

Type ADDRESS =<br />

( zip : NAT Where self < 100 000 ,<br />

city :STRING ,<br />

street : STRING )<br />

End ADDRESS<br />

Type AGENCY =<br />

(number : NAT Where self < 1 000 ,<br />

address : ADDRESS ,<br />

phones : f TELECOM NO g ,<br />

fax : TELECOM NO ,<br />

cares for : f ( zip : NAT Where self < 100 000 ,<br />

city :STRING ) g )<br />

End AGENCY<br />

ut<br />

Renement by <strong>in</strong>stantiation may also <strong>in</strong>troduce bodies for methods that were undened so<br />

far.<br />

9.4.2 Splitt<strong>in</strong>g<br />

Renement by splitt<strong>in</strong>g leads to new classes with structure expressions that correspond to<br />

parts <strong>of</strong> an exist<strong>in</strong>g structure expression which <strong>in</strong> turn are replaced by references. It is ma<strong>in</strong>ly<br />

used <strong>in</strong> the case <strong>of</strong> shared data.<br />

Example 9.7. The class Agency stems from splitt<strong>in</strong>g Insurant <strong>in</strong> Example 9.3 assum<strong>in</strong>g<br />

the <strong>in</strong>stantiation <strong>of</strong> Example 9.6 to be already done. The new reference is agency : Agency.<br />

Class Agency =<br />

Structure ( agency : AGENCY )<br />

End Agency<br />

Class Insurant =<br />

Structure (contract-no : NAT ,<br />

::: ::: ,<br />

agency : Agency )<br />

Methods :::<br />

End Insurant<br />

Clearly, the exist<strong>in</strong>g methods on the splitted class have also to be changed.<br />

ut<br />

9.4.3 Specialization<br />

Renement by specialization <strong>in</strong>troduces subclasses and subtypes. Moreover, it may <strong>in</strong>volve to<br />

replace a structure expression such that the new representation type will be a subtype <strong>of</strong> the<br />

old one and the new implicit constra<strong>in</strong>ts will imply the old ones.<br />

Example 9.8. Let us <strong>in</strong>troduce a new class Ma<strong>in</strong>-Insurant as a subclass <strong>of</strong> Insurant.<br />

<strong>Object</strong>s <strong>in</strong> this subclass have an additional reference to Company that need not exist for all<br />

<strong>in</strong>surants.<br />

189


Class Ma<strong>in</strong>-Insurant =<br />

IsA Insurant Structure (account-no : NAT ,<br />

employed-by :Company )<br />

Methods :::<br />

End Ma<strong>in</strong>-Insurant<br />

The new class Insurant results by specializ<strong>in</strong>g the old class with this name. We simply<br />

add a reference to the class Ma<strong>in</strong>-Insurant for the case <strong>of</strong> <strong>in</strong>surant <strong>of</strong>k<strong>in</strong>d \Family". The<br />

correspond<strong>in</strong>g subtype function is a simple projection.<br />

Class Insurant =<br />

Structure ( ::: ,<br />

<strong>in</strong>surance-times : [ ( beg<strong>in</strong> : DATE ,<br />

end : DATE [?,<br />

( k<strong>in</strong>d : \Ma<strong>in</strong>" j \Interruption" ) [<br />

(k<strong>in</strong>d:\Family",<br />

associated-with : Ma<strong>in</strong>-Insurant ))]<br />

Where ::: ,<br />

agency : Agency )<br />

Methods :::<br />

End Insurant<br />

ut<br />

9.4.4 Extension<br />

Renement by extension is very simple, s<strong>in</strong>ce it means the denition <strong>of</strong> new types, classes,<br />

constra<strong>in</strong>ts or methods that do not yet exist <strong>in</strong> the schema.<br />

Example 9.9. A new class New Insurant to capture persons that apply to become an<br />

<strong>in</strong>surant is<strong>in</strong>troduced as follows.<br />

Class New Insurant =<br />

Structure (name:NAME ,<br />

address : ADDRESS ,<br />

sex : SEX ,<br />

when to start : DATE ,<br />

<strong>in</strong>itial-agency : Agency ,<br />

vocational-group : VOCATION-KEY ,<br />

<strong>in</strong>come : NAT Where self < 1 000 000 )<br />

Methods :::<br />

End New Insurant<br />

<strong>Object</strong>s may at the same time belong to both class Insurant and New Insurant with<br />

dierent names, addresses and so on. <strong>Object</strong> identiers are used to relate dierent aspects <strong>of</strong><br />

the same object.<br />

ut<br />

9.5 Declarativity by Constra<strong>in</strong>t Centered Design<br />

As announced <strong>in</strong> Denition 9.5 we now concretize constra<strong>in</strong>ts associated with a schema.<br />

Particular <strong>in</strong>terest will be paid for such constra<strong>in</strong>ts that arise as generalizations <strong>of</strong> constra<strong>in</strong>ts<br />

known from the relational model, e.g. functional, <strong>in</strong>clusion and exclusion constra<strong>in</strong>ts [17, 18].<br />

190


Denition 9.10. { An <strong>in</strong>tegrity constra<strong>in</strong>t on a schema S is a formula I over the underly<strong>in</strong>g<br />

type system with free variables fr(I) fC 1 ::: C n g, where each class name C i is used<br />

as a variable <strong>of</strong> type f(ident : IDvalue : T Ci )g.<br />

{ An <strong>in</strong>stance D <strong>of</strong> a schema is said to be consistent i substitut<strong>in</strong>g D(C) for each class<br />

variable C <strong>in</strong> each <strong>in</strong>tegrity constra<strong>in</strong>t I evaluates to true, when <strong>in</strong>terpreted <strong>in</strong> the usual<br />

way.<br />

Note that the conditions for an <strong>in</strong>stance <strong>in</strong> Denition 9.6 correspond to model <strong>in</strong>herent <strong>in</strong>tegrity<br />

constra<strong>in</strong>ts. We refer to these constra<strong>in</strong>ts as implicit identier, IsA and referential<br />

constra<strong>in</strong>ts on the schema S. Other constra<strong>in</strong>ts that are already given implicitly by the structure<br />

<strong>of</strong> the schema arise from Where-clauses <strong>in</strong> predicative types. Indeed, we may replace such<br />

types by the underly<strong>in</strong>g ground type|just omit the Where-clause| and add the clause as a<br />

constra<strong>in</strong>t. From the designer's po<strong>in</strong>t <strong>of</strong> view this is not necessary, but it will be as soon as<br />

constra<strong>in</strong>t ma<strong>in</strong>tenance comes <strong>in</strong>to play (see below).<br />

Example 9.10. Return to Example 9.8, where we <strong>in</strong>troduced the class Ma<strong>in</strong>-Insurant as<br />

a subclass <strong>of</strong> Insurant. We would like to express that each object currently <strong>in</strong> Insurant<br />

with k<strong>in</strong>d = \Ma<strong>in</strong>" must also belong to Ma<strong>in</strong>-Insurant. This gives the formula<br />

8i v b `<br />

(i v) 2 Insurant ^<br />

<strong>in</strong>surance-times(o) =[(b ? \Ma<strong>in</strong>") j `] )<br />

9w (i w) 2 Ma<strong>in</strong>-Insurant<br />

with free variables Insurant and Ma<strong>in</strong>-Insurant.<br />

ut<br />

In particular, we allow dist<strong>in</strong>guished classes <strong>of</strong> constra<strong>in</strong>ts to be specied <strong>in</strong> OODM schemata.<br />

These comprise <strong>in</strong>clusion, exclusion, functional, uniqueness, object generat<strong>in</strong>g and path constra<strong>in</strong>ts<br />

and generalize relevant classes known <strong>in</strong> the relational eld [22].<br />

Denition 9.11. Let C C 1 C 2 be classes <strong>in</strong> a schema S and let c i : T C ! T i (i = 1 2 3)<br />

and c i : T Ci ! T (i =1 2) be subtype functions.<br />

{ A functional constra<strong>in</strong>t on C is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) :<br />

(9.95)<br />

{ An <strong>in</strong>clusion constra<strong>in</strong>t on C 1 and C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8t :: T:9i 1 :: IDv 1 :: T C 1 : (i 1v 1 ) 2 x C 1 ^ c 1(v 1 )=t )<br />

9i 2 :: IDv 2 :: T C 2 : (i 2v 2 ) 2 x C 2 ^ c 2(v 2 )=t : (9.96)<br />

{ An exclusion constra<strong>in</strong>t on C 1 , C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />

8i 1 i 2 :: ID:8v 1 :: T C 1 : 8v 2 :: T C 2 : (i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1 (v 1 ) 6= c 2 (v 2 ) :<br />

(9.97)<br />

191


Constra<strong>in</strong>ts <strong>in</strong>crease the declarativity <strong>of</strong> designs. This is important, because <strong>in</strong> data and<br />

knowledge <strong>in</strong>tensive application systems the data and constra<strong>in</strong>ts on them usually live longer<br />

than the operations, i.e. the methods.<br />

Then the problem is to guarantee the consistency <strong>of</strong> the methods with respect to the<br />

specied constra<strong>in</strong>ts. Sometimes this requires hard verication work, but for a wide spectrum<br />

<strong>of</strong> schemata automatic transformation <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong>to methods is provided.<br />

In [16] consistent generic update operations with respect to implicit constra<strong>in</strong>ts have been<br />

presented. In [18] this has been extended to the classes <strong>of</strong> constra<strong>in</strong>ts mentioned above. In<br />

[19] an algorithm for the transformation <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong>to transactions has been proven to<br />

be correct. This algorithm reduces the consistency enforcement task to basic updates. It can<br />

be shown that this operational approach to consistency enforcement is more powerfull than<br />

the rule trigger<strong>in</strong>g approach [5, 8]. However, the verication <strong>of</strong> a very technical condition,<br />

called I-reducedness is required, which limits the applicability <strong>of</strong> consistency enforcement <strong>in</strong><br />

general. We omit the details <strong>of</strong> the algorithm here, s<strong>in</strong>ce they are hidden to the designer.<br />

The only th<strong>in</strong>g a designer has to know is that constra<strong>in</strong>t specications will be made explicit<br />

<strong>in</strong> methods <strong>in</strong> a canonical way. If this leads to unexpected results, s/he may change the orig<strong>in</strong>al<br />

design. It is an open research problem how to support the amelioration <strong>of</strong> a schema <strong>in</strong> case<br />

constra<strong>in</strong>t enforcement leads to <strong>in</strong>ecient methods.<br />

9.6 Variation Based Reuse: A Research Issue<br />

The design process presented <strong>in</strong> Sections 9.3-9.5 implicitly assumes that we want to build a<br />

new application system from scratch. One promise <strong>of</strong> the object oriented approach, however, is<br />

an enormous <strong>in</strong>crease <strong>in</strong> s<strong>of</strong>tware reuse. This can be achieved if wekeep the design components,<br />

i.e. type and class denitions <strong>in</strong> libraries. The benets here<strong>of</strong> are apparent especially if we<br />

regard the scale <strong>of</strong> reusability <strong>of</strong> parameterized class denitions.<br />

Unfortunately reusability does not automatically imply reuse. Indeed, we have to provide<br />

mechanisms to relate the <strong>in</strong>tended (new) designs with exist<strong>in</strong>g components <strong>in</strong> such a library.<br />

Exist<strong>in</strong>g type and class denitions are not <strong>in</strong>dependent from one another. The idea is now to<br />

exploit the hierarchies <strong>in</strong> OODM schemata due to <strong>in</strong>stantiation, specialization and renement.<br />

This extends the work <strong>in</strong> [14], where the specialization taxonomy <strong>in</strong> a KL-ONE like knowledge<br />

representation system has been exploited for a similar task.<br />

An <strong>in</strong>tended design is given just as before by a rst (partial) OODM schema. Then the<br />

follow<strong>in</strong>g cases may occur.<br />

{ A class/type <strong>of</strong> the <strong>in</strong>tended design is an <strong>in</strong>stantiation, specialization or renement <strong>of</strong>an<br />

exist<strong>in</strong>g design component. Then we may ask whether a rearrangement <strong>of</strong> requirements<br />

would enable the reuse <strong>of</strong> further <strong>in</strong>stantiations, specializations or renements that exist<br />

<strong>in</strong> the library.<br />

{ A class/type <strong>of</strong> the <strong>in</strong>tended design is a variant <strong>of</strong> an exist<strong>in</strong>g library component, i.e.<br />

the rst alternative is true for a reparameterization <strong>of</strong> this library component. Of course,<br />

this is always possible, s<strong>in</strong>ce a pure parameter would satisfy this requirement. Hence<br />

we have to judge whether it is helpfull to take the reuse <strong>of</strong> the reparameterization <strong>in</strong>to<br />

consideration. This approach is similar to the use <strong>of</strong> a similarity measure <strong>in</strong> case-based<br />

reason<strong>in</strong>g.<br />

192


Once we have discovered a reusable variant <strong>in</strong> the library, wemaykeep track <strong>of</strong> the dierence<br />

to the <strong>in</strong>tended design and propagate these changes along the exist<strong>in</strong>g hierarchies. Then we<br />

may ask whether the result<strong>in</strong>g components can be directly reused.<br />

This suggests a modication <strong>of</strong> the renement-based design methodology. Before start<strong>in</strong>g<br />

a renement process exist<strong>in</strong>g doma<strong>in</strong>-specic libraries are exam<strong>in</strong>ed and variants are built.<br />

Then the renement process is based on selected variants. Moreover, variant construction is<br />

also required after standard renement steps that <strong>in</strong>troduce new types or classes, s<strong>in</strong>ce for<br />

these there may also exist variants <strong>in</strong> some library.<br />

Example 9.11. The class Insurant <strong>in</strong> Example 9.3 corresponds to current legal requirements.<br />

Some years ago an <strong>in</strong>itial schema for an <strong>in</strong>surance application would have looked<br />

slightly dierent, s<strong>in</strong>ce only ma<strong>in</strong> <strong>in</strong>surants existed at that time. This could have been modelled<br />

by some class Insurant old.<br />

Class Insurant old =<br />

Structure ( contract-no : NAT ,<br />

name : NAME ,<br />

address : ADDRESS ,<br />

sex : SEX ,<br />

<strong>in</strong>surance-times : [(beg<strong>in</strong> : DATE ,end:DATE [?)] Where ::: ,<br />

account-no : NAT ,<br />

employed-by :Company ,<br />

family : f NAME g ,<br />

agency : AGENCY )<br />

Method :::<br />

End Insurant old<br />

Assume such an <strong>in</strong>itial design and all renements to be kept <strong>in</strong> some library. Omitt<strong>in</strong>g accountno,<br />

employed-by and agency <strong>in</strong> the structure expression above would give a common supertype<br />

<strong>of</strong> the representation types for Insurant old and Insurant <strong>in</strong> Example 9.3.<br />

Then build variants <strong>of</strong> all the exist<strong>in</strong>g renements just omitt<strong>in</strong>g this <strong>in</strong>formation and check<br />

whether these are compatible with the new requirements. This avoids repeat<strong>in</strong>g renement<br />

steps that occurred (<strong>in</strong> modied form) already <strong>in</strong> the past.<br />

F<strong>in</strong>ally, specialize Insurant as <strong>in</strong>dicated <strong>in</strong> Example 9.8 and build variants <strong>of</strong> the rened<br />

classes Insurant and Ma<strong>in</strong>-Insurant with respect to the hierarchy developed so far. Aga<strong>in</strong><br />

this should avoid repeat<strong>in</strong>g earlier renement steps.<br />

ut<br />

The concretization and theoretical treatment <strong>of</strong> these ideas for the outl<strong>in</strong>ed methodology is<br />

a research issue under current <strong>in</strong>vestigation.<br />

9.7 Inferences <strong>in</strong> OODB Design<br />

The work reported <strong>in</strong> the preced<strong>in</strong>g sections presents rst pr<strong>in</strong>ciples <strong>of</strong> object oriented database<br />

design. The ma<strong>in</strong> scenario is centered around stepwise renement on the basis <strong>of</strong> an object<br />

oriented datamodel support<strong>in</strong>g class abstraction, generic update operations and declarative<br />

constra<strong>in</strong>t specication. The datamodel as well as the design process <strong>in</strong>volve a lot <strong>of</strong> support<strong>in</strong>g<br />

<strong>in</strong>ferences. These fall <strong>in</strong>to two classes. Let us rst describe those <strong>in</strong>ferences that are<br />

<strong>in</strong>tr<strong>in</strong>sic to the datamodel.<br />

193


{ The datamodel supports type and class hierarchies. S<strong>in</strong>ce methods on subclasses may<br />

override <strong>in</strong>herited methods, we have to check that these are <strong>in</strong>deed specializations <strong>in</strong><br />

order to shr<strong>in</strong>k undesired arbitrar<strong>in</strong>ess.<br />

{ The datamodel supports strongly typed methods, hence the problem to check type correctness.<br />

A more general problem is the verication <strong>of</strong> consistency for constra<strong>in</strong>ts that<br />

evade from enforcement.<br />

{ The datamodel supports generic updates, but these only exist <strong>in</strong> the case <strong>of</strong> valuerepresentability.<br />

This leads to the problem whether a uniqueness constra<strong>in</strong>t is implied.<br />

The second class <strong>of</strong> <strong>in</strong>ferences is required by the design methodology and extr<strong>in</strong>sic to the<br />

datamodel.<br />

{ The ma<strong>in</strong> scenario is based on stepwise renement. Hence the task to verify formal re-<br />

nement conditions. However, for the standard renement steps <strong>in</strong> Section 9.3 this is<br />

redundant, s<strong>in</strong>ce they have already been proven to be correct.<br />

{ In order to enforce consistence the formal requirement on I-reducedness [19] has to be<br />

satised. Hence the task to check it.<br />

{ F<strong>in</strong>ally, wehave to recognize the relation <strong>of</strong> an <strong>in</strong>tended design to exist<strong>in</strong>g library components,<br />

i.e. whether it is an <strong>in</strong>stantiation, specialization, renement or variant. This may<br />

<strong>in</strong>volve data restructur<strong>in</strong>g as shown <strong>in</strong> [12]. Moreover, once a usefull variant has been detected,<br />

we may want to propagate the changes along the dierent hierarchies. This k<strong>in</strong>d<br />

<strong>of</strong> variation-based reuse is still a research issue that we arework<strong>in</strong>g on.<br />

However, there are still open research problems. So far, we do not know the exact boundary<br />

<strong>of</strong> the <strong>in</strong>ferences. Another problem is the <strong>in</strong>tegration <strong>of</strong> user <strong>in</strong>terfaces and graphical support<br />

<strong>in</strong> order to facilitate the control whether the design ts for the amount <strong>of</strong>knowledge result<strong>in</strong>g<br />

from the current stage <strong>of</strong> requirements analysis.<br />

Currently, there is a research project CODE (Computer-aided <strong>Object</strong> oriented Design<br />

Environment) that aims at solv<strong>in</strong>g these open problems. The ma<strong>in</strong> research topics <strong>of</strong> CODE<br />

will be the extension <strong>of</strong> the design method toward variation-based reuse and the support <strong>of</strong><br />

the outl<strong>in</strong>ed methodology by a CASE tool.<br />

References for Chapter 9<br />

1. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The object-oriented<br />

database system manifesto, Proc. 1st DOOD, Kyoto 1989<br />

2. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />

5 (4), 1990, pp. 353-382<br />

3. G. Booch: <strong>Object</strong>-oriented design with applications, Benjam<strong>in</strong> Cumm<strong>in</strong>gs, 1991<br />

4. L. Cardelli, P. Wegner: On understand<strong>in</strong>g types, data abstraction and polymorphism, ACM Comput<strong>in</strong>g<br />

Surveys, vol. 17(4), pp. 471-522<br />

5. S. Ceri, J. Widom: Deriv<strong>in</strong>g production rules for constra<strong>in</strong>t ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />

VLDB, Brisbane (Australia), August 1990, pp. 566-577<br />

6. P. Coad, E. Yourdan: <strong>Object</strong>-oriented analysis, Prentice Hall, 1991<br />

7. C. Floyd: A comparative evaluation <strong>of</strong> system development methods, <strong>in</strong> T. W. Olle, H. G. Sol,<br />

A. A. Verrijn-Stuart (Eds.): Information Systems Design Methodologies { Improv<strong>in</strong>g the Practice,<br />

Elsevier 1986<br />

8. P. Fraternali, S. Paraboschi, L. Tanca: Automatic rule generation for constra<strong>in</strong>t enforcement <strong>in</strong><br />

active databases, <strong>in</strong> U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations <strong>of</strong><br />

Models and Languages for Data and <strong>Object</strong>s, Volkse (Germany), October 1992, Spr<strong>in</strong>ger WICS<br />

194


9. R. Hull, R. K<strong>in</strong>g: Semantic database model<strong>in</strong>g: survey, applications and research issues, ACM<br />

Comput<strong>in</strong>g Surveys, vol. 19(3), September 1987<br />

10. W. Kim: <strong>Object</strong>-oriented databases: denition and research directions, IEEE Trans. on Knowledge<br />

and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (3), 1990, pp. 327-341<br />

11. B. Meyer: <strong>Object</strong>-oriented s<strong>of</strong>tware construction, Prentice-Hall, 1988<br />

12. B. Piza, K.-D. Schewe, J. W. Schmidt: Term subsumption with type constructors, <strong>in</strong> Y. Yesha<br />

(Ed.): Proc. 1st Int. Conf. on Information and Knowledge Management, Baltimore, November<br />

1992<br />

13. G. Saake, R. Jungclaus: Specication <strong>of</strong> database applications <strong>in</strong> the TROLL language, <strong>in</strong><br />

D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication <strong>of</strong> Database Systems,<br />

Glasgow, July 1991, Spr<strong>in</strong>ger WICS, pp. 228-245<br />

14. K.-D. Schewe: Variant construction us<strong>in</strong>g constra<strong>in</strong>t propagation techniques over semantic networks,<br />

<strong>in</strong> J. Retti, K. Leidlmaier (Eds.): Proc. <strong>of</strong> 5th Austrian AI Conference, Igls (Austria) 1989,<br />

Spr<strong>in</strong>ger IFB 208, pp. 188-197<br />

15. B. Schewe, K.-D. Schewe, B. Thalheim: Verfe<strong>in</strong>erungsschritte fur e<strong>in</strong>e objektorientierte Entwurfsmethodik,<br />

<strong>in</strong> Proc. 23rd GI-Jahrestagung, Dresden (Germany), October 1993<br />

16. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, genericity and consistency <strong>in</strong> objectoriented<br />

databases, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Berl<strong>in</strong> (Germany), October<br />

1992, Spr<strong>in</strong>ger LNCS 646, pp. 341-356<br />

17. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity enforcement <strong>in</strong> object-oriented<br />

databases, <strong>in</strong>U.Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations <strong>of</strong> Models<br />

and Languages for Data and <strong>Object</strong>s, Volkse (Germany), October 1992, Spr<strong>in</strong>ger WICS<br />

18. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity preserv<strong>in</strong>g updates <strong>in</strong> object oriented databases, <strong>in</strong><br />

M. Orlowska, M. Papazoglou (Eds.) : Proc. Australian Database Conference, Brisbane (Australia),<br />

February 1993, World Scientic, pp. 171-185<br />

19. K.-D. Schewe, B. Thalheim: Comput<strong>in</strong>g Consistent Transactions, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t<br />

CS-08-92, December 1992, submitted for publication<br />

20. S. Shlaer, S. J. Meller: An object-oriented approach to doma<strong>in</strong> analysis, ACM S<strong>of</strong>tware Eng<strong>in</strong>eer<strong>in</strong>g<br />

Notes, vol. 14 (3), 1989<br />

21. C. Sernadas, P. Gouveia, J. Gouveia, A. Sernadas, P. Resende: The reication dimension <strong>in</strong> objectoriented<br />

database design, <strong>in</strong> D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication<br />

<strong>of</strong> Database Systems, Glasgow, July 1991, Spr<strong>in</strong>ger WICS, pp. 275-299<br />

22. B. Thalheim: Dependencies <strong>in</strong> relational databases, Teubner, Leipzig 1991<br />

23. B. Thalheim: Intelligent database design us<strong>in</strong>g an extended entity-relationship model, University<br />

<strong>of</strong> Rostock, Prepr<strong>in</strong>t CS-11-91, Dezember 1991<br />

195


Chapter 10<br />

View-Centered Conceptual<br />

Modell<strong>in</strong>g<br />

Contents<br />

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197<br />

10.2 The data layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198<br />

10.2.1 Application-<strong>in</strong>dependent abstraction: types . . . . . . . . . . . . . . 199<br />

10.2.2 Comb<strong>in</strong>ed structure and behaviour: classes . . . . . . . . . . . . . . 199<br />

10.2.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200<br />

10.2.4 OODM schemata and <strong>in</strong>stances . . . . . . . . . . . . . . . . . . . . . 201<br />

10.3 The dialogue layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202<br />

10.3.1 Views <strong>in</strong> the datamodel . . . . . . . . . . . . . . . . . . . . . . . . . 202<br />

10.3.2 Dialogue classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203<br />

10.3.3 Operations on d-classes . . . . . . . . . . . . . . . . . . . . . . . . . 204<br />

10.3.4 The dialogue management level . . . . . . . . . . . . . . . . . . . . . 205<br />

10.3.5 The impact <strong>of</strong> genericity: selection, <strong>in</strong>vocation, navigation, deletion . 205<br />

10.4 The presentation layer . . . . . . . . . . . . . . . . . . . . . . . . . 206<br />

10.4.1 Presentation <strong>of</strong> dialogue classes . . . . . . . . . . . . . . . . . . . . . 206<br />

10.4.2 Presentation <strong>of</strong> actions . . . . . . . . . . . . . . . . . . . . . . . . . . 208<br />

10.5 Development Methods . . . . . . . . . . . . . . . . . . . . . . . . . 208<br />

10.5.1 Design<strong>in</strong>g a New Application . . . . . . . . . . . . . . . . . . . . . . 208<br />

10.5.2 Chang<strong>in</strong>g an Exist<strong>in</strong>g Application . . . . . . . . . . . . . . . . . . . 209<br />

10.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209<br />

The follow<strong>in</strong>g is a repr<strong>in</strong>t <strong>of</strong><br />

Klaus-Dieter Schewe, Bett<strong>in</strong>a Schewe. View-Centered Conceptual Modell<strong>in</strong>g { An<br />

<strong>Object</strong> <strong>Oriented</strong> Approach. <strong>in</strong> B. Thalheim (Ed.). Conceptual Model<strong>in</strong>g { Proc. ER<br />

'96 . Spr<strong>in</strong>ger LNCS.<br />

196


Abstract. Information systems for highly skilled clerical workers present themselves as a<br />

collection <strong>of</strong> w<strong>in</strong>dow-based processes with underly<strong>in</strong>g procedures access<strong>in</strong>g databases. It is left<br />

to the users to cont<strong>in</strong>ue or <strong>in</strong>terrupt a certa<strong>in</strong> piece <strong>of</strong> work or to switch from one application<br />

to another. Such system can be supported by three layers: a database layer, a dialogue layer<br />

and a presentation layer.<br />

In this paper an <strong>in</strong>tegrated object oriented model with a dist<strong>in</strong>ction between types and<br />

classes is outl<strong>in</strong>ed. In this model views on the datamodel can be extended to dialogue classes<br />

which enable a smooth <strong>in</strong>tegration <strong>of</strong> dialogue objects with the underly<strong>in</strong>g datamodel. The<br />

only rema<strong>in</strong><strong>in</strong>g task for the presentation layer consists <strong>of</strong> suitable ergonomic presentations <strong>of</strong><br />

dialogue objects on the screen by means <strong>of</strong> a general UIMS.<br />

10.1 Introduction<br />

Conceptual modell<strong>in</strong>g for <strong>in</strong>formation systems depends on the <strong>in</strong>tended application. In case<br />

<strong>of</strong> the work <strong>of</strong> highly skilled clerical workers to be supported we must be aware that they do<br />

not follow a monotone work<strong>in</strong>g scheme. E.g., consider agencies <strong>of</strong> a health <strong>in</strong>surance company<br />

with emphasis on the service for clients, who behave dierent from one another, demand for<br />

optimal service and <strong>in</strong>formation without delay, address their demands to the agents either<br />

personally, by phone or by fax, appreciate not to be burdened with complicated term<strong>in</strong>ology<br />

and forms etc. Therefore, a support<strong>in</strong>g <strong>in</strong>formation system must support workow beyond<br />

strict regularity permitt<strong>in</strong>g its users to exam<strong>in</strong>e additional circumstances, write specialized<br />

letters <strong>in</strong>stead <strong>of</strong> us<strong>in</strong>g forms, escape or <strong>in</strong>terrupt processes etc.<br />

As a consequence such <strong>in</strong>formation systems have to be composed <strong>of</strong> several <strong>in</strong>dependently<br />

usable dialogues leav<strong>in</strong>g to the user the decision which one to use <strong>in</strong> a concrete situation. The<br />

dialogue system has to oer many quickly reachable dialogue objects without forc<strong>in</strong>g its users<br />

to reach them <strong>in</strong> a specic way. Furthermore, it must oer a good overview about a client's<br />

situation as context to the special data to be actually processed. On the other hand, such<br />

systems must handle large amounts <strong>of</strong> data, hence should be supported by a well-designed<br />

database system without bother<strong>in</strong>g the users with database details.<br />

From a conceptual modell<strong>in</strong>g po<strong>in</strong>t <strong>of</strong> view the description <strong>of</strong> dialogues can be divided <strong>in</strong>to<br />

two major components. The rst one comprises the pure representational aspects concern<strong>in</strong>g<br />

w<strong>in</strong>dows, eld, menues, shortcuts etc., and its design is basically concerned with a UIMS and<br />

ergonomic criteria [4]. The second one deals with the abstract data conta<strong>in</strong>ed <strong>in</strong> the dialogue<br />

objects.<br />

The nature <strong>of</strong> the <strong>in</strong>tended applications <strong>of</strong> be<strong>in</strong>g data-<strong>in</strong>tensivemakes (conceptual) database<br />

design a central task <strong>in</strong> their development. This task is governed by general requirements concern<strong>in</strong>g<br />

the quality <strong>of</strong> databases, which must be free <strong>of</strong> redundancies, exible with respect<br />

to future extensions, not limited to specic applications and achieve highly <strong>in</strong>creased performance.<br />

However, the data processed <strong>in</strong> the dialogues is far from satisfy<strong>in</strong>g these criteria, but<br />

give rise to views.<br />

The development process has to be understood as a learn<strong>in</strong>g process, where not all requirements<br />

are known at the beg<strong>in</strong>n<strong>in</strong>g. This requires the participation <strong>of</strong> the users, because they<br />

are the only ones who can judge about the usefulness <strong>of</strong> proposed solutions. As a consequence<br />

the dialogue objects and hence the views on the data dened by them become the driv<strong>in</strong>g<br />

force <strong>in</strong> conceptual modell<strong>in</strong>g. This should not be taken as an accident, but as a challenge.<br />

In this paper we present an<strong>in</strong>tegrated model on the basis <strong>of</strong> the object oriented datamodel<br />

(OODM) <strong>in</strong> [10]. This datamodel has been dened <strong>in</strong> the spirit <strong>of</strong> Beeri's fundamental idea<br />

197


concern<strong>in</strong>g the conceptual separation <strong>of</strong> values and objects [2]. Values are provided by the<br />

means <strong>of</strong> type systems consist<strong>in</strong>g <strong>of</strong> base types and constructors [6]. <strong>Object</strong>s are provided<br />

by the means <strong>of</strong> classes which comb<strong>in</strong>e complex value and reference structures, operations<br />

and <strong>in</strong>heritance. This approach to object orientation is quite dierent from the work <strong>in</strong> [3, 7]<br />

which focusses on methods for object oriented programm<strong>in</strong>g. In particular, it is easy to see that<br />

certa<strong>in</strong> classes <strong>of</strong> OODM schemata with only at acyclic reference structures are equivalent<br />

to schemata <strong>in</strong> the higher-order entity relationship model [13]. We give an outl<strong>in</strong>e <strong>of</strong> the<br />

datamodel <strong>in</strong> Section 10.2.<br />

The OODM has been extended by dialogue classes <strong>in</strong> [8, 12] <strong>in</strong> order to support the development<br />

<strong>of</strong> <strong>in</strong>formation systems as characterized above. These dialogue classes are dened<br />

analogously to classes <strong>in</strong> the datamodel, i.e. they provide structural and behavioural abstractions<br />

<strong>of</strong> dialogue objects as well as <strong>in</strong>heritance. The dialogue objects can then be handled <strong>in</strong><br />

the same way as objects <strong>in</strong> the database which turns the management <strong>of</strong> the dialogues <strong>in</strong>to<br />

a database task. The relationship between the dialogue model and the datamodel is given by<br />

the means <strong>of</strong> views. We present the dialogue model <strong>in</strong> Section 10.3.<br />

The development <strong>of</strong> user <strong>in</strong>terfaces then reduces to the task <strong>of</strong> nd<strong>in</strong>g suitable representations<br />

<strong>of</strong> dialogue objects on the screen. For this purpose we propose the use <strong>of</strong> a general<br />

UIMS. In Section 10.4 we present a brief outl<strong>in</strong>e <strong>of</strong> representational means with respect to<br />

our <strong>in</strong>tegrated model.<br />

To that end, the work reported <strong>in</strong> this paper cont<strong>in</strong>ues our previous work <strong>in</strong> [12]. With<br />

respect to that paper we nowachieve some simplication concern<strong>in</strong>g the denition <strong>of</strong> dialogue<br />

classes. This denition was rst given <strong>in</strong>dependently from the datamodel and led to several<br />

additional notions such as selection classes, actions and dierent k<strong>in</strong>ds <strong>of</strong> operations (selection,<br />

navigation, <strong>in</strong>vocation, process<strong>in</strong>g) and we observed already the relationship to views on the<br />

datamodel. Now this relationship is directly <strong>in</strong>corporated <strong>in</strong> the denition <strong>of</strong> dialogue classes.<br />

Furthermore, selection is enabled by exploit<strong>in</strong>g uniqueness constra<strong>in</strong>ts <strong>in</strong> the datamodel that<br />

were <strong>in</strong>troduced <strong>in</strong> handl<strong>in</strong>g the identication problem <strong>in</strong> OODBs [10], and navigation can be<br />

supported by references between dialogue classes. F<strong>in</strong>ally, the variety <strong>of</strong> dierent operations<br />

can be simplied us<strong>in</strong>g the dist<strong>in</strong>ction between hidden and visible operations which is already<br />

present <strong>in</strong> the OODM. Actions then correspond to the head <strong>of</strong> visible operations, while some<br />

<strong>of</strong> their characteristics are shifted to presentations.<br />

With respect to the modell<strong>in</strong>g method we th<strong>in</strong>k <strong>of</strong> a renement-based approach as presented<br />

<strong>in</strong> [9, 11] for the OODM and extended to the dialogue model <strong>in</strong> [8], i.e. the data<br />

schema and the dialogue schema have to be developped <strong>in</strong> parallel tak<strong>in</strong>g care about their<br />

<strong>in</strong>terrelationships. This is <strong>in</strong> contrast to the work <strong>in</strong> [1, 5], where the start<strong>in</strong>g po<strong>in</strong>t for user<br />

<strong>in</strong>terface design is a complete entity-relationship schema. This topic will be briey sketched<br />

<strong>in</strong> Section 10.5.<br />

10.2 The data layer<br />

In the object-oriented datamodel (OODM) [10] we dist<strong>in</strong>guish between objects and values.<br />

Whereas values are common abstractions identied by themselves, objects depend on the<br />

particular application context and have to be encoded by object identiers. In the OODM<br />

each object consists <strong>of</strong> a unique, immutable identier, a set <strong>of</strong> describ<strong>in</strong>g values <strong>of</strong> possibly<br />

dierent types, references to other objects and operations associated with the object.<br />

198


10.2.1 Application-<strong>in</strong>dependent abstraction: types<br />

Types are used to describe immutable sets <strong>of</strong> values with (type-)operations predened on<br />

them. Type systems are prescriptions for the syntax and semantics <strong>of</strong> permitted type def<strong>in</strong>itions.<br />

Consider a type system that consists <strong>of</strong> some basic types, type constructors and a<br />

subtyp<strong>in</strong>g relation. Moreover, recursive types, i.e. types dened by equations, and predicative<br />

types, i.e. types dened by restrict<strong>in</strong>g formulae, are <strong>in</strong>cluded.<br />

Base types used here are BOOL, NAT, INT, FLOAT, STRING, ID or ?, where ID<br />

is an abstract identier type without any non-trivial supertype and ? is the trivial type that<br />

is a supertype <strong>of</strong> every type.<br />

The type constructors used here are e 1 j j e n (enumeration), (a 1 : 1 ::: a n : n )<br />

(record), fg (nite set), [] (list),hi (bag) or (a : )[(b : ) (union), where 1 ::: n <br />

are already dened types, e 1 ::: e n are constant values and a 1 ::: a n abare eld selectors.<br />

We may use base types and constructors to dene new types by nest<strong>in</strong>g. If there is no<br />

confusion, the eld selectors <strong>in</strong> record or union types may be omitted.<br />

The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />

standard operations on base types and on records, sets, bags, ::: We omit the details here. A<br />

type T is called proper i the number <strong>of</strong> its parameters is 0. T is called a value type i there<br />

is no occurrence <strong>of</strong> ID <strong>in</strong> T .IfT 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type T , then there exists a<br />

correspond<strong>in</strong>g occurrence relation o : T T 0 ! BOOL with o(v 1 v 2 )=true i v 2 occurs<br />

<strong>in</strong> v 1 at the position <strong>in</strong>dicated by the position <strong>of</strong> T 0 <strong>in</strong> T .<br />

A subtype function is a function T 0 ! T from a subtype to its supertype (T 0 T ) dened<br />

by the usual subtyp<strong>in</strong>g rules [6].<br />

Predicative types are used to restrict the set <strong>of</strong> values given by some type denition to<br />

a subset. Formally, a predicative type T consists <strong>of</strong> an underly<strong>in</strong>g type T 0 and a formula P<br />

with exactly one free variable self <strong>of</strong> type T 0 . Clearly, the <strong>in</strong>clusion then gives a subtype<br />

function. In order to avoid <strong>in</strong>ationary use <strong>of</strong> quantiers, other variables are also allowed to<br />

occur freely <strong>in</strong> such a formula. They are assumed to be universally quantied.<br />

Example 10.1. We dene a type PERIOD and a predicative subtype COURSE <strong>of</strong> [PE-<br />

RIOD]:<br />

Type PERIOD = (beg<strong>in</strong> : DATE, end : DATE [?)<br />

Where self.end 6= ?)self.beg<strong>in</strong> self.end<br />

End PERIOD<br />

Type COURSE = [ PERIOD ]<br />

Where self = concat(L 1 ,[P 1 ,P 2 j L 2 ])) P 2 .end 6= ?^P 2 .end P 1 .beg<strong>in</strong><br />

End COURSE<br />

L 1 and L 2 are lists with elements <strong>of</strong> type PERIOD, P 1 and P 2 are values <strong>of</strong> type PERIOD<br />

and `concat' is the concatenation <strong>of</strong> two lists. Informally, the formula requires for any two<br />

successive periods the beg<strong>in</strong> date <strong>of</strong> the rst one to be later than the end date <strong>of</strong> the second<br />

one.<br />

ut<br />

10.2.2 Comb<strong>in</strong>ed structure and behaviour: classes<br />

The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure and behaviour.<br />

Structurally this uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Behaviourally,<br />

199


this abstracts from operations on s<strong>in</strong>gle objects <strong>in</strong>clud<strong>in</strong>g their creation and deletion. In the<br />

OODM objects usually belong to more than one class.<br />

References between classes give rise to implicit referential constra<strong>in</strong>ts. In addition, subclasses<br />

(IsA-relationships) require each database <strong>in</strong>stance to satisfy <strong>in</strong>clusion constra<strong>in</strong>ts on<br />

object identiers. As usual <strong>in</strong> object oriented approaches class operations are used to model<br />

the database dynamics. In the OODM these are associated with classes.<br />

S<strong>in</strong>ce identiers can be represented us<strong>in</strong>g ID,values and references can be comb<strong>in</strong>ed <strong>in</strong>to<br />

a representation type, where each occurrence <strong>of</strong> ID denotes references to some other class.<br />

Therefore, we may dene the structure <strong>of</strong> a class us<strong>in</strong>g parameterized types. Moreover, classes<br />

are arranged <strong>in</strong> IsA-hierarchies.<br />

More formally, ifT is a value type with parameters 1 ::: n and if the parameters are<br />

replaced by pairsr i : C i with a reference name r i and a class name C i , the result<strong>in</strong>g expression<br />

is called a structure expression.<br />

Then a class consists <strong>of</strong> a class name C, a structure expression S, a set <strong>of</strong> class names<br />

D 1 ::: D m (called superclasses) and a set <strong>of</strong> operations. We call r i the reference named r i<br />

from class C to class C i .Thetype derived from S by replac<strong>in</strong>g each reference r i : C i bythetype<br />

ID is called the representation type T C <strong>of</strong> the class C. Thetype U C = (ident :IDvalue : T C )<br />

is called the class type <strong>of</strong> class C.<br />

Example 10.2.<br />

Let us consider a class Insurant for an <strong>in</strong>surance application.<br />

Class Insurant =<br />

Structure (<strong>in</strong>surance number: NAT , name: NAME, address: ADDRESS,<br />

course <strong>of</strong> <strong>in</strong>surance: [ ( k<strong>in</strong>d : \self", beg<strong>in</strong> : DATE,<br />

end : (date: DATE, reason: STRING) [?) [<br />

(k<strong>in</strong>d : \fam", beg<strong>in</strong> : DATE, end : DATE [?,<br />

self : Self Insurant, relation: \child" j \spouse") ])<br />

Operation :::<br />

End Insurant<br />

Class Self Insurant =<br />

IsA Insurant<br />

Structure ( employed by :Company , account no : NAT )<br />

Operation :::<br />

End Self Insurant<br />

A period <strong>of</strong> <strong>in</strong>surance <strong>in</strong> this example is <strong>of</strong> one <strong>of</strong> two possible k<strong>in</strong>ds: Either the <strong>in</strong>surant is<br />

employed by a company and therefore pays his/her own fee or (s)he is a family member <strong>of</strong><br />

the <strong>in</strong>surant without own <strong>in</strong>come.<br />

ut<br />

10.2.3 Operations<br />

The OODM dist<strong>in</strong>guishes between visible and hidden operations on classes to emphasize those<br />

that can be <strong>in</strong>voked by the user. However, all operations on a class <strong>in</strong>clud<strong>in</strong>g the hidden ones<br />

can be accessed by other operations. The justication for such aweak hid<strong>in</strong>g concept is due<br />

to two reasons:<br />

{ Visible operations serve as a means to specify (nested) transactions. In order to build<br />

sequences <strong>of</strong> database <strong>in</strong>stances we only regard these transactions assum<strong>in</strong>g a l<strong>in</strong>ear <strong>in</strong>vocation<br />

order on them.<br />

200


{ Hidden operations can be used to handle identiers. S<strong>in</strong>ce these identiers do not have<br />

any mean<strong>in</strong>g to the user, they must not occur with<strong>in</strong> the <strong>in</strong>put or output <strong>of</strong> a transaction.<br />

Each operation on a class C consists <strong>of</strong> a signature and a body. The signature consists <strong>of</strong><br />

an operation name O, a set <strong>of</strong> <strong>in</strong>put-parameter/type pairs i :: T i and a set <strong>of</strong> outputparameter/type<br />

pairs o j :: Tj 0 . The body is recursively built <strong>of</strong> the follow<strong>in</strong>g constructs:<br />

{ assignment x := E, where x is the class variable C <strong>of</strong> type fU C g or a local variable<br />

(<strong>in</strong>clud<strong>in</strong>g the output-parameters), and E is an expression <strong>of</strong> the same type as x,<br />

{ local variable declaration Let x :: T ,<br />

{ skip and fail,<br />

{ sequenc<strong>in</strong>g S 1 S 2 and branch<strong>in</strong>g IF P THEN S 1 ELSE S 2 ENDIF ,<br />

{ operation call C 0 :- O 0 (<strong>in</strong> : E 0 1 ::: E0 j out : x0 1 ::: x0 i ), where O0 is an operation on class<br />

C 0 with compatible signature and<br />

{ non-determ<strong>in</strong>istic selection <strong>of</strong> values New:f(x), where f is a selector on the representation<br />

type <strong>of</strong> C New Id selects a new identier.<br />

An operation O on a class C is called value-dened i all types occurr<strong>in</strong>g <strong>in</strong> its signature are<br />

proper value types. As already mentioned we require each visible operation to be value-dened.<br />

Subclasses <strong>in</strong>herit the operations <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long as the<br />

new operation is a specialization <strong>of</strong> all its correspond<strong>in</strong>g operations <strong>in</strong> its superclasses, but<br />

we dispense with a formal discussion <strong>of</strong> operational specialization.<br />

10.2.4 OODM schemata and <strong>in</strong>stances<br />

A database schema S is given by a nite collection <strong>of</strong> type and class denitions such that<br />

all types, classes and operations occurr<strong>in</strong>g with<strong>in</strong> type denitions, structure denitions and<br />

operations are dened <strong>in</strong> S.<br />

At any time, a class represents a nite set <strong>of</strong> objects. More precisely this is captured by the<br />

notion <strong>of</strong> an <strong>in</strong>stance (or database state). For a closed schema S an <strong>in</strong>stance D assigns to each<br />

class C a value D(C) <strong>of</strong> type f(ident : IDvalue : T C )g such that the follow<strong>in</strong>g conditions<br />

are satised:<br />

{ For each class C identiers must be unique.<br />

{ The set <strong>of</strong> identiers <strong>in</strong> a subclass C is a subset <strong>of</strong> the one <strong>in</strong> the superclass C 0 . Moreover,<br />

if T C T 0 C with subtype function f : T C ! T 0 C , then (i v) 2D(C) ) (i f(v)) 2D(C0 )<br />

holds.<br />

{ For each reference r from C to D identiers j occurr<strong>in</strong>g <strong>in</strong> a value v <strong>of</strong> an object <strong>in</strong> C<br />

with respect to the occurrence relation o r , i.e.(i v) 2D(C) and o r (v j) hold, must occur<br />

<strong>in</strong> D(D).<br />

Basic update operations, i.e. <strong>in</strong>sertion, deletion and update <strong>of</strong> a s<strong>in</strong>gle object <strong>in</strong>to a class C,<br />

cannot always be derived <strong>in</strong> the object-oriented case, because the abstract identiers have<br />

to be hidden from the user. However, <strong>in</strong> [10] it has been shown that for value-representable<br />

classes these operations are uniquely determ<strong>in</strong>ed by the schema and consistent with respect<br />

to the implicit referential and <strong>in</strong>clusion constra<strong>in</strong>ts.<br />

Value-representability <strong>of</strong> all classes <strong>in</strong> a closed schema is implied, if we have a (trivial)<br />

uniqueness constra<strong>in</strong>t for each class. Such aconstra<strong>in</strong>t requires the values <strong>of</strong> type T C <strong>in</strong> the<br />

class extension C to be unique.<br />

201


10.3 The dialogue layer<br />

<strong>Object</strong> orientation with<strong>in</strong> dialogue systems means to enter or select values on the screen and<br />

to <strong>in</strong>voke actions on them. The dialogue system reacts by oer<strong>in</strong>g other data or by activat<strong>in</strong>g<br />

and deactivat<strong>in</strong>g entries <strong>in</strong> selection lists or possible actions <strong>in</strong> the action bar [4]. We call<br />

such a collection <strong>of</strong> data and possible actions a dialogue object (d-object). In graphical user<br />

<strong>in</strong>terfaces d-objects are normally presented <strong>in</strong> a w<strong>in</strong>dow.<br />

Users <strong>in</strong>voke actions to change data <strong>in</strong> the database, to navigate to another possibly<br />

new dialogue object or to a modied presentation <strong>of</strong> the same dialogue object. Depend<strong>in</strong>g<br />

on selections or entries made <strong>in</strong> a d-object only a part <strong>of</strong> the possible actions are allowed.<br />

The process<strong>in</strong>g <strong>of</strong> an action may require further preconditions depend<strong>in</strong>g on the state <strong>of</strong> the<br />

dialogue system especially on other user's d-objects.<br />

A dialog object consists <strong>of</strong> a unique abstract identier, a set <strong>of</strong> values v i <strong>in</strong> associated<br />

elds F 1 ::: F n which correspond to describ<strong>in</strong>g values <strong>of</strong> objects, a set <strong>of</strong> references to other<br />

dialogue objects <strong>in</strong> order to allow aquicknavigational access, aset<strong>of</strong>actions to change the<br />

data and to control the dialogue and a state with the values `active' and `<strong>in</strong>active'. This<br />

means, that dialogue objects only exist as long as the dialogue object is visible on the screen.<br />

If a w<strong>in</strong>dow is closed the correspond<strong>in</strong>g dialogue object ist deleted.<br />

The identier serves to adm<strong>in</strong>istrate the dialogue objects. It is not known to the user,<br />

cannot be used by him and is not visible. Only the active d-object allows manipulations <strong>of</strong><br />

the represented data and only its actions can be <strong>in</strong>voked.<br />

10.3.1 Views <strong>in</strong> the datamodel<br />

Roughly spoken a view may be regarded as a stored query. In the relational datamodel queries<br />

can be expressed by terms <strong>in</strong> relational algebra. This can be generalized to the OODM us<strong>in</strong>g<br />

its type system. Then a query turns out to be represented by a term t over some type T such<br />

that the free variables <strong>of</strong> t represent the classes.<br />

S<strong>in</strong>ce objects employ identiers, we have to dist<strong>in</strong>guish between queries that result <strong>in</strong><br />

values and those that result <strong>in</strong> (collections <strong>of</strong>) objects. Therefore we dist<strong>in</strong>guish <strong>in</strong> the OODM<br />

between value queries and general access expressions. For a value query the type T <strong>of</strong> the<br />

den<strong>in</strong>g term t mustbeavalue type.<br />

This allows terms t to be built which <strong>in</strong>volve only identiers already exist<strong>in</strong>g <strong>in</strong> the<br />

database. Thus, such queries are called object preserv<strong>in</strong>g. Ifwewant the result <strong>of</strong> a query to<br />

represent `new' objects, i.e. if we want tohave object generat<strong>in</strong>g queries, we have to apply a<br />

mechanism to create new object identiers. This can be achieved by object creat<strong>in</strong>g functions<br />

on the type ID with arity ID ::: ID ! ID [10].<br />

The idea that a view is a stored query then carries over easily. Thus, a view on the<br />

schema S consists <strong>of</strong> a view name v 2 N C such that there is no class C with this name, a<br />

structure expression S(v) conta<strong>in</strong><strong>in</strong>g references to classes <strong>in</strong> S or to views on S and a den<strong>in</strong>g<br />

access expression 1 t(v) <strong>of</strong>type f(ident :IDvalue : T v )g, where T v is the representation type<br />

correspond<strong>in</strong>g to S(v).<br />

Example 10.3. Let us give a sample view on the schema <strong>of</strong> Example 10.2:<br />

View Course <strong>of</strong> Insurant =<br />

1 Assume for the moment that view denitions do not conta<strong>in</strong> recursive denitions.<br />

202


Structure<br />

[ ( k<strong>in</strong>d : \self", beg<strong>in</strong> : DATE,<br />

end : (date: DATE, reason: STRING) [?,<br />

fams: f ( id: Insurant, name:NAME, relation: \child" j \spouse",<br />

beg<strong>in</strong> : DATE, end:DATE [?) g ) [<br />

(k<strong>in</strong>d : \fam", beg<strong>in</strong> : DATE, end : DATE [?,<br />

self : (id : Insurant, name: NAME,<br />

beg<strong>in</strong> : DATE, end : DATE [?)) ]<br />

Def<strong>in</strong>ition<br />

f (i,course) j9cou . (i,cou) 2 Insurant ^<br />

course = [ p k9c 2 cou.course <strong>of</strong> <strong>in</strong>surance .<br />

p.k<strong>in</strong>d = c.k<strong>in</strong>d ^ p.beg<strong>in</strong> = c.beg<strong>in</strong> ^ p.end = c.end ^<br />

( c.k<strong>in</strong>d = \self" ) p.fams = f (j,fam) j9cou 0 .<br />

(j,cou 0 ) 2 Insurant ^ fam.name = cou 0 .name ^<br />

(\fam", fam.beg<strong>in</strong>, fam.end, i, fam.relation) =<br />

cou 0 .course <strong>of</strong> <strong>in</strong>surance.rst ^<br />

p.beg<strong>in</strong> fam.beg<strong>in</strong> ^<br />

(p.end 6= ?^fam.end 6= ?)fam.end p.end )) ^<br />

( c.k<strong>in</strong>d = \fam" )9(k,cou 00 ) 2 Insurant . c.self = k ^<br />

p.self.name = cou 00 .name ^ p.self.beg<strong>in</strong> p.beg<strong>in</strong> ^<br />

(\self", p.self.beg<strong>in</strong>, p.self.end) 2 cou 00 .course <strong>of</strong> <strong>in</strong>surance ^<br />

(p.self.end 6= ?^p.end 6= ?)p.end p.self.end )) ] g<br />

End Course <strong>of</strong> Insurant<br />

This view conta<strong>in</strong>s the course <strong>of</strong> <strong>in</strong>surance <strong>of</strong> one concrete <strong>in</strong>surant. Together with one period<br />

<strong>of</strong> k<strong>in</strong>d `self' <strong>in</strong> that course there are also the latest <strong>in</strong>surance periods <strong>of</strong> the family members.<br />

Together with one period <strong>of</strong> k<strong>in</strong>d `fam' there is also the period <strong>of</strong> the <strong>in</strong>surant to whose family<br />

the related <strong>in</strong>surant belongs.<br />

ut<br />

10.3.2 Dialogue classes<br />

Dialogue classes serve to group dialogue objects with the same structure and behaviour. As<br />

with objects we may use the type ID to represent identiers <strong>of</strong> d-objects and comb<strong>in</strong>e values<br />

and references <strong>in</strong> a structure expression now conta<strong>in</strong><strong>in</strong>g also references to other d-classes. This<br />

can be described by a view as dened <strong>in</strong> the previous subsection. In addition, there should be<br />

avisualtype which describes the data shown on the screen. This should be a supertype <strong>of</strong> the<br />

representation type correspond<strong>in</strong>g to the structure expression <strong>of</strong> the den<strong>in</strong>g view. Then the<br />

content <strong>of</strong> a d-object may be split over more than one d-class which leads to the <strong>in</strong>troduction<br />

<strong>of</strong> super-d-classes. Actions can be expressed by d-operations.<br />

Thus, a dialogue class (d-class) consists <strong>of</strong> a unique name DC, a set <strong>of</strong> names DC 1 , ::: ,<br />

DC n <strong>of</strong> d-classes (called the super-d-classes <strong>of</strong> DC), a den<strong>in</strong>g view with a structure expression<br />

DT 0 DC and a content denition def DC, a value type DT DC which is a supertype <strong>of</strong> the<br />

representation type T 0 DC correspond<strong>in</strong>g to DT 0 DC and a set <strong>of</strong> d-operations. We call DT 0 DC the<br />

content structure , T 0 DC the content type and DT DC the visual type <strong>of</strong> the d-class DC.<br />

Example 10.4. We give a part <strong>of</strong> the formal denition <strong>of</strong> a d-class correspond<strong>in</strong>g to the view<br />

<strong>in</strong> Example 10.3:<br />

203


Dialogue class Course <strong>of</strong> Insurance<br />

IsA IIP<br />

Visual<br />

[(k<strong>in</strong>d: \self", beg<strong>in</strong>: DATE, end: (date: DATE, reason: STRING) [?,<br />

fams: f(name: NAME, relation: \child" j \spouse", beg<strong>in</strong>: DATE,<br />

end: DATE [?)g ) [<br />

(k<strong>in</strong>d: \fam", beg<strong>in</strong>: DATE, end: DATE [?,<br />

self: (name: NAME, beg<strong>in</strong>: DATE, end: DATE [?))]<br />

Content<br />

[(k<strong>in</strong>d: \self", beg<strong>in</strong>: DATE, end: (date: DATE, reason: STRING) [?,<br />

fams: f(id : Insurant, name: NAME, relation: \child" j \spouse",<br />

beg<strong>in</strong>: DATE, end: DATE [?)g ) [<br />

(k<strong>in</strong>d: \fam", beg<strong>in</strong>: DATE, end: DATE [?,<br />

self: (id : Insurant, name:NAME, beg<strong>in</strong>: DATE, end:DATE [?))]<br />

Def<strong>in</strong>ition :::<br />

Operations :::<br />

End Course <strong>of</strong> <strong>in</strong>surance<br />

For the denition we refer to the view presented <strong>in</strong> Example 10.3.<br />

ut<br />

10.3.3 Operations on d-classes<br />

If a user selects an action associated with an active d-object, (s)he <strong>in</strong>itiates changes to that<br />

d-object <strong>in</strong>clud<strong>in</strong>g its deletion, the creation <strong>of</strong> a new d-object, modications to the underly<strong>in</strong>g<br />

database or switches to other d-objects. This is modelled by the d-operations on d-classes.<br />

As with the datamodel we dist<strong>in</strong>guish between visible and hidden d-operations. Only<br />

visible d-operations are accessible by user actions, whereas hidden d-operations can only<br />

be called from other d-operations. In contrast to the datamodel the access to (visible) d-<br />

operations may be restricted by preconditions that express the statusa <strong>of</strong> a d-object by means<br />

<strong>of</strong> selected or non-selected parts. Such preconditions are given by supertypes <strong>of</strong> the visual type<br />

DT DC <strong>of</strong> the d-class DC.<br />

Thus, a d-operation consists <strong>of</strong> a signature, aselection type and a body. The signature is the<br />

same as for classes <strong>in</strong> the datamodel. This also applies to the body with the dierences that<br />

operations to be called can be d-operations on d-classes and operations on classes, whereas<br />

assignments are not allowed. In this way we circumvent the update problem for views.<br />

Then by analogy to the datamodel we require visible d-operations to be value-dened.<br />

Sub-d-classes <strong>in</strong>herit d-operations from their super-d-classes, and overrid<strong>in</strong>g is restricted to<br />

specialization.<br />

Example 10.5. The follow<strong>in</strong>g denes a d-operation on the d-class Course <strong>of</strong> <strong>in</strong>surance<br />

<strong>of</strong> Example 10.5:<br />

New Insurant [(fams:(name:NAME)) [ (self:(name:NAME)) [?]<br />

(<strong>in</strong> : ?, out:?)<br />

System :- save (<strong>in</strong>: cont) <br />

Let <strong>in</strong>s :: ID <br />

Course <strong>of</strong> <strong>in</strong>surance :- Select (<strong>in</strong>: sel, out: <strong>in</strong>s) <br />

Course <strong>of</strong> <strong>in</strong>surance :- Delete (<strong>in</strong>: cont.ident) <br />

204


Course <strong>of</strong> <strong>in</strong>surance :- Invoke (<strong>in</strong>: <strong>in</strong>s)<br />

End New Insurant<br />

Here the selection supertype has been put <strong>in</strong>to brackets. Furthermore, we used two standard<br />

variables cont <strong>of</strong> type (ident :ID,value : TDC 0 ) for the identier/content pair <strong>of</strong> the current<br />

d-object and sel for the selected values with respect to the selection type.<br />

This d-operation New Insurant stores the actual data <strong>in</strong> the database, retrieves the identier<br />

<strong>of</strong> the selected <strong>in</strong>surant, deletes the current d-object and creates a new one associated<br />

with the course <strong>of</strong> <strong>in</strong>surance <strong>of</strong> the selected <strong>in</strong>surant.<br />

For that purposes we used calls to a d-operation save dened on the d-class System (assume<br />

this has been dened as a super-d-class <strong>of</strong> IIP) and to generic d-operations on Course<br />

<strong>of</strong> <strong>in</strong>surance for the deletion and creation <strong>of</strong> d-objects. We shall <strong>in</strong>vestigate genericity below.<br />

ut<br />

10.3.4 The dialogue management level<br />

The notions <strong>of</strong> d-schema and d-<strong>in</strong>stance generalize the correspond<strong>in</strong>g notions for the datamodel.<br />

A d-schema is a nite collection <strong>of</strong> type, class and d-class denitions that do not<br />

conta<strong>in</strong> undened types, classes, operations, d-classes or d-operations occurr<strong>in</strong>g <strong>in</strong> structure<br />

expressions, references, superclasses, signatures or calls. In particular, each d-schema DS has<br />

an underly<strong>in</strong>g OODM schema S.<br />

Then an <strong>in</strong>stance D <strong>of</strong> S already denes the contents <strong>of</strong> the views underly<strong>in</strong>g d-classes <strong>in</strong><br />

DS and hence also the contents with respect to the visual types. For a d-class DC we write<br />

D(DC) for the value <strong>of</strong> type f(ident :IDvalue : TDC 0 )g dened by the content denition part<br />

def DC on the <strong>in</strong>stance D. We call D(DC) the set <strong>of</strong> possible d-objects <strong>in</strong> d-clasd DC with<br />

respect to the <strong>in</strong>stance D.<br />

However, we want to associate with a d-<strong>in</strong>stance the set <strong>of</strong> actual (active) d-objects <strong>in</strong><br />

d-class DC. This should lead to subsets <strong>of</strong> D(DC) satisfy<strong>in</strong>g conditions analogous to those<br />

required for <strong>in</strong>stances D.<br />

Thus, a d-<strong>in</strong>stance DD for a d-schema DS consists <strong>of</strong> an <strong>in</strong>stance D <strong>of</strong> the underly<strong>in</strong>g<br />

OODM schema S and a mapp<strong>in</strong>g D act which assigns to each d-class DC 2DSasetD act (DC)<br />

such that the uniqueness <strong>of</strong> identiers, the <strong>in</strong>clusion <strong>in</strong>tegrity and the referential <strong>in</strong>tegrity (as<br />

dened for <strong>in</strong>stances) hold.<br />

10.3.5 The impact <strong>of</strong> genericity: selection, <strong>in</strong>vocation, navigation, deletion<br />

As for classes <strong>in</strong> the datamodel wemay ask for generic operations on d-classes. S<strong>in</strong>ce possible d-<br />

objects are already determ<strong>in</strong>ed by <strong>in</strong>stances, generic operations for d-classes can only provide<br />

the deletion <strong>of</strong> actual d-objects or the <strong>in</strong>vocation <strong>of</strong> another d-object. Note that the latter<br />

case comprises the navigation to an exist<strong>in</strong>g (active) d-object as well as the creation <strong>of</strong> a new<br />

one (<strong>in</strong> D act (DC)) comb<strong>in</strong>ed with a switch toit.<br />

S<strong>in</strong>ce sets <strong>of</strong> actual d-objects behave like sets <strong>of</strong> ord<strong>in</strong>ary OODM objects, we may exploit<br />

the identication theoty <strong>of</strong> the OODM <strong>in</strong> [10] to generate these generic operations. Even<br />

simpler, due to the denition via views we only need a value identication for d-objects.<br />

Recall that such a value identication is given by uniqueness constra<strong>in</strong>ts for all classes.<br />

S<strong>in</strong>ce subtyp<strong>in</strong>g can be easily extended to structure expressions, such a uniqueness constra<strong>in</strong>t<br />

on a class C with structure expression S C is simply given by a super-structure-expression S C .<br />

205


If the representation type T C <strong>of</strong> S C is a value type, then this means to determ<strong>in</strong>e a unique<br />

object <strong>in</strong> D(C), i.e. its identier, from a given value <strong>of</strong> type T C (if such an object exists at<br />

all). If S C conta<strong>in</strong>s a reference, we rst have to identify a referenced object, i.e. to determ<strong>in</strong>e<br />

its identier from a given value <strong>of</strong> some value type.<br />

Thus value identication gives rise to a generic select-operation which may call selectoperations<br />

on other classes. If there are several uniqueness constra<strong>in</strong>ts, we may comb<strong>in</strong>e the<br />

required <strong>in</strong>put types us<strong>in</strong>g the union constructor.<br />

Example 10.6. For the class Insurant <strong>in</strong> Example 10.2 we may obta<strong>in</strong> a select-operation<br />

with the signature<br />

select (<strong>in</strong> : sel :: (Isn: NAT) [ (name: NAME, date <strong>of</strong> birth: DATE, address: ADDRESS<br />

), out: i:: ID)<br />

S<strong>in</strong>ce the view <strong>in</strong> Example 10.3 is object preserv<strong>in</strong>g and the d-class <strong>in</strong> Example 10.4 is dened<br />

on top <strong>of</strong> this view, the operation carries over to a select operation for a d-objerct (used <strong>in</strong><br />

Example 10.5).<br />

ut<br />

As seen <strong>in</strong> Example 10.6 value identication gives rise to a generic select-operation for d-<br />

classes dened by object preserv<strong>in</strong>g views. In this case the delete- and <strong>in</strong>voke-operations<br />

can be split <strong>in</strong>to a selection part, i.e. a call to the select-operation and a simpler delete- or<br />

<strong>in</strong>voke-operation with an <strong>in</strong>put <strong>of</strong> type ID (the identier <strong>of</strong> the d-object).<br />

In the case <strong>of</strong> an object generat<strong>in</strong>g view the new objects depend on others and we may<br />

obta<strong>in</strong> a generic select-operation by rst select<strong>in</strong>g these other objects. E.g., <strong>in</strong> our <strong>in</strong>surance<br />

application this applies to a d-class <strong>in</strong> which each period <strong>of</strong> an <strong>in</strong>surant is turned <strong>in</strong>to a<br />

separate object.<br />

10.4 The presentation layer<br />

The handl<strong>in</strong>g <strong>of</strong> a dialogue system is best performed us<strong>in</strong>g a User Interface Management<br />

System (UIMS). Such a system provides (among other features)<br />

{ w<strong>in</strong>dows and operations to open and close them, to move them on the screen, to scroll,<br />

to change their size etc.<br />

{ several representations <strong>of</strong> data, such as selection lists or buttons, text entry elds etc.<br />

{ a ma<strong>in</strong> menu where all dialogues start, <strong>of</strong>ten called the operation desk.<br />

10.4.1 Presentation <strong>of</strong> dialogue classes<br />

For each d-class there is at least one representation on the screen. Normally there are actions<br />

with which the representation <strong>of</strong> the d-class on the screen can be modied without chang<strong>in</strong>g<br />

the state <strong>of</strong> the d-class. The representation <strong>of</strong> the d-class is given by the UIMS. The concrete<br />

description therefore depends on its functionality.<br />

Visual values are associated with elds consist<strong>in</strong>g <strong>of</strong> a relation to a component <strong>of</strong> the<br />

content type <strong>of</strong> a d-class, eld attributes like `protected' / `unprotected', `normal' / `emphasized',<br />

::: , the type<strong>of</strong>theeld (text entry eld, selection eld, ::: ), a selection state with<br />

the values `selected' and `unselected', the <strong>in</strong>formation whether data have been entered <strong>in</strong> a<br />

eld or not, the <strong>in</strong>formation where the cursor is placed and an optional name <strong>of</strong> the eld.<br />

206


System History Options W<strong>in</strong>dows<br />

Course <strong>of</strong> the Insurance<br />

1133557 Neumann, Luise 10.11.1948 273<br />

+more <strong>in</strong>formation about the <strong>in</strong>surant +<br />

K<strong>in</strong>d Beg<strong>in</strong> End Reason <strong>of</strong> End<br />

Name Relation Beg<strong>in</strong> End<br />

self 01.04.1979<br />

Neumann, Marga child 13.02.1984<br />

Neumann, Horst child 27.04.1986<br />

fam 10.11.1976 31.03.1979<br />

Meier-Neumann,<br />

01.01.1975<br />

Fritz<br />

self 01.10.1967 09.11.1976 Too old as student<br />

fam 10.11.1948 30.09.1967<br />

Neumann, Wilhelm 01.01.1919 16.08.1990<br />

Fig. 10.1. The presentation <strong>of</strong> a d-object<br />

Fields may be grouped together. Further properties <strong>of</strong> elds depend on the features <strong>of</strong><br />

the UIMS. For each eld there is at least one representation on the screen compris<strong>in</strong>g a<br />

declaration <strong>of</strong> its length, its style <strong>of</strong> emphasis and its style <strong>of</strong> representation <strong>of</strong> protection.<br />

For each representation there is also a representation <strong>of</strong> the selection state <strong>of</strong> the eld.<br />

Example 10.7. The presentation <strong>of</strong> a d-object <strong>in</strong> the class Course <strong>of</strong> <strong>in</strong>surance consists<br />

<strong>of</strong> three parts correspond<strong>in</strong>g to the d-classes Course <strong>of</strong> <strong>in</strong>surance, IIP and System (see<br />

Figure 10.1):<br />

{ The `<strong>in</strong>surant <strong>in</strong>formation part (IIP)' is part <strong>of</strong> most d-objects and gives an overview<br />

about the <strong>in</strong>surant.<br />

{ Besides the IIP the d-object conta<strong>in</strong>s a list <strong>of</strong> <strong>in</strong>surance periods. Each period is represented<br />

by a group <strong>of</strong> l<strong>in</strong>es <strong>of</strong> which therst l<strong>in</strong>e conta<strong>in</strong>s the k<strong>in</strong>d (self or as family member <strong>of</strong><br />

another <strong>in</strong>surant), the beg<strong>in</strong> and the end <strong>of</strong> the period. For periods <strong>of</strong> k<strong>in</strong>d `self' several<br />

l<strong>in</strong>es (maybe0)follow with names <strong>of</strong> family members, the relation <strong>of</strong> the family member<br />

to the <strong>in</strong>surant and beg<strong>in</strong> and end <strong>of</strong> the latest <strong>in</strong>surance period <strong>of</strong> the family member.<br />

For periods <strong>of</strong> k<strong>in</strong>d `fam' one l<strong>in</strong>e follows with the name and the <strong>in</strong>surance period <strong>of</strong> the<br />

<strong>in</strong>surant whose family the member belongs to.<br />

{ The last l<strong>in</strong>e is used for messages and orig<strong>in</strong>ates from the d-class System. ut<br />

Besides the d-classes which are<strong>in</strong>voked by the user there are dialogue boxes, <strong>in</strong> which data can<br />

be entered and processed [4]. Dialogue boxes are called by operations <strong>of</strong> d-classes, if further<br />

data are needed to nish an operation.<br />

207


10.4.2 Presentation <strong>of</strong> actions<br />

The user uses actions to change the data on the screen and to control the dialogue. These<br />

actions correspond to the d-operations <strong>in</strong> the d-classes <strong>of</strong> the d-object and therefore consist<br />

<strong>of</strong> a name used <strong>in</strong> the action bar, a shortcut symbol with which the action can be <strong>in</strong>voked<br />

alternatively and a selection criterium.<br />

The name <strong>of</strong> the action is the name <strong>of</strong> the correspond<strong>in</strong>g d-operation, but names <strong>of</strong> menus<br />

may be added if necessary. The selection criterium is given by elds that may or must be<br />

selected before <strong>in</strong>vok<strong>in</strong>g the action. It corresponds to the selection type <strong>of</strong> the correspond<strong>in</strong>g<br />

d-operation. Invok<strong>in</strong>g an action means to execute the body <strong>of</strong> the correspond<strong>in</strong>g d-operation.<br />

Example 10.8. Let us expla<strong>in</strong> some actions associated with the d-object <strong>in</strong> Figure 10.1:<br />

{ `History' shows earlier states <strong>of</strong> the course <strong>of</strong> the <strong>in</strong>surance.<br />

{ `System' and `Options' are pull-down-menus (omitted <strong>in</strong> the example). E. g., `System'<br />

conta<strong>in</strong>s the follow<strong>in</strong>g actions: New Insurant, Save, Cancel (Esc), Save and Quit (F3),<br />

Scroll Forward (Bild#), Scroll Back (Bild"), Desk (Strg + F4).<br />

{ `New <strong>in</strong>surant' saves the data on the screen and shows the course <strong>of</strong> <strong>in</strong>surance <strong>of</strong> another<br />

<strong>in</strong>surant which can be selected <strong>in</strong> the list <strong>of</strong> periods. If no <strong>in</strong>surant is selected a dialogue<br />

box with entry elds for the search for a new <strong>in</strong>surant is activated.<br />

{ `Save' saves the changes <strong>of</strong> the data on the screen and shows the same dialogue object<br />

aga<strong>in</strong>.<br />

{ `Cancel' deletes the dialogue object and returns to the one which was active before respectively<br />

to the desk. Changes made to the data are forgotten.<br />

{ `W<strong>in</strong>dows' is a pull-down-menu, conta<strong>in</strong><strong>in</strong>g the list <strong>of</strong> all exist<strong>in</strong>g dialogue objects. It is<br />

oered by the UIMS and not described here.<br />

ut<br />

10.5 Development Methods<br />

In the previous sections we presented an <strong>in</strong>tegrated data- and dialogue-model for conceptual<br />

modell<strong>in</strong>g, but we did not <strong>in</strong>vestigate how to use this model <strong>in</strong> practice. Due to space limitations,<br />

the follow<strong>in</strong>g presentation <strong>of</strong> development methods will be rather sketchy. We <strong>in</strong>dicate<br />

the power <strong>of</strong> the chosen model on the basis <strong>of</strong> two scenarios. The rst one captures the case<br />

<strong>of</strong> a new system to be designed, the second one handles the case <strong>of</strong> chang<strong>in</strong>g or extend<strong>in</strong>g a<br />

work<strong>in</strong>g application. In both cases, we concentrate on the conceptual level.<br />

10.5.1 Design<strong>in</strong>g a New Application<br />

In design<strong>in</strong>g a new application we have todenetypes, classes and d-classes from scratch. We<br />

assume that purely presentational aspects are captured by the use <strong>of</strong> a UIMS. The method we<br />

propose assumes an almost monotonic growth <strong>of</strong> application knowledge by means <strong>of</strong> <strong>in</strong>terviews<br />

with the <strong>in</strong>tended users and analysis <strong>of</strong> their work<strong>in</strong>g processes to be supported. The rst goal<br />

will be to gather the basic activities <strong>of</strong> the users and to outl<strong>in</strong>e the correspond<strong>in</strong>g dialogues.<br />

At this level, representational aspects naturally come <strong>in</strong>to play by means <strong>of</strong> restrictions on<br />

screens, facilities <strong>of</strong> the UIMS and basic hard- and s<strong>of</strong>tware.<br />

From a more abstract po<strong>in</strong>t <strong>of</strong> view this means to start with dialogue objects and to<br />

abstract to dialogue classes without know<strong>in</strong>g the underly<strong>in</strong>g datamodel schema. E.g., we<br />

208


could decide to have a presentation <strong>of</strong> dialogue objects and actions (grouped to menues) as<br />

<strong>in</strong> Figure 10.1. The simplest way to obta<strong>in</strong> a rst conceptual data schema is to take the view<br />

denition as trivial, i.e., the content type <strong>of</strong> the view co<strong>in</strong>cides with the representation type<br />

<strong>of</strong> a class. Note that references only occur, if we decided to split the data <strong>in</strong> the presentation<br />

among several dialogue objects.<br />

As to the methods, we may either postpone them for later specication or dene them<br />

on the basis <strong>of</strong> this rst schema. The rst alternative is generally recommended as long as<br />

the database schema is not stable. Thus, the rst development step results <strong>in</strong> a schema with<br />

certa<strong>in</strong> undened types, classes and references and with redundancies concern<strong>in</strong>g the data<br />

schema.<br />

Usually there is not only one such dialogue class { otherwise we are done. Hence there<br />

will be several dependencies among the classes <strong>of</strong> the schema. The second step will be to<br />

make these dependencies explicit by the denition <strong>of</strong> constra<strong>in</strong>ts. Then the third step is to<br />

rene the schema <strong>in</strong> order to shift as much <strong>of</strong> theses dependencies as possible <strong>in</strong>to structures.<br />

Renement rules for this purpose have been extensively discussed <strong>in</strong> [8, 9, 11]. These comprise<br />

{ the splitt<strong>in</strong>g <strong>of</strong> classes thereby <strong>in</strong>troduc<strong>in</strong>g references or IsA-relations,<br />

{ the <strong>in</strong>troduction <strong>of</strong> new classes by specialization omitt<strong>in</strong>g the old class or <strong>in</strong>troduc<strong>in</strong>g an<br />

IsA-relation,<br />

{ the extension <strong>of</strong> the schema by new types, classes, d-classes etc. and<br />

{ the completion <strong>of</strong> the schema add<strong>in</strong>g denitions to undened components.<br />

All these renement steps can be reversed. Apply<strong>in</strong>g one <strong>of</strong> them requires consequent changes<br />

to the constra<strong>in</strong>ts and the methods dened so far. Furthermore, each renement guarantees<br />

that classes <strong>of</strong> the old schema become views on the new schema, which <strong>in</strong>turnshows how to<br />

achieve complete d-classes.<br />

10.5.2 Chang<strong>in</strong>g an Exist<strong>in</strong>g Application<br />

When there already exists a runn<strong>in</strong>g application and we want to change or extend it, the<br />

process<strong>in</strong>g method is quite similar. We analyse the new processes, detect the dialogue object<br />

and add d-classes to the schema. In addition, let the underly<strong>in</strong>g views be trivial thereby<br />

<strong>in</strong>troduc<strong>in</strong>g redundancies on the data layer. Thus, the rst schema update is simply additive.<br />

In the follow<strong>in</strong>g steps redundancies have to be made explicit us<strong>in</strong>g constra<strong>in</strong>ts and these are<br />

shifted <strong>in</strong>to structure denitions. As a result we obta<strong>in</strong> aga<strong>in</strong> the required view denition which<br />

can nally be attened. We omit further details and refer to [8] for an extensive discussion <strong>of</strong><br />

an application example.<br />

10.6 Conclusion<br />

In this paper we presented an object oriented model which <strong>in</strong>tegrates a datamodel and a<br />

dialogue model by the means <strong>of</strong> views. <strong>Object</strong>s are used as units <strong>of</strong> data <strong>in</strong> the database with<br />

describ<strong>in</strong>g values, references to other objects and operations. They are managed by an object<br />

oriented database management system.<br />

In the same way d-objects dene the basic units <strong>of</strong> dialogues. A d-object abstract from<br />

presentational issues at the user <strong>in</strong>terface und hence provides a description <strong>of</strong> data and actions<br />

presented <strong>in</strong> dialogues. The data <strong>in</strong> the database and the dialogues are related by the<br />

209


means <strong>of</strong> views. This allows d-objects to be managed <strong>in</strong> the manner as objects. Only screen<br />

presentations are left to a support<strong>in</strong>g user <strong>in</strong>terface management system.<br />

The conceptual build<strong>in</strong>g blocks for objects and d-objects then follow the same pr<strong>in</strong>ciples.<br />

We use classes and d-classes to describe the abstract structural and behavioural aspects <strong>of</strong><br />

both. Then a view on the database describes the possible contents <strong>of</strong> d-objects. Selection and<br />

creation correspond to uniqueness constra<strong>in</strong>t thatwere <strong>in</strong>troduced <strong>in</strong> connection with valuerepresentability.<br />

In contrast to previous work the paper emphasizes just these relationships<br />

between the datamodel and the dialogue model.<br />

References for Chapter 10<br />

1. H. Balzert. Der JANUS-Dialogexperte: Vom Fachkonzept zur Dialogstruktur. S<strong>of</strong>twaretechnik-<br />

Trends, 13(3), August 1993.<br />

2. C. Beeri. A formal approach to object-oriented databases. In Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g,<br />

Vol. 5, 353 { 382. North Holland, 1990.<br />

3. P. CoadandE.Yourdan. <strong>Object</strong>-oriented analysis. Prentice Hall, Englewood-Clis, N.J., 1991.<br />

4. IBM (International Bus<strong>in</strong>ess Mach<strong>in</strong>es Corp.). Systems Application Architecture Common User<br />

Access / Advanced Interface Design Guide, 1991. Nr. SC34-4290.<br />

5. C. Janssen, A. Weisbecker, and J. Ziegler. Generat<strong>in</strong>g user <strong>in</strong>terfaces from data models and<br />

dialogue net specications. In Human Factors <strong>in</strong> Comput<strong>in</strong>g Systems (INTERCHI), 418 { 423,<br />

Amsterdam, 1993. ACM.<br />

6. J. C. Mitchell. Type systems for programm<strong>in</strong>g languages. In J. von Leeuwen, editor, The Handbook<br />

<strong>of</strong> Theoretical Computer Science, Vol. B, 365 { 458. Elsevier, 1990.<br />

7. J. Rumbaugh, M. Blaha, W. Premerlane, F. Eddy, and W. Lorensen. <strong>Object</strong>-<strong>Oriented</strong> Model<strong>in</strong>g<br />

and Design. Prentice Hall, Englewood Clis, New Jersey, 1991.<br />

8. B. Schewe. Kooperative S<strong>of</strong>twareentwicklung { E<strong>in</strong> objektorientierter Ansatz. Deutscher Universitatsverlag,<br />

Leverkusen, 1996.<br />

9. B. Schewe, K.-D. Schewe, and B. Thalheim. Objektorientierter Datenbankentwurf <strong>in</strong> der Entwicklung<br />

daten<strong>in</strong>tensiver Informationssysteme. Informatik -Forschung und Entwicklung, 10(3),<br />

1995, 115 { 127.<br />

10. K.-D. Schewe and B. Thalheim. Fundamental concepts <strong>of</strong> object oriented databases. Acta Cybernetica,<br />

Szeged, 11(1/2), 1993, 49 { 84.<br />

11. K.-D. Schewe and B. Thalheim. Pr<strong>in</strong>ciples <strong>of</strong> object oriented database design. In H. Jaakkola,<br />

H. Kangassalo, T. Kitahashi, and A. Markus, editors, Information Modell<strong>in</strong>g and Knowledge Bases<br />

V, 227 { 242. IOS Press, Amsterdam, 1994.<br />

12. B. Schewe and K.-D. Schewe. A user-centered method for the development <strong>of</strong> data-<strong>in</strong>tensive<br />

dialogue systems { an object oriented approach. In E. D. Falkenberg, W. Hesse, A. Olive, editors,<br />

Information System Concepts, 88 { 103. Chapman & Hall, 1995.<br />

13. B. Thalheim. Foundations <strong>of</strong> entity-relationship model<strong>in</strong>g. Annals <strong>of</strong> Mathematics and Articial<br />

Intelligence, 7, 1993, 197 { 256.<br />

210

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!