Readings in Fundamentals of Object Oriented Databases
Readings in Fundamentals of Object Oriented Databases
Readings in Fundamentals of Object Oriented Databases
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Read<strong>in</strong>gs</strong> <strong>in</strong> <strong>Fundamentals</strong> <strong>of</strong><br />
<strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong><br />
{ Selected Papers {<br />
Klaus-Dieter Schewe 1 , Bernhard Thalheim 2<br />
1 Technische Universitat Clausthal<br />
Institut fur Informatik, Erzstr. 1<br />
38678 Clausthal-Zellerfeld<br />
2 Technische Universitat Cottbus<br />
Institut fur Informatik, Karl-Marx-Str. 17<br />
03044 Cottbus
Table <strong>of</strong> Contents<br />
1 Fundamental Concepts <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong> 1<br />
2 Identication as a Primitive <strong>of</strong> Database Models 34<br />
3 <strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Modell<strong>in</strong>g 51<br />
4 Higher-Level Genericity <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong> 68<br />
5 Towards a Theory <strong>of</strong> Consistency Enforcement 85<br />
6 Tailor<strong>in</strong>g Consistent Specializations as a Natural Approach to Consistency<br />
Enforcement 118<br />
7 Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance <strong>in</strong> the Context<br />
<strong>of</strong> Transaction Specications 134<br />
8 Consistency Enforcement <strong>in</strong>Entity-Relationship and <strong>Object</strong>-<strong>Oriented</strong> Models<br />
158<br />
9 Pr<strong>in</strong>ciples <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Design 179<br />
10 View-Centered Conceptual Modell<strong>in</strong>g 196<br />
i
Preface<br />
This report conta<strong>in</strong>s a collection <strong>of</strong> selected papers on \<strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong><br />
<strong>Databases</strong>". This work started with a small work<strong>in</strong>g group meet<strong>in</strong>g monthly at Hamburg<br />
University. Orig<strong>in</strong>al participants were Ingrid Wetzel, Bernhard Thalheim and Klaus-Dieter<br />
Schewe. The primary <strong>in</strong>tention was to br<strong>in</strong>g together conceptual modell<strong>in</strong>g and formal specication<br />
approaches to database design and to set up solid mathematical foundations. In<br />
these meet<strong>in</strong>gs we detected rather early that the major po<strong>in</strong>ts to focus on were identication,<br />
genericity and consistency.<br />
After rst tentative papers address<strong>in</strong>g these problems { unpublished or documented <strong>in</strong><br />
technical reports, e.g. [2, 4, 19] { the rst paper just conta<strong>in</strong><strong>in</strong>g the problem areas above <strong>in</strong><br />
its title was presented at ICDT '92 <strong>in</strong> Berl<strong>in</strong> [3]. In parallel the GCS approach to consistency<br />
enforcement was set up [9]. In collaboration with David Stemple we even discovered l<strong>in</strong>guistic<br />
reection as the fundamental implementation issue for these tasks [20]. Chapter 1 conta<strong>in</strong>s<br />
a repr<strong>in</strong>t <strong>of</strong> a polished journal version <strong>of</strong> this work published <strong>in</strong> Acta Cybernetica[6], also<br />
summarized <strong>in</strong> [5]. Chapter 2 conta<strong>in</strong>s a deeper <strong>in</strong>vestigation <strong>of</strong> the identication problem by<br />
Catriel Beeri and Bernhard Thalheim [1]. Chapter 3 conta<strong>in</strong>s a follow-on paper [7], <strong>in</strong> which<br />
the orig<strong>in</strong>al idea from formal specications to consider arbitrary underly<strong>in</strong>g type systems was<br />
taken up and the connection to higher-order <strong>in</strong>tuitionistic logic was established. Chapter 4 a<br />
repr<strong>in</strong>t <strong>of</strong> the paper presented at the 1994 Indian conference on \Management <strong>of</strong> Data" [20].<br />
In the sequel the GCS approach has been developped carefully, which after some prelim<strong>in</strong>ary<br />
work [11, 10] lead to the fundamental journal article [8] <strong>in</strong> Acta Informatica and a<br />
follow-on article <strong>in</strong> [14]. These are repr<strong>in</strong>ted <strong>in</strong> Chapter 5 and Chapter 6.<br />
From the beg<strong>in</strong>n<strong>in</strong>g there was a struggle with the \Active Database" community. Researchers<br />
believ<strong>in</strong>g <strong>in</strong> the unlimited power <strong>of</strong> the rule based approach were not very enthusiastic<br />
with our fundamental approach to consistency enforcement rely<strong>in</strong>g on specication<br />
language semantics. In particular, our emphasiz<strong>in</strong>g the need for a formal denition <strong>of</strong> the<br />
goal <strong>of</strong> consistency enforcement <strong>in</strong> databases that encompasses just term<strong>in</strong>ation, conuence<br />
and consistency, was (and still is) not generally accepted. Therefore, we started a side activity<br />
on limits <strong>of</strong> rule based systems [12, 13, 15, 16, 17, 18] <strong>in</strong> the context <strong>of</strong> transaction specications.<br />
Chapter 7 conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong> the polished article [13] published <strong>in</strong> Acta Cybernetica,<br />
which conta<strong>in</strong>s the major theoretical issues. The Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g article [18],<br />
repr<strong>in</strong>ted <strong>in</strong> Chapter 8, conta<strong>in</strong>s an application <strong>of</strong> part <strong>of</strong> that theory to simple object oriented<br />
schemata.<br />
F<strong>in</strong>ally, we worked on the design <strong>of</strong> object oriented databases. The tie-<strong>in</strong> with formal<br />
specications <strong>in</strong>itiated a renement-based approach <strong>in</strong> [21], repr<strong>in</strong>ted <strong>in</strong> Chapter 9. In the<br />
meantime Bett<strong>in</strong>a Schewe brought up the idea <strong>of</strong> an <strong>in</strong>tegrated user <strong>in</strong>terface design through<br />
the use <strong>of</strong> dialogue objects. This lead to the work <strong>in</strong> [22, 23] and additional articles, reports<br />
and books written <strong>in</strong> German. We repr<strong>in</strong>t [23] <strong>in</strong> Chapter 10.<br />
ii
Acknowledgement<br />
We would like to thank our coauthors and all others who stimulated ideas presented <strong>in</strong> our<br />
articles.<br />
References<br />
1. C. Beeri, B. Thalheim. Identication as a Primitive <strong>of</strong> Database Models. In T .Polle, T. Ripke,<br />
K.-D. Schewe. <strong>Fundamentals</strong> <strong>of</strong> Information Systems. Kluwer 1998.<br />
2. K.-D. Schewe, B. Thalheim, I. Wetzel, J.W. Schmidt. Extensible, safe object oriented design <strong>of</strong><br />
database applications. prepr<strong>in</strong>t CS-09-91. Rostock University. 1991.<br />
3. K.-D. Schewe, J.W. Schmidt, I. Wetzel. Identication, genericity and consistency <strong>in</strong> object oriented<br />
databases. <strong>in</strong> J. Biskup, R. Hull (Eds.). Proc. Int. Conf. on Database Theory { ICDT '92 . Spr<strong>in</strong>ger<br />
LNCS 646. Berl<strong>in</strong> 1992.<br />
4. K.-D. Schewe, B. Thalheim, I. Wetzel. Foundations <strong>of</strong> object oriented database concepts. Technical<br />
Report FBI-HH-B-157/92. Hamburg University. 1992.<br />
5. K.-D. Schewe, B. Thalheim. Towards a formal foundations <strong>of</strong> object oriented databases. SIGMOD<br />
workshop Comb<strong>in</strong><strong>in</strong>g declarative and object oriented databases. Wash<strong>in</strong>gton 1993.<br />
6. K.-D. Schewe, B. Thalheim. Fundamental concepts <strong>of</strong> object oriented databases. Acta Cybernetica,<br />
vol. 11 (4), 49-84. Szeged 1993.<br />
7. K.-D. Schewe. <strong>Fundamentals</strong> <strong>of</strong> object oriented database modell<strong>in</strong>g. Intelligent Systems. Moskau<br />
1997.<br />
8. K.-D. Schewe, B. Thalheim. Towards a theory <strong>of</strong> consistency enforcement. Acta Informatica (to<br />
appear).<br />
9. K.-D. Schewe, B. Thalheim, J.W. Schmidt, I. Wetzel. Enforc<strong>in</strong>g <strong>in</strong>tegrity <strong>in</strong> object oriented<br />
databases. <strong>in</strong> U. Lipeck, B. Thalheim (Eds.). Modell<strong>in</strong>g Database Dynamics. Spr<strong>in</strong>ger Workshops<br />
<strong>in</strong> Computer Science. London 1993.<br />
10. K.-D. Schewe, B. Thalheim, I. Wetzel. Integrity presev<strong>in</strong>g updates <strong>in</strong> object oriented databases. <strong>in</strong><br />
M. Orlowska, M. Papazoglou (Eds.). Advances <strong>in</strong> <strong>Databases</strong> { ADC '93 .World Scientic. Sigapore<br />
1993.<br />
11. K.-D. Schewe, B. Thalheim. Comput<strong>in</strong>g Consistent Transactions. Prepr<strong>in</strong>t CS-08-92. Rostock University<br />
1992.<br />
12. K.-D. Schewe, B. Thalheim. Achiev<strong>in</strong>g Consistency <strong>in</strong> Active <strong>Databases</strong>. <strong>in</strong> S. Chakravarty,<br />
J. Widom (Eds.). Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g { Active Database Systems. Houston 1994.<br />
13. K.-D. Schewe, B. Thalheim. Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance <strong>in</strong> the<br />
Context <strong>of</strong> Transaction Specications. Acta Cybernetica (to appear).<br />
14. K.-D. Schewe. Tailor<strong>in</strong>g Consistent Specilizations as a Natural Approach to Consistency Enforcement.<br />
<strong>in</strong> S. Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.). Integrity <strong>in</strong> <strong>Databases</strong>. Dagstuhl 1996.<br />
15. K.-D. Schewe, B. Thalheim. Active Consistency Enforcement for Repairable Database Transitions.<br />
<strong>in</strong> S. Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.). Integrity <strong>in</strong> <strong>Databases</strong>. Dagstuhl 1996.<br />
16. K.-D. Schewe, B. Thalheim. On the strength <strong>of</strong> rule trigger<strong>in</strong>g systems for <strong>in</strong>tegrity ma<strong>in</strong>tenance.<br />
<strong>in</strong> C. McDonald (Ed.). Database Systems '98 . Australian Computer Science Communications, vol.<br />
20 (2), 77-88. Spr<strong>in</strong>ger 1998.<br />
17. K.-D. Schewe. Well-behav<strong>in</strong>g rule systems for Entity-Relationship and object oriented models. <strong>in</strong><br />
D. Embley, R.Goldste<strong>in</strong> (Eds.). Conceptual Model<strong>in</strong>g { ER '97 . Spr<strong>in</strong>ger LNCS 1331, 141-154.<br />
New York 1997.<br />
18. K.-D. Schewe. Consistency Enforcement <strong>in</strong>Entity-Relationship and <strong>Object</strong> <strong>Oriented</strong> Models. Data<br />
& Knowledge Eng<strong>in</strong>eer<strong>in</strong>g 1998 (to appear).<br />
19. K.-D. Schewe, J.W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel. A reective approach to method<br />
generation <strong>in</strong> object oriented databases. Rostocker Informatik Berichte, vol. 14. Rostock University.<br />
1992.<br />
iii
20. K.-D. Schewe, D. Stemple, B. Thalheim. Higher level genericity <strong>in</strong> object oriented databases. Proc.<br />
Int. Conf. Management <strong>of</strong> Data. Bangalore 1994.<br />
21. K.-D. Schewe, B. Thalheim. Pr<strong>in</strong>ciples <strong>of</strong> object oriented database design. <strong>in</strong> H. Jaakkola et al.<br />
(Eds.). Information Modell<strong>in</strong>g and Knowledge Bases V , 227-242. IOS Press. Amsterdam 1994.<br />
22. B. Schewe, K.-D. Schewe. A user-centered method for the development <strong>of</strong> data-<strong>in</strong>tensive dialogue<br />
systems. <strong>in</strong> E. Falkenberg, W. Hesse. (Eds.). Information System Concepts, 88-103. Chapman &<br />
Hall 1995.<br />
23. K.-D. Schewe, B. Schewe. View-centered conceptual modell<strong>in</strong>g { an object oriented approach. <strong>in</strong><br />
B. Thalheim. (Eds.). Conceptual Model<strong>in</strong>g {ER'96. Spr<strong>in</strong>ger LNCS. Berl<strong>in</strong> 1996.<br />
iv
Chapter 1<br />
Fundamental Concepts <strong>of</strong> <strong>Object</strong><br />
<strong>Oriented</strong> <strong>Databases</strong><br />
Contents<br />
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />
1.2 A Motivat<strong>in</strong>g Example . . . . . . . . . . . . . . . . . . . . . . . . . 5<br />
1.3 A Core <strong>Object</strong> <strong>Oriented</strong> Datamodel . . . . . . . . . . . . . . . . . 9<br />
1.3.1 A Simple Type System . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />
1.3.2 The Class Concept as a Structural Primitive . . . . . . . . . . . . . 11<br />
1.3.3 User Dened Integrity Constra<strong>in</strong>ts . . . . . . . . . . . . . . . . . . . 11<br />
1.3.4 Methods as a Basis for Behaviour Modell<strong>in</strong>g . . . . . . . . . . . . . . 12<br />
1.3.5 Queries and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14<br />
1.4 The <strong>Object</strong> Identication Problem . . . . . . . . . . . . . . . . . . 15<br />
1.4.1 The Notion <strong>of</strong> Value-Representability . . . . . . . . . . . . . . . . . 16<br />
1.4.2 Value-Representability <strong>in</strong> the Case <strong>of</strong> Acyclic Reference Graphs . . . 16<br />
1.4.3 Computation <strong>of</strong> Value Representation Types . . . . . . . . . . . . . 17<br />
1.4.4 The F<strong>in</strong>iteness Property . . . . . . . . . . . . . . . . . . . . . . . . . 18<br />
1.4.5 Weak Value-Representability . . . . . . . . . . . . . . . . . . . . . . 20<br />
1.5 The Genericity Problem . . . . . . . . . . . . . . . . . . . . . . . . 21<br />
1.5.1 Generic Update Methods . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
1.5.2 Generic Updates <strong>in</strong> the Case <strong>of</strong> Value-Representability . . . . . . . . 23<br />
1.6 The Consistency Problem . . . . . . . . . . . . . . . . . . . . . . . 25<br />
1.6.1 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 25<br />
1.6.2 Enforc<strong>in</strong>g Integrity <strong>in</strong> the OODM . . . . . . . . . . . . . . . . . . . 27<br />
1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
K.-D. Schewe, B. Thalheim. Fundamental Concepts <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>.<br />
Acta Cybernetica, vol. 11, no. 1-2, 49 - 84. Szeged 1993.<br />
1
Abstract. It is claimed that object oriented databases (OODBs) overcome many <strong>of</strong> the<br />
limitations <strong>of</strong> the relational model. However, the formal foundation <strong>of</strong> OODB concepts is<br />
still an open problem. Even worse, for relational databases a commonly accepted datamodel<br />
existed very early on whereas for OODBs the unication <strong>of</strong> concepts is miss<strong>in</strong>g. The work<br />
reported <strong>in</strong> this paper conta<strong>in</strong>s the results <strong>of</strong> our rst <strong>in</strong>vestigations on a formally founded<br />
object oriented datamodel (OODM) and is <strong>in</strong>tended to contribute to the development <strong>of</strong> a<br />
uniform mathematical theory <strong>of</strong> OODBs.<br />
A clear dist<strong>in</strong>ction between objects and values turns out to be essential <strong>in</strong> the OODM.<br />
Types and Classes are used to structure values and objects repectively. Then the problem<br />
<strong>of</strong> unique object identication occurs. We show that this problem can be be solved<br />
for classes with extents that are completely representable by values. Such classes are called<br />
value-representable.<br />
Another advantage <strong>of</strong> the relational approach istheexistence <strong>of</strong> structurally determ<strong>in</strong>ed<br />
generic update operations. We show that this property can be carried over to object-oriented<br />
datamodels if classes are value-representable. Moreover, <strong>in</strong> this case database consistency<br />
with respect to implicitly specied referential and <strong>in</strong>clusion constra<strong>in</strong>ts will be automatically<br />
preserved.<br />
This result can be generalized with respect to dist<strong>in</strong>guished classes <strong>of</strong> explicitly stated<br />
static constra<strong>in</strong>ts. Given some arbitrary method and some <strong>in</strong>tegrity constra<strong>in</strong>t there exists a<br />
greatest consistent specialization (GCS) that behaves nice <strong>in</strong> that it is compatible with the<br />
conjunction <strong>of</strong> constra<strong>in</strong>ts. We present an algorithm for the GCS construction <strong>of</strong> user-dened<br />
methods and describe the GCSs <strong>of</strong> generic update operations that are required here<strong>in</strong>.<br />
1.1 Introduction<br />
The shortcom<strong>in</strong>gs <strong>of</strong> the relational database approach encouraged much research aimed at<br />
achiev<strong>in</strong>g more appropriate data models. It has been claimed that the object-oriented approach<br />
will be the key technology for future database systems and languages [8]. Several systems<br />
[4, 6, 7, 9, 15, 16, 17, 19, 26, 36, 37, 38] arose from these eorts. However, <strong>in</strong> contrast to<br />
research <strong>in</strong> the relational area there is no common formal agreement on what constitutes an<br />
object-oriented database [10, 11, 13].<br />
The basic question \What is an object?" seems to be trivial, but already here the variety<br />
<strong>of</strong> answers is large. In object oriented programm<strong>in</strong>g the notion <strong>of</strong> an object was <strong>in</strong>tended as<br />
a generalization <strong>of</strong> the abstract data type concept with the additional feature <strong>of</strong> <strong>in</strong>heritance.<br />
In this sense object orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong><br />
order to promote high s<strong>of</strong>tware development productivity. The development <strong>of</strong> object oriented<br />
databases regarded an object also as a basic unit <strong>of</strong> persistent data, a view that is heavily <strong>in</strong>-<br />
uenced by exist<strong>in</strong>g semantic datamodels (SDMs) [2, 29, 31, 39, 40, 60]. Thus, object oriented<br />
databases are composed <strong>of</strong> <strong>in</strong>dependent objects but must also provide for the ma<strong>in</strong>tenance <strong>of</strong><br />
<strong>in</strong>ter-object consistency, a demand that is to some degree <strong>in</strong> dissonance with the basic style<br />
<strong>of</strong> object orientation.<br />
A view that is common <strong>in</strong> OODB research is that objects are abstractions <strong>of</strong> real world<br />
entities and should have an identity [8]. This leads to a dist<strong>in</strong>ction between values and objects<br />
[10, 11]. A value is identied by itself whereas an object has an identity <strong>in</strong>dependent <strong>of</strong> its<br />
value. This object identity is usually encoded by object identiers [1, 3, 34]. Abstract<strong>in</strong>g from<br />
the pure physical level the identier <strong>of</strong> an object can be regarded as be<strong>in</strong>g immutable dur<strong>in</strong>g<br />
2
the object's lifetime. Identiers ease the shar<strong>in</strong>g and update <strong>of</strong> data. However, such abstract<br />
identiers do not relieve us from the task to provide unique identication mechanisms for<br />
objects. In object oriented programm<strong>in</strong>g object names are sucient, but retriev<strong>in</strong>g mass data<br />
by name is senseless.<br />
In most approaches to OODBs an object is coupled with a value <strong>of</strong> some xed structure.<br />
To our po<strong>in</strong>t <strong>of</strong> view this contradicts already the goal <strong>of</strong> objects be<strong>in</strong>g abstractions <strong>of</strong> reality.<br />
In real situations an object has several and also chang<strong>in</strong>g aspects that should be captured by<br />
the object model. Therefore, <strong>in</strong> our object model each object o consists <strong>of</strong> a unique identier<br />
id, a set <strong>of</strong> (type-, value-)pairs (T i v i ), a set <strong>of</strong> (reference-, object-)pairs (ref j o j ) and a set<br />
<strong>of</strong> methods meth k .<br />
Types are used to structure values. Classes serve as structur<strong>in</strong>g primitive for objects<br />
hav<strong>in</strong>g the same structure and behaviour. It is obvious that the multiple aspects view <strong>of</strong> an<br />
object allows them to be simultaneously members <strong>of</strong> more than one class and to change class<br />
memberships. This sett<strong>in</strong>g also makes every discussion on \object migration" unnessecary, as<br />
migration is only a specic form <strong>of</strong> value change.<br />
In our model a class structure uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references.<br />
The extent <strong>of</strong> classes varies over time, whereas types are immutable. Relationships between<br />
classes are represented by references together with referential constra<strong>in</strong>ts on the object identiers<br />
<strong>in</strong>volved. Moreover, each class is accompanied by a collection <strong>of</strong> methods. A schema is<br />
given by a collection <strong>of</strong> class denitions together with explicit <strong>in</strong>tegrity constra<strong>in</strong>ts.<br />
The Identication Problem. One important concept <strong>of</strong> object-oriented databases is object<br />
identity. Follow<strong>in</strong>g [1, 12] the immutable identity <strong>of</strong> an object can be encoded by the concept<br />
<strong>of</strong> abstract object-identiers. The advantages <strong>of</strong> this approach are that shar<strong>in</strong>g, mutability<br />
<strong>of</strong> values and cyclic structures can be represented easily [42]. On the other hand, object<br />
identiers do not have a mean<strong>in</strong>g for the user and should therefore be hidden.<br />
We study whether equality <strong>of</strong>identiers can be derived from the equality <strong>of</strong>values. In the<br />
literature the notion <strong>of</strong> \deep" equality has been <strong>in</strong>troduced for objects with equal values and<br />
references to objects that are also \deeply" equal. This recursive denition becomes <strong>in</strong>terest<strong>in</strong>g<br />
<strong>in</strong> the case <strong>of</strong> cyclic references.<br />
Therefore, we <strong>in</strong>troduce uniqueness constra<strong>in</strong>ts, which express equality on identiers as a<br />
consequence <strong>of</strong> the equality <strong>of</strong> some values or references. On this basis we can address the<br />
problem how tocharacterize those classes that are completely representable (and hence also<br />
identiable) by values.<br />
Generic Update Operations. The success <strong>of</strong> the relational data model is due certa<strong>in</strong>ly to the<br />
existence <strong>of</strong> simple query and update-languages. Preserv<strong>in</strong>g the advantages <strong>of</strong> the relational<br />
<strong>in</strong> OODBs is a serious goal.<br />
The generic query<strong>in</strong>g <strong>of</strong> objects has been approached <strong>in</strong> [1, 12]. While query<strong>in</strong>g is per se<br />
a set-oriented operation, i.e. it is not necessary to select just one s<strong>in</strong>gle object, and hence<br />
does not raise any specic problems with object identiers, th<strong>in</strong>gs change completely <strong>in</strong> case<br />
<strong>of</strong> updates. If an object with a given value is to be updated (or deleted), this is only dened<br />
unambigously, if there does not exist another object with the same value. If more than one<br />
object exists with the same value or more generally with the same value and the same references<br />
to other objects, then the user has to decide, whether an update- or delete-operation is<br />
applied to all these objects, to only one <strong>of</strong> these objects selected non-determ<strong>in</strong>istically or to<br />
none <strong>of</strong> them, i.e. to reject the operation. However, it is not possible to specify a priori such<br />
3
an operation that works <strong>in</strong> the same way for all objects <strong>in</strong> all situations. The same applies<br />
to <strong>in</strong>sert-operations. Hence the problem, <strong>in</strong> which cases operations for the <strong>in</strong>sertion, deletion<br />
and update <strong>of</strong> objects can be dened generically.<br />
Some authors [43] have chosen the solution to abandon generic operations. Others [6, 7, 9]<br />
use identify<strong>in</strong>g values to represent objectidentity, thus embody a strict concept <strong>of</strong> surrogate<br />
keys to avoid the problem. Our approach is dierent from both solutions <strong>in</strong> that we use the<br />
concept <strong>of</strong> hidden abstract identiers, but at the same time formally characterize those classes<br />
for which unique generic operations for the <strong>in</strong>sertion, deletion and update <strong>of</strong> s<strong>in</strong>gle objects<br />
can be derived automatically. It turns out that these are exactly the value-representable ones.<br />
The Consistency Problem. One <strong>of</strong> the primary benets that database systems oer is automatic<br />
enforcement <strong>of</strong> database <strong>in</strong>tegrity. One type <strong>of</strong> <strong>in</strong>tegrity is ma<strong>in</strong>ta<strong>in</strong>ed through automatic<br />
concurrency control and recovery mechanisms another one is the automatic enforcement<br />
<strong>of</strong> user-specied <strong>in</strong>tegrity constra<strong>in</strong>ts. Most commercial database systems, especially<br />
relational database management systems enforce only a bare m<strong>in</strong>imum <strong>of</strong> constra<strong>in</strong>ts, largely<br />
because <strong>of</strong> the performance overhead associated with updates.<br />
The ma<strong>in</strong>tenance problem is the problem how to ensure that the database satises its<br />
constra<strong>in</strong>ts after certa<strong>in</strong> actions. There are at present two approaches to this ma<strong>in</strong>tenance<br />
problem. The rst one, more classical is the modication <strong>of</strong> methods <strong>in</strong> accordance to the specied<br />
<strong>in</strong>tegrity consta<strong>in</strong>ts. The second approach uses generation mechanisms for the specied<br />
events. Upon occurrence <strong>of</strong> certa<strong>in</strong> database events like update operations the management<br />
component is activated for <strong>in</strong>tegrity ma<strong>in</strong>tenance. The rst research direction did not succeed<br />
because <strong>of</strong> some limitations with<strong>in</strong> the approach. The second one is at present one <strong>of</strong> the most<br />
active database research areas. One <strong>of</strong> our objectives is to show that the rst approach can<br />
be extended to object-oriented databases us<strong>in</strong>g stronger mathematical fundamentals.<br />
Accuracy is an obviously important and desirable feature <strong>of</strong> any database. To this end,<br />
<strong>in</strong>tegrity constra<strong>in</strong>ts, conditions that data must satisfy before a database is updated, are<br />
commonly employed as a means <strong>of</strong> help<strong>in</strong>g to ma<strong>in</strong>ta<strong>in</strong> consistency. In relational databases<br />
the specication and enforcement <strong>of</strong> <strong>in</strong>tegrity constra<strong>in</strong>ts has a long tradition [61], whereas<br />
<strong>in</strong> OODBs the <strong>in</strong>tegrity problem has only recently drawn attention [48].<br />
In object oriented databases, <strong>in</strong>tegrity ma<strong>in</strong>tenance can be based on two dierent approaches.<br />
The rst one uses bl<strong>in</strong>d update operations. In this case, any update is allowed and<br />
the system organizes the ma<strong>in</strong>tenance. The second approach is based on methods rewrit<strong>in</strong>g.<br />
This approach is more eective. Assum<strong>in</strong>g a consistent database state the modied method<br />
can not lead to an <strong>in</strong>consistent state.<br />
In relational databases dist<strong>in</strong>guished classes <strong>of</strong> static <strong>in</strong>tegrity constra<strong>in</strong>ts have been discussed<br />
such as<strong>in</strong>clusion, exclusion, functional, key and multi-valued dependencies. All these<br />
constra<strong>in</strong>ts can be generalized to the object oriented case. Then the result on the existence<br />
<strong>of</strong> <strong>in</strong>tegrity preserv<strong>in</strong>g methods can be generalized to capture also these constra<strong>in</strong>ts. We shall<br />
also describe the result<strong>in</strong>g methods.<br />
The Organization <strong>of</strong> the Paper. We start with a motivat<strong>in</strong>g example <strong>in</strong> Section 1.2, then<br />
<strong>in</strong>troduce <strong>in</strong> Section 1.3 a core OODM to formalize the concepts used <strong>in</strong>tuitively <strong>in</strong> the<br />
example. In Section 1.4 the notions <strong>of</strong> (weak) value-representability are<strong>in</strong>troduced <strong>in</strong> order<br />
to handle the identication problem. The genericity problem will be approached <strong>in</strong> Section<br />
1.5. We show the relationship between value-representability and the unique existence <strong>of</strong><br />
generic update operations. The consistency problem is dealt with <strong>in</strong> Section 1.6. We outl<strong>in</strong>e an<br />
4
operational approach based on the computation <strong>of</strong> greatest consistent specializations (GCSs).<br />
S<strong>in</strong>ce the used algorithm allows the problem to be reduced to basic update operations, we<br />
describe the GCSs here<strong>of</strong>. We summarize our results and describe some open problems <strong>in</strong><br />
Section 1.7.<br />
1.2 A Motivat<strong>in</strong>g Example<br />
In this section we start giv<strong>in</strong>g a completely <strong>in</strong>formal <strong>in</strong>troduction to the OODM on the basis<br />
<strong>of</strong> a simple university example. We rst <strong>in</strong>troduce types and classes, then show an example<br />
<strong>of</strong> a database <strong>in</strong>stance, i.e. the content <strong>of</strong> the database at a given timepo<strong>in</strong>t. The representation<br />
<strong>of</strong> an <strong>in</strong>stance requires object identiers. Then we extend the example by <strong>in</strong>troduc<strong>in</strong>g<br />
user-dened constra<strong>in</strong>ts. We shall see that this enables alternative representations without<br />
us<strong>in</strong>g identiers, hence leads to the notion <strong>of</strong> value-representability. F<strong>in</strong>ally, we <strong>in</strong>dicate the<br />
denition <strong>of</strong> methods as a means to model database dynamics. For the sake <strong>of</strong> simplicity we<br />
only describe a generic update method that can be generated by the system.<br />
As already said <strong>in</strong> the <strong>in</strong>troduction, we dist<strong>in</strong>guish between values and objects with the<br />
ma<strong>in</strong> dierence dened by values identify<strong>in</strong>g themselves whereas objects require an additional<br />
external identication mechanism. Types are used to structure values. Thus, let us rst give<br />
some examples <strong>of</strong> types.<br />
Example. Basically, every type can be built from a few predened basic types such as<br />
BOOL, NAT, STRING, etc. and also predened type constructors for records, nite sets,<br />
lists, unions, etc.<br />
The type denition for PERSONNAME uses both a set constructor fg and a (tagged)<br />
record constructor ():<br />
Type PERSONNAME<br />
= ( FirstName : STRING ,<br />
SecondName : STRING ,<br />
Titles : STRING )<br />
End PERSONNAME<br />
The denition <strong>of</strong> a type PERSON uses the type PERSONNAME.<br />
Type PERSON<br />
= ( PersonIdentityNo : NAT ,<br />
Name : PERSONNAME )<br />
End PERSON<br />
The follow<strong>in</strong>g denes STUDENT as a subtype <strong>of</strong> PERSON , i.e. we can naturally project<br />
each value <strong>of</strong> type STUDENT onto a value <strong>of</strong> type PERSON.<br />
Type STUDENT<br />
= ( PersonIdentityNo : NAT ,<br />
StudNo : NAT ,<br />
Name : PERSONNAME )<br />
5
End STUDENT<br />
Besides these denitions <strong>of</strong> types as sets <strong>of</strong> values we may also dene new type constructors<br />
as follows, where is a parameter for this new constructor:<br />
Type MPERSON ()<br />
= ( PersonIdentityNo : NAT ,<br />
Spouse : )<br />
End MPERSON<br />
ut<br />
Next we use these types to build the structural part <strong>of</strong> an OODM schema. We deneaschema<br />
as a collection <strong>of</strong> classes and a class as a variable collection <strong>of</strong> objects.<br />
Example. Each object <strong>in</strong> a class has a structure, which comb<strong>in</strong>es aspects <strong>of</strong> values associated<br />
with the object and references to other objects. This structure can be based on a type<br />
denition as above or <strong>in</strong>volve itself a (nameless) type denition. Moreover, class denitions<br />
<strong>in</strong>volve IsA relations <strong>in</strong> order to model objects <strong>in</strong> more than one class. We use to <strong>in</strong>dicate<br />
concatenation for record types.<br />
Schema University<br />
Class PersonC<br />
Structure PERSON<br />
End PersonC<br />
Class MarriedPersonC<br />
IsA PersonC<br />
Structure ( PersonIdentityNo : NAT ,<br />
Spouse : MarriedPersonC )<br />
End MarriedPersonC<br />
Class StudentC<br />
IsA PersonC<br />
Structure STUDENT <br />
( Supervisor : Pr<strong>of</strong>essorC ,<br />
Major : DepartmentC ,<br />
M<strong>in</strong>or : DepartmentC )<br />
End StudentC<br />
Class Pr<strong>of</strong>essorC<br />
IsA PersonC<br />
Structure ( PersonIdentityNo : NAT ,<br />
Age : NAT ,<br />
Salary : NAT ,<br />
Faculty : DepartmentC )<br />
End Pr<strong>of</strong>essorC<br />
Class DepartmentC<br />
Structure ( DeptName : STRING )<br />
End DepartmentC<br />
ut<br />
In pr<strong>in</strong>ciple, we are now able to describe the content <strong>of</strong> the database at a given timepo<strong>in</strong>t. For<br />
6
such database <strong>in</strong>stances we need a type ID <strong>of</strong> object identiers that is used for two purposes,<br />
rst as a unique and ecient <strong>in</strong>ternal identication mechanism for objects and second for<br />
modell<strong>in</strong>g objects <strong>in</strong> dierent classes and references to other objects. In this case each class<br />
will be associated with a representation type that can be used directly for stor<strong>in</strong>g objects.<br />
Example.<br />
We useD as a name for the <strong>in</strong>stance.<br />
D(PersonC) =f ( i 1 , ( 123 , ( \John" , \Denver" , f \Pr<strong>of</strong>essor" , \Dr" g ))),<br />
( i 2 , ( 124 , ( \Mary" , \Stuart" , f \Dr" g ))),<br />
( i 3 , ( 456 , ( \John" , \Stuart" , fg))),<br />
( i 4 , ( 567 , ( \Laura" , \James" , fg))),<br />
( i 5 , ( 987 , ( \Dave" ,\Ford" , fg))) g<br />
D(MarriedPersonC)=f ( i 1 ,(123,i 2 )),<br />
( i 2 ,(124,i 1 )) g<br />
D(Pr<strong>of</strong>essorC)=f ( i 1 , ( 123 , 48 , 8000 , i 6 ))<br />
D(StudentC)=f ( i 3 , ( 456 , 1023 , ( \John" , \Stuart" , fg),i 1 , i 6 , i 7 )),<br />
( i 4 , ( 567 , 2134 , ( \Laura" , \James" , fg),i 1 , i 6 , i 7 )) g<br />
D(DepartmentC)=f ( i 6 , ( \Computer Science" ) ) ,<br />
( i 7 , ( \Philosophy" ) ) ,<br />
( i 8 , ( \Music" ) ) g<br />
Note that the follow<strong>in</strong>g three conditions are satised by the <strong>in</strong>stance:<br />
{ The object identiers are unique with<strong>in</strong> a class,<br />
{ the IsA relations <strong>in</strong> the schema give rise to set <strong>in</strong>clusion relationships for the underly<strong>in</strong>g<br />
sets <strong>of</strong> identiers (<strong>in</strong>clusion <strong>in</strong>tegrity), and<br />
{ the identiers occurr<strong>in</strong>g with<strong>in</strong> an object's value at a place correspond<strong>in</strong>g to a reference,<br />
always occur as an object identier <strong>in</strong> the referenced class (referential <strong>in</strong>tegrity).<br />
We shall always refer to these conditions as model <strong>in</strong>herent constra<strong>in</strong>ts that must be satised<br />
by each <strong>in</strong>stance. Other <strong>in</strong>tegrity constra<strong>in</strong>ts can be dened by the user and added to the<br />
schema <strong>in</strong> order to capture more application semantics as shown <strong>in</strong> the next example.<br />
Example. First let us express that there are no two persons with the same PersonIdentityNo,<br />
no two students with the same StudentNo and no two departments with the same name. In<br />
order to formulate this, use x P , x S and x D to refer to the content <strong>of</strong> the classes PersonC,<br />
StudentC and DepartmentC, and let c P : PERSON ! (PersonIdentityNo : NAT)<br />
and c S : STUDENT ID 3 ! (StudNo : NAT) be functions that arise from the natural<br />
projection to the components PersonIdentityNo and StudNo <strong>in</strong> PERSON and STUDENT<br />
respectively. This gives the follow<strong>in</strong>g uniqueness constra<strong>in</strong>ts.<br />
8i j :: ID:8v w :: PERSON: (i v) 2 x P ^ (j w) 2 x P ^ c P (v) =c P (w) ) i = j :<br />
8i j :: ID:8v w :: STUDENT ID 3 : (i v) 2 x S ^ (j w) 2 x S ^ c S (v) =c S (w) ) i = j :<br />
8i j :: ID:8v w :: (DeptName : STRING): (i v) 2 x D ^ (j w) 2 x D ^ v = w ) i = j (1.1) :<br />
Let us further assume that the salary <strong>of</strong> a pr<strong>of</strong>essor is determ<strong>in</strong>ed by his/her age. For this<br />
purpose, let Age Salary : T Pr<strong>of</strong> ! NAT be the natural projections to the Age- and<br />
ut<br />
g<br />
7
Salary-values respectively. Then we have the follow<strong>in</strong>g functional constra<strong>in</strong>t on the class<br />
Pr<strong>of</strong>essorC:<br />
8i j :: ID:8v w :: T Pr<strong>of</strong> : (i v) 2 x Pr<strong>of</strong> ^ (j w) 2 x Pr<strong>of</strong> ^ Age(v) =Age(w) )<br />
Salary(v) = Salary(w) : (1.2)<br />
Next assume that we want to guarantee that the spouse <strong>of</strong> a person's spouse is the person<br />
itself, which gives (with the abbreviations understood) the formula<br />
8i j :: ID:8v w :: T MP : (i v) 2 x MP ^ (j w) 2 x MP ^ Spouse(v) =j ) Spouse(w) =i :<br />
(1.3)<br />
Note that all these constra<strong>in</strong>ts are also satised by the <strong>in</strong>stance above.<br />
ut<br />
Now wehave added uniqueness constra<strong>in</strong>ts, the object identiers used <strong>in</strong> <strong>in</strong>stances correspond<br />
one-to-one to values <strong>of</strong> some types associated with the classes. These are the so-called value<br />
identication types V C .Hencewe could remove identiers and represent the same <strong>in</strong>formation<br />
<strong>in</strong> a purely value-based fashion. In our example the value representation type for the class<br />
PersonC is simply PERSON, but for the class MarriedPersonC we need the recursive<br />
type<br />
V MP = PERSON ( Spouse : V MP )<br />
with values that are rational trees [45, 47].<br />
So far only structural aspects (types, classes, constra<strong>in</strong>ts) have been considered. Let us<br />
now add methods to classes <strong>in</strong> order to model the dynamics <strong>of</strong> the database. In the OODM<br />
methods will be modelled <strong>in</strong> a simple procedural style.<br />
Example.<br />
Let us describe an <strong>in</strong>sert-method for the class PersonC.<br />
<strong>in</strong>sert P ersonC (<strong>in</strong>: P :: PERSON,out:I::ID) =<br />
IF 9 O 2 PersonC .value(O) =P<br />
THEN I := ident(O)<br />
ELSE I := NewId <br />
PersonC := PersonC [f( I,P )g<br />
ENDIF<br />
For an <strong>in</strong>sertion <strong>in</strong>to the class MarriedPersonC we need a more complex <strong>in</strong>put type V<br />
recursively dened as<br />
V = PERSON (V [ ID)<br />
For each P :: V let f(P ) :: PERSON be the projection onto PERSON correspond<strong>in</strong>g to<br />
the subtype relation between V and PERSON.Thenwehave<br />
<strong>in</strong>sert MarriedPersonC (<strong>in</strong>: P :: V , out: I :: ID) =<br />
I := <strong>in</strong>sert PersonC (f(P )) <br />
IF 8 O 2 MarriedPersonC . ident(O) 6= I<br />
THEN P 0 := substitute(I,P ,Spouse(P)) <br />
IF P 0 :: ID<br />
THEN J := P 0 8
ELSE J := <strong>in</strong>sert MarriedP ersonC (P 0 )<br />
ENDIF <br />
MarriedPersonC := MarriedPersonC [f(I,f(P ) (J))g<br />
ENDIF<br />
We used the global method NewId to denote the selection <strong>of</strong> a new identier. The expression<br />
substitute(I,P ,T ) denotes the result <strong>of</strong> replac<strong>in</strong>g the value I for P <strong>in</strong> the expression T . Later<br />
we shall use a more abstract syntax oriented toward guarded commands [20, 41, 46]. ut<br />
Later we shall see that methods as described <strong>in</strong> this example are canonical and can be automatically<br />
derived from the schema. Correspond<strong>in</strong>g generic update methods look quite similar<br />
with the only dierence that there is no output. Such generic update methods only exist for<br />
value representable classes <strong>in</strong> which case,however, they enforce <strong>in</strong>tegrity with respect to the<br />
model <strong>in</strong>herent constra<strong>in</strong>ts. However, generic update methods need not be consistent with<br />
respect to the user-dened constra<strong>in</strong>ts. To achieve this, we have to apply the GCS algorithm<br />
to user-dened methods.<br />
In the follow<strong>in</strong>g sections we formally dene the concepts above and pro<strong>of</strong> the ma<strong>in</strong> results<br />
on value representation, generic updates and <strong>in</strong>tegrity enforcement.<br />
1.3 A Core <strong>Object</strong> <strong>Oriented</strong> Datamodel<br />
In this section we present a slightly modied version <strong>of</strong> the object oriented datamodel (OODM)<br />
<strong>of</strong> [45, 47, 49]. We observe that an object <strong>in</strong> the real world always has an identity. Therefore,<br />
abstract (i.e. system-provided) object identiers are <strong>in</strong>troduced to capture identity. However,<br />
neither the real world object that was the basis <strong>of</strong> the abstraction nor the abstract identier<br />
can be used for the identication <strong>of</strong> an object.<br />
In contrast to exist<strong>in</strong>g object oriented datamodels [1, 3, 4, 6, 7, 8, 9, 16, 17, 26, 36, 37, 42,<br />
43, 54] an object is not coupled with a unique type. In contrast, we observe that real world<br />
objects can have dierent aspects that may change over time. Therefore, a primary decision<br />
was taken to let an object be associated with more than one type and to let these types even<br />
change dur<strong>in</strong>g the object's lifetime. The same applies to references to other objects.<br />
In the follow<strong>in</strong>g let N P , N T , N C , N R , N F , N M and V denote arbitrary pairwise disjo<strong>in</strong>t,<br />
denumerable sets represent<strong>in</strong>g parameter-, type-, class-, reference-, function-, method- and<br />
variable-names respectively.<br />
1.3.1 A Simple Type System<br />
Relational approaches to data modell<strong>in</strong>g are called value-oriented s<strong>in</strong>ce <strong>in</strong> these models real<br />
world entities are completely represented by their values. In the object-oriented approach we<br />
dist<strong>in</strong>guish between objects and values. Values can be gouped <strong>in</strong>to types. In general, a type<br />
may be regarded as an immutable set <strong>of</strong> values <strong>of</strong> a uniform structure together with operations<br />
dened on such values. Subtyp<strong>in</strong>g is used to relate values <strong>in</strong> dierent types.<br />
In [12, 47, 49] algebraic type specications as <strong>in</strong> [21, 23] have been used to allow opentype<br />
systems. For the sake <strong>of</strong> simplicity we deviate here from this approach and follow the more<br />
classical view <strong>of</strong> [14, 15, 45] us<strong>in</strong>g a type system that consists <strong>of</strong> some basic types such as<br />
BOOL, NATURAL, INTEGER, STRING, etc., and type constructors for records, nite<br />
sets, bags, lists, etc. and a subtyp<strong>in</strong>g relation. Moreover, assume the existence <strong>of</strong> recursive<br />
9
types, i.e. types dened by (a system <strong>of</strong>) doma<strong>in</strong> equations. In pr<strong>in</strong>ciple we could use one <strong>of</strong><br />
the type systems dened <strong>in</strong> [4, 5, 14, 15, 19, 24, 38]. In addition we suppose the existence <strong>of</strong><br />
an abstract identier type ID <strong>in</strong> T without any non-trivial supertype. Arbitrary types can<br />
then be dened by nest<strong>in</strong>g. A type T without occurrence <strong>of</strong> ID will be called a value-type.<br />
We shall proceed giv<strong>in</strong>g a more formal denition <strong>of</strong> types.<br />
Denition 1.1. (i) A base type is either BOOL, NAT, INT, FLOAT, STRING, ID or<br />
?.<br />
(ii) Let a i 2 N F and i 2 N P (i = 1::: n). A type constructor is either (a 1 :<br />
1 ::: a n : n ) (record), fg (nite set), [] (list), hi (bag) or [ (union).<br />
(iii) A type t is either a base type, a type constructor, a generalized constructor that results<br />
from replac<strong>in</strong>g some parameters <strong>in</strong> a type constructor by types or a recursive type dened<br />
by an equation t = f=tg:t 0 , where t 0 is a generalized constructor and one <strong>of</strong> its parameters<br />
is replaced by t 2 N T .<br />
In the latter two cases the rema<strong>in</strong><strong>in</strong>g parameters <strong>of</strong> the type constructor together with<br />
the parameters <strong>of</strong> the replac<strong>in</strong>g types yield the parameters 1 ::: n <strong>of</strong> t.<br />
(iv) Atype t is called proper i the number <strong>of</strong> its parameters is 0. t is called a value type i<br />
there is no occurrence <strong>of</strong> ID <strong>in</strong> t.<br />
(v) A type form consists <strong>of</strong> a type name t 2 N T and a type t 0 with possibly some <strong>of</strong> its<br />
parameters replaced by type names.<br />
(vi) A type specication T is a nite collection <strong>of</strong> type forms t 1 ::: t n such that the only type<br />
names occurr<strong>in</strong>g here<strong>in</strong> are the names <strong>of</strong> t 1 ::: t n .<br />
The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />
standard operators on base types and on records, sets, bags, ::: We omit the details here.<br />
If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a correspond<strong>in</strong>g occurrence<br />
relation<br />
o : t t 0 ! BOOL :<br />
F<strong>in</strong>ally, we <strong>in</strong>troduce subtypes. For a more detailed <strong>in</strong>troduction to types see either [14] or<br />
[49].<br />
Denition 1.2. (i) A subtype relation on types is given by the follow<strong>in</strong>g rules:<br />
(a) Every type t is its own subtype and a subtype <strong>of</strong> ?.<br />
(b) NAT INT FLOAT .<br />
(c) (::: a i;1 : i;1 a i : i a i+1 : i+1 :::) (::: a i;1 : 0 i;1 a i+1 : 0 i+1 :::)<br />
whenever j 0 j .<br />
(d)<br />
8<br />
<<br />
:<br />
fg fg<br />
[] []<br />
hi hi<br />
9<br />
=<br />
<br />
(e) fg hi and [] hi.<br />
(f) [ .<br />
i .<br />
(ii) A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened<br />
by (a)-(f) above.<br />
10
1.3.2 The Class Concept as a Structural Primitive<br />
The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure which uniformly<br />
comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Moreover, generic operations on objects such<br />
as object creation, deletion and update <strong>of</strong> its values and references are associated with classes<br />
provided these operations can be dened unambigously. <strong>Object</strong>s can belong to dierent classes,<br />
which guarantees each object <strong>of</strong> our abstract object model to be captured by the collection<br />
<strong>of</strong> possible classes. As for values that are only dened via types, objects can only be dened<br />
via classes.<br />
Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values and references to<br />
objects <strong>in</strong> other classes. Identiers can be represented us<strong>in</strong>g the unique identier type ID.<br />
Values and references can be comb<strong>in</strong>ed <strong>in</strong>to a representation type, where each occurence <strong>of</strong><br />
ID denotes references to some other classes. Therefore, we may dene the structure <strong>of</strong> a class<br />
us<strong>in</strong>g parameterized types.<br />
Denition 1.3. (i) Let t be a value type with parameters 1 ::: n .For dist<strong>in</strong>ct reference<br />
names r 1 ::: r n 2 N R and class names C 1 ::: C n 2 N C the expression derived from t<br />
by replac<strong>in</strong>g each i <strong>in</strong> t by r i : C i for i =1::: n is called a structure expression.<br />
(ii) A structural class consists <strong>of</strong> a class name C 2 N C , a structure expression S and a set <strong>of</strong><br />
class names D 1 ::: D m 2 N C (<strong>in</strong> the follow<strong>in</strong>g called the set <strong>of</strong> superclasses). We call r i<br />
the reference named r i from class C to class C i . The type derived from S by replac<strong>in</strong>g<br />
each reference r i : C i bythetype ID is called the representation type T C <strong>of</strong> the class C,<br />
the type U C =(ident : ID value :: T C ) is called the class type <strong>of</strong> C.<br />
(iii) A (structural) schema S is a nite collection <strong>of</strong> structural classes C 1 ::: C n closed under<br />
references and superclasses.<br />
(iv) An <strong>in</strong>stance D <strong>of</strong> a structural schema S assigns to each classC avalue D(C) <strong>of</strong>type U C<br />
such that the follow<strong>in</strong>g conditions are satised:<br />
uniqueness <strong>of</strong> identiers: For every class C we have<br />
8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (1.4)<br />
<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />
8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (1.5)<br />
Moreover, if T C isasubtype <strong>of</strong> TC 0 with subtype function f : T C ! TC 0 , then we have<br />
8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (1.6)<br />
referential <strong>in</strong>tegrity: For each reference from C to C 0 with correspond<strong>in</strong>g occurrence<br />
relation o r wehave<br />
8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (1.7)<br />
1.3.3 User Dened Integrity Constra<strong>in</strong>ts<br />
Let us now extend the notion <strong>of</strong> schema by the <strong>in</strong>troduction <strong>of</strong> explicit user-dened <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts. First we dene the notion <strong>of</strong> constra<strong>in</strong>t schema <strong>in</strong> general, then we restrict ourselves<br />
to dist<strong>in</strong>guished classes <strong>of</strong> constra<strong>in</strong>ts that arise as generalizations <strong>of</strong> constra<strong>in</strong>ts known<br />
from the relational model, e.g. functional and key constra<strong>in</strong>ts, <strong>in</strong>clusion and exclusion constra<strong>in</strong>ts<br />
[48, 52].<br />
11
Denition 1.4.<br />
Let S = fC 1 ::: C n g be a structural schema.<br />
(i) An <strong>in</strong>tegrity constra<strong>in</strong>t on S is a formula I over the underly<strong>in</strong>g type system with free<br />
variables fr(I) fx C 1 ::: x C n<br />
g, where each x Ci is a variable <strong>of</strong> type fU Ci g.We call x Ci<br />
the class variable <strong>of</strong> C i .<br />
(ii) A constra<strong>in</strong>ed schema consists <strong>of</strong> a structural schema S and a nite set <strong>of</strong> <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts on S.<br />
(iii) An <strong>in</strong>stance <strong>of</strong> a constra<strong>in</strong>ed schema is an <strong>in</strong>stance <strong>of</strong> the underly<strong>in</strong>g structural schema.<br />
An <strong>in</strong>stance D is said to be consistent with respect to the <strong>in</strong>tegrity constra<strong>in</strong>t I i<br />
substitut<strong>in</strong>g D(C) for each class variable x C <strong>in</strong> I evaluates to true, when<strong>in</strong>terpreted <strong>in</strong><br />
the usual way.<br />
Note that the conditions for an <strong>in</strong>stance <strong>in</strong> Denition 4 correspond to model <strong>in</strong>herent <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts. We refer to these constra<strong>in</strong>ts as implicit identier, IsA and referential constra<strong>in</strong>ts<br />
on the schema S. Let us now dene some dist<strong>in</strong>guished classes <strong>of</strong> user-dened constra<strong>in</strong>ts.<br />
Denition 1.5. Let C C 1 C 2 be classes <strong>in</strong> a schema S and let c i : T C ! T i (i =1 2 3) and<br />
c i : T Ci ! T (i =1 2) be subtype functions.<br />
(i) A functional constra<strong>in</strong>t on C is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) :<br />
(1.8)<br />
(ii) A uniqueness constra<strong>in</strong>t on C is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) i = i 0 :<br />
(1.9)<br />
A uniqueness constra<strong>in</strong>t onC is called trivial i T C = T 1 and c 1 = id hold.<br />
(iii) An <strong>in</strong>clusion constra<strong>in</strong>t on C 1 and C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8t :: T:9i 1 :: IDv 1 :: T C 1 : (i 1v 1 ) 2 x C 1 ^ c 1(v 1 )=t )<br />
9i 2 :: IDv 2 :: T C 2 : (i 2v 2 ) 2 x C 2 ^ c 2(v 2 )=t : (1.10)<br />
(iv) An exclusion constra<strong>in</strong>t on C 1 , C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8i 1 i 2 :: ID:8v 1 :: T C 1 : 8v 2 :: T C 2 : (i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1 (v 1 ) 6= c 2 (v 2 ) :<br />
(1.11)<br />
1.3.4 Methods as a Basis for Behaviour Modell<strong>in</strong>g<br />
So far, only static aspects have been considered. A structural schema is simply a collection <strong>of</strong><br />
data structures called classes. Let us now turn to add<strong>in</strong>g dynamics to this picture. As required<br />
<strong>in</strong> the object oriented approach operations will be associated with classes. This gives us the<br />
notion <strong>of</strong> a method.<br />
We shall dist<strong>in</strong>guish between visible and hidden methods to emphasize those methods<br />
that can be <strong>in</strong>voked by the user and others. This is not <strong>in</strong>tended to dene an <strong>in</strong>terface <strong>of</strong> a<br />
class, s<strong>in</strong>ce for the moment all methods <strong>of</strong> a class <strong>in</strong>clud<strong>in</strong>g the hidden ones can be accessed<br />
by other methods. The justication for such aweak hid<strong>in</strong>g concept is due to two reasons.<br />
12
{ Visible methods serve as a means to specify (nested) transactions. In order to build<br />
sequences <strong>of</strong> database <strong>in</strong>stances we only regard these transactions assum<strong>in</strong>g a l<strong>in</strong>ear <strong>in</strong>vocation<br />
order on them.<br />
{ Hidden methods can be used to handle identiers. S<strong>in</strong>ce these identiers do not have any<br />
mean<strong>in</strong>g for the user, they must not occur with<strong>in</strong> the <strong>in</strong>put or output <strong>of</strong> a transaction.<br />
Denition 1.6. Let S be a structural schema. Let T 1 ::: T n T 0 1 ::: T0 m be types, M 2 N M<br />
and 1 ::: n o 1 ::: o m 2 V .<br />
(i) A method signature consists <strong>of</strong> a method name M, a set <strong>of</strong> <strong>in</strong>put-parameter / <strong>in</strong>put-type<br />
pairs i :: T i and a set <strong>of</strong> output-parameter / output-type pairs o j :: Tj 0 .We write<br />
o 1 :: T 0 1::: o m :: T 0 m M( 1 :: T 1 ::: n :: T n ) :<br />
(ii) Let C be some structural class <strong>in</strong> S. A method M on C consists <strong>of</strong> a method signature<br />
with name M and a body that is recursively built from the follow<strong>in</strong>g constructs:<br />
(a) assignment x := E, where x is either the class variable x C or a local variable with<strong>in</strong><br />
S, andE is a term <strong>of</strong> the same type as x,<br />
(b) skip, fail, loop,<br />
(c) sequential composition S 1 S 2 , choice S 1 S 2 , projection x :: T j S, guard P ! S,<br />
restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable <strong>of</strong> type<br />
T ,and<br />
(d) <strong>in</strong>stantiation x 0 1 ::: x0 i C0 : S 0 (E1 0 ::: E0 j ), where S0 is a method on class C 0 with<br />
<strong>in</strong>put-parameters 0 1 ::: 0 j and output-parameters o0 1 ::: o0 i ,such that the variables<br />
o 0 f , x0 f have the same type and the term E0 g has the same type as the variable 0 g.<br />
(iii) A method M on a class C with signature o 1 :: T1 0::: o m :: Tm 0 M( 1 :: T 1 ::: n ::<br />
T n ) is called value-dened i all T i (i =1:::n) and Tj 0 (j =1::: m) are proper value<br />
types.<br />
As already mentioned the OODM dist<strong>in</strong>guishes between transactions, i.e. methods visible to<br />
the user, and hidden methods. We require each transaction to be value-dened.<br />
Subclasses <strong>in</strong>herit the methods <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long<br />
as the new method is a specialization <strong>of</strong> all its correspond<strong>in</strong>g methods <strong>in</strong> its superclasses.<br />
Overrid<strong>in</strong>g becomes mandatory <strong>in</strong> the case <strong>of</strong> multiple <strong>in</strong>heritance with name conicts. A<br />
method that overrides a hidden method on some superclass must also be hidden.<br />
Denition 1.7. Let S be a structural schema and C 2Sbe a structural class as <strong>in</strong> Denition<br />
1.3 with superclasses D 1 ::: D k .Amethod specication on C consists <strong>of</strong> two sets<strong>of</strong>methods<br />
S = fM 1 ::: M n g (called transactions) and H = fM1 0::: M0 mg (called hidden methods)<br />
such that the follow<strong>in</strong>g properties hold:<br />
(i) Each M i (i =1::: n)isvalue-dened.<br />
(ii) For each transaction M l on some superclass D l there exists some i 2f1::: ng such that<br />
M i specializes M l .<br />
(iii) For each hidden method M l on some superclass D l there exists some j 2f1::: mg such<br />
that M 0 j specializes M l .<br />
13
Let us briey discuss what specialization means for the <strong>in</strong>put- and output-types. Sometimes<br />
it is required that the <strong>in</strong>put-type for an overrid<strong>in</strong>g method should be a subtype <strong>of</strong> the orig<strong>in</strong>al<br />
one (covariance rule), sometimes the opposite (contravariance rule) is required. The rst rule<br />
applies e.g. if we want tooverride an <strong>in</strong>sert method. In this case the <strong>in</strong>herited method has no<br />
eect on the subclass, but simply calls the \old" method. The second rule applies if <strong>in</strong>puttypes<br />
required on the superclass can be omitted on the subclass. Both rules are captured<br />
by the formal notion <strong>of</strong> specialization. We omit the details [44]. Now we are prepared to<br />
generalize the denition <strong>of</strong> classes and schemata.<br />
Denition 1.8. (i) A class consists <strong>of</strong> a class name C 2 N C , a structure expression S, a set <strong>of</strong><br />
class names D 1 ::: D m 2 N C (called the set <strong>of</strong> superclasses) and a method specication<br />
(S = fM 1 ::: M n g , H = fM 0 1 ::: M0 n 0 g)onC.<br />
(ii) A (behavioural) schema S is a nite collection <strong>of</strong> classes fC 1 ::: C n closed under references,<br />
superclasses and method call together with a collection <strong>of</strong> <strong>in</strong>tegrity constra<strong>in</strong>ts<br />
I 1 ::: I n on S.<br />
(iii) An <strong>in</strong>stance D <strong>of</strong> a behavioural schema S is an <strong>in</strong>stance <strong>of</strong> the underly<strong>in</strong>g structural<br />
schema. A database history on S is a sequence D 0 D 1 ::: <strong>of</strong> <strong>in</strong>stances such that each<br />
transition from D i;1 to D i is due to some transaction on some class C 2S.<br />
Note the relation between database histories used here and the work on the semantics <strong>of</strong><br />
object bases <strong>in</strong> [22, 28].<br />
1.3.5 Queries and Views<br />
Roughly speak<strong>in</strong>g the query<strong>in</strong>g <strong>of</strong> a database is an operation on the database without chang<strong>in</strong>g<br />
its state. The emphasis <strong>of</strong> a query is on the output. While such a general view <strong>of</strong> queries can be<br />
subsumed by transactions, hence by methods <strong>in</strong> the OODM, query languages are <strong>in</strong> particular<br />
<strong>in</strong>tended to be declarative <strong>in</strong> order to support an ad-hoc query<strong>in</strong>g <strong>of</strong> a database without the<br />
need to write new transactions [8].<br />
Query<strong>in</strong>g a relational database can be expressed by terms <strong>in</strong> relational algebra. This view<br />
can be easily generalized to the OODM us<strong>in</strong>g its type system. Therefore, terms over such<br />
types occur naturally. Moreover, type specications are based on other type specications via<br />
constructors, selectors and functions. Hence, T allows arbitrary terms <strong>in</strong>volv<strong>in</strong>g more than one<br />
class variable x C to be built. Then a query turns out be be represented by termt over some<br />
type T such that the free variables <strong>of</strong> t are all class variables. This approach is <strong>in</strong> accordance<br />
with the algebraic approach <strong>in</strong> [12] and with so called universal traversal comb<strong>in</strong>ators [25].<br />
In relational algebra a view may be regarded simply as a stored query (or derived relation).<br />
We shall try to generalize also this view to the OODM.<br />
However, th<strong>in</strong>gs change dramatically, when object identiers come <strong>in</strong>to play [13], s<strong>in</strong>ce<br />
now we have to dist<strong>in</strong>guish between queries that result <strong>in</strong> values and those that result <strong>in</strong><br />
(collections <strong>of</strong>) objects. Therefore we dist<strong>in</strong>guish <strong>in</strong> the OODM between value queries and<br />
general access expressions.<br />
A value query on a schema S can then be represented by a term t <strong>of</strong> some value type T<br />
with fr(t) fx C j C 2Sg. Ad-hoc query<strong>in</strong>g <strong>of</strong> a database should then be restricted to value<br />
queries. This is no loss <strong>of</strong> generality, because for any type T <strong>in</strong> T <strong>in</strong>volv<strong>in</strong>g identiers there<br />
exists a correspond<strong>in</strong>g type T 0 allow<strong>in</strong>g multiple occurrences. Take e.g. a class C. Ifwewant<br />
to get all the objects <strong>in</strong> that class no matter whether they have the same values or not, the<br />
14
correspond<strong>in</strong>g term would be x C . This is not a value query, but if T C is a value type, we may<br />
take T 0 = hT C i and the natural projection given by the subtype functions<br />
f(ident : ID value : )g ! h(ident : IDvalue : )i ! hi :<br />
In the case <strong>of</strong> arbitrary access expressions another problem occurs [13]. So far, we can only<br />
build terms t that <strong>in</strong>volve identiers already exist<strong>in</strong>g <strong>in</strong> the database. Thus, such queries<br />
are called object preserv<strong>in</strong>g. If we want the result <strong>of</strong> a query to represent \new" objects, i.e.<br />
if we want to have object generat<strong>in</strong>g queries, we have to apply a mechanism to create new<br />
object identiers. This can be achieved by object creat<strong>in</strong>g functions on the type ID with arity<br />
ID ::: ID ! ID [32, 35].<br />
The idea that a view is a stored query then carries over easily. However, the structure <strong>of</strong> a<br />
view should be compatible with the structure <strong>of</strong> the schema, i.e. each view may be regarded<br />
as a derived class. Summariz<strong>in</strong>g, we getthe follow<strong>in</strong>g formal denition.<br />
Denition 1.9. Let S = fC 1 ::: C n g be some schema.<br />
(i) A value query on S is a term t over some proper value type T with fr(t) fx C 1 ::: x C n<br />
g.<br />
(ii) An access expression on S is a term t over some proper type T with fr(t) fx C 1 ::: x C n<br />
g.<br />
(iii) A view on S consists <strong>of</strong> a view name v 2 N C such that there is no class C 2S with this<br />
name, a structure expression S(v) conta<strong>in</strong><strong>in</strong>g references to classes <strong>in</strong> S or to views on S<br />
and a den<strong>in</strong>g access expression t(v) <strong>of</strong> type fU v g, where T v is the representation type<br />
correspond<strong>in</strong>g to S(v).<br />
(iv) A (complete) schema is a behavioural schema together with a nite set <strong>of</strong> views. An<br />
<strong>in</strong>stance <strong>of</strong> a complete schema is an <strong>in</strong>stance <strong>of</strong> the underly<strong>in</strong>g structural schema such<br />
that for every view v replac<strong>in</strong>g each class variable x C <strong>in</strong> the access expressions <strong>of</strong> v yields<br />
avalue <strong>of</strong> type fU v g satisfy<strong>in</strong>g the uniqueness property for identiers.<br />
1.4 The <strong>Object</strong> Identication Problem<br />
From an object oriented po<strong>in</strong>t <strong>of</strong>view a database may be considered as a huge collection <strong>of</strong><br />
objects <strong>of</strong> arbitrary complex structure. Hence the problem to uniquely identify and retrieve<br />
objects <strong>in</strong> such collections.<br />
Each object <strong>in</strong> a database is an abstraction <strong>of</strong> a real world object that has a unique identity.<br />
The representation <strong>of</strong> such objects <strong>in</strong> the OODM uses an abstract identier I <strong>of</strong> type ID to<br />
encode this identity. Suchanidentier may be considered as be<strong>in</strong>g immutable. However, from<br />
a systems oriented view permutations or collapses <strong>of</strong> identiers without chang<strong>in</strong>g anyth<strong>in</strong>g<br />
else should not aect the behaviour <strong>of</strong> the database.<br />
For the user the abstract identier <strong>of</strong> an object has no mean<strong>in</strong>g. Therefore, a dierent<br />
access to the identication problem is required. We show that the unique identication <strong>of</strong><br />
an object <strong>in</strong> a class leads to the notion <strong>of</strong> (weak) value-identiability, where weak valuerepresentability<br />
can be used to capture also objects that do not exists for there own, but<br />
depend on other objects. This is related to weak entities <strong>in</strong> entity-relationship models [62].<br />
The stronger notion <strong>of</strong> value-representability is required for the unique denition <strong>of</strong> generic<br />
update operations.<br />
15
1.4.1 The Notion <strong>of</strong> Value-Representability<br />
Accord<strong>in</strong>g to our denitions two objects <strong>in</strong> a class C are identical i they have the same<br />
identier. By the use <strong>of</strong> constra<strong>in</strong>ts, especially uniqueness constra<strong>in</strong>ts, we could restrict this<br />
notion <strong>of</strong> equality.<br />
Let us address the characterization <strong>of</strong> those classes, the objects <strong>in</strong> which are completely<br />
representable by values, i.e. we could drop the object identiers and replace references by values<br />
<strong>of</strong> the referred object. We shall see <strong>in</strong> Section 1.5 that <strong>in</strong> case <strong>of</strong> value-representable classes<br />
we are able to preserve an important advantage <strong>of</strong> relational databases, i.e. the existence <strong>of</strong><br />
structurally determ<strong>in</strong>ed update operations.<br />
Denition 1.10. Let C be a class <strong>in</strong> a schema S with representation type T C .<br />
(i) C is called value-identiable i there exists a proper value type I C such that for all<br />
<strong>in</strong>stances D <strong>of</strong> S there is a function c : T C ! I C such thatthe uniqueness constra<strong>in</strong>t on<br />
C dened by c holds for D.<br />
(ii) C is called value-representable i there exists a proper value type V C such that for all<br />
<strong>in</strong>stances D <strong>of</strong> S there is a function c : T C ! V C such that for D<br />
(a) the uniqueness constra<strong>in</strong>t onC dened by c holds and<br />
(b) for each uniqueness constra<strong>in</strong>tonC dened by some function c 0 : T C ! VC 0 with proper<br />
value type VC 0 there exists a function c00 : V C ! VC 0 that is unique on c(codom(D(C)))<br />
with c 0 = c 00 c.<br />
It is easy to see that each value-representable class C is also value-identiable. Moreover, the<br />
value-representation type V C <strong>in</strong> Denition 1.10 is unique up to isomorphism.<br />
1.4.2 Value-Representability <strong>in</strong> the Case <strong>of</strong> Acyclic Reference Graphs<br />
S<strong>in</strong>ce value-representability is dened by the existence <strong>of</strong> a certa<strong>in</strong> proper value type, it is hard<br />
to decide, whether an arbitrary class is value-representable or not. In case <strong>of</strong> simple classes<br />
the problem is easier, s<strong>in</strong>ce we only have to deal with uniqueness and value constra<strong>in</strong>ts. In<br />
this case it is helpful to analyse the reference structure <strong>of</strong> the class. Hence the follow<strong>in</strong>g<br />
graph-theoretic denitions.<br />
Denition 1.11. The reference graph <strong>of</strong> a class C <strong>in</strong> a schema S is the smallest labelled<br />
graph G rep =(VEl) satisfy<strong>in</strong>g:<br />
(i) There exists a vertex v C 2 V with l(v C ) = ft Cg, where t is the top-level type <strong>in</strong> the<br />
structure expression S <strong>of</strong> C.<br />
(ii) For each proper occurrence <strong>of</strong> a type t 6= ID <strong>in</strong> T C there exists a unique vertex v t 2 V<br />
with l(v t )=ftg.<br />
(iii) For each reference r i : C i <strong>in</strong> the structure expression S <strong>of</strong> C the reference graph G i ref is<br />
a subgraph <strong>of</strong> G ref .<br />
(iv) For each vertex v t or v C correspond<strong>in</strong>g to t(x 1 ::: x n )<strong>in</strong>S there exist unique edges e (i)<br />
t<br />
from v t or v C respectively to v ti <strong>in</strong> case x i is the type t i or to v Ci <strong>in</strong> case x i is the reference<br />
r i : C i . In the rst case l(e (i)<br />
t )=fS i g, where S i is the correspond<strong>in</strong>g selector name <strong>in</strong> the<br />
latter case the label is fS i r i g.<br />
16
Denition 1.12. (i) Let S = fC 1 ::: C n g be a schema. Let S 0 = fC1 0 ::: C0 ng be another<br />
schema such that for all i there exists a uniqueness constra<strong>in</strong>t on C i dened by some<br />
c i : T Ci ! T C 0<br />
i<br />
. Then an identication graph G id <strong>of</strong> the class C i is obta<strong>in</strong>ed from the<br />
reference graph <strong>of</strong> Ci 0 bychang<strong>in</strong>g each label C0 j to C j.<br />
(ii) The identication graph G id result<strong>in</strong>g from the use <strong>of</strong> trivial uniqueness constra<strong>in</strong>ts is<br />
called the standard identication graph.<br />
Clearly, there need not exist any identication graph nor does the existence <strong>of</strong> one identication<br />
graph imply the existence <strong>of</strong> the standard one. However, if the standard identication<br />
graph exist, then it is equal to the reference graph.<br />
Proposition 1.13. Let C be a class <strong>in</strong> a schema S with acyclic reference graph G ref such<br />
that there exist uniqueness constra<strong>in</strong>ts for C and each C i such that C i occurs as a label <strong>in</strong><br />
G ref . Then C is value-representable.<br />
Pro<strong>of</strong>. We use <strong>in</strong>duction on the maximum length <strong>of</strong> a path <strong>in</strong> G ref . If there are no references<br />
<strong>in</strong> the structure expression S <strong>of</strong> C the type T C is a proper value type. S<strong>in</strong>ce there exists a<br />
uniqueness constra<strong>in</strong>tonC, the identity function id on T C also denes a uniqueness constra<strong>in</strong>t.<br />
Hence V C = T C satises the requirements <strong>of</strong> Denition 1.10.<br />
If there are references r i : C i <strong>in</strong> the structure expression S <strong>of</strong> C, then the <strong>in</strong>duction<br />
hypothesis holds for each such C i , because G ref is acyclic. Let V C result from S by replac<strong>in</strong>g<br />
each r i : C i by V Ci . Then V C satises the requirements <strong>of</strong> Denition 1.10.<br />
ut<br />
Corollary 1.14. Let C be a class <strong>in</strong> a schema S such that there exist an acyclic identication<br />
graph G id and uniqueness constra<strong>in</strong>ts for C and each C i occur<strong>in</strong>g as a label <strong>in</strong> G id . Then C<br />
is value-identiable.<br />
1.4.3 Computation <strong>of</strong> Value Representation Types<br />
We want to address the more general case where cyclic references may occur <strong>in</strong> the schema<br />
S = fC 1 ::: C n g. In this case a simple <strong>in</strong>duction argument as <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Theorem 1.13<br />
is not applicable. So we take another approach. We dene algorithms to compute types V C<br />
and I C that turn out to be proper value types under certa<strong>in</strong> conditions. In the next subsection<br />
we then show that these types are the value representation type and the value identication<br />
type required by Denition 1.10.<br />
Algorithm 1.15. Let F (C i )=T i provided there exists a uniqueness constra<strong>in</strong>t onC i dened<br />
by c i : T Ci ! T i , otherwise let F (C i ) be undened. If ID occurs <strong>in</strong> some F (C i ) correspond<strong>in</strong>g<br />
to r j : C j (j 6= i), we writeID j .<br />
Then iterate as long as possible us<strong>in</strong>g the follow<strong>in</strong>g rules:<br />
(i) If F (C j )isaproper value type and ID j occurs <strong>in</strong> some F (C i )(j 6= i), then replace this<br />
correspond<strong>in</strong>g ID j <strong>in</strong> F (C i )by F (C j ).<br />
(ii) If ID i occurs <strong>in</strong> some F (C i ), then let F (C i ) be recursively dened by F (C i )==S i , where<br />
S i is the result <strong>of</strong> replac<strong>in</strong>g ID i <strong>in</strong> F (C i )by the type name F (C i ).<br />
This iteration term<strong>in</strong>ates, s<strong>in</strong>ce there exists only a nite collection <strong>of</strong> classes. If these rules are<br />
no longer applicable, replace each rema<strong>in</strong><strong>in</strong>g occurrence <strong>of</strong> ID j <strong>in</strong> F (C i ) by the type name<br />
F (C j )provided F (C j ) is dened.<br />
ut<br />
17
Note that the the algorithm computes (mutually) recursive types. Now we give a sucient<br />
condition for the result <strong>of</strong> Algorithm 1.15 to be a proper value type.<br />
Lemma 1.16. Let C be a class <strong>in</strong> a schema S such that there exists a uniqueness constra<strong>in</strong>t<br />
for all classes C i occurr<strong>in</strong>g as a label <strong>in</strong> some identication graph G id <strong>of</strong> C. Let I C be the<br />
type F (C) computed by Algorithm 1.15 with respect to the uniqueness constra<strong>in</strong>ts used <strong>in</strong> the<br />
denition <strong>of</strong> G id . Then I C is a proper value type.<br />
Pro<strong>of</strong>. Suppose I C were not a proper value type. Then there exists at least one occurrence <strong>of</strong><br />
ID <strong>in</strong> I C . This corresponds to a class C i without uniqueness constra<strong>in</strong>t occurr<strong>in</strong>g as a label<br />
<strong>in</strong> G id , hence contradicts the assumption <strong>of</strong> the lemma.<br />
ut<br />
1.4.4 The F<strong>in</strong>iteness Property<br />
Let us now address the general case. The basic idea is that there is always only a nite number<br />
<strong>of</strong> objects <strong>in</strong> a database. Assum<strong>in</strong>g the database be<strong>in</strong>g consistent with respect to <strong>in</strong>clusion<br />
and referential constra<strong>in</strong>ts yields that there can not exist <strong>in</strong>nite cyclic references. This will<br />
be expressed by theniteness property. We show that this property allows the computation<br />
<strong>of</strong> value representation types.<br />
Denition 1.17. Let C be a class <strong>in</strong> a schema S and let g kl denote a path <strong>in</strong> G ref from v Ck<br />
to v Cl provided there is a reference r l : C l <strong>in</strong> the structure expression <strong>of</strong> C k . Then a cycle <strong>in</strong><br />
G ref is a sequence g 01 g n;1n with C 0 = C n and C k 6= C l otherwise.<br />
Note that we use paths <strong>in</strong>stead <strong>of</strong> edges, because the edges <strong>in</strong> G ref do not always correspond<br />
to references. Accord<strong>in</strong>g to our denition <strong>of</strong> a class there exists a referential constra<strong>in</strong>t on<br />
C k , C l dened by o kl : T Ck ID ! BOOL correspond<strong>in</strong>g to g kl . Therefore, to each cycle<br />
there exists a correspond<strong>in</strong>g sequence <strong>of</strong> functions o 01 o n;1n . This can be used as follows<br />
to dene a function cyc : ID ID ! BOOL correspond<strong>in</strong>g to a cycle <strong>in</strong> G ref .<br />
Denition 1.18. Let C be a class <strong>in</strong> a schema S and let g 01 g n;1n be a cycle <strong>in</strong> G ref . The<br />
correspond<strong>in</strong>g cycle relation cyc : ID ID ! BOOL is dened by cyc(i j) =true i there<br />
exists a sequence i = i 0 i 1 ::: i n = j (n 6= 0) such that (i l v l ) 2 C l and o ll+1 (i l+1 v l )=true<br />
for all l =0::: n; 1.<br />
Given a cycle relation cyc, letcyc m the m-th power <strong>of</strong> cyc.<br />
Lemma 1.19. Let C be a class <strong>in</strong> a schema S. Then C satises the niteness property,<br />
i.e. for each <strong>in</strong>stance D <strong>of</strong> S and for each cycle <strong>in</strong> G ref the correspond<strong>in</strong>g cycle relation cyc<br />
satises<br />
8i 2 dom(C): 9n: 8j 2 dom(C): 9m
Lemma 1.20. Let D be an <strong>in</strong>stance <strong>of</strong> schema S = fC 1 ::: C n g. Then D satises at<br />
each stage <strong>of</strong> Algorithm 1.15 uniqueness constra<strong>in</strong>ts for all i = 1::: n dened by some<br />
c 0 i : T C i<br />
! F (C i ).<br />
Pro<strong>of</strong>. It is sucient toshow that whenever a rule is applied replac<strong>in</strong>g F (C i )by F (C i ) 0 , then<br />
F (C i ) 0 also denes a uniqueness constra<strong>in</strong>t onC i .<br />
Suppose that (i v) 2 C i holds <strong>in</strong> D. S<strong>in</strong>ce it is possible to apply a rule to F (C i ), there exists<br />
at least one value j :: ID occurr<strong>in</strong>g <strong>in</strong> c i (v). Replac<strong>in</strong>g ID j <strong>in</strong> F (C i ) corresponds to replac<strong>in</strong>g<br />
j by some value v j :: F (C j ). Because <strong>of</strong> the niteness property such a value must exist.<br />
Moreover, due to the uniqueness constra<strong>in</strong>t dened by c j the function f : F (C i ) ! F (C i ) 0<br />
represent<strong>in</strong>g this replacement must be <strong>in</strong>jective onc i (codo(D(C i ))). Hence, c 0 i = f c i denes<br />
a uniqueness constra<strong>in</strong>t onC i .<br />
ut<br />
Now assume that we use only trivial uniqueness constra<strong>in</strong>ts <strong>in</strong> Algorithm 1.15. In order to<br />
dist<strong>in</strong>guish this situation from the general case we write G(C i ) <strong>in</strong>stead <strong>of</strong> F (C i ) to refer to<br />
this special case.<br />
Lemma 1.21. Let D be an <strong>in</strong>stance <strong>of</strong> schema S = fC 1 ::: C n g. Then at each stage <strong>of</strong><br />
Algorithm 1.15 (applied with arbitrary uniqueness constra<strong>in</strong>ts and <strong>in</strong> parallel with trivial<br />
ones) there exists for all i = 1::: n a function c i : G(C i ) ! F (C i ) that is unique on<br />
c i (codom(D(C i ))) with c 0 i =c i c i .<br />
Pro<strong>of</strong>. As <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Lemma 1.20 it is sucient to show that the required property<br />
is preserved by the application <strong>of</strong> a rule from any <strong>of</strong> the two versions <strong>of</strong> Algorithm 1.15.<br />
Therefore, let c i satisfy the required property and let g : G(C i ) ! G(C i ) 0 and f : F (C i ) !<br />
F (C i ) 0 be functions correspond<strong>in</strong>g to the application <strong>of</strong> a rule to G(C i ) and F (C i ) respectively.<br />
Such functions were constructed <strong>in</strong> the pro<strong>of</strong>s <strong>of</strong> Lemma 1.20 and Lemma 1.20 respectively.<br />
Then f c i satises the required property with respect to the application <strong>of</strong> f. In the case<br />
<strong>of</strong> apply<strong>in</strong>g g we know that g is <strong>in</strong>jective on c i (codom(D(C i ))). Let h : G(C i ) 0 ! G(C i ) be<br />
any cont<strong>in</strong>uation <strong>of</strong> g ;1 : g(c i (codom(D(C i )))) ! G(C i ). Then c i h satises the required<br />
property.<br />
ut<br />
Theorem 1.22. Let C be a class <strong>in</strong> a schema S such that there exists a uniqueness constra<strong>in</strong>t<br />
for all classes C i occurr<strong>in</strong>g as a label <strong>in</strong> the reference graph G ref <strong>of</strong> C. Let V C be the type<br />
G(C) computed by Algorithm 1.15 with respect to trivial uniqueness constra<strong>in</strong>ts and let I C be<br />
the type F (C) computed by Algorithm 1.15 with respect to arbitrary uniqueness constra<strong>in</strong>ts.<br />
Then C is value-representable with value representation type V C and each such I C is a value<br />
identication type.<br />
Pro<strong>of</strong>. V C is a proper value type by Lemma 1.16. From Lemma 1.20 it follows that if D is an<br />
<strong>in</strong>stance <strong>of</strong> S, then there exists a function c : T C ! V C such that the uniqueness constra<strong>in</strong>t<br />
dened by c holds for D. The same applies to I C .<br />
If VC<br />
0 is another proper value type and D satises a uniqueness constra<strong>in</strong>t dened by<br />
c 0 : T C ! VC 0 , then V C 0 is some value-identication type I C.Henceby Lemma 1.21 there exists<br />
a function c 00 : V C ! VC 0 that is unique on c(codom(D(C))) with c0 = c 00 c. This proves the<br />
Theorem.<br />
ut<br />
Corollary 1.23. Let S be a schema such that all classes C <strong>in</strong> S are value-identiable. Then<br />
all classes C <strong>in</strong> S are also value-representable.<br />
ut<br />
19
1.4.5 Weak Value-Representability<br />
Let us now ask whether there exist also weaker identication mechanisms other than valuerepresentability.<br />
Inseveral papers, e.g. [42] a navigational approach on the basis <strong>of</strong> the reference<br />
structure has been favoured. This leads to dependent classessimilar to \weak entities"<br />
<strong>in</strong> the entity-relationship model [62]. We shall show that such an approach requires at least<br />
a value-identiable \entrance" <strong>of</strong> some path and the hard restriction on references to be<br />
representable by surjective functions.<br />
Denition 1.24.<br />
Let S be some schema.<br />
(i) If r is a reference from class C to D <strong>in</strong> S and o : T C ID ! BOOL is the function<br />
<strong>of</strong> Denition 4 express<strong>in</strong>g the correspond<strong>in</strong>g referential constra<strong>in</strong>t, then r satises the<br />
(SF)-condition i<br />
(a) o(v i) ^ o(v j) ) i = j and<br />
(b) j 2 dom(x D ) ) 9v :: T C :v 2 codom(x C ) ^ o(v j)<br />
hold for all i j :: IDv :: T C .<br />
(ii) An (SF)-cha<strong>in</strong> from class D to C <strong>in</strong> S is a sequence <strong>of</strong> classes D = C 0 ::: C n = C such<br />
that for all i (i =1::: n) either C i is a subclass <strong>of</strong> C i;1 or there exists a reference r i<br />
from C i;1 to C i satisfy<strong>in</strong>g the (SF)-condition.<br />
(iii) A class C <strong>in</strong> S is called weakly value-identiable i there exists avalue-identiable class<br />
D and an (SF)-cha<strong>in</strong> from D to C.<br />
The notation (SF)-condition has been chosen to emphasize that such a reference represents<br />
a surjective function. It is easy to see tak<strong>in</strong>g n =0that each value-identiable class is also<br />
weakly value-identiable.<br />
Lemma 1.25. If C is a weakly value-identiable class <strong>in</strong> a schema S, then there exists a<br />
proper value type I C such that for each <strong>in</strong>stance D <strong>of</strong> S there exists a function c : ID ! I C<br />
such that c is <strong>in</strong>jective on dom(D(C)).<br />
Call I C a weak value-identication type <strong>of</strong> the class C.<br />
Pro<strong>of</strong>. Let D = C 0 ::: C n = C be an (SF)-cha<strong>in</strong> from the value-identiable class D to C<br />
with correspond<strong>in</strong>g references r i (i =1::: n). If r i satises the (SF)-condition, there exists<br />
a function c i : ID ! ID such that j 2 dom(D(C i )) ) (c i (j)v) 2 x Ci;1 for some v with<br />
o i (v j) (just take some <strong>in</strong>verse image <strong>of</strong> j under the surjective reference function). S<strong>in</strong>ce r i<br />
denes a function, c i is clearly <strong>in</strong>jective. If C i is a subclass <strong>of</strong> C i;1 , then take c i = id.<br />
If c 0 : ID ! I D is the function dened by the uniqueness constra<strong>in</strong>t onD and c 00 : ID !<br />
ID is the concatenation c 1 ::: c n , then c = c 0 c 00 satises the required property. ut<br />
Denition 1.26. A class C <strong>in</strong> a schema S is called weakly value-representable i there exists<br />
apropervalue type V C such that for each <strong>in</strong>stance D <strong>of</strong> S the follow<strong>in</strong>g properties hold.<br />
(i) There is a function c : ID ! V C that is <strong>in</strong>jective ondom(D(C)).<br />
(ii) For each proper value type VC 0 and each function c0 : ID ! VC 0 that is <strong>in</strong>jective on<br />
dom(D(C)) there exists a function c 00 : V C ! VC 0 that is unique on c(dom(D(C))) with<br />
c 0 = c 00 c.<br />
20
We call V C the weak value-representation type <strong>of</strong> the class C.<br />
Note that the weak value-representation typeisuniqueprovided it exists. Aga<strong>in</strong> it is easy to<br />
see that value-representability implies weak value-representability. Moreover, due to Lemma<br />
1.25 each weakly value-representable class is also weakly value-identiable. We shall see that<br />
also the converse <strong>of</strong> this fact is true.<br />
We want to compute weak value representation types. This can be done us<strong>in</strong>g a slight<br />
modication <strong>of</strong> Algorithm 1.15 that completely ignores uniqueness constra<strong>in</strong>ts. We refer to<br />
this algorithm as the bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15 and to emphasize this, we write H(C i )<br />
<strong>in</strong>stead <strong>of</strong> F (C i ). Analogous to Lemmata 1.16 and 1.20 the follow<strong>in</strong>g results holds.<br />
Lemma 1.27. Let C be aclass <strong>in</strong> a schema S and let I C bethetype H(C) computed by the<br />
bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15. Then I C is a proper value type.<br />
Lemma 1.28. Let D be an <strong>in</strong>stance <strong>of</strong> the schema S = fC 1 ::: C n g. Let C, D be classes<br />
such that C is weakly value-identiable, D is value-identiable and there exists some (SF)-<br />
cha<strong>in</strong> from D to C. Let c : ID ! I C be the function <strong>of</strong> Lemma 1.25 correspond<strong>in</strong>g to this<br />
cha<strong>in</strong>. Let c 0 : ID ! H(D) be a function correspond<strong>in</strong>g to the uniqueness constra<strong>in</strong>t on D<br />
and the <strong>in</strong>stance D. Then at each stage <strong>of</strong> the bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15 there exists a<br />
function c : H(D) ! I C that is unique on c 0 (dom D (C)) with c =c c 0 .<br />
Based on these two lemmata we can now state the ma<strong>in</strong> result on weak value representability.<br />
Theorem 1.29. Let C be a weakly value-identiable class <strong>in</strong> a schema S andlet V C be the<br />
product <strong>of</strong> all types H(D), where D is the lead<strong>in</strong>g value-identiable class <strong>in</strong> some maximal<br />
(SF)-cha<strong>in</strong> correspond<strong>in</strong>g to C and H(D) is the result <strong>of</strong> the bl<strong>in</strong>d version <strong>of</strong> Algorithm 1.15.<br />
Then C is weakly value-representable with weak value-representation type V C .<br />
Pro<strong>of</strong>. V C is a proper value type by Lemma 1.27. From Lemmata 1.20 and 1.25 it follows<br />
that there exists a function c 0 : ID ! V C that is <strong>in</strong>jective ondom D (C).<br />
From Lemma 1.28 it follows that there exists a function c : V C ! I C that is unique on<br />
c 0 (dom(D(C))) with c =c c 0 . This proves the Theorem.<br />
ut<br />
1.5 The Genericity Problem<br />
The preservation <strong>of</strong> advantages <strong>of</strong> relational databases requires generic operations for query<strong>in</strong>g<br />
and for the <strong>in</strong>sertion, deletion and update <strong>of</strong> s<strong>in</strong>gle objects. While query<strong>in</strong>g [1, 12, 30, 55] is<br />
per se a set-oriented operation, i.e. it is not necessary to select just one s<strong>in</strong>gle object, and<br />
hence does not raise any specic problems with object identiers, th<strong>in</strong>gs change completely<br />
<strong>in</strong> case <strong>of</strong> updates. If an object with a given value is to be updated (or deleted), this is only<br />
dened unambigously, if there does not exist another object with the same value.Ifmorethan<br />
one object exists with the same value or more generally with the same value and the same<br />
references to other objects, then the user has to decide, whether an update- or delete-operation<br />
is applied to all these objects, to only one <strong>of</strong> these objects selected non-determ<strong>in</strong>istically or to<br />
none <strong>of</strong> them, i.e. to reject the operation. However, it is not possible to specify a priori such<br />
an operation that works <strong>in</strong> the same way for all objects <strong>in</strong> all situations. The same applies<br />
to <strong>in</strong>sert-operations. Hence the problem, <strong>in</strong> which cases operations for the <strong>in</strong>sertion, deletion<br />
and update <strong>of</strong> objects can be dened generically.<br />
21
Some authors [43] have chosen the solution to abandon generic operations. Others [6, 7, 9]<br />
use identify<strong>in</strong>g values to represent objectidentity, thus embody a strict concept <strong>of</strong> surrogate<br />
keys to avoid the problem. Our approach is dierent from both solutions <strong>in</strong> that we use the<br />
concept <strong>of</strong> hidden abstract identiers, but at the same time formally characterize those classes<br />
for which unique generic methods for the <strong>in</strong>sertion, deletion and update <strong>of</strong> s<strong>in</strong>gle objects exist.<br />
At the same time <strong>in</strong>clusion and referential <strong>in</strong>tegrity have to be enforced. We show that these<br />
classes are the value-representable ones.<br />
1.5.1 Generic Update Methods<br />
The requirement that object-identiers have to be hidden from the user imposes the restriction<br />
on canonical update operations to be value-dened <strong>in</strong> the sense that the identier <strong>of</strong> a new<br />
object hastobechosen by the system whereas all <strong>in</strong>put- and output-data have to be values<br />
<strong>of</strong> proper value types.<br />
We now formally dene what we mean by generic update methods. For this purpose regard<br />
an <strong>in</strong>stance D <strong>of</strong> a schema S as a set <strong>of</strong> objects. For each recursively dened type T let T<br />
denote by replac<strong>in</strong>g each occurrence <strong>of</strong> a recursive type T 0 <strong>in</strong> T by UNION(T 0 ID).<br />
Denition 1.30. Let C be a class <strong>in</strong> a schema S. Generic update methods on C are <strong>in</strong>sert C ,<br />
delete C and update C satisfy<strong>in</strong>g the follow<strong>in</strong>g properties:<br />
(i) Their <strong>in</strong>put types are proper value types their output type is the trivial type ?.<br />
(ii) In the case <strong>of</strong> <strong>in</strong>sert applied to an <strong>in</strong>stance D there exists some o :: U C such that<br />
(a) the result is an <strong>in</strong>stance D 0 with o 2D 0 and DD 0 hold and<br />
(b) if D is any <strong>in</strong>stance with D D and o 2 D, then D 0 D.<br />
(iii) In the case <strong>of</strong> delete applied to an <strong>in</strong>stance D there exists some o :: U C such that<br />
(a) the result is an <strong>in</strong>stance D 0 with o 62 D 0 and D 0 Dhold and<br />
(b) if D is any <strong>in</strong>stance with DDand o 62 D, then DD 0 .<br />
<br />
(iv) In the case <strong>of</strong> update applied to an <strong>in</strong>stance D = D 1 [D 2 , where D 2 = fog if o 6= o 0 and<br />
D 2 = otherwise there exist o o 0 :: U C with o =(i v) ando 0 =(i v 0 )such that<br />
(a) the result is an <strong>in</strong>stance D 0 <br />
= D 1 [D2 0 with D 2 \D2 0 = ,<br />
(b) o 2D, o 0 2D 0 ,<br />
(c) if D is any <strong>in</strong>stance with D 1 D and o 0 2 D, then D 0 D. <br />
Canonical update methods on C are <strong>in</strong>sert 0 C , delete0 C and update0 C<br />
dened analogously with<br />
the only dierence <strong>of</strong> their output type be<strong>in</strong>g ID and their <strong>in</strong>put-type be<strong>in</strong>g T for some<br />
value-type T .<br />
Note that this denition <strong>of</strong> genericity <strong>in</strong>cludes the consistency with respect to the implicit constra<strong>in</strong>ts<br />
on S. Weshowthatvalue-representability is necessary and sucient for the existence<br />
and uniqueness <strong>of</strong> such operations.<br />
Lemma 1.31. Let C be a class <strong>in</strong> a schema S such that there exist canonical update methods<br />
on C. Then also generic update methods exist on C.<br />
Pro<strong>of</strong>. In the case <strong>of</strong> <strong>in</strong>sert dene <strong>in</strong>sert C (V :: V C ) == I <strong>in</strong>sert 0 C<br />
(V ), i.e. call the<br />
correspond<strong>in</strong>g canonical operation and ignore its output. The same argument applies to delete<br />
and update.<br />
ut<br />
22
Theorem 1.32. Let C be a class <strong>in</strong> a schema S such that there exist generic update methods<br />
on C. Then C is value-representable. Moreover, all super- and subclasses <strong>of</strong> C are also valuerepresentable.<br />
Pro<strong>of</strong>. First consider the delete method with <strong>in</strong>put type I C which isby denition a proper<br />
value type. We show that it is already a value identication type.<br />
If not, then for all <strong>in</strong>stances D and all functions c : T C ! I C there exist i j :: ID and<br />
v w :: T C with<br />
i 6= j ^ (i v) 2D(C) ^ (j w) 2D(C) ^ c(v) =c(w) : (1.12)<br />
Now take o = (i v) and o 0 = (j w). Then there exist two dist<strong>in</strong>ct <strong>in</strong>stances D 0 and D 00<br />
satisfy<strong>in</strong>g the conditions <strong>of</strong> Denition 1.30(iii) with respect to o and o 0 respectively, hence<br />
contradict the assumption <strong>of</strong> a unique generic delete-method on C.<br />
The same argument applies to the <strong>in</strong>put-type V C . Moreover, s<strong>in</strong>ce <strong>in</strong>sertion requires all<br />
values <strong>of</strong> referenced object to be provided, we derive from Algorithm 1.15 and Theorem 1.22<br />
that V C is a value representation type. Therefore, C is value-representable.<br />
The value-representability on superclasses is implied, s<strong>in</strong>ce <strong>in</strong>sert (and update) on C<br />
<strong>in</strong>volve the correspond<strong>in</strong>g method on each superclass. The value-representability <strong>of</strong> subclasses<br />
follows from the propagation <strong>of</strong> update through them. We omit the technical details. ut<br />
1.5.2 Generic Updates <strong>in</strong> the Case <strong>of</strong> Value-Representability<br />
Our next goal is to reduce the existence problem <strong>of</strong> canonical update operations to schemata<br />
without IsA relations.<br />
Lemma 1.33. Let C, D be value-representable classes <strong>in</strong> a schema S such that C is a subclass<br />
<strong>of</strong> D with subtype function g : T C ! T D .Thenthere exists a function h : V C ! V D such that<br />
for each <strong>in</strong>stance D <strong>of</strong> S with correspond<strong>in</strong>g functions c : T C ! V C and d : T D ! V D we have<br />
h(c(v)) = d(g(v)) for all v 2 codom(D(C)).<br />
Pro<strong>of</strong>. By Denition 1.10 c is <strong>in</strong>jectiveoncodom(D(C)), hence any cont<strong>in</strong>uation h <strong>of</strong> dgc ;1<br />
satises the required property.<br />
It rema<strong>in</strong>s to show that h does not depend on D. Suppose D 1 , D 2 are two <strong>in</strong>stances such<br />
that w = c 1 (v 1 )=c 2 (v 2 ) 2 V C , where c 1 d 1 h 1 correspond to D 1 and c 2 d 2 h 2 correspond to<br />
D 2 . Then there exists a permutation on ID such that v 2 = (v 1 ). We may extend to a<br />
permutation on any type. S<strong>in</strong>ce ID has no non-trivial supertype, g permutes with , hence<br />
g(v 2 )=(g(v 1 )). From Denition 1.10 it follows d 2 (g(v 2 )) = d 1 (g(v 1 )), i.e. h 2 (w) =h 1 (w).<br />
ut<br />
In the follow<strong>in</strong>g let S 0 be a schema derived from a schema S by omitt<strong>in</strong>g all IsA relations.<br />
Lemma 1.34. Let C be a value-representable class <strong>in</strong> S such that all its superclasses and<br />
subclasses D 1 :::D n are also value-representable. Then canonical update operations exist on<br />
C <strong>in</strong> S i they exist on C and all D i <strong>in</strong> S 0 .<br />
Pro<strong>of</strong>. By Theorem 1.22 the value-representation type V C is the result <strong>of</strong> Algorithm 1.15,<br />
hence V C does not depend on the <strong>in</strong>clusion constra<strong>in</strong>ts <strong>of</strong> S. Thenwehave<br />
I :: ID<br />
<strong>in</strong>sert 0 C (V :: V C)==<br />
I <strong>in</strong>sert 0 D1 (h 1(V )) ::: I <strong>in</strong>sert 0 D n<br />
(h n (V )) I <strong>in</strong>sert 0 C(V ) <br />
23
where h i : V C ! V Di is the function <strong>of</strong> Lemma 1.33 and <strong>in</strong>sert 0 C<br />
denotes a canonical <strong>in</strong>sert<br />
on C <strong>in</strong> S 0 . Hence <strong>in</strong> this case the result for the <strong>in</strong>sert follows by structural <strong>in</strong>duction on the<br />
IsA-hierarchy.<br />
If the subtype function g required <strong>in</strong> Lemma 1.33 does not exist for some superclass D<br />
then simply add V D to the <strong>in</strong>put type. We omit the details for this case.<br />
The arguments for delete and update are analogous. The value-representability <strong>of</strong> subclasses<br />
is required for the update case.<br />
ut<br />
From now onwe use a global operation NewId that produces a fresh identier I :: ID.This<br />
can be represented as a method us<strong>in</strong>g projection.<br />
Lemma 1.35. Let C be a value-representable class <strong>in</strong> S 0 . Then there exist unique quasicanonical<br />
update operations on C.<br />
Pro<strong>of</strong>. Let r i : C i (i =1:::n) denote the references <strong>in</strong> the structure expression <strong>of</strong> C. IfV be<br />
avalue <strong>of</strong> type V C , then there exist values V ij :: VCi<br />
(i =1:::nj =1:::k i ) occurr<strong>in</strong>g <strong>in</strong> V .<br />
Let V = fV ij =J ij j i =1:::nj =1:::k i g:V denote the value <strong>of</strong> type T C that results from<br />
replac<strong>in</strong>g each V ij by some J ij :: ID. Moreover, for I :: ID let<br />
<br />
V (I) fV=Ig:Vij if V occurs <strong>in</strong> V<br />
ij<br />
=<br />
ij<br />
else<br />
V ij<br />
Then the canonical <strong>in</strong>sert operation can be dened as follows:<br />
I :: ID <strong>in</strong>sert 0 C (V :: VC ) ==<br />
9 I 0 :: ID V 0 :: T C : (P air(I 0 V 0 ) 2 C ^ c(V 0 )=V ) ! I := I 0<br />
9V 0 :: T C :V = V 0 ! I NewId x C := x C [f(IV )g<br />
I NewId J 11 <strong>in</strong>sert 0 (I)<br />
C1 (V 11 ) ::: J nk n<br />
<strong>in</strong>sert 0 C n<br />
(V (I)<br />
nk n<br />
)<br />
x C := x C [f(IV )g<br />
It rema<strong>in</strong>s to show that this operation is <strong>in</strong>deed canonical. Apply the method to some <strong>in</strong>stance<br />
D. If there already exists some o =(I 0 V 0 ) <strong>in</strong> C with c(V 0 ) = V , the result is D 0 = D and<br />
the requirements <strong>of</strong> Denition 1.30 are trivially satised. Otherwise let o = (I V ). If D<br />
is an <strong>in</strong>stance with D D and o 2 D, we have J ij 2 dom(C i ) for all i = 1 :::n, j =<br />
1 :::k i , s<strong>in</strong>ce D satises the referential constra<strong>in</strong>ts. Hence D conta<strong>in</strong>s the dist<strong>in</strong>guished objects<br />
correspond<strong>in</strong>g to the <strong>in</strong>volved quasi-canonical operations <strong>in</strong>sert 0 C i<br />
. By <strong>in</strong>duction on the length<br />
<strong>of</strong> call-sequences D ij D for all i = 1 :::n, j = 1 :::k i , where D ij is the result <strong>of</strong> J ij<br />
<strong>in</strong>sert 0 C i<br />
(V (I)<br />
ij ). Hence D0 = S ij<br />
D ij [fog D. The uniqueness follows from the uniqueness <strong>of</strong><br />
V C .<br />
The denitions and pro<strong>of</strong>s for delete and update are analogous.<br />
Theorem 1.36. Let C be a value-representable class <strong>in</strong> a schema S such that all its superand<br />
subclasses are also value-representable. Then there exist unique generic update operations<br />
on C.<br />
Pro<strong>of</strong>. By Lemma 1.31 and Lemma 1.34 it is sucient to show the existence <strong>of</strong> canonical<br />
update operations on C and all its super- and subclasses <strong>in</strong> the schema S 0 . This follows from<br />
Lemma 1.35.<br />
ut<br />
In [50] it has been shown, how l<strong>in</strong>guistic reection [56] can be exploited to generate the generic<br />
update operations for value-representable classes <strong>in</strong> an OODM schema.<br />
24<br />
ut
1.6 The Consistency Problem<br />
In general a database may be considered as a triplet (S O C), where S denes a structure,<br />
O denotes a collection <strong>of</strong> state chang<strong>in</strong>g operations and C is a set <strong>of</strong> constra<strong>in</strong>ts. Then the<br />
consistency problem is to guaranteethateach specied operation o 2Owill never violate any<br />
constra<strong>in</strong>t I2C. Integrity enforcement aims at the derivation <strong>of</strong> a new set O 0 with j O 0 j=j O j<br />
<strong>of</strong> operations such that(S O 0 C) satises this property.<br />
Suppose we are given a database schema S and a static <strong>in</strong>tegrity constra<strong>in</strong>t I on that<br />
schema. Regard I as a logical formula dened on S. Consistency requires that only those<br />
<strong>in</strong>stances D <strong>of</strong> S are allowed that satisfy I. Call the set <strong>of</strong> such <strong>in</strong>stances sat(S I). Each<br />
transaction is a database transformation. Such a database transformation T takes an arbitrary<br />
<strong>in</strong>stance D and possibly some <strong>in</strong>put values v 1 ::: v n and produces a new <strong>in</strong>stance D 0 and<br />
possibly some output values v1 0 ::: v0 m . T is consistent with respect to I i for each D 2<br />
sat(S I) we also have D 0 2 sat(S I).<br />
Classically consistency is ma<strong>in</strong>ta<strong>in</strong>ed at run-time by transaction monitors. Whenever an<br />
<strong>in</strong>consistent <strong>in</strong>stance is produced the transaction that caused the <strong>in</strong>consistency will be rolled<br />
back. This \everyth<strong>in</strong>g or noth<strong>in</strong>g" approach has been critized, s<strong>in</strong>ce it causes enormous runtime<br />
overhead for consistency check<strong>in</strong>g and rollback. Moreover, it leaves the burden <strong>of</strong> writ<strong>in</strong>g<br />
consistent transactions to the user. In pr<strong>in</strong>ciple the rst problem vanishes, if verication<br />
techniques are used at design time [44, 57, 58], whereas the second one still rema<strong>in</strong>s.<br />
As an alternative alot<strong>of</strong>attention has been paid to <strong>in</strong>tegrity enforcement. In most cases<br />
the envisioned solution is an active database [18, 27, 59, 64, 65], where production rules are<br />
used to repair <strong>in</strong>consistencies <strong>in</strong>stead <strong>of</strong> roll<strong>in</strong>g back. Although this is sometimes coupled<br />
with design time (or even run-time) analysis <strong>of</strong> the rules [18, 27, 33, 63], the approach isnot<br />
always successfull. Moreover, a satisfy<strong>in</strong>g theory for rule trigger<strong>in</strong>g systems with respect to the<br />
<strong>in</strong>tegrity enforcement problem is still miss<strong>in</strong>g. Therefore, we favour an operational approach<br />
[51, 48, 52, 53], which aims at replac<strong>in</strong>g <strong>in</strong>consistent database transactions by consistent<br />
specializations.<br />
1.6.1 Greatest Consistent Specializations<br />
In general non-determ<strong>in</strong>istic partial state transitions S as used <strong>in</strong> our method language can<br />
be described by a subset <strong>of</strong> DD ? , where D denotes the set <strong>of</strong> possible states and D ? =<br />
D[f?g, where ? is a special symbol used to <strong>in</strong>dicate non-term<strong>in</strong>ation. It can be shown<br />
[20, 41, 46, 44] that this is equivalent to den<strong>in</strong>g two predicate transformers wp(S) andwlp(S)<br />
associated with S satisfy<strong>in</strong>g the pair<strong>in</strong>g condition wp(S)(R) , wlp(S)(R) ^ wp(S)(true) and<br />
the universal conjunctivity <strong>of</strong> wlp(S),i.e.<br />
wlp(S)(8i 2 I:R i ) , 8i 2 I:wlp(S)(R i ) :<br />
The predicate transformers assign to some postcondition R the weakest (liberal) precondition<br />
<strong>of</strong> S to establish R. Clearly, pre- and postconditions are X-constra<strong>in</strong>ts. Informally these<br />
conditions can be characterized as follows:<br />
{ wlp(S)(R) characterizes those <strong>in</strong>itial states such that all term<strong>in</strong>at<strong>in</strong>g executions <strong>of</strong> S will<br />
reach a nal state characterized by R provided S is dened <strong>in</strong> that <strong>in</strong>itial state, and<br />
{ wp(S)(R) characterizes those <strong>in</strong>itial states such that all executions <strong>of</strong> S term<strong>in</strong>ate and<br />
will reach a nal state characterized by R provided S is dened.<br />
25
The use <strong>of</strong> these predicate transformers for the denition <strong>of</strong> language semantics is usually<br />
called \axiomatic semantics". Based on this consistency and specialization can be formally<br />
dened and used for the formal description <strong>of</strong> the consistency problem. For this purpose we<br />
dene \extended operations" and therefore need to know for each operation S the set <strong>of</strong><br />
classes S 0 such that S does neither read nor change the class variables x C with C =2 S 0 . In<br />
this case we callS a S 0 -operation. We omit the formal denition [41, 51].<br />
Denition 1.37. Let S be a schema, I a constra<strong>in</strong>t and S, T methods dened on S 1 S<br />
and S 2 S respectively with S 1 S 2 .<br />
(i) S is consistent with respect to I i I ) wlp(S)(I) holds.<br />
(ii) T specializes S i wp(S)(true) ) wp(T )(true) and wlp(S)(R) ) wlp(T )(R) hold for<br />
all constra<strong>in</strong>ts R with free variables x C such that C 2S 1 (denoted T v S).<br />
Hence the follow<strong>in</strong>g denition <strong>of</strong> a greatest consistent specialization:<br />
Denition 1.38. Let S be a schema, I a constra<strong>in</strong>t and S a method dened on S 1 S. A<br />
method S I is a Greatest Consistent Specialization (GCS) <strong>of</strong> S with respect to I i<br />
(i) S I v S ,<br />
(ii) S I is consistent with respect to I and<br />
(iii) for each method T satisfy<strong>in</strong>g properties (i) and (ii) (<strong>in</strong>stead <strong>of</strong> S I )wehave T v S I .<br />
If only properties (i) and (ii) are satised, we simply talk <strong>of</strong> a consistent specialization.<br />
Let us rst state the ma<strong>in</strong> results from [48].<br />
Theorem 1.39. Let S be a schema, I, J constra<strong>in</strong>ts and S a method dened on S 1 S.<br />
(i) There exists a greatest consistent specialization S I <strong>of</strong> S with respect to I. Moreover, S I<br />
is uniquely determ<strong>in</strong>ed (up to semantic equivalence) by S and I.<br />
(ii) The GCSs (S I ) J and S (I^J) co<strong>in</strong>cide on <strong>in</strong>itial states satisfy<strong>in</strong>g I^J.<br />
The pro<strong>of</strong> <strong>of</strong> these results heavily uses predicate transformers and is therefore omitted here.<br />
In [51] it has been shown that a GCS|that is <strong>in</strong> general non-determ<strong>in</strong>istic|can be written<br />
as a nite choice <strong>of</strong> maximal quasi-determ<strong>in</strong>istic specializations (MQCSs), where quasideterm<strong>in</strong>ism<br />
means determ<strong>in</strong>ism up to the selection <strong>of</strong> some values. In most cases this value<br />
selection can be shifted to the <strong>in</strong>put, but the selection <strong>of</strong> object identiers should be left to<br />
the system.<br />
Next, we formally dene quasi-determ<strong>in</strong>ism and then present the ma<strong>in</strong> result from [51],<br />
an algorithm for the computation <strong>of</strong> MQCSs.<br />
Denition 1.40. A method S is called quasi-determ<strong>in</strong>istic i there exist types T 1 ::: T n<br />
such thatS is semantically equivalent to<br />
where S 0 is a determ<strong>in</strong>istic method.<br />
y 1 :: T 1 j :::y n :: T n j S 0 <br />
26
Algorithm 1.41.<br />
In: An X-operation S and constra<strong>in</strong>ts I 1 ::: I n dened on extensions Y 1 ::: Y n <strong>of</strong> X.<br />
Let ` be the list <strong>of</strong> the constra<strong>in</strong>ts. As long as ` 6= nil proceed as follows:<br />
1. Set S 0 I = S.<br />
2. Choose and remove one constra<strong>in</strong>t I i from `.<br />
3. Check whether S 0 I is I i-reduced. If not, stop with no result, otherwise cont<strong>in</strong>ue.<br />
4. Make S 0 I -free by replac<strong>in</strong>g each occurr<strong>in</strong>g S 1 S 2 by S 1 wlp(S 1 )(false) ! S 2 .<br />
5. Replace each basic assignment <strong>in</strong>SI 0 by some (subsumption-free) MQCS with respect to<br />
I i .<br />
6. Compute P (S I )as<br />
P ( S I ) fz 1 =x 1 ::: z n =x n g:wlp(fx 1 =z 1 ::: x n =z n g: S I )(:wlp(S)(z 1 6= x 1 _:::_z n 6= x n )) <br />
where the x i are the class variables occurr<strong>in</strong>g <strong>in</strong> I or <strong>in</strong> S and the z i are used as a disjo<strong>in</strong>t<br />
copy <strong>of</strong> these.<br />
7. Set S = P (S I ) ! S 0 I .<br />
Set S 0 I = S.<br />
Out: An operation I!SI 0 , where S0 I is a (subsumption-free) MQCS <strong>of</strong> the orig<strong>in</strong>al S with<br />
respect to the conjunction I <strong>of</strong> the constra<strong>in</strong>ts.<br />
ut<br />
An extension <strong>of</strong> the GCS algorithm to compute all (subsumption-free) MQCSs is easy.<br />
It has been shown <strong>in</strong> [51] that Algorithm 1.41 is correct. However, it depends on check<strong>in</strong>g<br />
avery technical condition, I-reducedness. We omit this condition here.<br />
1.6.2 Enforc<strong>in</strong>g Integrity <strong>in</strong> the OODM<br />
S<strong>in</strong>ce Algorithm 1.41 allows <strong>in</strong>tegrity enforcement to be reduced to the case <strong>of</strong> assignments,<br />
we may restrict ourselves to the case <strong>of</strong> a s<strong>in</strong>gle explicit constra<strong>in</strong>t <strong>in</strong> addition to the trivial<br />
uniqueness constra<strong>in</strong>ts that are required to assure value-representability and that are used to<br />
construct generic update operations. In the follow<strong>in</strong>g we describe MQCSs with respect to the<br />
constra<strong>in</strong>ts <strong>in</strong>troduced <strong>in</strong> Denition 1.5.<br />
Inclusion Constra<strong>in</strong>ts. Let I be an <strong>in</strong>clusion constra<strong>in</strong>t onC 1 , C 2 dened via c i : T Ci ! T<br />
(i = 1 2). Then each <strong>in</strong>sertion <strong>in</strong>to C 1 requires an additional <strong>in</strong>sertion <strong>in</strong>to C 2 whereas a<br />
deletion on C 2 requires a deletion on C 1 . Update on one <strong>of</strong> the C i requires an additional<br />
update on the other class.<br />
Let us rst concentrate on the <strong>in</strong>sert-operation on C 1 (for an <strong>in</strong>sert on C 2 there is noth<strong>in</strong>g<br />
to do). Insertion <strong>in</strong>to C 1 requires an <strong>in</strong>put-value <strong>of</strong> type V C 1 an additional <strong>in</strong>sert on C 2 then<br />
requires an <strong>in</strong>put-value <strong>of</strong> type V C 2 .However, these <strong>in</strong>put-values are not <strong>in</strong>dependent, because<br />
the correspond<strong>in</strong>g values <strong>of</strong> type T C 1 and T C2 must satisfy the general <strong>in</strong>clusion constra<strong>in</strong>t.<br />
Therefore we rst show that the constra<strong>in</strong>t can be \lifted" to a constra<strong>in</strong>t on the valuerepresentation<br />
types. Note that this is similar to the handl<strong>in</strong>g <strong>of</strong> IsA-constra<strong>in</strong>ts <strong>in</strong> Lemma<br />
1.33.<br />
27
Lemma 1.42. Let C 1 , C 2 be classes, c i : T Ci ! T functions and let V Ci be the value-representation<br />
type <strong>of</strong>C i (i =1 2). Then there exist functions f i : V Ci ! T such that for all database<br />
<strong>in</strong>stances D<br />
f 1 (d D 1 (v 1 )) = f 2 (d D 2 (v 2 )) , c 1 (v 1 )=c 2 (v 2 ) (1.13)<br />
for all v i 2 codom(D(x Ci )) (i =1 2) holds. Here d D i : T Ci ! V Ci denotes the function used<br />
<strong>in</strong> the uniqueness constra<strong>in</strong>t on C i with respect to D.<br />
Pro<strong>of</strong>. Due to Denition 1.10 we may dene f i = c i (d D i );1 on c i (codom(D(x Ci ))) (i =1 2).<br />
Then wehave toshow that this denition is <strong>in</strong>dependent <strong>of</strong> the <strong>in</strong>stance D. Suppose D 1 , D 2<br />
are two dierent <strong>in</strong>stances. Then there exists a permutation on ID such that d D 2<br />
i<br />
= d D 1<br />
i<br />
,<br />
where is extended to T Ci . Then<br />
c i (d D 2<br />
i<br />
) ;1 = c i ;1 (d D 1<br />
i<br />
) ;1 = ;1 c i (d D 1) ;1 <br />
i<br />
s<strong>in</strong>ce c i permutes with ;1 . Then the stated equality follows.<br />
ut<br />
Now let V C 1C2 = V C1 V C2 and dene the new <strong>in</strong>sert-operation on C 1 by (<strong>in</strong>sert C 1 ) I ((v 1v 2 )::<br />
V C 1C2 ) == f 1 (v 1 )=f 2 (v 2 ) ! <strong>in</strong>sert C 1 (v 1) <strong>in</strong>sert C 2 (v 2) (1.14)<br />
where the f i are the functions <strong>of</strong> Lemma 1.42. Note there there is no need to require C 1 6= C 2 .<br />
Delete- and update-operations can be dened analogously.<br />
Functional and Uniqueness Constra<strong>in</strong>ts. Now let I be a functional constra<strong>in</strong>t on C<br />
dened via c 1 : T C ! T 1 and c 2 : T C ! T 2 . In this case noth<strong>in</strong>g is required for the delete<br />
operation whereas for <strong>in</strong>serts (and updates) we have to add a postcondition. Moreover, let<br />
c D : T C ! V C denote the function associated with the value-representability <strong>of</strong> C and the<br />
database <strong>in</strong>stance D and let all other notations be as before. Let us aga<strong>in</strong> concentrate on the<br />
<strong>in</strong>sert-operation. Let <strong>in</strong>sert 0 C<br />
denote the canonical <strong>in</strong>sert on C. Then we dene<br />
(<strong>in</strong>sert C ) I (V :: V C ) ==<br />
I :: ID j I <strong>in</strong>sert 0 C (V )<br />
V 0 :: T C j (IV 0 ) 2 x C !<br />
( 8J :: IDW :: T C : ((JW) 2 x C<br />
^ c 1 (W )=c 1 (V 0 ) ) c 2 (W )=c 2 (V 0 )) ! skip : (1.15)<br />
Note that <strong>in</strong> this case there is no change <strong>of</strong> <strong>in</strong>put-type. For delete- and update-operations we<br />
have analogous denitions.<br />
A uniqueness constra<strong>in</strong>t dened via c 1 : T C ! T 1 is equivalent toa functional constra<strong>in</strong>t<br />
dened via c 1 and c 2 = id : T C ! T C plus the trivial uniqueness constra<strong>in</strong>t. S<strong>in</strong>ce trivial<br />
uniqueness constra<strong>in</strong>ts are already enforced by the canonical update operations, there is no<br />
need to handle separately arbitrary uniqueness constra<strong>in</strong>ts.<br />
28
Exclusion Constra<strong>in</strong>ts. The handl<strong>in</strong>g <strong>of</strong> exclusion constra<strong>in</strong>ts is analogous to the handl<strong>in</strong>g<br />
<strong>of</strong> <strong>in</strong>clusion constra<strong>in</strong>ts. This means that an <strong>in</strong>sert (update) on one class may cause a delete<br />
on the other, whereas delete-operations rema<strong>in</strong> unchanged.<br />
We concentrate aga<strong>in</strong> on the <strong>in</strong>sert-operation. Let I be an exclusion constra<strong>in</strong>t onC 1 and<br />
C 2 dened via c i : T Ci ! T (i =1 2). Let f i : V Ci ! T denote the functions from Lemma<br />
1.42. Then we dene a new <strong>in</strong>sert-operation on C 1 by<br />
(<strong>in</strong>sert C 1 ) I (V :: V C1 )==<br />
<strong>in</strong>sert C 1 (V )<br />
S: ((I :: ID j V 0 :: T C 2 j (IV 0 ) 2 x C 2<br />
^c 2 (V 0 )=f 1 (V ) ! delete C 2 (V 0 ) S ) skip ) : (1.16)<br />
For delete- and update-operations an analogous result holds.<br />
Theorem 1.43. The methods S I <strong>in</strong> (1.14), (1.15) and (1.16) are MQCSs <strong>of</strong> generic <strong>in</strong>sertmethods<br />
with respect to <strong>in</strong>clusion, functional and exclusion constra<strong>in</strong>ts respectively.<br />
The pro<strong>of</strong> <strong>in</strong>volves detailed use <strong>of</strong> predicate transformers and is therefore omitted here [48, 49].<br />
Analogous results hold for delete and update.<br />
1.7 Conclusion<br />
In this paper we describe rst results concern<strong>in</strong>g the formal foundations <strong>of</strong> object oriented<br />
database concepts. For this purpose we<strong>in</strong>troduced a formal object oriented datamodel (OODM)<br />
with the follow<strong>in</strong>g characteristics.<br />
{ <strong>Object</strong>s are considered to be abstractions <strong>of</strong> real world entities, hence they have an immutable<br />
identity. This identity is encoded by abstract identiers that are assumed to form<br />
some type ID. This identier concept eases the modell<strong>in</strong>g <strong>of</strong> shared data and cyclic references,<br />
however, it does not relieve us from the problem to provide unique identication<br />
mechanisms for objects <strong>in</strong> a database.<br />
{ In our approach there is not only one value <strong>of</strong> a given type that is associated with an<br />
object. In contrast we allow several values <strong>of</strong> possibly dierent types to belong to an<br />
object, and even this collection <strong>of</strong> types may change.<br />
{ Classes are used to structure objects. At each time a class corresponds to a collection <strong>of</strong><br />
objects with values <strong>of</strong> the same type and references to objects <strong>in</strong> a xed set <strong>of</strong> classes.<br />
Inheritance is based on IsA relations that express an <strong>in</strong>clusion at each time <strong>of</strong> the sets <strong>of</strong><br />
objects. Moreover, referential <strong>in</strong>tegrity is supported.<br />
{ We associate with each class a collection <strong>of</strong> methods. Methods are specied by guarded<br />
commands, hence the method language is computationally complete. In order to allow<br />
the handl<strong>in</strong>g <strong>of</strong> identiers that are always hidden from the user as well as user-accessible<br />
transactions a hid<strong>in</strong>g operator on methods is <strong>in</strong>troduced. Generic update operations, i.e.<br />
<strong>in</strong>sert, delete and update on a class are assumed to be automatically derived whenever<br />
this is possible.<br />
{ We associate <strong>in</strong>tegrity constra<strong>in</strong>ts to schemata. Certa<strong>in</strong> k<strong>in</strong>ds <strong>of</strong> such constra<strong>in</strong>ts can be<br />
obta<strong>in</strong>ed by generaliz<strong>in</strong>g correspond<strong>in</strong>g constra<strong>in</strong>ts <strong>in</strong> the relational model. We assume<br />
that methods are automatically changed <strong>in</strong> order to enforce <strong>in</strong>tegrity.<br />
29
On this basis <strong>of</strong> this formal OODM we study the problems <strong>of</strong> identication, genericity and<br />
<strong>in</strong>tegrity. Weshow that the unique identication <strong>of</strong> objects <strong>in</strong> a class requires the class to be<br />
value-representable.<br />
An advantage <strong>of</strong> database systems is to provide generic update operations. We show that<br />
the unique existence <strong>of</strong> such generic methods requires also value-representability. However, <strong>in</strong><br />
this case referential and <strong>in</strong>clusion <strong>in</strong>tegrity can be enforced automatically. This result can be<br />
generalized with respect to dist<strong>in</strong>guished classes <strong>of</strong> user-dened <strong>in</strong>tegrity constra<strong>in</strong>ts. Given<br />
some arbitrary method S and some constra<strong>in</strong>t I there exists a greatest consistent specialization<br />
(GCS) S I <strong>of</strong> S with respect to I. Such a GCS behaves nice <strong>in</strong> that it is compatible with the<br />
conjunction <strong>of</strong> constra<strong>in</strong>ts. For the GCS construction <strong>of</strong> a user-dened transaction we apply<br />
the GCS algorithm developped<strong>in</strong>[48,51,52,53].<br />
This work on mathematical foundations <strong>of</strong> OODB concepts is not yet completed. A lot <strong>of</strong><br />
problems are still left open and are the matter <strong>of</strong> current <strong>in</strong>vestigations and future research.<br />
{ In our approach classes are sets. What are other bulk types? Does it make sense to abstract<br />
from classes <strong>in</strong> this way?<br />
{ The problem <strong>of</strong> updatable views is still open.<br />
{ Our approach to genericity only handles the worst case expressed by the value representation<br />
type. We assume that polymorphism will help to generalize our results to the general<br />
case. Moreover, we must <strong>in</strong>tegrate communication aspects at least with respect to the<br />
user.<br />
{ The usual axiomatic semantics for guarded commands abstracts from an execution model.<br />
All results are true for semantic equivalence classes. However, we also need optimization,<br />
especially with respect to the derived GCSs.<br />
{ We only presented a formal OODM without look<strong>in</strong>g <strong>in</strong>to methodological aspects such as<br />
the characterization <strong>of</strong> good designs.<br />
We express the hope that others will also contribute to solve open problems <strong>in</strong> OODB foundation<br />
or <strong>in</strong> the implementation <strong>of</strong> more sophisticated object oriented database languages on<br />
a sound mathematical basis.<br />
References for Chapter 1<br />
1. S. Abiteboul: Towards a deductive object-oriented database language, Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g,<br />
vol. 5, 1990, pp. 263 { 287<br />
2. S. Abiteboul, R. Hull: IFO: A Formal Semantic Database Model, ACM ToDS, vol. 12 (4), December<br />
1987, pp. 525 { 565<br />
3. S. Abiteboul, P. Kanellakis: <strong>Object</strong> Identity as a Query Language Primitive, <strong>in</strong> Proc. SIGMOD,<br />
Portland Oregon, 1989, pp. 159 { 173<br />
4. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: Types for <strong>Databases</strong>: The Galileo Experience, <strong>in</strong> Type Systems<br />
and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and<br />
Computational Sciences, Research Report CS/90/3, 27 { 37<br />
5. A. Albano, A. Dearle, G. Ghelli, C. Marl<strong>in</strong>, R. Morrison, R. Ors<strong>in</strong>i, D. Stemple: AFramework for<br />
Compar<strong>in</strong>g Type Systems for Database Programm<strong>in</strong>g Languages, <strong>in</strong>Type Systems and Database<br />
Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and Computational<br />
Sciences, Research Report CS/90/3, 1990<br />
6. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: <strong>Object</strong>s and Classes for a Database Programm<strong>in</strong>g Language, FIDE<br />
technical report 91/16, 1991<br />
30
7. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: ARelationship Mechanism for a Strongly Typed <strong>Object</strong>-<strong>Oriented</strong><br />
Database Programm<strong>in</strong>g Language, <strong>in</strong> A. Sernadas (Ed.): Proc. VLDB 91, Barcelona 1991<br />
8. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The <strong>Object</strong>-<strong>Oriented</strong><br />
Database System Manifesto, Proc. 1st DOOD, Kyoto 1989<br />
9. F. Bancilhon, G. Barbedette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeer,<br />
P. Richard, F. Velez: The Design and Implementation <strong>of</strong> O 2 , an <strong>Object</strong>-<strong>Oriented</strong> Database System,<br />
Proc. <strong>of</strong> the ooDBS II workshop, Bad Munster, FRG, September 1988<br />
10. C. Beeri: Formal Models for <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong>, Proc. 1st DOOD 1989, pp. 370 { 395<br />
11. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />
5 (4), 1990, pp. 353 { 382<br />
12. C. Beeri, Y. Kornatzky: Algebraic Optimization <strong>of</strong> <strong>Object</strong>-<strong>Oriented</strong> QueryLanguages, <strong>in</strong> S. Abiteboul,<br />
P. C. Kanellakis (Eds.): Proc. ICDT '90, Spr<strong>in</strong>ger LNCS 470, pp. 72 { 88<br />
13. C. Beeri: New Data Models and Languages - the Challange <strong>in</strong> Proc. PODS '92<br />
14. L. Cardelli, P. Wegner: On Understand<strong>in</strong>g Types, Data Abstraction and Polymorphism, ACM<br />
Comput<strong>in</strong>g Suerveys 17,4, pp 471 { 522<br />
15. L. Cardelli: Typeful Programm<strong>in</strong>g, Digital Systems Research Center Reports 45, DEC SRC Palo<br />
Alto, May 1989<br />
16. M. Carey, D. DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc.<br />
ACM SIGMOD 88<br />
17. M. Caruso, E. Sciore: The VISION <strong>Object</strong>-<strong>Oriented</strong> Database Management System, Proc.<strong>of</strong>the<br />
Workshop on Database Programm<strong>in</strong>g Languages, Rosco, France, September 1987<br />
18. S. Ceri, J. Widom: Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />
VLDB, Brisbane (Australia), August 1990, pp. 566 { 577<br />
19. A. Dearle, R. Connor, F. Brown, R. Morrison: Napier88 - ADatabase Programm<strong>in</strong>g Language?,<br />
<strong>in</strong> Type Systems and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong><br />
Mathematical and Computational Sciences, Research Report CS/90/3, 10 { 26<br />
20. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Spr<strong>in</strong>ger-Verlag, 1989<br />
21. H.-D. Ehrich, M. Gogolla, U. Lipeck: Algebraische Spezikation abstrakter Datentypen, Teubner-<br />
Verlag, 1989<br />
22. H.-D. Ehrich, A. Sernadas: Fundamental <strong>Object</strong> Concepts and Constructors, <strong>in</strong> G. Saake, A. Sernadas<br />
(Eds.): Information Systems { Correctness and Reusability, TU Braunschweig, Informatik<br />
Berichte 91-03, 1991<br />
23. H. Ehrig, B. Mahr: <strong>Fundamentals</strong> <strong>of</strong> Algebraic Specication, vol.1, Spr<strong>in</strong>ger 1985<br />
24. L. Fegaras, T. Sheard, D. Stemple: The ADABTPL Type System, <strong>in</strong>Type Systems and Database<br />
Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and Computational<br />
Sciences, Research Report CS/90/3, 45 { 56<br />
25. L. Fegaras, T. Sheard, D. Stemple: Uniform Traversal Comb<strong>in</strong>ators: Denition, Use and Properties,<br />
University <strong>of</strong> Massachusetts, 1992<br />
26. D. Fishman, D. Beech, H. Cate, E. Chow et al.: IRIS: An <strong>Object</strong>-<strong>Oriented</strong> Database Management<br />
System, ACM ToIS, vol. 5(1), January 1987<br />
27. P. Fraternali, S. Paraboschi, L. Tanca: Automatic Rule Generation for Constra<strong>in</strong>t Enforcement<br />
<strong>in</strong> Active <strong>Databases</strong>, <strong>in</strong> U. Lipeck (Ed.): Proc. 4th Int. Workshop on Foundations <strong>of</strong> Models and<br />
Languages for Data and <strong>Object</strong>s \MODELLING DATABASE DYNAMICS", Volkse (Germany),<br />
October 19-22, 1992<br />
28. G. Gottlob, G. Kappel, M. Schre: Semantics <strong>of</strong> <strong>Object</strong>-<strong>Oriented</strong> Data Models { The Evolv<strong>in</strong>g<br />
Algebra Approach, <strong>in</strong> J. W. Schmidt, A. A. Stognij (Eds.): Proc. Next Generation Information<br />
Systems Technology, Spr<strong>in</strong>ger LNCS, vol. 504, 1991<br />
29. M. Hammer, D. McLeod: Database Description with SDM: A Semantic Database Model, J.ACM,<br />
vol. 31 (3), 1984, pp. 351 { 386<br />
30. A. Heuer, P. Sander: Classify<strong>in</strong>g <strong>Object</strong>-<strong>Oriented</strong> Results <strong>in</strong> a Class/Type Lattice, <strong>in</strong> B. Thalheim<br />
et al. (Ed.): Proceed<strong>in</strong>gs MFDBS 91, Spr<strong>in</strong>ger LNCS 495, pp. 14 { 28<br />
31. R. Hull, R. K<strong>in</strong>g: Semantic Database Model<strong>in</strong>g: Survey, Applications and Research Issues, ACM<br />
Comput<strong>in</strong>g Surveys, vol. 19(3), September 1987<br />
31
32. R. Hull, M. Yoshikawa: ILOG: Declarative Creation and Manipulation <strong>of</strong> <strong>Object</strong> Identiers, <strong>in</strong><br />
Proc. 16th VLDB, Brisbane (Australia), 1990, pp. 455 { 467<br />
33. A. P. Karadimce, S. D. Urban, Diagnos<strong>in</strong>g Anomalous Rule Behaviour <strong>in</strong> <strong>Databases</strong> with Integrity<br />
Ma<strong>in</strong>tenance Production Rules, <strong>in</strong> Proc. 3rd Int. Workshop on Foundations <strong>of</strong> Models and Languages<br />
for Data and <strong>Object</strong>s, Aigen (Austria), September 1991, pp. 77 { 102<br />
34. S. Khoshaan, G. Copeland: <strong>Object</strong> Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon,<br />
1986<br />
35. M. Kifer, J. Wu: ALogic for <strong>Object</strong>-<strong>Oriented</strong> Logic Programm<strong>in</strong>g (Maier's O-Logic Revisited), <strong>in</strong><br />
PODS'89, pp. 379 { 393<br />
36. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrat<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong><br />
Programm<strong>in</strong>g System with a Database System, <strong>in</strong> Proc. OOPSLA 1988<br />
37. D. Maier, J. Ste<strong>in</strong>, A. Ottis, A. Purdy: Development <strong>of</strong> an <strong>Object</strong>-<strong>Oriented</strong> DBMS, OOPSLA,<br />
September 1986<br />
38. F. Matthes, J. W. Schmidt: Bulk Types { Add-On or Built-In?, <strong>in</strong> Proc. DBPL III, Nafplion 1991<br />
39. J. Mylopoulos, P. A. Bernste<strong>in</strong>, H. K. T. Wong: A Language Facility for Design<strong>in</strong>g Interactive<br />
Database-Intensive Applications, ACM ToDS, vol. 5 (2), April 1980, pp. 185 { 207<br />
40. J. Mylopoulos, A. Borgida, M. Jarke, M. Koubarakis: Telos: Represent<strong>in</strong>g Knowledge About Information<br />
Systems, ACM ToIS, vol. 8 (4), October 1990 pp. 325 { 362<br />
41. G. Nelson: A Generalization <strong>of</strong> Dijkstra's Calculus, ACM TOPLAS, vol. 11 (4), October 1989, pp.<br />
517 { 561<br />
42. A. Ohori: Represent<strong>in</strong>g <strong>Object</strong> Identity <strong>in</strong> a Pure Functional Language, Proc. ICDT 90, Spr<strong>in</strong>ger<br />
LNCS, pp. 41 { 55<br />
43. G. Saake, R. Jungclaus: Specication <strong>of</strong> Database Applications <strong>in</strong> the TROLL Language, <strong>in</strong> Proc.<br />
Int. Workshop on the Specication <strong>of</strong> Database Systems, Glasgow, 1991<br />
44. K.-D. Schewe, I. Wetzel, J. W. Schmidt: Towards a Structured Specication Language for Database<br />
Applications, <strong>in</strong> D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication <strong>of</strong> Database<br />
Systems, Spr<strong>in</strong>ger WICS, 1991, pp. 255 { 274 (an extended version appeared as FIDE technical<br />
report 1991/30, October 1991)<br />
45. K.-D. Schewe, B. Thalheim, I. Wetzel,J.W.Schmidt: Extensible Safe <strong>Object</strong>-<strong>Oriented</strong> Design <strong>of</strong><br />
Database Applications, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t CS-09-91, September 1991<br />
46. K.-D. Schewe: Spezikation daten<strong>in</strong>tensiver Anwendungssysteme (<strong>in</strong> German), lecture manuscript,<br />
University <strong>of</strong> Hamburg, W<strong>in</strong>ter 1991/92<br />
47. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />
<strong>Oriented</strong> <strong>Databases</strong>, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, pp. 341-356<br />
48. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity Enforcement <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong><br />
<strong>Databases</strong>, <strong>in</strong> U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations <strong>of</strong> Models<br />
and Languages for Data and <strong>Object</strong>s \MODELLING DATABASE DYNAMICS", Volkse (Germany),<br />
October 19-22, 1992<br />
49. K.-D. Schewe, B. Thalheim, I. Wetzel: Foundations <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Concepts, University<br />
<strong>of</strong>Hamburg, Report FBI-HH-B-157/92, October 1992<br />
50. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: AReective Approach to Method<br />
Generation <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, University <strong>of</strong> Rostock, Rostocker Informatik Berichte,<br />
no. 14, 1992<br />
51. K.-D. Schewe, B. Thalheim: Comput<strong>in</strong>g Consistent Transactions, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t<br />
CS-08-92, December 1992<br />
52. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity Preserv<strong>in</strong>g Updates <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>,<br />
<strong>in</strong> M. Orlowska, M. Papazoglou (Eds.) : Proc. 4th Australian Database Conference, Brisbane,<br />
February 1993, World Scientic, pp. 171-185<br />
53. K.-D. Schewe, B. Thalheim: Exceed<strong>in</strong>g the Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems to Achieve Consistent<br />
Transactions, submitted for publication<br />
54. M. H. Scholl, H.-J. Schek: ARelational <strong>Object</strong> Model, <strong>in</strong> Proc. ICDT 90, Spr<strong>in</strong>ger LNCS, pp. 89<br />
{105<br />
32
55. G. M. Shaw, S. B. Zdonik: An <strong>Object</strong>-<strong>Oriented</strong> Query-Algebra, IEEE Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 12<br />
(3), 1989, pp. 29 { 36<br />
56. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programm<strong>in</strong>g to Database Languages,<br />
<strong>in</strong> Proc. HICSS '92<br />
57. D. Stemple, S. Mazumdar, T. Sheard: On the Modes and Mean<strong>in</strong>g <strong>of</strong> Feedback to Transaction<br />
Designer, <strong>in</strong> Proc. SIGMOD 1987, pp. 375 { 386<br />
58. D. Stemple, T. Sheard: Automatic Verication <strong>of</strong> Database Transaction Safety, ACM ToDS vol.<br />
14 (3), September 1989<br />
59. M. Stonebraker, A. Ju<strong>in</strong>gran, J. Goh, S. Potam<strong>in</strong>os: On Rules, Procedures, Cach<strong>in</strong>g and Views <strong>in</strong><br />
Database Systems, <strong>in</strong> Proc. SIDMOD 1990, pp. 281 { 290<br />
60. S. Y. W. Su: SAM : A Semantic Association Model for Corporate and Scientic-Statistical<br />
<strong>Databases</strong>, Inf. Sci., vol. 29, 1983, pp. 151 { 199<br />
61. B. Thalheim: Dependencies <strong>in</strong> Relational <strong>Databases</strong>, Teubner Leipzig, 1991<br />
62. B. Thalheim: The Higher-Order Entity-Relationship Model, <strong>in</strong>J.W.Schmidt, A. A. Stognij (Eds.):<br />
Proc. Next Generation Information Systems Technology, Spr<strong>in</strong>ger LNCS, vol. 504, 1991<br />
63. S. D. Urban, L. Delcambre: Constra<strong>in</strong>t Analysis: a Design Process for Specify<strong>in</strong>g Operations on<br />
<strong>Object</strong>s, IEEETrans. on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), December 1990<br />
64. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: Set-oriented Production Rules <strong>in</strong> Relational Database Systems, <strong>in</strong><br />
Proc. SIGMOD 1990, pp. 259 { 270<br />
65. Y. Zhou, M. Hsu: A Theory for Rule Trigger<strong>in</strong>g Systems, <strong>in</strong> Proc. EDBT '90, Spr<strong>in</strong>ger LNCS 416,<br />
pp. 407 { 421<br />
33
Chapter 2<br />
Identication as a Primitive <strong>of</strong><br />
Database Models<br />
Contents<br />
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35<br />
2.2 The Identication Problem . . . . . . . . . . . . . . . . . . . . . . 36<br />
2.3 Identication Concepts <strong>in</strong> <strong>Databases</strong> . . . . . . . . . . . . . . . . . 41<br />
2.4 Comparison <strong>of</strong> Identication Concepts . . . . . . . . . . . . . . . . 44<br />
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
Catriel Beeri, Bernhard Thalheim. Identication as a Primitive <strong>of</strong> Database Models.<br />
In T .Polle, T. Ripke, K.-D. Schewe. <strong>Fundamentals</strong> <strong>of</strong> Information Systems. Kluwer<br />
1998.<br />
34
Abstract. Identication is one <strong>of</strong> the ma<strong>in</strong> primitives <strong>of</strong> database technology. Whereas identication<br />
<strong>of</strong> real world entities by humans is an extremely exible mechanism, identication <strong>in</strong><br />
a database system is severely restricted, s<strong>in</strong>ce the identication mechanism used <strong>in</strong> it depends<br />
on the data model and the type system on which it is based. To understand the modell<strong>in</strong>g<br />
power <strong>of</strong> a data model, it is necessary to understand the identication mechanism it supports.<br />
Thus, this paper surveys and analyses the identication mechanism <strong>of</strong> database models.<br />
2.1 Introduction<br />
<strong>Databases</strong> are used to represent entities 1 <strong>of</strong> the real world. On a suciently high level <strong>of</strong><br />
abstraction, every th<strong>in</strong>g we deal with <strong>in</strong> our life, whether concrete or abstract, is an entity.<br />
However, to facilitate the construction <strong>of</strong> a world model, and certa<strong>in</strong>ly if one wants to use such<br />
a model as a basis <strong>of</strong> a database representation, it is useful to dist<strong>in</strong>guish between entities,<br />
properties <strong>of</strong> entities, associations between entities, etc. In a computerized system, some entities<br />
are represented by atomic values (numbers, for example), whereas others are represented<br />
by structured, non-atomic values, such as tuples <strong>in</strong> the relational model, or by objects <strong>in</strong><br />
object-oriented systems. Properties and associations are represented as part <strong>of</strong> the structures<br />
represent<strong>in</strong>g the entities, or as additional structures. For example, an employee tuple <strong>in</strong> a<br />
relational database conta<strong>in</strong>s the values for its properties <strong>of</strong> <strong>in</strong>terest, and may also conta<strong>in</strong><br />
values that represent relationships, for example a department number. If the relationship between<br />
employees and projects is many tomany, then a separate relation may store the tuples<br />
describ<strong>in</strong>g it. In an object-oriented database, properties <strong>of</strong> an entity represented by an object<br />
are stored with it as associated values or objects. In either case, we may say that properties<br />
and associations are described by structures.<br />
Entities <strong>in</strong> the real world have the follow<strong>in</strong>g properties:<br />
{ An entity is uniquely identied by its history, and by its properties and associations.<br />
{ Its set <strong>of</strong> properties and associations can be arbitrary.<br />
{ It has a life cycle | it is created, it exists, then it ceases to exist.<br />
{ An entity can exist <strong>in</strong>dependently <strong>of</strong> other entities.<br />
Note that this holds even for entities that on rst thoughtwemaybelieve not to satisfy some <strong>of</strong><br />
the above. For example, nails <strong>in</strong> a box exist, and each is a unique physical entity. Furthermore,<br />
each is uniquely identied at every po<strong>in</strong>t <strong>of</strong>timeby its location <strong>in</strong> the box. Time-<strong>in</strong>dependent<br />
identication, for example by m<strong>in</strong>ute dierences <strong>in</strong> lengths or weights may also exist. The<br />
last property may not hold for abstract entities, i.e., entities that are conceptual, rather than<br />
physical.<br />
The fact that the set <strong>of</strong> properties is arbitrary is important <strong>in</strong> real life. We recognize other<br />
people by many dierent properties. We may believe we know somebodyby his hair colour.<br />
Meet<strong>in</strong>g him after twenty years, the colour is changed, or the hair is gone, yet we do know<br />
him.<br />
The exibility that exists <strong>in</strong> the real world cannot be directly supported <strong>in</strong> computerized<br />
representations. When we choose to represent a universe <strong>of</strong> discourse <strong>in</strong> a database, we restrict<br />
the properties and associations we care to represent to a nite, pre-specied set. Although<br />
this set is arbitrary, <strong>in</strong> the sense that we can choose it as we like, it is xed by the choice.<br />
1 We use `entity' here <strong>in</strong> the normal natural language sense. It should not be confused with (closely related<br />
but technical) use <strong>in</strong> the entity-relationship model.<br />
35
Furthermore, our choice regard<strong>in</strong>g what to represent are guided by feasibility and cost. While<br />
we can,ifwe wish, record the location <strong>of</strong> each nailateachpo<strong>in</strong>t <strong>of</strong> time, practically, however,<br />
we are ready to pay the price <strong>of</strong> such a system for locat<strong>in</strong>g cars, but not nails. A primary<br />
goal <strong>of</strong> the representations we choose is to allow us to uniquely identity entities, as this si<br />
the basis for proper use and manipulation. The restrictions on representations impose severe<br />
limits on how we can uniquely identify the (representations <strong>of</strong> the) entities <strong>in</strong> the database.<br />
This applies not only to the cases where we have given up the option <strong>of</strong> unique representation,<br />
such as the nail box, but also to many <strong>of</strong> the cases where our representation is `full'.<br />
While identication <strong>in</strong> relational databases has been solved by the key concept, the issue is<br />
still vague <strong>in</strong> OODB's. By way <strong>of</strong>motivation, one <strong>of</strong> the authors has performed an experiment<br />
on a commercial OODB. Three objects with the name 'John' and cyclic references 'friends'<br />
between them were created. The query `How many John's are <strong>in</strong> the database' was run several<br />
times. The results were 3, 6, 9 for the rst three runs, respectively, and <strong>in</strong>creased similarly for<br />
subsequent runs. It seems very plausible that the failure has to do with unique identiability<br />
<strong>of</strong> the three objects.<br />
Overview on the paper<br />
In this paper we discuss the representation <strong>of</strong> entities <strong>in</strong> <strong>in</strong>formation systems, and the identi-<br />
cation mechanisms <strong>of</strong> dierent database models. We dist<strong>in</strong>guish several notions <strong>of</strong> identication,<br />
and <strong>in</strong> particular between identication and separability, and we show that currently<br />
implemented mechanisms are limited.<br />
Section 2 discusses the identication problem <strong>in</strong> general and for object-oriented databases.<br />
Section 3 <strong>in</strong>troduces dierent identication concepts. These concepts are compared <strong>in</strong> Section<br />
4. Section 5 demonstrates that there are further concepts which can be used for identication<br />
as well.<br />
2.2 The Identication Problem<br />
Identication is <strong>in</strong>timately related to equality. In the real world, to say that entities t 1 and<br />
t 2 are equal means that they are the same | they are identical. To uniquely identify an<br />
entity is to be able to separate it <strong>of</strong> from any entity that is not identical to it. As mentioned<br />
above, <strong>in</strong> the real world, we may identify entities by arbitrary comb<strong>in</strong>ations <strong>of</strong> properties<br />
and associations, which may change over time. The situation is further complicated by the<br />
fact that entities are <strong>of</strong>ten related to roles and abstractions, and it is not always clear <strong>in</strong> a<br />
statement to which <strong>of</strong> those one relates.<br />
Consider the follow<strong>in</strong>g equalities: Cl<strong>in</strong>ton = Cl<strong>in</strong>ton, Cicero = Ford, and Cl<strong>in</strong>ton = The<br />
President <strong>of</strong> the USA. Cl<strong>in</strong>ton and Ford refer to physical entities that existed (each at a certa<strong>in</strong><br />
time). Thus, the rst equality is trivialy true, it is an identity, and the second is false. Is the<br />
third equality true or false? If we take it to mean that these two dierent names, denot<strong>in</strong>g<br />
two conceptual personalities Cl<strong>in</strong>ton and The President <strong>of</strong> the USA, are an identical physical<br />
person, then it is true. If the <strong>in</strong>tention is that the two conceptual entities are identical, it is<br />
false. And note that if we say it is true, then Ford = The President <strong>of</strong> the USA was also true<br />
at some time.<br />
In the real world, such dierences are resolved <strong>in</strong> a variety <strong>of</strong> ways | by context, by<br />
ask<strong>in</strong>g for clarication, by misunderstand<strong>in</strong>g, and so on. In a database system us<strong>in</strong>g current<br />
36
technology the mean<strong>in</strong>g has to be clear, or, at most, resolution should be obvious given schema<br />
<strong>in</strong>formation. Note that deduc<strong>in</strong>g that some objects <strong>in</strong> a database are identical can have nontrivial<br />
consequences. Consider the situation that <strong>in</strong> a database we have:<br />
fbooksg |||- The President, Cl<strong>in</strong>ton||{ fbus<strong>in</strong>ess friendsg.<br />
By the equality Cl<strong>in</strong>ton = The President the object Cl<strong>in</strong>ton can <strong>in</strong>herit the properties <strong>of</strong> the<br />
President, e.g. the books, and the President <strong>in</strong>herits the bus<strong>in</strong>ess friends <strong>of</strong> Cl<strong>in</strong>ton.<br />
Logical <strong>Fundamentals</strong> <strong>of</strong> Identication<br />
Computerized systems are one <strong>in</strong>stance <strong>of</strong> formal systems for world representation. It is <strong>of</strong><br />
<strong>in</strong>terest to consider how equality and identication were treated <strong>in</strong> other doma<strong>in</strong>s that deal<br />
with such systems.<br />
In philosophy and the study <strong>of</strong> logic various pr<strong>in</strong>ciples have been considered together with<br />
the equality concept. (for a theory <strong>of</strong> equality see [5], [13], [17], [8]).<br />
The <strong>in</strong>dist<strong>in</strong>guishability pr<strong>in</strong>ciple [7] formulated by Leibniz [16] states that entities which<br />
cannot be dist<strong>in</strong>guished by the unary formulas or predicates <strong>of</strong> the given language are equal,<br />
i.e.<br />
x=y i 8P ( P (x) $ P (y) ) .<br />
The characterization <strong>of</strong> entities is related to the abstraction pr<strong>in</strong>ciple <strong>in</strong> the sense it is used <strong>in</strong><br />
database modell<strong>in</strong>g, i.e. abstract<strong>in</strong>g from most properties and concentrat<strong>in</strong>g on some properties.<br />
The presented <strong>in</strong>dist<strong>in</strong>guishability depends on the chosen language and on the applicability<br />
<strong>in</strong> the case <strong>of</strong> partial predicates. Thus, denot<strong>in</strong>g by P (x)! the applicability <strong>of</strong>P to x,<br />
we have two versions <strong>of</strong> Leibniz pr<strong>in</strong>ciple:<br />
1. 8P ( (P (x)! ^ P (y)!) ;! (P (x) $ P (y)) ) <br />
2. 8P ( (P (x)! $ P (y)!) ^ ((P (x)! ^ P (y)!) ;! (P (x) $ P (y)) ) .<br />
The rst version restricts dist<strong>in</strong>ction to those predicates which are dened on x and y at<br />
the same time. If P is not dened on one <strong>of</strong> the entities and dened on the other then this<br />
dierence is not used for dist<strong>in</strong>ction. The second version permits this possibility.<br />
The pr<strong>in</strong>ciple is also related to the observation property. If for a given entity its identity<br />
can be observed on the basis <strong>of</strong> a calculus then this observability can be used for identication<br />
as well. Observation is closely related to and crucially depends on scope. A scope denes what<br />
is visible from a given viewpo<strong>in</strong>t, and only what is visible can be used for identication. For<br />
example, a view over a database <strong>of</strong>ten presents less <strong>in</strong>formation than the complete database,<br />
and that may prevent entity representations to be dist<strong>in</strong>guishable from each other <strong>in</strong> the view.<br />
(This is closely related to the view update problem.)<br />
Summariz<strong>in</strong>g, (<strong>in</strong>)dist<strong>in</strong>guishability depends on the languages which are used for representation<br />
<strong>of</strong> entities, and for query<strong>in</strong>g them. These characteristics hold <strong>in</strong> <strong>in</strong>formation systems<br />
as well, as shown <strong>in</strong> the sequel.<br />
Values and <strong>Object</strong>s<br />
Values are the basic build<strong>in</strong>g blocks <strong>of</strong> data. Atomic values represent universally known abstractions.<br />
For example, numbers are atomic types. By `universally known abstraction' we<br />
mean that it has a standard mean<strong>in</strong>g that is known to a large community further, <strong>in</strong> this<br />
community, there are accepted denotation(s) for it. Certa<strong>in</strong>ly, numbers satisfy these requirements.<br />
Values can also be comb<strong>in</strong>ed <strong>in</strong> various ways to form structures, such as tuples, lists<br />
or sets. These are non-atomic or structured values. Most <strong>of</strong>ten values are partitioned <strong>in</strong>to<br />
37
sets, called doma<strong>in</strong>s, such as the set <strong>of</strong> <strong>in</strong>tegers, the set <strong>of</strong> characters, and so on a value is an<br />
element <strong>of</strong>adoma<strong>in</strong><strong>of</strong>values. Each value has a xed, user-visible denotation/representation,<br />
bound to an element <strong>of</strong> the doma<strong>in</strong>, and these have the same form for all values <strong>in</strong> a doma<strong>in</strong>.<br />
A system normally supports several doma<strong>in</strong>s <strong>of</strong> atomic values, and several k<strong>in</strong>ds <strong>of</strong> nonatomic<br />
values that can be constructed from them, and depend<strong>in</strong>g on the <strong>in</strong>tended semantics,<br />
a collection <strong>of</strong> functions, also called operations, on these doma<strong>in</strong>s. Values do not change the<br />
operations only map values to other values. They are not created, nor do they cease to exist.<br />
The other k<strong>in</strong>d <strong>of</strong> data <strong>in</strong> computerized system are objects. They normally represent<br />
entities or abstractions that are not necessarily universally known, and whose existence is not<br />
pre-wired <strong>in</strong>to the system. Their properties are <strong>in</strong>uenced by those <strong>of</strong> such entities and by<br />
the representation method, and are the follow<strong>in</strong>g [6]:<br />
{ An object has an <strong>in</strong>ternal structure and has a state accord<strong>in</strong>g to this <strong>in</strong>ternal structure.<br />
{ It has a life cycle : it is created, it can be modied and it is nally removed.<br />
{ Its identity cannot be changed dur<strong>in</strong>g its lifetime. Identity isthatproperty <strong>of</strong>anobject<br />
which dist<strong>in</strong>guishes each object from all others[12].<br />
{ An object can exist <strong>in</strong>dependently from other objects. 2<br />
The <strong>in</strong>ternal structure <strong>of</strong> an object serves as a representation <strong>of</strong> its properties, and possibly<br />
also <strong>of</strong> (some <strong>of</strong>) its associations. 3 At any po<strong>in</strong>t <strong>of</strong> time, this structure provides the values<br />
<strong>of</strong> the properties at that time. Just as is the case for real world entities, properties can be<br />
changed | this is modication. However, the identity <strong>of</strong> the object never changes, throughout<br />
its lifetime. A value can be seen as a special k<strong>in</strong>d <strong>of</strong> object that has no properties except itself<br />
(for an atomic value), or its components (for a structured value). Hence, a value is never<br />
modied. Whereas numbers are immutable and exist forever, an employee object is created<br />
when an employee is hired, its properties are subject to change (e.g., a salary raise), and when<br />
the employee quits the company, its object is removed.<br />
While objects dier from values <strong>in</strong> several ways as described above, the dierence that is<br />
<strong>of</strong>ten assumed as the primary concept that dist<strong>in</strong>guishes OODB's from previous models, is<br />
the existence <strong>of</strong> object identity. Simply stated, an object has an identity that is <strong>in</strong>dependent<br />
<strong>of</strong> its properties and associations, and is immutable. The object properties, or associations<br />
<strong>in</strong> which it participates can change, but the identity never changes throughout its lifetime.<br />
This idea is considered as a cornerstone for proper representation <strong>of</strong> real world entities. In<br />
particular, each object represents a unique entity, and only one object represents each entity.<br />
But how is this requirement accomplished <strong>in</strong> a database system? A common approach<br />
is to implement identication by means <strong>of</strong> object identiers (o-id's). O-id's are system supplied<br />
(and implementation-dependent) atomic items, used solely for the purpose <strong>of</strong> identify<strong>in</strong>g<br />
objects <strong>in</strong> the system. 4 An identier is assigned to an object upon creation, and it<br />
never changes. The uniqueness and immutability <strong>of</strong> identiers guarantee that the system can<br />
uniquely identify each object throughout its lifetime, That means that they can be used for<br />
access structures, or to allow an object to be an attribute value <strong>of</strong> another (or <strong>of</strong> several other<br />
objects), by us<strong>in</strong>g the o-id as a surrogate. However, o-id's are considered <strong>in</strong>ternal, their values<br />
be<strong>in</strong>g mean<strong>in</strong>gless to users, hence the only operation on them that is permitted to users is to<br />
2 In some models that <strong>in</strong>corporate a notion <strong>of</strong> composite object, the existence <strong>of</strong> an object may depend on<br />
that <strong>of</strong> others.<br />
3 In the currently accepted models, relationships as a separate concept are not supported. See [18] for an<br />
<strong>in</strong>uential paper that suggests an extension <strong>of</strong> object models with relationships.<br />
4 The o-id concept is similar to that <strong>of</strong> surrogate or tuple identier <strong>in</strong> relational databases.<br />
38
ask whether two o-id's (equivalently, two objects) are identical. This is commensurate with<br />
the OODB philosophy <strong>of</strong> encapsulation | the <strong>in</strong>ternal state <strong>of</strong> an object can be observed<br />
and manipulated exclusively through an its <strong>in</strong>terface. Indeed, if o-id's were made available for<br />
users to view, they would simply be just another attribute value, like employee numbers.<br />
But now we observe that if the user cannot see the values <strong>of</strong> o-id's, they cannot serve<br />
him/her for identify<strong>in</strong>g objects! That is, while o-id's may serve a useful role at the implementation<br />
level, they serve nosuch role at the conceptual level. Thus, as noted <strong>in</strong> a previous work<br />
[3], identiers are an implementational, not a conceptual, concept. We note that some systems<br />
actually use a physical address as the o-id, and this may change with physical reorganization.<br />
In such systems, the o-id certa<strong>in</strong>ly cannot serve for conceptual identication, although from<br />
the system's po<strong>in</strong>t <strong>of</strong> view, s<strong>in</strong>ce such changes guarantee <strong>in</strong>tegrity <strong>of</strong> references, the o-id's can<br />
be considered immutable.<br />
The identication <strong>of</strong> objects at the conceptual and external levels must ultimately rely on<br />
values, just as <strong>in</strong> the value-based models. The dierence (if at all) is that an OODB has a<br />
rich structure, hence many more ways values can be associated with objects for identication.<br />
Further, the rich structure possibly allows the structure itself to serve as an identication or<br />
equality mechanism. For example, consider the two well-known notions <strong>of</strong> equality for objects,<br />
based on the values <strong>in</strong> this representation: In shallow equality two objects are (shallow) equal<br />
if they have the same structure, and the values <strong>in</strong> the structures are pairwise equal. Note<br />
that a component <strong>of</strong> a structure may be an object, and then equality asidentity isused. In<br />
deep equality two objects are equal if their structures match, and for each pair <strong>of</strong> match<strong>in</strong>g<br />
components, either they are equal values, or deep equal objects. Note that <strong>in</strong> the real world,<br />
entities are identied by value properties (such as hair colour, timbre <strong>of</strong> voice, height), or by<br />
associations with other entities that have value-vased identication. Although identication<br />
<strong>in</strong> the real world is exible and potentially complex, it is eventually value-based.<br />
Among implemented or proposed OODB models we can dist<strong>in</strong>guish three dierent k<strong>in</strong>ds:<br />
Value-based databases: All objects are value-identiable, i.e. can be identied by values <strong>of</strong><br />
their (public) attributes or by an unfold<strong>in</strong>g, unnest<strong>in</strong>g <strong>of</strong> the values. This means that a<br />
subset <strong>of</strong> the public attributes serves as a key for a class.<br />
Value-representable databases: All objects are reference-identiable, where reference-identi-<br />
ability can be recursively dened as follows:<br />
{ Each value-identiable object is also reference-identiable.<br />
{ If an object is identied by acomb<strong>in</strong>ation <strong>of</strong> attribute values and by references to or<br />
from a set <strong>of</strong> objects such that each object <strong>in</strong> this set is reference-identiable, then<br />
the object is itself reference-identiable.<br />
Identier-based databases: There are objects which are not reference-identiable.<br />
Figure 2.1 depicts the relationships these classes. Note that we do not claim that no other<br />
methods for identication exist. F<strong>in</strong>d<strong>in</strong>g methods that are ecient yet expressive is a subject<br />
for research.<br />
The Identication Problem <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong><br />
In summary <strong>of</strong> the discussion above, the issue <strong>of</strong> identication <strong>of</strong> objects <strong>in</strong> OODB's is not<br />
solved by the use <strong>of</strong> o-id's and is far from be<strong>in</strong>g well understood. In particular, the last class,<br />
it is possible for a database to conta<strong>in</strong> objects that cannot be dist<strong>in</strong>guished from each other.<br />
39
[htbp]<br />
value-oriented database<br />
database<br />
value-representable database<br />
value-based database<br />
P PPPPP<br />
<br />
<br />
<br />
<br />
object-oriented database<br />
P PPPPP<br />
<br />
<br />
<br />
<br />
P PPPPP<br />
<br />
<br />
<br />
<br />
non-value-based database<br />
identier-based database<br />
Fig. 2.1. Classication <strong>of</strong> databases<br />
We now illustrate the problem. For simplicity, we use a simple graph model (similar models<br />
have been used <strong>in</strong> e.g., GOOD, [11]). <strong>Object</strong> graphs are dened on a set O [ V <strong>of</strong> nodes,<br />
where O is a set <strong>of</strong> (abstract) objects, and V is a set <strong>of</strong> (atomic) values, and a set L <strong>of</strong> edge<br />
labels. Labels can be 2, state or names (used as attribute names). Type constructor names<br />
and class names are assumed to be elements <strong>of</strong> V .Thus, V may conta<strong>in</strong> values such astuple,<br />
set, emp-class. Now an object graph G is given by a nite set N <strong>of</strong> nodes and a nite set E <strong>of</strong><br />
labeled edges, i.e. E N L N. The label 2 appears on an edge that connects an object to<br />
its class, or an element to a set that conta<strong>in</strong>s it. The label state connects an object to a tuple<br />
value, what represent its state. A name label connects a tuple to a component. Thus, this<br />
simple model can describe complex types constructed by tuple and set constructors, object<br />
classes, and object states (without encapsulation).<br />
Let us consider the graphs shown <strong>in</strong> gure 2.2. In (a), the objects o 1 o 2 o 3 cannot be<br />
[htbp]<br />
o 1<br />
AK<br />
s A 0<br />
<br />
s ?<br />
A<br />
A s<br />
1 A<br />
<br />
A<br />
s * HHY 0<br />
s 0H<br />
s HA<br />
o 2<br />
- o 3<br />
a<br />
<br />
<br />
<br />
<br />
s<br />
<br />
<br />
<br />
<br />
o 4<br />
s<br />
B<br />
BBBBBBN<br />
s<br />
s<br />
o 6<br />
(b)<br />
-<br />
o 5<br />
s<br />
B<br />
BBBBBBN<br />
s<br />
-<br />
b<br />
(a)<br />
Fig. 2.2. Identication <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong><br />
dist<strong>in</strong>guished each from one another. They have the same outgo<strong>in</strong>g and <strong>in</strong>com<strong>in</strong>g edges, i.e.<br />
the graph is completely symmetric. However, if somehow o 1 could be dist<strong>in</strong>guished from o 2<br />
then all three objects can be dist<strong>in</strong>guished from each other. In (b), s<strong>in</strong>ce there are value nodes<br />
and a 6= b, objects o 4 o 5 o 6 can be dist<strong>in</strong>guished either by their outgo<strong>in</strong>g or <strong>in</strong>com<strong>in</strong>g edges.<br />
We can use identication trees for present<strong>in</strong>g the local structure <strong>of</strong> the graph around each<br />
object. The trees <strong>in</strong> gure 2.3 show the similarity <strong>of</strong> <strong>of</strong> the neighborhoods for the three objects.<br />
If (o i so j ) 2 E the edge (o j s!o i ) is used for <strong>in</strong>vers<strong>in</strong>g the order.<br />
The graph <strong>in</strong> gure 2.4 also has non-trivial symmetries, yet the objects cannot be uniquely<br />
identied. However, the objects are divided <strong>in</strong>to two sets that can be dist<strong>in</strong>guished from each<br />
other. <strong>Object</strong>s o 2 o 3 can be dist<strong>in</strong>guished, that is separated from each other, i objects o 1 o 4<br />
40
[bhtp]<br />
o 1<br />
o 2<br />
<br />
H s<br />
<br />
s HHHj<br />
<br />
H s<br />
? <br />
s HHHj<br />
?<br />
o 2<br />
! o 3<br />
s 0 1<br />
o 3<br />
! o 1<br />
s 0 1<br />
o 3<br />
<br />
H s<br />
<br />
s HHHj<br />
?<br />
o 1<br />
! o 2<br />
s 0 1<br />
Fig. 2.3. Trees <strong>of</strong> <strong>of</strong> depth 1 for o 1 o 2 o 3<br />
[htbp]<br />
o 2<br />
s 2<br />
s 2<br />
-<br />
Xy<br />
6 X s 3<br />
s 1 o 1 s 1<br />
XXX o 4<br />
Xy<br />
XXXXX X ? s 1 s s 1<br />
s 3<br />
s 3<br />
2<br />
Xz XXXX<br />
o 3 9<br />
s 2<br />
s 3<br />
:<br />
Xy X XXXX<br />
XXXXX XzX<br />
Fig. 2.4. <strong>Object</strong>s which cannot be dist<strong>in</strong>guished<br />
can be dist<strong>in</strong>guished.<br />
The examples demonstrate that the fact that objects have o-id's cannot serve for identi-<br />
cation. They illustrate that objects may be identiable conditional to the identiability <strong>of</strong><br />
others, and the close relationship between identiability and dist<strong>in</strong>guishability. Several general<br />
approaches to identication and dist<strong>in</strong>guishability are <strong>in</strong>troduced and discussed next.<br />
2.3 Identication Concepts <strong>in</strong> <strong>Databases</strong><br />
We have already mentioned the option <strong>of</strong> identify<strong>in</strong>g objects by values <strong>of</strong> their attributes, or<br />
additionally by associations to other objects. Generally, objects can be identied by the their<br />
position <strong>in</strong> the graph as well. We now consider concrete formalizations <strong>of</strong> these ideas.<br />
The rst two ideas concern homomorphic mapp<strong>in</strong>gs on graphs. Given two object graphs<br />
G = (NE) and G 0 = (N 0 E 0 ). A mapp<strong>in</strong>g h : N ;! N 0 preserves node labels if, for<br />
each u 2 N \ V , h(u) = u. It preserves adjacency if for all nodes u v <strong>in</strong> G, and for each<br />
label s, if there exists an edge (u s v) <strong>in</strong>G then there exists an edge (h(u)sh(v)) <strong>in</strong> G 0 . If,<br />
additionally, whenever (h(u)sh(v)) is <strong>in</strong> G 0 there is an edge (u s v) <strong>in</strong> G, then we say it<br />
strongly preserves adjacency.<br />
The mapp<strong>in</strong>g h is called g-homomorphism if it maps N onto N 0 and both preserves node<br />
labels and strongly preserves adjacency. It is an isomorphism if it is a bijective map. We<br />
denote a g-homomorphism by h : G ;! G 0 .<br />
The requirement that homomorphisms preserve node labels embodies our assumption that<br />
avalue is a uniquely identiable entity, that can be mentioned by users. Hence, a node labeled<br />
by avalue cannot be mapped to a node labeled by another. Recall that class names and type<br />
constructor names are also considered values, so they must be mapped to themselves. Thus,<br />
s<strong>in</strong>ce a g-homomorphism strongly preserves adjacency objects <strong>in</strong> a class can only be mapped<br />
to objects <strong>of</strong> the same class, s<strong>in</strong>ce they are related to the node represent<strong>in</strong>g their class by an<br />
edge.<br />
Identiability by homomorphisms:<br />
41
We say that two nodes o 1 o 2 <strong>of</strong> G are <strong>in</strong>dist<strong>in</strong>guishable by a g-homomorphism if there<br />
exists a graph G 0 and a g-homomorphism h : G ;! G 0 such that h(o 1 )=h(o 2 ). An object<br />
o is H-uniquely identiable if there is no other object o 0 dierent from o such that o o 0 are<br />
<strong>in</strong>dist<strong>in</strong>guishable by some g-homomorphism. The graph G is H-identiable if each <strong>of</strong> its objects<br />
is H-uniquely identiable.<br />
<br />
Identiability by automorphisms:<br />
Given the object graph G = (NE), a mapp<strong>in</strong>g h is called g-automorphism if it is a<br />
g-isomorphism from G to itself. Denote the automorphism group <strong>of</strong> G by ; (G). Two nodes<br />
u v from G are called A-equivalent (denoted by u = v) if there exists a g-automorphism h <strong>in</strong><br />
; (G) withv = h(u). (It is easily seen that it is <strong>in</strong>deed an equivalence relation.) For each node<br />
u the set <strong>of</strong> <strong>of</strong> all A-equivalent nodes is called the orbit <strong>of</strong> u (denoted by Or(u)). A node u is<br />
called A-identiable if u = h(u) foreach g-automorphism h, that is if it is the only element <strong>in</strong><br />
Or(u). The node is called A-unidentiable otherwise. The graph is A-identiable if its nodes<br />
are A-identiable.<br />
<br />
Identication by bisimulation:<br />
A related idea is obta<strong>in</strong> by generaliz<strong>in</strong>g from mapp<strong>in</strong>gs to relations. Given two graphs as<br />
above, a bisimulation between them is a b<strong>in</strong>ary relation R that preserves labels and adjacency,<br />
i.e., if R(u v) and u 2 N \ V ,thenu = v, and if R(u v), then there exists an edge (u s u 0 )<br />
i there exists an edge (v s v 0 ) and R(u 0 v 0 ). Bisimulations are closed under union, hence<br />
there always exists a maximal bisimulation between two graphs. Two nodes u u 0 <strong>in</strong> G are<br />
B-identiable is they are related by the maximal bisimulation between the graph and itself.<br />
<br />
The three previous denitions use the idea that values are the basis for identiability,<br />
but they rely on a global mechanism, namely the existence or unexistence <strong>of</strong> mapp<strong>in</strong>gs with<br />
certa<strong>in</strong> properties. The next two denitions <strong>in</strong>troduce ideas that are essentially local. Let us<br />
dene the local neighborhood <strong>of</strong> a node u to consist <strong>of</strong> u, all the nodes v such that (u s v)<br />
or (v s u) are <strong>in</strong> the graph, and all the edges connect<strong>in</strong>g them. That is, it is the subgraph<br />
<strong>of</strong> G <strong>in</strong>duced by the nodes whose distance from u is at most one (where edge directions are<br />
ignored). (This is essentially a neighborhood <strong>of</strong> radius one, <strong>in</strong> the term<strong>in</strong>ology <strong>of</strong> [9].) We<br />
denote the local neighborhood <strong>of</strong> u by ln(u). A g-isomorphism from ln(u) onto ln(v) is a<br />
regular g-isomorphism between these two graphs, that maps u to v. Thus, we assume that<br />
u is a dist<strong>in</strong>guished node <strong>in</strong> ln(u). Given a set P <strong>of</strong> pairs <strong>of</strong> nodes <strong>in</strong> N, a g-isomorphism<br />
f from l(u) to l(v) is a P -mapp<strong>in</strong>g if for any node u 0 <strong>in</strong> ln(u), the pair (u 0 f(u 0 ) is not <strong>in</strong><br />
P . The set P should be thought <strong>of</strong>asaset<strong>of</strong> excluded pairs, that cannot be related by the<br />
mapp<strong>in</strong>g. A typical case is when P is the set <strong>of</strong> values | these are all dierent from each<br />
other. Another case is a set <strong>of</strong> objects <strong>in</strong> a view, where the <strong>in</strong>formation that they are pairwise<br />
dierent cannot be deduced from the data <strong>in</strong> the view, but can be given as a summary <strong>of</strong><br />
<strong>in</strong>formation <strong>in</strong> the underly<strong>in</strong>g database.<br />
Identiability by values:<br />
The idea here is that values are dist<strong>in</strong>guishable from each other, and from objects. Also,<br />
nodes can be dist<strong>in</strong>guished if they are connected to dist<strong>in</strong>guishable nodes, or have a dierent<br />
pattern <strong>of</strong> connectivity. In the follow<strong>in</strong>g algorithm, this idea is repeatedly applied until a<br />
xpo<strong>in</strong>t isreached. For generality, the algorithm is given <strong>in</strong> terms <strong>of</strong> an arbitrary <strong>in</strong>itial set<br />
42
IE <strong>of</strong> pairs <strong>of</strong> nodes that are assumed to be known to be unequal. 5<br />
1. Input G =(NE)<br />
2. Initialization<br />
NotId = IE<br />
3. Repeat until no further change<br />
NotId := NotId [<br />
f (u v) (v u) j thereisno NotId-mapp<strong>in</strong>g <strong>of</strong> ln(u) onto ln(v)g<br />
4. Output : NotId <br />
For a graph G, let the canonical <strong>in</strong>equality setbe<br />
IE can (G) =f(u v)ju v are dierent nodes u v 2 N \ V _ (u 2 N n V ^ v 2 N \ V )g:<br />
Anodeu 2 N is called V-identiable if for each nodev 2 N the property(u v) 2 NotId holds,<br />
when the algorithm is started with IE can (G). Otherwise the node is call V-unidentiable. The<br />
graph G is V-identiable if each <strong>of</strong> its nodes is V-identiable.<br />
<br />
The close relationship between computation and logical <strong>in</strong>ference suggests that the previous<br />
denition can be brought <strong>in</strong>to a logical form.<br />
Identiability by (dis)equational logics:<br />
This logic is an analog <strong>of</strong> <strong>in</strong>equality systems used for ADT logics. Now we dene a Hilbert<br />
type deductive system for this logic. In addition to the predicate above, the system uses<br />
another b<strong>in</strong>ary predicate, that we denote 6=. Also, rather than writ<strong>in</strong>g (u v) 2 6=, we arite<br />
u 6= v. The set <strong>of</strong> axioms is assumed to be a given set IE <strong>of</strong> pairs on B =(O L) (whichmay<br />
be empty). The deductive system is denoted D IE<br />
V<br />
Axioms<br />
Rules<br />
u 6= v<br />
if (u v) 2 IE<br />
there is no 6= ;mapp<strong>in</strong>g <strong>of</strong> ln(u) onto ln(v)<br />
u 6= v<br />
Now we dene the derivation relationship `IE on the basis <strong>of</strong> DB IE.<br />
The node n is E-identiable if for every other node n 0 2 N we can derive `IEcan(G) n 6= n 0 .<br />
The graph G is E-identiable if each <strong>of</strong> its nodes is E-identiable.<br />
<br />
Identiability by queries<br />
We dene a simple query language on B = (O V L). The set <strong>of</strong> queries Q(B) is the<br />
smallest set generated by the follow<strong>in</strong>g formation rules.<br />
(i) If M is a subset <strong>of</strong> V then M is a query.<br />
(ii) If q is a query, andJ is a subset <strong>of</strong> L then ! J (q), J (q) are queries.<br />
(iii) If q q 0 are queries, then q [ q 0 , q \ q 0 and q n q 0 are queries.<br />
The semantics <strong>of</strong> queries, i.e., their mean<strong>in</strong>g on a graph G =(NE) is dened as follows.<br />
(i) M(G) =fu 2 N j u 2 Mg<br />
5 We discuss below a scenario where this is <strong>of</strong> <strong>in</strong>terest.<br />
43
(ii) J (q)(G) =fu 2 N j (u l v) 2 El 2 Jv 2 q(G)g<br />
(iii) ! J (q)(G) =fu 2 N j (v l u) 2 El 2 Jv 2 q(G)g<br />
(iv) (q [ q 0 )(G) =q(G) [ q 0 (G) (q \ q 0 )(G) =q(G) \ q 0 (G) (q n q 0 )(G) =q(G) n q 0 (G)<br />
Us<strong>in</strong>g the rst type <strong>of</strong> query, we can select any subset <strong>of</strong> the value nodes <strong>of</strong> a graph. Us<strong>in</strong>g the<br />
next two k<strong>in</strong>ds, we can express complex reachability patterns. Note that the language does<br />
not conta<strong>in</strong> iteration idioms. However, s<strong>in</strong>ce queries can essentially be composed, and we can<br />
write large queries, which compensate the lack <strong>of</strong> such idioms.<br />
Given a graph G =(NE) onB, letQ G be the subset <strong>of</strong> Q(B) that mentions only values<br />
and labels <strong>in</strong> G. Two nodes u v are Q-<strong>in</strong>dist<strong>in</strong>guishable (denoted u = Q v) if for all q <strong>in</strong> Q G ,<br />
u 2 q(G) i v 2 q(G). A node u <strong>in</strong> N is Q-identiable if there is a query q <strong>in</strong> Q such that<br />
q(G) =fug. G is Q-identiable if each node<strong>in</strong>N is Q-identiable. .<br />
2.4 Comparison <strong>of</strong> Identication Concepts<br />
We now proceed to compare the expressive power <strong>of</strong> the mechanisms <strong>in</strong>troduced above. We<br />
start with V and E identiability.<br />
Proposition 2.1. For each graph G, for any set <strong>of</strong> pairs IE, and for any nodes u v <strong>in</strong> N,<br />
the algorithm term<strong>in</strong>ates with (u v) <strong>in</strong> NotId if and only if `IE u 6= v.<br />
Pro<strong>of</strong>. Easy <strong>in</strong>ductions on computations, deductions respectively.<br />
Corollary 2.2. Anode u is V-identiable if and only if is E-identiable.<br />
<br />
From now, we use computation to refer to either computation or derivation. We note that<br />
computations are non-determ<strong>in</strong>istic <strong>in</strong> the sense that <strong>in</strong> each stepwe can consider an arbitrary<br />
pair. However, when we consider the set <strong>of</strong> pairs <strong>of</strong> nodes that are <strong>in</strong>ferred to be unequal, we<br />
have:<br />
Lemma 2.3. Computations are conuent: All computations on a given graph from an <strong>in</strong>itial<br />
set IE term<strong>in</strong>ate with the same set <strong>of</strong> pairs <strong>of</strong> nodes for 6=.<br />
Pro<strong>of</strong>. If at a given po<strong>in</strong>t <strong>in</strong> a computation, it is possible to derive that u 6= v, then the<br />
execution <strong>of</strong> other steps does not <strong>in</strong>validate any <strong>of</strong> the prerequisites, s<strong>in</strong>ce 6= canonly grow.<br />
Hence this fact rema<strong>in</strong>s derivable.<br />
Given a set <strong>of</strong> pairs, if wewantto<strong>in</strong>terpret it as a set <strong>of</strong> <strong>in</strong>equalities, then it is desirable that<br />
its complement has the properties <strong>of</strong> equality, <strong>in</strong> particular that is it an equivalence relation<br />
on the nodes <strong>of</strong> the graph. Let us call a set <strong>of</strong> pairs whose complement is an equivalence<br />
relation well-behaved.<br />
Proposition 2.4. Assume that the given <strong>in</strong>itial set is well-behaved. Then the computation<br />
produces a well-behaved set NotId.<br />
Pro<strong>of</strong>. By the lemma, the order <strong>of</strong> steps <strong>in</strong> a computation is irrelevant, so we consider steps<br />
done <strong>in</strong> a certa<strong>in</strong> order, and organized <strong>in</strong>to stages: In a given stage, we take the complement<br />
<strong>of</strong> the set NotId computed so far, take a connected component <strong>of</strong>thatcomplement, and test<br />
each pair <strong>in</strong> this component for <strong>in</strong>clusion <strong>in</strong> NotId. We add all pairs that qualify at the end<br />
44
<strong>of</strong> the stage to the set <strong>of</strong> pairs, then proceed to the next stage. This computation is still<br />
non-determ<strong>in</strong>istic <strong>in</strong> the choice <strong>of</strong> a component for a stage.<br />
The claim now proceeds by <strong>in</strong>duction on stages. By assumption, the given set is wellbehaved,<br />
so after <strong>in</strong>itialization, NotId is well-behaved. Assumed that it is well-behaved after<br />
k stages, we show the property holds after the k + 1 stage. Note that s<strong>in</strong>ce it is well-behaved,<br />
the complement is an equivalence relation | a connected component <strong>of</strong> the complement is<br />
an equivalence class. Now, a pair u v <strong>in</strong> the class is noted for <strong>in</strong>clusion <strong>in</strong> NotId if and only<br />
if there is no NotId-isomorphism <strong>of</strong> ln(u) onto ln(v). Thus, two nodes u v will rema<strong>in</strong> <strong>in</strong><br />
the complement <strong>of</strong>NotId i there is such an isomorphism between ln(u)ln(v). It is easy to<br />
see that s<strong>in</strong>ce NotId is well-behaved, the set <strong>of</strong> pairs that are not <strong>in</strong>cluded <strong>in</strong> it <strong>in</strong> this stage<br />
is a (disjo<strong>in</strong>t) partition <strong>of</strong> the class.<br />
We now proceed to compare V- and A-identiability:<br />
Proposition 2.5. If nodes u v are V-dist<strong>in</strong>guishable, then they are A-dist<strong>in</strong>guishable.<br />
Pro<strong>of</strong>. We claim that if u v are A-<strong>in</strong>dist<strong>in</strong>guishable, that is, there is a g-automorphism h<br />
that maps u to v, then the pair (u v) will not be <strong>in</strong> NotId . The pro<strong>of</strong> is by <strong>in</strong>duction on<br />
the stages <strong>of</strong> a computation. Clearly, the pairs <strong>in</strong> IE can (G) are A-dist<strong>in</strong>guishable, each is a<br />
separate class <strong>in</strong> the complement <strong>of</strong> NotId and is mapped by h to itself. So (u v) is not <strong>in</strong><br />
IEcan(G), for any such pair. For the <strong>in</strong>duction, we assume that for all u v, ifh(u) =v then<br />
u v are <strong>in</strong> the complement <strong>of</strong> NotId after k stages, and we prove this holds after the next<br />
stage. Indeed, let u v be a pair such that h(u) = v. Then the restriction <strong>of</strong> h to ln(u) is a<br />
NotId-isomorphism onto ln(v), so the pair (u v) is not put <strong>in</strong>to NotId.<br />
The converse to the proposition does not hold. As an example, consider a database with n<br />
classes, C 1 ::: C n . Let class C i conta<strong>in</strong> objects x i y i z i , and let each <strong>of</strong> these haveanl-labeled<br />
outgo<strong>in</strong>g edge, so the graph conta<strong>in</strong>s the 3n edges (a i la i+1 ), for i = 1n and a = x y z.<br />
Further, assume the existence <strong>of</strong> the follow<strong>in</strong>g three edges: (x n lx 1 ) (y n lz 1 ) (z n ly 1 ).<br />
F<strong>in</strong>ally, assume the follow<strong>in</strong>g 3n h-edges: (x i hy i ) (y i hz i ) (z i hx i ). Now, s<strong>in</strong>ce class nodes<br />
are V-dist<strong>in</strong>guishable, the algorithm for V-identication will partition the objects so that<br />
x i y i z i will be <strong>in</strong> an equivalence class <strong>in</strong> the complement <strong>of</strong>NotId. S<strong>in</strong>ce the structure <strong>of</strong> each<br />
class is symmetric, no additional partition can occur. However, no non-trivial automorphism<br />
exist. Indeed, assume it exists, and call it h. Without loss <strong>of</strong> generality, assume h(x 1 )=y 1 .<br />
Then necessarily, from then edge structure, h(y 1 ) = z 1 h(z 1 ) = x 1 , and the same, namely<br />
h(x i ) = y i , ::: , must hold for the objects <strong>in</strong> the classes C 2 ::: C n . But now we reach a<br />
contradiction, s<strong>in</strong>ce (y n lz 1 )is<strong>in</strong>E, but (h(y n )lh(z 1 )) = (z n lx 1 ) is not <strong>in</strong> E.<br />
We now consider properties <strong>of</strong> H. Every g-homomorphism on a graph G partitions the<br />
nodes <strong>in</strong>to equivalence classes. We can compare g-homomorphisms by compar<strong>in</strong>g the partitions<br />
they <strong>in</strong>duce. We say h dom<strong>in</strong>ates h 0 if each equivalence class <strong>of</strong> h 0 is conta<strong>in</strong>ed <strong>in</strong> an<br />
equivalence class <strong>of</strong> h. Clearly, dom<strong>in</strong>ance <strong>in</strong>duces a partial order. A g-homomorphism <strong>of</strong><br />
G that dom<strong>in</strong>ates all other g-homomorphisms <strong>of</strong> G is called maximum. Any two maximal<br />
mapp<strong>in</strong>gs dom<strong>in</strong>ate each other, hence <strong>in</strong>duce the same partitions. Thus, their images are<br />
g-isomorphic.<br />
Proposition 2.6. There exists a g-homomorphism which is maximum.<br />
45
Pro<strong>of</strong>. Let HN be the transitive closure <strong>of</strong> the b<strong>in</strong>ary relationship `u and v are H-<strong>in</strong>dist<strong>in</strong>guishable'.<br />
It is an equivalence relation on the nodes N <strong>of</strong> the graph G. We can create a graph ^G whose<br />
nodes are the elements <strong>of</strong> HN, and such that ([u]l[v]) is an edge if and only if there is an<br />
edge (u l v) <strong>in</strong> G, where u is any member <strong>of</strong> the equivalence class [u], and similarly for v.<br />
We claim that the mapp<strong>in</strong>g ^h that maps all elements <strong>of</strong> [u] <strong>in</strong>N to [u] is a g-homomorphism<br />
from G onto ^G.<br />
Assume that for some g-homomorphism h, h(u) =h(v). Further assume that (u l u 0 ) is<br />
an edge <strong>in</strong> G. Then (h(u)lh(u 0 )) = (h(v)lh(u 0 )) is an edge <strong>in</strong> the image under h. S<strong>in</strong>ce h<br />
strongly preserves adjacency, (v l u 0 )must be an edge <strong>in</strong> G. Thus, u and v are connected by<br />
l edges to precisely the same nodes. The claim obviously holds also for other edge labels, and<br />
also for back edges <strong>of</strong> the form (u 0 lu). In short, u v have the same connections.<br />
Now, if v and w are identied by some h 0 , then v and w have the same connections. It<br />
follows that u and w have the same connections. By <strong>in</strong>duction, we nowhave thatifwehave a<br />
sequence u 1 ::: u n ,suchthateach pair u i u i+1 is identied by some h i , then all elements <strong>in</strong><br />
the sequence have the same connections. In other words, all elements <strong>of</strong> an equivalence class<br />
<strong>in</strong> HN have the same connections. It follows that ^h is <strong>in</strong>deed a g-homomorphism. It is clear<br />
that it is a maximal g-homomorphism.<br />
Proposition 2.7. H-identiability is the same as B-identiability<br />
Pro<strong>of</strong>. A bisimulation on a graph <strong>in</strong>duces an equivalence relation on its nodes, and it is<br />
easy to see that there exists a g-homomorphism from the graph to its image modulo this<br />
equivalence relation. In particular, tak<strong>in</strong>g the maximal bisimulation, we have that if objects<br />
are H-identiable, then they are B-identiable. For the opposite direction, we observe that<br />
the maximal g-homomorphism <strong>in</strong>duces a bisimulation on the graph.<br />
We now compare V and H. Let us call the classes <strong>in</strong> the partition <strong>of</strong> the nodes <strong>of</strong> G <strong>in</strong>to<br />
equivalence classes by ^h-classes. As noted above, the notion <strong>of</strong> V-identiability also denes<br />
equivalence classes on G, each consist<strong>in</strong>g <strong>of</strong> objects that are pairwise V-<strong>in</strong>dist<strong>in</strong>guishable.<br />
Denote these as V-classes.<br />
Proposition 2.8. Let G be a graph. Then each ^h-class is conta<strong>in</strong>ed <strong>in</strong> some V-class.<br />
Pro<strong>of</strong>. We prove the claim by <strong>in</strong>duction on the stages <strong>of</strong> the computation <strong>of</strong> the V-classes.<br />
Initially, each v 2 N \ V is an equivalence class by itself, and all other elements <strong>of</strong> N are <strong>in</strong><br />
one class. Clearly, each v 2 N \ V is also a s<strong>in</strong>gleton ^h-class, so the claim holds. Assume now<br />
it holds after stage k. If the local neighborhoods <strong>of</strong> u v <strong>of</strong> the same ^h-class are compared,<br />
they will be found to be isomorphic. To see this, observe thatasweshowed above theyhave<br />
precisely the same connections. Thus, the mapp<strong>in</strong>g that takes u to v and leaves all other nodes<br />
<strong>in</strong> place is an isomorphism <strong>of</strong> ln(u) and ln(v). It follows that they will not be separated.<br />
Corollary 2.9. V-identiability implies H-identiability.<br />
The converse fails: In the example above, no twonodeshave precisely the same connections,<br />
hence ^h is the identity, so each node is H-identiable, but as we saw, nodes are not V-<br />
identiable.<br />
If we change the example by conneect<strong>in</strong>g x n to x 1 , ::: , then there is a non-trivial<br />
g-automorphism, so nodes <strong>in</strong> the same class are not A-identiable, but they are still H-<br />
identiable.<br />
46
Proposition 2.10. A-identiability implies H-identiability.<br />
Pro<strong>of</strong>. Assume nodes u v are H-undist<strong>in</strong>guishable. We claim that the map that <strong>in</strong>terchanges<br />
u with v and is the identity elsewhere is a g-automorphism.<br />
We now consider Q. As mentioned above, when one starts from IE can (G), the complement<br />
<strong>of</strong> the 6= relation computed by the algorithm for V-identication is a partition <strong>of</strong> the nodes<br />
<strong>in</strong>to equivalence classes, the V-classes.<br />
Proposition 2.11. Let G be agraph, and q be a query. Then the q(G) is a union <strong>of</strong> V-classes.<br />
Pro<strong>of</strong>. The pro<strong>of</strong> uses <strong>in</strong>duction on the structure <strong>of</strong> queries. If M V , then M(G) =M \ N,<br />
which obviously is a union <strong>of</strong> V-classes, as each value is <strong>in</strong> a class by itself. Now, assume the<br />
claim is true for a query Q, let J L, and consider the query ! q . Assume u 2 ! q (G), and<br />
also assume that u 0 is <strong>in</strong> the same V-class as u. Then there is a 6=-isomorphism from ln(u)<br />
onto ln(u 0 ). In particular, if there is an l-edge from u to a V-class, there is an l-edge from u 0<br />
to the same class. It follows that u 0 is also <strong>in</strong> ! q (G). The case for q is similar. S<strong>in</strong>ce sets <strong>of</strong><br />
V-classes are closed under boolean operations, the pro<strong>of</strong> is complete.<br />
Corollary 2.12. Q-identiability implies V-identiability.<br />
The converse does not hold, s<strong>in</strong>ce the V-algorithm can count, while queries do not count.<br />
For example, assume o 1 o 2 have l-edges to a node labeled 3, o 3 has a k-edge to o 1 , while o 4<br />
has k-edges to both o 1 and o 2 . The V-algorithm will separate o 3 from o 4 , s<strong>in</strong>ce their local<br />
neighborhoods are not isomorphic. Queries cannot separate o 1 from o 2 , hence cannot also<br />
separate nodes related to them by the same k<strong>in</strong>d <strong>of</strong> edges, like o 3 o 4 .<br />
Notice, for non-canonical sets <strong>of</strong> <strong>in</strong>equalities the generalized V-identiability and the equivalent<br />
E-identiability donot imply H-identiability orA-identiability. Thus, identiability<br />
based <strong>in</strong>equality sets can be a very powerful method.<br />
Integrity constra<strong>in</strong>ts can be used for identiability aswell, s<strong>in</strong>ce they can impose dist<strong>in</strong>guishability<br />
or <strong>in</strong>dist<strong>in</strong>guishability <strong>of</strong> objects. As we have seen, sometimes two objects can be<br />
dist<strong>in</strong>guished from each other if a pair <strong>of</strong> others can. Thus, dist<strong>in</strong>guishability that is deduced<br />
from the <strong>in</strong>tegrity constra<strong>in</strong>ts <strong>in</strong> the system can propagate.<br />
Us<strong>in</strong>g a generalized relational representation, equality generat<strong>in</strong>g dependencies are constra<strong>in</strong>ts<br />
<strong>of</strong> the follow<strong>in</strong>g form:<br />
8x((P R (x 1 ) ^ ::: ^ P R (x m ) ^ F (x m+1 )<br />
! G(x)<br />
where F G are conjunctions <strong>of</strong> equalities <strong>of</strong> the form x ij = x i 0 j0, P is the predicate symbol<br />
associated with relation R, andx i x. Based on the transformation <strong>of</strong> the constra<strong>in</strong>t to the<br />
equivalent formula<br />
8(x)(P R (x 1 ) ^ ::: ^ P R (x m ) ^:G(x m+1 )<br />
!:F (x))<br />
we can use the <strong>in</strong>equality set IE <strong>in</strong> order to extend the deductive system DV<br />
IE . Thus, we<br />
can express the identication properties on the basis <strong>of</strong> value-dist<strong>in</strong>guishability or equational<br />
logic <strong>in</strong> the case <strong>of</strong> equality generat<strong>in</strong>g dependencies. Notice, that functional dependencies,<br />
key constra<strong>in</strong>ts, and generalized functional dependencies are special equality generat<strong>in</strong>g dependencies.<br />
47
Corollary 2.13. Identication extended byequality-generat<strong>in</strong>g dependencies can be expressed<br />
by V-identiability.<br />
An exclusion dependency is an expression <strong>of</strong> the form<br />
R[R:A 1 :::: R:A n ] k S[S:B 1 :::S:B n ]:<br />
The property specied by the exclusion dependency can be directly translated to <strong>in</strong>equalities<br />
among objects.<br />
A generalized <strong>in</strong>clusion dependency is an expression <strong>of</strong> the form<br />
R 1 [X 1 ] \ ::: \ R n [X n ] S 1 [Y 1 ] [ ::: [ S m [Y m ]<br />
for compatible sequences X i Y j . Similarily to equality-generat<strong>in</strong>g dependencies, generalized<br />
<strong>in</strong>clusion dependencies can be transformed to negated formulas. These formulas are the basis<br />
for the extension <strong>of</strong> the deductive system D IE<br />
V .<br />
Corollary 2.14. Identication extended by generalized <strong>in</strong>clusion dependencies and exclusion<br />
dependencies can be expressed by V-identiability.<br />
Disjunctive existence constra<strong>in</strong>ts X ) Y 1 Y 2 ::: Y n specify that if a tuple is completely dened<br />
on X then it is completely dened on Y i for some i. There is an axiomatization for disjunctive<br />
existence constra<strong>in</strong>ts. They can be represented by monotone Boolean functions.<br />
S<strong>in</strong>ce the existence has been treated explicitly <strong>in</strong> the denition <strong>of</strong> value-identiability we<br />
conclude directly:<br />
Corollary 2.15. Identication extended by existence dependencies can be expressed by V-<br />
identiability.<br />
Summariz<strong>in</strong>g the comparisons above we obta<strong>in</strong><br />
Corollary 2.16. V-identiability and E-identiability are equivalent for generalized <strong>in</strong>clusion,<br />
exclusion, existence and equality-generat<strong>in</strong>g dependencies.<br />
2.5 Conclusion<br />
This paper has reconsidered notions <strong>of</strong> identity andidentiability <strong>in</strong> OODB'ss. We have proposed<br />
and justied the thesis that object identiers, as proposed and used <strong>in</strong> most OODB<br />
implementations are system-related but do not address problems <strong>of</strong> users. In particular, although<br />
the o-id mechanism guarantees that objects do have a unique identity, as required<br />
<strong>in</strong> the foundational postulates for OODB's, that by itself does not provide for identiability,<br />
namely the ability <strong>of</strong> a program or user to uniquely identify an object. For the latter,<br />
as far users <strong>of</strong> an OODB are concerned, a value-based mechanism must be used. We have<br />
shown a close relationships between identiability and separability <strong>of</strong> objects from each others.<br />
In order to better understand identiability, wehave studied various notions that can be<br />
used as a specication <strong>of</strong> this notion, and have classied their relative strengths. Our results<br />
and discussion complement and augment previous discussion <strong>of</strong> object identication and its<br />
complexity <strong>in</strong> the literature, e.g., [2, 10, 14, 15].<br />
We have not considered the practical issues <strong>of</strong> identiability. S<strong>in</strong>ce the mechanism must<br />
be value-based, a notion <strong>of</strong> keys, as used <strong>in</strong> relational databases certa<strong>in</strong>ly suces. However,<br />
48
OODB's oer a much richer structure, and it seems reasonable to expect that more <strong>of</strong> this<br />
structure be used for identication. E.g., one can use not only the value <strong>of</strong> a key attribute,<br />
but also it membership <strong>in</strong> a given class, <strong>in</strong> particular dist<strong>in</strong>guish between membership <strong>in</strong><br />
subclasses <strong>of</strong> a given class. Initial work <strong>in</strong> this direction has been reported <strong>in</strong> [19, 20]. We see<br />
<strong>in</strong> this an important research direction.<br />
A related problem concerns the representation <strong>of</strong> real-world entities by database objects.<br />
A-posteriori it is possible for two or more objects to represent the same real world entity.<br />
Thus, <strong>in</strong> addition to the primary notion <strong>of</strong> object equality, where objects are equal if they are<br />
identical <strong>in</strong> the database, we have referential equality, where two database objects refer to the<br />
same real world entity. Although not discussed much <strong>in</strong> the literature, it is probably undesirable<br />
to allow dierent database objects to represent the same real world entity. Whether<br />
such phenomena can be avoided depends on the identication mechanisms supported by the<br />
system. A mechanism that is easy to use, simple to understand, and can be directly related<br />
to properties <strong>of</strong> real-world entities can help avoid problems <strong>of</strong> multiple representations.<br />
A nal po<strong>in</strong>t concerns views. One can say that the reason o-id's cannot help users to<br />
identify objects is the existence <strong>of</strong> an abstraction barrier between the system and its users.<br />
The system knows the values <strong>of</strong> o-id's and can use them freely for all its needs. However, s<strong>in</strong>ce<br />
only an equality test is exported, these same o-id's are much less useful for the users. One<br />
can say that the OODB as seen by the users is a view <strong>of</strong> the <strong>in</strong>ternal OODB, as seen by the<br />
system. It follows that one should expect similar problems when deal<strong>in</strong>g with views. Namely,<br />
it is possible that the identifcation mechanism <strong>in</strong> an OODB uniquely identies each object<br />
<strong>in</strong> the current state. Yet, s<strong>in</strong>ce views present a restricted viewpo<strong>in</strong>t, it is possible that this<br />
property does not hold for some views. This may be problematic for views that allow updates,<br />
possibly also for queries. Note that view denitions may form abstraction barriers <strong>in</strong> ways<br />
that are much more sophisticated that the simple <strong>in</strong>terface between the implementation and<br />
conceptual levels <strong>of</strong> an OODB. The analysis <strong>of</strong> identiability issues may therefore be more<br />
dicult. The problem will become both more dicult and more important as distributed<br />
access to dist<strong>in</strong>ct OODB's through the Web becomes common.<br />
References for Chapter 2<br />
1. S. Abiteboul, P.C. Kanellakis, <strong>Object</strong> identity as a query language primitive. Proc. SIGMOD,<br />
1989, 159 - 173.<br />
2. S. Abiteboul, J. Van den Bussche, Deep equality revised. Proc. DOOD'95 (eds. T.W. L<strong>in</strong>g, A.O.<br />
Mendelzon, L. Vielle), LNCS 1013, 213 - 228.<br />
3. C. Beeri, A formal approach toobject-oriented databases. Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, 5,<br />
1990, 4, 353 - 382.<br />
4. C. Beeri, Some thoughts on the future evolution <strong>of</strong> object-oriented database concepts. Proc. BTW<br />
93 (ed. W. Stucky), Spr<strong>in</strong>ger, 1993, 18 -32.<br />
5. K. Berka, L. Kreiser, Texts on logics. Akademie-Verlag, Berl<strong>in</strong>, 1973.<br />
6. J. Biskup and H.H. Bruggemann. An object-surrogate-value approach for database languages.<br />
Technical report 16-3-89, University Hildesheim, Dept. Computer Science.<br />
7. H.B. Curry, Foundations <strong>of</strong> mathematical logic. McGraw-Hill, New York, 1963.<br />
8. G. Frege, Funktion und Begri. Jena 1891.<br />
9. H. Gaifman, On local and non-local properties. Proc. <strong>of</strong> the Herbrand Symposium, Logic Colloq.<br />
'81, North-Holland, Amsterdam, 1982.<br />
10. M. Gogolla, A declarative query approach to object identication. Proc. OO-ER95 (ed.M.Papazoglou),<br />
LNCS 1021, 65 - 76.<br />
49
11. M. Gyssens, J. Paredaens, D. v. Gucht, A graph-oriented object database model. Proc. PODS,<br />
1990, 417-424.<br />
12. S.N. Khoshaan, G. Copeland, <strong>Object</strong> identity. Proc. OOPSLA-86, special Issue <strong>of</strong> SIGPLAN<br />
Notices (ed. N. Meyrowitz), 21 (12), Dec. 1986, 406 - 416.<br />
13. S.C. Kleene, Mathematical logic. John Wiley, New York, 1967.<br />
14. H.-J. Kle<strong>in</strong>, J. Rasch. Value based identication and functional dependencies for object databases.<br />
Proc. 3rd Basque Int. Workshop on Information Technology, IEEE Comp. Sci. Press, 1997, 22-34.<br />
15. A. Kosky, Observational dist<strong>in</strong>guishability. Proc. 5th DBPL, Electronic Report <strong>of</strong> Conferences <strong>in</strong><br />
Comput<strong>in</strong>g, Spr<strong>in</strong>ger, 1995.<br />
16. G.W. Leibniz, Fragmente zur Logik. Edited by Fr. Schmidt, Berl<strong>in</strong>, 1960.<br />
17. P.S. Poreckij, Theorie conjo<strong>in</strong>te des egalites des non-egalites logiques. News <strong>of</strong> Physics Society <strong>of</strong><br />
Kazan University, XVI, No. 1-2, 1908.<br />
18. J. Rumbaugh, Controll<strong>in</strong>g propagation <strong>of</strong> operations us<strong>in</strong>g attributes on relations. Proc. OOP-<br />
SLA88, ACM Sigplan Notices (23,11), Nov. 1988, 285{296.<br />
19. K.-D. Schewe, J.W. Schmidt, and I. Wetzel, Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />
<strong>Oriented</strong> <strong>Databases</strong>. In J. Biskup, R. Hull (eds.), Proc. 3rd International Conference on Database<br />
Theory, ICDT '92, Berl<strong>in</strong> (Germany), Lecture Notes <strong>in</strong> Computer Science 341{356, 1992, Spr<strong>in</strong>ger.<br />
20. K.-D. Schewe, B. Thalheim, Fundamental Conceps <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Concepts. Acta Cybernetica,<br />
11, No. 4, 1993, 49 { 81<br />
21. B. Thalheim, Reconsider<strong>in</strong>g key and identication concepts <strong>in</strong> dierent database models. Technical<br />
Report CS-08-91, University <strong>of</strong> Rostock, 1991.<br />
22. J. Van den Bussche,J.Paredaens, The expressive power <strong>of</strong> complex values <strong>in</strong> object-based data<br />
models. Inf. Comput. 120, 220{236.<br />
23. J. Van den Bussche, D. van Gucht, M. Andries, M. Gyssens, On the completeness <strong>of</strong> object-creat<strong>in</strong>g<br />
database transformation languages. JACM 44:2, March 1997, 272{319<br />
50
Chapter 3<br />
<strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong><br />
Database Modell<strong>in</strong>g<br />
Contents<br />
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br />
3.2 Type Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />
3.3 OODM Schemata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br />
3.4 Value Representability . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />
3.5 Logical Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
Klaus{Dieter Schewe. <strong>Fundamentals</strong> <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Modell<strong>in</strong>g. Intelligent<br />
Systems. Moskau 1997.<br />
51
Abstract. Solid theoretical foundations <strong>of</strong> object oriented databases (OODBs) are still miss<strong>in</strong>g.<br />
The work reported <strong>in</strong> this paper conta<strong>in</strong>s results on a formally founded object oriented<br />
datamodel (OODM) and is <strong>in</strong>tended to contribute to the development <strong>of</strong> a uniform mathematical<br />
theory <strong>of</strong> OODBs.<br />
A clear dist<strong>in</strong>ction between objects and values turns out to be essential <strong>in</strong> the OODM.<br />
Types and classes are used to structure values and objects repectively. This can be founded<br />
on top <strong>of</strong> any underly<strong>in</strong>g type system. We outl<strong>in</strong>e dierent approaches to type systems and<br />
their semantics and claim that OODB theory on top <strong>of</strong> arbitrary type systems leads to type<br />
theory with topos-theoretically dened semantics.<br />
On this basis the known solutions to the problems <strong>of</strong> unique object identication and<br />
genericity can be generalized. It turns out that extents <strong>of</strong> classes must be completely representable<br />
by values. Such classes are called value-representable. As a consequence object<br />
identiers degenerate to a pure implementation concept. This stimulates considerations that<br />
do not depend on such identiers.<br />
In order to approach this problem object oriented schemata and <strong>in</strong>stances are reorganized<br />
by means <strong>of</strong> general category-theoretical arguments to let them occur as theories <strong>in</strong> the higherorder<br />
<strong>in</strong>tuitionistic logic associated with a topos dened by the type system. Moreover, <strong>in</strong> the<br />
case <strong>of</strong> value-representability itcan be seen that object identiers can be dispensed with at<br />
the logical level. This allows to approach queries algebraically as well as logically and sets up<br />
a start<strong>in</strong>g po<strong>in</strong>t for deduction with<strong>in</strong> OODBs.<br />
3.1 Introduction<br />
The shortcom<strong>in</strong>gs <strong>of</strong> the relational database approach encouraged much research aimed at<br />
achiev<strong>in</strong>g more appropriate data models. It has been claimed that the object-oriented approach<br />
will be the key technology for future database systems and languages [8]. Several systems<br />
[5, 6, 7, 9, 19, 20, 21, 22, 24, 27, 38, 40, 41, 70] arose from these eorts. However, <strong>in</strong> contrast<br />
to research <strong>in</strong> the relational area there is no common formal agreement on what constitutes<br />
an object-oriented database [11, 12, 14].<br />
The basic question \What is an object?" seems to be trivial, but already here the variety<br />
<strong>of</strong> answers is large. In object oriented programm<strong>in</strong>g the notion <strong>of</strong> an object was <strong>in</strong>tended as<br />
a generalization <strong>of</strong> the abstract data type concept with the additional feature <strong>of</strong> <strong>in</strong>heritance.<br />
In this sense object orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong><br />
order to promote high s<strong>of</strong>tware development productivity. The development <strong>of</strong> object oriented<br />
databases regarded an object also as a basic unit <strong>of</strong> persistent data, a view that is heavily <strong>in</strong>-<br />
uenced by exist<strong>in</strong>g semantic datamodels (SDMs) [2, 30, 31, 43, 44, 63]. Thus, object oriented<br />
databases are composed <strong>of</strong> <strong>in</strong>dependent objects but must also provide for the ma<strong>in</strong>tenance <strong>of</strong><br />
<strong>in</strong>ter-object consistency, a demand that is to some degree <strong>in</strong> dissonance with the basic style<br />
<strong>of</strong> object orientation.<br />
Theoretical <strong>in</strong>vestigations <strong>in</strong> the eld <strong>of</strong> OODBs are rare. The few exist<strong>in</strong>g results <strong>in</strong> OODB<br />
theory can be classied <strong>in</strong> three groups. The rst one [25, 65, 66, 67] studies expressiveness and<br />
complexity <strong>of</strong> query languages with object creation and duplicate elim<strong>in</strong>ation. This follows<br />
more or less the ideas <strong>of</strong> the IQL framework [3]. The second one [12, 14, 15, 16, 54, 55] asks<br />
for the fundamental features <strong>of</strong> object oriented datamodels and their semantical foundations.<br />
The third group [4, 37] cont<strong>in</strong>ues the l<strong>in</strong>e <strong>of</strong> research <strong>in</strong> which databases occur as theories<br />
dened by logic programs.<br />
52
A view that is common <strong>in</strong> OODB research is that objects are abstractions <strong>of</strong> real world<br />
entities and should have an identity [8]. This leads to a dist<strong>in</strong>ction between values and objects<br />
[11, 12]. A value is identied by itself whereas an object has an identity <strong>in</strong>dependent <strong>of</strong> its<br />
value. This object identity is usually encoded by object identiers [1, 3, 36]. Abstract<strong>in</strong>g from<br />
the pure physical level the identier <strong>of</strong> an object can be regarded as be<strong>in</strong>g immutable dur<strong>in</strong>g<br />
the object's lifetime. Identiers ease the shar<strong>in</strong>g and update <strong>of</strong> data. However, such abstract<br />
identiers do not relieve us from the task to provide unique identication mechanisms for<br />
objects. In object oriented programm<strong>in</strong>g object names are sucient, but retriev<strong>in</strong>g mass data<br />
by name is senseless.<br />
In most approaches to OODBs an object is coupled with a value <strong>of</strong> some xed structure.<br />
To our po<strong>in</strong>t <strong>of</strong> view this contradicts already the goal <strong>of</strong> objects be<strong>in</strong>g abstractions <strong>of</strong> reality.<br />
In real situations an object has several and also chang<strong>in</strong>g aspects that should be captured by<br />
the object model. Therefore, <strong>in</strong> our object model each object o consists <strong>of</strong> a unique identier<br />
id, a set <strong>of</strong> (type-, value-)pairs (T i v i ), a set <strong>of</strong> (reference-, object-)pairs (ref j o j ) and a set<br />
<strong>of</strong> operations m k .<br />
Types are used to structure values. Then the rst problem concerns the semantics <strong>of</strong> the<br />
type system, i.e. the variety <strong>of</strong> types that can be dened and used <strong>in</strong> schema denitions.<br />
We consider three dierent approaches based on a simple type system with set semantics, the<br />
typed -calculus and a slightly extended version <strong>of</strong> Girard-Reynolds polymorphism [17, 42, 48].<br />
For the third case it is well-known that there is no set-theoretic model. In this case, however,<br />
suitable models can be obta<strong>in</strong>ed <strong>in</strong> the eective topos [34, 32, 50] or even <strong>in</strong> Grothendieck<br />
topoi [47]. Moreover, we may always ask how good a model is with respect to computational<br />
aspects. Here aga<strong>in</strong> it may be argued that hav<strong>in</strong>g an <strong>in</strong>tuitionist's m<strong>in</strong>d, i. e. tak<strong>in</strong>g a topostheoretic<br />
po<strong>in</strong>t <strong>of</strong> view, may helptohave eective computations [49].<br />
Classes serve as structur<strong>in</strong>g primitive for objects hav<strong>in</strong>g the same structure and behaviour.<br />
It is obvious that the multiple aspects view <strong>of</strong> an object allows them to be simultaneously<br />
members <strong>of</strong> more than one class and to change class memberships. In the OODM a class<br />
structure uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. The extent <strong>of</strong> classes<br />
varies over time, whereas types are immutable. Relationships between classes are represented<br />
by references together with referential constra<strong>in</strong>ts on the object identiers <strong>in</strong>volved. Moreover,<br />
each class is accompanied by acollection <strong>of</strong> operations. A schema is given by acollection <strong>of</strong><br />
class denitions together with explicit <strong>in</strong>tegrity constra<strong>in</strong>ts. It will be shown that the semantics<br />
<strong>of</strong> OODM schemata can be dened <strong>in</strong> a uniform way <strong>in</strong>dependently from the underly<strong>in</strong>g type<br />
system.<br />
Important OODB problems concern the unique identication <strong>of</strong> objects and the existence<br />
<strong>of</strong> generic update operations [55]. Follow<strong>in</strong>g [1, 13] the immutable identity <strong>of</strong> an object can<br />
be encoded by the concept <strong>of</strong> abstract object-identiers. The advantages <strong>of</strong> this approach are<br />
that shar<strong>in</strong>g, mutability <strong>of</strong>values and cyclic structures can be represented easily [46]. On the<br />
other hand, object identiers do not have a mean<strong>in</strong>g for the user and should therefore be<br />
hidden. The notion <strong>of</strong> value-representability is known to guarantee unique identication <strong>in</strong><br />
the case <strong>of</strong> set semantics. This can be generalized to the general case. The same applies to<br />
the genericity problem.<br />
Then we show us<strong>in</strong>g now categorical terms how classes, schemata and <strong>in</strong>stances can be captured<br />
categorically. Us<strong>in</strong>g the <strong>in</strong>ternal logic <strong>of</strong> a topos, we may dene schemata and <strong>in</strong>stances<br />
by theories and even get rid <strong>of</strong> object identiers us<strong>in</strong>g the existence, identity and description<br />
predicates <strong>in</strong> <strong>in</strong>tuitionistic logic <strong>in</strong>stead. On this basis algebraic and logical queries can be<br />
dened. However, this last step depends on value-representability, a necessary property for<br />
53
genericity [55], whereas for the unique identication <strong>of</strong> objects weak value-identiability would<br />
be sucient [16, 55]. However, some slight extensions { which areomitted here { allow also<br />
to capture this case.<br />
Throughout the paper we assume some basic knowledge about category theory [10], elementary<br />
topos theory [35, 39] and their relation to higher-order <strong>in</strong>tuitionistic logic [28, 39, 60].<br />
3.2 Type Systems<br />
We start with a brief look at three dierent type systems and their semantics. The three<br />
approaches comprise a very simple type system with set semantics, typed -calculus with<br />
semantics <strong>in</strong> cartesian closed categories and a version <strong>of</strong> the polymorphic or second-order<br />
typed -calculus.<br />
Common to all these cases is the view that types are basically given by base types and<br />
constructors. The latter will occur as types with free (type) variables. A type without free<br />
variables will be called proper. Among the base types we assume an abstract identier type<br />
ID.Atype T without occurrence <strong>of</strong> ID will be called a value-type.<br />
A Simple Type System. In set-based modell<strong>in</strong>g a type may be regarded as an immutable<br />
set <strong>of</strong> values <strong>of</strong> a uniform structure together. Subtyp<strong>in</strong>g is used to relate values <strong>in</strong> dierent<br />
types. We use a type system that consists <strong>of</strong> some base types such as BOOL, NAT, INT,<br />
STRING, etc., and type constructors for records, nite sets, lists, etc. Arbitrary types can<br />
then be dened by nest<strong>in</strong>g. Moreover, we assume recursive types with a semantics dened<br />
by rational trees. We shall proceed giv<strong>in</strong>g a more formal denition <strong>of</strong> types. Thus the type<br />
system can be dened as<br />
t := b j x j (a 1 : t 1 :::a n : t n ) jftg j[t] j x:t :<br />
The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Then the type system may be<br />
extended by a subtype relation t 0 t [42], which semantically gives rise to subtype functions<br />
t 0 ! t. We omit the details here.<br />
If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a correspond<strong>in</strong>g occurrence<br />
relation<br />
o : t t 0 ! <br />
where is the truth object <strong>in</strong> Set, i.e. = BOOL.<br />
Typed -Calculus. In the typed -claculus the ma<strong>in</strong> emphasis is on function types, i.e. we<br />
can dene the type system by<br />
t := b j x j (a 1 : t 1 :::a n : t n ) j t 1 ! t 2 j :<br />
The semantics <strong>of</strong> the typed -calculus can described by cartesian closed categories.<br />
54
Polymorphism. As the third approach we choose some slightly enriched version <strong>of</strong> Girard-<br />
Reynolds polymorphism (GRP), i.e. types are given by the language<br />
t := b j x j t 1 ::: t n j t 1 ! t 2 j x:t <br />
where b denotes some collection <strong>of</strong> base types <strong>in</strong>clud<strong>in</strong>g for our purposes a type ID <strong>of</strong> object<br />
identiers, x represents some type variable, represents product types, ! represents function<br />
types and impredicative polymorphic abstraction with x runn<strong>in</strong>g over all types [42, 48].<br />
First recall the notion <strong>of</strong> a topos. Atopos E is a nitely-complete cartesian-closed category<br />
with a subobject classier, i. e. there is an object and a global element true :1l ! such<br />
that for each monomorphism f : A,! B there is a unique classify<strong>in</strong>g morphism cl(f) :B ! <br />
such that f and triv dene the pullback <strong>of</strong>cl(f) andtrue. Here1l denotes a term<strong>in</strong>al object.<br />
Then we may dene what we mean by a model <strong>of</strong> our type theory <strong>in</strong> a topos E. Amodel<br />
<strong>of</strong> GRP <strong>in</strong> a topos E consist <strong>of</strong> an essentially small <strong>in</strong>ternal category IE that is closed under<br />
nite products, exponents and Ob(IE)-<strong>in</strong>dexed products together with an embedd<strong>in</strong>g <strong>in</strong>to E<br />
which preserves these properties. The commonly known such model is given by the category<br />
Per <strong>of</strong> partial equivalence relations <strong>in</strong> the eective topos E .<br />
For an exhibition <strong>of</strong> various approaches to construct such modelswe refer to [34, 47].<br />
3.3 OODM Schemata<br />
In this section we present a slightly modied version <strong>of</strong> the object oriented datamodel (OODM)<br />
<strong>of</strong> [52, 54, 58]. We observe that an object <strong>in</strong> the real world always has an identity. Therefore,<br />
abstract (i.e. system-provided) object identiers are <strong>in</strong>troduced to capture identity. However,<br />
neither the real world object that was the basis <strong>of</strong> the abstraction nor the abstract identier<br />
can be used for the identication <strong>of</strong> an object.<br />
In contrast to exist<strong>in</strong>g object oriented datamodels [1, 3, 5, 6, 7, 8, 9, 20, 21, 27, 38, 40,<br />
46, 61] an object is not coupled with a unique type. In contrast, we observe that real world<br />
objects can have dierent aspects that may change over time. Therefore, a primary decision<br />
was taken to let an object be associated with more than one type and to let these types even<br />
change dur<strong>in</strong>g the object's lifetime. The same applies to references to other objects.<br />
The Class Concept. The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same<br />
structure which uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Moreover, generic<br />
operations on objects such as object creation, deletion and update <strong>of</strong> its values and references<br />
are associated with classes provided these operations can be dened unambigously. <strong>Object</strong>s<br />
can belong to dierent classes, which guarantees each object <strong>of</strong> our abstract object model to<br />
be captured by the collection <strong>of</strong> possible classes. As for values that are only dened via types,<br />
objects can only be dened via classes.<br />
Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values and references to<br />
objects <strong>in</strong> other classes. Identiers can be represented us<strong>in</strong>g the unique identier type ID.<br />
Values and references can be comb<strong>in</strong>ed <strong>in</strong>to a representation type, where each occurence <strong>of</strong><br />
ID denotes references to some other classes. Therefore, we may dene the structure <strong>of</strong> a class<br />
us<strong>in</strong>g types with free variables.<br />
As to dynamics we dist<strong>in</strong>guish between visible and hidden operations to emphasize those<br />
operations that can be <strong>in</strong>voked by the user and others. All operations on a class <strong>in</strong>clud<strong>in</strong>g the<br />
55
hidden ones can be accessed by other operations, but only hidden operations can be used to<br />
handle identiers.<br />
In the follow<strong>in</strong>g N denotes some (large enough) collection <strong>of</strong> names.<br />
(i) Let t be a value type with free variables 1 ::: n .For pairwise dist<strong>in</strong>ct reference names<br />
r 1 ::: r n 2 N and class names C 1 ::: C n 2 N the expression derived from t by replac<strong>in</strong>g<br />
each i <strong>in</strong> t by r i : C i for i =1::: n is called a structure expression.<br />
(ii) A class consists <strong>of</strong> a class name C 2 N, a structure expression S, a set <strong>of</strong> superclass names<br />
fD 1 ::: D m gN and a set fm 1 :::m k g <strong>of</strong> operations. We callr i a reference from class<br />
C to class C i . The type derived from S by replac<strong>in</strong>g each reference r i : C i bythetype ID<br />
is called the representation type T C <strong>of</strong> the class C, the type U C =(ident : IDvalue :: T C )<br />
is called the class type <strong>of</strong> C.<br />
(iii) An operation signature consists <strong>of</strong> a operation name M 2 N, a set <strong>of</strong> <strong>in</strong>put-parameter<br />
/ <strong>in</strong>put-type pairs i :: T i ( i 2 N) and a set <strong>of</strong> output-parameter / output-type pairs<br />
o j :: T 0 j (o j 2 N). We write<br />
o 1 :: T 0 1::: o m :: T 0 m M( 1 :: T 1 ::: n :: T n ) :<br />
(iv) A operation M on aclassC consists <strong>of</strong> a operation signature with name M and a body<br />
that is recursively built from the follow<strong>in</strong>g constructs:<br />
(a) assignment x := E, where x is either the class variable x C or a local variable with<strong>in</strong><br />
S, andE is a term <strong>of</strong> the same type as x,<br />
(b) skip, fail, loop,<br />
(c) sequential composition S 1 S 2 , choice S 1 S 2 , projection x :: T j S, guard P ! S,<br />
restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable <strong>of</strong> type<br />
T ,and<br />
(d) <strong>in</strong>stantiation x 0 1 ::: x0 i C0 : S 0 (E1 0 ::: E0 j ), where S0 is a operation on class C 0 with<br />
<strong>in</strong>put-parameters 0 1 ::: 0 j and output-parameters o0 1 ::: o0 i ,such that the variables<br />
o 0 f , x0 f have the same type and the term E0 g has the same type as the variable 0 g.<br />
(v) An operation M on a class C with signature o 1 :: T1 0::: o m :: Tm 0 M( 1 :: T 1 ::: n ::<br />
T n ) is called value-dened i all T i (i =1:::n) and Tj 0 (j =1::: m) are proper value<br />
types.<br />
(vi) A schema S is a nite collection <strong>of</strong> classes C 1 ::: C n closed under references, superclasses<br />
and occurrences <strong>of</strong> class names <strong>in</strong> operations.<br />
Semantics. First assume that the underly<strong>in</strong>g type systems has a set semantics. Then we can<br />
dene <strong>in</strong>stances <strong>of</strong> OODM schemata.<br />
An <strong>in</strong>stance D <strong>of</strong> a schema S assigns to each classC avalue D(C) <strong>of</strong>type U C such that<br />
the follow<strong>in</strong>g conditions are satised:<br />
uniqueness <strong>of</strong> identiers: For every class C we have<br />
8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (3.17)<br />
<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />
8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (3.18)<br />
Moreover, if T C is a subtype <strong>of</strong> T 0 C with subtype function f : T C ! T 0 C ,thenwehave<br />
8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (3.19)<br />
56
eferential <strong>in</strong>tegrity: For each reference from C to C 0 with correspond<strong>in</strong>g occurrence relation<br />
o r wehave<br />
8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) :<br />
(3.20)<br />
On the basis <strong>of</strong> topos theory we can rephrase the denition <strong>of</strong> database <strong>in</strong>stances. Instead <strong>of</strong><br />
the set D(C) wehave to consider a subobject DC ,! |<br />
ID T C , i.e. a monomorphism <strong>in</strong> IE. If<br />
: ID T C ! ID is the canonical projection, then the uniqueness <strong>of</strong> identiers means that<br />
| is monic. If j C i is the image factorization <strong>of</strong> | with j C : im(DC) ! ID, then this<br />
must factor through j D if C is a subclass <strong>of</strong> D. Thirdly, letDC be the subobject <strong>of</strong> DC ID<br />
classied by<br />
DC ID |id<br />
,! ID T C ID id ;! T C ID or<br />
;! <br />
where o r corresponds to the reference from C to D. Letj r : im(DC) ,! ID result from the<br />
image factorization <strong>of</strong> { for { : DC ,! DC ID. Then j r must factor through j D .<br />
The semantics <strong>of</strong> operations can be dened via predicate transformers as shown <strong>in</strong> [26, 45]<br />
for the classical case and <strong>in</strong> [57] for the topos-based semantics.<br />
Example. Let us look at a simple university example based on the simple type system with<br />
set semantics. We rst<strong>in</strong>troduce types and classes, then show an example <strong>of</strong> an <strong>in</strong>stance.<br />
Type PERSONNAME = ( FirstName : STRING , SecondName : STRING , Titles : f<br />
STRING g )<br />
Type PERSON = (PersonIdentityNo : NAT Name : PERSONNAME )<br />
Type MPERSON = ( PersonIdentityNo : NAT , Spouse : )<br />
Then let the schema consist <strong>of</strong> the follow<strong>in</strong>g classes:<br />
Class PersonC<br />
Structure PERSON<br />
End PersonC<br />
Class MarriedPersonC<br />
IsA PersonC<br />
Structure ( PersonIdentityNo : NAT , Spouse : MarriedPersonC )<br />
End MarriedPersonC<br />
Class StudentC<br />
IsA PersonC<br />
Structure ( StudentNumber : NAT , Supervisor : Pr<strong>of</strong>essorC ,<br />
Major : DepartmentC , M<strong>in</strong>or : DepartmentC )<br />
End StudentC<br />
Class Pr<strong>of</strong>essorC<br />
IsA PersonC<br />
Structure ( PersonIdentityNo : NAT , Age : NAT ,<br />
Salary : NAT ,Faculty :DepartmentC )<br />
End Pr<strong>of</strong>essorC<br />
57
Class DepartmentC<br />
Structure ( DeptName : STRING )<br />
End DepartmentC<br />
Next use D as a name for the <strong>in</strong>stance.<br />
D(PersonC) =f ( i 1 , ( 123 , ( \John" , \Denver" , f \Pr<strong>of</strong>essor" , \Dr" g ))),<br />
( i 2 , ( 124 , ( \Mary" , \Stuart" , f \Dr" g ))),<br />
( i 3 , ( 456 , ( \John" , \Stuart" , fg))),<br />
( i 4 , ( 567 , ( \Laura" , \James" , fg))),<br />
( i 5 , ( 987 ,(\Dave" ,\Ford" , fg))) g<br />
D(MarriedPersonC)=f ( i 1 , ( 123 , i 2 )),<br />
( i 2 , ( 124 , i 1 )) g<br />
D(Pr<strong>of</strong>essorC)=f ( i 1 , ( 123 , 48 , 8000 , i 6 ))<br />
D(StudentC)=f ( i 3 , ( 456 , 1023 , ( \John" , \Stuart" , fg),i 1 , i 6 , i 7 )),<br />
( i 4 , ( 567 , 2134 , ( \Laura" , \James" , fg),i 1 , i 6 , i 7 )) g<br />
D(DepartmentC)=f ( i 6 , ( \Computer Science" ) ) ,<br />
( i 7 , ( \Philosophy" ) ) ,<br />
( i 8 ,(\Music"))g<br />
3.4 Value Representability<br />
From an object oriented po<strong>in</strong>t <strong>of</strong>view a database may be considered as a huge collection <strong>of</strong><br />
objects <strong>of</strong> arbitrary complex structure. Hence the problem to uniquely identify and retrieve<br />
objects <strong>in</strong> such collections.<br />
Each object <strong>in</strong> a database is an abstraction <strong>of</strong> a real world object that has a unique identity.<br />
The representation <strong>of</strong> such objects <strong>in</strong> the OODM uses an abstract identier I <strong>of</strong> type ID to<br />
encode this identity. Suchanidentier may be considered as be<strong>in</strong>g immutable. However, from<br />
a systems oriented view permutations or collapses <strong>of</strong> identiers without chang<strong>in</strong>g anyth<strong>in</strong>g<br />
else should not aect the behaviour <strong>of</strong> the database.<br />
For the user the abstract identier <strong>of</strong> an object has no mean<strong>in</strong>g. Therefore, a dierent<br />
access to the identication problem is required. We show that the unique identication <strong>of</strong><br />
an object <strong>in</strong> a class leads to the notion <strong>of</strong> value-identiability. The stronger notion <strong>of</strong> valuerepresentability<br />
is required for the unique denition <strong>of</strong> generic update operations. The setbased<br />
case has been handled <strong>in</strong> [54, 55]<br />
(i) A class C is called value-identiable i there exists a proper value type I C , called valueidentication<br />
type such that for all <strong>in</strong>stances D <strong>of</strong> S there is a morphism c : T C ! I C<br />
such that the composition<br />
DC ,! ID T C<br />
2<br />
! T C<br />
c<br />
! I C<br />
is monic.<br />
(ii) C is called value-representable i there exists a value-dentication type V C such that<br />
for all <strong>in</strong>stances D <strong>of</strong> S there is a morphism c : T C ! V C such that for all valueidentication<br />
types I C and the image factorization T<br />
c V<br />
C ! DV C ,! V C there exists a<br />
morphism c 0 : DV C ! I C with c I = c 0 c V .<br />
58<br />
g
It is easy to see that each value-representable class C is also value-identiable. Moreover, the<br />
value-representation type V C <strong>in</strong> is unique up to isomorphism.<br />
We want to dene algorithms to compute types V C and I C that turn out to be proper<br />
value types under certa<strong>in</strong> conditions. For this we extend subtyp<strong>in</strong>g to structure expressions<br />
<strong>in</strong> a natural way tak<strong>in</strong>g care <strong>of</strong> IsA-relations. Then each super structure expression S 0 and<br />
each <strong>in</strong>stance dene a morphism IS 0 : DC ! DC 0 ,! ID T S 0 us<strong>in</strong>g the representation type<br />
T S 0 <strong>of</strong> S 0 .<br />
Algorithm. Let F (C i )=T i provided there exists a super structure expression on C i dened<br />
by c i : T Ci ! T i , otherwise let F (C i ) be undened. If ID occurs <strong>in</strong> some F (C i ) correspond<strong>in</strong>g<br />
to r j : C j (j 6= i), we writeID j .<br />
Then iterate as long as possible us<strong>in</strong>g the follow<strong>in</strong>g rules:<br />
(i) If F (C j )isaproper value type and ID j occurs <strong>in</strong> some F (C i )(j 6= i), then replace this<br />
correspond<strong>in</strong>g ID j <strong>in</strong> F (C i )by F (C j ).<br />
(ii) If ID i occurs <strong>in</strong> some F (C i ), then let F (C i ) be recursively dened by F (C i )==S i , where<br />
S i is the result <strong>of</strong> replac<strong>in</strong>g ID i <strong>in</strong> F (C i )by the type name F (C i ).<br />
The iteration term<strong>in</strong>ates, s<strong>in</strong>ce there exists only a nite collection <strong>of</strong> classes. If these rules are<br />
no longer applicable, replace each rema<strong>in</strong><strong>in</strong>g occurrence <strong>of</strong> ID j <strong>in</strong> F (C i ) by the type name<br />
F (C j )provided F (C j ) is dened.<br />
ut<br />
Note that the the algorithm computes (mutually) recursive types.<br />
The reference graph <strong>of</strong> a class C <strong>in</strong> a schema S is the smallest labelled graph G rep =<br />
(VEl) satisfy<strong>in</strong>g:<br />
(i) There exists a vertex v C 2 V with l(v C ) = ft Cg, where t is the top-level type <strong>in</strong> the<br />
structure expression S <strong>of</strong> C.<br />
(ii) For each proper occurrence <strong>of</strong> a type t 6= ID <strong>in</strong> T C there exists a unique vertex v t 2 V<br />
with l(v t )=ftg.<br />
(iii) For each reference r i : C i <strong>in</strong> the structure expression S <strong>of</strong> C the reference graph G i ref is<br />
a subgraph <strong>of</strong> G ref .<br />
(iv) For each vertex v t or v C correspond<strong>in</strong>g to t(x 1 ::: x n )<strong>in</strong>S there exist unique edges e (i)<br />
t<br />
from v t or v C respectively to v ti <strong>in</strong> case x i is the type t i or to v Ci <strong>in</strong> case x i is the reference<br />
r i : C i . In the rst case l(e (i)<br />
t )=fS i g, where S i is the correspond<strong>in</strong>g selector name <strong>in</strong> the<br />
latter case the label is fS i r i g.<br />
Let S = fC 1 ::: C n g be a schema. Let S 0 = fC1 0 ::: C0 ng be another schema such that for<br />
all i there exists a super structure expression on C i dened by some c i : T Ci ! T C 0<br />
i<br />
. Then an<br />
identication graph G id <strong>of</strong> the class C i is obta<strong>in</strong>ed from the reference graph <strong>of</strong> Ci 0 bychang<strong>in</strong>g<br />
each label Cj 0 to C j.<br />
With these notations it is easy to see that for a class C such that there exists a super<br />
structure expression for all classes C i occurr<strong>in</strong>g as a label <strong>in</strong> some identication graph G id <strong>of</strong><br />
C and the type I C computed by the Algorithm with respect to the super structure expression<br />
used <strong>in</strong> the denition <strong>of</strong> G id , I C is a proper value type.<br />
Theorem. (i) Let C be a class <strong>in</strong> a schema S such that there exists a super structure expression<br />
for all classes C i occurr<strong>in</strong>gasalabel <strong>in</strong> the reference graph G ref <strong>of</strong> C. Let V C be the<br />
59
type G(C) computed by the Algorithm with respect to trivial super structure expressions<br />
and let I C be the type F (C) computed by the Algorithm with respect to arbitrary super<br />
structure expressions. Then C is value-representable with value representation type V C<br />
and each such I C is a value identication type.<br />
(ii) Let C be a class <strong>in</strong> a schema S such that there exist generic update methods on C.<br />
Then C is value-representable. Moreover, all super- and subclasses <strong>of</strong> C are also valuerepresentable.<br />
(iii) Let C be a value-representable class <strong>in</strong> a schema S such that all its super- and subclasses<br />
are also value-representable. Then there exist unique generic update operations on C.<br />
The pro<strong>of</strong> mimiques the set-based arguments <strong>in</strong> [55].<br />
3.5 Logical Reconstruction<br />
So far, we have seen the decisive role<strong>of</strong>type semantics for OODBs. Given a topos <strong>of</strong> types,<br />
we may describe <strong>in</strong>stances <strong>of</strong> a schema on top <strong>of</strong> it. The only assumption is the existence <strong>of</strong> a<br />
type ID <strong>of</strong> object identiers. Moreover, it is known from [28, 35, 39] that topoi are <strong>in</strong>herently<br />
connected with higher-order <strong>in</strong>tuitionistic logics.<br />
In pr<strong>in</strong>cipal, there are two (equivalent) ways to approach the logic <strong>of</strong> a topos. The rst<br />
one is given by the Mitchell-Benabou language anf Kripke-Joyal semantics [39], the second<br />
one based on Fourman-Scott languages [28] follows more the general l<strong>in</strong>e <strong>of</strong> logics den<strong>in</strong>g its<br />
syntax and <strong>in</strong>terpretation <strong>in</strong> an (arbitrary) topos.<br />
In our presentation we take the second approach, because it directly comes up with equality,<br />
existence and description [60]. Recall that a Fourman-Scott language L consists <strong>of</strong><br />
{ two sets Sort and Const <strong>of</strong> sorts and constants,<br />
{ a power sort map [] : S n2IN Sortn ! Sort written (A 1 ::: A n ) 7! [A 1 ::: A n ],<br />
{ a family <strong>of</strong> countable sets fVar s g s2Sort <strong>in</strong>dexed by the sorts and<br />
{ amap#:Const ! Sort assign<strong>in</strong>g to each constant its sort.<br />
We also use Var= S s2Sort Var s to refer to the set <strong>of</strong> all variables. Then for a given variable<br />
x 2 Varwe write#x to refer to the sort <strong>of</strong> x. Moreover, we use f = [] as an abbreviation for<br />
the empty power sort which will be regarded as consist<strong>in</strong>g <strong>of</strong> truth values.<br />
The terms T s (L) <strong>of</strong> sort s 2 Sort for a language L are constructed from L as the smallest<br />
set such that each variable x <strong>of</strong> sort s, each constant c with #c = s and Ix:' for each variable<br />
x with #x = s and each formula ' belong to T s (L).<br />
The formulae <strong>of</strong> L build the smallest set F(L) such that the follow<strong>in</strong>g formulae are <strong>in</strong><br />
F(L):<br />
{ E for each term 2T(L),<br />
{ for terms , <strong>of</strong> the same sort s,<br />
{ ( 1 ::: n ) for terms i 2T si (L) and 2T [s 1::: s n](L),<br />
{ ' ^ for formulae ' and ,<br />
{ ' ) for formulae ' and and<br />
{ 8x:' for variables x 2 Var and formulae ' .<br />
60
We may then <strong>in</strong>troduce the other junctors :, _, ,, the predicate = and the quantier 9 as<br />
abbreviations.<br />
The <strong>in</strong>tension beh<strong>in</strong>d the description symbol I needs some explanation. Informally Ix:'<br />
means the unique x that satises '. However, such anx may not exist.<br />
The logic deals with this problem by <strong>in</strong>troduc<strong>in</strong>g a formal existence predicate E, where<br />
E means that exists. This is formalized by dist<strong>in</strong>guish<strong>in</strong>g doma<strong>in</strong>s ~ A <strong>of</strong> possible elements<br />
and to let E pick out the subdoma<strong>in</strong>s <strong>of</strong> actual elements. Then bound variables will range<br />
only over actual elements. When <strong>in</strong>terpret<strong>in</strong>g the logic <strong>in</strong> a topos this construction is related<br />
to partial morphism classication.<br />
The <strong>in</strong>troduction <strong>of</strong> an existence predicate also <strong>in</strong>uences the equality predicate = which<br />
is considered as a property <strong>of</strong> actual elements. In order to compare also possible elements<br />
the equivalence predicate is <strong>in</strong>troduced. Non-exist<strong>in</strong>g elements are all considered to be<br />
equivalent. S<strong>in</strong>ce then equality can be dened <strong>in</strong> terms <strong>of</strong> the equivalence and the existence<br />
predicates, only is taken as a primitive <strong>in</strong> the logic.<br />
We have mentioned above that the sort f will be considered as truth values. Then the<br />
formula () with 2Tf(L) is a formula that asserts .<br />
We dispense with a description <strong>of</strong> the axioms and rules that dene the derivation operator<br />
` as well as with the <strong>in</strong>terpretation <strong>of</strong> L <strong>in</strong> an arbitrary topos. We only mention that each<br />
theory T <strong>of</strong> L canonically denes a topos IE(T ), called the topos <strong>of</strong> denable types and<br />
denable total functions, and that each topos E can be written <strong>in</strong> this form. In particular,<br />
there is a canonical <strong>in</strong>terpretation <strong>of</strong> L <strong>in</strong> IE(T ), which is sound and complete.<br />
In order to dene IE(T ) we <strong>in</strong>troduce types and relations as terms <strong>of</strong> specic syntactic<br />
forms. Such types reect the many possible subdoma<strong>in</strong>s <strong>of</strong> doma<strong>in</strong>s associated with power<br />
sorts.<br />
A type A is a term <strong>of</strong> the form Iy :: [s]:8x :: s:(' , y(x)). A relation f from s to t is a<br />
term <strong>of</strong> the form Iz :: [s t]:8x :: s y :: t:(' , z(x y)). Atype A or a relation f is said to be<br />
denable i the den<strong>in</strong>g formula is closed.<br />
A more convenient notationforatype A dened by the formula ' is A = fx :: s j 'g. For<br />
a term <strong>of</strong> sort s we then get the formula 2 A. Foravariable x with #x = s we may use<br />
the quantiers 8x 2 A and 9x 2 A.<br />
For a relation f we may use the notation f # () forIy :: t:f(y) for 2T s (L) even if do<br />
not know whether f is the graph <strong>of</strong> a function. Furthermore, we use functional abstraction<br />
writ<strong>in</strong>g x :: s: as an abbreviation for Iz :: [s t]:8x :: s y :: t:(y = , z(x y)).<br />
F<strong>in</strong>ally, two relations f, g from type A to type B are equivalent with respect to T i<br />
T `8x 2 A:(f # (x) g # (x)) holds.<br />
Let T be a theory over L. ThetoposIE(T )<strong>of</strong>denable types and denable total functions<br />
has as objects the denable types <strong>of</strong> L and as morphisms from A to B equivalence classes <strong>of</strong><br />
denable relations from A to B such that T `8x 2 A:f # (x) 2 B holds. For f 2 Hom(A B)<br />
and g 2 Hom(BC) the composition g f 2 Hom(A C) is dened by x 2 A:(g # (f # (x))).<br />
Schemata and Instances as Theories. Given a topos E, let us now try to shift the categorical<br />
characterization <strong>of</strong> <strong>in</strong>stances <strong>in</strong>to the associated logic. Recall that the sorts <strong>of</strong> this logic are<br />
the objects <strong>of</strong> E, the constants <strong>of</strong> sort A are the morphisms c :1l ! A, ~ where A : A ! A ~ is<br />
the partial morphism classier for A [35, 39, 53] and the power sort map takes A 1 ::: A n<br />
to A 1:::A n<br />
.<br />
Now consider the monomorphism | : DC ,! ID T C and the canonical projection 1 :<br />
IDT C ! ID.Asabove letj C : im(DC) ,! ID result from the image factorization <strong>of</strong> 1 |.<br />
61
S<strong>in</strong>ce we assume 1 | to be monic, the universal property <strong>of</strong> images gives rise to a unique<br />
monomorphism { : im(DC) ! DC.<br />
S<strong>in</strong>ce we assume value-representability, 2 | is also monic, hence 2 |{ gives a monomorphism<br />
from im(DC), a subobject <strong>of</strong> ID to T C . Then the universal property <strong>of</strong> the partial<br />
morphism classier TC gives rise to a unique monomorphism I(C) :ID ! T ~ C .<br />
Similarly, consider the morphism o r : T C ID ! correspond<strong>in</strong>g to a reference r from<br />
class C to class D. S<strong>in</strong>ce TC I(D) : T C ID ! T ~ C T ~ D denes a monomorphism, we<br />
may aga<strong>in</strong> consider the partial morphism classier for , which is = id . This gives us<br />
a unique morphism ~o r : TC ~ T ~ D ! . Then let I(r) = ^~o r : TC ~ ! <br />
T ~ D<br />
be its exponential<br />
adjo<strong>in</strong>t.<br />
Then the morphisms I(C) for all classes C 2Sand I(r) for all references <strong>in</strong> S (assum<strong>in</strong>g<br />
for the moment unique reference names) are sucient to describe objects. In fact, we may<br />
th<strong>in</strong>k <strong>of</strong> these morphisms as semantically associated with an <strong>in</strong>stance, whereas syntactically<br />
we may use the class names C and the reference names r <strong>in</strong>stead.<br />
This gives rise to formulae <strong>of</strong> the form EIo: ' as \ground facts" given by some <strong>in</strong>stance.<br />
Moreover, the follow<strong>in</strong>g formulae dene the axioms <strong>of</strong> the schema S:<br />
8o: EC(o) ) ED(o) if C is a subclass <strong>of</strong> D (3.21)<br />
8o: EC(o) )8o 0 : (r(C(o))(D(o 0 )) ) ED(o 0 )) for a reference r from C to D (3.22)<br />
8o o 0 :C(o) =C(o 0 ) ) o = o 0 (3.23)<br />
If Ax(S) is the set <strong>of</strong> formulae (3.21), (3.22) and (3.23) dened for schema S, then this<br />
corresponds to the theory T 0 = f' j Ax(S) ` 'g. If <strong>in</strong> addition Ax(I) is a set <strong>of</strong> formulae<br />
given by some <strong>in</strong>stance, maybe only \ground facts" as above, then the correspond<strong>in</strong>g theory<br />
is T 0 = f' j Ax(S) [ Ax(I) ` 'g.<br />
Note that each model <strong>of</strong> such a theory T 0 <strong>in</strong> the underly<strong>in</strong>g topos IE(T ) gives rise to a<br />
logical morphism IE(T 0 ) ! IE(T ) [28].<br />
Let us nally remark that the construction <strong>of</strong> I(C) is also possible, if value-representability<br />
is not assumed, but <strong>in</strong> this case we shall not get a monomorphism. Then <strong>in</strong> general a fact<br />
such asIo: C(o) =t may not exist, i. e. EIo: C(o) =t may not factor through true. Then the<br />
only model would be the <strong>in</strong>consistent topos. Nevertheless a smooth extension to the case <strong>of</strong><br />
weak value-identiability is still possible.<br />
Gett<strong>in</strong>g Rid <strong>of</strong> Identiers. By the work [16] object identiers have been identied as a<br />
pure implementation concept. This leads to the requirement <strong>of</strong> weak value-identiability. In<br />
our construction above, where we assumed the stronger value-representability this is already<br />
reected by the fact that I(C) isamorphism <strong>in</strong>to T ~ C , whereas <strong>in</strong> the rst categorical reformulation<br />
we had monomorphisms <strong>in</strong>to ID T C .<br />
Nevertheless, the type T ~ C still <strong>in</strong>volves identiers correspond<strong>in</strong>g to references, but as shown<br />
<strong>in</strong> [53, 54, 55] the value types that can be used to identify objects can be eectively computed.<br />
Let us sketch a correspond<strong>in</strong>g construction <strong>in</strong> E.<br />
Thus, consider the pullback <strong>of</strong>I(r) : T ~ C ! <br />
T ~ D<br />
and id TD ~ . This denes an object T CrD<br />
and morphisms exp 0 (r) : T CrD ! <br />
T ~ D<br />
and exp(r) : T CrD ! T ~ C , the latter be<strong>in</strong>g monic.<br />
S<strong>in</strong>ce I(r) I(C) =id TD ~ (I(r) I(C)) the universal property <strong>of</strong> pullbacks denes a unique<br />
monomorphism I(C rD):ID ! T CrD with exp(r) I(C rD)=I(C).<br />
We may repeat this construction with respect to all morphisms correspond<strong>in</strong>g to references<br />
<strong>in</strong>clud<strong>in</strong>g the exp 0 (r) constructed above. This denes a diagram D : ; ! E. Let O denote<br />
62
the limit <strong>of</strong> D. Then there is also a unique monomorphism I : ID ! O such that all the<br />
morphisms I(C) are given by I and D.<br />
Note that we may also assume all objects <strong>in</strong> D(; ) to be bounded, i. e. there exists a<br />
monomorphism <strong>in</strong>to some xed object R. Then also O will turn out as a subobject <strong>of</strong> R.<br />
The construction <strong>of</strong> O glues together types and references, but still does not <strong>in</strong>troduce<br />
object description without any identiers. For this let C : T C ! TC 0 result by elim<strong>in</strong>ation <strong>of</strong><br />
all identiers, formally C occurs as the pushout <strong>of</strong> U ! 1l and U ! T C , where these two<br />
morphisms dene the pullback <strong>of</strong> the exponential adjo<strong>in</strong>t ^o r and the exponential adjo<strong>in</strong>t <strong>of</strong><br />
true triv ID .<br />
If fg dene the pushout <strong>of</strong> C and I(r), then we get also an object TCrD 0 by the pullback<br />
<strong>of</strong> f and g, hence also morphisms CrD : T CrD ! TCrD 0 .IfD0 is a diagram that extends D<br />
by these morphisms, then we obta<strong>in</strong> the required types without occurrences <strong>of</strong> ID that can<br />
be used to extend the logic.<br />
Queries. In the relational model there are two basic approaches to queries based on the<br />
relational algebra and the relational calculus. We are now able to <strong>in</strong>troduce analogous constructions<br />
<strong>in</strong> the OODM.<br />
In the algebraic perspective we may use all operation supplied by the type system. Syntactically<br />
this means to consider all closed value terms as queries with a semantics dened<br />
by morphisms t :1l ! T ~ . In addition, each class C denes a query with semantics given by<br />
I(C) :ID ! T ~ C <strong>in</strong> an <strong>in</strong>stance I. Comb<strong>in</strong><strong>in</strong>g these two basic queries us<strong>in</strong>g all operators <strong>of</strong><br />
the type system gives a simple query language [55]. Note that <strong>in</strong> the relational subcase we<br />
obta<strong>in</strong> the operators <strong>of</strong> relational algebra without the jo<strong>in</strong>.<br />
Furthermore, we need polymorphic operators to comb<strong>in</strong>e queries. For queries dened by<br />
morphisms I 1 ! A and I 2 ! B and functions A ! C and B ! C we may consider the<br />
\<strong>in</strong>ner" pullback A C B ! B, A C B ! B and <strong>in</strong> the same way the \outer" pullback<br />
I = I 1 C I 2 .Thenby universality we obta<strong>in</strong> a unique morphism I ! A C B, den<strong>in</strong>g the<br />
semantics <strong>of</strong> pullback query. In the relational algebra the jo<strong>in</strong> corresponds to such pullbacks.<br />
For classes conta<strong>in</strong><strong>in</strong>g references we may also consider queries fr=Dg:C dened by the<br />
substitution <strong>of</strong> class D for reference r : D.Semantically we consider aga<strong>in</strong> a pullback T ~ C r <br />
T ~ D<br />
over id : <br />
T ~ D<br />
! <br />
T ~ D<br />
and I(r) : TC ~ ! <br />
T ~ D<br />
. Then the morphisms I(C) : ID ! T ~ C and<br />
I(r)I(C) :ID ! <br />
T ~ D<br />
give rise to a unique monomorphism ID ,! T ~ C r <br />
T ~ D<br />
, which denes<br />
the semantics <strong>of</strong> reference substitution queries.<br />
For the calculus th<strong>in</strong>gs are much easier, s<strong>in</strong>ce we may exploit the associated logic. S<strong>in</strong>ce<br />
classes and references have been <strong>in</strong>corporated <strong>in</strong>to the logic, a qery is simply given by a term<br />
Ix:' with a den<strong>in</strong>g formula '. This generalizes the relational approach.<br />
3.6 Conclusion<br />
In this paper we <strong>in</strong>dicated some fundamentals and logical semantics for object oriented<br />
databases. The start<strong>in</strong>g po<strong>in</strong>t was the consideration <strong>of</strong> build<strong>in</strong>g blocks <strong>in</strong> OODB schemata,<br />
i.e. types and classes. First we observedadecisive importance <strong>of</strong> type semantics. <strong>Object</strong>s are<br />
considered to be abstractions <strong>of</strong> real world entities, hence they have an immutable identity.<br />
This identity is rst encoded by abstract identiers that are assumed to form some type ID.<br />
There is not only one value <strong>of</strong> a given type that is associated with an object. In contrast we<br />
allow several values <strong>of</strong> possibly dierent types to belong to an object, and even this collection<br />
63
<strong>of</strong> types may change. Classes are used to structure objects. At each time a class corresponds<br />
to a collection <strong>of</strong> objects with values <strong>of</strong> the same type and references to objects <strong>in</strong> a xed set<br />
<strong>of</strong> classes.<br />
In general, it is reasonable to assume a semantics based on topos theory. Then all these<br />
considerations can be generalized us<strong>in</strong>g notions from category theory. On this basis the problems<br />
<strong>of</strong> identication and genericity have been solved <strong>in</strong> general. The unique identication<br />
<strong>of</strong> objects and the existence <strong>of</strong> generic update operations <strong>in</strong> a class require the class to be<br />
value-representable.<br />
S<strong>in</strong>ce topos theory is <strong>in</strong>herently connected with higher-order <strong>in</strong>tuitionistic logic, we were<br />
able to rst rephrase the notions <strong>of</strong> object oriented databases <strong>in</strong> category theory, then to<br />
transform them <strong>in</strong>to logic. This allows the denition <strong>of</strong> query algebra and calculus. Tak<strong>in</strong>g<br />
value-representability as a desirable property <strong>in</strong>to account, we could even show how to get<br />
rid <strong>of</strong> object identiers that have already been detected as a pure implementation concept.<br />
The results achieved so far seem to oer a reasonable logical foundation for object oriented<br />
databases. They even allow to relate this eld to recent <strong>in</strong>vestigations <strong>in</strong> foundations <strong>of</strong><br />
computer science with respect to type theory and eective computation.<br />
Nevertheless, it is just the beg<strong>in</strong>n<strong>in</strong>g <strong>of</strong> a story concern<strong>in</strong>g deductive capabilities <strong>in</strong> object<br />
oriented databases. To proceed, it will be <strong>in</strong>terest<strong>in</strong>g to <strong>in</strong>vestigate (higher-order) geometric<br />
theories [39, 68, 69]. Further research is planned <strong>in</strong> this direction.<br />
As to the dynamics <strong>of</strong> object oriented databases concern<strong>in</strong>g the formalization <strong>of</strong> operation<br />
semantics we may wish to exploit e.g. axiomatic semantics <strong>in</strong> the sense <strong>of</strong> Dijkstra's predicate<br />
transformers [23, 26, 45]. The problem with this theory is that it depends on the use <strong>of</strong><br />
a suitable logic that guarantees the existence <strong>of</strong> predicate transformers with the <strong>in</strong>tended<br />
semantics. Whilst the classical theory uses an <strong>in</strong>nitary rst-order logic L ! !1 the required<br />
generalization to topos logic has been shown <strong>in</strong> [53, 57].<br />
F<strong>in</strong>ally, types can be handled <strong>in</strong> a much more exible way, ifwe extend algebraic data type<br />
specications by higher-order functional and truth-value sorts and dene topoi as models <strong>of</strong><br />
such constructor theories. This approach is described <strong>in</strong> [53, 56].<br />
Then it is an open problem, how this k<strong>in</strong>d <strong>of</strong> type theory relates to synthetic doma<strong>in</strong> theory,<br />
which is roughly \doma<strong>in</strong> theory with<strong>in</strong> a topos" [29, 33, 51, 64]. The basic assumption <strong>of</strong><br />
this theory is that \doma<strong>in</strong>s" are specic objects <strong>in</strong> a topos such that all morphisms between<br />
them are cont<strong>in</strong>uous and all constructions are solely based on categorical properties without<br />
recurr<strong>in</strong>g to order-theoretic properties. Aga<strong>in</strong> the eective topos turns out to be a reasonable<br />
source <strong>of</strong> examples <strong>of</strong> that k<strong>in</strong>d <strong>of</strong> theory.<br />
References for Chapter 3<br />
1. S. Abiteboul: Towards a deductive object-oriented database language, Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g,<br />
vol. 5, 1990, pp. 263 { 287<br />
2. S. Abiteboul, R. Hull: IFO: A Formal Semantic Database Model, ACM ToDS, vol. 12 (4), December<br />
1987, pp. 525 { 565<br />
3. S. Abiteboul, P. Kanellakis: <strong>Object</strong> Identity as a Query Language Primitive, <strong>in</strong> Proc. SIGMOD,<br />
Portland Oregon, 1989, pp. 159 { 173<br />
4. H. At-Kaci: An Overview <strong>of</strong> LIFE, <strong>in</strong>J.W.Schmidt, A. A. Stognij (Eds.): Proc. Next Generation<br />
Information Systems Technology , Spr<strong>in</strong>ger LNCS, vol. 504, 1991, pp. 42 { 58<br />
5. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: Types for <strong>Databases</strong>: The Galileo Experience, <strong>in</strong> Type Systems<br />
and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong> Mathematical and<br />
Computational Sciences, Research Report CS/90/3, 27 { 37<br />
64
6. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: <strong>Object</strong>s and Classes for a Database Programm<strong>in</strong>g Language, FIDE<br />
technical report 91/16, 1991<br />
7. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: ARelationship Mechanism for a Strongly Typed <strong>Object</strong>-<strong>Oriented</strong><br />
Database Programm<strong>in</strong>g Language, <strong>in</strong> A. Sernadas (Ed.): Proc. VLDB 91, Barcelona 1991<br />
8. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The <strong>Object</strong>-<strong>Oriented</strong><br />
Database System Manifesto, Proc. 1st DOOD, Kyoto 1989<br />
9. F. Bancilhon, G. Barbedette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeer,<br />
P. Richard, F. Velez: The Design and Implementation <strong>of</strong> O 2 , an <strong>Object</strong>-<strong>Oriented</strong> Database System,<br />
Proc. <strong>of</strong> the ooDBS II workshop, Bad Munster, FRG, September 1988<br />
10. M. Barr, C. Wells: Category Theory for Comput<strong>in</strong>g Science, Prentice-Hall 1990<br />
11. C. Beeri: Formal Models for <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong>, Proc. 1st DOOD 1989, pp. 370 { 395<br />
12. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />
5 (4), 1990, pp. 353 { 382<br />
13. C. Beeri, Y. Kornatzky: Algebraic Optimization <strong>of</strong> <strong>Object</strong>-<strong>Oriented</strong> QueryLanguages, <strong>in</strong> S. Abiteboul,<br />
P. C. Kanellakis (Eds.): Proc. ICDT '90, Spr<strong>in</strong>ger LNCS 470, pp. 72 { 88<br />
14. C. Beeri: New Data Models and Languages - the Challange <strong>in</strong> Proc. PODS '92<br />
15. C. Beeri, T. Milo: Subtyp<strong>in</strong>g <strong>in</strong> OODBs, <strong>in</strong> Proc. PODS'91<br />
16. C. Beeri, B. Thalheim: Can I see your Identication, please?, Proc. <strong>of</strong> the Workshop on Database<br />
Semantics, Rez, January 1995 (to appear)<br />
17. K. B. Bruce, A. R. Meyer: The Semantics <strong>of</strong> Second Order Polymorphic Lambda Calculus, <strong>in</strong><br />
G. Kahn, D. B. MacQueen, G. Plotk<strong>in</strong> (Eds.): Semantics <strong>of</strong> Data Types, Spr<strong>in</strong>ger LNCS 173,<br />
1984, 131-144<br />
18. L. Cardelli, P. Wegner: On Understand<strong>in</strong>g Types, Data Abstraction and Polymorphism, ACM<br />
Comput<strong>in</strong>g Suerveys 17,4, pp 471 { 522<br />
19. L. Cardelli: Typeful Programm<strong>in</strong>g, Digital Systems Research Center Reports 45, DEC SRC Palo<br />
Alto, May 1989<br />
20. M. Carey, D. DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc.<br />
ACM SIGMOD 88<br />
21. M. Caruso, E. Sciore: The VISION <strong>Object</strong>-<strong>Oriented</strong> Database Management System, Proc.<strong>of</strong>the<br />
Workshop on Database Programm<strong>in</strong>g Languages, Rosco, France, September 1987<br />
22. R.G.G. Cattell: <strong>Object</strong> Data Management: <strong>Object</strong> <strong>Oriented</strong> and Extended Relational Database<br />
Systems, Addison-Wesley, 1991<br />
23. P. Cousot: Methods and Logics for Prov<strong>in</strong>g Programs, <strong>in</strong>J.van Leeuwen (Ed.): The Handbook <strong>of</strong><br />
Theoretical Computer Science, vol B: \Formal Models and Semantics", Elsevier, 1990, 841-993<br />
24. A. Dearle, R. Connor, F. Brown, R. Morrison: Napier88 - ADatabase Programm<strong>in</strong>g Language?,<br />
<strong>in</strong> Type Systems and Database Programm<strong>in</strong>g Languages, University <strong>of</strong> St. Andrews, Dept. <strong>of</strong><br />
Mathematical and Computational Sciences, Research Report CS/90/3, 10 { 26<br />
25. K. Denn<strong>in</strong>gho, V. Vianu: Database Method Schemas and <strong>Object</strong> Creation, <strong>in</strong> Proc. PODS '93,<br />
265-275<br />
26. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Spr<strong>in</strong>ger-Verlag, 1989<br />
27. D. Fishman, D. Beech, H. Cate, E. Chow et al.: IRIS: An <strong>Object</strong>-<strong>Oriented</strong> Database Management<br />
System, ACM ToIS, vol. 5(1), January 1987<br />
28. M. P. Fourman: The Logic <strong>of</strong> Topoi, <strong>in</strong> J. Barwise (Ed.): Handbook <strong>of</strong> Mathematical Logic, North-<br />
Holland Studies <strong>in</strong> Logic, vol. 90, 1977, 1053-1090<br />
29. P. Freyd: Recursive Types reduced to Inductive Types, <strong>in</strong> J. Mitchell (Ed.): 5th Symposium on<br />
Logic <strong>in</strong> Computer Science, Philadelphia, 1990<br />
30. M. Hammer, D. McLeod: Database Description with SDM: A Semantic Database Model, J.ACM,<br />
vol. 31 (3), 1984, pp. 351 { 386<br />
31. R. Hull, R. K<strong>in</strong>g: Semantic Database Model<strong>in</strong>g: Survey, Applications and Research Issues, ACM<br />
Comput<strong>in</strong>g Surveys, vol. 19(3), September 1987<br />
32. J. Hyland: The Eective Topos, <strong>in</strong>A.Troelstra, D. van Dalen (Eds.): The L.E.J. Brouwer Centenary<br />
Symposium, North Holland, 1982, 165-216<br />
65
33. J. Hyland: First Steps <strong>in</strong> Synthetic Doma<strong>in</strong> Theory, <strong>in</strong> A. Carboni, M. Pedicchio, G. Rosol<strong>in</strong>i<br />
(Eds.): Category Theory '90 , Spr<strong>in</strong>ger LNM, vol. 1488, 1992<br />
34. J. Hyland, E. Rob<strong>in</strong>son, G. Rosol<strong>in</strong>i: The Discrete <strong>Object</strong>s <strong>in</strong> the Eective Topos, Proc. LMS 60<br />
(1990), 1-60<br />
35. P. Johnstone: Topos Theory, LMS Monographs vol. 10, Academic Press, 1977<br />
36. S. Khoshaan, G. Copeland: <strong>Object</strong> Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon,<br />
1986<br />
37. M. Kifer, G. Lausen. F-Logic: A Higher-order Language for Reason<strong>in</strong>g about <strong>Object</strong>s, Inheritance<br />
and Schema, <strong>in</strong> Proc. SIGMOD 1989, 134-146<br />
38. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrat<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong><br />
Programm<strong>in</strong>g System with a Database System, <strong>in</strong> Proc. OOPSLA 1988<br />
39. S. Mac Lane, I. Moerdijk: Sheaves <strong>in</strong> Geometry and Logic { A First Introduction to Topos Theory,<br />
Spr<strong>in</strong>ger Universitext, 1992<br />
40. D. Maier, J. Ste<strong>in</strong>, A. Ottis, A. Purdy: Development <strong>of</strong> an <strong>Object</strong>-<strong>Oriented</strong> DBMS, OOPSLA,<br />
September 1986<br />
41. F. Matthes, J. W. Schmidt: Bulk Types { Add-On or Built-In?, <strong>in</strong> Proc. DBPL III, Nafplion 1991<br />
42. J. C. Mitchell: Type Systems for Programm<strong>in</strong>g Languages, <strong>in</strong>J.van Leeuwen (Ed.): The Handbook<br />
<strong>of</strong> Theoretical Computer Science, vol B: \Formal Models and Semantics", Elsevier, 1990, 365-458<br />
43. J. Mylopoulos, P. A. Bernste<strong>in</strong>, H. K. T. Wong: A Language Facility for Design<strong>in</strong>g Interactive<br />
Database-Intensive Applications, ACM ToDS, vol. 5 (2), April 1980, pp. 185 { 207<br />
44. J. Mylopoulos, A. Borgida, M. Jarke, M. Koubarakis: Telos: Represent<strong>in</strong>g Knowledge About Information<br />
Systems, ACM ToIS, vol. 8 (4), October 1990 pp. 325 { 362<br />
45. G. Nelson: A Generalization <strong>of</strong> Dijkstra's Calculus, ACM TOPLAS, vol. 11 (4), October 1989, pp.<br />
517 { 561<br />
46. A. Ohori: Represent<strong>in</strong>g <strong>Object</strong> Identity <strong>in</strong> a Pure Functional Language, Proc. ICDT 90, Spr<strong>in</strong>ger<br />
LNCS, pp. 41 { 55<br />
47. A. M. Pitts: Polymorphism is Set Theoretic, Constructively, <strong>in</strong> D.H. Pitt, A. Poigne, D.E. Rydeheard<br />
(Eds.): Category Theory and Computer Science, Spr<strong>in</strong>ger LNCS 283, 12-39<br />
48. J. C. Reynolds: Polymorphism is not Set-Theoretic, <strong>in</strong> G. Kahn, D. B. MacQueen, G. Plotk<strong>in</strong><br />
(Eds.): Semantics <strong>of</strong> Data Types, Spr<strong>in</strong>ger LNCS 173, 1984, 145-156<br />
49. G. Rosol<strong>in</strong>i: Categories and Eective Computations, <strong>in</strong> D.H. Pitt, A. Poigne, D.E. Rydeheard<br />
(Eds.): Category Theory and Computer Science, Spr<strong>in</strong>ger LNCS 283, 1-11<br />
50. G. Rosol<strong>in</strong>i, E. Rob<strong>in</strong>son: Colimit Completions and the Eective Topos, Journal <strong>of</strong> Symbolic Logic<br />
55 (1990), 678-699<br />
51. G. Rosol<strong>in</strong>i: Notes on Synthetic Doma<strong>in</strong> Theory, University <strong>of</strong> Genova, February 1995<br />
52. K.-D. Schewe, B. Thalheim, I. Wetzel,J.W.Schmidt: Extensible Safe <strong>Object</strong>-<strong>Oriented</strong> Design <strong>of</strong><br />
Database Applications, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t CS-09-91, September 1991<br />
53. K.-D. Schewe: Specication <strong>of</strong> Data-Intensive Application Systems, Habilitation Thesis, TU Cottbus,<br />
1994<br />
54. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />
<strong>Oriented</strong> <strong>Databases</strong>, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, 341-356<br />
55. K.-D. Schewe, B. Thalheim: Fundamental Concepts <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, Acta Cybernetica,<br />
vol. 11 (4), 1993, 49-84<br />
56. K.-D. Schewe: A Semantics for Type Specications Based onTopos Theory, TU Cottbus, Technical<br />
Report I-5 / 1994<br />
57. K.-D. Schewe: A Non-Classical Generalization <strong>of</strong> Dijkstra's Calculus { Axiomatic Semantics for<br />
Typed Program Specications, TU Cottbus, Technical Report I-6 / 1994<br />
58. K.-D. Schewe, B. Thalheim, I. Wetzel: Foundations <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database Concepts, University<br />
<strong>of</strong>Hamburg, Report FBI-HH-B-157/92, October 1992<br />
59. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: AReective Approach to Method<br />
Generation <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, University <strong>of</strong> Rostock, Rostocker Informatik Berichte,<br />
no. 14, 1992<br />
66
60. D. S. Scott: Identity and Existence <strong>in</strong> Intuitionistic Logic, <strong>in</strong> M. P. Fourman, C. J. Mulvey,<br />
D. S. Scott (Eds.): Applications <strong>of</strong> Sheaves, Spr<strong>in</strong>ger LNM 753, 660-696<br />
61. M. H. Scholl, H.-J. Schek: ARelational <strong>Object</strong> Model, <strong>in</strong> Proc. ICDT 90, Spr<strong>in</strong>ger LNCS, pp. 89<br />
{105<br />
62. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programm<strong>in</strong>g to Database Languages,<br />
<strong>in</strong> Proc. HICSS '92<br />
63. S. Y. W. Su: SAM : A Semantic Association Model for Corporate and Scientic-Statistical<br />
<strong>Databases</strong>, Inf. Sci., vol. 29, 1983, pp. 151 { 199<br />
64. P. Taylor: The Fixed Po<strong>in</strong>t Property <strong>in</strong> Synthetic Doma<strong>in</strong> Theory, <strong>in</strong>G.Kahn:6th Symposium on<br />
Logic <strong>in</strong> Computer Science, Amsterdam 1991, 152-160<br />
65. J. Van den Bussche, Dirk Van Gucht: A Hierarchy <strong>of</strong> Faithful Set Creation <strong>in</strong> Pure OODBs, <strong>in</strong><br />
J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, 326-340<br />
66. J. Van den Bussche, Dirk Van Gucht: Semi-determ<strong>in</strong>ism, <strong>in</strong> Proc. PODS '92, ACM Press, 191-201<br />
67. J. Van den Bussche: Formal Aspects <strong>of</strong> <strong>Object</strong> Identity <strong>in</strong> Database Manipulation, Ph.D. Thesis,<br />
University <strong>of</strong>Antwerp, 1993<br />
68. S. Vickers: Geometric Theories and <strong>Databases</strong>, <strong>in</strong> M.P. Fourman, P.T. Johnstone, A.M. Pitts<br />
(Eds.): Applications <strong>of</strong> Category Theory <strong>in</strong> Computer Science, London Mathematical Society<br />
Lecture Notes Series, Cambridge University Press, 1992, 288-314<br />
69. S. Vickers: Geometric Logic <strong>in</strong> Computer Science, <strong>in</strong> G.L. Burn, S.J. Gray, M.D. Ryan (Eds.):<br />
Theory and Formal Methods 1993, Spr<strong>in</strong>ger WiCS, 1993, 37-54<br />
70. S.B. Zdonik, D. Maier: <strong>Read<strong>in</strong>gs</strong> <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> Database Systems, Morgan Kaufmann Publishers,<br />
1990<br />
67
Chapter 4<br />
Higher-Level Genericity <strong>in</strong> <strong>Object</strong><br />
<strong>Oriented</strong> <strong>Databases</strong><br />
Contents<br />
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69<br />
4.2 A Core <strong>Object</strong> <strong>Oriented</strong> Database Language . . . . . . . . . . . . 71<br />
4.2.1 A Simple Type System . . . . . . . . . . . . . . . . . . . . . . . . . . 71<br />
4.2.2 Specication <strong>of</strong> Structure . . . . . . . . . . . . . . . . . . . . . . . . 72<br />
4.2.3 Database Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72<br />
4.2.4 Specication <strong>of</strong> Behaviour . . . . . . . . . . . . . . . . . . . . . . . . 73<br />
4.3 Genericity Beyond Polymorphism . . . . . . . . . . . . . . . . . . 74<br />
4.3.1 Implicit Schema Extensions . . . . . . . . . . . . . . . . . . . . . . . 74<br />
4.3.2 L<strong>in</strong>guistic Reection . . . . . . . . . . . . . . . . . . . . . . . . . . . 75<br />
4.3.3 Reection Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />
4.3.4 Generators for Generic Updates . . . . . . . . . . . . . . . . . . . . . 77<br />
4.4 Integrity Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . 80<br />
4.4.1 User-Dened Integrity Constra<strong>in</strong>ts . . . . . . . . . . . . . . . . . . . 80<br />
4.4.2 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 81<br />
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
Klaus-Dieter Schewe, David Stemple, Bernhard Thalheim. Higher-Level Genericity <strong>in</strong><br />
<strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>. Proc. COMAD 1994.<br />
68
Abstract. <strong>Object</strong> oriented databases (OODBs) are composed <strong>of</strong> semi-<strong>in</strong>dependent objects<br />
but must also provide for the ma<strong>in</strong>tenance <strong>of</strong> <strong>in</strong>ter-object consistency, especially with respect<br />
to constra<strong>in</strong>ts aris<strong>in</strong>g from class hierarchies and <strong>in</strong>ter-object references. Hence the problem to<br />
provide consistent generic update methods.<br />
We address the problem how to derive such methods from the structure <strong>of</strong> an OODB<br />
schema by the specication <strong>of</strong> generator macros for them. These generators are based on a<br />
strict mathematical formalization <strong>of</strong> OODB concepts <strong>in</strong>clud<strong>in</strong>g the possibility to represent<br />
syntactic components <strong>of</strong> the language as values with<strong>in</strong> the language itself, which isknown to<br />
form the basis <strong>of</strong> l<strong>in</strong>guistic reection.<br />
Moreover, the approach can be extended to the enforcement <strong>of</strong> user-dened <strong>in</strong>tegrity constra<strong>in</strong>ts<br />
that give rise to context sensitive macros turn<strong>in</strong>g each user-dened method <strong>in</strong>to<br />
branches <strong>of</strong> its greatest consistent specialization.<br />
Keywords: object oriented databases, genericity,<strong>in</strong>tegrity constra<strong>in</strong>ts, consistency, l<strong>in</strong>guistic<br />
reection<br />
4.1 Introduction<br />
The relational datamodel (RDM) was the rst to support the complete abstraction from<br />
physical data organization. This was certa<strong>in</strong>ly one <strong>of</strong> its advantages <strong>in</strong> comparison to former<br />
hierarchical and network models and one <strong>of</strong> the reasons for its success. Another one is certa<strong>in</strong>ly<br />
due to the simplicity and elegance <strong>of</strong> query and update languages. In particular, each RDM<br />
schema is accompanied by operations to <strong>in</strong>sert, delete or update a tuple. These operations are<br />
generic <strong>in</strong> the sense that they are applicable to each relation <strong>in</strong> the schema. To beeven more<br />
precise, the k<strong>in</strong>d <strong>of</strong> genericity required here can be obta<strong>in</strong>ed by parametric polymorphism,<br />
s<strong>in</strong>ce it is sucient to know the underly<strong>in</strong>g RECORD-type <strong>of</strong> a relation.<br />
However, some shortcom<strong>in</strong>gs <strong>of</strong> the RDM encouraged much research aimed at achiev<strong>in</strong>g<br />
more exible and ecient datamodels. It has been claimed <strong>in</strong> [1] that object orientation<br />
provides the key technology for future database systems and languages. In order to provide<br />
object oriented databases (OODBs) with the same grade <strong>of</strong> maturity as exist<strong>in</strong>g relational<br />
systems it is a reasonable goal try<strong>in</strong>g to preserve the advantages <strong>of</strong> the RDM here<strong>in</strong>.<br />
In object oriented programm<strong>in</strong>g the notion <strong>of</strong> an object was <strong>in</strong>tended as a generalization <strong>of</strong><br />
the abstract data type concept with the additional feature <strong>of</strong> <strong>in</strong>heritance. In this sense object<br />
orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong> order to promote<br />
high s<strong>of</strong>tware development productivity. <strong>Object</strong> oriented databases must regard an object<br />
also as a basic unit <strong>of</strong> persistent data, and therefore are composed <strong>of</strong> <strong>in</strong>dependent objectsbut<br />
must also provide for the ma<strong>in</strong>tenance <strong>of</strong> <strong>in</strong>ter-object consistency, a demand that is to some<br />
degree <strong>in</strong> dissonance with the basic style <strong>of</strong> object orientation.<br />
Therefore, it is not too surpris<strong>in</strong>g that many object oriented database systems do not<br />
provide generic update methods [12] or these fail to enforce model <strong>in</strong>herent <strong>in</strong>clusion and<br />
referential constra<strong>in</strong>ts. Another source <strong>of</strong> confusion is due to object identiers [4, 14], a concept<br />
used for encod<strong>in</strong>g the identity <strong>of</strong> objects. Mak<strong>in</strong>g such identiers visible to the user as done<br />
<strong>in</strong> programm<strong>in</strong>g languages does not make much sense <strong>in</strong> databases. However, regard<strong>in</strong>g them<br />
as a pure implementation concept as <strong>in</strong> [6] raises the problem, whether generic updates do<br />
actually exist.<br />
In fact, generic updates <strong>in</strong> OODBs are much more complicated than <strong>in</strong> the RDM due<br />
to the fact that identiers may not occur with<strong>in</strong> <strong>in</strong>put- and output-values and that at least<br />
69
model <strong>in</strong>herent constra<strong>in</strong>ts have to be ma<strong>in</strong>ta<strong>in</strong>ed, which requires context <strong>in</strong>formation <strong>in</strong><br />
order to provide generic update methods. In has been shown <strong>in</strong> [21, 17] that parametric<br />
polymorphism [11] as used <strong>in</strong> most object oriented languages is <strong>in</strong>sucient for the genericity<br />
problem. Nevertheless, we shall address <strong>in</strong> this paper the problem how to derive automatically<br />
and eciently generic update methods <strong>in</strong> OODBs.<br />
Our solution is based on the formally dened object oriented datamodel (OODM) <strong>in</strong>troduced<br />
<strong>in</strong> [24] with a clear dist<strong>in</strong>ction between values and objects as required <strong>in</strong> [7, 8]. Types<br />
correspond to immutable sets <strong>of</strong> values, classes correspond to mutable collections <strong>of</strong> objects.<br />
In Section 4.2 we briey describe the basic features <strong>of</strong> this model.<br />
The ma<strong>in</strong> advantage <strong>of</strong> the used OODM is its theoretical basis [24]. Some compet<strong>in</strong>g<br />
models [13, 16] are closely oriented to a particilar object oriented programm<strong>in</strong>g language<br />
and ignore certa<strong>in</strong> mismatches <strong>in</strong> coupl<strong>in</strong>g these with a database. Others [9, 10] are basically<br />
behaviourally extended semantic datamodels. [3, 5] share the OODM property <strong>of</strong> a datamodel<br />
orientation, but still ignore the problems <strong>of</strong> object identication, genericity and consistency<br />
with respect to model-<strong>in</strong>herent constra<strong>in</strong>ts.<br />
As shown <strong>in</strong> [22, 24] generic consistent update methods exist for value-representable classes<br />
and only for them. Hence, the construction <strong>of</strong> such methods depends on additional <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts that are required for value-representability. Moreover, <strong>in</strong> order to capture cyclic<br />
references between objects, an extension to types is required that allows rational tree 1 structures<br />
to be dened by type equations. This corresponds to the -terms <strong>in</strong>troduced <strong>in</strong> [2]. Such<br />
generic consistent update methods as well as nitely representable, but <strong>in</strong>nite structures are<br />
miss<strong>in</strong>g <strong>in</strong> almost all compet<strong>in</strong>g OODB languages.<br />
The ecient construction <strong>of</strong> generic update methods is based on l<strong>in</strong>guistic reection as<br />
described <strong>in</strong> [19, 20]. Type-safe l<strong>in</strong>guistic reection came up with the development <strong>of</strong> the<br />
ADABTPL language which laid a primary <strong>in</strong>terest on the develoment <strong>of</strong> correct database<br />
transactions [18]. It turned out that synthesiz<strong>in</strong>g common operations <strong>in</strong> the RDM such asor<br />
natural jo<strong>in</strong> would be helpfull, but these are not polymorphically expressible.<br />
The ma<strong>in</strong> idea is to provide macro facilities that allow to compute with syntactic representations<br />
<strong>of</strong> language constructs <strong>in</strong> a type-safe fashion with<strong>in</strong> the database language itself.<br />
In Section 4.3 we describe this approach to generator macros for generic update methods.<br />
The approach <strong>in</strong>cludes the computation <strong>of</strong> the value-representation types for all classes<br />
<strong>in</strong> a given schema, hence genericity <strong>in</strong> this case exceeds the capability <strong>of</strong> simple parametric<br />
polymorphism. The implementation <strong>of</strong> the acyclic case without propagation is described <strong>in</strong><br />
[27]. The implementation <strong>of</strong> the general case is currently <strong>in</strong>vestigated.<br />
The approach suggests an immediate extension to <strong>in</strong>tegrity enforcement with respect to<br />
explicit user-dened constra<strong>in</strong>ts, especially those that arise from generaliz<strong>in</strong>g correspond<strong>in</strong>g<br />
constra<strong>in</strong>ts <strong>in</strong> the RDM such as functional, <strong>in</strong>clusion and exclusion constra<strong>in</strong>ts [26]. Enforc<strong>in</strong>g<br />
constra<strong>in</strong>ts can be formalized by the computation <strong>of</strong> greatest consistent specializations (GCSs)<br />
<strong>of</strong> user-dened methods, an approach that occurs naturally <strong>in</strong> OODBs, s<strong>in</strong>ce operational<br />
specialization is already present when overrid<strong>in</strong>g methods.<br />
In [23] an algorithm has been presented that allows GCS construction (under certa<strong>in</strong><br />
technical prerequisites) to be reduced to basic operations. Greatest consistent specializations<br />
<strong>of</strong> generic update methods for the OODM were presented <strong>in</strong> [25]. We briey describe GCS<br />
construction <strong>in</strong> Section 4.4.<br />
1 A rational tree is a nite or <strong>in</strong>nite tree with only a nite number <strong>of</strong> dierent subtrees.<br />
70
4.2 A Core <strong>Object</strong> <strong>Oriented</strong> Database Language<br />
In the object-oriented approach we dist<strong>in</strong>guish between objects and values. Values can be<br />
gouped <strong>in</strong>to types that may be regarded as an immutable set <strong>of</strong> values <strong>of</strong> a uniform structure<br />
together with operations dened on them. Subtyp<strong>in</strong>g is used to relate values <strong>in</strong> dierent types.<br />
The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure which uniformly<br />
comb<strong>in</strong>es aspects <strong>of</strong> object values, references and subreferences, but objects can belong<br />
to dierent classes. As for values that are only dened via types, objects can only be dened<br />
via classes.<br />
References and subreferences between classes give rise to implicit referential constra<strong>in</strong>ts.<br />
In addition, subreferences (part-<strong>of</strong>) dene local referential constra<strong>in</strong>ts, and subclasses (IsArelationships)<br />
require each database <strong>in</strong>stance to satisfy <strong>in</strong>clusion constra<strong>in</strong>ts on object identiers.<br />
We shall later extend this picture allow<strong>in</strong>g additional constra<strong>in</strong>ts to be dened by the<br />
user.<br />
As usual <strong>in</strong> object oriented approaches methods are used to model the database dynamics.<br />
In the OODM these are associated with classes. In addition we shall later add macros with<br />
the dierence that a macro produces new language expressions from language expressions.<br />
4.2.1 A Simple Type System<br />
Here we follow the classical view <strong>of</strong> types <strong>in</strong> [11] us<strong>in</strong>g a type system that consists <strong>of</strong> some<br />
basic types, type constructors and a subtyp<strong>in</strong>g relation. Moreover, assume the existence <strong>of</strong><br />
recursive types, i.e. types dened by doma<strong>in</strong> equations.<br />
The base types are BOOL, NAT, INT, FLOAT, STRING, ID or ?, where ID is an<br />
abstract identier type without any non-trivial supertype and ? is the trivial type that is a<br />
supertype for every type. The type constructors are (a 1 : 1 ::: a n : n ) (record), fg (nite<br />
set), [] (list), hi (bag) or (a 1 : 1 ) [ :::[ (a n : n ) (union). We may use base types and<br />
constructors to dene new types by nest<strong>in</strong>g.<br />
It is easy to extend such atype system by add<strong>in</strong>g e.g. a function type constructor ! .<br />
We absta<strong>in</strong>ed from this extension here because <strong>of</strong> the object identication and genericity<br />
problems. We shall discuss this problem at the end <strong>of</strong> the next subsection.<br />
Example 4.1. The type denition for PERSONNAME uses both the set constructor fg<br />
and the record constructor ():<br />
Type PERSONNAME =<br />
( FirstName : STRING , SecondName : STRING , Titles : f STRING g )<br />
End PERSONNAME<br />
The denition <strong>of</strong> a type PERSON uses the type PERSONNAME.<br />
Type PERSON =<br />
( PersonIdentityNo : NAT , Name : PERSONNAME )<br />
End PERSON<br />
The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />
standard operators on base types and on records, sets, bags, ::: We omit the details here. A<br />
type t is called proper i the number <strong>of</strong> its parameters is 0. t is called a value type i there<br />
is no occurrence <strong>of</strong> ID <strong>in</strong> t. If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a<br />
correspond<strong>in</strong>g occurrence relation o : t t 0 ! BOOL.<br />
71<br />
ut
A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened by<br />
the usual subtype relation [11].<br />
4.2.2 Specication <strong>of</strong> Structure<br />
Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values and references / subreferences<br />
to other objects. Identiers can be represented us<strong>in</strong>g the unique identier type ID.<br />
Values and (sub)references can be comb<strong>in</strong>ed <strong>in</strong> a representation type, where each occurrence<br />
<strong>of</strong> ID denotes references to some other classes. Therefore, we may dene the structure <strong>of</strong> a<br />
class us<strong>in</strong>g parameterized types. Moreover, classes are arranged <strong>in</strong> IsA-hierarchies.<br />
A structural class consists <strong>of</strong> an class name C, a set <strong>of</strong> class names D 1 ::: D m (<strong>in</strong> the<br />
follow<strong>in</strong>g called superclasses) and a value type expression S with all parameters replaced<br />
either by areference ref r i : C i or by asubreference part r i : C i with (sub)reference names<br />
r i and class names C i .<br />
If r i occurs with<strong>in</strong> ref r i : C i <strong>in</strong> the structure expression <strong>of</strong> a class C, we call r i the<br />
reference named r i from class C to class C i .Ifr i occurs with<strong>in</strong> part r i : C i <strong>in</strong> the structure<br />
expression <strong>of</strong> a class C, wecallr i the subreference named r i from class C to class C i .<br />
The type derived from the structure expression S <strong>of</strong> a class by replac<strong>in</strong>g each reference<br />
ref r i : C i and each subreference part r i : C i bythetype ID is called the representation type<br />
T C <strong>of</strong> the class C, thetype U C =(ident : IDvalue :: T C ) is called the class type <strong>of</strong> C.<br />
Example 4.2.<br />
Let us now describe some structural classes <strong>in</strong> a simple university example.<br />
Class PersonC<br />
Structure PERSON<br />
End PersonC<br />
Class Pr<strong>of</strong>essorC<br />
IsA PersonC<br />
Structure ( PersonIdentityNo : NAT , Age : NAT ,<br />
Salary : NAT , ref Faculty :DepartmentC )<br />
End Pr<strong>of</strong>essorC<br />
Class DepartmentC<br />
Structure ( DeptName : STRING ,<br />
ref Head : Pr<strong>of</strong>essorC ,<br />
Phones : f NAT g )<br />
End DepartmentC<br />
ut<br />
4.2.3 Database Instances<br />
A (structural) schema S is a nite collection <strong>of</strong> structural classes C 1 ::: C n closed under<br />
references and superclasses. In order to dene the semantics <strong>of</strong> structural schemata, we need<br />
the notion <strong>of</strong> a database <strong>in</strong>stance.<br />
An <strong>in</strong>stance D <strong>of</strong> a structural schema S assigns to each class C avalue D(C) <strong>of</strong>type fU C g<br />
such that the follow<strong>in</strong>g conditions are satised:<br />
uniqueness <strong>of</strong> identiers: For every class C we have<br />
8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (4.24)<br />
72
<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />
8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (4.25)<br />
Moreover, if T C is a subtype <strong>of</strong> TC 0 with subtype function f : T C ! TC 0 , then we have<br />
8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (4.26)<br />
referential <strong>in</strong>tegrity: For each reference or subreference from C to C 0 with correspond<strong>in</strong>g<br />
occurrence relation o r wehave<br />
8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (4.27)<br />
local referential <strong>in</strong>tegrity: For each subreference r from C to a class C 0 with correspond<strong>in</strong>g<br />
occurrence relation o r wehave<br />
8i 1 i 2 j :: ID:8v 1 v 2 :: T C : (i 1 v 1 ) 2D(C) ^ (i 2 v 2 ) 2D(C) ^ j 2 dom(D(C 0 )) ^<br />
o r (v 1 j) ^ o r (v 2 j) ) i 1 = i 2 ^ v 1 = v 2 : (4.28)<br />
We know from [22] that schema-dened generic update operations only exist for valuerepresentable<br />
classes. In turn, value-representability is implied by impos<strong>in</strong>g a trivial uniqueness<br />
constra<strong>in</strong>t oneach class. Therefore, <strong>in</strong> order to guarantee the existence <strong>of</strong> generic update<br />
methods we also assume for each class C the follow<strong>in</strong>g condition:<br />
value-identiability:<br />
8i j :: ID:8v :: T C : (i v) 2D(C) ^ (j v) 2D(C) ) i = j : (4.29)<br />
If we donothave function types, then for each database <strong>in</strong>stance it is decidable, whether the<br />
value-identiability condition holds. If functions come <strong>in</strong>to play, this is no longer true, s<strong>in</strong>ce<br />
we then have tocheck the equality <strong>of</strong> functions. Introduc<strong>in</strong>g function types therefore requires<br />
a more sophisticated treatment <strong>of</strong> value-identiability <strong>in</strong> the sense that we have to require a<br />
decidable uniqueness constra<strong>in</strong>t. For the reective generation <strong>of</strong> generic update operations,<br />
however, we need to know that they exist, not the reason why they exist. In order not to<br />
overload the presentation, we therefore decided to keep the type system as simple as possible.<br />
4.2.4 Specication <strong>of</strong> Behaviour<br />
So far, only static aspects have been considered. A structural schema is simply a collection <strong>of</strong><br />
data structures called classes. Let us now turn to add<strong>in</strong>g dynamics to this picture. As required<br />
<strong>in</strong> the object oriented approach operations will be associated with classes. This gives us the<br />
notion <strong>of</strong> a method.<br />
We shall dist<strong>in</strong>guish between visible and hidden methods to emphasize those methods<br />
that can be <strong>in</strong>voked by the user and others. Each method on a structural class C consists <strong>of</strong><br />
a signature and a body. The signature consists <strong>of</strong> a method name and sets <strong>of</strong> parameter/type<br />
pairs for <strong>in</strong>put and output. The body is dened by the usual constructs <strong>of</strong> a procedural<br />
programm<strong>in</strong>g language. A method M on a class C is called value-dened i all types occurr<strong>in</strong>g<br />
<strong>in</strong> its signature are proper value types.<br />
73
Example 4.3.<br />
Let us describe an <strong>in</strong>sert-method for the class PersonC.<br />
Method <strong>in</strong>sert 0 PersonC<br />
( <strong>in</strong> : P :: PERSON, out :I::ID) =<br />
IF 9 O 2 PersonC .value(O) =P<br />
THEN I := ident(O)<br />
ELSE I := NewId <br />
PersonC := PersonC [f( I,P )g<br />
ENDIF<br />
We used the global method NewId to denote the selection <strong>of</strong> a new identier. Note that this<br />
method is not value-dened, but we could simply drop the output to receive avalue-dened<br />
method.<br />
ut<br />
As already mentioned we dist<strong>in</strong>guish between methods visible to the user and hidden methods.<br />
We require each visible method to be value-dened. In particular, we use the value-dened<br />
generic update methods <strong>in</strong>sert C , delete C and update C for each class C that exist, s<strong>in</strong>ce<br />
we require value-representability [22]. Moreover, we use the quasi-generic update methods<br />
<strong>in</strong>sert 0 C , delete0 C and update0 C<br />
for each class C that are used to dene generic updates. The<br />
only dierence is that the generic updates suppress the output <strong>of</strong> type ID. The method <strong>in</strong><br />
Example 4.3 is quasi-generic.<br />
Subclasses <strong>in</strong>herit the methods <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long as<br />
the new method is a specialization <strong>of</strong> all its correspond<strong>in</strong>g methods <strong>in</strong> its superclasses.<br />
A (behavioural) schema S is a nite collection <strong>of</strong> behavioural classes fC 1 ::: C n g closed<br />
under references, superclasses and method calls, where behavioural class just means a structural<br />
class together with methods on it.<br />
4.3 Genericity Beyond Polymorphism<br />
Our goal is to provide generic update methods <strong>in</strong>sert C , delete C and update C for each class<br />
C <strong>of</strong> a database schema. These update methods are \generic" <strong>in</strong> the sense, that they are<br />
applicable to each class <strong>of</strong> a schema. These methods demand the identication <strong>of</strong> objects<br />
without access<strong>in</strong>g the object identier, s<strong>in</strong>ce oids are an <strong>in</strong>ternal concept and do not have a<br />
mean<strong>in</strong>g for the user <strong>of</strong> a database. Hence the need for value-representability.<br />
Besides this identication problem we alsohave to cope with the enforcement <strong>of</strong> implicit<br />
<strong>in</strong>tegrity constra<strong>in</strong>ts. In [22] it has been shown that value-representability is a necessary and<br />
sucient condition for the existence <strong>of</strong> consistent generic update methods.<br />
4.3.1 Implicit Schema Extensions<br />
S<strong>in</strong>ce we assume the existence <strong>of</strong> a trivial uniqueness constra<strong>in</strong>t for each class, generic and<br />
quasi-generic update methods always exist. Let us rst illustrate this by an example.<br />
Example 4.4. Consider aga<strong>in</strong> the schema <strong>of</strong> Example 4.2. The value-representation types for<br />
the classes Pr<strong>of</strong>essorC and DepartmentC are<br />
Type V Pr<strong>of</strong> =<br />
(PersonIdentityNo : NAT , Age : NAT ,Salary:NAT ,<br />
Faculty : ( DeptName : STRING , Head : V Pr<strong>of</strong> , Phones : f NAT g ))<br />
74
End V Pr<strong>of</strong><br />
Type V Dept =<br />
( DeptName : STRING , Head : V Pr<strong>of</strong> , Phones : f NAT g )<br />
End V Dept<br />
Select<strong>in</strong>g the component Faculty(V ) for a value V :: V Pr<strong>of</strong> gives the required value <strong>of</strong> type<br />
V Dept . However, we have to choose a new identier for a new object <strong>in</strong> Pr<strong>of</strong>essorC and<br />
due to the cycle <strong>in</strong> this schema this identier also occurs <strong>in</strong> the value <strong>of</strong> some new object <strong>in</strong><br />
DepartmentC, hence we need the more complex type<br />
Type VPr<strong>of</strong> =<br />
(PersonIdentityNo : NAT , Age : NAT , Salary : NAT ,<br />
Faculty : ( DeptName : STRING ,<br />
Head:(value : VPr<strong>of</strong> ) [ (ident :ID), Phones : f NAT g ))<br />
End VPr<strong>of</strong><br />
<br />
Note that this is a supertype <strong>of</strong> V Pr<strong>of</strong> . Let the correspond<strong>in</strong>g subtype function be f Pr<strong>of</strong> .<br />
Neglect<strong>in</strong>g for the moment IsA-relations the quasi-generic <strong>in</strong>sert on the class Pr<strong>of</strong>essorC<br />
is given by<br />
Method <strong>in</strong>sert 0 Pr<strong>of</strong><br />
(<strong>in</strong> :V::VPr<strong>of</strong> , out :I::ID) =<br />
IF 9O 2 Pr<strong>of</strong>essorC:f Pr<strong>of</strong> (value(O)) =V<br />
THEN I := ident(O)<br />
ELSE I:=NewId <br />
IF 9J :: ID: Head(Faculty(V)) =(ident:J)<br />
THEN Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V) g<br />
ELSE Let V 0 :: VDept . V0 := Faculty(V) <br />
Let K::ID . K:=<strong>in</strong>sert 0 Dept (V0 ) <br />
Let V 00 :: VPr<strong>of</strong> . V00 := ( PersonIdentityNo(V), Age(V), Salary(V), K ) <br />
Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V 00 ) g<br />
ENDIF<br />
ENDIF<br />
Our aim now is to generate (quasi-)generic update methods from the structural schema and<br />
to add them to the correspond<strong>in</strong>g classes, i.e. to implicitly change the behavioural schema.<br />
A natural rst idea is to exploit polymorphism as <strong>in</strong> [11] for this task. However, generic<br />
consistent updates on a class C have to be value-dened, hence require an <strong>in</strong>put-type V C<br />
without any occurrence <strong>of</strong> ID.Such an <strong>in</strong>put-type has to be computed from the schema and<br />
hence the generation requires meta-<strong>in</strong>formation. It has been shown <strong>in</strong> [17] that the need for<br />
meta-<strong>in</strong>formation exceeds the capability <strong>of</strong> polymorphism. The alternative is to use l<strong>in</strong>guistic<br />
reection as proposed <strong>in</strong> [20].<br />
ut<br />
4.3.2 L<strong>in</strong>guistic Reection<br />
The basic idea <strong>of</strong> l<strong>in</strong>guistic reection is to use reection types suchasSCHEMA rep , CLASS rep ,<br />
TYPE rep , METHOD rep , COMMAND rep , etc. for the representation <strong>of</strong> abstract syntax expressions<br />
represent<strong>in</strong>g schemata, classes, types, methods, commands (method bodies), etc.<br />
respectively. For each <strong>of</strong> these, there exists a function raise associat<strong>in</strong>g with this syntactic<br />
expression a true schema, class, type, etc. respectively.<br />
75
The used types for the representation <strong>of</strong> language constructs such as types, classes, constra<strong>in</strong>ts,<br />
methods, commands and schemata form the basis <strong>of</strong> l<strong>in</strong>guistic reection and will<br />
therefore be called reection types 2 .<br />
Moreover, we need a macro value-rep with signature<br />
SCHEMA rep CLASS rep ! TYPE rep<br />
<br />
where value-rep(S C) represents a value type needed for the unique identication (and representation)<br />
<strong>of</strong> some object, hence is needed for the generic <strong>in</strong>sert-, delete- and update-methods<br />
on raise(C).<br />
Such macros provide a more general way to specify database behaviour. They can be<br />
understood as transformations <strong>of</strong> language expressions. The ma<strong>in</strong> dierence to macros <strong>in</strong><br />
traditional programm<strong>in</strong>g languages, e.g. LISP, is that the expressions are abstract syntax<br />
that are represented <strong>in</strong> additional predened types. Hence macros are also strongly typed.<br />
Then the core <strong>of</strong> problem then is to dene three macros with signatures<br />
<strong>in</strong>sert : S :: SCHEMA rep C:: CLASS rep ! METHOD rep<br />
delete : S :: SCHEMA rep C:: CLASS rep ! METHOD rep and<br />
update : S :: SCHEMA rep C:: CLASS rep ! METHOD rep :<br />
Clearly, there are also other macros used by these ma<strong>in</strong> macros, and there is also one macro<br />
generic with signature SCHEMA rep ! SCHEMA rep that transforms a whole user-dened<br />
schema <strong>in</strong>to an <strong>in</strong>ternal schema with generic update methods added to all classes.<br />
4.3.3 Reection Types<br />
Let us now briey <strong>in</strong>dicate some <strong>of</strong> the reection types that are needed to construct generic<br />
update methods. We follow the presentation <strong>in</strong> Section 4.2 start<strong>in</strong>g with TYPE rep . In general,<br />
each type was given by atype-name and a den<strong>in</strong>g type-expression, hence<br />
Type TYPE rep =<br />
( name : NAME rep , type-exp : TYPE EXP rep )<br />
End TYPE rep<br />
Type expressions are given by the base types and type constructors dened <strong>in</strong> Section 4.2.1<br />
which leads to the follow<strong>in</strong>g recursive denition<br />
Type TYPE EXP rep =<br />
( BoolT : ? ) [ :::[ ( SetT : ( element-type : TYPE FORM rep )) [<br />
( RecordT : [ ( tag : NAME rep ,eld:TYPE FORM rep )])<br />
End TYPE EXP rep<br />
with values <strong>of</strong> (reection) type TYPE FORM rep be<strong>in</strong>g either type expressions (<strong>of</strong> type<br />
TYPE EXP rep ) or simply type names.<br />
Type TYPE FORM rep =<br />
(type-name : NAME rep ) [ (type-exp : TYPE EXP rep )<br />
End TYPE FORM rep<br />
Next, let us describe CLASS rep ,which can be built analogously. In particular there is a close<br />
2 In the orig<strong>in</strong>al work on l<strong>in</strong>guistic reection [20] the notion representation type was used <strong>in</strong>stead <strong>of</strong> reection<br />
type. Here we changed this notation <strong>in</strong> order not to run <strong>in</strong>to confusion with representation types <strong>of</strong> classes.<br />
76
essemblance between TYPE EXP rep and STRUCTURE rep .<br />
Type CLASS rep =<br />
( name : NAME rep , isa : f NAME rep g ,<br />
structure : STRUCTURE rep , methods : f METHOD rep g )<br />
End CLASS rep<br />
The dierence between the representation <strong>of</strong> type expressions and the one for structure expressions<br />
is that the latter may conta<strong>in</strong> references and subreferences <strong>in</strong>dicated by the use <strong>of</strong><br />
TYPE REF FORM rep <strong>in</strong>stead <strong>of</strong> TYPE FORM rep .<br />
Type STRUCTURE rep =<br />
( BoolT : ? ) [ :::[ ( SetS : ( element-type : TYPE REF FORM rep )) [<br />
( RecordS : [ ( tag : NAME rep , eld : TYPE REF FORM rep )])<br />
End STRUCTURE rep<br />
Then the extension <strong>of</strong> TYPE REF FORM rep with respect to TYPE FORM rep simply consists<br />
<strong>in</strong> add<strong>in</strong>g reference expressions.<br />
Type TYPE REF FORM rep =<br />
(type-name : NAME rep ) [ (type-exp : TYPE EXP rep ) [<br />
( ref-exp : ( ref-k<strong>in</strong>d : ( REF : ? ) [ (PART :? ),<br />
reference : NAME rep , class : NAME rep ))<br />
End TYPE REF FORM rep<br />
F<strong>in</strong>ally, let us <strong>in</strong>dicate the denition <strong>of</strong> the reection type METHOD rep , but without go<strong>in</strong>g<br />
too much <strong>in</strong>to details. We omit the denitions for COMMAND rep and EXPR rep .<br />
Type METHOD rep =<br />
( name : NAME rep ,<br />
<strong>in</strong>-list : [ ( parameter : NAME rep ,type : TYPE FORM rep )],<br />
out-list : [ ( parameter : NAME rep ,type : TYPE FORM rep )],<br />
body : COMMAND rep )<br />
End METHOD rep<br />
Further details on the reection types <strong>of</strong> the OODM will be omitted here.<br />
4.3.4 Generators for Generic Updates<br />
In order to build generator macros for generic update methods we follow the constructive<br />
pro<strong>of</strong> <strong>of</strong> their existence <strong>in</strong> [22]. Then we have to cope with the follow<strong>in</strong>g problems.<br />
(i) We have toprovide value types for the <strong>in</strong>put. This will be achieved by the macro value-rep<br />
already mentioned above.<br />
(ii) Generic update methods are value-dened, but nevertheless have to cope with identiers.<br />
This seem<strong>in</strong>gly contradiction can be resolved by construct<strong>in</strong>g canonical update methods<br />
with ID as output-type. Clearly, these give hidden methods <strong>in</strong> the <strong>in</strong>ternal schema. The<br />
correspond<strong>in</strong>g generic update method then just consists <strong>of</strong> a call to the canonical one and<br />
simply neglects its output.<br />
(iii) Inclusion <strong>in</strong>tegrity has to be enforced. Therefore, each <strong>in</strong>sert propagates through superclasses,<br />
whereas deletions propagate through subclasses and updates do both. For the<br />
macros <strong>in</strong>sert, delete and update this means that we have to build methods ignor<strong>in</strong>g all<br />
IsA-relations and then arrange these <strong>in</strong> a sequence.<br />
77
(iv) The most dicult task is the enforcement <strong>of</strong> referential <strong>in</strong>tegrity, especially <strong>in</strong> the case<br />
<strong>of</strong> cycles as e.g. <strong>in</strong> Example 4.2. We have to propagate <strong>in</strong> both directions along these<br />
references, but for cycles we also have tochoose new identiers before start<strong>in</strong>g this propagation.<br />
At rst glance it seems that our approach starts with the most complicated case, where all<br />
operations are propagated along references, whereas it is much simpler for an <strong>in</strong>sert-operation<br />
to require referenced objects already to exist and to disallow delete-operations as long as there<br />
still exist referenc<strong>in</strong>g objects. There are two reasons not to follow this simpler approach:<br />
(i) Our approach does only take care about implicit structurally dened constra<strong>in</strong>ts, whereas<br />
the simpler scenario arises, when certa<strong>in</strong> additional user-dened transition constra<strong>in</strong>ts are<br />
added. We briey discuss the general handl<strong>in</strong>g <strong>of</strong> any k<strong>in</strong>d <strong>of</strong> user-dened constra<strong>in</strong>ts on<br />
a solid theoretical ground <strong>in</strong> Section 4.4 (see also [23, 25]).<br />
(ii) These additional constra<strong>in</strong>ts may discard the generic operations, <strong>in</strong>particular <strong>in</strong> the cases<br />
<strong>of</strong> cycles <strong>in</strong> the schema. In theses cases it is desirable to let the database designer become<br />
aware <strong>of</strong> the consequences <strong>of</strong> add<strong>in</strong>g certa<strong>in</strong> constra<strong>in</strong>ts, whereas s/he has only the<br />
chance to completely change the schema (omitt<strong>in</strong>g all cycles) <strong>in</strong> the case, where additional<br />
constra<strong>in</strong>ts are tacitly assumed.<br />
An implementation <strong>of</strong> the simplied scenario on the basis <strong>of</strong> a strongly typed persistent<br />
programm<strong>in</strong>g language is reported <strong>in</strong> the PH. D. thesis [27].<br />
Let us now partly <strong>in</strong>dicate the solution to these problem by generator macros for the<br />
canonical update methods.<br />
The macro rep-type computes for a given class its representation type:<br />
Macro rep-type ( <strong>in</strong> :C::CLASS rep , out :TC::TYPE EXP rep ) =<br />
call rep-type-struct(<strong>in</strong> : structure(C) , out :TC)<br />
The called macro rep-type-struct <strong>in</strong>volves several cases depend<strong>in</strong>g on the value E which conta<strong>in</strong>s<br />
the representation <strong>of</strong> a structure expression. This stems from a type constructor, hence<br />
leads to the case dist<strong>in</strong>ctions.<br />
Macro rep-type-struct ( <strong>in</strong> :E::STRUCTURE rep ,<br />
out :E 0 :: TYPE EXP rep ) =<br />
CASE E :::<br />
E = (RecordS : [ (tag : N , eld : S) j L]) !<br />
Let S 0 :: TYPE REF FORM rep <br />
IF S=(ref-exp:R)<br />
THEN S 0 := ( type-name : ID )<br />
ELSE S 0 := S<br />
ENDIF <br />
Let L 0 :: TYPE EXP rep .<br />
call rep-type-struct(<strong>in</strong> :L,out :L 0 )<br />
E 0 := (RecordT : [ (tag : N , eld : S 0 ) j L 0 ])<br />
ENDCASE<br />
The macro value-rep is a little bit more complicate, s<strong>in</strong>ce it propagates through the whole<br />
schema.<br />
Macro value-rep ( <strong>in</strong> :C::CLASS rep ,S::SCHEMA rep ,<br />
78
out :VC::TYPE EXP rep ) =<br />
call vrep-type-struct(<strong>in</strong> : structure(C),S,[name(C)] , out :VC)<br />
Macro vrep-type-struct<br />
( <strong>in</strong> :E::STRUCTURE rep ,S::SCHEMA rep , K :: [ NAME rep ],<br />
out :E 0 :: TYPE EXP rep ) =<br />
CASE E :::<br />
E = (RecordS : [ (tag : N , eld : S) j L]) !<br />
Let S 0 :: TYPE REF FORM rep <br />
IF S = ( ref-exp : ( ref-k<strong>in</strong>d : x , reference : R , class : N 0 ))<br />
THEN<br />
IF N 0 =2 K<br />
THEN Let C 0 2 S . name(C 0 )=N 0 <br />
Let VC 0 :: TYPE EXP rep .<br />
call vrep-type-struct(<strong>in</strong> : structure(C 0 ),S,[N 0 j K] , out :VC 0 )<br />
S 0 := ( type-exp : VC 0 )<br />
ELSE S 0 := ( type-name : ( vrep : N 0 ))<br />
ENDIF<br />
ELSE S 0 := S<br />
ENDIF <br />
Let L 0 :: TYPE EXP rep .<br />
call vrep-type-struct(<strong>in</strong> : L,S,K , out :L 0 )<br />
E 0 := (RecordT : [ (tag : N , eld : S 0 ) j L 0 ])<br />
ENDCASE<br />
This solves the rst problem above. The trivial solution to the second problem has already<br />
been given. Now concentrate on the enforcement <strong>of</strong> IsA-constra<strong>in</strong>ts. The follow<strong>in</strong>g<br />
macro <strong>in</strong>sert-seq computes the body <strong>of</strong> the required canonical <strong>in</strong>sert. It uses the macro classdescription<br />
to get the denition <strong>of</strong> a class <strong>in</strong> a schema from the class name and the macro<br />
<strong>in</strong>sert-ref to compute the core <strong>of</strong> the method body anf enforc<strong>in</strong>g referential <strong>in</strong>tegrity.<br />
Macro <strong>in</strong>sert-seq<br />
( <strong>in</strong> :C::CLASS rep ,S::SCHEMA rep , out :P::COMMAND rep ) =<br />
Let P1, P2 :: COMMAND rep .<br />
call command-list(<strong>in</strong> : isa(C),S , out : P1) <br />
call <strong>in</strong>sert-ref(<strong>in</strong> : C,S , out :P2) <br />
P := sequence(P1,P2)<br />
Macro command-list<br />
( <strong>in</strong> :NL::f NAME rep g ,S::SCHEMA rep ,<br />
out :P::COMMAND rep ) =<br />
CASE NL<br />
NL = ! P := `skip'<br />
NL = f N g[L ^ N =2 L !<br />
Let D::CLASS rep .<br />
call class-description(<strong>in</strong> : N,S , out :D)<br />
Let P1, P2 :: COMMAND rep .<br />
call <strong>in</strong>sert-ref(<strong>in</strong> :D,S,out : P1) <br />
call command-list(<strong>in</strong> : L,S , out : P2) <br />
P := sequence(P1,P2)<br />
79
ENDCASE<br />
To solve the third problem we have to construct also the <strong>in</strong>put- and output-lists <strong>of</strong> the methods.<br />
This is straightforward.<br />
F<strong>in</strong>ally, to solve the fourth problem let us briey consider how to enforce referential<br />
<strong>in</strong>tegrity with<strong>in</strong> <strong>in</strong>sertions. Aga<strong>in</strong>, we have to build an <strong>in</strong>put-type that extends the valuerepresentation<br />
type by add<strong>in</strong>g ID. The correspond<strong>in</strong>g macros are completely analogous to<br />
value-rep and vrep-type-struct. Then the denition <strong>of</strong> <strong>in</strong>sert-ref follows the spirit <strong>of</strong> Example<br />
4.4 and the macros shown above. We omit further details.<br />
4.4 Integrity Enforcement<br />
The reective approach <strong>in</strong> the preced<strong>in</strong>g section allows to cope with the implicit <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts, hence suggests an immediate generalization to arbitrary user-dened constra<strong>in</strong>ts.<br />
Then the problem is to guarantee the consistency <strong>of</strong> a specied method with respect to such<br />
constra<strong>in</strong>ts.<br />
4.4.1 User-Dened Integrity Constra<strong>in</strong>ts<br />
Let us rst extend the notion <strong>of</strong> schema S by the <strong>in</strong>troduction <strong>of</strong> explicit user-dened <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts which are formulae I over the underly<strong>in</strong>g type system with free variables fr(I) <br />
fx C 1 ::: x C n<br />
g, where each x Ci is a variable <strong>of</strong> type fU Ci g.Wecallx Ci the class variable <strong>of</strong><br />
C i .<br />
A constra<strong>in</strong>ed schema consists <strong>of</strong> a behavioural schema S and a nite set <strong>of</strong> <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts on S.An<strong>in</strong>stance D is said to be consistent with respect to the <strong>in</strong>tegrity constra<strong>in</strong>t<br />
I i substitut<strong>in</strong>g D(C) for each class variable x C <strong>in</strong> I evaluates to true, when<strong>in</strong>terpreted <strong>in</strong><br />
the usual way.<br />
We use abbreviations for dist<strong>in</strong>guished classes <strong>of</strong> constra<strong>in</strong>ts. For this let C C 1 C 2 be<br />
classes and let c i : T C ! T i (i =1 2 3) and c i : T Ci ! T (i =1 2) be subtype functions.<br />
A functional constra<strong>in</strong>t on C is a constra<strong>in</strong>t C:c 1 ! C:c 2 which abbreviates<br />
8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) :<br />
(4.30)<br />
A uniqueness constra<strong>in</strong>t on C is a constra<strong>in</strong>t UNIQUE(c 1 )orC:c 1 ! C:ident which abbreviates<br />
8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) (i 0 v 0 ) 2 x C ) i = i 0 : (4.31)<br />
An <strong>in</strong>clusion constra<strong>in</strong>t on C 1 and C 2 is a constra<strong>in</strong>t C 1 :c 1 C 2 :c 2 which abbreviates<br />
8t :: T:9(i 1 v 1 ) 2 x C 1 :c 1(v 1 )=t )9(i 2 v 2 ) 2 x C 2 :c 2(v 2 )=t: (4.32)<br />
An exclusion constra<strong>in</strong>t on C 1 , C 2 is a constra<strong>in</strong>t C 1 :c 1 kC 2 :c 2 which abbreviates<br />
80
8i 1 i 2 :: ID:8v 1 :: T C 1 :8v 2 :: T C 2 :(i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1(v 1 ) 6= c 2 (v 2 ) :<br />
(4.33)<br />
Assume c 1 c 2 c 3 denes a uniqueness constra<strong>in</strong>t on C. Then an object generat<strong>in</strong>g<br />
constra<strong>in</strong>t on C is a constra<strong>in</strong>t C:c 1 C:c 2 which abbreviates<br />
8i 1 i 2 :: ID:8v 1 v 2 :: T C :(i 1 v 1 ) 2 x C ^ (i 2 v 2 ) 2 x C ^ c 1 (v 1 )=c 1 (v 2 ) )<br />
9(i v) 2 x C ):c 1 (v) =c 1 (v 1 ) ^ c 2 (v) =c 2 (v 1 ) ^ c 3 (v) =c 3 (v 2 ) : (4.34)<br />
These constra<strong>in</strong>t notations can be easily extended to path constra<strong>in</strong>ts us<strong>in</strong>g the usual dotnotation.<br />
In addition we may use the notation -!r i for a (sub)reference r i . The dierence to<br />
-:r i is that the latter refers to a value <strong>of</strong> type ID, whereas the former corresponds to the<br />
referenced object <strong>in</strong> class C i or equivalently to a value <strong>of</strong> type U Ci .Paths can be abbreviated<br />
if this does not lead to confusion, <strong>in</strong> particular the selector value is usually omitted.<br />
Example 4.5. Let us assume that the salary <strong>of</strong> a pr<strong>of</strong>essor is determ<strong>in</strong>ed by his/her age.<br />
For this purpose, let Age Salary : T Pr<strong>of</strong> ! NAT be the natural projections to the Ageand<br />
Salary-values respectively. Thenwehave the follow<strong>in</strong>g functional constra<strong>in</strong>t on the class<br />
Pr<strong>of</strong>essorC:<br />
Constra<strong>in</strong>t Pr<strong>of</strong>essorC.Age ! Pr<strong>of</strong>essorC.Salary which abbreviates<br />
8i j :: ID:8v w :: T Pr<strong>of</strong> : (i v) 2 x Pr<strong>of</strong> ^ (j w) 2 x Pr<strong>of</strong> ^ Age(v) = Age(w)<br />
) Salary(v) = Salary(w) :<br />
As a second example take DepartmentC!Head:F aculty = DepartmentC:ident , which<br />
states that the head <strong>of</strong> a department alsoworks <strong>in</strong> that department.<br />
ut<br />
4.4.2 Greatest Consistent Specializations<br />
The problem <strong>of</strong> <strong>in</strong>tegrity enforcement can be formalized by greatest consistent specializations<br />
(GCSs). Given a method M andan<strong>in</strong>tegrity constra<strong>in</strong>t I, the GCS M I satises<br />
{ M I is consistent with respect to I,<br />
{ M I specializes M and<br />
{ each consistent specialization <strong>of</strong> M also specializes M I .<br />
It has been shown <strong>in</strong> [23] that GCSs always exist. Moreover, if we consider more than one<br />
constra<strong>in</strong>t, i.e. a conjunction <strong>of</strong> constra<strong>in</strong>ts, GCSs can be built successively and do not depend<br />
on the order <strong>of</strong> the constra<strong>in</strong>ts. It has also been shown how to compute GCS branches under<br />
some technical prerequisites. The restriction to GCS branches is due to practicality. We omit<br />
the algorithm, s<strong>in</strong>ce it envolves calculations with predicate transformers that can hardly be<br />
expla<strong>in</strong>ed <strong>in</strong> a few l<strong>in</strong>es. Instead, let us look at a simple example.<br />
Example 4.6. Let us consider the <strong>in</strong>sert-method on the class Pr<strong>of</strong>essorC from Example<br />
4.2 and the functional constra<strong>in</strong>t <strong>in</strong> Example 4.5. The method <strong>in</strong>sert 0 Pr<strong>of</strong><br />
(see Example 4.4)<br />
has to be replaced by the follow<strong>in</strong>g one [25].<br />
81
Method <strong>in</strong>sert 0 Pr<strong>of</strong><br />
(<strong>in</strong> :V::VPr<strong>of</strong> , out :I::ID) =<br />
IF 9O 2 Pr<strong>of</strong>essorC: f Pr<strong>of</strong> (value(O)) = V<br />
THEN I := ident(O)<br />
ELSE<br />
IF 9O 2 Pr<strong>of</strong>essorC: Age(value(O) = Age(V) ^ Salary(value(O) 6= Salary(V)<br />
THEN skip<br />
ELSE I:=NewId <br />
IF 9J :: ID: Head(Faculty(V)) = (ident :J)<br />
THEN Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V) g<br />
ELSE Let V 0 :: VDept . V0 := Faculty(V) <br />
Let K::ID . K:=<strong>in</strong>sert 0 Dept (V0 )<br />
Let V 00 :: VPr<strong>of</strong> .<br />
V 00 := ( PersonIdentityNo(V), Age(V), Salary(V), K ) <br />
Pr<strong>of</strong>essorC = Pr<strong>of</strong>essorC [f(I,V 00 ) g<br />
ENDIF<br />
ENDIF<br />
ENDIF<br />
ut<br />
From this example it can be seen how toprovide the generators for GCS construction. Indeed,<br />
we must have for each constra<strong>in</strong>t a generator for the GCSs <strong>of</strong> generic update methods, and <strong>in</strong><br />
addition, for each constra<strong>in</strong>t a generator for the precondition required <strong>in</strong> the GCS algorithm.<br />
We omit further details [23].<br />
4.5 Conclusion<br />
<strong>Object</strong> oriented databases dier from relational ones <strong>in</strong> that richer structures and implicit<br />
constra<strong>in</strong>ts, especially <strong>in</strong>clusion constra<strong>in</strong>ts (IsA) and referential constra<strong>in</strong>ts, are provided.<br />
This forbids a simple approach to genericity. Indeed, each generic method must enforce at<br />
least these implicit constra<strong>in</strong>ts. Consequently we must also be able to derive the necessary<br />
<strong>in</strong>put types for these operations from a given schema.<br />
Here we have solved this genericity problem us<strong>in</strong>g a reective approach, i.e. that the generators<br />
themselves can be represented <strong>in</strong> an object oriented database language us<strong>in</strong>g strongly<br />
typed macros. This form <strong>of</strong> l<strong>in</strong>guistic reection exceeds the limits <strong>of</strong> polymorphism. Reection<br />
is based on the possibility torepresent syntactic components <strong>of</strong> the language such astypes,<br />
classes, methods, etc. as values with<strong>in</strong> the language itself and to compute new schemata from<br />
these representations. This gives a practical solution to the genericity problem, whilst its<br />
theoretical justication was proven <strong>in</strong> [22, 24]. A partial implementation <strong>of</strong> the approach has<br />
been described <strong>in</strong> [27].<br />
We also sketched how to extend the approach to <strong>in</strong>tegrity enforcement. Based on the<br />
theoretical results <strong>in</strong> [23] each constra<strong>in</strong>t gives rise to a macro that transforms a user-dened<br />
method <strong>in</strong>to its greatest consistent specialization with respect to the given constra<strong>in</strong>t.<br />
To summarize genericityand<strong>in</strong>tegrity enforcement are not only theoretically well-founded,<br />
but can also be eciently built <strong>in</strong>to object oriented database languages. This allows a tremendous<br />
<strong>in</strong>crease <strong>in</strong> declarativity <strong>in</strong> object oriented databases.<br />
However, we have to ensure the value-representability <strong>of</strong> a schema, a demand that was<br />
granted for free <strong>in</strong> the RDM, and we have toprovide type-safe reective database languages.<br />
82
Of course, the second demand is only sucient and <strong>in</strong> general not necessary, s<strong>in</strong>ce we<br />
could build <strong>in</strong> algorithms for method generation and GCS construction as long as we have<br />
access to the schema denitions. The ma<strong>in</strong> advantage <strong>of</strong> the reective approach is that the<br />
work <strong>of</strong> these algorithms is made explicit and type-safe <strong>in</strong> the schema by the use <strong>of</strong> the macro<br />
language. This allows e.g. schema changes or changes to <strong>in</strong>tegrity constra<strong>in</strong>ts to be easily<br />
ma<strong>in</strong>ta<strong>in</strong>ed, s<strong>in</strong>ce they aect only a few macros.<br />
Another advantage <strong>of</strong> the outl<strong>in</strong>ed object oriented approach is that it allows to cope with<br />
constra<strong>in</strong>ts that are either structurally determ<strong>in</strong>ed or explicitly dened by the user. The<br />
traditional approach <strong>in</strong> the RDM usually buries such constra<strong>in</strong>ts <strong>in</strong> database programs.<br />
References for Chapter 4<br />
1. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The <strong>Object</strong>-<strong>Oriented</strong><br />
Database System Manifesto, Proc. 1st DOOD, Kyoto 1989<br />
2. H. At-Kaci: An Overview <strong>of</strong> LIFE, <strong>in</strong>J.W.Schmidt, A. A. Stognij (Eds.): Proc. Next Generation<br />
Information Systems Technology, Spr<strong>in</strong>ger LNCS 504, 1991, 42-58<br />
3. A. Albano, G. Ghelli, R. Ors<strong>in</strong>i: ARelationship Mechanism for a Strongly Typed <strong>Object</strong>-<strong>Oriented</strong><br />
Database Programm<strong>in</strong>g Language, <strong>in</strong> Proc. VLDB 1991<br />
4. S. Abiteboul, P. Kanellakis: <strong>Object</strong> Identity as a Query Language Primitive, <strong>in</strong> Proc. SIGMOD,<br />
Portland Oregon, 1989, 159-173<br />
5. F. Bancilhon, C. Delobel, P. Kanellakis: Build<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong> Database System: The Story<br />
<strong>of</strong> O2 , Morgan Kaufmann, 1992<br />
6. C. Beeri: Formal Models for <strong>Object</strong>-<strong>Oriented</strong> <strong>Databases</strong>, Proc. 1st DOOD 1989, 370-395<br />
7. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />
5(4), 1990, 353-382<br />
8. C. Beeri: New Data Models and Languages { the Challenge, <strong>in</strong> Proc. PODS '92<br />
9. M. Carey, D.DeWitt, S. Vandenberg: A Data Model and Query Language for EXODUS, Proc.<br />
ACM SIGMOD 88<br />
10. M. Caruso, E. Sciore: The VISION <strong>Object</strong>-<strong>Oriented</strong> Database Management System, Proc. <strong>of</strong> the<br />
Workshop on Database Programm<strong>in</strong>g Languages, Rosco, France, September 1987<br />
11. L. Cardelli, P. Wegner: On Understand<strong>in</strong>g Types, Data Abstraction and Polymorphism, ACM<br />
Comput<strong>in</strong>g Surveys, vol. 17(4), 471-522<br />
12. A. Heuer: Objektorientierte Datenbanken (<strong>in</strong> German), Addison Wesley, 1992<br />
13. W. Kim, N. Ballou, J. Banerjee, H. T. Chou, J. Garza, D. Woelk: Integrat<strong>in</strong>g an <strong>Object</strong>-<strong>Oriented</strong><br />
Programm<strong>in</strong>g System with a Database System, <strong>in</strong> Proc. OOPSLA 1988<br />
14. S. Khoshaan, G. Copeland: <strong>Object</strong> Identity, Proc. 1st Int. Conf. on OOPSLA, Portland, Oregon,<br />
1986<br />
15. B. Meyer: <strong>Object</strong>-<strong>Oriented</strong> S<strong>of</strong>tware Construction, Prentice-Hall, 1988<br />
16. D. Maier, J. Ste<strong>in</strong>, A. Ottis, A. Purdy: Development <strong>of</strong> an <strong>Object</strong>-<strong>Oriented</strong> DBMS, OOPSLA,<br />
September 1986<br />
17. D. Stemple, L. Fegaras, T. Sheard, A. Socorro: Exceed<strong>in</strong>g the Limits <strong>of</strong> Polymorphism <strong>in</strong> Database<br />
Programm<strong>in</strong>g Languages, <strong>in</strong> Proc. EDBT90, Spr<strong>in</strong>ger LNCS 416, 1990<br />
18. T. Sheard, D. Stemple: Automatic Verication <strong>of</strong> Database Transaction Safety, ACM ToDS vol.<br />
14 (3), September 1989<br />
19. D. Stemple, T. Sheard: ARecursive Base for Database Programm<strong>in</strong>g Primitives, <strong>in</strong> Proceed<strong>in</strong>gs<br />
<strong>of</strong> the First International East/West Database Workshop, Kiev, October 1990<br />
20. D. Stemple, T. Sheard, L. Fegaras: Reection: A Bridge from Programm<strong>in</strong>g to Database Languages,<br />
<strong>in</strong> Proc. HICSS '92<br />
21. K.-D. Schewe, J. W. Schmidt, D. Stemple, B. Thalheim, I. Wetzel: A Reective Approach to<br />
Method Generation <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>, University <strong>of</strong> Rostock, Rostocker Informatik<br />
Berichte, no. 14, 1992<br />
83
22. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, Genericity and Consistency <strong>in</strong> <strong>Object</strong>-<br />
<strong>Oriented</strong> <strong>Databases</strong>, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Spr<strong>in</strong>ger LNCS 646, 341-356<br />
23. K.-D. Schewe, B. Thalheim: Comput<strong>in</strong>g Consistent Transactions, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t<br />
CS-08-92, December 1992<br />
24. K.-D. Schewe, B. Thalheim: Fundamental concepts <strong>of</strong> object oriented databases, Acta Cybernetica,<br />
vol. 11 (4), 1993, 49-85<br />
25. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity Preserv<strong>in</strong>g Updates <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>,<br />
<strong>in</strong> M. Orlowska, M. Papazoglou (Eds.) : Proc. 4th Australian Database Conference, Brisbane,<br />
February 1993, World Scientic, 171-185<br />
26. B. Thalheim: Dependencies <strong>in</strong> Relational <strong>Databases</strong>, Teubner Leipzig, 1991<br />
27. I. Wetzel: Programmieren mit STYLE: Uber die systematische Entwicklung von Programmierumgebungen<br />
(<strong>in</strong> German), Ph.D. Thesis, Hamburg University, 1994<br />
84
Chapter 5<br />
Towards a Theory <strong>of</strong> Consistency<br />
Enforcement<br />
Contents<br />
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86<br />
5.1.1 The Concistency Enforcement Problem . . . . . . . . . . . . . . . . 87<br />
5.1.2 The Problem <strong>of</strong> GCS Construction . . . . . . . . . . . . . . . . . . . 87<br />
5.1.3 The Practicality <strong>of</strong> GCS Construction . . . . . . . . . . . . . . . . . 88<br />
5.2 A Motivat<strong>in</strong>g Example . . . . . . . . . . . . . . . . . . . . . . . . . 88<br />
5.2.1 Constra<strong>in</strong>ts <strong>in</strong> the Relational Model . . . . . . . . . . . . . . . . . . 89<br />
5.2.2 Stepwise Consistency Enforcement . . . . . . . . . . . . . . . . . . . 90<br />
5.3 Fundamental Features <strong>of</strong> State-Based Specications . . . . . . . 92<br />
5.3.1 Formal Specications with Guarded Commands . . . . . . . . . . . . 92<br />
5.3.2 Axiomatic Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 94<br />
5.3.3 Consistency and Specialization . . . . . . . . . . . . . . . . . . . . . 95<br />
5.3.4 Greatest Consistent Specializations . . . . . . . . . . . . . . . . . . . 97<br />
5.4 The Construction <strong>of</strong> GCSs . . . . . . . . . . . . . . . . . . . . . . . 99<br />
5.4.1 I-reduced Guarded Commands . . . . . . . . . . . . . . . . . . . . . 100<br />
5.4.2 An Upper Bound for GCSs . . . . . . . . . . . . . . . . . . . . . . . 103<br />
5.4.3 The General Form <strong>of</strong> GCSs . . . . . . . . . . . . . . . . . . . . . . . 106<br />
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108<br />
5.6 A Normal Form for the Specialization Pro<strong>of</strong> Obligation . . . . . 109<br />
5.7 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem for Sequences . . . . . . . . 110<br />
5.8 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem <strong>in</strong> the Recursive Case . . . 114<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
K.-D. Schewe, B. Thalheim. Towards a Theory <strong>of</strong> Consistency Enforcement. Acta<br />
Informatica 1998 (to appear).<br />
85
Abstract. Specications with <strong>in</strong>variants occur <strong>in</strong> almost all formal specication languages.<br />
Hence the problem is to prove the consistency <strong>of</strong> the specied operations with respect to<br />
the <strong>in</strong>variants. Whilst the problem seems to be easily solvable <strong>in</strong> predicative specications, it<br />
usually requires sophisticated verication eorts, when specications <strong>in</strong> the style <strong>of</strong> Dijkstra's<br />
guarded commands as e.g. <strong>in</strong> the specication language B are used.<br />
As an alternative a computational approach to consistency enforcement will be discussed<br />
<strong>in</strong> this paper. The basic idea is to replace <strong>in</strong>consistent operations by new consistent ones<br />
preserv<strong>in</strong>g at the same time the <strong>in</strong>tention <strong>of</strong> the old one. More precisely, this can be formalized<br />
by consistent spezializations, where specialization is a specic partial order on operations<br />
dened via predicate transformers.<br />
It can be shown that greatest consistent specializations (GCSs) always exist and are compatible<br />
with conjunctions <strong>of</strong> <strong>in</strong>variants. Then under certa<strong>in</strong> prerequisites the general construction<br />
<strong>of</strong> such GCSs is possible. In general, GCS construction can be embedded <strong>in</strong> renement<br />
calculi and therefore strengthens the systematic development <strong>of</strong> correct programs.<br />
5.1 Introduction<br />
Invariants provide an excellent way toachieve declarativity <strong>in</strong> formal specications. Therefore,<br />
almost all commonly used specication languages such asVDM[4,14],Z[26,27]andB[1,2]<br />
as well as research prototypes [20] allow atleaststatic <strong>in</strong>variants to be dened, i.e. conditions<br />
that have to be satised by all states. Then consistency <strong>of</strong> an operation 1 S with respect to<br />
the specied (static) <strong>in</strong>variant means that S transforms consistent states only <strong>in</strong>to consistent<br />
ones. More generally, transition <strong>in</strong>variants restrict the allowed pairs <strong>of</strong> <strong>in</strong>itial and nal states<br />
for operations S, and dynamic constra<strong>in</strong>ts restrict the allowed sequences <strong>of</strong> states [16, 17].<br />
Especially <strong>in</strong> the context <strong>of</strong> data-<strong>in</strong>tensive application systems, where the nal implementation<br />
will make use <strong>of</strong> persistent data stored <strong>in</strong> databases, most <strong>of</strong> the application semantics<br />
is expressible by static and dynamic <strong>in</strong>tegrity constra<strong>in</strong>ts, which is just another word for<br />
<strong>in</strong>variant [16, 17, 20, 24, 25, 28].<br />
Consistency pro<strong>of</strong>s are therefore an <strong>in</strong>herent and important task with<strong>in</strong> the development <strong>of</strong><br />
correct programs. However, as po<strong>in</strong>ted out <strong>in</strong> [3] there is a fundamental dierence <strong>in</strong> the way<br />
<strong>in</strong>variants are handled <strong>in</strong> VDM specications (and similarly Z specications) and specications<br />
<strong>in</strong> B. The predicative style <strong>in</strong> the former languages allows <strong>in</strong>variants to be considered as part<br />
<strong>of</strong> the specication, hence nd<strong>in</strong>g a correct program that satises the specication is left to<br />
renement. On the other hand, the axiomatic semantics associated with B operations <strong>in</strong> the<br />
style <strong>of</strong> Dijkstra [9, 12, 19] enables the denition <strong>of</strong> consistency pro<strong>of</strong> obligations [1, 8, 20] <strong>in</strong><br />
a suitable logic. At rst glance the VDM and Z approach seems to be advantagous, because<br />
it avoids signicant verication eorts.<br />
To the authors' po<strong>in</strong>t <strong>of</strong> view <strong>in</strong>dustrial applicability andacceptance <strong>of</strong> formal methods<br />
can only be expected if the whole renement process is taken <strong>in</strong>to consideration. Start<strong>in</strong>g from<br />
a high-level specication the application <strong>of</strong> provably correct renement steps should not stop<br />
before a formal specication is reached that is equivalent to an executable program. Then<br />
the automatic derivation <strong>of</strong> such a program should be possible as demonstrated <strong>in</strong> [13, 24].<br />
As a consequence, prov<strong>in</strong>g consistency is an unavoidable problem, s<strong>in</strong>ce at least once <strong>in</strong> the<br />
renement process we shall leave the ground <strong>of</strong> purely predicative specications [18, 24].<br />
1 To be precise, we should write operation specication to emphasize the <strong>in</strong>dependence from the implementation,<br />
but throughout the paper we drop this dist<strong>in</strong>ction.<br />
86
The B approach allows static <strong>in</strong>variants to be specied and pro<strong>of</strong> obligations to be derived.<br />
The consistency verication problem can be approached by us<strong>in</strong>g theorem provers or pro<strong>of</strong><br />
assistants, but the burden <strong>of</strong> writ<strong>in</strong>g consistent specications is left to the user. In the context<br />
<strong>of</strong> database transactions the same applies to the extended Boyer-Moore approach <strong>in</strong> [25].<br />
Hence the problem is to assist the user <strong>in</strong> this task and to provide solid and theoretically<br />
founded techniques for consistency enforcement as an alternative toverication. This problem<br />
is <strong>in</strong>vestigated <strong>in</strong> this paper.<br />
Of course, it cannot be expected to obta<strong>in</strong> a panacea for the development <strong>of</strong> correct<br />
programs, s<strong>in</strong>ce any approach to consistency enforcement must rely on certa<strong>in</strong> assumptions.<br />
We must take the specied operations and <strong>in</strong>variants as xed, i.e., the specied operations<br />
and <strong>in</strong>variants reect exactly the <strong>in</strong>tention <strong>of</strong> the user. Otherwise enforcement may produce<br />
an undesired new operation. In any case, just as for the results <strong>of</strong> verication, specications<br />
result<strong>in</strong>g from consistency enforcement may be used to give some feedback to the specify<strong>in</strong>g<br />
user and may encourage changes to a specication.<br />
5.1.1 The Concistency Enforcement Problem<br />
Given an operation S and an <strong>in</strong>variant I the basic idea is to replace S by a new operation<br />
S I which is consistent with respect to I. S<strong>in</strong>ce this alone is not a sucient property because<br />
<strong>of</strong> its <strong>in</strong>dependence from S, we claim that S I should be as close to S as possible. The rst<br />
problem is to nd a suitable notion for \close". The <strong>in</strong>tuition beh<strong>in</strong>d our work is that each<br />
operation has an \eect", i.e. performs certa<strong>in</strong> state changes, and S I should \preserve the<br />
eect" <strong>of</strong> S.<br />
Whatever the denition <strong>of</strong> eect preservation will be, it should lead to a partial order<br />
v on operations. With respect to this partial order we have S I v S and S I should be the<br />
greatest (consistent) operation with this property.<br />
In Section 5.2 we rst look at a practical example <strong>in</strong> the relational model taken from [22]<br />
to motivate the construction <strong>of</strong> S I or more precisely <strong>of</strong> one <strong>of</strong> its determ<strong>in</strong>istic branches.<br />
In Section 5.3 we recall fundamental features <strong>of</strong> state-based specications with emphasis on<br />
predicate transformer semantics.<br />
This is used to characterize consistency by a formula <strong>in</strong> <strong>in</strong>nitary rst-order logic. In<br />
the same spirit we may formalize operational specialization which denes a partial order<br />
on operations. Then it is natural to take this order for a formal denition <strong>of</strong> consistency<br />
enforcement. This leads directly to the notion <strong>of</strong> a greatest consistent specialization (GCS)<br />
which rst appeared <strong>in</strong> [21] <strong>in</strong> an object oriented database context. The rst results show<br />
the existence <strong>of</strong> GCSs and their compatibility with conjunctions, which allows to enforce<br />
consistency step-by-step for any order <strong>of</strong> the <strong>in</strong>variants.<br />
5.1.2 The Problem <strong>of</strong> GCS Construction<br />
These results are not at all surpris<strong>in</strong>g, because we know that for each specication us<strong>in</strong>g<br />
guarded commands we always nd an equivalent predicative specication and vice versa.<br />
Hence conjunction would be sucient to nd at least one solution. In fact, we may translate<br />
a specication <strong>in</strong>to a predicative form, jo<strong>in</strong> it with the <strong>in</strong>variant and then translate back to<br />
obta<strong>in</strong> the GCS. For the case <strong>of</strong> specications <strong>in</strong> the style <strong>of</strong> B this <strong>in</strong>troduces unbounded<br />
choices and therefore destroys the \operational avour" [3] <strong>of</strong> specications with guarded<br />
87
commands. Therefore, we are look<strong>in</strong>g for an approach to GCS construction which preserves<br />
this style.<br />
An <strong>in</strong>sucient alternative would be to consider just the basic operations with<strong>in</strong> the specication<br />
S, i.e. assignments or skip, and to replace them by their GCSs. In some cases this<br />
leads to over-specialization <strong>in</strong> other cases we do not even get a specialization at all. The ma<strong>in</strong><br />
result <strong>of</strong> this paper shows that under some technical prerequisites it is nevertheless possible to<br />
concentrate on basic operations. Replac<strong>in</strong>g them by their GCSs <strong>in</strong> a given complex operation<br />
denes a new operation S 0 I which is specialized by S I.We then get the GCS <strong>of</strong> the complex<br />
operation by add<strong>in</strong>g a precondition. This fundamental result will be shown <strong>in</strong> Section 5.4, but<br />
parts <strong>of</strong> the rather lengthy pro<strong>of</strong> <strong>of</strong> the \upper bound theorem" are shifted to the appendix.<br />
5.1.3 The Practicality <strong>of</strong> GCS Construction<br />
GCSs are <strong>in</strong> general non-determ<strong>in</strong>istic. Their determ<strong>in</strong>istic branches reect several alternative<br />
strategies for consistency enforcement. We may therefore ask, whether it is possible to<br />
construct directly such determ<strong>in</strong>istic branches. In Section 5.4 we showthatifwe build S 0 I by<br />
replac<strong>in</strong>g the basic operations <strong>in</strong> S by specializations <strong>of</strong> GCSs, we still achieve specializations<br />
<strong>of</strong> GCSs. This gives a second compatibility result which is <strong>of</strong> particular <strong>in</strong>terest for practical<br />
applications. Especially <strong>in</strong> data-<strong>in</strong>tensive applications, where we deal with sets, we maywant<br />
to choose determ<strong>in</strong>istic GCS branches with m<strong>in</strong>imized symmetric dierence for set values <strong>in</strong><br />
the <strong>in</strong>itial and nal state. More liberately this can be formalized by subsumption-free branches<br />
<strong>of</strong> the GCS.<br />
This paper is a cont<strong>in</strong>uation <strong>of</strong> the work <strong>in</strong> [21], where the notion <strong>of</strong> GCS was <strong>in</strong>troduced<br />
to give a solid theoretical basis for consistency enforcement, but there was no idea how<br />
to construct them. Especially <strong>in</strong> the context <strong>of</strong> data-<strong>in</strong>tensive applications there are many<br />
compet<strong>in</strong>g approaches based on active rule management [6, 10, 11, 15, 29, 30], but <strong>in</strong> none <strong>of</strong><br />
these a complete denition <strong>of</strong> the problem has been given.<br />
The ma<strong>in</strong> new result <strong>of</strong> this paper concerns the construction <strong>of</strong> GCSs. It is shown how this<br />
can be reduced to basic operations, for which GCSs must still be detected case by case. On<br />
this basis, the practical paper [23] <strong>in</strong>dicates an ecient implementation based on l<strong>in</strong>guistic<br />
reection. In addition the technical prerequisite <strong>of</strong> I-reducedness, which only occurs as a<br />
means for the pro<strong>of</strong> <strong>of</strong> the ma<strong>in</strong> result, shows the limits <strong>of</strong> consistency enforcement. The ma<strong>in</strong><br />
technical diculty <strong>in</strong>thepro<strong>of</strong>was to absta<strong>in</strong> from look<strong>in</strong>g at specializations <strong>of</strong> S, but rst<br />
to achieve a consistent generalization <strong>of</strong> S I , which is <strong>in</strong> general not a specialization <strong>of</strong> S.<br />
5.2 A Motivat<strong>in</strong>g Example<br />
Let us rst illustrate consistency enforcement <strong>in</strong> the relational model. For this we consider a<br />
small fragment <strong>of</strong> the example used <strong>in</strong> [10, 22].<br />
Recall that a relation schema is simply a set <strong>of</strong> attributes. Moreover, with each attribute<br />
A <strong>in</strong> a relation schema R we associate a data type dom(A), but for our purposes here the data<br />
types are not important. A relational database schema S S is a nite set <strong>of</strong> relation schemata.<br />
A tuple t over a relation schema R is a map R !<br />
A2R<br />
dom(A) witht(A) 2 dom(A). We<br />
usually denote tuples as ord<strong>in</strong>ary tuples with components named by the attributes. Sometimes,<br />
we even omit the attributes assum<strong>in</strong>g a xed order on them. Then a relation over R<br />
is a nite set <strong>of</strong> such tuples. An <strong>in</strong>stance <strong>of</strong> S associates with each relation schema R 2S a<br />
relation r over R.<br />
88
5.2.1 Constra<strong>in</strong>ts <strong>in</strong> the Relational Model<br />
An <strong>in</strong>tegrity constra<strong>in</strong>t over a database schema S is a formula<br />
I P 1 (x 1 ) ^ :::^ P n (x n ) ) Q 1 (y 1 ) _ :::_ Q m (y m ) <br />
where the predicate symbols P i , Q j either correspond to relation schemata R 2 S or are<br />
comparison predicates (= 6=
WIRE<br />
wire id connection wire type voltage power<br />
4711 HH-HB Koax 12 600<br />
4814 HH-H Tel 12 600<br />
TUBE<br />
tube id connection tube type<br />
8314 HH-H GX44<br />
8511 HH-HB GX44<br />
023 HB-H T33<br />
CONNECTION<br />
connection from to<br />
HH-H Hamburg Hannover<br />
HH-HB Hamburg Bremen<br />
HB-H Bremen Hannover<br />
It is easy to see that this <strong>in</strong>stance satises the constra<strong>in</strong>ts above.<br />
ut<br />
5.2.2 Stepwise Consistency Enforcement<br />
With each relation schema R we also associate basic update operations <strong>in</strong>sert R (t) and<br />
delete R (t). If a tuple t to be <strong>in</strong>serted already exists <strong>in</strong> the relation, the <strong>in</strong>sert-operation<br />
does noth<strong>in</strong>g. If a tuple t to be deleted does not exist, then a deletion also does noth<strong>in</strong>g.<br />
Thus, these operations could also be written as assignments R := R [ftg and R := R ;ftg<br />
by slightly abus<strong>in</strong>g the relation schemata as variables which take relations as values.<br />
Example 5.2. Consider the schema S from Example 5.1 and the operation <strong>in</strong>sert WIRE (t).<br />
This may lead to a violation <strong>of</strong> constra<strong>in</strong>t ID 1 ,<strong>in</strong>whichcasewemust add a tuple to TUBE.<br />
Hence it can be replaced by<br />
WIRE := WIRE [ftg <br />
IF connection(t) =2 TUBE[connection]<br />
THEN TUBE := TUBE [f(?,connection(t),?) g<br />
ENDIF<br />
Here the question marks stand for arbitrarily chosen values <strong>of</strong> the correspond<strong>in</strong>g data type.<br />
Similarly, wemay replace delete TUBE (t) by<br />
TUBE := TUBE ;ftg <br />
IF connection(t) 2 WIRE[connection] ; TUBE[connection]<br />
THEN WIRE := WIRE ;ft 0 j connection(t 0 ) = connection(t)g<br />
ENDIF<br />
In order to enforce FD 2 wemay then replace <strong>in</strong>sert TUBE (t) by<br />
IF 8t 0 2 TUBE . tube id(t) 6= tube id(t 0 )<br />
THEN TUBE := TUBE [ftg<br />
ENDIF<br />
Let us now add the exclusion constra<strong>in</strong>t ED WIRE[wire id] k TUBE[tube id]. In order to<br />
enforce this constra<strong>in</strong>t <strong>in</strong>sertions <strong>in</strong>to one <strong>of</strong> WIRE or TUBE should be followed by deletions<br />
<strong>in</strong> the other. The result<strong>in</strong>g operations are<br />
and<br />
WIRE := WIRE ;ftg <br />
TUBE := TUBE ;ft 0 j tube id(t 0 ) = wire id(t)g<br />
TUBE := TUBE ;ftg <br />
90
WIRE := WIRE ;ft 0 j wire id(t 0 )=tubeid(t)g .<br />
If we now take together FD 2 , ID 1 and ED we must be very carefull. E.g., if we execute<br />
<strong>in</strong>sert WIRE (8511,HH-HB,Koax,12,600) on the <strong>in</strong>stance above, we may rst delete the tuple<br />
(8511,HH-HB,GX44) <strong>in</strong> TUBE <strong>in</strong> order to enforce ED and then the two tuples (4711,HH-<br />
HB,Koax,12,600) and (8511,HH-HB,Koax,12,600) <strong>in</strong> WIRE <strong>in</strong> order to enforce ID 2 . The result<strong>in</strong>g<br />
<strong>in</strong>stance would be (omitt<strong>in</strong>g CONNECTION):<br />
WIRE<br />
wire id connection wire type voltage power<br />
4814 HH-H Tel 12 600<br />
TUBE<br />
tube id connection tube type<br />
8314 HH-H GX44<br />
023 HB-H T33<br />
Thus, the \eect" <strong>of</strong> the orig<strong>in</strong>al operation, i.e. <strong>in</strong>sertion <strong>of</strong> a tuple <strong>in</strong>to WIRE, is completely<br />
destroyed. The new eect is a deletion <strong>in</strong> WIRE and TUBE.<br />
The alternative works as follows: We start with the <strong>in</strong>sert WIRE operation and replace<br />
it by the one above used to enforce ID 1 . The result<strong>in</strong>g operation <strong>in</strong>volves an <strong>in</strong>sertion <strong>in</strong>to<br />
TUBE. Next \enforce" FD 2 by replac<strong>in</strong>g <strong>in</strong>sert TUBE . The result<strong>in</strong>g complex operation now<br />
<strong>in</strong>volves <strong>in</strong>sert WIRE and <strong>in</strong>sert TUBE . These are both replaced <strong>in</strong> order to \enforce" ED. We<br />
obta<strong>in</strong> the follow<strong>in</strong>g operation:<br />
WIRE := WIRE [ftg <br />
TUBE := TUBE ;ft 0 j tube id(t 0 ) = wire id(t)g <br />
IF connection(t) =2 TUBE[connection]<br />
THEN SELECT i =2 TUBE[tube id] [ WIRE[wire id] <br />
TUBE := TUBE [f(i,connection(t),?) g<br />
ENDIF<br />
However, this operation is not consistent with respect to ID 1 ,whichwe enforced before. We<br />
therefore add a precondition which holds exactly <strong>in</strong> those cases, where previous enforcement<br />
steps are preserved. This condition is wire id(t) =2 TUBE[tube id]. Then the nal operation<br />
will be<br />
IF wire id(t) =2 TUBE[tube id]<br />
THEN WIRE := WIRE [ftg <br />
IF connection(t) =2 TUBE[connection]<br />
THEN SELECT i =2 TUBE[tube id] [ WIRE[wire id] <br />
TUBE := TUBE [f(i,connection(t),?) g<br />
ENDIF<br />
ELSE fail<br />
ENDIF<br />
Here fail is used to express undenedness. If the condition <strong>in</strong> the IF-clause is not satised,<br />
the whole operation will be rejected.<br />
ut<br />
Example 5.2 reects exactly the construction <strong>of</strong> a GCS branch. The presentation so far is<br />
completely <strong>in</strong>formal. In the follow<strong>in</strong>g sections we shall justify this approach. We shall see that<br />
the chosen order <strong>of</strong> the constra<strong>in</strong>ts is not important. We shall also see that the precondition<br />
arises naturally from the specialization order.<br />
91
5.3 Fundamental Features <strong>of</strong> State-Based Specications<br />
In the follow<strong>in</strong>g consider a specication to consist <strong>of</strong> a state space, <strong>in</strong>variants and operations.<br />
A state space is simply a collection <strong>of</strong> typed state variables, where the types are assumed to be<br />
sets. Operations on these sets are dened by functions. Invariants are dened by formulae <strong>in</strong><br />
an underly<strong>in</strong>g logic L. F<strong>in</strong>ally, operations will be specied by generalized guarded commands.<br />
5.3.1 Formal Specications with Guarded Commands<br />
In the follow<strong>in</strong>g assume a xed many-sorted, S <strong>in</strong>nitary, rst-order logic L and a xed <strong>in</strong>terpretation<br />
structure (D !), where D =<br />
T type T is a set (semantic doma<strong>in</strong>) and ! assigns<br />
type-compatible functions !(f) :T 1 ::: T n ! T and !(p) :T 1 ::: T n !ftrue falseg<br />
to n-ary function symbols f and n-ary predicate symbols p respectively. Then ! can be extended<br />
<strong>in</strong> the usual way to the terms and formulae <strong>of</strong> L and we may assume the doma<strong>in</strong><br />
closure property, i.e. for each d 2 D there exists some closed term t <strong>in</strong> L with !(t) =d.<br />
Denition 5.1. (i) A state space X isaniteset<strong>of</strong>variables <strong>of</strong> L such thatforeach x 2 X<br />
there is an associated type T x .We write x :: T x .<br />
(ii) A (static) <strong>in</strong>variant on a state space X (for short: X-<strong>in</strong>variant) isaformula I <strong>of</strong> L with<br />
free variables <strong>in</strong> X (fr(I) X).<br />
(iii) A transition <strong>in</strong>variant J on a state space X is a formula <strong>of</strong> L with fr(J ) X [ X 0 ,<br />
where X 0 isadisjo<strong>in</strong>tcopy<strong>of</strong>X.<br />
(iv) Given a state space X, a state on X is a type-compatible variable assignment x 7!<br />
(x) 2 T x for each x 2 X.<br />
Let denote the set <strong>of</strong> all states. Clearly, states 2 are sucient to<strong>in</strong>terpret X-<strong>in</strong>variants,<br />
whereas state pairs () 2 suce to <strong>in</strong>terpret transition <strong>in</strong>variants. We use the<br />
notations j= and j= () <strong>in</strong> these cases. The disjo<strong>in</strong>tcopy X 0 <strong>of</strong> the state space X for transition<br />
<strong>in</strong>variants is used to dist<strong>in</strong>guish between the values <strong>in</strong> <strong>in</strong>itial and nal states respectively.<br />
Example 5.3. Let Z denote the set <strong>of</strong> <strong>in</strong>tegers. Consider the state space X = fx 1 x 2 g where<br />
T x 1 = T x2 is the set <strong>of</strong> nite subsets <strong>of</strong> the cartesian product Z Z. In addition, for i =1 2<br />
we have projection functions i : Z Z ! Z. By abuse <strong>of</strong> notation we also use i to denote<br />
the elementwise shift to set arguments result<strong>in</strong>g <strong>in</strong> a nite set <strong>of</strong> <strong>in</strong>tegers. Moreover, consider<br />
the static <strong>in</strong>variants:<br />
I 1 1 (x 1 ) 1 (x 2 )<br />
I 2 8x y :: Z Z: x2 x 2 ^ y 2 x 2 ^ 2 (x) = 2 (y) ) 1 (x) = 1 (y)<br />
I 3 2 (x 1 ) \ 2 (x 2 ) = <br />
ut<br />
Note that Example 5.3 captures the essentials <strong>of</strong> Examples 5.1 and 5.2.<br />
For operations we use guarded commands <strong>in</strong> the style <strong>of</strong> Dijkstra and Nelson [9, 12, 19, 24]<br />
<strong>in</strong>clud<strong>in</strong>g partiality and recursion. We dispense with more sophisticated constructs such as<br />
the dovetail-operator r [5], s<strong>in</strong>ce fairness is beyond the scope <strong>of</strong> this piece <strong>of</strong> work.<br />
Denition 5.2. Let X be some state space. An operation S on X consists <strong>of</strong> a set <strong>of</strong> <strong>in</strong>putparameters<br />
f 1 ::: k g, a set <strong>of</strong> output-parameters fo 1 ::: o l g and a body. To each <strong>in</strong>putparameter<br />
i corresponds a type I i and to each output-parameter o j corresponds a type O j .<br />
The body <strong>of</strong> S is a guarded command, i.e. it is recursively built from the follow<strong>in</strong>g constructs:<br />
92
(i) assignment x := E, where x is a state variable <strong>in</strong> X, an output parameter or a local<br />
variable with<strong>in</strong> S and E is a term <strong>of</strong> the same type as x,<br />
(ii) skip, fail, loop ,<br />
(iii) sequential composition S 1 S 2 ,choice S 1 S 2 ,unbounded choice @ x :: T S, guardP ! S<br />
and restricted choice S 1 S 2 , where P is a well-formed formula and x is a variable <strong>of</strong> type<br />
T and<br />
(iv) the least xpo<strong>in</strong>t operator S: f(S), where f(S) is an expression built as above us<strong>in</strong>g <strong>in</strong><br />
addition the operation variable S.<br />
We usually write o 1 ::: o l S( 1 ::: k ).<br />
Let us rst expla<strong>in</strong> the <strong>in</strong>formal (and rather procedural) mean<strong>in</strong>g <strong>of</strong> guarded commands.<br />
Each operation may be partial, i.e. it is undened on a subset <strong>of</strong> , and it is <strong>in</strong> general<br />
non-determ<strong>in</strong>istic, i.e. start<strong>in</strong>g <strong>in</strong> an <strong>in</strong>itial state may result <strong>in</strong> more than one nal state ,<br />
where may also be 1 to denote non-term<strong>in</strong>ation.<br />
Then the <strong>in</strong>formal mean<strong>in</strong>g <strong>of</strong> assignments, sequences and skip is the obvious one. Choices<br />
mean to arbitrarily select one <strong>of</strong> the operations, if it is dened. The <strong>in</strong>tention beh<strong>in</strong>d the<br />
unbounded choice is the <strong>in</strong>troduction <strong>of</strong> a new variable x not occurr<strong>in</strong>g <strong>in</strong> X <strong>of</strong> the given type<br />
T and to execute S on the extended state space X [fxg. A guard P ! S gives a precondition<br />
P for S. IfP is not satised, the whole operation is undened. Restricted choice S T means<br />
to execute S unless it is undened <strong>in</strong> which case T is taken.<br />
The basic operations fail and loop are only <strong>in</strong>troduced for theoretical completeness: fail<br />
is always undened, and loop never term<strong>in</strong>ates, but loop is the least element <strong>in</strong> the Nelson<br />
order , hence is fundamental for recursion, whereas fail will occur as the least element <strong>in</strong><br />
the specialization order v. Recall that the Nelson order is dened by T S i<br />
wp(T )(R) ) wp(S)(R) and wlp(S)(R) ) wlp(T )(R) (5.35)<br />
hold for all X-<strong>in</strong>variants R [19]. The denition <strong>of</strong> the specialization order v will occur <strong>in</strong><br />
Denition 5.6.<br />
Example 5.4. Let us extend Example 5.3 by some operations. Dene S(a b :: Z) by x 1 :=<br />
x 1 [f(a b)g and S 0 (a b :: Z) by<br />
x 1 := x 1 [f(a b)g ((a 62 map( 1 )(x 2 ) ! @ c :: Z x 2 := x 2 [f(a c)g ) skip ) :<br />
S <strong>in</strong>serts a new pair (a b) <strong>in</strong>to the set value <strong>of</strong> x 1 , and S 0 conta<strong>in</strong>s an additional <strong>in</strong>sertion<br />
<strong>in</strong>to x 2 .Thus, S 0 is a sequence S T ,whereT compensates a violation <strong>of</strong> the <strong>in</strong>variant I 1 .<br />
T itself has the form U skip. The reason for the skip is that U is a guard and thus<br />
is partial. Omitt<strong>in</strong>g the skip would lead to S 0 be<strong>in</strong>g undened <strong>in</strong> case <strong>of</strong> no violation to I 1 ,<br />
whilst now S 0 co<strong>in</strong>cides with S <strong>in</strong> that case.<br />
ut<br />
We allow types to be omitted, if they are known from the context or if they are not necessary<br />
for the understand<strong>in</strong>g. Moreover, we allow cascaded unbounded choices @ x 1 @ ::: @ x n S<br />
to be abbreviated by @x 1 ::: x n S.<br />
93
5.3.2 Axiomatic Semantics<br />
In general, we may describe the semantics <strong>of</strong> an operation S simply by a set (S) <br />
( [f1g), where 1 is the special symbol used to <strong>in</strong>dicate non-term<strong>in</strong>ation. S<strong>in</strong>ce nobody<br />
wants to dene the semantics <strong>of</strong> operations by explicit enumeration <strong>of</strong> all state pairs, we are<br />
look<strong>in</strong>g for an equivalent logical characterization.<br />
Let R be an X-<strong>in</strong>variant and consider the set <strong>of</strong> states R = f 2 jj= Rg satisfy<strong>in</strong>g R.<br />
If we take R as a postcondition for an operation S, wewant to associate with it the weakest<br />
(liberal) precondition <strong>of</strong> S to establish R. Informally these conditions can be characterized as<br />
follows:<br />
{ wlp(S)(R) characterizes those <strong>in</strong>itial states such that all term<strong>in</strong>at<strong>in</strong>g executions <strong>of</strong> S will<br />
reach a nal state characterized by R, i.e. j= wlp(S)(R) holds i for all 2 with<br />
() 2 (S) wehave j= R,and<br />
{ wp(S)(R) characterizes those <strong>in</strong>itial states such that all executions <strong>of</strong> S term<strong>in</strong>ate and<br />
will reach a nal state characterized by R, i.e. j= wp(S)(R) holds i for all () 2 (S)<br />
we have 6= 1 and j= R.<br />
Thus, wlp(S) and wp(S) are functions from X-<strong>in</strong>variants to X-<strong>in</strong>variants, which are usually<br />
called predicate transformers. It can be shown that these predicate transformers always exist<br />
and satisfy<br />
wp(S)(R) , wlp(S)(R) ^ wp(S)(true) and (5.36)<br />
wlp(S)(^ ^<br />
R i ) , wlp(S)(R i ) : (5.37)<br />
i2I<br />
i2I<br />
Call (5.36) the pair<strong>in</strong>g condition and (5.37) the universal conjunctivity property.<br />
The existence pro<strong>of</strong> is based on the assumption that L is an <strong>in</strong>nitary logic and the doma<strong>in</strong><br />
closure assumption. The latter allows for a given state to nd a characteriz<strong>in</strong>g predicate P ,<br />
i.e. we have j= P i = .<br />
Furthermore, the former property then allows to write R , W 2 R<br />
P , which is used<br />
to prove that the pair<strong>in</strong>g condition and the universal conjunctivity are sucient to dene<br />
operations, i.e., the denition <strong>of</strong> predicate transformers wlp(S) and wp(S) satisfy<strong>in</strong>g (5.36)<br />
and (5.37) is equivalent to the denition <strong>of</strong> (S) [9, 12, 19, 24].<br />
Let us dene the semantics <strong>of</strong> operations axiomatically via predicate transformers. The<br />
pro<strong>of</strong>s <strong>of</strong> universal conjunctivity and the pair<strong>in</strong>g condition are straightforward [19].<br />
Denition 5.3. Let S, S 1 , S 2 be guarded commands on some state space X, T some type,<br />
E(x) some term <strong>of</strong> type T x and let x 2 X and y 62 X. Then we have forany formula R <strong>of</strong> L:<br />
wlp(skip)(R) , wp(skip)(R) ,R (5.38)<br />
wlp(fail)(R) , wp(fail)(R) , true (5.39)<br />
wlp(loop)(R) , true <br />
wp(loop)(R) , false (5.40)<br />
wlp(x := E)(R) , wp(x := E)(R) ,fx=Eg:R (5.41)<br />
94
where fx=Eg:R denotes the substitution <strong>of</strong> the variable x <strong>in</strong> R by the expression E,<br />
wlp(S 1 S 2 )(R) , wlp(S 1 )(wlp(S 2 )(R)) <br />
wp(S 1 S 2 )(R) , wp(S 1 )(wp(S 2 )(R)) (5.42)<br />
wlp(P ! S)(R) ,P) wlp(S)(R) <br />
wp(P ! S)(R) ,P) wp(S)(R) (5.43)<br />
wlp(S 1 S 2 )(R) , wlp(S 1 )(R) ^ wlp(S 2 )(R) <br />
wp(S 1 S 2 )(R) , wp(S 1 )(R) ^ wp(S 2 )(R) (5.44)<br />
wlp(S 1 S 2 )(R) , wlp(S 1 )(R) ^ (wp(S 1 )(false) ) wlp(S 2 )(R)) <br />
wp(S 1 S 2 )(R) , wp(S 1 )(R) ^ (wp(S 1 )(false) ) wp(S 2 )(R)) (5.45)<br />
wlp(@ y :: T S)(R) ,8y :: T:wlp(S)(R) <br />
wp(@ y :: T S)(R) ,8y ^ :: T:wp(S)(R) (5.46)<br />
wlp(S:f(S))(R) , wlp(f (loop))(R) and<br />
<br />
wp(S:f(S))(R) , _ <br />
where ranges over the ord<strong>in</strong>al numbers.<br />
wp(f (loop))(R) (5.47)<br />
The recursive guarded command S:f(S) is the least xpo<strong>in</strong>t <strong>of</strong> f with respect to the Nelson<br />
order dened <strong>in</strong> (5.35). Then we must know the monotonicity <strong>of</strong> the constructors <strong>in</strong><br />
Denition 5.2 with respect to this order, which can be easily proven [19].<br />
Note that operations may only eect parts <strong>of</strong> the state space. For consistency enforcement<br />
it will be necessary to \extend" such operations. Therefore, we need to know for each operation<br />
S the subspace Y X such thatS does not change the values <strong>in</strong> X ; Y . In this case we call<br />
S a Y -operation on X. A formal denition is the follow<strong>in</strong>g.<br />
Denition 5.4. Let X be a state space and S an operation on X. S is a Y -operation for<br />
Y X i wlp(S)(R) ,Rand wp(S)(R) ,Rhold for each (X ; Y )-<strong>in</strong>variant R and Y is<br />
m<strong>in</strong>imal with this property.<br />
Note that for each operation S on X there is always a Y X such that S is a Y -operation.<br />
Let us now giveacharacterization for determ<strong>in</strong>istic operations. For this we need the notion<br />
<strong>of</strong> the dual or conjugate predicate transformers wlp(S) and wp(S) which are dened by<br />
wlp(S) (R) :wlp(S)(:R) and wp(S) (R) :wp(S)(:R) :<br />
(5.48)<br />
Denition 5.5. An operation S on the state space X is called determ<strong>in</strong>istic i wlp(S) (R) )<br />
wp(S)(R) holds for all X-<strong>in</strong>variants R.<br />
5.3.3 Consistency and Specialization<br />
General <strong>in</strong>variants and arbitrary operations on a state space X raise the problem, whether<br />
consistency as dened by the <strong>in</strong>variants is always satised by the operations. One approach<br />
to address this problem is to use general verication techniques, i.e. to derive (and prove)<br />
general pro<strong>of</strong> obligations <strong>in</strong> the predicate transformer calculus. Let us rst express these<br />
pro<strong>of</strong> obligations.<br />
95
Denition 5.6. Let X = fx 1 ::: x n g be a state space, Z Y X subspaces, I an X-<br />
<strong>in</strong>variant, J a transition <strong>in</strong>variant, S a Z-operation and T a Y -operation. Then<br />
(i) S is consistent with respect to I i I)wlp(S)(I) holds,<br />
(ii) T specializes 2 S i wp(S)(true) ) wp(T )(true) and wlp(S)(R) ) wlp(T )(R) hold for all<br />
Z-<strong>in</strong>variants R (denoted T v S), and<br />
(iii) S is consistent with respect to J i S v J holds, where J is dened as<br />
loop @ x 0 1::: x 0 n J ! x 1 := x 0 1 ::: x n := x 0 n :<br />
Recall the <strong>in</strong>tention beh<strong>in</strong>d these denitions. An X-<strong>in</strong>variant partitions <strong>in</strong>to two disjo<strong>in</strong>t<br />
<br />
subsets. If we consider the <strong>in</strong>variant I we have = I [ :I . States not satisfy<strong>in</strong>g the<br />
<strong>in</strong>variant should never be reached, i.e. if S is started <strong>in</strong> a state satisfy<strong>in</strong>g I, it should only<br />
reach states also satisfy<strong>in</strong>g I, i.e., if we have 2 I , then for all 2 with () 2 (S)<br />
we should always have 2 I . Recall from the <strong>in</strong>troduction to this section that the set <strong>of</strong> all<br />
<strong>in</strong>itial states such that each term<strong>in</strong>at<strong>in</strong>g execution <strong>of</strong> S reaches I is wlp(S)(I) , i.e.,<br />
f 2 j () 2 (S) implies j= for all 2 g = wlp(S)(I) :<br />
Hence we have the requirement I wlp(S)(I) which isequivalent to (i) [8].<br />
The <strong>in</strong>tuition beh<strong>in</strong>d the denition <strong>of</strong> specialization is that whenever an execution <strong>of</strong> the<br />
specialized operation T establishes some post-predicate R, then this execution should already<br />
be one <strong>of</strong> the general operation S. Clearly, v denes a partial order on operations.<br />
Each transition <strong>in</strong>variant may be regarded as a very general operation J that allows any<br />
state pair () satisfy<strong>in</strong>g J . Hence transition consistency for an operation S is equivalent to<br />
S v J . The loop-part <strong>in</strong> the denition <strong>of</strong> J gives wp( J )(true) , false,whichallows to<br />
consider only wlp( J )(R) ) wlp(S)(R) for all Z-<strong>in</strong>variants R.<br />
There exists an equivalent characterization <strong>of</strong> specialization and hence also <strong>of</strong> transition<br />
consistency that avoids the quantication over all X-<strong>in</strong>variants, but uses conjugate predicate<br />
transformers as dened <strong>in</strong> (5.48). The result <strong>in</strong> Proposition 5.7 is assumed to be commonly<br />
known. E.g., [7] mentions a similar result <strong>in</strong> the wp-calculus without pro<strong>of</strong>. The pro<strong>of</strong> is rather<br />
technical and can be done by simple direct calculations. S<strong>in</strong>ce we do not know <strong>of</strong>any reference<br />
for such a pro<strong>of</strong>, we have added it <strong>in</strong> Appendix 5.6.<br />
Proposition 5.7. Let S and T be operations on a state space X = fx 1 ::: x n g. Let Z =<br />
fz 1 ::: z n g be disjo<strong>in</strong>t to X with T xi = T zi . Then wlp(S)(R) ) wlp(T )(R) holds for all<br />
X-<strong>in</strong>variants R i<br />
fz 1 =x 1 ::: z n =x n g:wlp(T 0 )(wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) (5.49)<br />
holds, where T 0 results from T by renam<strong>in</strong>g all the variables x i by z i .<br />
ut<br />
2 Some other authors would prefer to call v renement. This is justied as long as renement does not comprise<br />
the extension <strong>of</strong> specications. This view <strong>of</strong> renement as a more general methodological means underlies<br />
our overall work on formal methods, whereas we prefer the notation specialization <strong>in</strong> this more restrictive<br />
context. From a technical po<strong>in</strong>t <strong>of</strong> view, we simply consider a partial order (with some nice properties) on<br />
operations.<br />
96
Note that the order <strong>of</strong> the substitution is irrelevant. Then T v S holds i we have (5.49)<br />
and wp(S)(true) ) wp(T )(true). This is a result <strong>of</strong> its own right, which enables mechanical<br />
or even automatic verication. In the context <strong>of</strong> consistency enforcement (5.49) denes an<br />
X-<strong>in</strong>variant, say P. If we restrict the operations S and T to R , then T would become a<br />
specialization <strong>of</strong> S. Ifwehave wp(S)(true) , wp(T )(true), this implies that P ! T with P<br />
dened by (5.49) is the greatest common specialization <strong>of</strong> S and T . This will later be used to<br />
nd the precondition <strong>in</strong> the GCS.<br />
Corollary 5.8. Let S and T be operations on a state space X = fx 1 ::: x n g with wp(S)(true) ,<br />
wp(T )(true). Dene an X-<strong>in</strong>variant P ST by (5.49). Then P ST ! T is the greatest common<br />
specialization <strong>of</strong> S and T .<br />
ut<br />
In general the considered operations will be non-determ<strong>in</strong>istic. Informally, non-determ<strong>in</strong>ism<br />
may be considered as glue<strong>in</strong>g together <strong>in</strong>nitely many determ<strong>in</strong>istic operations by a choice<br />
operator. Sometimes, however, we are <strong>in</strong>terested only <strong>in</strong> these determ<strong>in</strong>istic branches. We give<br />
a formal denition for this.<br />
Denition 5.9. Let S and T be operations on X with T v S and wp(T ) (true) , wp(S) (true).<br />
If T is determ<strong>in</strong>istic, it is called a determ<strong>in</strong>istic branch <strong>of</strong> S.<br />
If we T and S are semantically equivalent to some @y 1 :: T 1 ::: y n :: T n T 0 and<br />
@y 1 :: T 1 ::: y n :: T n S 0 respectively such that T 0 is a determ<strong>in</strong>istic branch <strong>of</strong>S 0 ,thenwe<br />
call T a quasi-determ<strong>in</strong>istic branch <strong>of</strong> S.<br />
5.3.4 Greatest Consistent Specializations<br />
Suppose now to be given an operation S and a static <strong>in</strong>variant I. Assume that S is an Y -<br />
operation, whereas I is dened on X with Y X. The idea is to construct a \new" operation<br />
S I that is consistent with respect to I and can be used to replace S. Roughly spoken this<br />
means that the eect <strong>of</strong> S I on the state variables <strong>in</strong> X should not dier from the eect <strong>of</strong><br />
S. Formally this is expressed by consistent specialization. S<strong>in</strong>ce there will be more than one<br />
such specialization and we therefore choose the \greatest", i.e. all others should specialize it.<br />
Denition 5.10. Let Y X be state spaces, S a Y -operation and I an X-<strong>in</strong>variant. Then<br />
an operation S I on X is called Greatest Consistent Specialization (GCS) <strong>of</strong> S with respect to<br />
I i<br />
(i) S I v S holds,<br />
(ii) S I is consistent with respect to I and<br />
(iii) for each operation T on X satisfy<strong>in</strong>g properties (i) and (ii) (<strong>in</strong>stead <strong>of</strong> S I )wehave T v S I .<br />
Example 5.5. Consider S = loop, which is already consistent with respect to any <strong>in</strong>variant<br />
I. HenceS I must also be loop.<br />
Similarly, S = fail is consistent with respect to any <strong>in</strong>variant I, whichgives S I = fail.<br />
ut<br />
Example 5.6. Let Z denote the set <strong>of</strong> <strong>in</strong>tegers. Take the state space X = fxg with x :: Z<br />
and suppose the X-constra<strong>in</strong>t I x 0 and the X-operation S = x := x ; a for some<br />
constant a 0. Then we have<br />
S I = (x a _ x
(i) holds, s<strong>in</strong>ce<br />
wlp(S)(R) ,fx=x ; ag:R<br />
) (x a _ x
Moreover, due to the construction <strong>of</strong> T and the denition <strong>of</strong> specialization, S 0 is the the<br />
least upper bound <strong>of</strong> T with respect to v, hence it must be also a specialization <strong>of</strong> S. On<br />
the other hand, from the consistency pro<strong>of</strong> obligation and the construction <strong>of</strong> T we obta<strong>in</strong><br />
immediately that S 0 is consistent with respect to I. HenceS 0 2T must hold, which proves<br />
that S 0 is a GCS S I <strong>of</strong> S with respect to I. This completes the existence pro<strong>of</strong>.<br />
The uniqueness follows immediately, s<strong>in</strong>ce each GCS S 0 <strong>of</strong> S with respect to I must be<br />
the least upper bound <strong>of</strong> T .<br />
ut<br />
We observe that the GCS with respect to the conjunction <strong>of</strong> <strong>in</strong>variants can be successively<br />
built. Similarly, we obta<strong>in</strong> a trivial compatibility result with respect to further specialization.<br />
Both results, given <strong>in</strong> the next proposition, have already been proven <strong>in</strong> [21].<br />
Proposition 5.12. If I 1 and I 2 are static <strong>in</strong>variants on X, then for any operation S on<br />
Y X the GCSs (S I 1 ) I2 and S (I1^I2) co<strong>in</strong>cide on <strong>in</strong>itial states satisfy<strong>in</strong>g I 1 ^I 2 , i.e.,<br />
I 1 ^I 2 ! (S I 1 ) I2 and I 1 ^I 2 ! S (I 1^I2) are semantically equivalent.<br />
For an X-<strong>in</strong>variant I and a Z-operation T v S the GCS T I <strong>of</strong> T with respect to I is a<br />
specialization <strong>of</strong> S I .<br />
ut<br />
5.4 The Construction <strong>of</strong> GCSs<br />
The pro<strong>of</strong> <strong>of</strong> the existence result <strong>in</strong> Proposition 5.11 is not constructive. Therefore, we have<br />
to nd a way to construct the GCS <strong>of</strong> an operation with respect to some given <strong>in</strong>variant.<br />
For the basic operations loop and fail we have already computed their GCSs <strong>in</strong> Example<br />
5.6. For skip,whichisa-operation, we notice that each operation T with wp(T )(true) , true<br />
is already a specialization. Hence we have to nd the greatest consistent operation with this<br />
property. Informally, when start<strong>in</strong>g <strong>in</strong> a state 2 I the result<strong>in</strong>g state must also lie <strong>in</strong> I .<br />
When start<strong>in</strong>g <strong>in</strong> a state 2 :I wemayreachany nal state 2 . For X = fx 1 ::: x n g<br />
this operation is given by<br />
(I !(@x 0 1::: x 0 n (fx 1 =x 0 1::: x n =x 0 ng:I ! x 1 := x 0 1 ::: x n := x 0 n)))<br />
(@x 0 1 ::: x0 n x 1 := x 0 1 ::: x n := x 0 n ) :<br />
The required properties are easily checked 3 .<br />
For assignments, we assume a case-by-case analysis for selected classes <strong>of</strong> <strong>in</strong>variants. In a<br />
data-<strong>in</strong>tensive context such work has been done <strong>in</strong> [24, 21], but also the rule-based approach<br />
<strong>in</strong> [10] can be exploited for this task.<br />
In general, however, an operation is complex, built up from the basic operations and the<br />
constructors <strong>in</strong> Denition 5.2. It would be ne, if the GCS could be built just by replac<strong>in</strong>g<br />
the <strong>in</strong>volved basic operations by their GCSs, but <strong>in</strong> general this is wrong.<br />
Example 5.7. We have seen <strong>in</strong> Example 5.6 that GCSs may sometimes just arise from add<strong>in</strong>g<br />
preconditions. Now let X and I be the same, but take<br />
S = S 1 S 2<br />
= x := x ; a x := x + a<br />
for some <strong>in</strong>teger a 0. Clearly, S is semantically equivalent toskip. Aswehave seenabove,<br />
we obta<strong>in</strong>wp(S I )(true) , true. However, if we replace S 1 and S 2 by their GCSs, i.e.<br />
3 Formally, this also follows from Lemma 5.27 <strong>in</strong> Appendix 5.7.<br />
99
(S 1 ) I = (x
(ii) For all states with j= <br />
I we have, if<br />
P )fx 1 =x 0 1::: x l =x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g::I)<br />
is a -constra<strong>in</strong>t for S + 1 ,thenitisalsoa-constra<strong>in</strong>t forS+ 1 S 2.<br />
In both cases the evaluation order <strong>in</strong> the substitutions is not important.<br />
Example 5.8. Let us cont<strong>in</strong>ue Example 5.7 with X = fx :: Zg, Ix 0, S 1 = x := x ; a,<br />
S 2 = x := x + a and S = S 1 S 2 for some <strong>in</strong>teger a 0. In this example S 1 is determ<strong>in</strong>istic<br />
and hence its only determ<strong>in</strong>istic branch.<br />
(i) Take acharacteriz<strong>in</strong>g predicate P <br />
x = b for some b :: Z. Thenj= :I holds i<br />
b
Example 5.9. Now take X = fx yg with data types T x = T y be<strong>in</strong>g the set <strong>of</strong> nite subsets<br />
<strong>of</strong> some set T . Let I x y and S(a b :: T ) = S 1 S 2 with S 1 = y := y ;fag and<br />
S 2 = y := y [fbg. Then S 1 and S 2 are fyg-operations, and S 1 is determ<strong>in</strong>istic.<br />
(i) Regard the -constra<strong>in</strong>ts <strong>of</strong> S 1 <strong>of</strong> the form P ) (8x 1 :: T x :x 1 y 0 ) with P y = y 0<br />
as required <strong>in</strong> Denition 5.14. We have<br />
fy 0 =yg:wlp(fy=y 0 g:S 1 )(P ) (8x 1 :: T x :x 1 y 0 )) ,<br />
8x 1 :: T x :x 1 y 0 ;fag ,<br />
false :<br />
Hence there is no such constra<strong>in</strong>t and consequently the conjunction <strong>of</strong> these constra<strong>in</strong>ts<br />
is equivalent totrue, which is trivially a -constra<strong>in</strong>t forS.<br />
(ii) Then regard the -constra<strong>in</strong>ts <strong>of</strong> S 1 <strong>of</strong> the form P ) (8x 1 :: T x :x 1 6 y 0 )withP y =<br />
y 0 as required <strong>in</strong> Denition 5.14. We have<br />
fy 0 =yg:wlp(fy=y 0 g:S 1 )(P ) (8x 1 :: T x :x 1 6 y 0 )) ,<br />
:9x 1 :: T x :x 1 y 0 ;fag ,<br />
false :<br />
Aga<strong>in</strong> the conjunction <strong>of</strong> these constra<strong>in</strong>ts is equivalent to true, which is trivially a -<br />
constra<strong>in</strong>t forS.<br />
Hence, <strong>in</strong> this case S is -I-reduced.<br />
ut<br />
Note the fundamental dierence between these two examples. In both cases the free variables<br />
<strong>in</strong> the <strong>in</strong>variant I conprise all variables <strong>of</strong> X. In Example 5.8 the operation S 1 is an X-<br />
operation and S was not -I-reduced, whereas <strong>in</strong> Example 5.9 S 1 is a Y 1 -operation for a<br />
proper subset Y 1 <strong>of</strong> X and S is -I-reduced. The follow<strong>in</strong>g lemma shows that this observation<br />
can be generalized.<br />
Lemma 5.15. Let the notations be as <strong>in</strong> Denition 5.14. Suppose that S = S 1 S 2 is not<br />
-I-reduced. Then we have Y 1 = X.<br />
Pro<strong>of</strong>. Without loss <strong>of</strong> generality wemay assume that S 1 is determ<strong>in</strong>istic.<br />
Let P ) fx 1 =x 0 1 ::: x l=x 0 l g:(8 i(i = 1:::m):fy 1 = 1 ::: y m = m g:K) be a -constra<strong>in</strong>t<br />
for S 1 , where K is either I or :I. Furthermore, let P x 1 = d 1^:::^x n = d n with n = l+m<br />
and y 1 = x l+1 ::: y m = x n and assume j= :K. Thenweget<br />
fx 0 1 =x 1::: x 0 l =x lg:wlp(S 1 0 )<br />
(P )fx 1 =x 0 1 ::: x l=x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g:K)) ,<br />
8 i (i =1:::m):(fx 0 1=x 1 ::: x 0 l =x lg:wlp(S 1 0 )<br />
(P )fx 1 =x 0 1::: x l =x 0 l g:fy 1= 1 ::: y m = m g:K)) :<br />
For Y 1 6= X we have m 6= 0and at least one i will be bound <strong>in</strong> this formula. S<strong>in</strong>ce j= :K<br />
holds, this formula can never be satised. Hence the conjunction <strong>of</strong> -constra<strong>in</strong>ts for S 1 <strong>of</strong> the<br />
given form is true, which is trivially a -constra<strong>in</strong>t for S. Hence S is -I-reduced. ut<br />
102
F<strong>in</strong>ally, we may extend Denition 5.14 to arbitrary operations requir<strong>in</strong>g all occurr<strong>in</strong>g sequences<br />
to be -I-reduced.<br />
Denition 5.16. Let S be an X-operation and I some Y -<strong>in</strong>variant with X Y . S is called<br />
I-reduced i the follow<strong>in</strong>g holds:<br />
(i) If S is one <strong>of</strong> fail, skip, loop or an assignment, then S is always I-reduced.<br />
(ii) If S = S 1 S 2 ,thenS is I-reduced i S 1 and S 2 are I-reduced and S is -I-reduced.<br />
(iii) If S is one <strong>of</strong> P ! T ,@y :: T y T , S 1 S 2 or S 1 S 2 , then S is I-reduced i S 1 and S 2<br />
or T respectively are I-reduced.<br />
(iv) If S = T:f(T ), then S is I-reduced i f (loop) isI-reduced for each ord<strong>in</strong>al number .<br />
5.4.2 An Upper Bound for GCSs<br />
Now we are prepared for our rst goal. We prove that the GCS S I <strong>of</strong> an I-reduced operation<br />
S specializes S I which is built by replac<strong>in</strong>g each primitive operation <strong>in</strong> S by its GCS. As<br />
announced the pro<strong>of</strong> will be done by structural <strong>in</strong>duction on S us<strong>in</strong>g the constructors <strong>in</strong><br />
Denition 5.2.<br />
Proposition 5.17. Let S 0 = P ! S be a Y -operation and I an X-<strong>in</strong>variant with Y X. If<br />
T v S 0 is consistent with respect to I, then we have T vP! S I .<br />
Pro<strong>of</strong>. S<strong>in</strong>ce T v S 0 v S holds and T is consistent with respect to I, we conclude T v S I<br />
from Denition 5.10. In addition we have :P ) wp(S 0 )(false) ) wp(T )(false), hence<br />
T vP! S I follows.<br />
ut<br />
Proposition 5.18. Let S = S 1 S 2 beaY -operation and I an X-<strong>in</strong>variant with Y X. If<br />
T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I .<br />
Pro<strong>of</strong>. T is semantically equivalenttoT 0 Q!loop with wp(T 0 )(true) , true, wlp(T 0 )(R) ,<br />
wlp(T )(R) for all R and Q,wp(T ) (false). Then Q!loop v S implies<br />
Q!loop = (Q 1 ! loop) (Q 2 ! loop)<br />
with Q i ! loop v S i for i =1 2. If we show T 0 v (S 1 ) I (S 2 ) I ,thenalso<br />
T v (S 1 ) I (Q 1 ! loop)<br />
| {z }<br />
(S1) 0 I<br />
(S 2 ) I (Q 2 ! loop)<br />
| {z }<br />
(S2) 0 I<br />
holds, but it is easy to see that (S i ) 0 I v (S i) I holds for i =1 2. Hence the result.<br />
From now onwemay therefore assume without loss <strong>of</strong> generality that wp(T )(true) , true<br />
holds. Then for any state dene T = T (P ! skip) with a characteriz<strong>in</strong>g predicate P <br />
<strong>of</strong> the state . Clearly, T is a determ<strong>in</strong>istic specialization <strong>of</strong> T ,s<strong>in</strong>ce<br />
<br />
wlp(T ) (P ) , wlp(T ) wlp(T )<br />
(P ^P ) ,<br />
(P ) for = <br />
) wlp(T ) (P<br />
false else<br />
) :<br />
S<strong>in</strong>ce wp(T )(P ) , wp(T )(true), we may also derive the stated determ<strong>in</strong>ism from this<br />
computation. Analogously we have<br />
103
wp(T ) (P ) , wp(T ) wp(T )<br />
(P ^P ,<br />
(P ) for = <br />
wp(T ) (false) else<br />
<br />
) wp(T ) (P ) <br />
s<strong>in</strong>ce predicate transformers are monotonic. Consequently T is also consistent with respect<br />
to I.<br />
In addition the determ<strong>in</strong>ism implies that T is semantically eqivalent to T1 T2 with<br />
T <br />
i<br />
v S i for i =1 2. More precisely we have T <br />
i<br />
From Proposition 5.7 we derive<br />
= P <br />
i<br />
! T with<br />
P <br />
i fz=yg:wlp(fy=zg:T )(fy=zg:P ) wlp(S i ) (z = y)) :<br />
P 1 _P 2 ,fz=yg:wlp(fy=zg:T )(P ) wlp(S) (z = y)) (5:49)<br />
, true <br />
s<strong>in</strong>ce T v S holds, hence T =(P1 _P 2 ) ! T = P1 ! T P2 ! T . Then it follows<br />
from Denition 5.10 that Ti v (S i ) I holds, hence also T v (S 1 ) I (S 2 ) I .<br />
F<strong>in</strong>ally, the least upper bound <strong>of</strong> all T with respect to v must specialize (S 1 ) I (S 2 ) I ,<br />
but this least upper bound is semantically equivalent toT ,which completes the pro<strong>of</strong>. ut<br />
Unbounded choice can be handled analogous to choice.<br />
Proposition 5.19. Let S 0 =@y :: T y S be aY -operation and I an X-<strong>in</strong>variant with Y X.<br />
If T v S 0 is consistent with respect to I, thenwe have T v @y :: T y S I .<br />
ut<br />
Proposition 5.20. Let S = S 1 S 2 be a Y -operation and I an X-<strong>in</strong>variant with Y X. If<br />
T v S is consistent with respect to I, thenwe have<br />
T v (S 1 ) I wp(S 1 )(false) ! (S 2 ) I v (S 1 ) I (S 2 ) I :<br />
Moreover, we have T v (S 1 ) I wlp(S 1 )(false) ! (S 2 ) I .<br />
Pro<strong>of</strong>. Dene T 1 = wp(S 1 ) (true) ! T and T 2 = wp(S 1 )(false) ! T . S<strong>in</strong>ce wp(S 1 ) (true) _<br />
wp(S 1 )(false) , true, we certa<strong>in</strong>ly have T = T 1 T 2 . Moreover, T 1 v S 1 and T 2 v<br />
wp(S 1 )(false) ! S 2 obviously hold.<br />
S<strong>in</strong>ce T 1 and T 2 are consistent with respect to I, it follows T 1 v (S 1 ) I and T 2 v<br />
wp(S 1 )(false) ! (S 2 ) I v wp((S 1 ) I )(false) ! (S 2 ) I , hence also<br />
T 1 T 2 v (S 1 ) I wp(S 1 )(false) ! (S 2 ) I<br />
v (S 1 ) I wp((S 1 ) I )(false) ! (S 2 ) I = (S 1 ) I (S 2 ) I <br />
which proves the rst result. S<strong>in</strong>ce wp(S 1 )(false) ) wlp(S 1 )(false) holds, the second result<br />
is obvious.<br />
ut<br />
The most dicult part concerns sequences. In this case the pro<strong>of</strong> is rather lengthy and requires<br />
several lemmata concern<strong>in</strong>g a specic form <strong>of</strong> GCSs and a very technical result on -Ireducedness.<br />
Therefore the pro<strong>of</strong> is shifted to Appendix 5.7.<br />
Proposition 5.21. Let S = S 1 S 2 be an I-reduced Y -operation and I an X-<strong>in</strong>variant with<br />
Y X. If T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . ut<br />
104
The rema<strong>in</strong><strong>in</strong>g case is given by an I-reduced recursive operation S:f(S). For this we use<br />
<strong>in</strong>duction on ord<strong>in</strong>al numbers. The ma<strong>in</strong> diculty is to br<strong>in</strong>g together two dierent partial<br />
orders, the specialization order v <strong>of</strong> Denition 5.6 and the Nelson-order <strong>in</strong> (5.35). The specialization<br />
order is fundamental for GCSs, whereas the Nelson-order is required for recursion.<br />
For recursive guarded commands the monotonicity <strong>of</strong> all operation constructors with respect<br />
to the Nelson-order is fundamental [19]. Unfortunately, a similar result does not hold<br />
for the specialization order. More precisely, the result is false for the -constructor <strong>in</strong> its rst<br />
component.<br />
Lemma 5.22. Let f(S) be a guarded command expression built from the constructors <strong>in</strong><br />
Denition 5.2 except restricted choice . Then f is monotonic with respect to the specialization<br />
order v.<br />
Pro<strong>of</strong>. The pro<strong>of</strong> is done by structural <strong>in</strong>duction. For each constructor it is completely<br />
analogous to the correspond<strong>in</strong>g pro<strong>of</strong> for the Nelson-order <strong>in</strong> [19]. We omit the details. ut<br />
In Proposition 5.20 we have seen that S I may conta<strong>in</strong> the choice-constructor <strong>in</strong>stead <strong>of</strong> the<br />
restricted choice, provided we <strong>in</strong>clude some guard. Replac<strong>in</strong>g with<strong>in</strong> a recursive operation<br />
some S 1 S 2 by(S 1 ) I (S 2 ) I would destroy the required result.<br />
The next lemma follows from tak<strong>in</strong>g together Propositions 5.17{5.21.<br />
Lemma 5.23. Let T be aconsistent specialization <strong>of</strong> some I-reduced f(S 0 ) with respect to I,<br />
where f(S) is an expression built from the constructors <strong>in</strong> Denition 5.2. Construct f I (S)<br />
from f(S) as follows:<br />
(i) Each restricted choice S 1 S 2 occurr<strong>in</strong>g with<strong>in</strong> f(S) will be replaced by<br />
S 1 wlp(S 1 )(false) ! S 2 :<br />
(ii) Then each basic operation, i.e. skip and assignments x := E(x) will be replaced by their<br />
GCSs with respect to I.<br />
Then we have T v f I (S 0 I ).<br />
ut<br />
Proposition 5.24. Let S 0 = S:f(S) be anI-reduced Y -operation and T v S 0 beaconsistent<br />
specialization with respect to some X-<strong>in</strong>variant I with Y X. Then we also have T v<br />
S:f I (S), where f I (S) is built as <strong>in</strong> Lemma 5.23.<br />
ut<br />
Aga<strong>in</strong>, the pro<strong>of</strong> is rather lengthy and requires additional lemmata. Hence it is shifted to<br />
Appendix 5.8. We maynow summarize the result achieved so far, which gives the announced<br />
upper bound theorem.<br />
Theorem 5.25. Let S be some I-reduced Y -operation and I some X-<strong>in</strong>variant with Y X.<br />
Let SI result from S as follows:<br />
(i) Each restricted choice S 1 S 2 occurr<strong>in</strong>g with<strong>in</strong> S will be replaced byS 1 wlp(S 1 )(false) !<br />
S 2 .<br />
(ii) Then each basic operation, i.e. loop, fail, skip and assignments x := E(x) will be replaced<br />
by the GCSs with respect to I.<br />
Then T v S I holds for each consistent specialization T v S with respect to I.<br />
ut<br />
105
5.4.3 The General Form <strong>of</strong> GCSs<br />
Now we are prepared to state the ma<strong>in</strong> result on the general form <strong>of</strong> GCSs. Informally, the<br />
GCS <strong>of</strong> an I-reduced operation S results <strong>in</strong> two steps. First we have to remove all restricted<br />
choices and to replace basic update operations by their GCSs. Then we have seen that the<br />
GCS S I specializes the result<strong>in</strong>g SI 0 .Now add a precondition P (S0 I ) that \lters" only those<br />
computations <strong>of</strong> SI 0 that specialize the orig<strong>in</strong>al S. This precondition corresponds to the normal<br />
form <strong>of</strong> the specialization pro<strong>of</strong> obligation <strong>in</strong> (5.49).<br />
Theorem 5.26. Let I be anX-<strong>in</strong>variant and S some I-reduced Y -operation with Y X =<br />
fx 1 ::: x n g. Let S 0 I result from S by rst replac<strong>in</strong>g each restricted choice S 1 S 2 by<br />
S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic assignment operation by its GCS with respect<br />
to I. For a disjo<strong>in</strong>t copy fz 1 ::: z n g <strong>of</strong> X dene<br />
P (S 0 I ) fz 1=x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) <br />
where T results from S 0 I by renam<strong>in</strong>g all x i to z i . Then the GCS <strong>of</strong> S with respect to I can<br />
be written <strong>in</strong> the form S I = P (S 0 I ) ! S0 I .<br />
Pro<strong>of</strong>.<br />
Let R be an arbitrary X-<strong>in</strong>variant. Then we have<br />
wlp(S 0 I ) (R) , P (S 0 I ) ^ wlp(S0 I ) (R) (5.49) ) wlp(S) (R)<br />
and the wp-condition can be proven analogously, which implies that S I as given <strong>in</strong> the theorem<br />
is a specialization <strong>of</strong> S. S<strong>in</strong>ce SI 0 is consistent with respect to I, the same holds for S I. Hence<br />
the given operation S I is <strong>in</strong>deed a consistent specialization.<br />
It rema<strong>in</strong>s to show that it is already the GCS. Let T v S be some arbitrary consistent<br />
specialization and assume without loss <strong>of</strong> generality that wp(T )(true) , true holds. From<br />
Theorem 5.25 we know that T v SI 0 holds. Hence the result follows from wp(T ) (true) )<br />
P (SI 0 ).<br />
If j= :P (SI 0 ) holds, we conclude from Proposition 5.7 that there exists some state 0<br />
with<br />
j= wlp(S) (P 0) ^ :wlp(S 0 I)(P 0) <br />
hence also j= wlp(S) (P 0) ^ :wlp(T )(P 0), s<strong>in</strong>ce T v SI 0 holds. But s<strong>in</strong>ce T v S is<br />
assumed, we must have j= :wp(T ) (true) follows, which completes the pro<strong>of</strong>.<br />
ut<br />
Let us nally come back to our start<strong>in</strong>g po<strong>in</strong>t and look at practical applications. In general,<br />
GCSs are non-determ<strong>in</strong>istic, which reects various strategies for consistency enforcement.<br />
In most practical cases, however, we are only <strong>in</strong>terested <strong>in</strong> one <strong>of</strong> these strategies, i.e., we<br />
usually select a determ<strong>in</strong>istic or quasi-determ<strong>in</strong>istic branch <strong>of</strong> the GCS. The selection <strong>of</strong><br />
quasi-determ<strong>in</strong>istic branches is related to an <strong>in</strong>teractive support for the values to be selected.<br />
Consider the special case, where we deal with nite sets as <strong>in</strong> Examples 5.1 and 5.3. One<br />
strategy would be to change value as little as possible, i.e. the symmetric dierence between<br />
the old and the new values should be m<strong>in</strong>imized. Accord<strong>in</strong>g to our general result on GCS<br />
construction a reasonable result can be achieved, if we already choose such quasi-determ<strong>in</strong>istic<br />
branches for the GCS <strong>of</strong> the <strong>in</strong>volved basic operations. In particular, we only have to take<br />
care <strong>of</strong> assignments. We demonstrate this approach by a nal example.<br />
106
Example 5.10. Let us cont<strong>in</strong>ue Example 5.3, which comprises the essentials <strong>of</strong> the application<br />
example 5.1. The follow<strong>in</strong>g calculations will justify the <strong>in</strong>formal approach <strong>in</strong> Example<br />
5.2. Let the notations be as <strong>in</strong> Example 5.3. Then we consider the fx 1 g-operation<br />
S(a b :: Z) = x 1 := x 1 [f(a b)g. Proposition 5.12 allows to build the GCS successively. Let<br />
us take the<strong>in</strong>variants <strong>in</strong> the order given <strong>in</strong> Example 5.3.<br />
Step 1. First consider the <strong>in</strong>clusion <strong>in</strong>variant I 1 . S<strong>in</strong>ce S is just an assignment, it is I 1 -reduced.<br />
We then replace S by a branch <strong>of</strong> its GCS with respect to I 1 and obta<strong>in</strong> SI 0 (a b :: INT) =<br />
x 1 := x 1 [f(a b)g ( a =2 map( 1 )(x 2 ) ! @ c :: INT x 2 := x 2 [f(a c)g skip ) <br />
(5.54)<br />
which isanX-operation with P (SI 0 ) , true. Thenwe redene S by (5.54).<br />
Step 2. Now consider the <strong>in</strong>variant I 2 . S<strong>in</strong>ce S is a sequence S 1 S 2 with a fx 1 g-operation<br />
S 1 ,theI 2 -reducedness follows from Lemma 5.15.<br />
We have to remove the restricted choice and then replace the basic assignment to x 2 by<br />
the follow<strong>in</strong>g GCS branch with respect to I 2<br />
( a =2 map( 1 )(x 2 ) ! c =2 map( 2 )(x 2 ) ! x 2 := x 2 [f(a c)g )( a 2 map( 1 )(x 2 ) ! skip )<br />
For the result<strong>in</strong>g operation SI 0 we compute P (S0 I ) , true. After some rearrangements we<br />
obta<strong>in</strong> the follow<strong>in</strong>g GCS branch with respect to I 1 ^I 2 :<br />
x 1 := x 1 [f(a b)g <br />
(( a =2 map( 1 )(x 2 ) ! @ c :: INT c =2 map( 2 )(x 2 ) ! x 2 := x 2 [f(a c)g )<br />
a 2 map( 1 )(x 2 ) ! skip ) : (5.55)<br />
Then we redene the body <strong>of</strong> S by (5.55).<br />
Step 3. Now regard the exclusion <strong>in</strong>variant I 3 . Aga<strong>in</strong> I 3 -reducedness follows from Lemma<br />
5.15. Replace S 1 = x 1 := x 1 [f(a b)g <strong>in</strong> S by the GCS branch<br />
x 1 := x 1 [f(a b)g x 2 := x 2 ;fx 2 x 2 j 2 (x) =bg<br />
and analogously replace S 2<br />
= x 2 := x 2 [f(a c)g by the GCS branch<br />
x 2 := x 2 [f(a c)g x 1 := x 1 ;fx 2 x 1 j 2 (x) =cg :<br />
Then we compute<br />
P (SI) 0 , b =2 map( 2 )(x 2 ) ^<br />
(a =2 map( 1 )(x 2 ) )8c :: INT: (c 62 map( 2 )(x 2 ) ) c =2 map( 2 )(x 1 ) [fbg) ) <br />
hence after some rearrangements the nal result is<br />
S I (a b :: INT) = b =2 map( 2 )(x 2 ) ! x 1 := x 1 [f(a b)g <br />
(( a 62 map( 1 )(x 2 ) ! @ c :: INT <br />
c =2 map( 2 )(x 2 ) ^ c =2 map( 2 )(x 1 ) ! x 2 := x 2 [f(a c)g )<br />
a 2 map( 1 )(x 2 ) ! skip ) :<br />
Note that this result reects exactly the <strong>in</strong>formal considerations <strong>in</strong> Example 5.2.<br />
ut<br />
107
5.5 Conclusion<br />
The work reported <strong>in</strong> this paper deals with consistency enforcement <strong>in</strong> formal specications.<br />
This approach formalizes the problem by greatest consistent specializations (GCSs). Under<br />
the technical prerequisite <strong>of</strong> reducedness the computation <strong>of</strong> such a GCS can be be retraced to<br />
the denition <strong>of</strong> GCSs for basic update operations. It is possible to replace basic operations<br />
with<strong>in</strong> a complex operation specication by their GCSs and to compute a specialization<br />
precondition. This result is a step towards a general and theoretically founded solution <strong>of</strong><br />
consistency enforcement. However, a series <strong>of</strong> open problems is left for future research.<br />
{ The notion <strong>of</strong> a GCS can also be dened for transition constra<strong>in</strong>ts. Same as for static<br />
constra<strong>in</strong>ts existence, uniqueness and compatibility results are already known [21]. The<br />
problem is to extend also the results on GCS construction.<br />
{ The result on the construction <strong>of</strong> GCSs allows the problem <strong>of</strong> consistency enforcement to<br />
be reduced to basic operations, i.e. assignments, and simple constra<strong>in</strong>ts that are comb<strong>in</strong>ed<br />
by conjunction. The rema<strong>in</strong><strong>in</strong>g problem is to nd GCSs for basic operations, which has<br />
to be done case by case for selected classes <strong>of</strong> constra<strong>in</strong>ts (cf. [24, 21]).<br />
{ GCS construction depends on check<strong>in</strong>g for I-reducedness. This is equivalent toshow that<br />
certa<strong>in</strong> rst-order formulae derived from I are tautologies. Whilst this is undecidable<br />
<strong>in</strong> general, the problem is to characterize those <strong>in</strong>variants I for which I-reducedness is<br />
decidable.<br />
{ Even if we are able to decide I-reducedness, the problem is how to proceed <strong>in</strong> the case<br />
<strong>of</strong> non-reduced operations. It is not very satisfactory to break o with no result. The<br />
question is to nd a reduction algorithm or at least to nd conditions under which such<br />
an algorithm could exist. For <strong>in</strong>clusion, exclusion, functional and card<strong>in</strong>ality constra<strong>in</strong>ts<br />
<strong>in</strong> data-<strong>in</strong>tensive applications such reductions have beenworked out <strong>in</strong> [24].<br />
{ By us<strong>in</strong>g axiomatic semantics GCSs are only determ<strong>in</strong>ed up to semantic equivalence.<br />
The construction <strong>of</strong> a GCS, however, will result <strong>in</strong> a concrete syntactic form us<strong>in</strong>g typed<br />
guarded commands. An operational <strong>in</strong>terpretation <strong>of</strong> this form may <strong>in</strong>volve backtrack<strong>in</strong>g<br />
[19] and may be totally <strong>in</strong>ecient. Hence optimization may be required.<br />
All these problems are a bit technical <strong>in</strong> nature. The hardest problem, however, is concerned<br />
with the selection <strong>of</strong> the specialization order. As the use <strong>of</strong> the ma<strong>in</strong> result <strong>in</strong> practice demonstrates<br />
this order might still be too coarse for enforcement purposes. On the other hand,<br />
multi-valued dependencies, which concern only one set-valued state variable lead to preconditions,<br />
although we might expect additional changes <strong>in</strong>stead [21].<br />
One possible solution might be to choose an order based on -constra<strong>in</strong>ts, but then the<br />
problem leads back to GCSs because <strong>of</strong> the relation between transition consistency and specialization.<br />
Clos<strong>in</strong>g the rema<strong>in</strong><strong>in</strong>g gaps <strong>in</strong> this piece <strong>of</strong> work is a matter <strong>of</strong> current research.<br />
108
Appendix<br />
5.6 A Normal Form for the Specialization Pro<strong>of</strong> Obligation<br />
In this appendix we only give a pro<strong>of</strong> <strong>of</strong> Proposition 5.7, which is rst repeated here.<br />
Proposition 5.7 Let S and T be operations on a state space X = fx 1 ::: x n g. Let Z =<br />
fz 1 ::: z n g be disjo<strong>in</strong>t to X with T xi = T zi . Then wlp(S)(R) ) wlp(T )(R) holds for all<br />
X-<strong>in</strong>variants R i<br />
fz 1 =x 1 ::: z n =x n g:wlp(T 0 )(wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) (5.56)<br />
holds, where T 0 results from T by renam<strong>in</strong>g all the variables x i by z i .<br />
Pro<strong>of</strong>.<br />
In [19] it has been shown that we mayalways write wlp(T 0 )(R) <strong>in</strong> the form<br />
8z 0 1::: z 0 n: (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n) )fz 1 =z 0 1::: z n =z 0 ng:R) :<br />
In particular, (5.56) is equivalent to<br />
8z 0 1::: z 0 n: (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n) )<br />
fz 1 =z 0 1::: z n =z 0 ng:wlp(S) (x 1 = z 1 ^ :::^ x n = z n )) :<br />
S<strong>in</strong>ce S is an X-operation, we conclude<br />
fz 1 =z 0 1::: z n =z 0 ng:wlp(S) (x 1 = z 1 ^ :::^ x n = z n ) , wlp(S) (x 1 = z 0 1 ^ :::^ x n = z 0 n) :<br />
Now assume wlp(S)(R) ) wlp(T )(R) for all X-predicates R. Then also wlp(S 0 )(R) )<br />
wlp(T 0 )(R) holds for all Z-predicates R, where S 0 results from S by renam<strong>in</strong>g all x i to<br />
z i .<br />
In particular, we maytake R as z 1 = (d 1 )^:::^z n = (d n ) for arbitrary constants d i 2 D<br />
and a selector function , which assigns closed terms (d) tosemantic constants d 2 D such<br />
that ! T = id D holds.<br />
But then wlp(S 0 )(R) can be rewritten as<br />
Hence<br />
fx 1 =z 1 ::: x n =z n g:wlp(S) (x 1 = (d 1 ) ^ :::^ x n = (d n ))<br />
8z 0 1 ::: z0 n : (wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n ) )<br />
fx 1 =z 1 ::: x n =z n g:wlp(S) (x 1 = z1 0 ^ :::^ x n = zn 0 )) (5.57)<br />
holds, which implies (5.56).<br />
Conversely, (5.56) implies (5.57). For an arbitrary X-predicate R we may then write<br />
wlp(T ) (R) as<br />
9z 0 1 ::: z0 n : (fz 1=x 1 ::: z n =x n g:wlp(T 0 ) (z 1 = z 0 1 ^ :::^ z n = z 0 n ) ^fx 1=z 0 1 ::: x n=z 0 n g:R)<br />
and by (5.57)we conclude<br />
fz 1 =x 1 ::: z n =x n g:(9z 0 1 ::: z0 n : (wlp(S) (x 1 = z 0 1 ^:::^x n = z 0 n )^fx 1=z 0 1 ::: x n=z 0 n g:R)) <br />
which is wlp(S) (R) by the normal form representation for wlp(S). Hence wlp(S)(R) )<br />
wlp(T )(R) as required.<br />
ut<br />
109
5.7 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem for Sequences<br />
In this appendix we give the pro<strong>of</strong> <strong>of</strong> Proposition 5.21. For this we need two lemmata. The<br />
rst <strong>of</strong> these shows a general syntactic form <strong>of</strong> GCSs based on unbounded choices. A similar<br />
result occurs if we exploit the equivalence to predicative specications.<br />
Lemma 5.27. Let S be a Y -operation, Y X and I an <strong>in</strong>variant on X. Then the greatest<br />
consistent specialization S I <strong>of</strong> S with respect to I is semantically equivalent to<br />
(I ! S @ z := I!skip) (:I ! S @ z := ) (5.58)<br />
where z has been used as an abbreviation <strong>of</strong> the collection <strong>of</strong> state variables <strong>in</strong> X ; Y and<br />
ranges over values <strong>of</strong> the correspond<strong>in</strong>g types. Moreover, S I is uniquely determ<strong>in</strong>ed (up to<br />
semantic equivalence) by S and I.<br />
Pro<strong>of</strong>. We have to verify the conditions <strong>in</strong> Denition 5.10 for S I dened by (5.58). For an<br />
arbitrary Y -<strong>in</strong>variant R we have<br />
wlp(S I ) (R) , (I ^wlp(S) (9:fz=g:(I ^R))) _ (:I ^ wlp(S) (9:fz=g:R))<br />
, (I ^wlp(S) ((9:fz=g:I) ^R)) _ (:I ^ wlp(S) (R))<br />
) (I ^wlp(S) (R)) _ (:I ^ wlp(S) (R))<br />
, wlp(S) (R) :<br />
Here we exploited the monotonicity <strong>of</strong> conjugate predicate transformers with respect to implication<br />
and the fact that the variables z do not occur <strong>in</strong> R. Then the same calculation can be<br />
used with wlp replaced everywhere by wp, which shows (i). For the pro<strong>of</strong> <strong>of</strong> (ii) we compute<br />
wlp(S I )(I) , (I )wlp(S)(8:fz=g:(I )I))) ^ (:I ) wlp(S)(8:fz=g:I))<br />
, (:I ) wlp(S)(8:fz=g:I) <br />
which implies I)wlp(S I )(I) as required.<br />
For the pro<strong>of</strong> <strong>of</strong> (iii) let P be a characteriz<strong>in</strong>g predicate and let T v S be an arbitrary<br />
consistent specialization <strong>of</strong> S. We dist<strong>in</strong>guish two cases.<br />
Case 1. Assume P ):I.Thenwe also have wlp(T ) (P ) ) wlp(T ) (:I) ):I, s<strong>in</strong>ce T<br />
is consistent with respect to I and wlp(S) is monotonic. It follows<br />
wlp(T ) (P ) ) :I ^ wlp(S) (P ) ) :I ^ wlp(S) (9:fz=g:P ) ) wlp(S I ) (P ) :<br />
The last implication follows from the calculation <strong>of</strong> wlp(S I ) <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> (i).<br />
Case 2. Assume P ) I. Then we have wlp(T ) (P ) , wlp(T ) (I ^P ). S<strong>in</strong>ce T v S<br />
holds, the monotonicity <strong>of</strong>wlp(S) implies<br />
wlp(S) (9:fz=g:(I ^P )) ^ wlp(S) (9:fz=g:P ) : (5.59)<br />
In any case (wlp(T ) (P ) ^I) _ (wlp(T ) (P ) ):I) holds. Together with (5.59) we get<br />
wlp(T ) (P ) ) (I ^wlp(S) (9:fz=g:(I ^P ))) _ (:I ^ wlp(S) (9:fz=g:P ))<br />
, wlp(S I ) (P ) :<br />
110
The universal conjunctivity property then implies wlp(T ) (R) ) wlp(S I ) (R) for all R.<br />
In addition, it is easy to see that wp(T ) (false) ) wp(S I ) (false) holds. Hence T is a<br />
specialization <strong>of</strong> S I .<br />
ut<br />
The second technical lemma reformulates the properties <strong>of</strong> -I-reducedness for determ<strong>in</strong>istic<br />
S 1 .<br />
Lemma 5.28. Let the notations be as <strong>in</strong> Denition 5.14. Assume that S is -I-reduced and<br />
S 1 is determ<strong>in</strong>istic. Then follow<strong>in</strong>g two conditions hold:<br />
(i) For all states and with j= :I , j= :I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ) we<br />
have, if<br />
P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)<br />
is a -constra<strong>in</strong>t for S 1 ,thenP )fx=x 0 g:8 Y ;X 1:fy=g:I) is a -constra<strong>in</strong>t for S.<br />
(ii) For all states and with j= I , j= I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ) we<br />
have, if<br />
P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) ):I)<br />
is a -constra<strong>in</strong>t for S 1 ,thenP )fx=x 0 g:8 Y ;X 1:fy=g::I) is a -constra<strong>in</strong>t for S.<br />
Pro<strong>of</strong>. Let and be states with j= :I , j= :I and j= wlp(S) (9 Y ;X 1;X2 :fy=g:P ).<br />
Assume that<br />
P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) (5.60)<br />
is a -constra<strong>in</strong>t forS 1 . Then we have<br />
j= wlp(S 1 ) (wlp(S 2 ) (9 Y ;X 1;X2 :fy=g:P )) ) (s<strong>in</strong>ce S 1 is determ<strong>in</strong>istic)<br />
j= wlp(S 1 )(wlp(S 2 ) (9 Y ;X 1;X2 :fy=g:P )) )<br />
j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P ) wlp(fx=x 0 g:S 2 ) (9 Y ;X 1;X2 :fy=g:fx=x0 g:P )) :(5.61)<br />
From (5.60) we have<br />
j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:(8 Y ;X 1 :fy=g:wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I) :<br />
Together with (5.61) this implies<br />
j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:8 Y ;X 1 :fy=g:I) :<br />
Hence P ) fx=x 0 g:8 Y ;X 1 :fy=g:I is a -constra<strong>in</strong>t for S 1. S<strong>in</strong>ce S is assumed to be -<br />
I-reduced, it follows that P ) fx=x 0 g:8 Y ;X 1:fy=g:I is also a -constra<strong>in</strong>t for S, hence<br />
condition (i). The pro<strong>of</strong> <strong>of</strong> condition (ii) is completely analogous.<br />
ut<br />
With the help <strong>of</strong> these two technical lemmata we can now approach the pro<strong>of</strong> <strong>of</strong> the upper<br />
bound theorem for sequences. We use Lemma 5.27 to compute S I and (S 1 ) I (S 2 ) I . This<br />
allows to compute their predicate transformers. Then we verify the required specialization,<br />
for which Lemma 5.28 must be exploited.<br />
111
Proposition 5.21 Let S = S 1 S 2 be an I-reduced Y -operation and I an X-<strong>in</strong>variant with<br />
Y X. If T v S is consistent with respect to I, thenwe have T v (S 1 ) I (S 2 ) I . ut<br />
Pro<strong>of</strong>. We may assume without loss <strong>of</strong> generality that wp(T )(true) , true holds. Then it<br />
suces to show wlp(S I ) (P ) ) wlp((S 1 ) I (S 2 ) I ) (P ) for all characteriz<strong>in</strong>g predicates<br />
P .<br />
Moreover, s<strong>in</strong>ce S 1 is the least upper bound <strong>of</strong> its determ<strong>in</strong>istic branches with respect<br />
to v, we may assume without loss <strong>of</strong> generality thatS 1 is determ<strong>in</strong>istic. Hence the stronger<br />
properties <strong>in</strong> Lemma 5.28 can be used.<br />
First we compute both sides <strong>of</strong> such an implication us<strong>in</strong>g (5.58). We have<br />
wlp(S I ) (P ) , (I^9:wlp(S) (fy=g:I ^fy=g:P )) _<br />
(:I ^ 9:wlp(S) (fy=g:P ))<br />
(5.62)<br />
and<br />
wlp((S 1 ) I (S 2 ) I ) (P ) ,<br />
(I ^wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp((S 2) I ) (P )))) _<br />
(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:wlp((S 2) I ) (P ))) ,<br />
(I ^wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:(I ^P ))))) _<br />
(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:(I ^wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:(I ^P ))))) _<br />
(:I ^ wlp(S 1 ) (9 Y ;X 1 :fy=g:(:I ^ wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P )))) ,<br />
9 Y ;X 1 :9 Y ;X2 0 :(wlp(S 1 ) (fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:(I ^P )))) _<br />
9 Y ;X 1 :9 Y ;X2 0 ::I ^ (wlp(S 1 ) (:fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:P ))) :<br />
(5.63)<br />
Case 1. Assume P ):Iholds. Then we have wlp(S I ) (P ) ) wlp(S I ) (:I) ):I, s<strong>in</strong>ce<br />
S I is consistent with respect to I. Hence, s<strong>in</strong>ce we assume wlp(S I ) (P ), we have to consider<br />
only the second l<strong>in</strong>e <strong>of</strong> (5.62). We want to show that this implies the second l<strong>in</strong>e <strong>of</strong> (5.63),<br />
i.e. (due to consistency :I can be omitted)<br />
9 Y ;X 1 :9 Y ;X2 0 : (:I ^ (wlp(S 1 ) (:fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:P )))) :<br />
(5.64)<br />
Assume that (5.64) does not hold, i.e. there exists some state with<br />
j= wlp(S 1 )(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) : (5.65)<br />
We then calculate that (5.65) is equivalent to<br />
j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(fx=x 0 g:<br />
8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) ,<br />
| {z }<br />
R<br />
j= fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) )fx 0 =x 00 g:fx=x 0 g:R) ,<br />
j= P )fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) )fx 0 =x 00 g:fx=x 0 g:R) ,<br />
j= fx 0 =xg:(8x 00 :wlp(fx=x 0 g:S 1 ) (x 0 = x 00 ) ) (P )fx 0 =x 00 g:fx=x 0 g:R)) ,<br />
j= fx 0 =xg:wlp(fx=x 0 g:S 1 )(P )fx=x 0 g:R) :<br />
112
From this we conclude that<br />
P )fx=x 0 g:(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) )I)) (5.66)<br />
is a -constra<strong>in</strong>t forS 1 . Due to Lemma 5.28(i), s<strong>in</strong>ce j= :I and j= :I hold, this implies<br />
to be a -constra<strong>in</strong>t forS, hence we get<br />
P )fx=x 0 g:(8 Y ;X 1:fy=g:I) (5.67)<br />
j= fx 0 =xg:wlp(fx=x 0 g:S)(P )fx=x 0 g:(8 Y ;X 1 :fy=g:I)) ,<br />
(do the same calculation as above)<br />
j= fx 0 =xg:wlp(fx=x 0 g:S)(fx=x 0 g:8 Y ;X 1 :fy=g:I) ,<br />
j= wlp(S)(8 Y ;X 1:fy=g:I) : (5.68)<br />
which leads to a contradiction, s<strong>in</strong>ce P ):Iimplies<br />
wlp(S) (9 Y ;X 1 :fy=g::I) , :wlp(S)(8 Y ;X1 :fy=g:I)<br />
and on the other hand due to consistency we have<br />
wlp(S I ) (P ) ) wlp(S I ) (9 Y ;X 1;X2 :fy=g:P ) ) wlp(S) (9 Y ;X 1;X2 :fy=g:P ) :<br />
This proves the assertion <strong>in</strong> Case 1.<br />
Case 2. Now assume that P )Iand j= <br />
subcases.<br />
wlp(S I ) (P ) hold. From (5.62) we derive two<br />
Case 2.1 Assume j= :I ^ 9:wlp(S) (fy=g:P ). Then we conclude (always j= :::)<br />
9:wlp(S 1 ) (wlp(S 2 ) (fy=g:P )) ,<br />
9:wlp(S 1 ) ((I _:I) ^ wlp(S 2 ) (fy=g:P )) ,<br />
9:wlp(S 1 ) (I^wlp(S 2 ) (fy g:(P ^I))) _9:wlp(S 1 ) (:I ^ wlp(S 2 ) (fy=g:P )) <br />
hence (5.63) follows. This proves Case 2.1.<br />
Case 2.2. Now assume j= I^9:wlp(S) (fy=g:(I ^P )). We show that this implies<br />
j= 9 Y ;X 1 :9 Y ;X2 0 :(wlp(S 1 ) (fy=g:I ^fy=g:wlp(S 2 ) (fy= 0 g:(I ^P )))) <br />
(5.69)<br />
which implies the rst l<strong>in</strong>e <strong>of</strong> (5.63).<br />
As <strong>in</strong> Case 1 assume that (5.69) does not hold. We use analogous computations to derive<br />
that<br />
P )fx=x 0 g:(8 Y ;X 1 :fy=g:(wlp(S 2) (9 Y ;X 2 0 :fy= 0 g:P ) ):I))<br />
is a -constra<strong>in</strong>t forS 1 . Accord<strong>in</strong>g to Lemma 5.28, s<strong>in</strong>ce j= I, j= I, this implies<br />
P )fx=x 0 g:(8 Y ;X 1 :fy=g::I)<br />
113
to be a -constra<strong>in</strong>t forS. Thus, we get<br />
j= fx 0 =xg:wlp(fx=x 0 g:S)(P )fx=x 0 g:(8 Y ;X 1 :fy=g::I)) ,<br />
j= fx 0 =xg:wlp(fx=x 0 g:S)(fx=x 0 g:8 Y ;X 1 :fy=g::I) ,<br />
j= wlp(S)(8 Y ;X 1:fy=g::I) : (5.70)<br />
However, from our assumption and P )Iwe conclude<br />
wlp(S) (9 Y ;X 1 :fy=g:I) ,:wlp(S)(8 Y ;X1 :fy=g::I)<br />
contradict<strong>in</strong>g (5.70). This proves the assertion <strong>in</strong> Case 2.2.<br />
ut<br />
5.8 Pro<strong>of</strong> <strong>of</strong> the Upper Bound Theorem <strong>in</strong> the Recursive<br />
Case<br />
In this appendix we prove the upper bound theorem for recursive operations. For this we need<br />
an additional lemma that deals with the compatibility <strong>of</strong> GCSs with the Nelson-order. Let<br />
F <br />
denote the least upper bound with respect to the Nelson-order and F the least upper<br />
bound with respect to the specialization order.<br />
Lemma 5.29. Let T , S and S for each ord<strong>in</strong>al number be Y -operations such that S 0 S <br />
holds for 0 and let I be some X-<strong>in</strong>variant for Y X. Then we have:<br />
(i) If T S holds, then T I F S I follows.<br />
(ii) The least upper bound exists and we have<br />
<<br />
S I<br />
0<br />
@ G<br />
<<br />
<br />
S<br />
<br />
1<br />
A<br />
I<br />
v<br />
G<br />
<br />
S<br />
<br />
I :<br />
<<br />
Pro<strong>of</strong>. (i) follows, because all constructors <strong>of</strong> Denition 5.2 are monotonic <strong>in</strong> the Nelson<br />
order , hence the rst result follows from (5.58).<br />
(ii) S<strong>in</strong>ce S 0 S holds for 0 , the family (S ) < forms an ascend<strong>in</strong>g cha<strong>in</strong>. From<br />
[19] we know that <strong>in</strong> this case the least upper bound with respect to the Nelson order exists.<br />
It rema<strong>in</strong>s to show the required specialization assertion.<br />
Let T 1 and T 2 denote the left hand side F and the right hand side <strong>of</strong> this assertion respectively.<br />
S<strong>in</strong>ce for all < we have S S we conclude that S I T 1 and hence also<br />
<<br />
T 2 T 1 .Thisproves the wp(T 2 )(R) ) wp(T 1 )(R) for all X-constra<strong>in</strong>ts R.<br />
It rema<strong>in</strong>s to show wlp(T 2 )(R) ) wlp(T 1 )(R), which follows directly from Proposition<br />
5.7. ut<br />
Now we can give the ma<strong>in</strong> pro<strong>of</strong>.<br />
Proposition 5.24 Let S 0 = S:f(S) be anI-reduced Y -operation and T v S 0 beaconsistent<br />
specialization with respect to some X-<strong>in</strong>variant I with Y X. Then we also have T v<br />
S:f I (S), where f I (S) is built as <strong>in</strong> Lemma 5.23.<br />
114
Pro<strong>of</strong>.<br />
Recall from [19] that we have<br />
S 0<br />
= f (loop) = f<br />
0<br />
@ G<br />
<<br />
<br />
f (loop)<br />
1<br />
A<br />
for some ord<strong>in</strong>al number , hence from Lemmata 5.23 and 5.29 we derive<br />
0<br />
T v f I<br />
@<br />
0<br />
@ G<br />
<<br />
<br />
f (loop)<br />
0 1<br />
1 1<br />
T1<br />
A A v fI G z }| {<br />
<br />
f<br />
B<br />
(loop) I<br />
I @<br />
C<br />
<<br />
| {z }<br />
The last <strong>in</strong>equality follows from Lemma 5.29(ii) applied to the operand and from the monotonicity<br />
<strong>of</strong>f I with respect to the specialization order as stated <strong>in</strong> Lemma 5.22.<br />
Now dene T2 = f I (loop) and apply transnite <strong>in</strong>duction to show T 1 v T 2 for all .<br />
For = 0 the result is obvious, s<strong>in</strong>ce loop I is semantically equivalent toloop.<br />
Now assume T1 0<br />
v T2 0<br />
holds for all 0 < . For 0 < we have f<br />
F<br />
0<br />
I (loop) f I (loop).<br />
Hence the least upper bound <strong>in</strong> the Nelson order<br />
f 0<br />
(loop) exists and we have<br />
0<br />
f I<br />
B<br />
1<br />
T2<br />
z }| 0<br />
{<br />
<br />
f 0<br />
I (loop)<br />
C<br />
0 <<br />
| {z }<br />
T2<br />
G<br />
B<br />
@<br />
0 <<br />
I<br />
T1<br />
A = f I (loop) :<br />
Then by apply<strong>in</strong>g the <strong>in</strong>duction hypothesis (change <strong>in</strong> T 1 to 0 and to ) for an arbitrary<br />
X-constra<strong>in</strong>t R we get<br />
wlp(T 2 )(R) ,<br />
and<br />
wp(T 2 )(R) ,<br />
^<br />
0 <<br />
_<br />
0 <<br />
<strong>in</strong>duction hypothesis<br />
wlp(T2 0<br />
)(R) )<br />
<strong>in</strong>duction hypothesis<br />
wp(T2 0<br />
)(R) )<br />
^<br />
0 <<br />
_<br />
0 <<br />
A :<br />
wlp(T 0<br />
1 )(R) , wlp(T 1)(R)<br />
wp(T 0<br />
1 )(R) , wp(T 1)(R) :<br />
Consequently T 1 v T 2 holds. F<strong>in</strong>ally, from Lemma 5.22 we conclude T v f I (T 1 ) v f I (T 2 )=<br />
(loop) as required.<br />
ut<br />
f I<br />
References for Chapter 5<br />
1. J. R. Abrial: \A Formal Approach to Large S<strong>of</strong>tware Construction", <strong>in</strong> J. L. A. Van de Snepscheut<br />
(Ed.), Mathematics <strong>of</strong> Program Construction, Spr<strong>in</strong>ger LNCS 375, 1989, 1-20<br />
2. J. R. Abrial: The B Method, Prentice Hall International (to appear)<br />
3. J. Bicarregui, B. Ritchie: \Invariants, Frames and Postconditions: a Comparison <strong>of</strong> the VDM and B<br />
Notations", <strong>in</strong> J.C.P. Woodcock, P.G. Larsen (Eds.): Formal Methods Europe (FME'93), Spr<strong>in</strong>ger<br />
LNCS 670, 1993, 162-182<br />
115
4. D. Bjrner, C. B. Jones (1982): Formal Specication and S<strong>of</strong>tware Development, Prentice Hall<br />
5. M. Broy, G. Nelson: \Add<strong>in</strong>g Fair Choice to Dijkstra's Calculus", ACM TOPLAS, vol. 16 (3),<br />
1994, 924-938<br />
6. S. Ceri, J. Widom: \Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance", <strong>in</strong> Proceed<strong>in</strong>gs 16th<br />
Conference on VLDB, 1990, 566-577<br />
7. W. Chen, J. T. Udd<strong>in</strong>g: \Towards a Calculus <strong>of</strong> Data Renement",<strong>in</strong>J.L.AVan de Snepscheut<br />
(Ed.): Mathematics <strong>of</strong> Program Construction, Spr<strong>in</strong>ger LNCS 375, 1989, 197-218<br />
8. P. Cousot: \Methods and Logics for Prov<strong>in</strong>g Programs", <strong>in</strong> J. van Leeuwen (Ed.): The Handbook<br />
<strong>of</strong> Theoretical Computer Science, vol. B, Elsevier, 1990, 841-993<br />
9. E. W. Dijkstra, C. S. Scholten: Predicate Calculus and Program Semantics, Spr<strong>in</strong>ger, Texts and<br />
Monographs <strong>in</strong> Computer Science, 1989<br />
10. P. Fraternali, S. Paraboschi, L. Tanca: \Automatic Rule Generation for Constra<strong>in</strong>t Enforcement<br />
<strong>in</strong> Active <strong>Databases</strong>", <strong>in</strong> U. Lipeck, B. Thalheim (Eds.): Modell<strong>in</strong>g Database Dynamics, Spr<strong>in</strong>ger<br />
WICS, 1993, 153-173<br />
11. M. Gertz, U. W. Lipeck: \Deriv<strong>in</strong>g Integrity Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Triggers from Transition Graphs", <strong>in</strong><br />
Proceed<strong>in</strong>gs 9th ICDE, IEEE Computer Society Press, 1993, 22-29<br />
12. D. Gries: The Science <strong>of</strong> Programm<strong>in</strong>g, Spr<strong>in</strong>ger Texts and Monographs <strong>in</strong> Computer Science,<br />
1981<br />
13. T. Gunther, K.-D. Schewe, I. Wetzel: \On the Derivation <strong>of</strong> Executable Database Programs<br />
from Formal Specications", <strong>in</strong> J.C.P. Woodcock, P.G. Larsen (Eds.): Formal Methods Europe<br />
(FME'93), Spr<strong>in</strong>ger LNCS 670, 1993, 351-366<br />
14. C. B. Jones: Systematic S<strong>of</strong>tware Development us<strong>in</strong>g VDM , Prentice-Hall International, 1986<br />
15. A. P. Karadimce, S. D. Urban: \Diagnos<strong>in</strong>g Anomalous Rule Behaviour <strong>in</strong> <strong>Databases</strong> with Integrity<br />
Ma<strong>in</strong>tenance Production Rules", <strong>in</strong> Proceed<strong>in</strong>gs 3rd Int. Workshop on Foundations <strong>of</strong> Models and<br />
Languages for Data and <strong>Object</strong>s, 1991, 77-102<br />
16. U. W. Lipeck: Dynamische Integritat von Datenbanken, Spr<strong>in</strong>ger IFB 209, 1987<br />
17. J.-J. Meyer, H. Weigand, R. Wier<strong>in</strong>ga: \A Specication Language for Static, Dynamic and Deontic<br />
Integrity Constra<strong>in</strong>ts", <strong>in</strong> J. Demetrovics, B. Thalheim (Eds.): MFDBS 89 , Spr<strong>in</strong>ger LNCS 364,<br />
347-366<br />
18. C. Morgan: Programm<strong>in</strong>g from Specications, Prentice Hall, 1988<br />
19. G. Nelson: \A Generalization <strong>of</strong> Dijkstra's Calculus", ACM TOPLAS, vol. 11 (4), 1989, 517-561<br />
20. K.-D. Schewe, I. Wetzel, J. W. Schmidt: \Towards a Structured Specication Language for<br />
Database Applications", <strong>in</strong> D. Harper, M. Norrie: Specication <strong>of</strong> Database Systems, Spr<strong>in</strong>ger<br />
Workshops <strong>in</strong> Comput<strong>in</strong>g Science, 1992, 255-274<br />
21. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: \Integrity Enforcement <strong>in</strong> <strong>Object</strong>-<strong>Oriented</strong><br />
<strong>Databases</strong>", <strong>in</strong> U. W. Lipeck, B. Thalheim (Eds.): Modell<strong>in</strong>g Database Dynamics, Spr<strong>in</strong>ger WICS,<br />
1993, 174-195<br />
22. K.-D. Schewe, B. Thalheim: \Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>", <strong>in</strong> S. Chakravarty,<br />
J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, 1994<br />
23. K.-D. Schewe, D. Stemple, B. Thalheim: \Higher Level Genericity <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>",<br />
<strong>in</strong> S. Chakravarty (Ed.): Conference on the Management <strong>of</strong> Data, 1994<br />
24. K.-D. Schewe: Specication and Development <strong>of</strong> Correct Relational Database Programs, book<br />
manuscript<br />
25. T. Sheard, D. Stemple: \Automatic Verication <strong>of</strong> Database Transaction Safety", ACM ToDS,<br />
vol. 14 (3), 1989, 322-368<br />
26. J. M. Spivey: Understand<strong>in</strong>g Z, A Specication Language and its Formal Semantics, Cambridge<br />
University Press, 1988<br />
27. J. M. Spivey: The Z Notation, A Reference Manual, Prentice Hall, 1989<br />
28. D. Stemple, S. Mazumdar, T. Sheard (1987): \On the Modes and Mean<strong>in</strong>g <strong>of</strong> Feedback toTransaction<br />
Designer", <strong>in</strong> Proceed<strong>in</strong>gsSIGMOD1987, 1987, 375-386<br />
29. S. D. Urban, L. Delcambre: \Constra<strong>in</strong>t Analysis: A Design Process for Specify<strong>in</strong>g Operations on<br />
<strong>Object</strong>s", IEEE Transactions on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), 1990<br />
116
30. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: \Set-oriented Production Rules <strong>in</strong> Relational Database Systems", <strong>in</strong><br />
Proceed<strong>in</strong>gs SIGMOD, 1990, 259-270<br />
117
Chapter 6<br />
Tailor<strong>in</strong>g Consistent Specializations<br />
as a Natural Approach to<br />
Consistency Enforcement<br />
Contents<br />
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119<br />
6.2 A Review on Transaction Transformation by Specialization . . . 121<br />
6.2.1 Greatest Consistent Specialization . . . . . . . . . . . . . . . . . . . 121<br />
6.2.2 The Construction <strong>of</strong> GCSs . . . . . . . . . . . . . . . . . . . . . . . . 122<br />
6.2.3 Two Major Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 123<br />
6.3 Weaker Notions <strong>of</strong> Eect Preservation . . . . . . . . . . . . . . . . 124<br />
6.3.1 Maximal Consistent Eect Preservers . . . . . . . . . . . . . . . . . 125<br />
6.3.2 Eective MCE Construction . . . . . . . . . . . . . . . . . . . . . . . 126<br />
6.4 Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . 127<br />
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128<br />
6.6 The Predicate Transformer Calculus . . . . . . . . . . . . . . . . . 130<br />
6.7 I-reducedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
Klaus-Dieter Schewe. Tailor<strong>in</strong>g Consistent Specializations as a Natural Approach to<br />
Consistency Enforcement. <strong>in</strong> S.Conrad, H.-J.Kle<strong>in</strong>, K.-D. Schewe (Eds.). Integrity <strong>in</strong><br />
<strong>Databases</strong>. available at<br />
http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceed<strong>in</strong>gs.html.<br />
118
Abstract. Consistency enforcement may be regarded as a process <strong>of</strong> transaction transformation,<br />
where the modied transaction will be consistent with respect to a given set <strong>of</strong><br />
constra<strong>in</strong>ts. The computational approach by Schewe and Thalheim requires the modied<br />
transaction to be the greatest consistent one below the orig<strong>in</strong>al transaction with respect to<br />
some order. The order should express the preservation <strong>of</strong> the \eect" <strong>of</strong> the orig<strong>in</strong>al transaction.<br />
Thus, the major problem is to nd the right order.<br />
The rst choice, specialization, turns out to provide good computational properties, but<br />
on the one hand the order is too weak, because arbitrary changes to state variables not<br />
touched by the orig<strong>in</strong>al transaction are allowed, and on the other hand it is too strong, as<br />
eect preservation by specialization means that further changes to the other state variables<br />
are forbidden.<br />
In this paper, modications <strong>of</strong> greatest consistent specializations are studied to avoid these<br />
problems. Weaken<strong>in</strong>g the notion <strong>of</strong> eect preservation leads to the denition <strong>of</strong> maximal<br />
consistent eect preservers (MCEs). This turns out to be a reasonable choice, s<strong>in</strong>ce they<br />
preserve the computational strength achieved for consistent specializations. Moreover, for<br />
basic operations they are compatible with dierent consistency enforcement strategies chosen<br />
by users.<br />
6.1 Introduction<br />
Consistency enforcement is considered to be one <strong>of</strong> the major application elds <strong>of</strong> active<br />
database systems. It is expected that arbitrary sets <strong>of</strong> static <strong>in</strong>tegrity constra<strong>in</strong>ts allow repair<strong>in</strong>g<br />
ECA-rules to be dened or even generated. The analysis <strong>of</strong> the result<strong>in</strong>g rule trigger<strong>in</strong>g<br />
system concentrates on the term<strong>in</strong>ation <strong>of</strong> the rule system, the <strong>in</strong>dependence <strong>of</strong> the nal<br />
database state from the chosen selection order <strong>of</strong> the rules and on consistency. The work <strong>in</strong><br />
[1] can be taken as a representative <strong>of</strong> this approach.<br />
The mentioned requirements are not sucient for a reasonable rule behaviour, because<br />
they do not take care about the <strong>in</strong>teraction <strong>of</strong> the rules. In general, given a complex database<br />
transition, rule systems may always <strong>in</strong>validate the eect <strong>of</strong> that transition, e.g. an <strong>in</strong>sertion<br />
may be turned <strong>in</strong>to a deletion and vice versa. The work <strong>in</strong> [4] presents critical examples<br />
with respect to undesirable rule behaviour. In [2] the rule analysis is characterized as purely<br />
syntactical.<br />
The basic problem seems to be that an accepted theory <strong>of</strong> consistency enforcement is still<br />
miss<strong>in</strong>g. S<strong>in</strong>ce it is easy to dene an RTS that empties the database <strong>in</strong> case <strong>of</strong> any constra<strong>in</strong>t<br />
violation, it is not sucient to ensure consistency <strong>of</strong> the result. Therefore, the notion <strong>of</strong> greatest<br />
consistent specialization (GCS) was <strong>in</strong>troduced <strong>in</strong> [6] as a theoretical means for a denition<br />
<strong>of</strong> consistency enforcement. The basic considerations <strong>of</strong> this approach are quite simple:<br />
{ Instead <strong>of</strong> a constra<strong>in</strong>t set we may study a s<strong>in</strong>gle constra<strong>in</strong>t { just take the conjunction.<br />
{ S<strong>in</strong>ce consistency is basically a property <strong>of</strong> transactions, we may consider an arbitrary<br />
complex database transition.<br />
{ Then there should be a partial order, called specialization, on transactions such that it<br />
expresses the preservation <strong>of</strong> the eects <strong>of</strong> the orig<strong>in</strong>al transition. With respect to this<br />
order a solution <strong>of</strong> the consistency enforcement problem should be the GCS.<br />
Thus, the fundamental idea <strong>of</strong> the GCS approach is the transformation <strong>of</strong> arbitrary database<br />
transitions <strong>in</strong>to GCSs which should then be handled as transactions. Both consistency and<br />
specialization can be dened <strong>in</strong> terms <strong>of</strong> the extended predicate transformer calculus [3].<br />
119
The rst results on GCSs demonstrated their existence, uniqueness and a commutativity<br />
property with respect to several constra<strong>in</strong>ts [6]. Thus, the restriction to a s<strong>in</strong>gle constra<strong>in</strong>t<br />
is only necessary for denitional purposes, s<strong>in</strong>ce the GCS with respect to a conjunction <strong>of</strong><br />
constra<strong>in</strong>ts can be built successively.Furthermore, the order <strong>of</strong> the constra<strong>in</strong>ts is not important<br />
for such a construction. Such a property does not hold for any rule-based approach. The price<br />
for this exibility is the <strong>in</strong>herent non-determ<strong>in</strong>istic nature <strong>of</strong> GCSs.<br />
The <strong>in</strong>terest<strong>in</strong>g problem how GCSs are to be constructed was <strong>in</strong>vestigated <strong>in</strong> detail <strong>in</strong><br />
[5]. It could be shown that under mild technical restrictions the problem can be reduced to<br />
nd<strong>in</strong>g GCSs for basic operations. The GCS <strong>of</strong> a complex database transition results, if rstly<br />
<strong>in</strong>volved basic operations are replaced by their GCSs and secondly a precondition is added.<br />
Hence, GCSs <strong>in</strong> general are partial, i.e. <strong>in</strong> certa<strong>in</strong> cases there is no other choice than a rollback.<br />
This partiality cannot be achieved by rule systems, <strong>in</strong> particular, the computed precondition<br />
heavily depends on the orig<strong>in</strong>al database transition, whilst the rule-based approach aims at<br />
a solution that is <strong>in</strong>dependent from user-dened database transitions.<br />
Nevertheless, two major drawbacks exist for the GCS approach. The rst one concerns the<br />
rigidity <strong>of</strong> the specialization order with respect to the part <strong>of</strong> the database aected by a transition.<br />
State changes related to the orig<strong>in</strong>al transition can only be discarded, but not changed.<br />
E.g., with respect to a functional dependency <strong>in</strong> the RDM an <strong>in</strong>sertion is only allowed, if<br />
it is consistent. The same holds for multi-valued dependencies. This is only one reasonable<br />
enforcement strategy, but from an <strong>in</strong>tuitive po<strong>in</strong>t <strong>of</strong> view alternatives are imag<strong>in</strong>able.<br />
The second drawback concerns the arbitrar<strong>in</strong>ess with respect to the part <strong>of</strong> the database<br />
not aected by the orig<strong>in</strong>al transition. Here any changes are allowed as long as consistency<br />
is achieved. In [5] this problem has been circumvented allow<strong>in</strong>g branches <strong>of</strong> GCSs to be computed.<br />
This pragmatic approach leads to reasonable consistent specializations and restricts<br />
the non-determ<strong>in</strong>ism <strong>of</strong> GCSs. In Section 6.2 we present a formal review <strong>of</strong> the GCS approach.<br />
These two problems are taken up <strong>in</strong> this paper. In fact, the rst problem is <strong>in</strong>vestigated,<br />
but the nal solution comprises both problems at a time. The rst idea with respect to the<br />
rigidity problem is to weaken the order and to preserve not all the eects <strong>of</strong> the orig<strong>in</strong>al<br />
transition. Indeed, this was also the case for specialization, s<strong>in</strong>ce only eects on the part<br />
<strong>of</strong> the database aected by the more general transition were considered. In this paper, we<br />
consider eects { formalized by specic transition constra<strong>in</strong>ts { that are compatible with the<br />
given static constra<strong>in</strong>t. These transition constra<strong>in</strong>ts can be ordered by implication and we<br />
may consider m<strong>in</strong>imal constra<strong>in</strong>ts with this property. This leads to the notion <strong>of</strong> maximal<br />
consistent eect preservers (MCEs).<br />
Indeed, MCEs t well with our <strong>in</strong>tuition. For this we shortly discuss dierent enforcement<br />
strategies with respect to basic operations and selected classes <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> the RDM and<br />
conv<strong>in</strong>ce ourselves concern<strong>in</strong>g the naturality <strong>of</strong> the MCE approach.<br />
After the formal denition <strong>of</strong> MCEs <strong>in</strong> Section 6.3 we analyze the computational properties<br />
<strong>of</strong> MCEs. We shall see that existence, uniqueness and commutativity hold as they did<br />
for GCSs. Then, aga<strong>in</strong> under mild technical restrictions, we show that the problem can be<br />
reduced to nd<strong>in</strong>g MCEs for basic operations. As for GCSs the MCE <strong>of</strong> a complex database<br />
transition results, if <strong>in</strong>volved basic operations are replaced by some MCEs and a precondition<br />
is computed. Hence, MCEs preserve the computational properties <strong>of</strong> GCSs. Formal denitions<br />
and results for MCEs are presented <strong>in</strong> Section 6.3. Unfortunately, the pro<strong>of</strong>s are more<br />
complicated than they were for GCSs. Therefore, pro<strong>of</strong>s are omitted, but they follow the same<br />
approach as the pro<strong>of</strong>s for GCSs <strong>in</strong> [5]. We conclude with a short summary.<br />
Throughout the paper we assume some familiarity with guarded command notations and<br />
120
their axiomatic semantics by predicate transformers 1 [3]. Furthermore, pro<strong>of</strong>s are omitted,<br />
because they tend to become rather lengthy.<br />
6.2 A Review on Transaction Transformation by<br />
Specialization<br />
The start<strong>in</strong>g po<strong>in</strong>t <strong>of</strong> the computational approach to consistency enforcement was the use<br />
<strong>of</strong> axiomatic semantics <strong>in</strong> the extended style <strong>of</strong> Dijkstra [3] (see Appendix 6.6. We consider<br />
a state space X as a set <strong>of</strong> (typed) variables. In the relational model [5] each state variable<br />
corresponds to a relation schema the possible relation dene the associated type. In object<br />
oriented models [6] state variables correspond to classes with the associated class type.<br />
A state over X is given by a map which associates with each state variable a value <strong>of</strong><br />
its type. We use to denote the set <strong>of</strong> states over X.<br />
Then a (static) constra<strong>in</strong>t I is dened as a formula (<strong>in</strong> a naturally dened many-sorted<br />
logic) with free variables <strong>in</strong> the state space X, i.e. fr(I) X. It is clear that states are<br />
sucient for the <strong>in</strong>terpretation <strong>of</strong> constra<strong>in</strong>ts.<br />
A database transition can then be dened by a guarded command over the state space X<br />
captur<strong>in</strong>g non-determ<strong>in</strong>ism, partiality and general recursion. Note that such aformalization<br />
is much more general than usual denitions <strong>of</strong> transactions, but the <strong>in</strong>volved orthogonality <strong>of</strong><br />
constructors such as sequence, guard, choice etc. is signicant for pro<strong>of</strong>s to be kept simple.<br />
Furthermore, guarded commands are just one way to describe the syntax <strong>of</strong> transitions.<br />
A transition constra<strong>in</strong>t on a state space X is a formula J with free variables <strong>in</strong> X [ X 0<br />
us<strong>in</strong>g a disjo<strong>in</strong>t copy X 0 <strong>of</strong> X, i.e. fr(J ) X [X 0 .Ifx 0 2 X 0 corresponds to the state variable<br />
x 2 X, then values associated to x coresspond to before-states, whereas values associated to<br />
x 0 correspond to after-states. In particular, state pairs () 2 suce to <strong>in</strong>terpret<br />
transition constra<strong>in</strong>ts.<br />
Each transition constra<strong>in</strong>t J gives rise to a database transition S(J ) <strong>in</strong> a simple way. All<br />
state pairs satisfy<strong>in</strong>g J are used to dene S(J ). In addition, s<strong>in</strong>ce we have to decide, how<br />
to handle term<strong>in</strong>ation, we choose also to take all pairs ( 1) <strong>in</strong>to (S(J )). Then it can be<br />
shown that S(J ) can be written as<br />
S(J )=(@@x 0 1 :::x0 n J ! x 1 := x 0 1 ::: x n := x 0 n ) loop :<br />
The computational approach, however, abstracts from syntactic means. All denitions are<br />
given <strong>in</strong> terms <strong>in</strong> predicate transformers. This applies for the notions <strong>of</strong> operational specialization<br />
and consistency with respect to a static constra<strong>in</strong>t. These are the necessary <strong>in</strong>gredients<br />
for the denition and <strong>in</strong>vestigation <strong>of</strong> greatest consistent specializations (GCSs).<br />
6.2.1 Greatest Consistent Specialization<br />
As already said the operational approach to consistency enforcement starts with a formal<br />
denition <strong>of</strong> the goal. The idea is quite simple. We choose an order on transitions which<br />
should model the preservation <strong>of</strong> \eects". This order is called specialization and denoted v.<br />
Then a consistent specialization is expresses both consistency and the preservation <strong>of</strong> eects.<br />
F<strong>in</strong>ally we takethegreatest consistent specialization, if it exists.<br />
1 Wepresent a short summary <strong>of</strong> the used version <strong>of</strong> Dijkstra's calculus <strong>in</strong> Appendix 6.6.<br />
121
The <strong>in</strong>tention beh<strong>in</strong>d specialization is quite easy. Ifwe are given an execution <strong>of</strong> a database<br />
transition T work<strong>in</strong>g on a larger state space X than the database transition S work<strong>in</strong>g on<br />
Y X, then we may restrict this computation to Y . S<strong>in</strong>ce states have been dened as<br />
mapp<strong>in</strong>gs, this is just a restriction <strong>of</strong> mapp<strong>in</strong>gs. Then specialization means that any execution<br />
<strong>of</strong> T (restricted to Y ) should also be an execution <strong>of</strong> S. It is straightforward to show that this<br />
is exactly captured by the predicate transformer formulation <strong>in</strong> Denition 6.1 (i).<br />
This also allows transition consistency <strong>of</strong> a database transition S with respect to a transition<br />
constra<strong>in</strong>t J to be formalized by S specializ<strong>in</strong>g S(J ).<br />
As to static consistency with respect to some static constra<strong>in</strong>t I we require that any<br />
term<strong>in</strong>at<strong>in</strong>g execution <strong>of</strong> a transition T start<strong>in</strong>g <strong>in</strong> a state satisfy<strong>in</strong>g I should also reach a<br />
state satisfy<strong>in</strong>g I, which is formalized by Denition 6.1 (ii).<br />
S<strong>in</strong>ce specialization captures our <strong>in</strong>tuitive notion <strong>of</strong> \preservation <strong>of</strong> eects", the denition<br />
<strong>of</strong> Greatest Consistent Specializations <strong>in</strong> Denition 6.1 (iii) is now obvious.<br />
Denition 6.1. Let S, T be database transitions on Y and X respectively with Y X. Let<br />
I denote a static constra<strong>in</strong>t onX.<br />
(i) T specializes S (T v S) i for all static constra<strong>in</strong>ts R on Y the implications wlp(S)(R) )<br />
wlp(T )(R) and wp(S)(R) ) wp(T )(R) hold.<br />
(ii) T is consistent with respect to I i I)wlp(T )(I) holds.<br />
(iii) S I is a greatest consistent specialization <strong>of</strong> S with respect to I i S I v S holds, S I is<br />
consistent with respect to I and S I is the greatest database transition with respect to v<br />
with these properties.<br />
ut<br />
The rst properties that were derived for GCSs concerned their existence, uniqueness and<br />
their relation to conjunctions (or equivalently sets) <strong>of</strong> constra<strong>in</strong>ts. Due to the very general<br />
approach to database transitions <strong>in</strong>clud<strong>in</strong>g non-determ<strong>in</strong>ism and partiality, their existence<br />
can be easily veried. Due to the abstract semantic nature <strong>of</strong> the denition the uniqueness<br />
(up to semantic equivalence) is obvious. Additionally, the rst steps towards GCS theory<br />
detected a commutativity property with respect to conjunctions, at least if we restrict <strong>in</strong>itial<br />
states to those satify<strong>in</strong>g the constra<strong>in</strong>ts [6]. We summarize:<br />
Proposition 6.2. Let S be a database transition on Y and let I and J be static constra<strong>in</strong>ts<br />
on X with Y X. The the GCS S I exists and is uniquely determ<strong>in</strong>ed up to semantic equivalence<br />
byS and I. Furthermore I^J ! S I^J and I^J ! (S I ) J are semantically equivalent.<br />
ut<br />
This proposition is important for the practical computation <strong>of</strong> GCSs. If there exists an eective<br />
way to compute GCSs, then the proposition allows the computation to be restricted to simple<br />
constra<strong>in</strong>ts that cannot be decomposed as a conjunction. Then we can use the conjuncts <strong>of</strong> a<br />
more complex constra<strong>in</strong>t <strong>in</strong> any order to build the GCS. Thus, the commutativity property<br />
allows consistency to be enforced stepwise tak<strong>in</strong>g any order <strong>of</strong> the constra<strong>in</strong>ts.<br />
6.2.2 The Construction <strong>of</strong> GCSs<br />
In order to become practically <strong>in</strong>terest<strong>in</strong>g GCSs must allow to be constructed. How toachieve<br />
construction means seemed to be a hopeless problem at the beg<strong>in</strong>n<strong>in</strong>g, at least for complex<br />
database transitions. It is clear that a naive approach {replac<strong>in</strong>g just basic operations such<br />
122
as <strong>in</strong>sertions and deletions by their GCSs { leads to wrong results. More precisly, we obta<strong>in</strong> a<br />
consistent specialization, but not the greatest, or even worse, we obta<strong>in</strong> not a specialization<br />
at all.<br />
The major breakthrough was achieved by requir<strong>in</strong>g database transitions to be I-reduced.<br />
This is only a technical condition (see Appendix 6.7) { <strong>in</strong> fact, only a condition on sequences {<br />
which <strong>in</strong>formally states that there is no self-repair<strong>in</strong>g. We omit the technical denition here,<br />
s<strong>in</strong>ce it is only understandable <strong>in</strong> connection with the pro<strong>of</strong> <strong>of</strong> the ma<strong>in</strong> results [5]. Then<br />
it was shown that I-reduced database transitions the GCS is itself a specialization <strong>of</strong> the<br />
database transition result<strong>in</strong>g from the replacement <strong>of</strong> basic operations by their GCSs (upper<br />
bound theorem) and it appears that it add<strong>in</strong>g a certa<strong>in</strong> precondition gives the complete GCS<br />
(ma<strong>in</strong> theorem). We summarize:<br />
Theorem 6.3. Let I be a static constra<strong>in</strong>t on X and S some I-reduced database transition<br />
on Y with Y X = fx 1 ::: x n g. Let SI 0 result from S by rst replac<strong>in</strong>g each restricted<br />
choice S 1 S 2 by S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic database transition by its<br />
GCS with respect to I. For a disjo<strong>in</strong>t copy fz 1 ::: z n g <strong>of</strong> X dene<br />
P (S 0 I ) fz 1=x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) <br />
where T results from S 0 I by renam<strong>in</strong>g all x i to z i . Then the GCS <strong>of</strong> S with respect to I can<br />
be written <strong>in</strong> the form S I = P (S 0 I ) ! S0 I . ut<br />
The theorem needs some explanation concern<strong>in</strong>g both its term<strong>in</strong>ology and its impact. By a<br />
disjo<strong>in</strong>t copy<strong>of</strong>X = fx 1 :::x n g we mean a state space Z = fz 1 :::z n g such that the types<br />
<strong>of</strong> z i and x i co<strong>in</strong>cide and X \ Z = hold. Then the notation fx=tg:R with a variable x, a<br />
term t <strong>of</strong> the same type as x and a formula R denotes the result <strong>of</strong> the substitution <strong>of</strong> each<br />
free occurrence <strong>of</strong> x <strong>in</strong> R by t.<br />
The formula P (SI 0 ) results from a rst order reformulation <strong>of</strong> the specialization condition<br />
<strong>in</strong> Denition 6.1 (i), which was basically second order. This is possible, s<strong>in</strong>ce S works on X, T<br />
works on the disjo<strong>in</strong>t copy Z, i.e. their parallel execution has the same eect as any sequence,<br />
and the formula x 1 = z 1 ^^x n = z n on X [ Z expresses a \glue" between X and Z. Ifthe<br />
given formula were always true, T would be a specialization <strong>of</strong> S. Tak<strong>in</strong>g it as a precondition<br />
restricts T to those executions that may occur <strong>in</strong> a specialization <strong>of</strong> S. S<strong>in</strong>ce the T chosen<br />
<strong>in</strong> the theorem is already a consistent generalization <strong>of</strong> S I , we really obta<strong>in</strong> the GCS. The<br />
lengthy pro<strong>of</strong> is conta<strong>in</strong>ed <strong>in</strong> [5].<br />
Note that the theorem together with the commutativity result mentioned beforehand<br />
gives eective means for GCS construction and hence for consistency enforcement <strong>in</strong> the<br />
basic computational approach. The hard part <strong>of</strong> the pro<strong>of</strong> is to show that S I v SI 0 holds<br />
(upper bound theorem) which requires lengthy structural <strong>in</strong>duction [5].<br />
6.2.3 Two Major Problems<br />
Let us look at consistency enforcement from a more practical po<strong>in</strong>t <strong>of</strong> view and ask whether<br />
GCSs really co<strong>in</strong>cide with our <strong>in</strong>tuition. In general, GCSs are non-determ<strong>in</strong>istic, which reects<br />
various strategies for consistency enforcement. The approach <strong>in</strong> [5] selects a branch <strong>of</strong> the GCS<br />
which is related to an <strong>in</strong>teractive support for the values to be selected.<br />
E.g., take an <strong>in</strong>clusion constra<strong>in</strong>t x 2 p ) x 2 q and an <strong>in</strong>sertion <strong>in</strong>to p, then GCS<br />
branches oer the freedom to chose any newvalue for q provided it is a superset <strong>of</strong> p [fxg.<br />
123
Intuitively we prefer this value to be q [fxg, i.e. to keep change propagation as simple as<br />
possible. For GCS branches, however, there is no such \preference" or otherwise said:<br />
For a database transition on Y X the GCS approach is too liberal on X ; Y .<br />
On the other hand, multi-valued dependencies, which concern only one set-valued state variable<br />
lead to preconditions, although we might expect additional changes <strong>in</strong>stead. Otherwise<br />
said:<br />
For a database transition on Y X the GCS approach is too restrictive onY .<br />
This demonstrates that the specialization order might still be too coarse for enforcement<br />
purposes. One possible solution orig<strong>in</strong>at<strong>in</strong>g from the work <strong>in</strong> [5] is to choose an order based<br />
on -constra<strong>in</strong>ts. We shall follow this idea <strong>in</strong> Section 6.3.<br />
6.3 Weaker Notions <strong>of</strong> Eect Preservation<br />
Intuitively there exist various enforcement strategies with respect to basic operations <strong>in</strong> order<br />
to enhance the rigidity <strong>of</strong> GCSs. E.g., consider an <strong>in</strong>sertion <strong>of</strong> a new tuple <strong>in</strong>to a relation:<br />
{ For a functional dependency we may enforce consistency either by add<strong>in</strong>g a precondition<br />
(the choice made <strong>in</strong> the practical example <strong>in</strong> [5]) oder propagate the deletion <strong>of</strong> other<br />
tuples.<br />
{ For amultivalued dependency we either propagate further <strong>in</strong>sertions or add a precondition.<br />
{ For an <strong>in</strong>clusion dependency we may propagate an <strong>in</strong>sertion <strong>in</strong> the other relation or add<br />
a precondition.<br />
{ For an exclusion dependency we may propagate the deletion <strong>of</strong> tuples <strong>in</strong> the other relation<br />
or add a precondition.<br />
For the case <strong>of</strong> a deletion <strong>of</strong> a tuple the alternatives are similar.<br />
Let us now try to characterize the relationship between the orig<strong>in</strong>al database transition<br />
and those result<strong>in</strong>g from such rewrit<strong>in</strong>g eorts <strong>in</strong> order to nd a weaker notion <strong>of</strong> eect<br />
preservation that allow to encompass the problems we have with GCSs.<br />
In general the eects <strong>of</strong> a database transition T may be expressed by transition constra<strong>in</strong>ts.<br />
Just take a the characteriz<strong>in</strong>g predicate <strong>of</strong> a subset <strong>of</strong> (T ). Then T is certa<strong>in</strong>ly consistent<br />
with a constra<strong>in</strong>t choosen <strong>in</strong> that way. Therefore, we <strong>in</strong>troduce the notion <strong>of</strong> a -constra<strong>in</strong>t,<br />
i.e. a transition constra<strong>in</strong>t that is satised by a database transition S [5]:<br />
Denition 6.4. Let S be a database transition on X = fx 1 ::: x n g. A -constra<strong>in</strong>t for S<br />
is a transition constra<strong>in</strong>t J on X such thatfx 0 1 =x 1::: x 0 n =x ng:wlp(S 0 )(J ) holds, where S 0<br />
results from S by renam<strong>in</strong>g all x i to x 0 i .<br />
ut<br />
Example 6.1. Look at the the <strong>in</strong>sertion S <strong>of</strong> a new tuple t <strong>in</strong>to a relation r. Then the<br />
follow<strong>in</strong>g formulae are -constra<strong>in</strong>ts for S:<br />
{ t 2 r 0<br />
{ 8u: u 2 r ) u 2 r 0 124
{ 8u: u 2 q , u 2 q 0 for all relation schemata q 6= r<br />
{ 8u: u 6= t ^ u 2 r 0 ) u 2 r ut<br />
If S is dened on Y X and we require all -constra<strong>in</strong>ts J <strong>of</strong> S with fr(J ) Y [ Y 0 to be<br />
also -constra<strong>in</strong>ts for T , then T will be a specialization <strong>of</strong> S. The converse also holds. Thus,<br />
we may replace the specialization condition by the preservation <strong>of</strong> certa<strong>in</strong> -constra<strong>in</strong>ts.<br />
6.3.1 Maximal Consistent Eect Preservers<br />
We have already seen that we may always associate a database transition S(J ) with each<br />
transition constra<strong>in</strong>t J .In order to preserve J we must require to specialize S(J ). The basic<br />
idea <strong>of</strong> the tailored operational approach is now to consider not all -constra<strong>in</strong>ts, but only<br />
some <strong>of</strong> them. Thus, we do no longer build the GCS <strong>of</strong> S with respect to I, but the GCS <strong>of</strong><br />
some S(J ).<br />
If some -constra<strong>in</strong>ts <strong>of</strong> S are omitted <strong>in</strong> J ,thenS(J ) will allow executions that do not<br />
occur <strong>in</strong> any specialization <strong>of</strong> S. In this way, we can capture the reasonable changes that<br />
were listed at the beg<strong>in</strong>n<strong>in</strong>g <strong>of</strong> this section. However, tak<strong>in</strong>g any such -constra<strong>in</strong>t is much<br />
too weak. S(J ) should only add executions that are consistent with I. This justies to dene<br />
-constra<strong>in</strong>ts that are compatible with a given static constra<strong>in</strong>t I on X <strong>in</strong> the sense that<br />
buld<strong>in</strong>g the GCS S(J ) I does not <strong>in</strong>crease partiality.<br />
Denition 6.5. A -constra<strong>in</strong>t J for a database transition S is compatible with a static<br />
constra<strong>in</strong>t I i wp(S(J ) I )(false) ) wp(S(J ))(false) holds.<br />
ut<br />
Example 6.2. It is easy to see that each <strong>of</strong> the -constra<strong>in</strong>ts <strong>in</strong> the previous example is<br />
compatible with I chosen to be a multivalued dependency. Furthermore, the conjunction<br />
<strong>of</strong> three <strong>of</strong> these constra<strong>in</strong>ts is also compatible with I, but the conjunction <strong>of</strong> all four -<br />
constra<strong>in</strong>ts is not.<br />
ut<br />
The last example suggests to consider the implication order on -constra<strong>in</strong>ts. We say that<br />
J 1 is stronger than J 2 i J 1 ) J 2 holds. Unfortunately there is no smallest -constra<strong>in</strong>t<br />
compatible with I and we cannot consider the \strongest" I-compatible -constra<strong>in</strong>t for S,<br />
but we may consider m<strong>in</strong>imal elements <strong>in</strong> this order.<br />
Denition 6.6. A -constra<strong>in</strong>t J for S is low with respect to I i it is I-compatible and<br />
there is no strictly stronger I-compatible -constra<strong>in</strong>t.<br />
ut<br />
Nowwe are prepared to dene maximal consistent eect preservers for a database transition S.<br />
For these we choosealow -constra<strong>in</strong>t J which formalizes an eect <strong>of</strong> S to be preserved. Then<br />
we take a consistent database transition S I that preserves this eect, but rema<strong>in</strong>s undened,<br />
whereever S is undened. F<strong>in</strong>ally, we require S I to be a greatest database transition with<br />
these properties with respect to the specialization order.<br />
Denition 6.7. Let S be a database transition and I a static constra<strong>in</strong>t onX. LetJ be a<br />
low -constra<strong>in</strong>t <strong>of</strong> S with respect to I. A database transition S I on X is called a maximal<br />
consistent eect preserver (MCE) <strong>of</strong> S with respect to I i<br />
(i) J is a -constra<strong>in</strong>t forS I ,<br />
(ii) wp(S)(false) ) wp(S I )(false) holds,<br />
125
(iii) S I is consistent with respect to I and<br />
(iv) any other database transition T with these properties specializes S I .<br />
ut<br />
Note that <strong>in</strong> this denition the state space on which S is dened is no longer important. It<br />
\vanishes" <strong>in</strong>side the chosen J . Then it is easy to see that the <strong>in</strong>formal enforcement strategies<br />
at the beg<strong>in</strong>n<strong>in</strong>g <strong>of</strong> this section are captured by MCEs for basic database transitions.<br />
Furthermore, property (iv) employs the specialization order v aga<strong>in</strong>. This seems to be<br />
surpris<strong>in</strong>g for the rst moment, but it turns out to be a natural denition as shown <strong>in</strong> the<br />
follow<strong>in</strong>g lemma which follows directly from the denitions.<br />
Lemma 6.8. Let S be adatabase transition and I a static constra<strong>in</strong>t on X. Let J be a low<br />
-constra<strong>in</strong>t <strong>of</strong> S with respect to I. Then wp(S) (true) ! S(J ) I is the MCE with respect to<br />
I. ut<br />
From the lemma we maydraw rst conclusions:<br />
{ For a chosen low -constra<strong>in</strong>t with respect to I the MCE S I always exists and is uniquely<br />
determ<strong>in</strong>ed (up to semantic equivalence) by S, I and J .<br />
{ MCEs are closely related to GCSs. Apart from the precondition wp(S) (true) the MCE<br />
is the GCS <strong>of</strong> a slightly extended database transition, i.e. possible changes have been<br />
<strong>in</strong>corporated <strong>in</strong>to S(J ).<br />
The lemma suggests to apply the theory <strong>of</strong> GCS construction from Section 6.2 to the construction<br />
<strong>of</strong> MCEs. This idea, however, is mislead<strong>in</strong>g, s<strong>in</strong>ce there is no eective way to construct<br />
S(J ). Instead, we shall <strong>in</strong>vestigate eective MCE construction below. On the other hand, we<br />
can show that commutativity also holds for MCEs.<br />
Proposition 6.9. For static constra<strong>in</strong>ts I 1 , I 2 each preconditioned MCE I 1 ^I 2 ! S I 1^I2<br />
is semantically equivalent to I 1 ^I 2 ! (S I1 ) I 2 and vice versa.<br />
ut<br />
6.3.2 Eective MCE Construction<br />
Let us now ask for the eective construction <strong>of</strong> MCEs for complex database transitions. Aga<strong>in</strong><br />
a naive approach { replac<strong>in</strong>g just basic operations such as <strong>in</strong>sertions and deletions by some<br />
<strong>of</strong> their MCEs { leads to wrong results, but we observe that an MCE S I for a chosen low<br />
-constra<strong>in</strong>t J is a specialization <strong>of</strong> S(J ) 0 I , a database transition that is basically built by<br />
replac<strong>in</strong>g basic database transitions by their GCSs. Hence, it seems promis<strong>in</strong>g not to consider<br />
the replacement by GCSs, but by selected MCEs.<br />
As <strong>in</strong> the case <strong>of</strong> GCS construction we have to require I-reducedness, the purely technical<br />
condition which excludes self-repair<strong>in</strong>g with<strong>in</strong> sequences [5]. Then it can be shown that for<br />
I-reduced database transitions each MCE is itself a specialization <strong>of</strong> the database transition<br />
(S I ) 0 , a database transition that is basically built by replac<strong>in</strong>g basic database transitions by<br />
MCEs (upper bound theorem). Then it appears that add<strong>in</strong>g a precondition gives a MCE<br />
(ma<strong>in</strong> theorem). Thus, we obta<strong>in</strong> the follow<strong>in</strong>g result:<br />
Theorem 6.10. Let I be a static constra<strong>in</strong>t on X and S some I-reduced database transition.<br />
Assume X = fx 1 ::: x n g. Let (S I ) 0 result from S by rst replac<strong>in</strong>g each restricted choice<br />
S 1 S 2 by S 1 (wlp(S 1 )(false) ! S 2 ) and then each basic transition <strong>in</strong> S by one <strong>of</strong> its MCEs<br />
with respect to I. For a disjo<strong>in</strong>t copy fz 1 ::: z n g <strong>of</strong> X dene<br />
126
P ((S I ) 0 ) fz 1 =x 1 ::: z n =x n g:wlp(T )(wlp(S) (z 1 = x 1 ^ :::^ z n = x n )) <br />
where T results from (S I ) 0 by renam<strong>in</strong>g all x i to z i . Then<br />
S I = wp(S) (true) ! (P ((S I ) 0 ) ! (S I ) 0 )<br />
is an MCE for S with respect to I.<br />
ut<br />
This theorem aga<strong>in</strong> requires some <strong>in</strong>formal explanation. Its basic impact is the reduction<br />
<strong>of</strong> the MCE construction problem to basic operations. Practically this means to chose an<br />
enforcement strategy for basic operations by the means <strong>of</strong> a MCE. Then the theorem shows<br />
how to construct a correspond<strong>in</strong>g MCE for any complex operation, i.e. for any \<strong>in</strong>tended<br />
transaction".<br />
This also works if alternatives for MCEs <strong>of</strong> basic operations, i.e. <strong>in</strong>sertions, deletions etc.,<br />
are permitted. In this case each MCE comb<strong>in</strong>ation can be used <strong>in</strong> the theorem to dene MCEs<br />
<strong>of</strong> complex transitions.<br />
If we take the theorem together with the commutativity result mentioned beforehand,<br />
this gives eective means for MCE construction for arbitrary sets <strong>of</strong> constra<strong>in</strong>ts and hence<br />
for consistency enforcement <strong>in</strong> the tailored computational approach.<br />
6.4 Application Example<br />
Let us now look at a simple application example for the (tailored) computational approach.<br />
We consider a simple MCE computation that is similar to the computation <strong>of</strong> a GCS branch<br />
<strong>in</strong> [5]. Consider the state space X = fx 1 x 2 :: FSET(INT INT)g where FSET() is the<br />
nite set type constructor. Moreover, consider the static constra<strong>in</strong>ts:<br />
I 1 map( 1 )(x 1 ) map( 1 )(x 2 )<br />
I 2 8x y :: INT INT: x 2 x 2 ^ y 2 x 2 ^ 2 (x) = 2 (y) ) 1 (x) = 1 (y)<br />
I 3 2 (x 1 ) \ 2 (x 2 ) = <br />
These are examples <strong>of</strong> an <strong>in</strong>clusion dependency, a functional dependency and an exclusion<br />
dependency.<br />
Example 6.3. Let the state space and constra<strong>in</strong>ts be as above. Now consider the fx 1 g-<br />
operation S(a b :: INT) = x 1 := x 1 [f(a b)g. Let us take the constra<strong>in</strong>ts <strong>in</strong> the given<br />
order.<br />
Step 1. First consider the <strong>in</strong>clusion constra<strong>in</strong>t I 1 .We dispense with the pro<strong>of</strong> <strong>of</strong> I 1 -reducedness.<br />
S is a determ<strong>in</strong>istic basic assignment that can be replaced by its MCE with respect to I 1 and<br />
the low -constra<strong>in</strong>t<br />
J x 0 1 = x 1 [f(a b)g^9c: x 0 2 = x 2 [f(a c)g :<br />
Then we compute (S I ) 0 (a b :: INT) =<br />
x 1 := x 1 [f(a b)g ( a =2 1 (x 2 ) ! @@ c :: INT x 2 := x 2 [f(a c)g skip ) <br />
which isanX-operation with P ((S I ) 0 ) , true. Dene this as the new S.<br />
127
Step 2. Now consider the <strong>in</strong>variant I 2 . Aga<strong>in</strong> the reducedness pro<strong>of</strong> is omitted. We have<br />
to remove the restricted choice and to replace the basic assignment tox 2 by the MCE with<br />
respect to I 2 and the low -constra<strong>in</strong>t<br />
J x 0 1 = x 1 ^ (x 0 2 = x 2 [f(a c)g _x 0 2 = x 2) :<br />
( a =2 1 (x 2 ) ! c =2 2 (x 2 ) ! x 2 := x 2 [f(a c)g )( a 2 1 (x 2 ) ! skip )<br />
Then we compute P (SI 0 ) $ true. Hence the new S is (after some rearrangements)<br />
S(a b :: INT) = x 1 := x 1 [f(a b)g <br />
(( a =2 1 (x 2 ) ! @@ c :: INT c =2 2 (x 2 ) ! x 2 := x 2 [f(a c)g )<br />
a 2 1 (x 2 ) ! skip ) :<br />
Step 3. Now regard the exclusion <strong>in</strong>variant I 3 . Reducedness holds, but we omit the pro<strong>of</strong>.<br />
Replace S 1 = x 1 := x 1 [f(a b)g <strong>in</strong> S by the MCE<br />
x 1 := x 1 [f(a b)g x 2 := x 2 ;fx 2 x 2 j 2 (x) =bg<br />
and analogously replace S 2<br />
= x 2 := x 2 [f(a c)g by theMCE<br />
x 2 := x 2 [f(a c)g x 1 := x 1 ;fx 2 x 1 j 2 (x) =cg :<br />
Then we compute<br />
P (S 0 I ) , b =2 2(x 2 ) ^<br />
( =2 1 (x 2 ) )8c :: INT: (c 62 2 (x 2 ) ) c =2 2 (x 1 ) [fbg) ) <br />
hence the nal result is (after some rearrangements) semantically equivalent to<br />
S I (a b :: INT) = b =2 2 (x 2 ) ! x 1 := x 1 [f(a b)g <br />
(( a 62 1 (x 2 ) ! @@ c :: INT <br />
c =2 2 (x 2 ) ^ c =2 2 (x 1 ) ! x 2 := x 2 [f(a c)g )<br />
a 2 1 (x 2 ) ! skip ) :<br />
ut<br />
6.5 Conclusion<br />
We <strong>in</strong>vestigated major problems <strong>of</strong> the computational approach to consistency enforcement.<br />
These problems are related to the specialization order chosen by this approach. We exam<strong>in</strong>ed<br />
<strong>in</strong>tuitive strategies for consistency enforcement with respect to basic operations and selected<br />
classes <strong>of</strong> constra<strong>in</strong>ts. In these cases the strict eect preservation property <strong>of</strong>the specialization<br />
approach can be restricted to so-called I-compatible -constra<strong>in</strong>ts. In fact, choos<strong>in</strong>g such<br />
a constra<strong>in</strong>t that is m<strong>in</strong>imal with respect to the implication order gives rise to the denition<br />
<strong>of</strong> maximal consistent eect preservers (MCEs) as a natural approach to consistency<br />
enforcement.<br />
128
Fortunately, MCEs are closely related to greatest consistent specializations (GCSs) that<br />
were studied before. Each MCE is given by the GCS <strong>of</strong> a slightly extended transition and<br />
a precondition. This does not help directly <strong>in</strong> construct<strong>in</strong>g MCEs, but it turns out that<br />
MCE construction can be done <strong>in</strong> the same way as GCS construction, i.e., the consistency<br />
enforcement problem can be reduced to nd<strong>in</strong>g MCEs for basic operations, which is only a<br />
problem <strong>of</strong> practical calculation.<br />
Thus, the tailored approach presented <strong>in</strong> this paper may be considered as a general solution<br />
to (static) consistency enforcement. Moreover, as <strong>in</strong>dicated <strong>in</strong> [8] an ecient and exible<br />
implementation can be achieved by the use <strong>of</strong> l<strong>in</strong>guistic reection. The only problem that<br />
might be critical concerns the technical prerequisite <strong>of</strong> I-reducedness which excludes badly<br />
written database transitions. As shown <strong>in</strong> [7] for selected classes <strong>of</strong> constra<strong>in</strong>ts it is even<br />
possible to rewrite transitions <strong>in</strong> such away thatI-reducedness always holds.<br />
References for Chapter 6<br />
1. S. Ceri, P.Fraternali, S. Paraboschi, L. Tanca: Automatic Generation <strong>of</strong> Production Rules for Integrity<br />
Ma<strong>in</strong>tenance. ACM TODS 19(3), 1994, 367-422.<br />
2. P.Fraternali, S. Paraboschi: Order<strong>in</strong>g and Select<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance:<br />
Complexity and Heuristic Solution. to appear <strong>in</strong> IEEE TKDE.<br />
3. G. Nelson: A Generalization <strong>of</strong> Dijkstra's Calculus. ACM TOPLAS 11 (4), 1989, 517-561.<br />
4. K.-D. Schewe, B. Thalheim: Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>. In S. Chakravarty,<br />
J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g |Active <strong>Databases</strong>. Workshop Proceed<strong>in</strong>gs.<br />
Houston, Februar 1994.<br />
5. K.-D. Schewe, B. Thalheim: A Computational Approach to Consistency Enforcement. submitted<br />
for publication.<br />
6. K.-D. Schewe, B. Thalheim, J. Schmidt, I. Wetzel: Integrity Enforcement <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong><br />
<strong>Databases</strong>. In U. W. Lipeck, B. Thalheim (Eds.): Modell<strong>in</strong>g Database Dynamics. Spr<strong>in</strong>ger Workshops<br />
<strong>in</strong> Comput<strong>in</strong>g. Volkse 1992, 174-195.<br />
7. K.-D. Schewe: Specication and Development <strong>of</strong> Correct Relational Database Programs. Technical<br />
Report. submitted for publication.<br />
8. K.-D. Schewe, D. Stemple, B. Thalheim: Higher Level Genericity <strong>in</strong> <strong>Object</strong> <strong>Oriented</strong> <strong>Databases</strong>. In:<br />
Proc. Conference on the Management <strong>of</strong> Data (COMAD '94). Bangalore (India), December 1994.<br />
129
Appendix<br />
6.6 The Predicate Transformer Calculus<br />
This section gives a brief review <strong>of</strong> Dijkstra's classical calculus [3]. Assume that S is a program<br />
specication and that X is the nite set <strong>of</strong> variables occurr<strong>in</strong>g <strong>in</strong> S. We usually call X a state<br />
space. If D is a set <strong>of</strong> values (or more generally a doma<strong>in</strong>), then a state is simply a variable<br />
assignment : X ! D. Let be the set <strong>of</strong> all such states (more generally: a power doma<strong>in</strong>).<br />
Then the overall mean<strong>in</strong>g <strong>of</strong> S can be given by a subset (S) [ f1g, where<br />
() 2 (S) means that start<strong>in</strong>g S <strong>in</strong> the <strong>in</strong>itial state , may lead to the nal state and 1<br />
represents non-term<strong>in</strong>ation. This description does not depend on the style <strong>of</strong> the specication<br />
S. Of course, this trivial semantics description comprises non-determ<strong>in</strong>ism and partiality.<br />
Consider an <strong>in</strong>nitary logic and assume that there is an equality predicate. Regard formulae<br />
R with free variables <strong>in</strong> X. These are called X-predicates. Let F(X) be the set <strong>of</strong> all<br />
X-predicates. Let St =(D !) be a xed structure for the <strong>in</strong>terpretation <strong>of</strong> L with semantic<br />
doma<strong>in</strong> D and assume that St satises the doma<strong>in</strong> closure property, i.e. for each d 2 D there<br />
is some closed term t 2T(L) with!(t) =d. Obviously, a state is sucient to<strong>in</strong>terpret an<br />
X-predicate. Write j= R if <strong>in</strong>terpret<strong>in</strong>g R <strong>in</strong> state yields true. Now dene two mapp<strong>in</strong>gs<br />
wlp(S) andwp(S) on equivalence classes <strong>of</strong> X-predicates.<br />
j= wlp(S)(R) i () 2 (S) ^ 6= 1)j= R and (6.71)<br />
j= wp(S)(R) i () 2 (S) ) 6= 1^ j= R : (6.72)<br />
we callw(l)p(S)(R) theweakest (liberal) precondition <strong>of</strong> S for the postcondition R. Note that<br />
this denition precisely formalizes the <strong>in</strong>formal mean<strong>in</strong>g <strong>of</strong> wlp(S) andwp(S). Moreover, the<br />
predicate transformers are uniquely determ<strong>in</strong>ed by (S) uptoequivalence.<br />
Theorem 6.11. For a given program specication S the predicate transformers wlp(S) and<br />
wp(S) exist. Moreover, they satisfy<br />
wp(S)(R) , wlp(S)(R) ^ wp(S)(true) (pair<strong>in</strong>g condition) and (6.73)<br />
wlp(S)(^ ^<br />
R i ) , wlp(S)(R i ) (universal conjunctivity) : (6.74)<br />
i2I<br />
i2I<br />
The follow<strong>in</strong>g <strong>in</strong>version theorem shows that universal conjunctivity and the pair<strong>in</strong>g condition<br />
already suce to nd a specication S with correspond<strong>in</strong>g predicate transformers. For this<br />
recall that the dual f <strong>of</strong> a predicate transformer f is dened as f (R) =:f(:R).<br />
Theorem 6.12. Let flp and fp be predicate transformers satisfy<strong>in</strong>g (6.73) and (6.74) <strong>in</strong><br />
place <strong>of</strong> wlp(S) and wp(S). Then for a program specication S with<br />
(S) =f() jj= flp (P )g[f( 1) jj= fp (false)g<br />
ut<br />
wlp(S)(R) , flp(R) and wp(S)(R) , fp(R).<br />
ut<br />
130
In [3] recursion has been <strong>in</strong>vestigated with respect to the order v dened by S v T i<br />
wlp(T )(R) ) wlp(S)(R) andwp(S)(R) ) wp(T )(R) hold for all X-predicates R. Therefore,<br />
for monotonic f with respect to v the program specication T = S:f(S) can be dened as<br />
a least xpo<strong>in</strong>t and wlp(T ) (resp. wp(T )) is dened by conjunction (disjunction).<br />
F<strong>in</strong>ally, regard the follow<strong>in</strong>g language <strong>of</strong> guarded commands built recursively from the<br />
follow<strong>in</strong>g constructs.<br />
(i) assignments x := E for a variable x and a term E,<br />
(ii) skip, fail, loop,<br />
(iii) sequential composition S 1 S 2 , choice S 1 S 2 , projection @@x S, guard P ! S and<br />
restricted choice S 1 S 2 ,whereP is a well-formed formula and x is a variable.<br />
The <strong>in</strong>formal mean<strong>in</strong>g <strong>of</strong> an assignment is the usual one. skip is an operation that \does<br />
noth<strong>in</strong>g", loop is always dened, but never term<strong>in</strong>ates, and fail is always undened. The<br />
latter two commands are only justied as least elements with respect to the Nelson order {<br />
used for recursion { and the specialization order.<br />
The <strong>in</strong>tended mean<strong>in</strong>g <strong>of</strong> a sequence is also the standard one: rst execute S 1 , then S 2 .A<br />
guard denes a precondition. If P is satised, S is executed, otherwise there is no execution.<br />
Choice means demonic choice, i.e., choose any <strong>of</strong> S 1 or S 2 as long as it is dened, even, if<br />
this leads to non-term<strong>in</strong>ation. Restricted choice on the other hand prefers the execution <strong>of</strong><br />
S 1 unless it is undened, <strong>in</strong> which case S 2 is taken. F<strong>in</strong>ally, theunbounded choice operator<br />
<strong>in</strong>troduces a new variable x and executes S on the state space extended by x.<br />
For this language the axiomatic semantics can be dened by<br />
w(l)p(x := E)(R) ,fx=Eg:R <br />
w(l)p(skip)(R) ,R <br />
w(l)p(fail)(R) , true <br />
wlp(loop)(R) , true and wp(loop)(R) , false <br />
w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(w(l)p(S 2 )(R)) <br />
w(l)p(P ! S)(R) ,P) w(l)p(S)(R) <br />
w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(R) ^ w(l)p(S 2 )(R) <br />
w(l)p(S 1 S 2 )(R) , w(l)p(S 1 )(R) ^ (wp(S 1 )(false) ) w(l)p(S 2 )(R)) and<br />
w(l)p(@@x S)(R) ,8x:w(l)p(S)(R) :<br />
Then for all expression f(S) built from the constructors above f will be monotonic with<br />
respect to v and we get<br />
^<br />
_<br />
wlp(S:f(S))(R) , wlp(f (loop))(R) and wp(S:f(S))(R) , wp(f (loop))(R) <br />
<br />
<br />
where ranges <strong>in</strong> both cases over the ord<strong>in</strong>al numbers.<br />
For any guarded command S we may also consider the conjugate predicate transformers<br />
wp(S) and wlp(S) , which are dened by<br />
w(l)p(S) (R) = :w(l)p(S)(:R) :<br />
131
6.7 I-reducedness<br />
For all the constructors for a guarded command S except the sequence each computation<br />
<strong>of</strong> the result<strong>in</strong>g complex operation already occurs as a computation <strong>of</strong> one <strong>of</strong> the <strong>in</strong>volved<br />
components. Therefore we may expect that GCS construction can be done componentwise.<br />
For sequences, however, this is not the case.<br />
Let us now dene the technical I-reducedness condition for sequences. Assume that the<br />
types <strong>of</strong> values are understood from the context.<br />
Denition 6.13. Let S = S 1 S 2 be an Y -operation such that S i is a Y i -operation for Y i Y<br />
(i = 1 2). Let I be some X-<strong>in</strong>variant with Y X. Let X ; Y 1 = fy 1 ::: y m g, Y 1 =<br />
fx 1 ::: x l g and assume that fx 0 1 ::: x0 l g is disjo<strong>in</strong>t copy <strong>of</strong> Y 1 disjo<strong>in</strong>t als<strong>of</strong>rom X. Then<br />
S is called -I-reduced i the follow<strong>in</strong>g two conditions hold:<br />
(i) For all states with j= <br />
:I we have, if<br />
P )fx 1 =x 0 1::: x l =x 0 l g:(8 i(i =1:::m):fy 1 = 1 ::: y m = m g:I)<br />
is a -constra<strong>in</strong>t for S 1 ,thenitisalsoa-constra<strong>in</strong>t forS.<br />
(ii) For all states we have, if<br />
P )fx 1 =x 0 1 ::: x l=x 0 l g:8 i(i =1:::m):fy 1 = 1 ::: y m = m g::I)<br />
is a -constra<strong>in</strong>t for S 1 ,thenitisalsoa-constra<strong>in</strong>t forS.<br />
Example 6.4. Take X = fx 1 :: FSET(T )x 2 :: FSET(T )g, I x 1 x 2 and S(x y :: T )=<br />
S 1 S 2 with S 1 = x 2 := x 2 ;fxg and S 2 = x 1 := x 1 [fyg.<br />
(i) A -constra<strong>in</strong>t for S 1 <strong>in</strong> the form P ) D 0 C 0 with P C = C 0 ^ D = D 0 is<br />
C = C 0 ^ D = D 0 ^ D 0 C 0 ^ x 62 D 0 ) D 0 C 0 :<br />
S<strong>in</strong>ce Denition 6.13 additionally requires j= :I, i.e. D 0 6 C 0 ,theconjunction <strong>of</strong> such<br />
constra<strong>in</strong>ts is true, which is also a -constra<strong>in</strong>t <strong>of</strong>S.<br />
(ii) Now takea-constra<strong>in</strong>t <strong>of</strong> S 1 <strong>in</strong> the form P ) D 0 6 C 0 . Then we have<br />
fC 0 =C D 0 =Dg:wlp(fC=C 0 D=D 0 g:S 1 )(C = C 0 ^ D = D 0 ) D 0 6 C 0 ,<br />
C = C 0 ^ D = D 0 ) D 6 C ;fxg ,<br />
D 0 6 C 0 ;fxg<br />
x 2 D 0 _ D 0 6 C 0 :<br />
Denition 6.13 additionally requires j= I, i.e. D 0 C 0 , the conjunction J <strong>of</strong> such<br />
constra<strong>in</strong>ts is<br />
x 2 D ^ D C ) D 0 6 C 0 :<br />
Then we compute<br />
fC 0 =C D 0 =Dg:wlp(fC=C 0 D=D 0 g:S)(J ) ,<br />
x 2 D ^ D C ) D 6 (C ;fxg) [fyg ,<br />
true for x 6= y<br />
false for x = y : 132
Hence S is -I-reduced only if x 6= y, but the operation<br />
S 0 (x y :: T ) = (x 6= y ! S 1 S 2 ) S 2<br />
is always -I-reduced and semantically equivalent toS.<br />
ut<br />
We may extend this denition to arbitrary operations requir<strong>in</strong>g all occurr<strong>in</strong>g sequences to be<br />
-I-reduced.<br />
Denition 6.14. Let S be an X-operation and I some Y -<strong>in</strong>variant with X Y . S is called<br />
I-reduced i the follow<strong>in</strong>g holds:<br />
(i) If S is one <strong>of</strong> fail, skip, loop or an assignment, then S is always I-reduced.<br />
(ii) If S = S 1 S 2 ,thenS is I-reduced i S 1 and S 2 are I-reduced and S is -I-reduced.<br />
(iii) If S is one <strong>of</strong> P ! T ,@@y :: T y T , S 1 S 2 or S 1 S 2 , then S is I-reduced i S 1 and S 2<br />
or T respectively are I-reduced.<br />
(iv) If S = T:f(T ), then S is I-reduced i f (loop) isI-reduced for each ord<strong>in</strong>al number .<br />
133
Chapter 7<br />
Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems<br />
for Integrity Ma<strong>in</strong>tenance <strong>in</strong> the<br />
Context <strong>of</strong> Transaction<br />
Specications<br />
Contents<br />
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />
7.2 Unrepairable Transitions . . . . . . . . . . . . . . . . . . . . . . . . 136<br />
7.3 Critical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138<br />
7.4 Stratied Constra<strong>in</strong>t Sets . . . . . . . . . . . . . . . . . . . . . . . 140<br />
7.5 An Algorithm for Check<strong>in</strong>g Stratication . . . . . . . . . . . . . . 142<br />
7.6 Locally Stratied Constra<strong>in</strong>t Sets . . . . . . . . . . . . . . . . . . . 146<br />
7.7 Complexity <strong>of</strong> Local Stratication . . . . . . . . . . . . . . . . . . 150<br />
7.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
K.-D. Schewe, B. Thalheim. Limits <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance<br />
<strong>in</strong> the Context <strong>of</strong> Transaction Specications. Acta Cybernetica 1998 (to appear).<br />
134
Abstract. Integrity Ma<strong>in</strong>tenance is considered one <strong>of</strong> the major application elds <strong>of</strong> rule<br />
trigger<strong>in</strong>g systems (RTSs). In the case <strong>of</strong> a given <strong>in</strong>tegrity constra<strong>in</strong>t be<strong>in</strong>g violated by a<br />
database transition these systems trigger repair<strong>in</strong>g actions. However, it will be shown that<br />
for any set <strong>of</strong> constra<strong>in</strong>ts there exist unrepairable transitions, which depend on the closure<br />
<strong>of</strong> the constra<strong>in</strong>t set. This implies that <strong>in</strong>tegrity ma<strong>in</strong>tenance by RTSs is only possible, if the<br />
constra<strong>in</strong>t implication problem is decidable.<br />
Even if unrepairable transitions are excluded, this does not prevent the RTS to produce<br />
undesired behaviour. Writ<strong>in</strong>g constra<strong>in</strong>ts as sets (conjunctions) <strong>of</strong> simple ones <strong>in</strong> implicative<br />
normal form, this behaves well if there is only one such constra<strong>in</strong>t. In general, however, the<br />
rule trigger<strong>in</strong>g approach fails to solve the problem.<br />
Analyz<strong>in</strong>g the behaviour <strong>of</strong> RTSs leads to the denition <strong>of</strong> critical paths <strong>in</strong> associated<br />
rule hypergraphs and the requirement <strong>of</strong>such paths be<strong>in</strong>g absent. It will be shown that this<br />
requirement can be satised if the underly<strong>in</strong>g set <strong>of</strong> constra<strong>in</strong>ts is stratied, but this notion<br />
turns out to be too strong to be also necessary. A sucient and necessary condition for the<br />
absence <strong>of</strong> critical paths is obta<strong>in</strong>ed, if sets <strong>of</strong> constra<strong>in</strong>ts are required to be locally stratied.<br />
Keywords: active databases, <strong>in</strong>tegrity ma<strong>in</strong>tenance<br />
7.1 Introduction<br />
Active databases (ADBs) aim at extend<strong>in</strong>g relational (or object oriented) DBMS by rule<br />
trigger<strong>in</strong>g systems (RTSs), i.e. by sets <strong>of</strong> rules which on a given event and <strong>in</strong> the case <strong>of</strong> a<br />
condition be<strong>in</strong>g satised trigger actions on the database (ECA-rules). Events can be external<br />
events, time conditions or <strong>in</strong>ternal events result<strong>in</strong>g from operations on the database. Conditions<br />
are usually given by boolean queries that have to be evaluated aga<strong>in</strong>st the database.<br />
The action part consists <strong>of</strong> a sequence <strong>of</strong> basic operations to <strong>in</strong>sert, delete or update tuples<br />
(or objects respectively) <strong>in</strong> the database.<br />
The current research on ADBs (see e.g. [3]) is dom<strong>in</strong>ated by implementational aspects,<br />
whilst foundations <strong>of</strong> RTSs are seldom approached. The work <strong>in</strong> [1, 2, 4, 9, 10] and partly<br />
<strong>in</strong> [3] considers the problem to enforce database <strong>in</strong>tegrity by the use <strong>of</strong> RTSs. The results<br />
concern the generation <strong>of</strong> repair<strong>in</strong>g ECA-rules and partly the analysis <strong>of</strong> the result<strong>in</strong>g RTS.<br />
This analysis concentrates on the term<strong>in</strong>ation <strong>of</strong> the rule system, the <strong>in</strong>dependence <strong>of</strong> the nal<br />
database state from the chosen selection order <strong>of</strong> the rules (conuence) and on consistency.<br />
These requirements are not sucient for a reasonable rule behaviour, because it is easy<br />
to dene an RTS that empties the database <strong>in</strong> case <strong>of</strong> any constra<strong>in</strong>t violation. Therefore,<br />
we claim an additional requirement, which <strong>in</strong>formally means that the <strong>in</strong>tended eect <strong>of</strong> a<br />
transition may not be turned <strong>in</strong>to its opposite by the RTS.<br />
In this short paper we analyze the limits <strong>of</strong> the rule trigger<strong>in</strong>g approach. For a given set<br />
<strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicational normal form we rst <strong>in</strong>vestigate the existence <strong>of</strong> unrepairable<br />
transitions. These are determ<strong>in</strong>ed by the closure <strong>of</strong> the constra<strong>in</strong>t set. It turns out that the<br />
decidability <strong>of</strong> the constra<strong>in</strong>t implication problem is necessary for <strong>in</strong>tegrity ma<strong>in</strong>tenance by<br />
RTSs.<br />
Next we analyze, how to obta<strong>in</strong> RTSs that denitely repair constra<strong>in</strong>t violations by a<br />
(repairable) transition without <strong>in</strong>validat<strong>in</strong>g its <strong>in</strong>tended eect. Given an RTS we rst associate<br />
with it a rule hypergraph which corresponds to the possible sequences <strong>of</strong> triggered rules. Next<br />
we dene critical trigger paths <strong>in</strong> these hypergraphs that correspond to the propagation <strong>of</strong><br />
135
conditions. Indeed it can be shown that the existence <strong>of</strong> a s<strong>in</strong>gle critical trigger path makes<br />
the RTS work <strong>in</strong>correctly for at least one transition.<br />
F<strong>in</strong>ally, we analyze constra<strong>in</strong>t sets <strong>in</strong> order to detect, whether it is possible to dene an<br />
RTS <strong>of</strong> repair<strong>in</strong>g actions such that the critical trigger paths <strong>in</strong> its associated hypergraph can<br />
only <strong>in</strong>validate unrepairable transitions. For this we rst <strong>in</strong>troduce stratied constra<strong>in</strong>t sets<br />
that satisfy this condition. S<strong>in</strong>ce the converse is not true, we nally weaken the concept to<br />
locally stratied constra<strong>in</strong>t sets which gives a necessary and sucient conditions for the RTS<br />
to work correctly.<br />
7.2 Unrepairable Transitions<br />
In the follow<strong>in</strong>g we consider the relational datamodel with <strong>in</strong>tegrity constra<strong>in</strong>ts given by<br />
formulae <strong>in</strong> implicative normal form<br />
I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) (7.75)<br />
with predicate symbols p i , q j , which correspond either to a relation <strong>of</strong> the schema or are<br />
comparison predicates (= 6=
Let us rst demonstrate the <strong>in</strong>suciency <strong>of</strong> a naive RTS approach by a simple example.<br />
In \real" applications the situation <strong>of</strong> Example 7.1 will not occur <strong>in</strong> such an obvious way,<br />
but there are always implied and <strong>in</strong> general not detectable constra<strong>in</strong>ts lead<strong>in</strong>g to analogous<br />
problems as shown <strong>in</strong> [6].<br />
Example 7.1. Take two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x) and<br />
I 2 p(x) ^ q(x) ) false. This implies p to be always empty, hence <strong>in</strong>sertions <strong>in</strong>to p should<br />
be abolished. Then we obta<strong>in</strong> the follow<strong>in</strong>g repair<strong>in</strong>g rules:<br />
R 1 : ON <strong>in</strong>sert p (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />
R 2 :ONdelete q (x) IF:I 1 DO delete p (x)<br />
R 3 : ON <strong>in</strong>sert p (x) IF:I 2 DO delete q (x)<br />
R 4 : ON <strong>in</strong>sert q (x) IF:I 2 DO delete p (x)<br />
If we try to execute a transition <strong>in</strong>sert p (x) on a database state satisfy<strong>in</strong>g q(x), then we<br />
successively trigger the rules R 3 and R 2 with the eect <strong>of</strong> only delet<strong>in</strong>g a <strong>in</strong> q. This contradicts<br />
the orig<strong>in</strong>al <strong>in</strong>tention <strong>of</strong> the transition.<br />
ut<br />
In order to analyze the un<strong>in</strong>tended behaviour <strong>in</strong> Example 7.1 consider a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong><br />
implicational normal form. Let denote the (semantic) closure, i.e. = fI j j= Ig.Now<br />
let I2 be non-trivial, i.e. it does not hold <strong>in</strong> all database states. Write I <strong>in</strong> implicational<br />
normal form<br />
I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m )<br />
and let p i 1 ::: p i k<br />
and q j 1 ::: p j` denote the relation symbols on the left and right hand<br />
sides <strong>of</strong> I respectively. Wemay dene a transition T by<br />
delete qj<br />
1 (y j1 ) ::: delete q j` (y j`) <strong>in</strong>sert pi<br />
1 (x i1 ) ::: <strong>in</strong>sert p ik<br />
(x ik ) :<br />
If we start T with values for the x i and y j such that the additional conditions on the left<br />
hand side <strong>of</strong> I are satised, whilst the additional conditions on the right hand side are not, T<br />
will always reach a database state satisfy<strong>in</strong>g :I. This eect <strong>of</strong> T is <strong>in</strong>tentional and hence the<br />
only reasonable approach to<strong>in</strong>tegrity ma<strong>in</strong>tenance <strong>in</strong> this case is to disallow such transitions.<br />
More formally, the eect <strong>of</strong> a transition T <strong>in</strong> a state is given by the strongest (with<br />
respect to )) formula E (T ) = such that j= wp(T )( ) holds. Here wp(T )( ) denotes<br />
the weakest precondition <strong>of</strong> under the transition , i.e. start<strong>in</strong>g T <strong>in</strong> <strong>in</strong>itial state will<br />
reach a nal state satisfy<strong>in</strong>g .<br />
S<strong>in</strong>ce we only consider sequences <strong>of</strong> <strong>in</strong>sertions and deletions, E (T ) can always be written<br />
as a conjunction <strong>of</strong> literals, i.e. <strong>in</strong> negated implicational normal form, with the positive literals<br />
correspond<strong>in</strong>g to <strong>in</strong>sertions and the negative onestodeletions. In addition, we may consider<br />
the eect <strong>of</strong> a sequence T RT S, where T is a transition and RT S a system <strong>of</strong> rules. We say<br />
that RT S <strong>in</strong>validates the eect <strong>of</strong> T i 6j= E (T ) ^ E (T RT S) holds for some state .<br />
Then it is justied to call a transition T repairable with respect to the constra<strong>in</strong>t set<br />
i :E (T ) =2 holds for at least one state . Then a complete term<strong>in</strong>at<strong>in</strong>g system<br />
RT S <strong>of</strong> ECA-rules always <strong>in</strong>validates the eect <strong>of</strong> a non-repairable transition T . Hence the<br />
problem is to detect (and exclude) non-repairable transitions. In order to decide whether a<br />
given transition T is repairable or not, we must be able to decide, whether :E (T ) is <strong>in</strong> the<br />
closure . Hence the implication problem for constra<strong>in</strong>ts must be decidable.<br />
137
Proposition 7.1. Let be aset<strong>of</strong>constra<strong>in</strong>ts. The problem to decide, whether a transition<br />
T is repairable with respect to is equivalent to the constra<strong>in</strong>t implication problem for ,<br />
i.e. the problem to decide, whether a given constra<strong>in</strong>t I is a member <strong>of</strong> or not. ut<br />
Proposition 7.1 denes the rst limit on <strong>in</strong>tegrity ma<strong>in</strong>tenance by rule trigger<strong>in</strong>g systems. In<br />
the follow<strong>in</strong>g sections we shall concentrate on repairable transitions.<br />
Note that our treatment ignores the term<strong>in</strong>ation problem. Non-term<strong>in</strong>at<strong>in</strong>g transitions<br />
have to be excluded as well, but this problem is <strong>in</strong>dependent from the repairability problem,<br />
s<strong>in</strong>ce non-term<strong>in</strong>ation <strong>of</strong> RTSs occurs as an orthogonal problem.<br />
7.3 Critical Paths<br />
Let us ask, whether we can always nd a complete set <strong>of</strong> repair rules for all repairable<br />
transitions. For this we <strong>in</strong>troduce the notions <strong>of</strong> associated hypergraphs and critical trigger<br />
paths.<br />
Denition 7.2. Let S = fp 1 ::: p n g be a relational database schema and RT S = fR 1 ::: R m g<br />
a system <strong>of</strong> ECA-rules on S. Then the associated rule hypergraph (VE) is constructed as follows:<br />
{ V is the disjo<strong>in</strong>t union <strong>of</strong> S and RT S. We then talk <strong>of</strong> S-vertices and RT S-vertices<br />
respectively.<br />
{ If R 2 RT S has event-part Ev on p 2 S and actions on p 1 ::: p k , then we have a<br />
hyperedge from p to fRg labelled by +or; depend<strong>in</strong>g on Ev be<strong>in</strong>g an <strong>in</strong>sert or delete,<br />
and a hyperedge from fRg to fp 1 ::: p k g analogously labelled by k values + or ;. ut<br />
Figure 7.1 shows the associated rule hypergraph <strong>of</strong> Example 7.1 <strong>in</strong> which case we have a<br />
simple graph.<br />
<br />
q<br />
;<br />
@@I <br />
@ ;<br />
;; @<br />
p<br />
<br />
R 2<br />
; + - R 4<br />
;<br />
@ ;<br />
+ @@<br />
; ;; @R<br />
;<br />
R 1<br />
+ + - R 3<br />
Fig. 7.1. Associated Rule Hypergraph<br />
Denition 7.2 ignores the condition part <strong>of</strong> the rules. These come <strong>in</strong>to play ifwe consider<br />
critical trigger paths <strong>in</strong> associated hypergraphs. These are dened <strong>in</strong> several steps start<strong>in</strong>g<br />
from paths <strong>in</strong> the associated hypergraph which correspond to possible sequences <strong>of</strong> ECArules<br />
with respect only to their event- and action-parts. Secondly we attach formulae to the<br />
S-vertices <strong>in</strong> the path <strong>in</strong> such a way that pre- and postconditions <strong>of</strong> the <strong>in</strong>volved rules are<br />
expressed. Then we talk <strong>of</strong> trigger paths.<br />
A maximal trigger path with contradict<strong>in</strong>g <strong>in</strong>itial and nal condition will then be called<br />
critical. Then imag<strong>in</strong>e a transition with an eect implied by the <strong>in</strong>itial formula, i.e. that<br />
138
there is an <strong>in</strong>itial state such that runn<strong>in</strong>g the transition <strong>in</strong> this state results <strong>in</strong> a state which<br />
satises the <strong>in</strong>itial condition <strong>of</strong> the trigger path. If we execute this transition followed by the<br />
rule trigger<strong>in</strong>g system along the critical trigger path will then turn the eect <strong>of</strong> the transition<br />
<strong>in</strong>to its opposite. This means that the RT S <strong>in</strong>validates the eect <strong>of</strong> at least one transition.<br />
Denition 7.3. Let G = (VE) be the rule hypergraph associated with a system RT S <strong>of</strong><br />
rules. A trigger path <strong>in</strong> G is a sequence v 0 e 1 v1 0 e0 1 ::: e0`v` <strong>of</strong> vertices and hyperedges with<br />
the follow<strong>in</strong>g conditions:<br />
{ v i 2S holds for all i =0::: `,<br />
{ vi 0 2 RT S holds for all i =1::: `,<br />
{ e i is a hyperedge from v i;1 to vi 0 and<br />
{ e 0 i is a hyperedge from v0 i to V i with v i 2 V i and the same label as e i+1 .<br />
We call ` the length <strong>of</strong> the trigger path.<br />
In addition we associate with each vertex v i 2 S (i = 0::: `) a formula ' i <strong>in</strong> negated<br />
implication normal form such thatj= ' i ) cond(vi+1 0 ) holds for the condition part cond(v0 i+1 )<br />
<strong>of</strong> rule vi+1 0 2 RT S and j= ' i ) wp(A i+1 )(' i+1 ) holds for the action-part A i+1 <strong>of</strong> rule vi+1<br />
0<br />
(i =0::: `; 1). Furthermore, there is no e`+1 2 E from v` to v0`+1 with the same label as<br />
e 0` such thatj= '` ) cond(v0`+1 ) holds.<br />
Then a trigger path is critical i j= :(' 0 ^ '`) holds. Such a critical trigger path is<br />
called admissible i there is a consistent state and a repairable transition T such that<br />
E (T ) , ' 0 holds.<br />
ut<br />
Critical trigger paths for the associated rule hypergraph <strong>in</strong> Figure 7.1 are sketched <strong>in</strong> Figure<br />
7.2. Note that <strong>in</strong> this case both critical trigger paths are not admissible.<br />
<br />
p<br />
<br />
<br />
p<br />
<br />
Fig. 7.2. Critical Trigger Paths<br />
<br />
q<br />
<br />
<br />
q<br />
<br />
+ + + ;<br />
R 1 R 4<br />
- - - -<br />
+ ; ; ;<br />
R 3 R 2<br />
- - - -<br />
<br />
p<br />
<br />
<br />
p<br />
<br />
p(x) ^:q(x) p(x) ^ q(x) :p(x) ^ q(x)<br />
v 0 e 1 v 0 1 e 0 1<br />
v 1 e 2 v 0 2 e 0 2<br />
v 2<br />
p(x) ^ q(x) p(x) ^:q(x) :p(x) ^:q(x)<br />
If a critical trigger path is not admissible, then only a non-repairable transition can be <strong>in</strong>validated<br />
by runn<strong>in</strong>g the rules<strong>in</strong>thetrigger path. S<strong>in</strong>ce we exclude non-repairable transitions,<br />
we only have to consider admissible trigger paths. After these remarks we are able to prove<br />
our rst result.<br />
Proposition 7.4. Let RT S be acomplete set <strong>of</strong> rules associated with a set <strong>of</strong> constra<strong>in</strong>ts<br />
and let G =(VE) be the associated rule hypergraph. Then G conta<strong>in</strong>s an admissible critical<br />
trigger path i there exists a consistent database state and a repairable transition T such that<br />
execut<strong>in</strong>g T <strong>in</strong> and consecutively runn<strong>in</strong>g RT S <strong>in</strong>validates the eect <strong>of</strong> T without leav<strong>in</strong>g<br />
the database unchanged.<br />
139
Pro<strong>of</strong>. Let us rst assume that G conta<strong>in</strong>s an admissible critical trigger path. Let ' 0 ::: '`<br />
denote the formulae associated with the S-vertices <strong>in</strong> this trigger path.<br />
Case 1. Assume that e 1 is labelled by +.Then' 0 conta<strong>in</strong>s at least one positive literal p(x).<br />
Let be a consistent state and T a repairable transition such that E (T ) is given by ' 0 .<br />
We may assume that j= :p(x) holds and that the nal action <strong>in</strong> T <strong>in</strong> an <strong>in</strong>sertion <strong>in</strong>to p. If<br />
we start T <strong>in</strong> the <strong>in</strong>itial state , then the result<strong>in</strong>g state satises ' 0 .<br />
T followed by the RT S may then result <strong>in</strong> a state satisfy<strong>in</strong>g '`. Hence the eect <strong>of</strong><br />
T RT S <strong>in</strong> is given by '`. S<strong>in</strong>ce j= :(' 0 ^ '`) holds by the denition <strong>of</strong> critical trigger<br />
paths, this implies that RT S <strong>in</strong>validates the eect <strong>of</strong> T . Furthermore, is consistent with<br />
respect to all constra<strong>in</strong>ts <strong>in</strong> , s<strong>in</strong>ce RT S is complete and there is no hyperedge e`+1 from v`<br />
to some v 0`+1 2 RT S with the same label as e0` such thatj= '` ) cond(v0`+1 ) holds.<br />
It rema<strong>in</strong>s to show 6= . If this does not hold, we get j= '` and consequently there<br />
exists some such that '` ,:p(x) ^ and ' 0 , p(x) ^ hold. This implies `>1, because<br />
otherwise the rule v1 0 would have the form ON <strong>in</strong>sert p(x) IF :I DO delete p (x) , which we<br />
excluded.<br />
If `>1 holds, there is at least one other literal q(y) (or :q(y)) <strong>in</strong> ' 0 such that delete q (y)<br />
(or <strong>in</strong>sert q (y) respectively) occurs <strong>in</strong> the action-part <strong>of</strong> v1 0 .Thenwemay consider the admissible<br />
critical trigger path v 1 e 2 ::: v` <strong>of</strong> length ` ; 1 <strong>in</strong>stead. Follow<strong>in</strong>g the argumentation<br />
above, we maychoose and T <strong>in</strong> such away that j= :q(y) (orj= q(y) respectively) holds.<br />
This implies 6= as required.<br />
Case 2. If e 1 is labelled by ;, then ' 0 conta<strong>in</strong>s a literal :p(x). Thus, we have to consider<br />
a transition T conta<strong>in</strong><strong>in</strong>g delete p (x) as its nal action and a consistent state with j= p(x)<br />
and E (T ) , ' 0 . Then we may apply the same arguments as for case 1.<br />
Conversely, assume that there is no admissible critical trigger path. Let T be a repairable<br />
transition and a database state which is consistent with respect to . Now start T <strong>in</strong> <br />
and assume that the result<strong>in</strong>g state 0 is not consistent. Then consider a trigger path <strong>of</strong> nite<br />
length such that j= 0 ' 0 holds. The consecutive execution <strong>of</strong> the rules <strong>in</strong> this trigger path will<br />
result <strong>in</strong> a state satisfy<strong>in</strong>g '`. Thus, we have E (T ) , ' 0 and E (T RT S) , '`.<br />
Accord<strong>in</strong>g to our assumption, the used trigger path cannot be critical, i.e. '` ^ ' 0 is<br />
satisable. Hence RT S does not <strong>in</strong>validate the eect <strong>of</strong> T .<br />
ut<br />
7.4 Stratied Constra<strong>in</strong>t Sets<br />
Accord<strong>in</strong>g to the result <strong>in</strong> Proposition 7.4 we may ask for constra<strong>in</strong>t sets that allow to dene<br />
complete RTSs which exclude admissible critical trigger paths <strong>in</strong> their associated hypergraphs.<br />
Let us start with a simple example.<br />
Example 7.2. Take aga<strong>in</strong> two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x)<br />
and I 2 q(x) ) p(x) which implies p to be always equal to q. Thenwe obta<strong>in</strong> the follow<strong>in</strong>g<br />
repair<strong>in</strong>g rules:<br />
R 1 : ON <strong>in</strong>sert p (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />
R 2 :ONdelete q (x) IF:I 1 DO delete p (x)<br />
R 3 : ON <strong>in</strong>sert q (x) IF:I 2 DO <strong>in</strong>sert p (x)<br />
R 4 :ONdelete p (x) IF:I 2 DO delete q (x)<br />
140
In this case there are no admissible critical paths <strong>in</strong> the associated rule hypergraph. We omit<br />
further details.<br />
ut<br />
Let us now <strong>in</strong>vestigate the reason for the absence <strong>of</strong> admissible critical trigger paths <strong>in</strong> Example<br />
7.2. This leads us to the notion <strong>of</strong> a stratied set <strong>of</strong> constra<strong>in</strong>ts.<br />
The motivation beh<strong>in</strong>d this is as follows: In Example 7.2 <strong>in</strong>sertions (deletions) on a relation<br />
p only trigger <strong>in</strong>sertions (deletions) on q and vice versa. This should be sucient for not<br />
<strong>in</strong>validat<strong>in</strong>g a once established eect. The correspond<strong>in</strong>g constra<strong>in</strong>ts can therefore be grouped<br />
together.<br />
Denition 7.5. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicative normal form (7.75) on a schema<br />
S. The is called stratied i we have a partition = 1 [ :::[ n with pairwise disjo<strong>in</strong>t<br />
constra<strong>in</strong>t sets i called strata such that the follow<strong>in</strong>g conditions are satised:<br />
(i) If L is a literal on the left hand side (right hand side) <strong>of</strong> some constra<strong>in</strong>t I 2 i , then<br />
all constra<strong>in</strong>ts J 2 conta<strong>in</strong><strong>in</strong>g a literal L 0 on the right hand side (left hand side) such<br />
that L and L 0 are uniable also lie <strong>in</strong> stratum i .<br />
(ii) All constra<strong>in</strong>ts I, J conta<strong>in</strong><strong>in</strong>g uniable literals L and L 0 either on the left or the right<br />
hand side must lie <strong>in</strong> dierent strata i and j .<br />
ut<br />
Now we can prove <strong>in</strong> general that stratied constra<strong>in</strong>t sets always give rise to RTSs without<br />
admissible critical trigger paths <strong>in</strong> the associated rule hypergraph.<br />
Proposition 7.6. Let be a stratied constra<strong>in</strong>t set on a schema S. Then there exists a<br />
complete RTS such that for any repairable transition T on S the RTS does not <strong>in</strong>validate the<br />
eect <strong>of</strong> T .<br />
Pro<strong>of</strong>. Given a constra<strong>in</strong>t I <strong>in</strong> implicative normal form (7.75), then each relation symbol p i<br />
on the left hand side gives rise to rules<br />
ON <strong>in</strong>sert pi (x i )IF:I DO <strong>in</strong>sert qj (y j ) ,<br />
ON <strong>in</strong>sert pi (x i )IF:I DO delete pj (y j )<br />
with relation symbols q j occurr<strong>in</strong>g on the right hand side and p j (j 6= i) on the left hand side<br />
<strong>of</strong> I. Similarly, each predicate symbol q j on the right hand side gives rise to rules<br />
ON delete qj (y j )IF:I DO <strong>in</strong>sert qi (y j ) (i 6= j) ,<br />
ON delete qj (y j )IF:I DO delete pi (y j )<br />
This denes a complete set RT S <strong>of</strong> rules. Now assume there exists a critical trigger path<br />
v 0 e 1 v1 0 e0 1 ::: e0`v` <strong>in</strong> the associated rule hypergraph. Each RT S-vertex vi 0 corresponds to<br />
a constra<strong>in</strong>t I i 2 . S<strong>in</strong>ce e 0 i and e i+1 are equally labelled correspond<strong>in</strong>g to the action- or<br />
event-part respectively, the construction <strong>of</strong> the rules above implies I i and I i+1 to lie <strong>in</strong> the<br />
same stratum (i =0::: `; 1).<br />
However, the condition j= :(' 0 ^ '`) implies that ' 0 conta<strong>in</strong>s a literal L, '` its negation,<br />
hence the construction <strong>of</strong> rules implies I 1 and I` to lie <strong>in</strong> dierent strata. Hence, there are<br />
only critical trigger paths <strong>of</strong> length ` =1.<br />
Accord<strong>in</strong>g to our construction <strong>of</strong> RT S this implies j= ' 0 ):Ito hold for some I 2 .<br />
Thus, :' 0 2 holds. Due to the denition <strong>of</strong> admissible critical trigger paths and the<br />
denition <strong>of</strong> repairable transitions, we conclude that the trigger paths <strong>of</strong> length ` = 1 cannot<br />
be admissible. Then the proposition follows from Proposition 7.4.<br />
ut<br />
141
F<strong>in</strong>ally, we may ask for cases, where stratied constra<strong>in</strong>t sets occur. Recall from [5] that a<br />
relational database schema S with constra<strong>in</strong>t set is <strong>in</strong> Entity-Relationship normal form<br />
(ERNF) { and hence is equivalent toanER-schema{i<br />
{ all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> are key-based and non-redundant,<br />
{ there is no cycle <strong>of</strong> <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> ,<br />
{ each relation schema R 2 S is <strong>in</strong> BCNF with respect to the functional dependencies <strong>in</strong><br />
and<br />
{ there are only <strong>in</strong>clusion and functional dependencies <strong>in</strong> .<br />
If a relational database schema S with constra<strong>in</strong>t set is <strong>in</strong> ERNF, then it is easy to see<br />
that is stratied.<br />
Corollary 7.7. Let S be a database schema <strong>in</strong> ERNF with respect to the constra<strong>in</strong>t set .<br />
Then is stratied.<br />
ut<br />
Hence, follow<strong>in</strong>g the design approach <strong>of</strong> Mannila and Raiha <strong>in</strong>[5]{ifthisissucient for the<br />
application { leads to schemata without any problems concern<strong>in</strong>g consistency enforcement by<br />
RTSs.<br />
Example 7.3.<br />
Let us look at the follow<strong>in</strong>g constra<strong>in</strong>ts<br />
I 1 : p(x y) ) q(x z) and<br />
I 2 : q(x z) ^ q(y z) ) x = y :<br />
Then this set <strong>of</strong> constra<strong>in</strong>ts corresponds to the Entity-Relationship diagram [8] <strong>in</strong> Figure 7.3.<br />
Obviously, the constra<strong>in</strong>t set is stratied.<br />
ut<br />
;; @ @@ ; C q<br />
(0 1) - D<br />
6<br />
(0 1)<br />
;; @ @@ ; ;; @ @@ ; -<br />
A p B<br />
Fig. 7.3. Entity-Relationship constra<strong>in</strong>ts<br />
7.5 An Algorithm for Check<strong>in</strong>g Stratication<br />
Before we analyze the converse <strong>of</strong> Proposition 7.6 and present the weaker notion <strong>of</strong> locally<br />
stratied constra<strong>in</strong>t sets, let us rst concentrate on an algorithm for check<strong>in</strong>g stratication<br />
and its complexity. For this we consider the set<br />
142
BW = f> ?g [ (IN ;f0g) [ffj 1 ::: j n gjn 1j k 2 IN ;f0gg :<br />
In the algorithm we successively add labels from BW to constra<strong>in</strong>ts. A label i 2 IN for<br />
a constra<strong>in</strong>t I is used to <strong>in</strong>dicate that I must lie <strong>in</strong> the stratum i . A label fj 1 ::: j n g<br />
<strong>in</strong>dicates that I must not lie <strong>in</strong> jk for k =1::: n. ? represents no <strong>in</strong>formation and > an<br />
<strong>in</strong>consistent assignment <strong>of</strong> stratum numbers.<br />
For a more convenient term<strong>in</strong>ology we call an element <strong>of</strong>BW black, ifitis<strong>in</strong>(IN ;f0g) [<br />
f>g, otherwise white. Furthermore, we use a commutative, associative b<strong>in</strong>ary operation on<br />
BW dened by<br />
x ? = x <br />
x > = > <br />
i if i = j<br />
i j =<br />
> otherwise<br />
fj 1 ::: j n gfk 1 ::: k m g = fj 1 ::: j n g[fk 1 ::: k m g and<br />
> if i = jk for some k 2f1::: ng<br />
i fj 1 ::: j n g =<br />
i otherwise<br />
<br />
:<br />
Algorithm 7.8 (Stratication Check).<br />
Input: aset = fI 1 ::: I n g <strong>of</strong> constra<strong>in</strong>ts<br />
<strong>in</strong> clausal form I i = L i1 _ :::_ L <strong>in</strong>i<br />
Output: a boolean value b<br />
Method:<br />
VAR gather : ARRAY 1 :::n OF BW ,<br />
mb, mb 0 : BW <br />
BEGIN<br />
FOR i =1TO n DO<br />
gather(i) :=?<br />
ENDFOR <br />
b := true <br />
mb := 1 <br />
WHILE 6= DO<br />
CHOOSE i 0 2f1::: ng WITH I i 0 2 AND gather(i 0) is maximal <br />
:= ;fI i 0 g <br />
IF gather(i 0 ) is white<br />
THEN gather(i 0 ):=mb <br />
mb := mb +1<br />
ENDIF <br />
mb 0 := gather(i 0 )<br />
FOR j =1TO n i 0 DO<br />
FOR ALL I k 2 DO<br />
FOR ` =1TO n ik DO<br />
IF L i 0j and L k` are uniable AND gather(i 0 ) 6= ><br />
THEN gather(k) :=gather(k) fgather(i 0 )g<br />
ELSIF L i 0j and L k` are uniable<br />
143
THEN gather(k) :=gather(k) gather(i 0 )<br />
ENDIF<br />
ENDFOR<br />
ENDFOR<br />
ENDFOR<br />
ENDDO <br />
FOR i =1TO n DO<br />
IF gather(i) =><br />
THEN b := false<br />
ENDIF<br />
ENDFOR <br />
RETURN(b)<br />
END<br />
ut<br />
We have tocheck that the algorithm is correct. Then we analyze its time complexity. Before<br />
we do this let us rst look at a simple example.<br />
Example 7.4.<br />
Consider the follow<strong>in</strong>g constra<strong>in</strong>ts:<br />
I 1 = :p(x) _:q(x) _ r(x) _ s(x) <br />
I 2 = :q(x) _ r(x) _:t(x) <br />
I 3 = p(x) _:r(x) <br />
I 4 = :s(x) _ t(x) and<br />
I 5 = q(x) _:t(x) :<br />
Then consider Table 1. Each row corresponds to a constra<strong>in</strong>t I i and lists the values added<br />
Table1. Stratication Check<br />
L 11 L 12 L 13 L 14 L 32 L 31 L 21 L 22 L 23 L 41 L 42 L 52 L 51 gather<br />
1 1 1 1 1 1<br />
3 1 1 1 1 1<br />
2 f1g f1g 1 > > > ><br />
4 1 > > > ><br />
5 1 > > > > ><br />
I 1 I 3 I 2 I 4 I 5 b = false<br />
to gather(i) dur<strong>in</strong>g the excution <strong>of</strong> the algorithm. The chosen order <strong>of</strong> the constra<strong>in</strong>ts <strong>in</strong> the<br />
algorithm is I 1 , I 3 , I 2 , I 4 , I 5 . Then b will become false and hence is not stratiable. ut<br />
Let us now address the correctness <strong>of</strong> Algorithm 7.8.<br />
Proposition 7.9. Let be a set <strong>of</strong> constra<strong>in</strong>ts. Then is stratiable i Algorithm 7.8<br />
applied to the <strong>in</strong>put computes the output b = true.<br />
144
Pro<strong>of</strong>. Let us rst assume that is stratied. Let = 1 [ :::[ n be a decomposition<br />
<strong>in</strong>to strata and assume that the i are taken m<strong>in</strong>imal with the required properties. We use<br />
<strong>in</strong>duction on n.<br />
For n = 1 there are no uniable literals L and L 0 <strong>in</strong> dierent constra<strong>in</strong>ts I J 2 . Hence<br />
gather(i) will become 1 for alle i and we obta<strong>in</strong> b = true.<br />
For n>1 we may assume without loss <strong>of</strong> generality that some constra<strong>in</strong>t <strong>in</strong> 1 will be<br />
chosen rst. Then, due to our m<strong>in</strong>imality assumption, we get gather(i) =1for all I i 2 1 ,<br />
whereas gather(j) will be white for all I j =2 1 . Thus, all constra<strong>in</strong>ts <strong>in</strong> 1 will be chosen<br />
rst.<br />
S<strong>in</strong>ce gather(j) was white for I j =2 1 and gather(i) =1for I i 2 1 before chos<strong>in</strong>g the<br />
rst constra<strong>in</strong>t <strong>in</strong> 2 [:::[ n ,wemay apply the <strong>in</strong>duction hypothesis to 2 [:::[ n , which<br />
gives gather(j) 6= > for all I j =2 1 . This implies b = true as claimed <strong>in</strong> the proposition.<br />
Conversely, assume that the algorithm produces the result b = true. Thenwe must have<br />
gather(i) 2 IN ;f0g. Dene k = fI i 2 j gather(i) = kg. Assume that the partition<br />
= 1 [ :::[ n does not satisfy the conditions for strata <strong>in</strong> Denition 7.5. Then there are<br />
two possible cases:<br />
(i) There are literals L and L 0 <strong>in</strong> constra<strong>in</strong>ts I i 2 k and I j 2 ` with k 6= ` such that L<br />
and L 0 are uniable. Suppose that I i is chosen rst by the algorithm. Then k will be<br />
added to gather(j), which gives gather(j) => contradict<strong>in</strong>g our assumption.<br />
(ii) There are uniable literals L and L 0 <strong>in</strong> constra<strong>in</strong>ts I i I j 2 k .IfI i is chosen rst by the<br />
algorithm, fkg will be added to gather(j), which also gives gather(j) => contradict<strong>in</strong>g<br />
our assumption.<br />
Thus 1 [ :::[ n is a partition <strong>in</strong>to strata, which completes the pro<strong>of</strong>.<br />
ut<br />
Proposition 7.10. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> clausal form, n =#, k the maximal<br />
arity <strong>of</strong> predicate symbols occurr<strong>in</strong>g <strong>in</strong> constra<strong>in</strong>ts I2 and let ` be the maximum number <strong>of</strong><br />
literals <strong>in</strong> these constra<strong>in</strong>ts. Then the time complexity <strong>of</strong> Algorithm 7.8 for check<strong>in</strong>g, whether<br />
is stratied is <strong>in</strong> O(k `2 n 2 ).<br />
Pro<strong>of</strong>. The <strong>in</strong>itialization and the nal computation <strong>of</strong> b can both be done <strong>in</strong> O(n) steps.<br />
In the <strong>in</strong>ner FOR-loop the test for uniability can be done <strong>in</strong> O(k) steps, s<strong>in</strong>ce there are no<br />
function symbols. All other operations have a complexity <strong>in</strong>O(1). Hence the <strong>in</strong>ner FOR-loop<br />
has a total complexity <strong>in</strong>O(k). This loop is executed `0 `00 times, where `0 is the number <strong>of</strong><br />
literals <strong>in</strong> the chosen constra<strong>in</strong>t I i 0 and `00 is the total number <strong>of</strong> literals <strong>in</strong> the rema<strong>in</strong><strong>in</strong>g<br />
constra<strong>in</strong>ts. If I i 0 is the i'th literal chosen by the algorithm, this can be estimated by `2(n;i).<br />
S<strong>in</strong>ce each I 2 will be chosen by the algorithm, the outer WHILE-loop will be executed<br />
n times. This gives the total complexity <strong>in</strong><br />
O(n)+O(`2 <br />
nX<br />
i=1<br />
(n ; i)) O(k)+O(n) = O(k `2 n 2 )<br />
as claimed <strong>in</strong> the proposition.<br />
ut<br />
It is easy to see that n ` can be replaced by the total number u = P n<br />
i=1 n i <strong>of</strong> literals <strong>in</strong> <br />
with u < n `. Thus, the time complexity <strong>of</strong> the stratication check<strong>in</strong>g algorithm 7.8 is <strong>in</strong><br />
O(k u 2 ).<br />
145
From Proposition 7.6 we know that active mechanisms can be eectively applied, if the<br />
constra<strong>in</strong>t set is stratied. In particular, this holds for schemata <strong>in</strong> ERNF [5], which are equivalent<br />
to Entity-Relationship schemata. From Proposition 7.10 we know that a stratication<br />
check can be done eciently.<br />
7.6 Locally Stratied Constra<strong>in</strong>t Sets<br />
Unfortunately, the converse <strong>of</strong> Proposition 7.6 does not hold, as seen <strong>in</strong> the next example.<br />
The reason for this is that <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6 we considered all repair<strong>in</strong>g rules for<br />
a given constra<strong>in</strong>t, whereas the constra<strong>in</strong>t set <strong>in</strong> Example 7.5 allows to select only a subset<br />
thus ga<strong>in</strong><strong>in</strong>g the required result without loos<strong>in</strong>g the completeness <strong>of</strong> the RTS.<br />
Example 7.5. Take three unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ^ r(x) )<br />
q(x), I 2 q(x) ) p(x) andI 3 p(x) ) r(x). It is easy to see that this constra<strong>in</strong>t set is not<br />
stratied.<br />
However, we may consider the follow<strong>in</strong>g system <strong>of</strong> ECA-rules:<br />
R 1 : ON <strong>in</strong>sert p (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />
R 2 :ONdelete q (x) IF:I 1 DO delete p (x)<br />
R 3 : ON <strong>in</strong>sert r (x) IF:I 1 DO <strong>in</strong>sert q (x)<br />
R 4 :ONdelete q (x) IF:I 1 DO delete r (x)<br />
R 5 : ON <strong>in</strong>sert q (x) IF:I 2 DO <strong>in</strong>sert p (x)<br />
R 6 :ONdelete p (x) IF:I 2 DO delete q (x)<br />
R 7 : ON <strong>in</strong>sert p (x) IF:I 3 DO <strong>in</strong>sert r (x)<br />
R 8 :ONdelete r (x) IF:I 3 DO delete p (x)<br />
We dispense with show<strong>in</strong>g that there are no admissible critical trigger paths <strong>in</strong> the associated<br />
rule hypergraph.<br />
Note that the construction <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6 would result <strong>in</strong> two more rules<br />
correspond<strong>in</strong>g to <strong>in</strong>sertions:<br />
R 9 : ON <strong>in</strong>sert p (x) IF:I 1 DO delete r (x)<br />
R 10 : ON <strong>in</strong>sert r (x) IF:I 1 DO delete p (x)<br />
These give rise to admissible critical trigger paths. The one shown <strong>in</strong> Figure 7.4 allows to<br />
<strong>in</strong>validate the eect <strong>of</strong> the repairable transition <strong>in</strong>sert p (x).<br />
ut<br />
<br />
p<br />
<br />
<br />
r<br />
<br />
+ ; ; ;<br />
R 9 R 8<br />
- - - -<br />
<br />
p<br />
<br />
p(x) ^:q(x) ^ r(x) p(x) ^:q(x) ^:r(x) :p(x) ^:q(x) ^:r(x)<br />
v 0 e 1 v1 0 e 0 v<br />
1 1 e 2 v2 0 e 0 v<br />
2 2<br />
Fig. 7.4. An Admissible Critical Trigger Path<br />
146
The constra<strong>in</strong>t set <strong>in</strong> Example 7.5 is not stratied, but nevertheless the associated RTS does<br />
not <strong>in</strong>validate the eect <strong>of</strong> repairable transitions. This shows that a constra<strong>in</strong>t set need not<br />
be stratied to allow a reasonable rule behaviour. Indeed, replac<strong>in</strong>g I 1 <strong>in</strong> the example by<br />
I1 0 p(x) ) q(x) gives an equivalent constra<strong>in</strong>t set, which is stratied. However, equivalence<br />
<strong>of</strong> constra<strong>in</strong>t sets is undecidable <strong>in</strong> general. Therefore, we <strong>in</strong>troduce the weaker notion <strong>of</strong><br />
be<strong>in</strong>g locally stratied. In this case we shall construct RTSs which only conta<strong>in</strong> a subset <strong>of</strong><br />
the set <strong>of</strong> rules constructed <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6.<br />
Denition 7.11. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicative normal form on a schema S.<br />
A labelled subsystem consists <strong>of</strong> a subset 0 = fI 2 j L (I) is dened g together with<br />
a set <strong>of</strong> clauses 00 = f L (I) jI2 0 g and a literal L (the label) such that each constra<strong>in</strong>t<br />
I2 0 can be written as the disjunction L (I) _I 0 with j= I 0 ) L.<br />
Here L (I) is dened i the negation L does not occur <strong>in</strong> I (written as a clause). Then<br />
L (I) results from I by omission <strong>of</strong> the literal L if the result conta<strong>in</strong>s at least two literals.<br />
Otherwise L (I) is simply I. We call I 0 the label part and L (I) the label-free part <strong>of</strong> the<br />
constra<strong>in</strong>t I. IfL is understood from the context, we drop the subscript and write <strong>in</strong>stead<br />
<strong>of</strong> L .<br />
A labelled subsystem ( 0 00 L) is called stratied i the set 00 is stratied <strong>in</strong> the sense<br />
<strong>of</strong> Denition 7.5 or locally stratied as dened below.<br />
The constra<strong>in</strong>t set is called locally stratied i = 1 0 [:::[0 n with stratied labelled<br />
subsystems (i 000<br />
i L i) (i =1::: n) such that for each constra<strong>in</strong>t I 2 i 0 and each literal<br />
L occurr<strong>in</strong>g <strong>in</strong> its label part with respect to i there exists another j with I 2 j 0 and L<br />
occurr<strong>in</strong>g <strong>in</strong> its label-free part <strong>of</strong> I with respect to j .<br />
ut<br />
Example 7.6. For the constra<strong>in</strong>t set <strong>in</strong> Example 7.5 we obta<strong>in</strong> the partition <strong>in</strong>to 1 0 =<br />
fI 1 I 3 g and 2 0 = fI 1 I 2 g.<br />
For the rst <strong>of</strong> these we have the label L 1 :p(x) and the label-free parts dened by<br />
L 1 (I 1) q(x) _:r(x) and L 1 (I 3) I 3 .<br />
For 2 0 we get the label L 2 :r(x) and the label-free parts L 2 (I 1) :p(x) _ q(x) and<br />
L 2 (I 2) I 2 .<br />
This shows that the constra<strong>in</strong>t set <strong>in</strong> Example 7.5 is <strong>in</strong>deed locally stratied. ut<br />
Note that each stratied constra<strong>in</strong>t set is also locally stratied. In this case we dene<br />
depth() = 0. If is locally stratied by a partition = 1 0 [:::[0 n ,we dene depth() =<br />
max n i=1 depth(00 i ) + 1. We calldepth() the depth <strong>of</strong> the locally stratied constra<strong>in</strong>t set .<br />
F<strong>in</strong>ally, we can strengthen Proposition 7.6 now deal<strong>in</strong>g with locally stratied constra<strong>in</strong>t<br />
sets. This condition turns out to be sucient and also necessary for the absence <strong>of</strong> admissible<br />
critical trigger paths.<br />
Theorem 7.12. Let be aconstra<strong>in</strong>t set on a schema S. Then is locally stratied i there<br />
exists a complete RTS such that for any repairable transition T the RTS does not <strong>in</strong>validate<br />
the eect <strong>of</strong> T .<br />
Pro<strong>of</strong>. First assume that is locally stratied. Let the labelled subsystems <strong>in</strong> the partition<br />
be (i 000<br />
i L i) for i =1::: n.We shall use <strong>in</strong>duction on the depth <strong>of</strong> . For depth() =0<br />
we are done by Proposition 7.6.<br />
Let us now consider the case depth() = 1. As <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6 we construct<br />
an RTS for . S<strong>in</strong>ce each i<br />
00 is stratied <strong>in</strong> the sense <strong>of</strong> Denition 7.5, we rst construct a<br />
147
ule system RT Si 0 with respect to 00 i as <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.6. The condition parts<br />
<strong>in</strong> these rules have theform: Li (I) forI2i 0. Then let RT S i result S from RT Si 0 bychang<strong>in</strong>g<br />
n<br />
all condition parts replac<strong>in</strong>g : Li (I) by :I. F<strong>in</strong>ally, take RT S = i=1 RT S i.<br />
Due to the last property <strong>in</strong> the denition <strong>of</strong> locally stratied constra<strong>in</strong>t sets <strong>in</strong> Denition<br />
7.11 we conclude that RT S is complete.<br />
Now consider a critical trigger path v 0 e 1 v1 0 e0 1 v 1::: e 0`v` <strong>in</strong> the rule hypergraph associated<br />
with RT S. Without loss <strong>of</strong> generality assume v1 0 2 RT S 1. Accord<strong>in</strong>g to Proposition<br />
7.4 we have to show that this trigger path is not admissible.<br />
We use <strong>in</strong>duction on the length ` <strong>of</strong> this critical trigger path. For ` =1we may use the<br />
same argument as<strong>in</strong> the pro<strong>of</strong> <strong>of</strong> Proposition 7.4. Therefore, assume `>1 and take a state<br />
with j= and a transition T with j= E (T ) , ' 0 .Thenwehave to show that T is not<br />
repairable.<br />
Assume that T is repairable. Then there exists a state with j= such that :E (T ) =2<br />
.We shall derive acontradiction from this.<br />
For this regard the critical trigger path v 1 e 2 v2 0 e0 2 v 2::: e 0`v` <strong>of</strong> length ` ; 1. By <strong>in</strong>duction<br />
it is not admissible. If A 1 is the action <strong>in</strong> the rule v1 0 , we get j= E (T A 1 ) , ' 1<br />
and T A 1 cannot be repairable. In particular, this implies :E (T A 1 ) 2 .<br />
S<strong>in</strong>ce A 1 is a simple <strong>in</strong>sertion or deletion, we getj= :E (T ) , '^L and j= :E (T A 1 )<br />
, '^ L for some literal L and its negation L.From this we conclude ' 2 and L 2 .<br />
Then there must exist a resolution refutation for L from <strong>in</strong>put . Any literal L 0 (except<br />
L) <strong>in</strong> this refutation must be selected at least once for build<strong>in</strong>g the resolvent. Therefore, due<br />
to our construction <strong>of</strong> L 1 (I) wemay cancel all clauses I2 conta<strong>in</strong><strong>in</strong>g the literal L 1 and<br />
simultaneously the literal L 1 <strong>in</strong> all clauses. Thus, there must also exist a resolution refutation<br />
for L from <strong>in</strong>put 1 00.<br />
On the other hand, each clause <strong>in</strong> 1 00 conta<strong>in</strong>s at least two literals. Therefore, any resolvent<br />
will also conta<strong>in</strong> at least two literals unless we have some I 1 2 1 00 with literals L 1 and L 2<br />
and another I 2 2 1 00 with literals L0 1 and L0 2 such that L 1, L 0 1 (and L 2, L 0 2 respectively) are<br />
uniable.<br />
This property, however, means that 1 00 is not stratied contradict<strong>in</strong>g our assumptions.<br />
Hence T cannot be repairable and we are done.<br />
Next let depth() > 1. We proceed analogously. By <strong>in</strong>duction, s<strong>in</strong>ce i<br />
00 is (locally) stratied,<br />
there exists a rule system RT Si 0 for 00 i with the required property. The condition parts<br />
<strong>in</strong> these rules have theform: Li (I) forI2i 0. Then let RT SS i result from RT Si 0 bychang<strong>in</strong>g<br />
n<br />
all condition parts from : Li (I) to:I. F<strong>in</strong>ally, take RT S = i=1 RT S i.<br />
Aga<strong>in</strong> due to the last property <strong>in</strong> the denition <strong>of</strong> locally stratied constra<strong>in</strong>t sets (cf.<br />
Denition 7.11) RT S must be complete.<br />
Now consider a critical trigger path v 0 e 1 v1 0 e0 1 v 1::: e 0`v` <strong>in</strong> the rule hypergraph associated<br />
with RT S. Accord<strong>in</strong>g to Proposition 7.4 we have to show that this trigger path is<br />
not admissible. Without loss <strong>of</strong> generality assume v1 0 2 RT S 1. Then take a maximal k such<br />
that v1 0 ::: v0 k 2 RT S 1 holds. Then for i =0::: k we may write ' i as a conjunction i ^J<br />
with j= i ): L 1 (I i) for some I i 2 1 0 .Hence,ifwe replace v0 i by the correspond<strong>in</strong>g rule <strong>in</strong><br />
RT S1 0 ,we obta<strong>in</strong> a critical trigger path for RT S0 1 .<br />
Now take a state with j= and a transition T with j= E (T ) , ' 0 . We have to<br />
show that T is not repairable. Assume the contrary. Then there exists a state with j= <br />
and :E =2 .<br />
Assume j= :L 1 . S<strong>in</strong>ce j= holds and each constra<strong>in</strong>t I 2 1 0 can be written as a<br />
disjunction I 0 _ L 1 (I) with j= I0 ) L 1 ,we conclude j= 1 00.<br />
148
S<strong>in</strong>ce v 0 e 1 v1 0 e0 1 v 1::: e 0 k v k is a critical trigger path for RT S1 0 and j= E , ' 0<br />
holds, we may apply the <strong>in</strong>duction hypothesis to 1 00 with depth(00 1 ) < depth(). Therefore,<br />
T cannot be repairable, i.e. for any state with j= 1 00 weget:E (T ) 2 (1 00)<br />
.<br />
In particular, take = . Then :E (T ) 2 (1 00)<br />
implies j= : L 1 (I) for some I 2 0 1<br />
and further 6j= contradict<strong>in</strong>g our assumption on . Thus, we must have j= L 1 .<br />
Assume j= :L 1 . Then we must have j= 1 00 and consequently :E (T ) 2 (1 00)<br />
. As<br />
above this implies j= : L 1 (I) for some I20 1 and hence 6j= contradict<strong>in</strong>g our assumption<br />
on . Hence,wemust have j= L 1 .<br />
Now let I 1 2 correspond to the rule v1 0 . Without loss <strong>of</strong> generality we may assume<br />
j= ' 0 ):L 1 . Otherwise, we must have j= :I1 0 and L1 (I 1)must not conta<strong>in</strong> L 1 . This implies<br />
L 1 to occur <strong>in</strong> J , <strong>in</strong> which case we may change it to :L 1 without aect<strong>in</strong>g the trigger path<br />
be<strong>in</strong>g critical.<br />
S<strong>in</strong>ce j= L 1 holds, T must <strong>in</strong>volve an <strong>in</strong>sertion (deletion) correspond<strong>in</strong>g to a negative<br />
(positive) literal L 1 . Hence, j= E (T ) ,:L 1 ^: holds. Due to the <strong>in</strong>dependence <strong>of</strong> J<br />
from 1 00 wemaychoose <strong>in</strong> such away that 2 (1 00)<br />
holds.<br />
However, this implies j= :E (T ) , L 1 _ 2 contradict<strong>in</strong>g the non-repairability <strong>of</strong><br />
T with respect to RT S1 0 . This completes the suciency pro<strong>of</strong>.<br />
Conversely, assume that we are given a complete RTS for which for any repairable<br />
transition T does not <strong>in</strong>validate its eect. Accord<strong>in</strong>g to Proposition 7.4 this implies that all<br />
critical trigger paths <strong>in</strong> the associated rule hypergraph are not admissible. From this we have<br />
to construct a partition <strong>of</strong> <strong>in</strong>to stratied labelled subsystems.<br />
First consider a s<strong>in</strong>gle rule R correspond<strong>in</strong>g to a constra<strong>in</strong>t I 2 . In particular, I is<br />
the condition part <strong>of</strong> this rule. S<strong>in</strong>ce RT S is complete, the event part <strong>of</strong> R gives rise to a<br />
negative (positive) literal L ev <strong>in</strong> I for the case <strong>of</strong> an <strong>in</strong>sertion (deletion). Similarly, an <strong>in</strong>sertion<br />
(deletion) <strong>in</strong> the action part <strong>of</strong> R gives rise to a positive (negative) literal L a <strong>in</strong> I.<br />
Let (I) = L ev _ L a . If I conta<strong>in</strong>s n 1 more literals L 1 ::: L n , let i (I) = (I) _<br />
L 1 _ :::_ L i _ :::_ L n . Then dene i 0(R) = fJ 2 j L i<br />
(J ) is dened g and i 00(R)<br />
=<br />
|{z}<br />
omit<br />
f Li (J ) jJ 2 i 0(R)g. (For I,(I) letL 1 = L ev and L 2 = L a and dene i 0 (R) and00 i (R)<br />
analogously.)<br />
Dene (R) =f(i 0(R)00<br />
i (R)L i) j i 00 (R) is locally stratied g, if this satises the last<br />
condition <strong>of</strong> Denition 7.11. Otherwise let (R) = . Then the elements <strong>of</strong> (R) dene<br />
stratied labelled subsystems <strong>of</strong> .<br />
In order to check the local stratication for i 00 (R) rst check, whether it is stratied. If<br />
not, dene for each literal L <strong>in</strong> i (I) thesetsiL 0 (R) =fJ 2 00 i (R) j L(J ) is dened g and<br />
iL 00 (R) =f L(J ) jJ 2 iL 0 (R)g. Consider f(0 iL (R)00 iL<br />
(R)L) j 00 iL<br />
(R) is locally stratied g<br />
and check the last condition S <strong>of</strong> Denition 7.11.<br />
Now take LSS =<br />
R2RT S<br />
(R). If (R) 6= holds for all R 2 RT S, this satises the last<br />
condition <strong>of</strong> Denition 7.11 and we obta<strong>in</strong> a partition <strong>of</strong> <strong>in</strong>to stratied labelled subsystems.<br />
Then LSS is the required partition.<br />
It rema<strong>in</strong>s to show (R) 6= <strong>in</strong> the construction above. Assume (R) =. Then there<br />
exists a sequence L 1 L 2 ::: L k <strong>of</strong> literals <strong>in</strong> I and a sequence (1 0 00 1 L 1)::: (k 0 00 k L k)<br />
<strong>of</strong> non-stratied labelled subsystems such that i+1 0 = fJ 2 00 i j Li+1 (J ) is dened g and<br />
k 00 conta<strong>in</strong>s two clauses Ik 1 and Ik 2 with literals L1 , L 10 and L 2 , L 20 respectively such that L 1 ,<br />
L 2 and L 10 , L 20 are uniable.<br />
I1 k and Ik 2 correspond to rules with respect to 00 k<br />
that dene an admissible trigger path<br />
<strong>in</strong> the associated rule hypergraph. S<strong>in</strong>ce for i =1 2 I1 k is : L k<br />
(I1 k;1 ), we may successively<br />
149
eplace these rules by rules correspond<strong>in</strong>g to k;1 00 ::: 00 1 and simultaneously replace the<br />
formulae ' k i by ' k;1<br />
i<br />
= ' k i ^:L k::: ' 0 i = ' 1 i ^:L 1. The result<strong>in</strong>g trigger path is still<br />
critical and due to our construction it is also admissible with respect to contradict<strong>in</strong>g our<br />
assumption. This completes the necessity pro<strong>of</strong>.<br />
ut<br />
Example 7.7.<br />
Let us extend Example 7.3 and add a third constra<strong>in</strong>t<br />
I 3 p(x z) ^ q(y z) ) false :<br />
In terms <strong>of</strong> the Entity-Relationship diagram <strong>in</strong> Figure 7.3 I 3 corresponds to an exclusion<br />
constra<strong>in</strong>t BkD. It is easy to see that the new set fI 1 I 2 I 3 g <strong>of</strong> constra<strong>in</strong>ts is not stratied.<br />
In particular, any local stratication must conta<strong>in</strong> a labelled subsystem with label :q(x z)<br />
with the reduced constra<strong>in</strong>ts I2 0 :q(y z) _ x = y and I0 3 I 3. However, :q(x z) cannot<br />
occur <strong>in</strong> the label-free part <strong>of</strong> some I2 0 , s<strong>in</strong>ce this always denes the same labelled subsystem.<br />
Hence, the given constra<strong>in</strong>t set is also not locally stratied. This shows that add<strong>in</strong>g a s<strong>in</strong>gle<br />
exclusion constra<strong>in</strong>t toan Entity-relationship schema may already destroy a reasonable rule<br />
behaviour.<br />
ut<br />
7.7 Complexity <strong>of</strong> Local Stratication<br />
Let us now look at the check, whether a given set <strong>of</strong> constra<strong>in</strong>ts is locally stratied. In<br />
the second part <strong>of</strong> the pro<strong>of</strong> <strong>of</strong> Theorem 7.12 we have seen that this check can be done by<br />
direct construction <strong>of</strong> the desired partition <strong>in</strong>to maximal stratied labelled subsystems. The<br />
rst part <strong>of</strong> that pro<strong>of</strong> then <strong>in</strong>dicates how to construct the correspond<strong>in</strong>g RTS. In [7] we gave<br />
an explicit algorithm which also produces for each constra<strong>in</strong>t the set <strong>of</strong> \reduced" constra<strong>in</strong>ts<br />
used <strong>in</strong> the RTS construction. However, the time complexity <strong>of</strong> that algorithm was beyond<br />
any practicality, s<strong>in</strong>ce we could pro<strong>of</strong> the follow<strong>in</strong>g result.<br />
Proposition 7.13. Let be a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> clausal form, n =#, ` the maximum<br />
number <strong>of</strong> literals <strong>in</strong> constra<strong>in</strong>ts I2 and k the maximal arity <strong>of</strong> predicate symbols occurr<strong>in</strong>g<br />
<strong>in</strong> these constra<strong>in</strong>ts. Then check<strong>in</strong>g to be locally stratied can be done with a time complexity<br />
<strong>in</strong> O(k `2 n 2n2 `).<br />
ut<br />
We nowwant toshow that this complexity result is not accidentally. For this we rst show a<br />
technical lemma.<br />
Lemma 7.14. Let be a set <strong>of</strong> clauses conta<strong>in</strong><strong>in</strong>g only propositional atoms. Let L be a<br />
literal, such that L does not occur <strong>in</strong> any <strong>of</strong> the clauses <strong>in</strong> . Assume = 1 [ 2 such<br />
that L does not occur <strong>in</strong> any <strong>of</strong> the clauses <strong>in</strong> 1 , but <strong>in</strong> all clauses <strong>of</strong> 2 . Moreover, 2<br />
conta<strong>in</strong>s only clauses with exactly two literals. If 1 is locally stratied and 2 is stratied,<br />
then is locally stratied.<br />
Pro<strong>of</strong>. First assume that 2 conta<strong>in</strong>s a s<strong>in</strong>gle clause C = L _ L 0 .If 1 is not stratied, there<br />
is a partition 1 = 11 0 [[0 1n (n>2) with stratied labelled subsystems (0 1i 00 1i L i).<br />
Then at most one L k can be L 0 and we may dene<br />
(<br />
i 0 1i 0 if L i = L 0<br />
=<br />
1i 0 [fCg otherwise :<br />
150
By <strong>in</strong>duction ( 0 i 00 i L i) is a stratied labelled subsystem. Thus, = 0 1 [[0 n denes<br />
the required partition.<br />
Now assume that 1 is stratied. Let 1 = 11 [[ 1n be a partition <strong>in</strong>to pairwise<br />
disjo<strong>in</strong>t strata. If 1 conta<strong>in</strong>s just one clause C 0 with L 0 and no clause with L 0 ,we are done,<br />
s<strong>in</strong>ce C may be added to the stratum <strong>of</strong> C 0 . Analogously, C may dene its own stratum, if<br />
such aC 0 does not exist at all. Therefore, we are reduced to the follow<strong>in</strong>g two cases:<br />
{ There is more than one clause <strong>in</strong> 1 conta<strong>in</strong><strong>in</strong>g L 0 (and hence none conta<strong>in</strong><strong>in</strong>g L 0 ) and<br />
these clauses belong to dierent strata.<br />
{ There are exactly two clauses C 1 and C 2 conta<strong>in</strong><strong>in</strong>g L 0 or L 0 respectively. Inparticular,<br />
C 1 and C 2 belong to the same stratum 1i .<br />
In both cases we choose the literals L 1 = L and L 2 = L 0 to dene labelled subsystems<br />
( 1 1 L 1 ) and (fCg[ 1 ;fC 00 j C 00 conta<strong>in</strong>s L 0 g<br />
| {z }<br />
2<br />
0<br />
00<br />
2 L 2 ) <br />
where 2 0 (and hence also 00 2 ) are stratied by the previous remarks.<br />
In the rst case choose C 0 conta<strong>in</strong><strong>in</strong>g L 0 and another literal L 00 to dene a labelled<br />
subsystem<br />
( 0 1 [fCg<br />
| {z }<br />
3<br />
0<br />
00<br />
3 L 3)<br />
with L 3 = L 00 ,where1 0 is a proper subset <strong>of</strong> 1 not conta<strong>in</strong><strong>in</strong>g C 0 . By <strong>in</strong>duction 3 00 must<br />
be locally stratied.<br />
In the second case choose C 2 = L 0 _ C2 0 , a literal L00 <strong>in</strong> C2 0 and L 3 = L 00 ,which denes a<br />
labelled subsystem (3 0 00 3 L 3) as before with 3 0 = 0 1 [fCg with a proper subset 0 1 ( 1<br />
conta<strong>in</strong><strong>in</strong>g C 1 , but not C 2 .Thus, 3 0 and 00 3 are stratied.<br />
In both cases we have obta<strong>in</strong>ed a partition = 1 [ 2 0 [ 0 3 with stratied labelled<br />
subsystems ( 1 1 L 1 ), (2 0 00 2 L 2) and (3 0 00 3 L 3). S<strong>in</strong>ce the additional condition for<br />
local stratication is easily veried, we conclude that is locally stratied.<br />
For the general case we may assume that 0 = 1 [ ( 2 ;fCg) is locally stratied by<br />
successive application <strong>of</strong> the constructions <strong>in</strong> the rst part <strong>of</strong> this pro<strong>of</strong>. Then we observe<br />
that <strong>in</strong> the case <strong>of</strong> non-stratied 0 we do not change labels, when we add C. However, it<br />
may happen that one <strong>of</strong> these labels now is L. This label results (as label L 1 ) from add<strong>in</strong>g<br />
C 0 to some stratied constra<strong>in</strong>t set.From the construction <strong>of</strong> this local stratication and the<br />
fact that 2 is stratied we conclude that the other labels L 2 and L 3 are dierent from L,<br />
which guarantees the local stratication condition to hold also <strong>in</strong> the general case.<br />
For the case <strong>of</strong> 0 be<strong>in</strong>g stratied the arguments are the same as before except for the case<br />
that 0 conta<strong>in</strong>s exactly one clause C 0 with L 0 and none with L 0 . Then the correspond<strong>in</strong>g<br />
stratum may also conta<strong>in</strong> clauses C i with literals L i and L i+1 (i = 1:::m), where L 1<br />
occurs <strong>in</strong> C 0 and L m+1 = L.<br />
In particular, we have C m 2 2 and add<strong>in</strong>g C to this stratum is no longer possible. S<strong>in</strong>ce<br />
2 is stratied, we must have m > 0, but then the literals L 0 , L 1 and L dene a local<br />
stratication with associated constra<strong>in</strong>t sets 0 ;fC 0 g[fCg, 0 ;fC m g[fCg and 0<br />
respectively.<br />
ut<br />
151
We shall use Lemma 7.14 <strong>in</strong> the pro<strong>of</strong> <strong>of</strong> NP-hardness to shr<strong>in</strong>k propositional constra<strong>in</strong>t sets.<br />
Another way toreducethetechnical complexity <strong>of</strong> that pro<strong>of</strong> is to drop the restriction on <br />
to conta<strong>in</strong> only clauses with at least one negative literal. If is a set <strong>of</strong> propositional clauses<br />
conta<strong>in</strong><strong>in</strong>g neither the atom q nor its negation, we add :q to each clause to form the set ext<br />
<strong>of</strong> clauses.<br />
Lemma 7.15. Let be a set <strong>of</strong> propositional clauses each with at least two literals. Then<br />
is locally stratied i is satisable and locally stratied.<br />
ext<br />
Pro<strong>of</strong>. First let be locally stratied and satisable. If is not stratied, we may choose<br />
the same labels to obta<strong>in</strong> a local stratication for ext .<br />
Thus, assume to be stratied. Then ( ext :q) is a stratied labelled subsystem.<br />
S<strong>in</strong>ce all clauses <strong>in</strong> all other labelled subsystems conta<strong>in</strong> the literal :q, wehave to isolate these<br />
clauses. Therefore, take a model for whichisgiven by a set fL 1 :::L n g <strong>of</strong> literals occurr<strong>in</strong>g<br />
<strong>in</strong> which must be <strong>in</strong>terpreted as true. Tak<strong>in</strong>g L i as a label and the correspond<strong>in</strong>g labelled<br />
subsystem (i 000<br />
i L i), we obta<strong>in</strong> a proper subset i 0 ( ext .For #i 00 > 1wemay proceed<br />
with the other literals L j . The last step results <strong>in</strong> unary sets f:q _ L k g which are obviously<br />
stratied.<br />
Conversely, given a local stratication for ext we can remove :q to obta<strong>in</strong> a local stratication<br />
for . It rema<strong>in</strong>s to show that is satisable. If ext is stratied, this is obvious,<br />
because a literal L with L occurr<strong>in</strong>g <strong>in</strong> some clause <strong>in</strong> cannot occur <strong>in</strong> any clause<strong>of</strong>.<br />
If ext is not stratied, there is at least one stratied labelled subsystem ( 0 00 L) such<br />
that :q occurs <strong>in</strong> all clauses <strong>in</strong> 00 , i.e. 00 = 0 ext and 0 is satisable. This still holds if we<br />
put back the literal L and extend our <strong>in</strong>terprete L as false to satisfy clauses <strong>in</strong> ; 0 . ut<br />
Theorem 7.16.<br />
NP-hard.<br />
Let be a set <strong>of</strong> constra<strong>in</strong>ts. Then check<strong>in</strong>g that is locally stratied is<br />
Pro<strong>of</strong>. We show that the disjo<strong>in</strong>t cover problem (DCP) { which isknown to be NP-complete<br />
{ can be reduced <strong>in</strong> polynomial time to the local stratication problem. For this, let (X S)<br />
be an <strong>in</strong>stance <strong>of</strong> DCP, i.e. X is a nite set, say X = fx 1 :::x n g and S = fS 1 :::S m g is a<br />
subset <strong>of</strong> the power set P(X). The problem is to decide, whether a subset S 0 Sexists such<br />
that X is the disjo<strong>in</strong>t union <strong>of</strong> the sets <strong>in</strong> S 0 .SuchaS 0 is called a S solution for (X S).<br />
Without loss <strong>of</strong> generality we may always assume that X = S i holds. Moreover, we<br />
may allow S to be a multiset.<br />
We now associate with (X S) a set <strong>of</strong> constra<strong>in</strong>ts . For this let p ij be a propositional<br />
atom for all x i 2 S j .For S i = fx j 1 :::x j i<br />
g2Swe dene clauses :p jk i _ p j`i and :p j`i _ p jk i<br />
for k ` 2 f1:::ig, k 6= `. We refer to these clauses as connection clauses with respect to<br />
S i . For x i 2 S j \ S k (j 6= k) we dene an exclusion clause :p ij _:p ik . F<strong>in</strong>ally, for each x i<br />
we dene a cover clause p ij 1 __p ij m<br />
for the sets S j 1 :::S j m<br />
2Sconta<strong>in</strong><strong>in</strong>g x i provided<br />
m 2. conta<strong>in</strong>s all these connection, exclusion and cover clauses.<br />
Then we have toshowthat(X S) has a solution i is locally stratied and satisable.<br />
For this we <strong>in</strong>troduce a partial order on DCP-<strong>in</strong>stances lett<strong>in</strong>g (X 1 S 1 ) < (X 2 S 2 )i<br />
X<br />
S2S1<br />
j S j <<br />
X S2S2<br />
or<br />
0<br />
@ X S2S1<br />
= X S2S2<br />
and<br />
S i 2S<br />
jS 1 j > jS 2 j<br />
1<br />
A<br />
152
holds.<br />
First let S 0 = fS i 1 :::S i k<br />
g be a solution for (X S). Then is obviously satisable. In<br />
order to use <strong>in</strong>duction with respect to we consider the follow<strong>in</strong>g two operations:<br />
{ Replace S j 2S 0 by S j ;fx`g and add S m+1 = fx`g for some x` 2 S j .<br />
{ Replace S j =2S 0 by S j ;fx`g for some x` 2 S j .<br />
In both cases we obta<strong>in</strong> a smaller DCP-<strong>in</strong>stance which has a solution. By <strong>in</strong>duction the<br />
correspond<strong>in</strong>g constra<strong>in</strong>t set1 0 is locally stratied.<br />
In the rst case we remove all clauses with literals p im+1 from 1 0 . The result<strong>in</strong>g subset 00 1<br />
is still locally stratied. Now build the labelled subsystem ( 0 00 L) with the label L = :p`j .<br />
The clauses <strong>in</strong> 0 (and hence <strong>in</strong> 00 ) do not conta<strong>in</strong> p`j , i.e. we omit the cover clause with<br />
respect to x` and connection clauses conta<strong>in</strong><strong>in</strong>g p`j with respect to x` 2 S j . Clauses <strong>in</strong> 00<br />
conta<strong>in</strong><strong>in</strong>g :p`j arise from the restriction to keep at least two literals, hence must also lie <strong>in</strong><br />
0 . Therefore, we obta<strong>in</strong> 00 = 1 [ 2 , where 2 is stratied and conta<strong>in</strong>s only clauses with<br />
two literals, one <strong>of</strong> them is :p`j , whereas clauses <strong>in</strong> 1 do not conta<strong>in</strong> :p`j .<br />
Thus, the rema<strong>in</strong><strong>in</strong>g connection clauses with respect to x` 2 S j and the exclusion clauses<br />
with respect to x` 2 S j occur <strong>in</strong> 2 . This implies 1 = 1 00 . From Lemma 7.14 we conclude<br />
that 00 is locally stratied.<br />
In the second case we build the labelled subsystem ( 0 00 L) with the label L = p`j .<br />
The clauses <strong>in</strong> 0 (and hence <strong>in</strong> 00 ) do not conta<strong>in</strong> :p`j , i.e. we omit exclusion clauses<br />
and connection clauses conta<strong>in</strong><strong>in</strong>g :p`j with respect to x` 2 S j . Aga<strong>in</strong>, the clauses <strong>in</strong> 00<br />
conta<strong>in</strong><strong>in</strong>g p`j only arise from the restriction to keep at least two literals. Hence, these clauses<br />
dene a stratied subset 2 <strong>of</strong> 00 (and <strong>of</strong> 0 )conta<strong>in</strong><strong>in</strong>g only clauses with two literals.<br />
The rema<strong>in</strong><strong>in</strong>g clauses form a subset 1 and clauses <strong>in</strong> 1 do not conta<strong>in</strong> p`j , i.e. the<br />
rema<strong>in</strong><strong>in</strong>g connection clauses with respect to x` 2 S j and the cover clause with respect to x`<br />
(if it conta<strong>in</strong>s just two literals) occur <strong>in</strong> 2 , which implies 1 = 1 0 . From Lemma 7.14 we<br />
conclude that 00 is locally stratied.<br />
S<strong>in</strong>ce <strong>in</strong> the rst case (x` 2 S j 2S 0 ) only the cover clause with respect to x` and connection<br />
clauses conta<strong>in</strong><strong>in</strong>g p`j and <strong>in</strong> the second case (x` 2 S j =2S 0 ) only exclusion clauses with respect<br />
to x` 2 S j and connection clauses conta<strong>in</strong><strong>in</strong>g :p`j are omitted <strong>in</strong> 0 , the additional condition<br />
for local stratication is easily veried, if all such choices are taken provided there are at least<br />
three such possibilities. The only critical case arises, if there are only three choices <strong>of</strong> the<br />
second k<strong>in</strong>d, all with the same x`. In this case we must have another S j = fx`g 2S 0 and we<br />
simply add the labelled subsystem ( 0 00 :p`j ) to satisfy the additional local stratication<br />
condition.<br />
If there are at most two choices, then either<br />
{ S = S 0 and there is exactly one S j = fx k x`g or<br />
{ S 0 conta<strong>in</strong>s only unary sets and these are exactly S j = fx j g =2 S 0 and S k = fx k g =2 S 0 or<br />
{ S 0 conta<strong>in</strong>s only unary sets and there is exactly one S j = fx k x`g =2 S 0 .<br />
In the rst case conta<strong>in</strong>s only two connection clauses with respect to S j and hence is<br />
obviously stratied. In the second case conta<strong>in</strong>s only four clauses<br />
:p kk _:p kk 0 p kk _ p kk 0 :p jj _:p jj 0 and p jj _ p jj 0<br />
for S j 0 = fx j g2S 0 and S k 0 = fx k g2S 0 ,hence is stratied.<br />
153
In the third case we obta<strong>in</strong> six clauses<br />
:p kj _ p`j :p`j _ p kj :p kj _ p kk 0 :p`j _ p``0 p kj _ p kk 0 and p`j _ p``0<br />
for S k 0 = fx k g2S 0 and S`0 = fx`g 2S 0 . Us<strong>in</strong>g Lemma 7.14 it is easily veried that the labels<br />
p kj , p`j , :p kj and :p`j dene a partition <strong>in</strong>to stratied labelled subsystems.<br />
For the converse let us rst assume that is stratied, i.e. there cannot exist three<br />
clauses with literals L, L and L respectively. In connection clauses we may have L = p`j<br />
(or L = :p`j ) and it follows that does not conta<strong>in</strong> exclusion or cover clauses for x` 2 S j .<br />
This implies x` =2 S k for all k 6= j. Ifwehave an exclusion clause for x` 2 S j ,say :p`j _:p`k ,<br />
then we also have a cover clause p`j _ p`k _ C 0 and vice versa, but there cannot be further<br />
exclusion clauses nor connection clauses for x` 2 S j , i.e. C 0 false and S j = fx`g.<br />
To summarize, if x` occurs <strong>in</strong> more than one S j ,then#S j = 1 and there are just two such<br />
sets. Therefore, for a solution S 0 we take all S j with #S j 2 and select a s<strong>in</strong>gleton set fx`g<br />
for the rema<strong>in</strong><strong>in</strong>g elements.<br />
Next assume that is locally stratied, i.e. there is a local stratication with labels<br />
L 1 :::L n (n 3). Aga<strong>in</strong>, we proceed by <strong>in</strong>duction on DCP-<strong>in</strong>stances.<br />
For L 1 = :p`j and the stratied labelled subsystem (1 0 00 1 L 1) the cover clause for x`<br />
and connection clauses for x` 2 S j conta<strong>in</strong><strong>in</strong>g p`j have been removed from to give 1 0 ,<br />
hence must occur <strong>in</strong> two other labelled subsystems such that for a label :p ki we must have<br />
k:` and for a label p ki wemust have i 6= j.<br />
Analogously, for L 1 = p`j exclusion and connection clauses for x` 2 S j , the latter ones<br />
conta<strong>in</strong><strong>in</strong>g :p`j have been removed omitted <strong>in</strong> 1 0 and must occur <strong>in</strong> two other labelled<br />
subsystems such that for another positive labelp ki wemust have k:` and for a negative label<br />
:p ki wemust have i 6= j. Hence, for the m<strong>in</strong>imum number <strong>of</strong> three labels L 1 , L 2 and L 3 we<br />
obta<strong>in</strong> the follow<strong>in</strong>g four cases:<br />
L 1 = :p`j L 2 = :p k 1i1 L 3 = :p k 2i2 with pairwise dierent ` k 1 k 2 <br />
L 1 = :p`j L 2 = :p k 1i1 L 3 = p k 2i2 with ` 6= k 1 and j 6= i 2 6= i 1 <br />
L 1 = :p`j L 2 = p k 1i1 L 3 = p k 2i2 with k 1 6= k 2 and i 1 6= j 6= i 2 or<br />
L 1 = p`j L 2 = p k 1i1 L 3 = p k 2i2 with pairwise dierent ` k 1 k 2 :<br />
For a negative literal L i = :p`j or a positive literal L i = p`j it follows from Lemma 7.14 that<br />
replac<strong>in</strong>g S j by S j ;fx`g and fx`g denes a locally stratied constra<strong>in</strong>t set. Therefore, by<br />
<strong>in</strong>duction <strong>in</strong> all four cases (with the restrictions for <strong>in</strong>dices) we obta<strong>in</strong> solutions for smaller<br />
DCP-<strong>in</strong>stances with<br />
S 1 = fS 1 :::S j ;fx`g:::S m fx`gg <br />
S 2 = fS 1 :::S i 1 ;fx k1 g:::S m fx k 1gg and<br />
S 3 = fS 1 :::S i 2 ;fx k2 g:::S m fx k 2 gg<br />
respectively. Ifany <strong>of</strong> these solutions conta<strong>in</strong>s both (or none) <strong>of</strong> the splitted components, e.g.<br />
S j ;fx`g and fx`g, we also have a solution for the orig<strong>in</strong>al problem.<br />
Therefore, assume that all solutions for (X S i ) must conta<strong>in</strong> exactly one <strong>of</strong> the splitted<br />
components denoted as S 1 , S 2 and S 3 . Let S 0 i = fSi 1 :::Si n i<br />
S i g be a solution for (X S i ).<br />
For i 6= j we proceed <strong>in</strong> the follow<strong>in</strong>g way:<br />
Start with T i = S 0 i ;S0 j , T j = S 0 j ;S0 i and T = fS jg and execute the follow<strong>in</strong>g steps until<br />
there are no more changes:<br />
154
{ Remove all sets from T i <strong>in</strong>tersect<strong>in</strong>g some set <strong>in</strong> T and let these dene a new T .<br />
{ Remove all sets from T j <strong>in</strong>tersect<strong>in</strong>g some set <strong>in</strong> T and let these dene a new T .<br />
F<strong>in</strong>ally, ifT i (and then also T j ) are non-empty, this means that we may replace T j S 0 j by<br />
T i or S 0 j ;T j by S 0 i ;T i. Accord<strong>in</strong>g to our assumption on solutions we always keep either S i<br />
or S j . Consequently, the procedure above denesacha<strong>in</strong><br />
S i ; S j i1 ; Si i1 ; Sj i2 ; Si i2 ;;Sj i k<br />
; S i i k<br />
; S j<br />
<br />
where neighbour<strong>in</strong>g sets have a common element. This is still true, if we replace S i by the<br />
orig<strong>in</strong>al S j .Tak<strong>in</strong>g together all three choices for (i j) we obta<strong>in</strong> an odd-length cycle<br />
S i 1 ; S i2 ; S i3 ;;S i m<br />
; S i 1<br />
with <strong>in</strong>tersect<strong>in</strong>g neighbour<strong>in</strong>g sets S ij 2S.Let 0 be the set <strong>of</strong> constra<strong>in</strong>ts correspond<strong>in</strong>g to<br />
fS i 1 :::S i m<br />
g. Then 0 diers from a subset 0 only by the fact that cover clauses may<br />
have been shortened. S<strong>in</strong>ce omitted (positive) literals <strong>in</strong> these cover clauses do not occur <strong>in</strong><br />
any other clauses <strong>in</strong> 0 , this must be locally stratied i 0 is locally stratied. Therefore,<br />
the pro<strong>of</strong> is completed, if we can show that cycles as above always dene constra<strong>in</strong>t sets that<br />
are not locally stratied or not satisable.<br />
With each neighbour<strong>in</strong>g pair (S ij S ij+1 )wemay associate a witness x 2 S i j<br />
\ S ij+1 . Then<br />
without loss <strong>of</strong> generality (just rename <strong>in</strong>dices) we canalways assume a cycle<br />
S 1<br />
x1<br />
; S 2<br />
x2<br />
; S 3 ;;S m<br />
x m;<br />
Sm+1 = S 1<br />
and show that the follow<strong>in</strong>g conditions can be achieved:<br />
{ m is odd,<br />
{ the x i are pairwise dierent,<br />
{ the S i are pairwise dierent and<br />
{ the cover clause <strong>in</strong> 0 for x` has the form p`` _ p``+1 _ C0`, where literals <strong>in</strong> C0` do not<br />
occur <strong>in</strong> any other clause <strong>in</strong> 0 .<br />
The last condition will allow us to assume without loss <strong>of</strong> generality thatcover clauses <strong>in</strong> 0<br />
only conta<strong>in</strong> two literals.<br />
In order to achieve such a cycle rercall that our orig<strong>in</strong>al cycle is composed <strong>of</strong> three subpaths<br />
(called anks) correspond<strong>in</strong>g to a solution <strong>of</strong> a smaller DCP-<strong>in</strong>stance and each pair <strong>of</strong> anks<br />
has a common set (called corner). If S i ( S j is such acorner, then the follow<strong>in</strong>g cases may<br />
arise:<br />
{ The two nieghbours S i and S k co<strong>in</strong>cide which allows to remove the corner S j and to<br />
identify S i with S k .<br />
{ If S i , S j and S k are pairwise dierent, we either obta<strong>in</strong> a simple cycle <strong>of</strong> length 3 or let<br />
the caycle unchanged.<br />
{ If one <strong>of</strong> the neighbours equals S j ,say S k = S j , then S k is not common <strong>in</strong> the solutions<br />
for the ank with S j and S k , i.e. there must be some S j 0 <strong>in</strong> the same solution as S i with<br />
S j \ S j 0 6= . In this case we may replace the even number <strong>of</strong> edges between S j and S j 0<br />
by a s<strong>in</strong>gle edge. By the same argument theeven number <strong>of</strong> edges between the opposite<br />
edge S` (<strong>in</strong> the same ank) and some S`0 by a s<strong>in</strong>gle edge.<br />
155
In all these cases the cycle length rema<strong>in</strong>s odd.<br />
If x i occurs twice, say between Si 1 and S i 2 and between Si 3 and S i 4 respectively, wemay<br />
assume paths from S i 1 to S i4 and from S i2 to S i3 <strong>of</strong> length n 1 and n 2 respectively. Then there<br />
are cycles with S i 2 , S i3 and S i1 , S i4 connected by x i respectively and one <strong>of</strong> the correspond<strong>in</strong>g<br />
lengths n 1 +1 or n 2 +1 must be odd. The only critical cases occur for S i 2 = S i4 or S i1 = S i3 ,<br />
but these correspond to corners that have already been removed.<br />
F<strong>in</strong>ally, <strong>in</strong> order to achieve the condition on cover clauses consider S i \ S j 6= .<br />
{ If S i and S j belong to dierent anks, but to the same solution, then we have S i = S j<br />
and we may identify them and remove theeven number <strong>of</strong> edges between them.<br />
{ If S i and S j belong to dierent anks and dierent solutions, then for S i 6= S j we may<br />
replace the odd number <strong>of</strong> edges between them by a s<strong>in</strong>gle new edge, whereas for S i = S j<br />
we may consider the odd number <strong>of</strong> edges between them as our new cycle.<br />
{ If S i and S j belong to the same ank, then the number <strong>of</strong> edges between them is even i<br />
S i = S j ,thus may beremoved or replaced by a s<strong>in</strong>gle new edge.<br />
The conditions on our cycle now allows clauses to be arranged <strong>in</strong> such away thatwehave<br />
0 = f L 1 _ L 2 L 2 _ L 3 ::: L p;1 _ L p L p _ L 1 g<br />
for an even number p with L p=2+i = L i for i = 1:::p=2. Such a 0 , however, is not<br />
satisable.<br />
ut<br />
7.8 Conclusion<br />
In this article we <strong>in</strong>vestigated the limits <strong>of</strong> rule trigger<strong>in</strong>g systems (RTSs) for ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g<br />
database <strong>in</strong>tegrity. The rst result assures the existence <strong>of</strong> non-repairable transitions. In order<br />
to disallow such transitions the constra<strong>in</strong>t implication problem must be decidable.<br />
Secondly, we analyzed critical trigger paths <strong>in</strong> rule hypergraphs associated with RTSs. We<br />
could show that the existence <strong>of</strong> critical trigger paths leads to RTSs which may<strong>in</strong>validate the<br />
eect <strong>of</strong> some transitions, even if these are repairable. Such abehaviour can only be excluded<br />
for locally stratied constra<strong>in</strong>t sets. In this case the needed RTS can be computed eectively,<br />
but check<strong>in</strong>g local stratication is NP-hard.<br />
To summarize, both results limit the applicability<strong>of</strong>RTSs for <strong>in</strong>tegrity ma<strong>in</strong>tenance under<br />
the assumption that the <strong>in</strong>tended eects <strong>of</strong> user-dened transitions should be preserved.<br />
Fortunately, there is a stronger condition on a constra<strong>in</strong>t set to be stratied, which is only<br />
sucient for reasonable rule behaviour, but not necessary. Stratied constra<strong>in</strong>t sets occur, if<br />
we have a relational database schema <strong>in</strong> Entity-Relationship normal form, which means that<br />
it is equivalent to an ER-schema without exclusion constra<strong>in</strong>ts. Check<strong>in</strong>g stratication is not<br />
only eective, but also ecient.<br />
On the other hand, the RTS approach to <strong>in</strong>tegrity ma<strong>in</strong>tenance completely ignores userdened<br />
transitions. Thus, a second conclusion from our studies is that these should be taken<br />
<strong>in</strong>to consideration.<br />
References for Chapter 7<br />
1. S. Ceri, J. Widom: Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />
VLDB, Brisbane (Australia), August 1990, 566-577<br />
156
2. S. Ceri, P. Fraternali, S. Paraboschi, L. Tanca: Automatic Generation <strong>of</strong> Production Rules for<br />
Integrity Ma<strong>in</strong>tenance. ACM ToDS, vol. 19(3), 1994, 367-422.<br />
3. S. Chakravarty, J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc.,<br />
Houston, Februar 1994<br />
4. M. Gertz, U. W. Lipeck: Deriv<strong>in</strong>g Integrity Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Triggers from Transition Graphs, <strong>in</strong> Proc.<br />
9th ICDE, IEEE Computer Society Press, 1993, 22-29<br />
5. H. Mannila, K.-J. Raiha: The Design <strong>of</strong> Relational <strong>Databases</strong>, Addison-Wesley 1992<br />
6. K.-D. Schewe, B. Thalheim: Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>, <strong>in</strong> S. Chakravarty, J.<br />
Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc., Houston, Februar<br />
1994<br />
7. K.-D. Schewe, B. Thalheim: Active Consistency Enforcement for Repairable Database Transitions,<br />
<strong>in</strong> S.Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.): Integrity <strong>in</strong> <strong>Databases</strong>, Proc. 6th Int. Workskop<br />
on Foundations <strong>of</strong> Models and Languages for Data and <strong>Object</strong>s, Schlo Dagstuhl, 1996, 87-102,<br />
available via http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceed<strong>in</strong>gs.html<br />
8. B. Thalheim: Foundations <strong>of</strong> entity-relationship model<strong>in</strong>g, Annals <strong>of</strong> Mathematics and Articial<br />
Intelligence, vol. 7, 1993, 197-256<br />
9. S. D. Urban, L. Delcambre: Constra<strong>in</strong>t Analysis: a Design Process for Specify<strong>in</strong>g Operations on<br />
<strong>Object</strong>s, IEEETrans. on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), December 1990<br />
10. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: Set-oriented Production Rules <strong>in</strong> Relational Database Systems, <strong>in</strong><br />
Proc. SIGMOD 1990, 259-270<br />
157
Chapter 8<br />
Consistency Enforcement <strong>in</strong><br />
Entity-Relationship and<br />
<strong>Object</strong>-<strong>Oriented</strong> Models<br />
Contents<br />
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159<br />
8.2 Rule Systems for Consistency Ma<strong>in</strong>tenance . . . . . . . . . . . . . 160<br />
8.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161<br />
8.2.2 ECA-Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />
8.3 Problems with Rule-Based Integrity Enforcement . . . . . . . . . 164<br />
8.3.1 Non-Repairable Transactions . . . . . . . . . . . . . . . . . . . . . . 164<br />
8.3.2 Critical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165<br />
8.3.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167<br />
8.4 Well-behav<strong>in</strong>g Rule Systems . . . . . . . . . . . . . . . . . . . . . . 169<br />
8.4.1 Stratied Rule Systems . . . . . . . . . . . . . . . . . . . . . . . . . 169<br />
8.4.2 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Entity-Relationship Schemata . . . . . . . 170<br />
8.4.3 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Simple <strong>Object</strong>-<strong>Oriented</strong> Schemata . . . . . 172<br />
8.5 Conict Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 174<br />
8.5.1 Problem <strong>of</strong> Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . 174<br />
8.5.2 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176<br />
8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
K.-D. Schewe. Consistency Enforcement <strong>in</strong>Entity-Relationship and <strong>Object</strong>-<strong>Oriented</strong><br />
Models. Data & Knowledge Eng<strong>in</strong>eer<strong>in</strong>g. 1998 (to appear).<br />
158
Abstract. Integrity Ma<strong>in</strong>tenance is considered one <strong>of</strong> the major application elds <strong>of</strong> rule<br />
trigger<strong>in</strong>g systems (RTSs). In the case <strong>of</strong> a given <strong>in</strong>tegrity constra<strong>in</strong>t be<strong>in</strong>g violated by a<br />
database transaction these systems trigger repair<strong>in</strong>g actions. However, it has been shown<br />
that for any set <strong>of</strong> constra<strong>in</strong>ts there exist non-repairable transactions, which depend on the<br />
closure <strong>of</strong> the constra<strong>in</strong>t set. Even if non-repairable transactions are excluded, this does not<br />
restra<strong>in</strong> the RTS from produc<strong>in</strong>g undesired behaviour.<br />
Analyz<strong>in</strong>g the behaviour <strong>of</strong> RTSs leads to the denition <strong>of</strong> critical paths <strong>in</strong> associated rule<br />
hypergraphs and the requirement<strong>of</strong>such paths be<strong>in</strong>g absent. It is shown that this requirement<br />
can be satised if the underly<strong>in</strong>g set <strong>of</strong> constra<strong>in</strong>ts is stratied and that this is always the<br />
case for the structural constra<strong>in</strong>ts <strong>in</strong> Entity-Relationship and simple object-oriented models.<br />
Moreover, <strong>in</strong> both cases there is no ambiguity for the selection <strong>of</strong> rules.<br />
Keywords. <strong>in</strong>tegrity constra<strong>in</strong>ts, consistency enforcement, active databases, Entity-Relationship,<br />
object-orientation, analysis <strong>of</strong> rule systems<br />
8.1 Introduction<br />
Active databases (ADBs) aim at extend<strong>in</strong>g relational (or object-oriented) DBMS by rule<br />
trigger<strong>in</strong>g systems (RTSs), i.e. by sets <strong>of</strong> rules which on a given event and <strong>in</strong> the case <strong>of</strong> a<br />
condition be<strong>in</strong>g satised trigger actions on the database (ECA-rules). Events can be external<br />
events, time conditions or <strong>in</strong>ternal events result<strong>in</strong>g from operations on the database. Conditions<br />
are usually given by boolean queries that have to be evaluated aga<strong>in</strong>st the database.<br />
The action part consists <strong>of</strong> a sequence <strong>of</strong> basic operations to <strong>in</strong>sert, delete or update tuples<br />
(or objects respectively) <strong>in</strong> the database.<br />
The work <strong>in</strong> [3, 4, 8, 16, 17] and partly <strong>in</strong> [5] considers the problem to enforce database<br />
<strong>in</strong>tegrity by the use <strong>of</strong> RTSs. The results concern the generation <strong>of</strong> repair<strong>in</strong>g ECA-rules and<br />
partly the analysis <strong>of</strong> the result<strong>in</strong>g RTS. This analysis concentrates on the term<strong>in</strong>ation <strong>of</strong> the<br />
rule system, the <strong>in</strong>dependence <strong>of</strong> the nal database state from the chosen selection order <strong>of</strong><br />
the rules (conuence) andonconsistency.<br />
These requirements are not sucient for a reasonable rule behaviour, because it is easy<br />
to dene an RTS that empties the database <strong>in</strong> case <strong>of</strong> any constra<strong>in</strong>t violation. Therefore,<br />
we claim an additional requirement, which <strong>in</strong>formally means that the <strong>in</strong>tended eect <strong>of</strong> a<br />
transaction may not be turned <strong>in</strong>to its opposite by the RTS.<br />
In this paper we <strong>in</strong>vestigate general problems with RTSs and show that these cannot occur<br />
<strong>in</strong> simple Entity-relationship- and object-oriented schemata. The rst problem concerns<br />
the existence <strong>of</strong> non-repairable transactions that are determ<strong>in</strong>ed by the closure <strong>of</strong> the constra<strong>in</strong>t<br />
set. The second problem arises from the analysis <strong>of</strong> how to obta<strong>in</strong> RTSs that denitely<br />
repair constra<strong>in</strong>t violations by a (repairable) transaction without <strong>in</strong>validat<strong>in</strong>g its <strong>in</strong>tended<br />
eect. Given an RTS we associate with it a rule hypergraph which corresponds to the possible<br />
sequences <strong>of</strong> triggered rules. We dene critical trigger paths <strong>in</strong> these hypergraphs that correspond<br />
to the propagation <strong>of</strong> conditions. Then it can be shown that the existence <strong>of</strong> a s<strong>in</strong>gle<br />
critical trigger path makes the RTS work <strong>in</strong>correctly for at least one transaction.<br />
Next we analyze constra<strong>in</strong>t sets <strong>in</strong> order to detect whether it is possible to dene an<br />
RTS <strong>of</strong> repair<strong>in</strong>g actions such that the critical trigger paths <strong>in</strong> its associated hypergraph can<br />
only <strong>in</strong>validate non-repairable transactions. For this we <strong>in</strong>troduce stratied constra<strong>in</strong>t sets<br />
that satisfy this condition. We apply our results to the case <strong>of</strong> specic Entity-Relationship<br />
159
and simple object-oriented models and demonstrate that structurally determ<strong>in</strong>ed constra<strong>in</strong>t<br />
sets <strong>in</strong> these cases are always stratied. Furthermore, it will be shown that <strong>in</strong> these cases<br />
ambiguities aris<strong>in</strong>g from dierent execution orders can also be detected.<br />
The work presented <strong>in</strong> this paper extends previous research <strong>in</strong> [12, 14] <strong>in</strong> that theoretical<br />
<strong>in</strong>vestigations about the strength and weaknesses <strong>of</strong> the rule trigger<strong>in</strong>g approach for <strong>in</strong>tegrity<br />
ma<strong>in</strong>tenance have been directly tied <strong>in</strong> with consistency <strong>in</strong> Entity-Relationship and simple<br />
object-oriented models. A prelim<strong>in</strong>ary version was presented at the 1997 conference on Conceptual<br />
Modell<strong>in</strong>g (ER '97) [13].<br />
8.2 Rule Systems for Consistency Ma<strong>in</strong>tenance<br />
Let us rst consider the relational data model with <strong>in</strong>tegrity constra<strong>in</strong>ts given by closed<br />
formulae I <strong>in</strong> implicative normal form<br />
8x 1 :::x k : 9y 1 :::y`: p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m ) :<br />
(8.76)<br />
The vectors x i consist only <strong>of</strong> universally quantied variables x j and the vectors y i consist <strong>of</strong><br />
both universally quantied variables x j and existentially quantied variables y j . The predicate<br />
symbols p i , q j correspond either to a relation <strong>of</strong> the schema or are comparison predicates<br />
(= 6=
8.2.1 Motivation<br />
Let us rst illustrate consistency enforcement us<strong>in</strong>g a small fragment <strong>of</strong> the example used <strong>in</strong><br />
[4, 10].<br />
Example 8.1. Let us dene a schema with some simple functional and <strong>in</strong>clusion constra<strong>in</strong>ts.<br />
For simplicity we omit all types. The relation schemata are<br />
WIRE = f wire id, connection, wire type, voltage, power g ,<br />
TUBE = f tube id, connection, tube type g and<br />
CONNECTION = f connection, from, to g<br />
These are used to express that there are tubes between two locations and wires <strong>in</strong> these tubes.<br />
In addition consider the follow<strong>in</strong>g constra<strong>in</strong>ts:<br />
FD 1 WIRE : wire id ! connection, wire type, voltage, power<br />
FD 2 TUBE : tube id ! connection, tube type<br />
FD 3 CONNECTION : connection ! from, to<br />
ID 1 WIRE[connection] TUBE[connection]<br />
ID 2 TUBE[connection] CONNECTION[connection]<br />
The rst three functional dependencies express that the values <strong>of</strong> wire id, tube id and connection<br />
are unique <strong>in</strong> relations over WIRE, TUBE and CONNECTION respectively. The latter<br />
<strong>in</strong>clusion constra<strong>in</strong>ts express that there is no wire nor tube without a correspond<strong>in</strong>g tuple <strong>in</strong><br />
a relation over CONNECTION.<br />
Then the follow<strong>in</strong>g relations dene an <strong>in</strong>stance <strong>of</strong> the schema:<br />
WIRE<br />
wire id connection wire type voltage power<br />
4711 HH-HB Koax 12 600<br />
4814 HH-H Tel 12 600<br />
TUBE<br />
tube id connection tube type<br />
8314 HH-H GX44<br />
8511 HH-HB GX44<br />
023 HB-H T33<br />
CONNECTION<br />
connection from to<br />
HH-H Hamburg Hannover<br />
HH-HB Hamburg Bremen<br />
HB-H Bremen Hannover<br />
It is easy to see that this <strong>in</strong>stance satises the constra<strong>in</strong>ts above.<br />
Now consider the operation <strong>in</strong>sert WIRE (t). This may lead to a violation <strong>of</strong> constra<strong>in</strong>t ID 1 ,<br />
<strong>in</strong> which case we must add a tuple to TUBE. Hence it can be replaced by<br />
<strong>in</strong>sert WIRE (t) <br />
IF connection(t) =2 TUBE[connection]<br />
THEN <strong>in</strong>sert TUBE (? connection(t) ?)<br />
ENDIF<br />
Here the question marks stand for arbitrarily chosen values <strong>of</strong> the correspond<strong>in</strong>g data type.<br />
Similarly, the operation delete TUBE (t) may also violate ID 1 . Therefore, we may replace<br />
delete TUBE (t) by<br />
delete TUBE (t) <br />
IF connection(t) 2 WIRE[connection] ; TUBE[connection]<br />
161
THEN FOR ALL t 0 WITH connection(t 0 ) = connection(t) DO<br />
delete WIRE (t 0 )<br />
ENDFOR<br />
ENDIF<br />
In order to enforce FD 2 wemay then replace <strong>in</strong>sert TUBE (t) by<br />
IF 8t 0 2 TUBE . tube id(t) 6= tube id(t 0 )<br />
THEN <strong>in</strong>sert WIRE (t)<br />
ENDIF<br />
Let us now add the exclusion constra<strong>in</strong>t ED WIRE[wire id] k TUBE[tube id]. In order to<br />
enforce this constra<strong>in</strong>t <strong>in</strong>sertions <strong>in</strong>to one <strong>of</strong> WIRE or TUBE should be followed by deletions<br />
<strong>in</strong> the other. The result<strong>in</strong>g transactions are<br />
and<br />
<strong>in</strong>sert WIRE (t) <br />
FOR ALL t 0 2 TUBE WITH tube id(t 0 ) = wire id(t) DO<br />
delete TUBE (t 0 )<br />
ENDFOR<br />
delete TUBE (t) <br />
FOR ALL t 0 2 WIRE WITH wire id(t 0 )=tubeid(t) DO<br />
delete WIRE (t 0 )<br />
ENDFOR<br />
If we now take together FD 2 , ID 1 and ED we must be very careful. E.g., if we execute<br />
<strong>in</strong>sert WIRE (8511,HH-HB,Koax,12,600) on the <strong>in</strong>stance above, we may rst delete the tuple<br />
(8511,HH-HB,GX44) <strong>in</strong> TUBE <strong>in</strong> order to enforce ED and then the two tuples (4711,HH-<br />
HB,Koax,12,600) and (8511,HH-HB,Koax,12,600) <strong>in</strong> WIRE <strong>in</strong> order to enforce ID 2 . The result<strong>in</strong>g<br />
<strong>in</strong>stance would be (omitt<strong>in</strong>g CONNECTION):<br />
WIRE<br />
wire id connection wire type voltage power<br />
4814 HH-H Tel 12 600<br />
TUBE<br />
tube id connection tube type<br />
8314 HH-H GX44<br />
023 HB-H T33<br />
Thus, the \eect" <strong>of</strong> the orig<strong>in</strong>al operation, i.e. <strong>in</strong>sertion <strong>of</strong> a tuple <strong>in</strong>to WIRE, is completely<br />
destroyed. The new eect is a deletion <strong>in</strong> WIRE and TUBE.<br />
ut<br />
8.2.2 ECA-Rules<br />
Active databases approach <strong>in</strong>tegrity enforcement by us<strong>in</strong>g ECA-rules. The general form <strong>of</strong><br />
these rules is<br />
ON heventi IF hconditioni DO hactioni : (8.80)<br />
heventi corresponds to an <strong>in</strong>ternal event, i.e. an <strong>in</strong>sert- or delete-operation. hconditioni is a<br />
formula to be evaluated aga<strong>in</strong>st the actual database state, e.g. it could be the negation :I<br />
<strong>of</strong> a constra<strong>in</strong>t I <strong>in</strong> implicative normal form (8.76). hactioni is a sequence <strong>of</strong> basic <strong>in</strong>sert- or<br />
delete-operations to be triggered, i.e. to be executed if the event occurred and the condition<br />
is satised.<br />
162
In the sequel the assumed execution model for ECA-rules relies on a deferred modus, i.e.<br />
the system RTS <strong>of</strong> rules is started after nish<strong>in</strong>g a transaction. Furthermore, we do not assume<br />
any order <strong>of</strong> the rules. Instead <strong>of</strong> this the execution model relies on demonic non-determ<strong>in</strong>ism,<br />
i.e. if the events <strong>of</strong> several rules r 1 ::: r n occur and their conditions evaluate to true, any<strong>of</strong><br />
these r i may be executed unless it is undened.<br />
Example 8.2. Let us look aga<strong>in</strong> at the schema used <strong>in</strong> Example 8.1. For the sake <strong>of</strong> simplicity<br />
we only consider the constra<strong>in</strong>ts ID 1 and ED. Then the changed operations can be expressed<br />
by rules. First consider <strong>in</strong>sert WIRE (t). The correspond<strong>in</strong>g rule would be<br />
ON <strong>in</strong>sert WIRE (w c t v p) IFc =2 TUBE[connection] DO <strong>in</strong>sert TUBE (?c?)<br />
(8.81)<br />
with ? stand<strong>in</strong>g for any value to be selected. This form is not yet exactly the one <strong>in</strong> (8.80),<br />
but writ<strong>in</strong>g relations as predicates we obta<strong>in</strong> the follow<strong>in</strong>g:<br />
ON <strong>in</strong>sert WIRE IF 9w c t v p: 8x t 0 : WIRE(w c t v p) ^:TUBE(x c t 0 )<br />
DO <strong>in</strong>sert TUBE (?c?) : (8.82)<br />
Note that the condition part is exactly the negation <strong>of</strong> ID 1 . Analogously, the other changes<br />
to operations discussed <strong>in</strong> Example 8.1 give rise to the follow<strong>in</strong>g rules:<br />
ON delete TUBE IF 9w c t v p: 8x t 0 : WIRE(w c t v p) ^:TUBE(x c t 0 )<br />
DO delete WIRE (w c t v p) (8.83)<br />
ON <strong>in</strong>sert WIRE IF 9w c t v p c 0 t 0 : WIRE(w c t v p) ^ TUBE(w c 0 t 0 )<br />
DO delete TUBE (w c 0 t 0 ) (8.84)<br />
ON <strong>in</strong>sert TUBE IF 9x c t v p c 0 t 0 : WIRE(x c 0 t 0 vp) ^ TUBE(x c t)<br />
DO delete WIRE (x c 0 t 0 vp) (8.85)<br />
In order to t with the <strong>in</strong>tended behaviour described <strong>in</strong> Example 8.1 it may occur that the<br />
same rule has to be executed several times. This can be achieved, if the semantics <strong>of</strong> the<br />
IF-part is considered as a WHILE-condition.<br />
ut<br />
Given a s<strong>in</strong>gle constra<strong>in</strong>t I <strong>in</strong> implicative normal form (8.76) we already get m<strong>in</strong>imum requirements<br />
for repair<strong>in</strong>g rules. If a relation symbol p occurs on the left hand side (right hand<br />
side) <strong>of</strong> (8.76), then each <strong>in</strong>sert- (delete-)operation on p may violate (8.76), hence give rise<br />
to event-parts. The correspond<strong>in</strong>g condition-part is simply :I. However, for the action-part<br />
there are still several alternatives.<br />
We call a system <strong>of</strong> ECA-rules complete i for all these cases <strong>of</strong> events and conditions<br />
there exists at least one repair<strong>in</strong>g rule, i.e. whenever the rule is selectable <strong>in</strong> some database<br />
state, the execution <strong>of</strong> the action part will establish I as a postcondition. However, we exclude<br />
those rules which simply <strong>in</strong>validate the event. For transactions we simply consider sequences<br />
<strong>of</strong> <strong>in</strong>sert- and delete-operations.<br />
Example 8.3. The four rules <strong>in</strong> the previous Example 8.2 form a complete system <strong>of</strong> ECArules,<br />
if we consider only the constra<strong>in</strong>ts ID 1 and ED from Example 8.1. However, if we also<br />
consider the other constra<strong>in</strong>ts <strong>in</strong> that example, we have to dene at least ve more rules to<br />
obta<strong>in</strong> a complete rule set, one rule for each <strong>of</strong> the three key constra<strong>in</strong>ts correspond<strong>in</strong>g to the<br />
events <strong>in</strong>sert WIRE , <strong>in</strong>sert TUBE and <strong>in</strong>sert CONNECTION , respectively, and two rules for the<br />
<strong>in</strong>clusion constra<strong>in</strong>t ID 2 correspond<strong>in</strong>g to <strong>in</strong>sert TUBE and delete CONNECTION .<br />
ut<br />
163
8.3 Problems with Rule-Based Integrity Enforcement<br />
If we were given only a s<strong>in</strong>gle constra<strong>in</strong>t I, then any <strong>of</strong> the rule constructions discussed <strong>in</strong><br />
the previous section would be sucient to enforce consistency. However, real systems { like<br />
the t<strong>in</strong>y one <strong>in</strong> Example 8.1 { conta<strong>in</strong> many constra<strong>in</strong>ts and the <strong>in</strong>terference <strong>of</strong> the rules may<br />
lead to problems.<br />
8.3.1 Non-Repairable Transactions<br />
Let us rst demonstrate the <strong>in</strong>suciency <strong>of</strong> a naive RTS approach us<strong>in</strong>g a second trivial<br />
example. In \real" applications as <strong>in</strong> the previous subsection the situation <strong>of</strong> Example 8.4 will<br />
not occur <strong>in</strong> such anobvious way, but there are always implied and <strong>in</strong> general not detectable<br />
constra<strong>in</strong>ts lead<strong>in</strong>g to analogous problems.<br />
Example 8.4. Take two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x) and<br />
I 2 p(x) ^ q(x) ) false. This implies p to be always empty, hence <strong>in</strong>sertions <strong>in</strong>to p should<br />
be abolished. Then we obta<strong>in</strong> the follow<strong>in</strong>g repair<strong>in</strong>g rules:<br />
R 1 : ON <strong>in</strong>sert p IF 9x: p(x) ^:q(x) DO <strong>in</strong>sert q (x)<br />
R 2 : ON delete q IF 9x: p(x) ^:q(x) DO delete p (x)<br />
R 3 : ON <strong>in</strong>sert p IF 9x: p(x) ^ q(x) DOdelete q (x)<br />
R 4 : ON <strong>in</strong>sert q IF 9x: p(x) ^ q(x) DO delete p (x)<br />
Here aga<strong>in</strong> the condition part <strong>in</strong> R 1 and R 2 is simply :I 1 and the condition part <strong>in</strong> R 3 and<br />
R 4 is :I 2 .<br />
If we try to execute a transaction <strong>in</strong>sert p (a) on a database state satisfy<strong>in</strong>g q(a), then we<br />
successively trigger the rules R 3 and R 2 with the eect <strong>of</strong> only delet<strong>in</strong>g a <strong>in</strong> q. This contradicts<br />
the orig<strong>in</strong>al <strong>in</strong>tention <strong>of</strong> the transaction.<br />
ut<br />
In order to analyze the un<strong>in</strong>tended behaviour <strong>in</strong> Example 8.4 consider a set <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong><br />
implicational normal form. Let denote the (semantic) closure, i.e. = fI j j= Ig.Now<br />
let I2 be non-trivial, i.e. it does not hold <strong>in</strong> all database states. Write I <strong>in</strong> implicational<br />
normal form<br />
I p 1 (x 1 ) ^ :::^ p n (x n ) ) q 1 (y 1 ) _ :::_ q m (y m )<br />
and let p i 1 ::: p i k<br />
and q j 1 ::: q j` denote the relation symbols on the left and right hand<br />
sides <strong>of</strong> I respectively. Wemay dene a transaction T by<br />
delete qj<br />
1 (y j1 ) ::: delete q j` (y j`) <strong>in</strong>sert pi<br />
1 (x i1 ) ::: <strong>in</strong>sert p ik<br />
(x ik ) :<br />
If we startT with values for the x i and y j such that the additional conditions on the left hand<br />
side <strong>of</strong> I are satised, whilst the additional conditions on the right hand side are not, T will<br />
always reach a database state satisfy<strong>in</strong>g :I. This eect <strong>of</strong> T is <strong>in</strong>tentional and hence the only<br />
reasonable approach to<strong>in</strong>tegrity ma<strong>in</strong>tenance <strong>in</strong> this case is to disallow such transactions.<br />
More formally, the eect <strong>of</strong> a transaction T <strong>in</strong> a state is given by the strongest (with<br />
respect to )) formula E (T ) = such that j= wp(T )( ) holds. Here wp(T )( ) denotes<br />
the weakest precondition <strong>of</strong> under the transaction , i.e. start<strong>in</strong>g T <strong>in</strong> <strong>in</strong>itial state will<br />
reach a nal state satisfy<strong>in</strong>g .<br />
164
S<strong>in</strong>ce we only consider sequences <strong>of</strong> <strong>in</strong>sertions and deletions, E (T ) can always be written<br />
as a conjunction <strong>of</strong> literals, i.e. <strong>in</strong> negated implicational normal form, with the positive literals<br />
correspond<strong>in</strong>g to <strong>in</strong>sertions and the negative onestodeletions. In addition, we may consider<br />
the eect <strong>of</strong> a sequence T RT S, where T is a transaction and RT S a system <strong>of</strong> rules. We say<br />
that RT S <strong>in</strong>validates the eect <strong>of</strong> T i 6j= E (T ) ^ E (T RT S) holds for some state .<br />
Then it is justied to call a transaction T repairable with respect to the constra<strong>in</strong>t set <br />
i :E (T ) =2 holds for at least one state . Then a complete term<strong>in</strong>at<strong>in</strong>g system RT S<br />
<strong>of</strong> ECA-rules always <strong>in</strong>validates the eect <strong>of</strong> a non-repairable transaction T . Hence the rst<br />
problem is to detect (and exclude) non-repairable transactions. In order to decide whether a<br />
given transaction T is repairable or not, we must be able to decide, whether :E (T ) is <strong>in</strong><br />
the closure . Hence the implication problem for constra<strong>in</strong>ts must be decidable.<br />
Note that our treatment ignores the term<strong>in</strong>ation problem. Non-term<strong>in</strong>at<strong>in</strong>g transactions<br />
have to be excluded as well, but this problem is <strong>in</strong>dependent from the repairability problem,<br />
s<strong>in</strong>ce non-term<strong>in</strong>ation <strong>of</strong> RTSs occurs as an orthogonal problem.<br />
8.3.2 Critical Paths<br />
Let us ask, whether we can always nd a complete set <strong>of</strong> repair rules for all repairable<br />
transactions. For this we <strong>in</strong>troduce the notions <strong>of</strong> associated hypergraphs and critical trigger<br />
paths.<br />
Let S = fp 1 ::: p n g be a relational database schema and RT S = fR 1 ::: R m g a system<br />
<strong>of</strong> ECA-rules on S. Then the associated rule hypergraph (VE) is constructed as follows:<br />
{ V is the disjo<strong>in</strong>t union <strong>of</strong> S and RT S. We then talk <strong>of</strong> S-vertices and RT S-vertices<br />
respectively.<br />
{ If R 2 RT S has event-part Ev on p 2 S and actions on p 1 ::: p k , then we have a<br />
hyperedge from p to fRg labelled by +or; depend<strong>in</strong>g on Ev be<strong>in</strong>g an <strong>in</strong>sert or delete,<br />
and a hyperedge from fRg to fp 1 ::: p k g analogously labelled by k values + or ;.<br />
Example 8.5. Figure 8.1 shows the associated rule hypergraph <strong>of</strong> Example 8.4 <strong>in</strong> which case<br />
we have a simple graph. Note that whenever action-parts consist only <strong>of</strong> a s<strong>in</strong>gle operation,<br />
the rule hypergraph degenerates to a graph.<br />
As a more practical example Figure 8.2 conta<strong>in</strong>s the associated hypergraph for Example<br />
8.1 with the rules discussed <strong>in</strong> Example 8.2. In particular, rules R 1 , R 2 and R 3 correspond to<br />
the functional dependencies FD 1 ,FD 2 and FD 3 ,rulesR 4 and R 5 to the <strong>in</strong>clusion dependency<br />
ID 1 ,rulesR 6 and R 7 to the <strong>in</strong>clusion dependency ID 2 and rules R 8 and R 9 to the exclusion<br />
dependency ED. Furthermore, we used the abbreviations W , T and C for WIRE, TUBE and<br />
CONNECTION, respectively.<br />
ut<br />
So far we ignore the condition part <strong>of</strong> the rules. These come <strong>in</strong>to play if we consider<br />
critical trigger paths <strong>in</strong> associated hypergraphs. These are dened <strong>in</strong> several steps start<strong>in</strong>g<br />
from paths <strong>in</strong> the associated hypergraph which correspond to possible sequences <strong>of</strong> ECArules<br />
with respect only to their event- and action-parts. Secondly we attach formulae to the<br />
S-vertices <strong>in</strong> the path <strong>in</strong> such a way that pre- and postconditions <strong>of</strong> the <strong>in</strong>volved rules are<br />
expressed. Then we talk <strong>of</strong> trigger paths.<br />
A maximal trigger path with contradict<strong>in</strong>g <strong>in</strong>itial and nal condition will then be called<br />
critical. Then imag<strong>in</strong>e a transaction with an eect implied by the <strong>in</strong>itial formula, i.e. that<br />
there is an <strong>in</strong>itial state such that runn<strong>in</strong>g the transaction <strong>in</strong> this state results <strong>in</strong> a state which<br />
165
q<br />
;<br />
@@I <br />
@ ; ;<br />
;; @<br />
p<br />
<br />
R 2<br />
; + - R 4<br />
;<br />
@ ;<br />
+ @@<br />
; ;; @R<br />
R 1<br />
+ + - R 3<br />
Fig. 8.1. Associated Rule Hypergraph for RT S = fR 1 R 2 R 3 R 4 g <strong>in</strong> Example 8.4<br />
<br />
W <br />
<br />
*+<br />
HY;<br />
H<br />
H<br />
R 8<br />
A<br />
AAAAA<br />
AU ;<br />
+<br />
R 4<br />
<br />
<br />
R 6<br />
H HHj +<br />
T *+ H HHj +<br />
HY;<br />
<br />
6 AAK<br />
; H <br />
H ; <br />
6 6<br />
; A;<br />
R ; 5<br />
<br />
R ;<br />
7<br />
A<br />
<br />
A <br />
+ A +<br />
? A + ?<br />
R 1 R 9 R 2 R 3<br />
<br />
C<br />
<br />
+<br />
?<br />
Fig. 8.2. Associated Rule Hypergraph for Example 8.1<br />
satises the <strong>in</strong>itial condition <strong>of</strong> the trigger path. Execut<strong>in</strong>g this transaction followed by the<br />
rule trigger<strong>in</strong>g system along the critical trigger path will then turn the eect <strong>of</strong> the transaction<br />
<strong>in</strong>to its opposite. This means that the RT S <strong>in</strong>validates the eect <strong>of</strong> at least one transaction.<br />
Let G =(VE) be the rule hypergraph associated with a system RT S <strong>of</strong> rules. A trigger<br />
path <strong>in</strong> G is a sequence v 0 e 1 v 0 1 e0 1 ::: e0`v` <strong>of</strong> vertices and hyperedges with the follow<strong>in</strong>g<br />
conditions:<br />
{ v i 2S holds for all i =0::: `,<br />
{ vi 0 2 RT S holds for all i =1::: `,<br />
{ e i is a hyperedge from v i;1 to vi 0 and<br />
{ e 0 i is a hyperedge from v0 i to V i with v i 2 V i and the same label as e i+1 .<br />
We call ` the length <strong>of</strong> the trigger path.<br />
In addition we associate with each vertex v i 2 S (i = 0::: `) a formula ' i <strong>in</strong> negated<br />
implication normal form such thatj= ' i ) cond(vi+1 0 ) holds for the condition part cond(v0 i+1 )<br />
<strong>of</strong> rule vi+1 0 2 RT S and j= ' i ) wp(A i+1 )(' i+1 ) holds for the action-part A i+1 <strong>of</strong> rule vi+1<br />
0<br />
(i =0::: `; 1). Furthermore, there is no e`+1 2 E from v` to v0`+1 with the same label as<br />
e 0` such thatj= '` ) cond(v0`+1 ) holds.<br />
Then a trigger path is critical i j= :(' 0 ^ '`) holds. Such a critical trigger path is<br />
called non-admissible i there is a consistent state and a repairable transaction T such that<br />
E (T ) , ' 0 holds.<br />
166
Critical trigger paths for the associated rule hypergraph <strong>in</strong> Figure 8.1 are sketched <strong>in</strong><br />
Figure 8.3. Note that <strong>in</strong> this case both critical trigger paths are not non-admissible.<br />
<br />
p<br />
<br />
<br />
p<br />
<br />
Fig. 8.3. Critical Trigger Paths<br />
<br />
q<br />
<br />
<br />
q<br />
<br />
+ + + ;<br />
R 1 R 4<br />
- - - -<br />
+ ; ; ;<br />
R 3 R 2<br />
- - - -<br />
<br />
p<br />
<br />
<br />
p<br />
<br />
p(x) ^:q(x) p(x) ^ q(x) :p(x) ^ q(x)<br />
v 0 e 1 v 0 1 e 0 1<br />
v 1 e 2 v 0 2 e 0 2<br />
v 2<br />
p(x) ^ q(x) p(x) ^:q(x) :p(x) ^:q(x)<br />
If a critical trigger path is not non-admissible, then only a non-repairable transaction<br />
can be <strong>in</strong>validated by runn<strong>in</strong>g the rules <strong>in</strong> the trigger path. S<strong>in</strong>ce we exclude non-repairable<br />
transactions, we only have to consider non-admissible trigger paths. After these remarks we<br />
are able to state our next result:<br />
If RT S is a complete set <strong>of</strong> rules associated with a set <strong>of</strong> constra<strong>in</strong>ts and let G =(VE)<br />
be the associated rule hypergraph, then G conta<strong>in</strong>s an non-admissible critical trigger path i<br />
there exists a consistent database state and a repairable transaction T such that execut<strong>in</strong>g<br />
T <strong>in</strong> and consecutively runn<strong>in</strong>g RT S <strong>in</strong>validates the eect <strong>of</strong> T without leav<strong>in</strong>g the database<br />
unchanged.<br />
To sketch a pro<strong>of</strong>, consider the sequence ' 0 :::'` <strong>of</strong> formulae associated with a critical<br />
trigger path. Accord<strong>in</strong>g to the label <strong>of</strong> e 1 be<strong>in</strong>g + or ; ' 0 either conta<strong>in</strong>s a literal p(x) or<br />
:p(x). Choose a consistent state with j= :p(x) orj= p(x), respectively, and a repairable<br />
transaction T with E (T ) , ' 0 . By the denition <strong>of</strong> critical trigger paths RT S <strong>in</strong>validates<br />
the eect <strong>of</strong> T .F<strong>in</strong>ally, use <strong>in</strong>duction on the length ` to show that the state result<strong>in</strong>g from<br />
T followed by RT S is dierent from .<br />
Conversely, if there is no admissible critical trigger path, let T be a repairable transition<br />
and a database state which is consistent with respect to . Now start T <strong>in</strong> and assume<br />
that the result<strong>in</strong>g state 0 is not consistent. Then consider a trigger path <strong>of</strong> nite length such<br />
that j= 0 ' 0 holds. The consecutive execution <strong>of</strong> the rules <strong>in</strong> this trigger path will result <strong>in</strong><br />
a state satisfy<strong>in</strong>g '`. Thus, we have E (T ) , ' 0 and E (T RT S) , '`. Accord<strong>in</strong>g to<br />
our assumption, the used trigger path cannot be critical. Hence RT S does not <strong>in</strong>validate the<br />
eect <strong>of</strong> T .<br />
The full pro<strong>of</strong> is conta<strong>in</strong>ed <strong>in</strong> [12] and [14, p.82f.].<br />
8.3.3 Extensions<br />
In our model the execution <strong>of</strong> a rule with condition-part :I does not completely repair<br />
violations to the constra<strong>in</strong>t I, s<strong>in</strong>ce there may be more than just one violat<strong>in</strong>g tuple. There<br />
are two possible solutions to this problem:<br />
{ The rst <strong>of</strong> these solutions considers a WHILE-semantics for the rules. In this case the<br />
second condition for critical trigger paths has to be replaced by j= ' i ) wp(A i+1 )(' i+1)<br />
167
R 8<br />
+<br />
?<br />
W <br />
HY;<br />
<br />
*+<br />
H<br />
H<br />
+<br />
R 4<br />
A<br />
AAAAA<br />
AU ;<br />
<br />
T <br />
<br />
*+<br />
HY;<br />
H H<br />
R 6<br />
H HHj +<br />
H HHj +<br />
; -<br />
;<br />
; -<br />
<br />
<br />
<br />
<br />
<br />
;<br />
+<br />
R 5<br />
<br />
<br />
6 AAK<br />
6 6<br />
; A;<br />
; R ;<br />
7<br />
A<br />
<br />
A <br />
+ A +<br />
? A + ?<br />
R 1 R 9 R 2 R 3<br />
<br />
C<br />
<br />
+<br />
?<br />
Fig. 8.4. Extended Rule Hypergraph for Example 8.1<br />
with A i+1 represent<strong>in</strong>g the iteration <strong>of</strong> the action-part as long as the condition is satised.<br />
Example 8.6 shows a critical trigger path us<strong>in</strong>g WHILE-semantics.<br />
{ The second solution extends the rule hypergraph, as if the action-part <strong>of</strong> each rule repeated<br />
the event. Of course, this is not necessary for rules that denitely repair all violations to<br />
I. Figure 8.4 extends the one <strong>in</strong> Figure 8.2 with respect to the rules R 5 , R 7 , R 8 and R 9 .<br />
In Example 8.6 we discuss critical trigger paths with respect to this extension.<br />
Example 8.6. The rst picture <strong>in</strong> Figure 8.5 shows a critical trigger path correspond<strong>in</strong>g to<br />
the rule hypergraph <strong>in</strong> Figure 8.2 us<strong>in</strong>g WHILE-semantics. The used formulae are<br />
' 0 W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^ T (8511 HH-HB:::)<br />
' 1 W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^:T (8511 HH-HB:::)<br />
' 2 :W (8511 HH-HB:::) ^:W (4711 HH-HB:::) ^:T (8511 HH-HB:::)<br />
Us<strong>in</strong>g extensions to the hypergraph <strong>in</strong>stead { as shown <strong>in</strong> Figure 8.4 { gives rise to the critical<br />
trigger path <strong>in</strong> the second picture <strong>in</strong> Figure 8.5 us<strong>in</strong>g<br />
' 0 2 :W (8511 HH-HB:::) ^ W (4711 HH-HB:::) ^:T (8511 HH-HB:::)<br />
and the same formulae ' 0 , ' 1 and ' 2 as above.<br />
ut<br />
Both extensions do not aect the result stated above. To sketch a pro<strong>of</strong>, the second solution<br />
is the same as add<strong>in</strong>g \dummy" actions, i.e. those repeat<strong>in</strong>g the event, to the action part.<br />
Therefore, it corresponds to a slightly changed rule system with the same behaviour. Then<br />
the iteration <strong>in</strong> the rst solution corresponds to rule iteration <strong>in</strong> the second solution.<br />
Analogously, if action-parts conta<strong>in</strong> more than one operation, the critical trigger paths<br />
considered so far do not reect completely the sequences <strong>of</strong> rule executions. However, extend<strong>in</strong>g<br />
hyperedges from RT S-nodes to S-nodes accord<strong>in</strong>g to previously triggered rules captures<br />
this situation. Just as before, this does not aect our ma<strong>in</strong> result on critical trigger paths.<br />
S<strong>in</strong>ce the practical rule systems we are <strong>in</strong>terested <strong>in</strong>, only comprise simple action parts, we<br />
do not discuss further examples for this extension.<br />
168
W<br />
<br />
<br />
W<br />
<br />
<br />
<br />
T<br />
<br />
<br />
T<br />
' 0 ' 1 ' 2<br />
+ - ;<br />
R 8<br />
- ;- ;<br />
R 5<br />
-<br />
<br />
W<br />
<br />
<br />
T<br />
<br />
' 0 ' 1 ' 0 2<br />
+ ; ; ;<br />
R 8 R 5 R 5<br />
- - - - - -<br />
<br />
W<br />
<br />
' 2<br />
Fig. 8.5. Critical Trigger Paths for Example 8.1<br />
8.4 Well-behav<strong>in</strong>g Rule Systems<br />
Let us now ask for constra<strong>in</strong>t sets that allow us to dene complete RTSs which exclude nonadmissible<br />
critical trigger paths <strong>in</strong> their associated hypergraphs. Let us start with a simple<br />
example.<br />
Example 8.7. Take aga<strong>in</strong> two unary relations p and q and the constra<strong>in</strong>ts I 1 p(x) ) q(x)<br />
and I 2 q(x) ) p(x) which implies p to be always equal to q. Thenwe obta<strong>in</strong> the follow<strong>in</strong>g<br />
repair<strong>in</strong>g rules:<br />
R 1 : ON <strong>in</strong>sert p IF 9x: p(x) ^:q(x) DO <strong>in</strong>sert q (x) (8.86)<br />
R 2 : ON delete q IF 9x: p(x) ^:q(x) DO delete p (x) (8.87)<br />
R 3 : ON <strong>in</strong>sert q IF 9x: :p(x) ^ q(x) DO <strong>in</strong>sert p (x) (8.88)<br />
R 4 : ON delete p IF 9x: :p(x) ^ q(x) DO delete q (x) (8.89)<br />
First observe, that all edges <strong>in</strong> critical trigger paths are equally labelled with either + or ;.<br />
For the case <strong>of</strong> +andv 0 = p consider all constants a such that j= ' 0 ) p(a) ^:q(a) holds,<br />
but from the denition <strong>of</strong> eects such a pair can only result from a non-repairable transaction<br />
T or an <strong>in</strong>consistent start<strong>in</strong>g state . The same argument applies to the other cases. Hence<br />
there are no non-admissible critical paths <strong>in</strong> the associated rule hypergraph.<br />
ut<br />
8.4.1 Stratied Rule Systems<br />
Let us now <strong>in</strong>vestigate the reason for the absence <strong>of</strong> non-admissible critical trigger paths <strong>in</strong><br />
Example 8.7. This leads us to the notion <strong>of</strong> a stratied set <strong>of</strong> constra<strong>in</strong>ts.<br />
The motivation beh<strong>in</strong>d this is as follows: In Example 8.7 <strong>in</strong>sertions (deletions) on a relation<br />
p only trigger <strong>in</strong>sertions (deletions) on q and vice versa. This should be sucient for not<br />
<strong>in</strong>validat<strong>in</strong>g an eect once it has been established. The correspond<strong>in</strong>g constra<strong>in</strong>ts can therefore<br />
be grouped together.<br />
Aset <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> implicative normal form (8.76) on a schema S is called stratied<br />
i we have a partition = 1 [:::[ n with pairwise disjo<strong>in</strong>t constra<strong>in</strong>t sets i called strata<br />
such that the follow<strong>in</strong>g conditions are satised:<br />
(i) If L 1 :::L k is a sequence <strong>of</strong> literals on the left hand side (right hand side) <strong>of</strong> I2 i and<br />
J 2 conta<strong>in</strong>s a sequence L 0 1:::L0` <strong>of</strong> literals on the right hand side (left hand side)<br />
such that fL 1 :::L k L 0 1 :::L0`g is uniable, then J must also lie <strong>in</strong> stratum i.<br />
169
(ii) If I 6= J conta<strong>in</strong> sequences <strong>of</strong> literals L 1 :::L k and L 0 1:::L0` both on the left (right)<br />
hand side such that fL 1 :::L k L 0 1:::L0`g is uniable with most general unier and<br />
:I, :J conta<strong>in</strong> uniable literals on the right (left) hand side, then I and J must lie <strong>in</strong><br />
dierent strata i and j , unless one <strong>of</strong> the <strong>in</strong>stances :I or :J is always satised.<br />
Example 8.8. Consider the constra<strong>in</strong>ts <strong>in</strong> Example 8.1 except the exclusion constra<strong>in</strong>t ED.<br />
Then the rst condition above requires ID 1 and ID 2 to lie <strong>in</strong> the same stratum. The same<br />
applies to FD 3 and ID 2 (or FD 2 and ID 1 , respectively).<br />
We may also unify the left hand sides <strong>of</strong> FD 2 and ID 2 (or FD 1 and ID 1 , respectively), but<br />
then the result<strong>in</strong>g <strong>in</strong>stance <strong>of</strong> the functional constra<strong>in</strong>t degenerates to true.<br />
ut<br />
Our next result states that stratied constra<strong>in</strong>t sets always give rise to RTSs without nonadmissible<br />
critical trigger paths <strong>in</strong> the associated rule hypergraph.<br />
If is a stratied constra<strong>in</strong>t set on a schema S, then there exists a complete RTS such<br />
that for any repairable transaction T on S the RTS does not <strong>in</strong>validate the eect <strong>of</strong> T .<br />
To sketch a pro<strong>of</strong> consider I <strong>in</strong> implicative normal form (8.76). For each relation symbol<br />
p i on the left hand side dene rules<br />
ON <strong>in</strong>sert pi IF :I DO <strong>in</strong>sert qj (y j )<br />
ON <strong>in</strong>sert pi IF :I DO delete pj (y j )<br />
and<br />
with relation symbols q j occurr<strong>in</strong>g on the right hand side and p j (j 6= i) on the left hand side<br />
<strong>of</strong> I. Similarly, each predicate symbol q j on the right hand side gives rise to rules<br />
ON delete qj IF :I DO <strong>in</strong>sert qi (y i )<br />
ON delete qj IF :I DO delete pi (y i ) :<br />
This denes a complete set RT S <strong>of</strong> rules. Due to this rule construction the constra<strong>in</strong>ts correspond<strong>in</strong>g<br />
to the rules <strong>in</strong> a critical trigger path all belong to the same stratum. However,<br />
the condition j= :(' 0 ^ '`) implies that ' 0 conta<strong>in</strong>s a literal L, '` its negation, hence the<br />
construction <strong>of</strong> rules implies I 1 and I` to lie <strong>in</strong> dierent strata. This shows that there are no<br />
non-admissible critical trigger paths. The full pro<strong>of</strong> can be found <strong>in</strong> [12, 14].<br />
and<br />
8.4.2 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Entity-Relationship Schemata<br />
F<strong>in</strong>ally, we may ask for cases where stratied constra<strong>in</strong>t sets occur. Recall from [9] that a<br />
relational database schema S with constra<strong>in</strong>t set is <strong>in</strong> Entity-Relationship normal form<br />
(ERNF) { and hence is equivalent toanER-schema{i<br />
{ all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> are key-based and non-redundant,<br />
{ there is no cycle <strong>of</strong> <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> ,<br />
{ each relation schema R 2 S is <strong>in</strong> BCNF with respect to the functional dependencies <strong>in</strong><br />
and<br />
{ there are only <strong>in</strong>clusion and functional dependencies <strong>in</strong> .<br />
If a relational database schema S with constra<strong>in</strong>tset is <strong>in</strong> ERNF, then a slight generalization<br />
<strong>of</strong> the argument given <strong>in</strong> Example 8.8 shows that is stratied. Indeed, property (i) forces<br />
<strong>in</strong>clusion dependencies R[X 1 ] S[Y 1 ] and S[X 2 ] T [Y 2 ] to belong to the same stratum. By<br />
170
the same property the key constra<strong>in</strong>ts dened by theY i also belong to this stratum. F<strong>in</strong>ally,<br />
property (ii) does not constra<strong>in</strong> pairs <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> .<br />
Furthermore, we only obta<strong>in</strong> an acyclic set <strong>of</strong> functional and <strong>in</strong>clusion constra<strong>in</strong>ts, for<br />
which the implication problem is decidable [2]. Hence we are able to detect also unrepairable<br />
transactions. Follow<strong>in</strong>g the design approach <strong>of</strong> Mannila and Raiha <strong>in</strong> [9] leads to schemata<br />
without any problems concern<strong>in</strong>g consistency enforcement byRTSs.<br />
;; @ @@ ; C q<br />
(0 1) - D<br />
6<br />
(0 1)<br />
;; @ @@ ; ;; @ @@ ; -<br />
A p B<br />
Fig. 8.6. Entity-Relationship constra<strong>in</strong>ts<br />
Example 8.9. Let us look at the higher order Entity-Relationship diagram [15] <strong>in</strong> Figure<br />
8.6, which leads to the constra<strong>in</strong>ts<br />
I 1 : p(x y) ) q(x z) and<br />
I 2 : q(x z) ^ q(y z) ) x = y :<br />
Stratication property (i) applied to q on the right hand side <strong>of</strong> I 1 and on the left hand<br />
side <strong>of</strong> I 2 forces I 1 , I 2 to lie <strong>in</strong> the same stratum. Property (ii) is not applicable. Hence the<br />
constra<strong>in</strong>t set is stratied. However, if we add a third constra<strong>in</strong>t<br />
I 3 p(x z) ^ q(y z) ) false<br />
which <strong>in</strong>terms<strong>of</strong>theEntity-Relationship diagram <strong>in</strong> Figure 8.6 corresponds to an exclusion<br />
constra<strong>in</strong>t BkD, the new set fI 1 I 2 I 3 g <strong>of</strong> constra<strong>in</strong>ts is no longer stratied. This is due to<br />
the fact that stratication property (i) forces I 1 and I 3 to lie <strong>in</strong> the same stratum, wheras<br />
now stratication property (ii) forces I 1 (or analogously I 2 ) to lie <strong>in</strong> a stratum dierent from<br />
the one <strong>of</strong> I 3 .Thus, there is no stratication satisfy<strong>in</strong>g both properties.<br />
ut<br />
connection {<br />
to {<br />
from {<br />
{tubeid<br />
CONNECTION ;; @ TUBE<br />
@@ ; (1 1) -<br />
{tubetype<br />
6<br />
;; @ @@ ; (1 1) -<br />
WIRE<br />
{ wire id<br />
{ wire type<br />
{voltage<br />
{power<br />
Fig. 8.7. Entity-Relationship constra<strong>in</strong>ts correspond<strong>in</strong>g to Example 8.1<br />
171
Example 8.10. Let us take another look at Example 8.1. We have already seen <strong>in</strong> Example<br />
8.8 that the set <strong>of</strong> functional and <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> this example is stratied. Aga<strong>in</strong>,<br />
add<strong>in</strong>g the exclusion constra<strong>in</strong>t ED destroys this property, s<strong>in</strong>ce the stratication property<br />
(i) forces ED to belong to the same stratum as ID 1 , whereas property (ii) implies it ly<strong>in</strong>g <strong>in</strong><br />
a dierent stratum. This is practically the same argument as <strong>in</strong> the previous Example 8.9.<br />
Aga<strong>in</strong> the schema corresponds to the Entity-Relationship diagram <strong>in</strong> Figure 8.7 with<br />
ED correspond<strong>in</strong>g to the exclusion constra<strong>in</strong>t W [wire id] T [tube id] and ID 1 to the path<br />
<strong>in</strong>clusion constra<strong>in</strong>t W:C[connection] T:C[connection].<br />
ut<br />
8.4.3 Constra<strong>in</strong>ts Aris<strong>in</strong>g from Simple <strong>Object</strong>-<strong>Oriented</strong> Schemata<br />
A similar situation arises for simple schemata <strong>in</strong> object-oriented data models. The OODM<br />
<strong>in</strong>vestigated <strong>in</strong> [11] dist<strong>in</strong>guishes between objects and values. Types are used to describe<br />
immutable sets <strong>of</strong> values with (type-)operations predened on them. Type systems are prescriptions<br />
for the syntax and semantics <strong>of</strong> permitted type denitions. We mayalways consider<br />
type systems that consists <strong>of</strong> some base types, type constructors and a subtyp<strong>in</strong>g relation.<br />
E.g., base types could BOOL, NAT, INT, STRING, ID or OK,whereID is an abstract<br />
identier type without any non-trivial supertype and OK is the trivial type (which has exactly<br />
one value ok). Type constructors could be record types (a 1 : 1 ::: a n : n ) and nite set<br />
types fg.<br />
We may use base types and constructors to dene new types by nest<strong>in</strong>g. In addition, we<br />
may build parameterized types lett<strong>in</strong>g type variables <strong>in</strong> constructors be un<strong>in</strong>stantiated. Then<br />
atype T is called proper i the number <strong>of</strong> its parameters is 0. T is called a value type i there<br />
is no occurrence <strong>of</strong> ID <strong>in</strong> T .IfT 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type T , then there exists a<br />
correspond<strong>in</strong>g occurrence relation o : T T 0 ! BOOL with o(v 1 v 2 )=true i v 2 occurs<br />
<strong>in</strong> v 1 at the position <strong>in</strong>dicated by the position <strong>of</strong> T 0 <strong>in</strong> T . Each subtype relation T 1 T 2 as<br />
above denes a subtype function T 1 ! T 2 on the correspond<strong>in</strong>g sets <strong>of</strong> values.<br />
The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure and behaviour.<br />
Structurally this uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Behaviourally,<br />
this abstracts from operations on s<strong>in</strong>gle objects <strong>in</strong>clud<strong>in</strong>g their creation and<br />
deletion.<br />
S<strong>in</strong>ce identiers can be represented us<strong>in</strong>g ID,values and references can be comb<strong>in</strong>ed <strong>in</strong>to<br />
a representation type, where each occurrence <strong>of</strong> ID denotes references to some other class.<br />
Therefore, we may dene the structure <strong>of</strong> a class us<strong>in</strong>g parameterized types.<br />
If T is a value type with parameters 1 ::: n and if the parameters are replaced by<br />
pairs r i : C i with a reference name r i and a class name C i ,theresult<strong>in</strong>g expression is called<br />
a structure expression. A class consists <strong>of</strong> a class name C, a structure expression S, a set <strong>of</strong><br />
class names D 1 ::: D m (called superclasses) and a set <strong>of</strong> operations. Wecallr i the reference<br />
named r i from class C to class C i .Thetype derived from S by replac<strong>in</strong>g each reference r i : C i<br />
by the type ID is called the representation type T C <strong>of</strong> the class C.<br />
A database schema S is given by a nite collection <strong>of</strong> type and class denitions such that<br />
all types, classes and operations occurr<strong>in</strong>g with<strong>in</strong> type denitions, structure denitions and<br />
operations are dened <strong>in</strong> S.<br />
Then an <strong>in</strong>stance D assigns to each classC avalue D(C) <strong>of</strong>type f(ident : IDvalue : T C )g<br />
such that the follow<strong>in</strong>g conditions are satised:<br />
{ For each class C identiers must be unique.<br />
172
{ The set <strong>of</strong> identiers <strong>in</strong> a subclass C is a subset <strong>of</strong> the one <strong>in</strong> the superclass C 0 . Moreover,<br />
if T C T 0 C with subtype function f : T C ! T 0 C , then (i v) 2D(C) ) (i f(v)) 2D(C0 )<br />
holds.<br />
{ For each reference r from C to D identiers j occurr<strong>in</strong>g <strong>in</strong> a value v <strong>of</strong> an object <strong>in</strong> C<br />
with respect to the occurrence relation o r , i.e.(i v) 2D(C) and o r (v j) hold, must occur<br />
<strong>in</strong> D(D).<br />
Let us consider only simple schemata as they occur <strong>in</strong> most practical object-oriented systems.<br />
In such aschema structure expressions always have the form (a 1 : T 1 :::a n : T n ), where T i<br />
is either a value type or a class name. In the latter case a i is a reference. In accordance to<br />
many practical systems we may then call a i an attribute.<br />
Example 8.11. Let us consider a simple university schema adapted from [11]:<br />
Class PersonC<br />
Structure (PersonIdentityNo : NAT , Address : STRING )<br />
Class MarriedPersonC<br />
IsA PersonC<br />
Structure (Spouse:MarriedPersonC )<br />
Class StudentC<br />
IsA PersonC<br />
Structure ( StudNo : NAT ,Name:STRING, Supervisor : Pr<strong>of</strong>essorC,<br />
Major : DepartmentC, M<strong>in</strong>or : DepartmentC )<br />
Class Pr<strong>of</strong>essorC<br />
IsA PersonC<br />
Structure (Age:NAT , Salary : NAT ,Faculty :DepartmentC )<br />
Class DepartmentC<br />
Structure ( DeptName : STRING, Head:Pr<strong>of</strong>essorC )<br />
This schema can be translated { us<strong>in</strong>g some self-explan<strong>in</strong>g abbreviations for the attribute<br />
names { <strong>in</strong>to a relational one with the follow<strong>in</strong>g relation schemata:<br />
Person = (id p<strong>in</strong>o address) <br />
MPerson = (id spouse) <br />
Student = (id sno name sup major m<strong>in</strong>or) <br />
Pr<strong>of</strong> = (id age salary faculty) <br />
Dept = (id dname head)<br />
In addition, we get the follow<strong>in</strong>g functional and <strong>in</strong>clusion dependencies:<br />
Person :id ! p<strong>in</strong>o address MPerson :id ! spouse <br />
Student :id ! sno name sup major m<strong>in</strong>or <br />
Pr<strong>of</strong> :id ! age salary faculty Dept :id ! dname head<br />
MPerson[id] Person[id] Student[id] Person[id] Pr<strong>of</strong>[id] Person[id]<br />
MPerson[spouse] MPerson[id] Student[sup] Pr<strong>of</strong>[id]<br />
Student[major] Dept[id] Student[m<strong>in</strong>or] Dept[id]<br />
Pr<strong>of</strong>[faculty] Dept[id] Dept[head] Pr<strong>of</strong>[id] :<br />
173
Note that all these relations have an attribute id with the type ID as its doma<strong>in</strong>. Furthermore,<br />
all <strong>in</strong>clusion constra<strong>in</strong>ts dened by theschema are key-based with just this attribute occurr<strong>in</strong>g<br />
on the right hand side. The <strong>in</strong>clusion constra<strong>in</strong>ts that stem from subclass<strong>in</strong>g also have idon<br />
their left hand sides. In particular all <strong>in</strong>clusion constra<strong>in</strong>ts are unary.<br />
ut<br />
The observations made <strong>in</strong> Example 8.11 can be generalized. From the denition <strong>of</strong> conditions<br />
to be satised by <strong>in</strong>stances, each <strong>in</strong>clusion constra<strong>in</strong>t dened by a simple object-oriented<br />
schema is key-based with the identier attribute id occurr<strong>in</strong>g on its right hand side. Furthermore,<br />
id denes a key for each relation. However, due to the use <strong>of</strong> the set-type-constructor<br />
relations appear to be not <strong>in</strong> rst normal form.<br />
With these observations concern<strong>in</strong>g the nature <strong>of</strong> constra<strong>in</strong>t sets arises from transform<strong>in</strong>g<br />
simple object-oriented schemata we may repeat our arguments used for Entity-<br />
Relationship constra<strong>in</strong>ts to see that is stratied. Indeed, property (i) forces <strong>in</strong>clusion dependencies<br />
R[a] S[id] and S[b] T [id] to belong to the same stratum. By the same property<br />
the key constra<strong>in</strong>ts dened by the attributes id on S or T also belong to this stratum. F<strong>in</strong>ally,<br />
property (ii) does not constra<strong>in</strong> pairs <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong> .<br />
S<strong>in</strong>ce all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> are unary, the implication problem is decidable [7].<br />
Therefore, we are also able to detect non-repairable transactions.<br />
8.5 Conict Resolution<br />
Referential actions are special rules to cope with violations <strong>of</strong> a foreign key constra<strong>in</strong>t R 1 [X] <br />
R 2 [Y ]. Note that all <strong>in</strong>clusion constra<strong>in</strong>ts <strong>in</strong> Entity-Relationship and object-oriented models<br />
considered so far have this form. As <strong>in</strong> SQL we only consider the case <strong>of</strong> delete- andupdateoperations<br />
on R 2 , i.e. we consider the deletion (or update) <strong>of</strong> a tuple t 2 2I(R 2 ). If this leads<br />
to constra<strong>in</strong>t violation, there mustbeatleast one tuple t 1 2I(R 1 ) with t 1 [X] =t 2 [Y ]. The<br />
follow<strong>in</strong>g actions have been suggested:<br />
cascade: Also delete t 1 (or update the values for the attributes <strong>in</strong> X such that the constra<strong>in</strong>t<br />
violation dissappears). If there is more than one such tuple, the action is applied to all <strong>of</strong><br />
them.<br />
set null: Set the values for the attributes <strong>in</strong> X to a null value.<br />
restrict: Reject the deletion or update on R 2 and roll back.<br />
In the rst two cases wehave a reaction by propagation, s<strong>in</strong>ce referenc<strong>in</strong>g tuples also disappear<br />
from the <strong>in</strong>stance.<br />
8.5.1 Problem <strong>of</strong> Ambiguity<br />
Assume that wehave associated a referential action with all constra<strong>in</strong>ts <strong>in</strong> I. Then the problem<br />
occurs that the nal result <strong>of</strong> an operation depends on the order <strong>of</strong> apply<strong>in</strong>g referential actions.<br />
A propagation path (for short: p-path) is a sequence R n [X n ]:::R 1 [X 1 ]such that there are<br />
constra<strong>in</strong>ts R i;1 [Yi;1 0 ] R i[Y i ] <strong>in</strong> I with X i Yi 0 Y i for i =2:::n, all these constra<strong>in</strong>ts<br />
are associated with a referential action <strong>of</strong> k<strong>in</strong>d cascade or set null and R i;1 [X i;1 ] R i [X i ]<br />
is <strong>in</strong> I .<br />
A restriction path (for short: r-path) is a sequence R n [X n ]:::R 1 [X 1 ]such that there are<br />
constra<strong>in</strong>ts R i;1 [Y 0<br />
i;1 ] R i[Y i ]<strong>in</strong>I with X i Y 0<br />
i Y i for i =2:::n, where R 1 [Y 0 1 ] R 2[Y 2 ]<br />
174
is associated with a referential action <strong>of</strong> k<strong>in</strong>d restrict and all other constra<strong>in</strong>ts are associated<br />
with a referential action <strong>of</strong> k<strong>in</strong>d cascade or set null, andR i;1 [X i;1 ] R i [X i ]is<strong>in</strong>I .<br />
A p-path R n [X n ]:::R 1 [X 1 ] is called a phantom i there is an r-path Rm[X 0 m], 0 ::: ,<br />
R1 0 [X0 1 ] with R0 m = R n , Xm 0 = X n and an <strong>in</strong>clusion constra<strong>in</strong>t R 1 [X 1 ] R1 0 [X0 1 ]<strong>in</strong>Dep .<br />
A schema S has a conict i there is a p-path R n [X n ]:::R 1 [X 1 ] correspond<strong>in</strong>g to<br />
constra<strong>in</strong>ts R i;1 [Yi;1 0 ] R i[Y i ], a r-path Rm[X 0 m]:::R 0 1 0 [X0 1 ] correspond<strong>in</strong>g to constra<strong>in</strong>ts<br />
Ri;1 0 [Z0 i;1 ] R0 i [Z i] with R n [X n ]=Rm[X 0 m]andR 0 1 [X 1 ]=R1 0 [X0 1 ], an <strong>in</strong>stance I and tuples<br />
t n :::t 1 <strong>in</strong> I(R n ):::I(R 1 )witht i [Y i ]=t i;1 [Yi;1 0 ] and tuples t0 m :::t0 1 <strong>in</strong> I(R0 m ):::I(R0 1 )<br />
with t 0 i [Z i]=t 0 i;1 [Z0 i;1 ]suchthatt n = t 0 m and t 1 = t 0 1 hold. A conict is called a phantom i<br />
the <strong>in</strong>volved p-path is a phantom.<br />
The condition t 1 = t 0 1 could be omitted, s<strong>in</strong>ce the existence <strong>of</strong> tuple sequences satisfy<strong>in</strong>g<br />
all other conditions implies the existence <strong>of</strong> tuples as claimed <strong>in</strong> the denition.<br />
registration {<br />
Car<br />
{ p<strong>in</strong>o<br />
Person { name<br />
{ address<br />
{ licence no {date<br />
AK<br />
A<br />
H H<br />
H Driver H Patient<br />
HH<br />
<br />
H<br />
A<br />
H<br />
H<br />
H<br />
HH<br />
<br />
<br />
-<br />
Qk<br />
3<br />
Q<br />
Q <br />
H<br />
Q<br />
H<br />
H<br />
Accident H {costs<br />
HH<br />
<br />
<br />
Cl<strong>in</strong>ic<br />
{clname<br />
Fig. 8.8. Entity-Relationship schema lead<strong>in</strong>g to a conict<br />
Example 8.12. Consider the Entity-Relationship diagram <strong>in</strong> Figure 8.8. Transform<strong>in</strong>g it to<br />
a relational schema gives rise to the relation schemata<br />
Person = (p<strong>in</strong>o name address)<br />
Driver = (p<strong>in</strong>o registration licence no)<br />
Patient = (p<strong>in</strong>o cl name date)<br />
together with the <strong>in</strong>clusion dependencies<br />
and<br />
Driver[p<strong>in</strong>o] Person[p<strong>in</strong>o]<br />
Car = (registration)<br />
Cl<strong>in</strong>ic = (cl name)<br />
Accident = (p<strong>in</strong>o registration cl name costs)<br />
Accident[p<strong>in</strong>o cl name] Patient[p<strong>in</strong>o cl name]<br />
Accident[p<strong>in</strong>o registration] Driver[p<strong>in</strong>o registration]<br />
Patient[p<strong>in</strong>o] Person[p<strong>in</strong>o] :<br />
Now suppose that the last <strong>of</strong> these dependencies has been equipped with the referential action<br />
restrict, whilst all others are equipped with cascade. ThenPerson[p<strong>in</strong>o], Driver[p<strong>in</strong>o],<br />
175
Accident[p<strong>in</strong>o registration] is a p-path and Person[p<strong>in</strong>o], Patient[p<strong>in</strong>o], Accident[p<strong>in</strong>o cl name]<br />
is an r-path. These two paths show that the schema has a conict.<br />
Now extend the schema as shown <strong>in</strong> Figure 8.9. We obta<strong>in</strong> the additional relation schema<br />
Bad Driver = (p<strong>in</strong>o) with the <strong>in</strong>clusion dependency Bad Driver[p<strong>in</strong>o] Person[p<strong>in</strong>o]. Assume<br />
this dependency to be equipped with the referential action restrict. Then Person[p<strong>in</strong>o],<br />
Bad Driver[p<strong>in</strong>o] constitutes another r-path.<br />
If (for reasons beyond the small section <strong>of</strong> constra<strong>in</strong>ts considered) we derive the <strong>in</strong>clusion<br />
dependency Accident[p<strong>in</strong>o] Bad Driver[p<strong>in</strong>o] 2 Dep , then the above conict will be a<br />
phantom.<br />
ut<br />
{ p<strong>in</strong>o<br />
H H (0,1)<br />
HBad Driver<br />
Person { name<br />
HH<br />
<br />
H<br />
<br />
-<br />
{ address<br />
registration {<br />
{ licence no {date<br />
Car <br />
H AK<br />
A<br />
H H<br />
Driver H Patient<br />
HH<br />
<br />
H<br />
A<br />
H<br />
H<br />
H<br />
HH<br />
<br />
<br />
-<br />
Qk<br />
3<br />
Q<br />
Q <br />
H<br />
Q H H<br />
Accident H {costs<br />
HH<br />
<br />
<br />
Fig. 8.9. Entity-Relationship schema lead<strong>in</strong>g to a phantom conict<br />
Cl<strong>in</strong>ic<br />
{clname<br />
If there is a conict, then a deletion or update for the tuple t n = t 0 m violates the constra<strong>in</strong>ts<br />
R n;1 [Yn;1 0 ] R n[Y n ] and Rm;1 0 [Z0 m;1 ] R0 m [Z m]. Execut<strong>in</strong>g the correspond<strong>in</strong>g referential<br />
actions violates the \next" foreign key constra<strong>in</strong>ts along the p-path or r-path respectively. Dependend<strong>in</strong>g<br />
on the order <strong>of</strong> the referential actions the tuple t 1 = t 0 1 is either deleted accord<strong>in</strong>g<br />
to the actions along the p-path and consequently no constra<strong>in</strong>t violation for R1 0 [Z0 1 ] R0 2 [Z 2]<br />
may occur or it leads to a rollback accord<strong>in</strong>g to the actions along the r-path. This is the core<br />
<strong>of</strong> the ambiguity problem.<br />
However, if it is a phantom conict, we also have a r-path Rk 00[X00<br />
k ]:::R00 1 [X00 1 ] with R00 k =<br />
R n with foreign key constra<strong>in</strong>ts Ri;1 00 [U i 0] R00 i [U i]andXk 00 = X n and an <strong>in</strong>clusion constra<strong>in</strong>t<br />
R 1 [X 1 ] R1 00[X00<br />
1 ]. Hence there are also tuples t00 k :::t00 1 with t00 i [U i] = t i;1 [Ui 0 ]. Hence the<br />
tuple t 00<br />
1 enforces a rollback and there is no ambiguity.<br />
Thus, the ambiguity problem is to decide for a given schema S together with a set Dep =<br />
K [ I <strong>of</strong> m<strong>in</strong>imal key and referential key constra<strong>in</strong>ts has a non-phantom conict or not.<br />
8.5.2 Decidability<br />
In order to show that ambiguity asdenedabove is decidable, we rst recall that implication<br />
for <strong>in</strong>clusion dependencies alone is decidable [2]. Thus, we can compute all p-paths and r-<br />
paths. S<strong>in</strong>ce a conict corresponds to a \diamond" with a p-path and a r-path, the existence<br />
<strong>of</strong> conicts is obviously decidable and we onlyhave to discard phantom p-paths. For this we<br />
have to decide, whether an arbitrary <strong>in</strong>clusion constra<strong>in</strong>t (R 1 [X 1 ] R1 0 [X0 1 ] <strong>in</strong> the denition<br />
176
<strong>of</strong> phantom p-paths) is <strong>in</strong> Dep .Thus, the existence <strong>of</strong> non-phantom conicts is decidable i<br />
for any <strong>in</strong>clusion constra<strong>in</strong>t it is decidable whether Dep j= holds.<br />
We have seen <strong>in</strong> the previous section that constra<strong>in</strong>t sets dened by Entity-Relationship<br />
or simple object-oriented schemata only conta<strong>in</strong> functional and <strong>in</strong>clusion dependencies. We<br />
know that for arbitrary sets <strong>of</strong> functional and <strong>in</strong>clusion constra<strong>in</strong>ts the implication problem<br />
Dep j= is undecidable [6], but for the Entity-Relationship case the <strong>in</strong>clusion dependencies<br />
are acyclic. Then it is well known [1] that the implication problem Dep j= is decidable.<br />
For the object-oriented case all <strong>in</strong>clusion dependencies are unary. For this case it is also<br />
well known [7] that the implication problem Dep j= is decidable.<br />
Therefore, for both cases <strong>of</strong> constra<strong>in</strong>t sets considered <strong>in</strong> this paper, those result<strong>in</strong>g from<br />
Entity-Relationship schemata and those aris<strong>in</strong>g from simple object-oriented schemata, the<br />
ambiguity problem for referential actions is decidable.<br />
8.6 Conclusion<br />
In this paper we <strong>in</strong>vestigated rule trigger<strong>in</strong>g systems (RTSs) for ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g consistency<br />
aris<strong>in</strong>g from implicit constra<strong>in</strong>ts <strong>in</strong> Entity-Relationship and object-oriented models. Unfortunately,<br />
their always exist non-repairable transactions. In order to disallow such transactions<br />
the constra<strong>in</strong>t implication problem must be decidable, which is the case for both models. In<br />
the rst case we are <strong>in</strong> the situation <strong>of</strong> acyclic <strong>in</strong>clusion constra<strong>in</strong>ts, whereas <strong>in</strong> the second<br />
case we only obta<strong>in</strong> unary <strong>in</strong>clusion constra<strong>in</strong>ts.<br />
Secondly, we analyzed critical trigger paths <strong>in</strong> rule hypergraphs associated with RTSs. We<br />
could show that the existence <strong>of</strong> critical trigger paths leads to RTSs which may<strong>in</strong>validate the<br />
eect <strong>of</strong> some transactions, even if these are repairable. Such abehaviour can be excluded for<br />
stratied constra<strong>in</strong>t sets, which holds for the constra<strong>in</strong>t sets aris<strong>in</strong>g from Entity-Relationship<br />
and object-oriented models.<br />
Thirdly, we <strong>in</strong>vestigated the ambiguity problem for rules for the case that rollback is<br />
allowed <strong>in</strong> the action part. This aga<strong>in</strong> can be reduced to the decidability problem for constra<strong>in</strong>t<br />
implication, hence holds for the chosen models.<br />
To summarize, the general applicability <strong>of</strong>RTSs for <strong>in</strong>tegrity ma<strong>in</strong>tenance is limited, if we<br />
assume that the <strong>in</strong>tended eects <strong>of</strong> user-dened transactions should be preserved. Fortunately,<br />
conicts do no occur or can be detected eciently if we only consider constra<strong>in</strong>ts aris<strong>in</strong>g from<br />
conceptual design with Entity-Relationship and certa<strong>in</strong> object-oriented models.<br />
References for Chapter 8<br />
1. S. Abiteboul, R. Hull, V. Vianu. Foundations <strong>of</strong> databases. Addison-Wesley 1995.<br />
2. M. A. Casanova, R. Fag<strong>in</strong>, C.H.Papadimitriou. Inclusion dependencies and their <strong>in</strong>teraction with<br />
functional dependencies. Journal <strong>of</strong> Computer and System Sciences 28 (1), 29-59, 1984.<br />
3. S. Ceri, J. Widom: Deriv<strong>in</strong>g Production Rules for Constra<strong>in</strong>t Ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />
VLDB, Brisbane (Australia), August 1990, 566-577.<br />
4. S. Ceri, P. Fraternali, S. Paraboschi, L. Tanca: Automatic Generation <strong>of</strong> Production Rules for<br />
Integrity Ma<strong>in</strong>tenance. ACM ToDS, vol. 19(3), 1994, 367-422.<br />
5. S. Chakravarty, J. Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc.,<br />
Houston, Februar 1994.<br />
6. A. K. Chandra, M. Y. Vardi. The implication problem for functional and <strong>in</strong>clusion dependencies is<br />
undecidable. SIAM Journal <strong>of</strong> Comput<strong>in</strong>g 14, 671-677, 1985.<br />
177
7. S. S. Cosmadakis, P. Kanellakis, M. Y. Vardi. Polynomial-time implication problems for unary <strong>in</strong>clusion<br />
dependencies. Journal <strong>of</strong> the ACM 37, 15-46, 1990.<br />
8. M. Gertz, U. W. Lipeck: Deriv<strong>in</strong>g Integrity Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Triggers from transaction Graphs, <strong>in</strong><br />
Proc. 9th ICDE, IEEE Computer Society Press, 1993, 22-29.<br />
9. H. Mannila, K.-J. Raiha: The Design <strong>of</strong> Relational <strong>Databases</strong>, Addison-Wesley 1992.<br />
10. K.-D. Schewe, B. Thalheim: Consistency Enforcement <strong>in</strong> Active <strong>Databases</strong>, <strong>in</strong> S. Chakravarty, J.<br />
Widom (Eds.): Research Issues <strong>in</strong> Data Eng<strong>in</strong>eer<strong>in</strong>g | Active <strong>Databases</strong>, Proc., Houston, Februar<br />
1994.<br />
11. K.-D. Schewe and B. Thalheim. Fundamental concepts <strong>of</strong> object oriented databases. Acta Cybernetica,<br />
vol. 11(1/2), Szeged 1993, 49 - 84.<br />
12. K.-D. Schewe, B. Thalheim: Active Consistency Enforcement for Repairable Database Transitions,<br />
<strong>in</strong> S.Conrad, H.-J. Kle<strong>in</strong>, K.-D. Schewe (Eds.): Integrity <strong>in</strong> <strong>Databases</strong>, Proc. 6th Int. Workskop<br />
on Foundations <strong>of</strong> Models and Languages for Data and <strong>Object</strong>s, Schlo Dagstuhl, 1996, 87-102,<br />
available via http://wwwiti.cs.uni-magdeburg.de/conrad/IDB96/Proceed<strong>in</strong>gs.html.<br />
13. K.-D. Schewe: Well-Behav<strong>in</strong>g Rule Systems for Entity-Relationship and <strong>Object</strong> <strong>Oriented</strong> Models,<br />
<strong>in</strong> D. W. Embley, R. C. Goldste<strong>in</strong> (Eds.): Conceptual Model<strong>in</strong>g { ER '97, Spr<strong>in</strong>ger LNCS 1331,<br />
1997, 141-154.<br />
14. K.-D. Schewe, B. Thalheim: On the Strength <strong>of</strong> Rule Trigger<strong>in</strong>g Systems for Integrity Ma<strong>in</strong>tenance,<br />
<strong>in</strong> C. McDonald (Ed.): Database Systems, Proc. 9th Australasian Database Conference, Perth<br />
1998, published as Australian Computer Science Communications, vol. 20 (2), 77-88.<br />
15. B. Thalheim: Foundations <strong>of</strong> entity-relationship model<strong>in</strong>g, Annals <strong>of</strong> Mathematics and Articial<br />
Intelligence, vol. 7, 1993, 197-256.<br />
16. S. D. Urban, L. Delcambre: Constra<strong>in</strong>t Analysis: a Design Process for Specify<strong>in</strong>g Operations on<br />
<strong>Object</strong>s, IEEETrans. on Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (4), December 1990.<br />
17. J. Widom, S. J. F<strong>in</strong>kelste<strong>in</strong>: Set-oriented Production Rules <strong>in</strong> Relational Database Systems, <strong>in</strong><br />
Proc. SIGMOD 1990, 259-270.<br />
178
Chapter 9<br />
Pr<strong>in</strong>ciples <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong><br />
Database Design<br />
Contents<br />
9.1 Philosophy <strong>of</strong>OODB Design . . . . . . . . . . . . . . . . . . . . . . 180<br />
9.2 The <strong>Object</strong> <strong>Oriented</strong> Datamodel: Basic Features . . . . . . . . . 181<br />
9.2.1 Type Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182<br />
9.2.2 Class Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />
9.2.3 Method Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />
9.2.4 Schema Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185<br />
9.3 Class Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186<br />
9.4 Stepwise Renement . . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />
9.4.1 Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />
9.4.2 Splitt<strong>in</strong>g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />
9.4.3 Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />
9.4.4 Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190<br />
9.5 Declarativity by Constra<strong>in</strong>t Centered Design . . . . . . . . . . . . 190<br />
9.6 Variation Based Reuse: A Research Issue . . . . . . . . . . . . . . 192<br />
9.7 Inferences <strong>in</strong> OODB Design . . . . . . . . . . . . . . . . . . . . . . 193<br />
This chapter conta<strong>in</strong>s a repr<strong>in</strong>t <strong>of</strong><br />
Klaus-Dieter Schewe, Bernhard Thalheim. Pr<strong>in</strong>ciples <strong>of</strong> <strong>Object</strong> <strong>Oriented</strong> Database<br />
Design. <strong>in</strong> H. Jaakkola, H. Kangassalo, T. Kitahashi, A. Markus (Eds.). Information<br />
Modell<strong>in</strong>g and Knowledge Bases V , 227 { 242. IOS Press, Amsterdam, 1994.<br />
179
Abstract. The design <strong>of</strong> complex <strong>in</strong>formation systems requires a transparent model-based<br />
methodology. It has been claimed that object orientation will have a signicant impact on<br />
the development <strong>of</strong>such a methodology, especially as reusability and naturality <strong>of</strong> conceptual<br />
modell<strong>in</strong>g are concerned.<br />
The methodology presented <strong>in</strong> this paper concentrates on four signicant pr<strong>in</strong>ciples <strong>of</strong><br />
object oriented database (OODB) design. The basic constituent is stepwise renement, i.e.<br />
to beg<strong>in</strong> the design process with a partial model that is completed and concretized furtheron<br />
depend<strong>in</strong>g on the growth <strong>of</strong> application knowledge. Class abstraction, i.e. to support libraries<br />
<strong>of</strong> <strong>in</strong>complete parameterized designs that are <strong>in</strong>stantiated and specialized later, is a natural<br />
consequence here<strong>of</strong>. Declarativity is achieved by constra<strong>in</strong>t centered design with (up to some<br />
degree) automatic transformation <strong>in</strong>to consistent transactions. Variations enable the design<br />
<strong>of</strong> <strong>in</strong>formation systems with heavy reuse <strong>of</strong> exist<strong>in</strong>g design components.<br />
The methodology is based on a theoretically founded object oriented datamodel (OODM).<br />
Hence the support <strong>of</strong> <strong>in</strong>ferences such as decid<strong>in</strong>g the identiability <strong>of</strong>objects, detect<strong>in</strong>g the<br />
relation <strong>of</strong> an <strong>in</strong>tended design to components <strong>in</strong> exist<strong>in</strong>g design libraries, and check<strong>in</strong>g operations<br />
for reducedness as a prerequisite for the automatic transformation <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong>to<br />
consistent transactions.<br />
9.1 Philosophy <strong>of</strong> OODB Design<br />
The design <strong>of</strong> data and knowledge <strong>in</strong>tensive <strong>in</strong>formation systems requires a transparent modelbased<br />
methodology. Classically there exist seperate methods for the database and transaction<br />
design without a satisfactory <strong>in</strong>tegration [7, 9]. Therefore, it is a natural hope that the use <strong>of</strong><br />
object oriented design methods will improve the situation.<br />
<strong>Object</strong> orientation <strong>in</strong>volves the isolation <strong>of</strong> data <strong>in</strong> semi-<strong>in</strong>dependent modules <strong>in</strong> order<br />
to promote high s<strong>of</strong>tware development productivity. This idea stems from programm<strong>in</strong>g languages<br />
and most methods proposed so far [3, 6, 11, 20] are <strong>in</strong>tended to support object oriented<br />
program development. The ma<strong>in</strong> dierence <strong>in</strong> object oriented database (OODB) design is due<br />
to the notion <strong>of</strong> object that is now <strong>in</strong>tended to serve as a basic unit <strong>of</strong> persistent data, a view<br />
that is <strong>in</strong>fluenced by semantic datamodels [9]. S<strong>in</strong>ce classes then serve not only as behaviour<br />
abstractions but also as (persistent) data collections, we have to cope with object identication,<br />
whereas <strong>in</strong> object oriented programm<strong>in</strong>g a simple identication mechanism via object<br />
names is sucient. This makes OODB design a signicantly dierent task to object oriented<br />
program development, although some ideas <strong>of</strong> the approaches to the latter eld can be taken<br />
over.<br />
Still most object oriented datamodels are very close to the language level [1, 10] no matter<br />
whether their development started from a semantic datamodel or an object oriented programm<strong>in</strong>g<br />
language. For object oriented database design, however, it is necessary to shift<br />
the approach to the conceptual level as also claimed <strong>in</strong> work <strong>of</strong> the IS-Core group [13, 21].<br />
Therefore, the primary goal <strong>of</strong> our methodology is to provide a conceptual object oriented<br />
model with greater naturality <strong>in</strong> application modell<strong>in</strong>g. At the same time we want to improve<br />
the design quality and to raise the rate <strong>of</strong> s<strong>of</strong>tware reuse.<br />
The work presented <strong>in</strong> this paper is centered around the theoretically founded object<br />
oriented datamodel (OODM) <strong>in</strong>troduced <strong>in</strong> [16] and partly based on the work <strong>in</strong> [2]. This<br />
model supports the uniform representation <strong>of</strong> designs at each level <strong>of</strong> concretion. In particular<br />
there is no need to use dierent models for the conceptual and logical design respectively.<br />
180
We regard requirements analysis and conceptual modell<strong>in</strong>g as two activities runn<strong>in</strong>g <strong>in</strong><br />
parallel. We start with an <strong>in</strong>itial design that is a one-to-one representation <strong>of</strong> rst knowledge<br />
about the <strong>in</strong>tended application. The analysis task is to grasp and describe such knowledge<br />
with the formal representation tools. The follow<strong>in</strong>g design process is monotonic, as the amount<br />
<strong>of</strong> application knowledge <strong>in</strong>creases. Each knowledge <strong>in</strong>crement then corresponds to some re-<br />
nement, i.e. a change|not only extension|<strong>of</strong> the design. However, this does not prejudice a<br />
particular, e.g. \top-down" design procedure. In contrast, the OODM favours <strong>in</strong>complete partial<br />
designs with the specication <strong>of</strong> details left for renement. Keep<strong>in</strong>g even such<strong>in</strong>termediate<br />
designs <strong>in</strong>creases the spread <strong>of</strong> possible reuse. This is close to the Design-by-Units-strategy<br />
[23].<br />
Classical design methods are centered around data, processes or constra<strong>in</strong>ts respectively.<br />
With<strong>in</strong> the unied model <strong>in</strong> our approach we may regard all these aspects at the same<br />
time and keep only track <strong>of</strong> the dependencies among them, s<strong>in</strong>ce constra<strong>in</strong>ts depend on the<br />
data and processes on both other components. This implies the relative <strong>in</strong>dependence <strong>of</strong><br />
renement steps on data, processes or constra<strong>in</strong>ts as long as these dependencies are taken<br />
<strong>in</strong>to consideration.<br />
S<strong>in</strong>ce processes <strong>in</strong> data and knowledge <strong>in</strong>tensive application systems change much faster<br />
than constra<strong>in</strong>ts, it is desirable to m<strong>in</strong>imize the process design task and to achieve a maximum<br />
<strong>of</strong> declarativity. As shown <strong>in</strong> [17, 18, 19] it is possible (up to some degree) to compute maximal<br />
specializations <strong>of</strong> specied processes <strong>in</strong> order to enforce consistency.<br />
The use <strong>of</strong> a uniform OODM dur<strong>in</strong>g the whole design process enables to build design<br />
libraries. Due to the support <strong>of</strong> abstract partial designs the components <strong>of</strong> such libraries can<br />
be more generic than usually assumed, but it is a truism that reusability does not imply<br />
reuse. We have to support mechanisms to retrieve a maximum <strong>of</strong> exist<strong>in</strong>g reusable library<br />
components for a given partial design. This leads to the concept <strong>of</strong> variation-based reuse<br />
extend<strong>in</strong>g results on variant construction<strong>in</strong>semantic networks [14].<br />
Such a methodology <strong>in</strong>volves a high level <strong>of</strong> <strong>in</strong>ferences. Some <strong>of</strong> these <strong>in</strong>ferences are <strong>in</strong>tr<strong>in</strong>sic<br />
to the used datamodel. Among them are the recognition <strong>of</strong> object identiability, specialication<br />
and type correctness or the verication <strong>of</strong> renement correctness. Others are extr<strong>in</strong>sic such as<br />
the pro<strong>of</strong> <strong>of</strong> reducedness as a prerequisite for consistency enforcement or the ascerta<strong>in</strong>ment<br />
<strong>of</strong> the relationship to exist<strong>in</strong>g library components.<br />
In the rema<strong>in</strong>der <strong>of</strong> this paper we shall rst describe the fundamental issues <strong>of</strong> the OODM<br />
<strong>in</strong> Section 9.2, then <strong>in</strong> Sections 9.3-9.6 we briefly concretize the basic pr<strong>in</strong>ciples <strong>of</strong> our design<br />
methodology. Section 9.7 presents a short outl<strong>in</strong>e <strong>of</strong> the required <strong>in</strong>ferences and a discussion<br />
<strong>of</strong> open research problems.<br />
9.2 The <strong>Object</strong> <strong>Oriented</strong> Datamodel: Basic Features<br />
In the object-oriented approach we dist<strong>in</strong>guish between objects and values. Whereas values<br />
are encoded by themselves, objects have tobeencodedby object identiers. In our approach<br />
each object consists <strong>of</strong> a unique, immutable identier, a set <strong>of</strong> values <strong>of</strong> possibly dierent<br />
types, references to other objects and methods associated with the object.<br />
Values can be grouped <strong>in</strong>to types. In general, a type may be regarded as an immutable set<br />
<strong>of</strong> values <strong>of</strong> a uniform structure together with operations dened on such values. Subtyp<strong>in</strong>g<br />
is used to relate values <strong>in</strong> dierent types. The class concept provides the group<strong>in</strong>g <strong>of</strong> objects<br />
hav<strong>in</strong>g the same structure which uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references.<br />
181
<strong>Object</strong>s can belong to dierent classes, which guarantees each object <strong>of</strong> our abstract object<br />
model to be captured by the collection <strong>of</strong> possible classes. As for values that are only dened<br />
via types, objects can only be dened via classes. Thus, a design consists <strong>of</strong> type and class<br />
denitions.<br />
9.2.1 Type Denitions<br />
We follow the classical view <strong>of</strong> types <strong>in</strong> [4] us<strong>in</strong>g a type system that consists <strong>of</strong> some basic<br />
types, type constructors and a subtyp<strong>in</strong>g relation. Moreover, recursive types, i.e. types dened<br />
by doma<strong>in</strong> equations, and predicative types, i.e. types dened by restrictions, can be dened.<br />
Denition 9.1. { The base types are BOOL, NAT, INT, FLOAT, STRING, ID or ?,<br />
where ID is an abstract identier type without any non-trivial supertype and ? is the<br />
trivial type that is a supertype for every type.<br />
{ The type constructors are e 1 j je n (enumeration), (a 1 : 1 ::: a n : n ) (record), fg<br />
(nite set), [] (list), hi (bag) or (a : ) [ (b : ) (union).<br />
We may use base types and constructors to dene new types by nest<strong>in</strong>g. If there is no confusion,<br />
the eld selectors <strong>in</strong> record or union types may be omitted.<br />
The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />
standard operators on base types and on records, sets, bags, ::: We omit the details here. A<br />
type t is called proper i the number <strong>of</strong> its parameters is 0. t is called a value type i there<br />
is no occurrence <strong>of</strong> ID <strong>in</strong> t. If t 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type t, then there exists a<br />
correspond<strong>in</strong>g occurrence relation o : t t 0 ! BOOL.<br />
A subtype function is a function t 0 ! t from a subtype to its supertype (t 0 t) dened by<br />
the usual subtype relation [4].<br />
Example 9.1.<br />
Let us dene a type VZ and a simple subtype VZ 0 here<strong>of</strong>.<br />
Type VZ =<br />
( beg<strong>in</strong> : DATE ,<br />
end : DATE [?,<br />
k<strong>in</strong>d-<strong>of</strong>-<strong>in</strong>surance : \Ma<strong>in</strong>" j \Family" j \Interruption" )<br />
End VZ<br />
Type VZ 0 =<br />
( beg<strong>in</strong> : DATE ,<br />
end : DATE ,<br />
k<strong>in</strong>d-<strong>of</strong>-<strong>in</strong>surance : \Ma<strong>in</strong>" j \Family" j \Interruption" )<br />
End VZ 0<br />
ut<br />
Predicative Types are used to restrict the set <strong>of</strong> values given by some type denition to a<br />
subset. For this purpose a formula with exactly one free variable self is used. Clearly, the<br />
<strong>in</strong>clusion then gives a subtype function. In order to avoid <strong>in</strong>flationary use <strong>of</strong> quantiers, other<br />
variables are also allowed to occur freely <strong>in</strong> such a formula. They are assumed to be universally<br />
quantied.<br />
Denition 9.2. A predicative type T consists <strong>of</strong> an underly<strong>in</strong>g type T 0 and a formula P with<br />
exactly one free variable self <strong>of</strong> type T 0 .<br />
182
Example 9.2. Let us dene a predicative subtype <strong>of</strong> [ VZ ].<br />
Type VZ-list = [ VZ ] Where<br />
( self = concat(L 1 ,[V 1 ,V 2 j L 2 ]))<br />
V 2 :: VZ 0 ^ V 2 .end V 1 .beg<strong>in</strong> ) ^<br />
( self = concat(L 1 ,[V j L 2 ]))<br />
V .end 6= ?)V .beg<strong>in</strong> V .end )<br />
End VZ-list<br />
ut<br />
9.2.2 Class Denitions<br />
Each object <strong>in</strong> a class consists <strong>of</strong> an identier, a collection <strong>of</strong> values, references to other objects<br />
and methods. Let us postpone methods for a while. Identiers can be represented us<strong>in</strong>g the<br />
unique identier type ID.Values and references can be comb<strong>in</strong>ed <strong>in</strong>to a representation type,<br />
where each occurence <strong>of</strong> ID denotes references to some other classes. Therefore, we may<br />
dene the structure <strong>of</strong> a class us<strong>in</strong>g parameterized types. Moreover, classes are arranged <strong>in</strong><br />
IsA-hierarchies.<br />
Denition 9.3. { If t is a value type with parameters 1 ::: n such that ID does not<br />
occur <strong>in</strong> t and if some <strong>of</strong> the parameters are replaced by pairs r i : C i with a reference<br />
name r i and a class name C i , the result<strong>in</strong>g expression is called a structure expression.<br />
Note that a structure expression may still conta<strong>in</strong> parameters.<br />
{ A class consists <strong>of</strong> a class name C, a structure expression S, a set <strong>of</strong> class names<br />
D 1 ::: D m (called superclasses) and a set <strong>of</strong> methods. Wecallr i the reference named r i<br />
from class C to class C i . The type derived from S by replac<strong>in</strong>g each reference r i : C i by<br />
the type ID is called the representation type T C <strong>of</strong> the class C.<br />
Example 9.3.<br />
Let us consider a class Insurant for an <strong>in</strong>surance application.<br />
Class Insurant =<br />
Structure (contract-no : NAT ,<br />
name : NAME ,<br />
address : ADDRESS ,<br />
sex : SEX ,<br />
<strong>in</strong>surance-times : VZ-list ,<br />
agency : AGENCY )<br />
Method :::<br />
End Insurant<br />
ut<br />
In this example there are no references, hence the structure expression is simply a type. We<br />
could have dened this type, say INSURANT-DATA, separately from the class denition as<br />
<strong>in</strong> Section 9.2.1. Then the structure would simply be Structure INSURANT-DATA.<br />
9.2.3 Method Denitions<br />
Let us now turn to add<strong>in</strong>g dynamics to the OODM. As required <strong>in</strong> the object oriented<br />
approach operations will be associated with classes. This gives us the notion <strong>of</strong> a method.<br />
We shall dist<strong>in</strong>guish between visible and hidden methods to emphasize those methods that<br />
can be <strong>in</strong>voked by the user and others. However, all methods <strong>of</strong> a class <strong>in</strong>clud<strong>in</strong>g the hidden<br />
ones can be accessed by other methods. The justication for such a weak hid<strong>in</strong>g concept is<br />
due to two reasons.<br />
183
{ Visible methods serve as a means to specify (nested) transactions. In order to build<br />
sequences <strong>of</strong> database <strong>in</strong>stances we only regard these transactions assum<strong>in</strong>g a l<strong>in</strong>ear <strong>in</strong>vocation<br />
order on them.<br />
{ Hidden methods can be used to handle identiers. S<strong>in</strong>ce these identiers do not have any<br />
mean<strong>in</strong>g for the user, they must not occur with<strong>in</strong> the <strong>in</strong>put or output <strong>of</strong> a transaction.<br />
Each method on a class C consists <strong>of</strong> a signature and a body. The signature consists <strong>of</strong> a<br />
method name and sets <strong>of</strong> parameter/type pairs for <strong>in</strong>put and output. The body is dened by<br />
the usual constructs <strong>of</strong> a procedural programm<strong>in</strong>g language.<br />
Denition 9.4. { A method signature consists <strong>of</strong> a method name M, a set <strong>of</strong> <strong>in</strong>put-parameter/type<br />
pairs i :: T i and a set <strong>of</strong> output-parameter/type pairs o j :: Tj 0.<br />
{ A method onaclassC consists <strong>of</strong> a method signature and a body that is recursively built<br />
from the follow<strong>in</strong>g constructs:<br />
assignment x := E, where x is either the class variable C <strong>of</strong> type fU C g or a local<br />
variable with<strong>in</strong> S (<strong>in</strong>clud<strong>in</strong>g the output-parameters), and E is a expression <strong>of</strong> the<br />
same type as x,<br />
local variable declaration Let x :: T ,<br />
skip and fail,<br />
sequenc<strong>in</strong>g S 1 S 2 and branch<strong>in</strong>g IF P THEN S 1 ELSE S 2 ENDIF,<br />
method call C 0 :- M 0 (<strong>in</strong> : E1 0 ::: E0 j out : x0 1 ::: x0 i ), where M 0 is a method on class<br />
C 0 with compatible signature and<br />
non-determ<strong>in</strong>istic selection <strong>of</strong> values New:f(x), where f is a selector on the representation<br />
type <strong>of</strong> C.<br />
If the class name is omitted <strong>in</strong> a method-call, then we refer to the class C itself or to the<br />
global method New Id to denote the selection <strong>of</strong> a new identier. Clearly, wemay regard this<br />
method as belong<strong>in</strong>g to an abstract class Any that is a superclass <strong>of</strong> all classes with structure<br />
?.<br />
A method M on a class C is called value-dened i all types occurr<strong>in</strong>g <strong>in</strong> its signature are<br />
proper value types. As already mentioned we dist<strong>in</strong>guish between methods visible to the user<br />
and hidden methods. We require each visible method to be value-dened. Subclasses <strong>in</strong>herit<br />
the methods <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long as the new method is a<br />
specialization <strong>of</strong> all its correspond<strong>in</strong>g methods <strong>in</strong> its superclasses.<br />
Example 9.4. Let us add the method add-<strong>in</strong>surant to the class Insurant <strong>of</strong> Example 9.3.<br />
184
Method<br />
add-<strong>in</strong>surant ( <strong>in</strong> : request-data :: REQUEST-DATA ,<br />
out : contract-no :: NAT ) =<br />
Insurant :- check-data ( <strong>in</strong> : request-data ,<br />
out : acceptable :: BOOL )<br />
IF acceptable<br />
THEN Let I :: ID , C :: NAT <br />
New.contract-no (C) <br />
New Id ( out : I ) <br />
Insurant :- compute-<strong>in</strong>surant-data<br />
(<strong>in</strong>:request-data,C ,out:V) <br />
Insurant := Insurant [f(I,V)g<br />
ELSE fail<br />
ENDIF<br />
ut<br />
Let us briefly discuss what it means that a method N on a class D specializes the method M<br />
on a superclass C. First, we may assume|tak<strong>in</strong>g records|that there is exactly one <strong>in</strong>putand<br />
one output-type, say I N (resp. I M ) and O N (resp. O M ). The <strong>in</strong>put-type is used for two<br />
purposes: object identication <strong>in</strong> D (resp. C) and provid<strong>in</strong>g necessary parameters, hence I N<br />
(resp. I M ) is a subtype <strong>of</strong> some I D I 0 N (resp. I C I 0 M ).<br />
In order to \<strong>in</strong>herit" the behaviour <strong>of</strong> M to N we must be able to transform N <strong>in</strong> such a<br />
way that it becomes applicable to the <strong>in</strong>put <strong>of</strong> M. Hence we have to project the parameter<br />
parts, whereas identication may exploit object identiers (see Denition 9.6). Hence I 0 M must<br />
be a subtype <strong>of</strong> I 0 N .<br />
Note that this gives some k<strong>in</strong>d <strong>of</strong> partial contravariance, whereas [11] requires covariance<br />
and [1] requires contravariance only. The dierences are due to the mismatches between<br />
program and database design as already mentioned <strong>in</strong> Section 9.1.<br />
For the output-types the situation is much simpler requir<strong>in</strong>g O N to be a subtype <strong>of</strong> O M .<br />
We may then transform N <strong>in</strong> a canonical way to some N 0 with the same signature as M.<br />
Both may be regarded as methods on C. Then, if N 0 applied to some <strong>in</strong>put-value yields some<br />
result, this should also result from apply<strong>in</strong>g M (but not vice versa). A more formal discussion<br />
on the theme occurs <strong>in</strong> [17].<br />
9.2.4 Schema Denitions<br />
Now we are prepared for the denition <strong>of</strong> a database schema that is simply given by a nite<br />
collection<strong>of</strong>type and class denitions. Later we shall add constra<strong>in</strong>t denitions. Thus, tak<strong>in</strong>g<br />
together Examples 9.1-9.4, we get a schema with only one class Insurant and only one<br />
method add-<strong>in</strong>surant.<br />
However, some <strong>of</strong> the types <strong>in</strong> this schema such asNAME, ADDRESS, REQUEST ;<br />
DATA are undened. The same applies to the methods check-data and compute-<strong>in</strong>surantdata<br />
called by add-<strong>in</strong>surant. This style <strong>of</strong> allow<strong>in</strong>g partiality <strong>in</strong> OODM schemata allows to<br />
capture also <strong>in</strong>complete knowledge about an application area and will be essential for our<br />
methodology. In the next two chapters we shall expla<strong>in</strong> <strong>in</strong> more detail this feature and show<br />
how toexploit it for a standard renement process.<br />
First let us have a closer look at schemata that are \complete", i.e. correspond to anal<br />
design <strong>of</strong> an application. This leads to the notion <strong>of</strong> closed schemata.<br />
185
Denition 9.5. A schema S is a nite collection <strong>of</strong> type, class and constra<strong>in</strong>t denitions. It is<br />
closed i all types, classes and methods occurr<strong>in</strong>g with<strong>in</strong> type denitions, structure denitions<br />
and methods are dened <strong>in</strong> S.<br />
Let us postpone constra<strong>in</strong>ts for a while. At each time, a class is given by a nite set <strong>of</strong> objects.<br />
More precisely, we need the notion <strong>of</strong> a database <strong>in</strong>stance.<br />
Denition 9.6. An <strong>in</strong>stance D <strong>of</strong> a closed schema S assigns to each class C a value D(C)<br />
<strong>of</strong> type f(ident : IDvalue : T C )g such that the follow<strong>in</strong>g conditions are satised:<br />
uniqueness <strong>of</strong> identiers: For every class C we have<br />
8i :: ID:8v w :: T C :(i v) 2D(C) ^ (i w) 2D(C) ) v = w : (9.90)<br />
<strong>in</strong>clusion <strong>in</strong>tegrity: For a subclass C <strong>of</strong> C 0 wehave<br />
8i :: ID:i 2 dom(D(C)) ) i 2 dom(D(C 0 )) : (9.91)<br />
Moreover, if T C is a subtype <strong>of</strong> TC 0 with subtype function f : T C ! TC 0 , then we have<br />
8i :: ID:8v :: T C : (i v) 2D(C) ) (i f(v)) 2D(C 0 ) : (9.92)<br />
referential <strong>in</strong>tegrity:<br />
relation o r wehave<br />
For each reference from C to C 0 with correspond<strong>in</strong>g occurrence<br />
8i j :: ID:8v :: T C : (i v) 2D(C) ^ o r (v j) ) j 2 dom(D(C 0 )) : (9.93)<br />
Basic update methods, i.e. <strong>in</strong>sertion, deletion and update <strong>of</strong> a s<strong>in</strong>gle object <strong>in</strong>to a class C,<br />
can not always be derived <strong>in</strong> the object-oriented case, because the abstract identiers have<br />
to be hidden from the user. However, <strong>in</strong> [16] it has been shown that for value-representable<br />
classes these operations are uniquely determ<strong>in</strong>ed by the schema and consistent with respect<br />
to the implicit referential and <strong>in</strong>clusion constra<strong>in</strong>ts.<br />
Value-representability <strong>of</strong> all classes <strong>in</strong> a closed schema is implied, if we can derive a (trivial)<br />
uniqueness constra<strong>in</strong>t for each class. Such aconstra<strong>in</strong>t requires the values <strong>of</strong> type T C <strong>in</strong> the<br />
class extension C to be unique:<br />
8i j :: ID:8v :: T C : (i v) 2D(C) ^ (j v) 2D(C) ) i = j : (9.94)<br />
F<strong>in</strong>ally, the semantics <strong>of</strong> a closed schema is given by database histories, where a database<br />
history on a schema S is a sequence D 0 D 1 ::: <strong>of</strong> <strong>in</strong>stances such that D 0 is the empty<br />
database and each transition from D i;1 to D i is due to some visible method on some class<br />
C 2S.<br />
9.3 Class Abstraction<br />
As we have seen <strong>in</strong> Section 9.2 the structure expression <strong>of</strong> a class <strong>in</strong> an OODM schema may<br />
conta<strong>in</strong> parameters. These arise from parameterized types. Parameterized classes allow to<br />
abstract from concrete structures. Indeed, an <strong>in</strong>stance <strong>of</strong> a parameterized class may not be<br />
186
egarded as a s<strong>in</strong>gle set <strong>of</strong> pairs, but as a family here<strong>of</strong> <strong>in</strong>dexed by the possible <strong>in</strong>stantiations.<br />
Let us now extend and concretize this view to arbitrary schemata.<br />
If we know that objects will have some attributes, but we still do not know the type <strong>of</strong> the<br />
correspond<strong>in</strong>g values, we may leave the correspond<strong>in</strong>g parameter un<strong>in</strong>stantiated. However, if<br />
we already know that we shall <strong>in</strong>stantiate this parameter by some type, we may mark this<br />
parameter as a type parameter. Ifwe know that there will be some reference r i : C i ,butC i is<br />
undened, then we have aclass parameter.<br />
For parameterized classes the possibilities to dene methods and constra<strong>in</strong>ts are restricted.<br />
If is a type parameter and we do not know anyth<strong>in</strong>g about the type, there is no non-trivial<br />
way to express a term <strong>of</strong> that type, but terms are required <strong>in</strong> assignments as well as <strong>in</strong><br />
constra<strong>in</strong>ts. However, we mayhave partial knowledge <strong>of</strong> that type, e.g. that it is a subtype <strong>of</strong><br />
some other type, <strong>in</strong> which case we may use terms <strong>of</strong> that supertype.<br />
If C is a class parameter, then each call <strong>of</strong> a method m on C is <strong>in</strong>deed undened. Therefore,<br />
for the pro<strong>of</strong> <strong>of</strong> properties <strong>of</strong> the call<strong>in</strong>g method such as consistency we only have the<br />
possibility to assume an arbitrary <strong>in</strong>put-output-relation for m unless we completely defer the<br />
pro<strong>of</strong>.<br />
Denition 9.7. If S is a schema, T a type parameter, C a class parameter and M an<br />
undened method. A parameter restriction is either T T 0 with some value type expression<br />
T 0 , C isa C 0 with some class name C 0 , C:structure S with some structure expression S<br />
or a restriction on the types <strong>of</strong> the signature <strong>of</strong> M.<br />
Here denotes the subtype relation and its canonical extension to structure expressions. Note<br />
that some parameter restrictions may be <strong>in</strong>ferred from context <strong>in</strong> the schema S. If a parameter<br />
is unrestricted, we may add the implicit parameter restrictions T ?, C:structure ?<br />
and T i ?for type parameters, class parameters and types <strong>in</strong> method signatures. However,<br />
if there is more than one restriction on a parameter, these may be <strong>in</strong>consistent. In the case<br />
<strong>of</strong> a consistent set <strong>of</strong> parameter restrictions, the set <strong>of</strong> restrictions on one parameter may be<br />
unied to give only one restriction <strong>in</strong> the form <strong>of</strong> Denition 9.7. We then talk <strong>of</strong> the normalized<br />
set <strong>of</strong> parameter restrictions.<br />
In order to dene the semantics <strong>of</strong> open (i.e. not closed) schemata, we need the notions <strong>of</strong><br />
<strong>in</strong>stantiations.<br />
Denition 9.8. Let S be a schema with a consistent set <strong>of</strong> parameter restrictions. An <strong>in</strong>stantiation<br />
I is given by a closed schema S 0 that results from S by replac<strong>in</strong>g each type parameter<br />
T by avalue type, each class parameter by aclass and each undened method by \Let :::<br />
o i :: O i ::: "such that all parameter restrictions are satised. S 0 is called m<strong>in</strong>imal i we had<br />
taken the types and classes occurr<strong>in</strong>g <strong>in</strong> the normalized set <strong>of</strong> parameter restrictions.<br />
Example 9.5. Let us look aga<strong>in</strong> at Examples 9.1-9.4. The m<strong>in</strong>imal <strong>in</strong>stantiation <strong>of</strong> the type<br />
VZ (and VZ 0 ) gives<br />
Type VZ =<br />
( beg<strong>in</strong> : ? ,<br />
end : ? ,<br />
k<strong>in</strong>d-<strong>of</strong>-<strong>in</strong>surance : \Ma<strong>in</strong>" j \Family" j \Interruption" )<br />
End VZ<br />
The m<strong>in</strong>imal <strong>in</strong>stantiation <strong>of</strong> the class Insurant leads to the structure expression<br />
187
Structure (contract-no : NAT ,<br />
name : ? ,<br />
address : ? ,<br />
sex : ? ,<br />
<strong>in</strong>surance-times : VZ-list ,<br />
agency : ? )<br />
The method add-<strong>in</strong>surant <strong>in</strong>volves the call <strong>of</strong> check-data on the same class, but this method is<br />
undened, hence could only be treated as the non-determ<strong>in</strong>istic value selection \ Let accepted<br />
:: BOOL ". ut<br />
F<strong>in</strong>ally, the full semantics <strong>of</strong> an open schema S is given by families <strong>of</strong> history sets <strong>in</strong>dexed<br />
by the possible <strong>in</strong>stantiations <strong>of</strong> S, whereas the m<strong>in</strong>imal semantics is the semantics <strong>of</strong> the<br />
m<strong>in</strong>imal <strong>in</strong>stantiation.<br />
Note that each <strong>in</strong>stantiation can be projected naturally to the m<strong>in</strong>imal one. The pr<strong>in</strong>ciple<br />
<strong>of</strong> class abstraction is necessary for stepwise renement as <strong>in</strong>dicated <strong>in</strong> Section 9.3, s<strong>in</strong>ce<br />
otherwise we were not able to support partial designs. On the other hand, it <strong>in</strong>creases the<br />
band-width <strong>of</strong> possible concrete designs that occur as <strong>in</strong>stantiations. Therefore, it is desirable<br />
to provide libraries <strong>of</strong> abstract (partial) designs to achieve a higher rate <strong>of</strong> reusability.<br />
9.4 Stepwise Renement<br />
Once, an <strong>in</strong>itial OODM schema is given, the follow<strong>in</strong>g design process is based on stepwise<br />
renement. Roughly speak<strong>in</strong>g, renement means the reorganization <strong>of</strong> classes and methods<br />
such that the semantics <strong>of</strong> the old schema is \preserved" with<strong>in</strong> the new one. This is captured<br />
by the next denition.<br />
Let S and T be closed schemata and suppose there are (partial) functions<br />
{ f <strong>in</strong>st that is total tak<strong>in</strong>g <strong>in</strong>stances <strong>of</strong> T to <strong>in</strong>stances <strong>of</strong> S,<br />
{ f class that is partial tak<strong>in</strong>g a class <strong>in</strong> T to a class <strong>in</strong> S and<br />
{ f meth that is total tak<strong>in</strong>g a method <strong>in</strong> T to a (possibly empty) set <strong>of</strong> methods <strong>in</strong> S.<br />
such that for each method M associated with a class C <strong>in</strong> T each method M 0 2 f meth (M) is<br />
associated with f class (C). If S and T are arbitrary schemata, assume these functions to be<br />
dened on the m<strong>in</strong>imal <strong>in</strong>stantiations.<br />
Denition 9.9. T is a renement <strong>of</strong> S i for each pair (D i;1 D i ) <strong>in</strong> a database history <strong>of</strong><br />
T that corresponds to a method M and each M 0 2 f meth (M) that is dened and term<strong>in</strong>at<strong>in</strong>g<br />
<strong>in</strong> f <strong>in</strong>st (D i;1 ) the pair (f <strong>in</strong>st (D i;1 )f <strong>in</strong>st (D i )) corresponds to M 0 .<br />
There exists a more elegant (but also strongly theoretical) characterization <strong>of</strong> renement. We<br />
omit the details here. In [15] the follow<strong>in</strong>g standard renement steps <strong>in</strong> the OODM have been<br />
discussed on the basis <strong>of</strong> an application example.<br />
9.4.1 Instantiation<br />
In Section 9.3 we discussed the possibility <strong>of</strong> parameterized (open) schemata and dened their<br />
semantics. Renement by <strong>in</strong>stantiation provides denitions for such parameters, but may also<br />
<strong>in</strong>troduce new parameters.<br />
188
Example 9.6. Let us <strong>in</strong>stantiate the type parameters ADDRESS and AGENCY occurr<strong>in</strong>g<br />
<strong>in</strong> Example 9.3.<br />
Type ADDRESS =<br />
( zip : NAT Where self < 100 000 ,<br />
city :STRING ,<br />
street : STRING )<br />
End ADDRESS<br />
Type AGENCY =<br />
(number : NAT Where self < 1 000 ,<br />
address : ADDRESS ,<br />
phones : f TELECOM NO g ,<br />
fax : TELECOM NO ,<br />
cares for : f ( zip : NAT Where self < 100 000 ,<br />
city :STRING ) g )<br />
End AGENCY<br />
ut<br />
Renement by <strong>in</strong>stantiation may also <strong>in</strong>troduce bodies for methods that were undened so<br />
far.<br />
9.4.2 Splitt<strong>in</strong>g<br />
Renement by splitt<strong>in</strong>g leads to new classes with structure expressions that correspond to<br />
parts <strong>of</strong> an exist<strong>in</strong>g structure expression which <strong>in</strong> turn are replaced by references. It is ma<strong>in</strong>ly<br />
used <strong>in</strong> the case <strong>of</strong> shared data.<br />
Example 9.7. The class Agency stems from splitt<strong>in</strong>g Insurant <strong>in</strong> Example 9.3 assum<strong>in</strong>g<br />
the <strong>in</strong>stantiation <strong>of</strong> Example 9.6 to be already done. The new reference is agency : Agency.<br />
Class Agency =<br />
Structure ( agency : AGENCY )<br />
End Agency<br />
Class Insurant =<br />
Structure (contract-no : NAT ,<br />
::: ::: ,<br />
agency : Agency )<br />
Methods :::<br />
End Insurant<br />
Clearly, the exist<strong>in</strong>g methods on the splitted class have also to be changed.<br />
ut<br />
9.4.3 Specialization<br />
Renement by specialization <strong>in</strong>troduces subclasses and subtypes. Moreover, it may <strong>in</strong>volve to<br />
replace a structure expression such that the new representation type will be a subtype <strong>of</strong> the<br />
old one and the new implicit constra<strong>in</strong>ts will imply the old ones.<br />
Example 9.8. Let us <strong>in</strong>troduce a new class Ma<strong>in</strong>-Insurant as a subclass <strong>of</strong> Insurant.<br />
<strong>Object</strong>s <strong>in</strong> this subclass have an additional reference to Company that need not exist for all<br />
<strong>in</strong>surants.<br />
189
Class Ma<strong>in</strong>-Insurant =<br />
IsA Insurant Structure (account-no : NAT ,<br />
employed-by :Company )<br />
Methods :::<br />
End Ma<strong>in</strong>-Insurant<br />
The new class Insurant results by specializ<strong>in</strong>g the old class with this name. We simply<br />
add a reference to the class Ma<strong>in</strong>-Insurant for the case <strong>of</strong> <strong>in</strong>surant <strong>of</strong>k<strong>in</strong>d \Family". The<br />
correspond<strong>in</strong>g subtype function is a simple projection.<br />
Class Insurant =<br />
Structure ( ::: ,<br />
<strong>in</strong>surance-times : [ ( beg<strong>in</strong> : DATE ,<br />
end : DATE [?,<br />
( k<strong>in</strong>d : \Ma<strong>in</strong>" j \Interruption" ) [<br />
(k<strong>in</strong>d:\Family",<br />
associated-with : Ma<strong>in</strong>-Insurant ))]<br />
Where ::: ,<br />
agency : Agency )<br />
Methods :::<br />
End Insurant<br />
ut<br />
9.4.4 Extension<br />
Renement by extension is very simple, s<strong>in</strong>ce it means the denition <strong>of</strong> new types, classes,<br />
constra<strong>in</strong>ts or methods that do not yet exist <strong>in</strong> the schema.<br />
Example 9.9. A new class New Insurant to capture persons that apply to become an<br />
<strong>in</strong>surant is<strong>in</strong>troduced as follows.<br />
Class New Insurant =<br />
Structure (name:NAME ,<br />
address : ADDRESS ,<br />
sex : SEX ,<br />
when to start : DATE ,<br />
<strong>in</strong>itial-agency : Agency ,<br />
vocational-group : VOCATION-KEY ,<br />
<strong>in</strong>come : NAT Where self < 1 000 000 )<br />
Methods :::<br />
End New Insurant<br />
<strong>Object</strong>s may at the same time belong to both class Insurant and New Insurant with<br />
dierent names, addresses and so on. <strong>Object</strong> identiers are used to relate dierent aspects <strong>of</strong><br />
the same object.<br />
ut<br />
9.5 Declarativity by Constra<strong>in</strong>t Centered Design<br />
As announced <strong>in</strong> Denition 9.5 we now concretize constra<strong>in</strong>ts associated with a schema.<br />
Particular <strong>in</strong>terest will be paid for such constra<strong>in</strong>ts that arise as generalizations <strong>of</strong> constra<strong>in</strong>ts<br />
known from the relational model, e.g. functional, <strong>in</strong>clusion and exclusion constra<strong>in</strong>ts [17, 18].<br />
190
Denition 9.10. { An <strong>in</strong>tegrity constra<strong>in</strong>t on a schema S is a formula I over the underly<strong>in</strong>g<br />
type system with free variables fr(I) fC 1 ::: C n g, where each class name C i is used<br />
as a variable <strong>of</strong> type f(ident : IDvalue : T Ci )g.<br />
{ An <strong>in</strong>stance D <strong>of</strong> a schema is said to be consistent i substitut<strong>in</strong>g D(C) for each class<br />
variable C <strong>in</strong> each <strong>in</strong>tegrity constra<strong>in</strong>t I evaluates to true, when <strong>in</strong>terpreted <strong>in</strong> the usual<br />
way.<br />
Note that the conditions for an <strong>in</strong>stance <strong>in</strong> Denition 9.6 correspond to model <strong>in</strong>herent <strong>in</strong>tegrity<br />
constra<strong>in</strong>ts. We refer to these constra<strong>in</strong>ts as implicit identier, IsA and referential<br />
constra<strong>in</strong>ts on the schema S. Other constra<strong>in</strong>ts that are already given implicitly by the structure<br />
<strong>of</strong> the schema arise from Where-clauses <strong>in</strong> predicative types. Indeed, we may replace such<br />
types by the underly<strong>in</strong>g ground type|just omit the Where-clause| and add the clause as a<br />
constra<strong>in</strong>t. From the designer's po<strong>in</strong>t <strong>of</strong> view this is not necessary, but it will be as soon as<br />
constra<strong>in</strong>t ma<strong>in</strong>tenance comes <strong>in</strong>to play (see below).<br />
Example 9.10. Return to Example 9.8, where we <strong>in</strong>troduced the class Ma<strong>in</strong>-Insurant as<br />
a subclass <strong>of</strong> Insurant. We would like to express that each object currently <strong>in</strong> Insurant<br />
with k<strong>in</strong>d = \Ma<strong>in</strong>" must also belong to Ma<strong>in</strong>-Insurant. This gives the formula<br />
8i v b `<br />
(i v) 2 Insurant ^<br />
<strong>in</strong>surance-times(o) =[(b ? \Ma<strong>in</strong>") j `] )<br />
9w (i w) 2 Ma<strong>in</strong>-Insurant<br />
with free variables Insurant and Ma<strong>in</strong>-Insurant.<br />
ut<br />
In particular, we allow dist<strong>in</strong>guished classes <strong>of</strong> constra<strong>in</strong>ts to be specied <strong>in</strong> OODM schemata.<br />
These comprise <strong>in</strong>clusion, exclusion, functional, uniqueness, object generat<strong>in</strong>g and path constra<strong>in</strong>ts<br />
and generalize relevant classes known <strong>in</strong> the relational eld [22].<br />
Denition 9.11. Let C C 1 C 2 be classes <strong>in</strong> a schema S and let c i : T C ! T i (i = 1 2 3)<br />
and c i : T Ci ! T (i =1 2) be subtype functions.<br />
{ A functional constra<strong>in</strong>t on C is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8i i 0 :: ID:8v v 0 :: T C :c 1 (v) =c 1 (v 0 ) ^ (i v) 2 x C ^ (i 0 v 0 ) 2 x C ) c 2 (v) =c 2 (v 0 ) :<br />
(9.95)<br />
{ An <strong>in</strong>clusion constra<strong>in</strong>t on C 1 and C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8t :: T:9i 1 :: IDv 1 :: T C 1 : (i 1v 1 ) 2 x C 1 ^ c 1(v 1 )=t )<br />
9i 2 :: IDv 2 :: T C 2 : (i 2v 2 ) 2 x C 2 ^ c 2(v 2 )=t : (9.96)<br />
{ An exclusion constra<strong>in</strong>t on C 1 , C 2 is a constra<strong>in</strong>t <strong>of</strong> the form<br />
8i 1 i 2 :: ID:8v 1 :: T C 1 : 8v 2 :: T C 2 : (i 1v 1 ) 2 x C 1 ^ (i 2v 2 ) 2 x C 2 ) c 1 (v 1 ) 6= c 2 (v 2 ) :<br />
(9.97)<br />
191
Constra<strong>in</strong>ts <strong>in</strong>crease the declarativity <strong>of</strong> designs. This is important, because <strong>in</strong> data and<br />
knowledge <strong>in</strong>tensive application systems the data and constra<strong>in</strong>ts on them usually live longer<br />
than the operations, i.e. the methods.<br />
Then the problem is to guarantee the consistency <strong>of</strong> the methods with respect to the<br />
specied constra<strong>in</strong>ts. Sometimes this requires hard verication work, but for a wide spectrum<br />
<strong>of</strong> schemata automatic transformation <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong>to methods is provided.<br />
In [16] consistent generic update operations with respect to implicit constra<strong>in</strong>ts have been<br />
presented. In [18] this has been extended to the classes <strong>of</strong> constra<strong>in</strong>ts mentioned above. In<br />
[19] an algorithm for the transformation <strong>of</strong> constra<strong>in</strong>ts <strong>in</strong>to transactions has been proven to<br />
be correct. This algorithm reduces the consistency enforcement task to basic updates. It can<br />
be shown that this operational approach to consistency enforcement is more powerfull than<br />
the rule trigger<strong>in</strong>g approach [5, 8]. However, the verication <strong>of</strong> a very technical condition,<br />
called I-reducedness is required, which limits the applicability <strong>of</strong> consistency enforcement <strong>in</strong><br />
general. We omit the details <strong>of</strong> the algorithm here, s<strong>in</strong>ce they are hidden to the designer.<br />
The only th<strong>in</strong>g a designer has to know is that constra<strong>in</strong>t specications will be made explicit<br />
<strong>in</strong> methods <strong>in</strong> a canonical way. If this leads to unexpected results, s/he may change the orig<strong>in</strong>al<br />
design. It is an open research problem how to support the amelioration <strong>of</strong> a schema <strong>in</strong> case<br />
constra<strong>in</strong>t enforcement leads to <strong>in</strong>ecient methods.<br />
9.6 Variation Based Reuse: A Research Issue<br />
The design process presented <strong>in</strong> Sections 9.3-9.5 implicitly assumes that we want to build a<br />
new application system from scratch. One promise <strong>of</strong> the object oriented approach, however, is<br />
an enormous <strong>in</strong>crease <strong>in</strong> s<strong>of</strong>tware reuse. This can be achieved if wekeep the design components,<br />
i.e. type and class denitions <strong>in</strong> libraries. The benets here<strong>of</strong> are apparent especially if we<br />
regard the scale <strong>of</strong> reusability <strong>of</strong> parameterized class denitions.<br />
Unfortunately reusability does not automatically imply reuse. Indeed, we have to provide<br />
mechanisms to relate the <strong>in</strong>tended (new) designs with exist<strong>in</strong>g components <strong>in</strong> such a library.<br />
Exist<strong>in</strong>g type and class denitions are not <strong>in</strong>dependent from one another. The idea is now to<br />
exploit the hierarchies <strong>in</strong> OODM schemata due to <strong>in</strong>stantiation, specialization and renement.<br />
This extends the work <strong>in</strong> [14], where the specialization taxonomy <strong>in</strong> a KL-ONE like knowledge<br />
representation system has been exploited for a similar task.<br />
An <strong>in</strong>tended design is given just as before by a rst (partial) OODM schema. Then the<br />
follow<strong>in</strong>g cases may occur.<br />
{ A class/type <strong>of</strong> the <strong>in</strong>tended design is an <strong>in</strong>stantiation, specialization or renement <strong>of</strong>an<br />
exist<strong>in</strong>g design component. Then we may ask whether a rearrangement <strong>of</strong> requirements<br />
would enable the reuse <strong>of</strong> further <strong>in</strong>stantiations, specializations or renements that exist<br />
<strong>in</strong> the library.<br />
{ A class/type <strong>of</strong> the <strong>in</strong>tended design is a variant <strong>of</strong> an exist<strong>in</strong>g library component, i.e.<br />
the rst alternative is true for a reparameterization <strong>of</strong> this library component. Of course,<br />
this is always possible, s<strong>in</strong>ce a pure parameter would satisfy this requirement. Hence<br />
we have to judge whether it is helpfull to take the reuse <strong>of</strong> the reparameterization <strong>in</strong>to<br />
consideration. This approach is similar to the use <strong>of</strong> a similarity measure <strong>in</strong> case-based<br />
reason<strong>in</strong>g.<br />
192
Once we have discovered a reusable variant <strong>in</strong> the library, wemaykeep track <strong>of</strong> the dierence<br />
to the <strong>in</strong>tended design and propagate these changes along the exist<strong>in</strong>g hierarchies. Then we<br />
may ask whether the result<strong>in</strong>g components can be directly reused.<br />
This suggests a modication <strong>of</strong> the renement-based design methodology. Before start<strong>in</strong>g<br />
a renement process exist<strong>in</strong>g doma<strong>in</strong>-specic libraries are exam<strong>in</strong>ed and variants are built.<br />
Then the renement process is based on selected variants. Moreover, variant construction is<br />
also required after standard renement steps that <strong>in</strong>troduce new types or classes, s<strong>in</strong>ce for<br />
these there may also exist variants <strong>in</strong> some library.<br />
Example 9.11. The class Insurant <strong>in</strong> Example 9.3 corresponds to current legal requirements.<br />
Some years ago an <strong>in</strong>itial schema for an <strong>in</strong>surance application would have looked<br />
slightly dierent, s<strong>in</strong>ce only ma<strong>in</strong> <strong>in</strong>surants existed at that time. This could have been modelled<br />
by some class Insurant old.<br />
Class Insurant old =<br />
Structure ( contract-no : NAT ,<br />
name : NAME ,<br />
address : ADDRESS ,<br />
sex : SEX ,<br />
<strong>in</strong>surance-times : [(beg<strong>in</strong> : DATE ,end:DATE [?)] Where ::: ,<br />
account-no : NAT ,<br />
employed-by :Company ,<br />
family : f NAME g ,<br />
agency : AGENCY )<br />
Method :::<br />
End Insurant old<br />
Assume such an <strong>in</strong>itial design and all renements to be kept <strong>in</strong> some library. Omitt<strong>in</strong>g accountno,<br />
employed-by and agency <strong>in</strong> the structure expression above would give a common supertype<br />
<strong>of</strong> the representation types for Insurant old and Insurant <strong>in</strong> Example 9.3.<br />
Then build variants <strong>of</strong> all the exist<strong>in</strong>g renements just omitt<strong>in</strong>g this <strong>in</strong>formation and check<br />
whether these are compatible with the new requirements. This avoids repeat<strong>in</strong>g renement<br />
steps that occurred (<strong>in</strong> modied form) already <strong>in</strong> the past.<br />
F<strong>in</strong>ally, specialize Insurant as <strong>in</strong>dicated <strong>in</strong> Example 9.8 and build variants <strong>of</strong> the rened<br />
classes Insurant and Ma<strong>in</strong>-Insurant with respect to the hierarchy developed so far. Aga<strong>in</strong><br />
this should avoid repeat<strong>in</strong>g earlier renement steps.<br />
ut<br />
The concretization and theoretical treatment <strong>of</strong> these ideas for the outl<strong>in</strong>ed methodology is<br />
a research issue under current <strong>in</strong>vestigation.<br />
9.7 Inferences <strong>in</strong> OODB Design<br />
The work reported <strong>in</strong> the preced<strong>in</strong>g sections presents rst pr<strong>in</strong>ciples <strong>of</strong> object oriented database<br />
design. The ma<strong>in</strong> scenario is centered around stepwise renement on the basis <strong>of</strong> an object<br />
oriented datamodel support<strong>in</strong>g class abstraction, generic update operations and declarative<br />
constra<strong>in</strong>t specication. The datamodel as well as the design process <strong>in</strong>volve a lot <strong>of</strong> support<strong>in</strong>g<br />
<strong>in</strong>ferences. These fall <strong>in</strong>to two classes. Let us rst describe those <strong>in</strong>ferences that are<br />
<strong>in</strong>tr<strong>in</strong>sic to the datamodel.<br />
193
{ The datamodel supports type and class hierarchies. S<strong>in</strong>ce methods on subclasses may<br />
override <strong>in</strong>herited methods, we have to check that these are <strong>in</strong>deed specializations <strong>in</strong><br />
order to shr<strong>in</strong>k undesired arbitrar<strong>in</strong>ess.<br />
{ The datamodel supports strongly typed methods, hence the problem to check type correctness.<br />
A more general problem is the verication <strong>of</strong> consistency for constra<strong>in</strong>ts that<br />
evade from enforcement.<br />
{ The datamodel supports generic updates, but these only exist <strong>in</strong> the case <strong>of</strong> valuerepresentability.<br />
This leads to the problem whether a uniqueness constra<strong>in</strong>t is implied.<br />
The second class <strong>of</strong> <strong>in</strong>ferences is required by the design methodology and extr<strong>in</strong>sic to the<br />
datamodel.<br />
{ The ma<strong>in</strong> scenario is based on stepwise renement. Hence the task to verify formal re-<br />
nement conditions. However, for the standard renement steps <strong>in</strong> Section 9.3 this is<br />
redundant, s<strong>in</strong>ce they have already been proven to be correct.<br />
{ In order to enforce consistence the formal requirement on I-reducedness [19] has to be<br />
satised. Hence the task to check it.<br />
{ F<strong>in</strong>ally, wehave to recognize the relation <strong>of</strong> an <strong>in</strong>tended design to exist<strong>in</strong>g library components,<br />
i.e. whether it is an <strong>in</strong>stantiation, specialization, renement or variant. This may<br />
<strong>in</strong>volve data restructur<strong>in</strong>g as shown <strong>in</strong> [12]. Moreover, once a usefull variant has been detected,<br />
we may want to propagate the changes along the dierent hierarchies. This k<strong>in</strong>d<br />
<strong>of</strong> variation-based reuse is still a research issue that we arework<strong>in</strong>g on.<br />
However, there are still open research problems. So far, we do not know the exact boundary<br />
<strong>of</strong> the <strong>in</strong>ferences. Another problem is the <strong>in</strong>tegration <strong>of</strong> user <strong>in</strong>terfaces and graphical support<br />
<strong>in</strong> order to facilitate the control whether the design ts for the amount <strong>of</strong>knowledge result<strong>in</strong>g<br />
from the current stage <strong>of</strong> requirements analysis.<br />
Currently, there is a research project CODE (Computer-aided <strong>Object</strong> oriented Design<br />
Environment) that aims at solv<strong>in</strong>g these open problems. The ma<strong>in</strong> research topics <strong>of</strong> CODE<br />
will be the extension <strong>of</strong> the design method toward variation-based reuse and the support <strong>of</strong><br />
the outl<strong>in</strong>ed methodology by a CASE tool.<br />
References for Chapter 9<br />
1. M. Atk<strong>in</strong>son, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, S. Zdonik: The object-oriented<br />
database system manifesto, Proc. 1st DOOD, Kyoto 1989<br />
2. C. Beeri: A formal approach to object-oriented databases, Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g, vol.<br />
5 (4), 1990, pp. 353-382<br />
3. G. Booch: <strong>Object</strong>-oriented design with applications, Benjam<strong>in</strong> Cumm<strong>in</strong>gs, 1991<br />
4. L. Cardelli, P. Wegner: On understand<strong>in</strong>g types, data abstraction and polymorphism, ACM Comput<strong>in</strong>g<br />
Surveys, vol. 17(4), pp. 471-522<br />
5. S. Ceri, J. Widom: Deriv<strong>in</strong>g production rules for constra<strong>in</strong>t ma<strong>in</strong>tenance, Proc. 16th Conf. on<br />
VLDB, Brisbane (Australia), August 1990, pp. 566-577<br />
6. P. Coad, E. Yourdan: <strong>Object</strong>-oriented analysis, Prentice Hall, 1991<br />
7. C. Floyd: A comparative evaluation <strong>of</strong> system development methods, <strong>in</strong> T. W. Olle, H. G. Sol,<br />
A. A. Verrijn-Stuart (Eds.): Information Systems Design Methodologies { Improv<strong>in</strong>g the Practice,<br />
Elsevier 1986<br />
8. P. Fraternali, S. Paraboschi, L. Tanca: Automatic rule generation for constra<strong>in</strong>t enforcement <strong>in</strong><br />
active databases, <strong>in</strong> U. Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations <strong>of</strong><br />
Models and Languages for Data and <strong>Object</strong>s, Volkse (Germany), October 1992, Spr<strong>in</strong>ger WICS<br />
194
9. R. Hull, R. K<strong>in</strong>g: Semantic database model<strong>in</strong>g: survey, applications and research issues, ACM<br />
Comput<strong>in</strong>g Surveys, vol. 19(3), September 1987<br />
10. W. Kim: <strong>Object</strong>-oriented databases: denition and research directions, IEEE Trans. on Knowledge<br />
and Data Eng<strong>in</strong>eer<strong>in</strong>g, vol. 2 (3), 1990, pp. 327-341<br />
11. B. Meyer: <strong>Object</strong>-oriented s<strong>of</strong>tware construction, Prentice-Hall, 1988<br />
12. B. Piza, K.-D. Schewe, J. W. Schmidt: Term subsumption with type constructors, <strong>in</strong> Y. Yesha<br />
(Ed.): Proc. 1st Int. Conf. on Information and Knowledge Management, Baltimore, November<br />
1992<br />
13. G. Saake, R. Jungclaus: Specication <strong>of</strong> database applications <strong>in</strong> the TROLL language, <strong>in</strong><br />
D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication <strong>of</strong> Database Systems,<br />
Glasgow, July 1991, Spr<strong>in</strong>ger WICS, pp. 228-245<br />
14. K.-D. Schewe: Variant construction us<strong>in</strong>g constra<strong>in</strong>t propagation techniques over semantic networks,<br />
<strong>in</strong> J. Retti, K. Leidlmaier (Eds.): Proc. <strong>of</strong> 5th Austrian AI Conference, Igls (Austria) 1989,<br />
Spr<strong>in</strong>ger IFB 208, pp. 188-197<br />
15. B. Schewe, K.-D. Schewe, B. Thalheim: Verfe<strong>in</strong>erungsschritte fur e<strong>in</strong>e objektorientierte Entwurfsmethodik,<br />
<strong>in</strong> Proc. 23rd GI-Jahrestagung, Dresden (Germany), October 1993<br />
16. K.-D. Schewe, J. W. Schmidt, I. Wetzel: Identication, genericity and consistency <strong>in</strong> objectoriented<br />
databases, <strong>in</strong> J. Biskup, R. Hull (Eds.): Proc. ICDT '92, Berl<strong>in</strong> (Germany), October<br />
1992, Spr<strong>in</strong>ger LNCS 646, pp. 341-356<br />
17. K.-D. Schewe, B. Thalheim, J. W. Schmidt, I. Wetzel: Integrity enforcement <strong>in</strong> object-oriented<br />
databases, <strong>in</strong>U.Lipeck, B. Thalheim (Eds.): Proc. 4th Int. Workshop on Foundations <strong>of</strong> Models<br />
and Languages for Data and <strong>Object</strong>s, Volkse (Germany), October 1992, Spr<strong>in</strong>ger WICS<br />
18. K.-D. Schewe, B. Thalheim, I. Wetzel: Integrity preserv<strong>in</strong>g updates <strong>in</strong> object oriented databases, <strong>in</strong><br />
M. Orlowska, M. Papazoglou (Eds.) : Proc. Australian Database Conference, Brisbane (Australia),<br />
February 1993, World Scientic, pp. 171-185<br />
19. K.-D. Schewe, B. Thalheim: Comput<strong>in</strong>g Consistent Transactions, University <strong>of</strong> Rostock, Prepr<strong>in</strong>t<br />
CS-08-92, December 1992, submitted for publication<br />
20. S. Shlaer, S. J. Meller: An object-oriented approach to doma<strong>in</strong> analysis, ACM S<strong>of</strong>tware Eng<strong>in</strong>eer<strong>in</strong>g<br />
Notes, vol. 14 (3), 1989<br />
21. C. Sernadas, P. Gouveia, J. Gouveia, A. Sernadas, P. Resende: The reication dimension <strong>in</strong> objectoriented<br />
database design, <strong>in</strong> D. Harper, M. Norrie (Eds.): Proc. Int. Workshop on the Specication<br />
<strong>of</strong> Database Systems, Glasgow, July 1991, Spr<strong>in</strong>ger WICS, pp. 275-299<br />
22. B. Thalheim: Dependencies <strong>in</strong> relational databases, Teubner, Leipzig 1991<br />
23. B. Thalheim: Intelligent database design us<strong>in</strong>g an extended entity-relationship model, University<br />
<strong>of</strong> Rostock, Prepr<strong>in</strong>t CS-11-91, Dezember 1991<br />
195
Chapter 10<br />
View-Centered Conceptual<br />
Modell<strong>in</strong>g<br />
Contents<br />
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197<br />
10.2 The data layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198<br />
10.2.1 Application-<strong>in</strong>dependent abstraction: types . . . . . . . . . . . . . . 199<br />
10.2.2 Comb<strong>in</strong>ed structure and behaviour: classes . . . . . . . . . . . . . . 199<br />
10.2.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200<br />
10.2.4 OODM schemata and <strong>in</strong>stances . . . . . . . . . . . . . . . . . . . . . 201<br />
10.3 The dialogue layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202<br />
10.3.1 Views <strong>in</strong> the datamodel . . . . . . . . . . . . . . . . . . . . . . . . . 202<br />
10.3.2 Dialogue classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203<br />
10.3.3 Operations on d-classes . . . . . . . . . . . . . . . . . . . . . . . . . 204<br />
10.3.4 The dialogue management level . . . . . . . . . . . . . . . . . . . . . 205<br />
10.3.5 The impact <strong>of</strong> genericity: selection, <strong>in</strong>vocation, navigation, deletion . 205<br />
10.4 The presentation layer . . . . . . . . . . . . . . . . . . . . . . . . . 206<br />
10.4.1 Presentation <strong>of</strong> dialogue classes . . . . . . . . . . . . . . . . . . . . . 206<br />
10.4.2 Presentation <strong>of</strong> actions . . . . . . . . . . . . . . . . . . . . . . . . . . 208<br />
10.5 Development Methods . . . . . . . . . . . . . . . . . . . . . . . . . 208<br />
10.5.1 Design<strong>in</strong>g a New Application . . . . . . . . . . . . . . . . . . . . . . 208<br />
10.5.2 Chang<strong>in</strong>g an Exist<strong>in</strong>g Application . . . . . . . . . . . . . . . . . . . 209<br />
10.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209<br />
The follow<strong>in</strong>g is a repr<strong>in</strong>t <strong>of</strong><br />
Klaus-Dieter Schewe, Bett<strong>in</strong>a Schewe. View-Centered Conceptual Modell<strong>in</strong>g { An<br />
<strong>Object</strong> <strong>Oriented</strong> Approach. <strong>in</strong> B. Thalheim (Ed.). Conceptual Model<strong>in</strong>g { Proc. ER<br />
'96 . Spr<strong>in</strong>ger LNCS.<br />
196
Abstract. Information systems for highly skilled clerical workers present themselves as a<br />
collection <strong>of</strong> w<strong>in</strong>dow-based processes with underly<strong>in</strong>g procedures access<strong>in</strong>g databases. It is left<br />
to the users to cont<strong>in</strong>ue or <strong>in</strong>terrupt a certa<strong>in</strong> piece <strong>of</strong> work or to switch from one application<br />
to another. Such system can be supported by three layers: a database layer, a dialogue layer<br />
and a presentation layer.<br />
In this paper an <strong>in</strong>tegrated object oriented model with a dist<strong>in</strong>ction between types and<br />
classes is outl<strong>in</strong>ed. In this model views on the datamodel can be extended to dialogue classes<br />
which enable a smooth <strong>in</strong>tegration <strong>of</strong> dialogue objects with the underly<strong>in</strong>g datamodel. The<br />
only rema<strong>in</strong><strong>in</strong>g task for the presentation layer consists <strong>of</strong> suitable ergonomic presentations <strong>of</strong><br />
dialogue objects on the screen by means <strong>of</strong> a general UIMS.<br />
10.1 Introduction<br />
Conceptual modell<strong>in</strong>g for <strong>in</strong>formation systems depends on the <strong>in</strong>tended application. In case<br />
<strong>of</strong> the work <strong>of</strong> highly skilled clerical workers to be supported we must be aware that they do<br />
not follow a monotone work<strong>in</strong>g scheme. E.g., consider agencies <strong>of</strong> a health <strong>in</strong>surance company<br />
with emphasis on the service for clients, who behave dierent from one another, demand for<br />
optimal service and <strong>in</strong>formation without delay, address their demands to the agents either<br />
personally, by phone or by fax, appreciate not to be burdened with complicated term<strong>in</strong>ology<br />
and forms etc. Therefore, a support<strong>in</strong>g <strong>in</strong>formation system must support workow beyond<br />
strict regularity permitt<strong>in</strong>g its users to exam<strong>in</strong>e additional circumstances, write specialized<br />
letters <strong>in</strong>stead <strong>of</strong> us<strong>in</strong>g forms, escape or <strong>in</strong>terrupt processes etc.<br />
As a consequence such <strong>in</strong>formation systems have to be composed <strong>of</strong> several <strong>in</strong>dependently<br />
usable dialogues leav<strong>in</strong>g to the user the decision which one to use <strong>in</strong> a concrete situation. The<br />
dialogue system has to oer many quickly reachable dialogue objects without forc<strong>in</strong>g its users<br />
to reach them <strong>in</strong> a specic way. Furthermore, it must oer a good overview about a client's<br />
situation as context to the special data to be actually processed. On the other hand, such<br />
systems must handle large amounts <strong>of</strong> data, hence should be supported by a well-designed<br />
database system without bother<strong>in</strong>g the users with database details.<br />
From a conceptual modell<strong>in</strong>g po<strong>in</strong>t <strong>of</strong> view the description <strong>of</strong> dialogues can be divided <strong>in</strong>to<br />
two major components. The rst one comprises the pure representational aspects concern<strong>in</strong>g<br />
w<strong>in</strong>dows, eld, menues, shortcuts etc., and its design is basically concerned with a UIMS and<br />
ergonomic criteria [4]. The second one deals with the abstract data conta<strong>in</strong>ed <strong>in</strong> the dialogue<br />
objects.<br />
The nature <strong>of</strong> the <strong>in</strong>tended applications <strong>of</strong> be<strong>in</strong>g data-<strong>in</strong>tensivemakes (conceptual) database<br />
design a central task <strong>in</strong> their development. This task is governed by general requirements concern<strong>in</strong>g<br />
the quality <strong>of</strong> databases, which must be free <strong>of</strong> redundancies, exible with respect<br />
to future extensions, not limited to specic applications and achieve highly <strong>in</strong>creased performance.<br />
However, the data processed <strong>in</strong> the dialogues is far from satisfy<strong>in</strong>g these criteria, but<br />
give rise to views.<br />
The development process has to be understood as a learn<strong>in</strong>g process, where not all requirements<br />
are known at the beg<strong>in</strong>n<strong>in</strong>g. This requires the participation <strong>of</strong> the users, because they<br />
are the only ones who can judge about the usefulness <strong>of</strong> proposed solutions. As a consequence<br />
the dialogue objects and hence the views on the data dened by them become the driv<strong>in</strong>g<br />
force <strong>in</strong> conceptual modell<strong>in</strong>g. This should not be taken as an accident, but as a challenge.<br />
In this paper we present an<strong>in</strong>tegrated model on the basis <strong>of</strong> the object oriented datamodel<br />
(OODM) <strong>in</strong> [10]. This datamodel has been dened <strong>in</strong> the spirit <strong>of</strong> Beeri's fundamental idea<br />
197
concern<strong>in</strong>g the conceptual separation <strong>of</strong> values and objects [2]. Values are provided by the<br />
means <strong>of</strong> type systems consist<strong>in</strong>g <strong>of</strong> base types and constructors [6]. <strong>Object</strong>s are provided<br />
by the means <strong>of</strong> classes which comb<strong>in</strong>e complex value and reference structures, operations<br />
and <strong>in</strong>heritance. This approach to object orientation is quite dierent from the work <strong>in</strong> [3, 7]<br />
which focusses on methods for object oriented programm<strong>in</strong>g. In particular, it is easy to see that<br />
certa<strong>in</strong> classes <strong>of</strong> OODM schemata with only at acyclic reference structures are equivalent<br />
to schemata <strong>in</strong> the higher-order entity relationship model [13]. We give an outl<strong>in</strong>e <strong>of</strong> the<br />
datamodel <strong>in</strong> Section 10.2.<br />
The OODM has been extended by dialogue classes <strong>in</strong> [8, 12] <strong>in</strong> order to support the development<br />
<strong>of</strong> <strong>in</strong>formation systems as characterized above. These dialogue classes are dened<br />
analogously to classes <strong>in</strong> the datamodel, i.e. they provide structural and behavioural abstractions<br />
<strong>of</strong> dialogue objects as well as <strong>in</strong>heritance. The dialogue objects can then be handled <strong>in</strong><br />
the same way as objects <strong>in</strong> the database which turns the management <strong>of</strong> the dialogues <strong>in</strong>to<br />
a database task. The relationship between the dialogue model and the datamodel is given by<br />
the means <strong>of</strong> views. We present the dialogue model <strong>in</strong> Section 10.3.<br />
The development <strong>of</strong> user <strong>in</strong>terfaces then reduces to the task <strong>of</strong> nd<strong>in</strong>g suitable representations<br />
<strong>of</strong> dialogue objects on the screen. For this purpose we propose the use <strong>of</strong> a general<br />
UIMS. In Section 10.4 we present a brief outl<strong>in</strong>e <strong>of</strong> representational means with respect to<br />
our <strong>in</strong>tegrated model.<br />
To that end, the work reported <strong>in</strong> this paper cont<strong>in</strong>ues our previous work <strong>in</strong> [12]. With<br />
respect to that paper we nowachieve some simplication concern<strong>in</strong>g the denition <strong>of</strong> dialogue<br />
classes. This denition was rst given <strong>in</strong>dependently from the datamodel and led to several<br />
additional notions such as selection classes, actions and dierent k<strong>in</strong>ds <strong>of</strong> operations (selection,<br />
navigation, <strong>in</strong>vocation, process<strong>in</strong>g) and we observed already the relationship to views on the<br />
datamodel. Now this relationship is directly <strong>in</strong>corporated <strong>in</strong> the denition <strong>of</strong> dialogue classes.<br />
Furthermore, selection is enabled by exploit<strong>in</strong>g uniqueness constra<strong>in</strong>ts <strong>in</strong> the datamodel that<br />
were <strong>in</strong>troduced <strong>in</strong> handl<strong>in</strong>g the identication problem <strong>in</strong> OODBs [10], and navigation can be<br />
supported by references between dialogue classes. F<strong>in</strong>ally, the variety <strong>of</strong> dierent operations<br />
can be simplied us<strong>in</strong>g the dist<strong>in</strong>ction between hidden and visible operations which is already<br />
present <strong>in</strong> the OODM. Actions then correspond to the head <strong>of</strong> visible operations, while some<br />
<strong>of</strong> their characteristics are shifted to presentations.<br />
With respect to the modell<strong>in</strong>g method we th<strong>in</strong>k <strong>of</strong> a renement-based approach as presented<br />
<strong>in</strong> [9, 11] for the OODM and extended to the dialogue model <strong>in</strong> [8], i.e. the data<br />
schema and the dialogue schema have to be developped <strong>in</strong> parallel tak<strong>in</strong>g care about their<br />
<strong>in</strong>terrelationships. This is <strong>in</strong> contrast to the work <strong>in</strong> [1, 5], where the start<strong>in</strong>g po<strong>in</strong>t for user<br />
<strong>in</strong>terface design is a complete entity-relationship schema. This topic will be briey sketched<br />
<strong>in</strong> Section 10.5.<br />
10.2 The data layer<br />
In the object-oriented datamodel (OODM) [10] we dist<strong>in</strong>guish between objects and values.<br />
Whereas values are common abstractions identied by themselves, objects depend on the<br />
particular application context and have to be encoded by object identiers. In the OODM<br />
each object consists <strong>of</strong> a unique, immutable identier, a set <strong>of</strong> describ<strong>in</strong>g values <strong>of</strong> possibly<br />
dierent types, references to other objects and operations associated with the object.<br />
198
10.2.1 Application-<strong>in</strong>dependent abstraction: types<br />
Types are used to describe immutable sets <strong>of</strong> values with (type-)operations predened on<br />
them. Type systems are prescriptions for the syntax and semantics <strong>of</strong> permitted type def<strong>in</strong>itions.<br />
Consider a type system that consists <strong>of</strong> some basic types, type constructors and a<br />
subtyp<strong>in</strong>g relation. Moreover, recursive types, i.e. types dened by equations, and predicative<br />
types, i.e. types dened by restrict<strong>in</strong>g formulae, are <strong>in</strong>cluded.<br />
Base types used here are BOOL, NAT, INT, FLOAT, STRING, ID or ?, where ID<br />
is an abstract identier type without any non-trivial supertype and ? is the trivial type that<br />
is a supertype <strong>of</strong> every type.<br />
The type constructors used here are e 1 j j e n (enumeration), (a 1 : 1 ::: a n : n )<br />
(record), fg (nite set), [] (list),hi (bag) or (a : )[(b : ) (union), where 1 ::: n <br />
are already dened types, e 1 ::: e n are constant values and a 1 ::: a n abare eld selectors.<br />
We may use base types and constructors to dene new types by nest<strong>in</strong>g. If there is no<br />
confusion, the eld selectors <strong>in</strong> record or union types may be omitted.<br />
The semantics <strong>of</strong> such types as sets <strong>of</strong> values is dened as usual. Moreover, we assume the<br />
standard operations on base types and on records, sets, bags, ::: We omit the details here. A<br />
type T is called proper i the number <strong>of</strong> its parameters is 0. T is called a value type i there<br />
is no occurrence <strong>of</strong> ID <strong>in</strong> T .IfT 0 is a proper type occurr<strong>in</strong>g <strong>in</strong> a type T , then there exists a<br />
correspond<strong>in</strong>g occurrence relation o : T T 0 ! BOOL with o(v 1 v 2 )=true i v 2 occurs<br />
<strong>in</strong> v 1 at the position <strong>in</strong>dicated by the position <strong>of</strong> T 0 <strong>in</strong> T .<br />
A subtype function is a function T 0 ! T from a subtype to its supertype (T 0 T ) dened<br />
by the usual subtyp<strong>in</strong>g rules [6].<br />
Predicative types are used to restrict the set <strong>of</strong> values given by some type denition to<br />
a subset. Formally, a predicative type T consists <strong>of</strong> an underly<strong>in</strong>g type T 0 and a formula P<br />
with exactly one free variable self <strong>of</strong> type T 0 . Clearly, the <strong>in</strong>clusion then gives a subtype<br />
function. In order to avoid <strong>in</strong>ationary use <strong>of</strong> quantiers, other variables are also allowed to<br />
occur freely <strong>in</strong> such a formula. They are assumed to be universally quantied.<br />
Example 10.1. We dene a type PERIOD and a predicative subtype COURSE <strong>of</strong> [PE-<br />
RIOD]:<br />
Type PERIOD = (beg<strong>in</strong> : DATE, end : DATE [?)<br />
Where self.end 6= ?)self.beg<strong>in</strong> self.end<br />
End PERIOD<br />
Type COURSE = [ PERIOD ]<br />
Where self = concat(L 1 ,[P 1 ,P 2 j L 2 ])) P 2 .end 6= ?^P 2 .end P 1 .beg<strong>in</strong><br />
End COURSE<br />
L 1 and L 2 are lists with elements <strong>of</strong> type PERIOD, P 1 and P 2 are values <strong>of</strong> type PERIOD<br />
and `concat' is the concatenation <strong>of</strong> two lists. Informally, the formula requires for any two<br />
successive periods the beg<strong>in</strong> date <strong>of</strong> the rst one to be later than the end date <strong>of</strong> the second<br />
one.<br />
ut<br />
10.2.2 Comb<strong>in</strong>ed structure and behaviour: classes<br />
The class concept provides the group<strong>in</strong>g <strong>of</strong> objects hav<strong>in</strong>g the same structure and behaviour.<br />
Structurally this uniformly comb<strong>in</strong>es aspects <strong>of</strong> object values and references. Behaviourally,<br />
199
this abstracts from operations on s<strong>in</strong>gle objects <strong>in</strong>clud<strong>in</strong>g their creation and deletion. In the<br />
OODM objects usually belong to more than one class.<br />
References between classes give rise to implicit referential constra<strong>in</strong>ts. In addition, subclasses<br />
(IsA-relationships) require each database <strong>in</strong>stance to satisfy <strong>in</strong>clusion constra<strong>in</strong>ts on<br />
object identiers. As usual <strong>in</strong> object oriented approaches class operations are used to model<br />
the database dynamics. In the OODM these are associated with classes.<br />
S<strong>in</strong>ce identiers can be represented us<strong>in</strong>g ID,values and references can be comb<strong>in</strong>ed <strong>in</strong>to<br />
a representation type, where each occurrence <strong>of</strong> ID denotes references to some other class.<br />
Therefore, we may dene the structure <strong>of</strong> a class us<strong>in</strong>g parameterized types. Moreover, classes<br />
are arranged <strong>in</strong> IsA-hierarchies.<br />
More formally, ifT is a value type with parameters 1 ::: n and if the parameters are<br />
replaced by pairsr i : C i with a reference name r i and a class name C i , the result<strong>in</strong>g expression<br />
is called a structure expression.<br />
Then a class consists <strong>of</strong> a class name C, a structure expression S, a set <strong>of</strong> class names<br />
D 1 ::: D m (called superclasses) and a set <strong>of</strong> operations. We call r i the reference named r i<br />
from class C to class C i .Thetype derived from S by replac<strong>in</strong>g each reference r i : C i bythetype<br />
ID is called the representation type T C <strong>of</strong> the class C. Thetype U C = (ident :IDvalue : T C )<br />
is called the class type <strong>of</strong> class C.<br />
Example 10.2.<br />
Let us consider a class Insurant for an <strong>in</strong>surance application.<br />
Class Insurant =<br />
Structure (<strong>in</strong>surance number: NAT , name: NAME, address: ADDRESS,<br />
course <strong>of</strong> <strong>in</strong>surance: [ ( k<strong>in</strong>d : \self", beg<strong>in</strong> : DATE,<br />
end : (date: DATE, reason: STRING) [?) [<br />
(k<strong>in</strong>d : \fam", beg<strong>in</strong> : DATE, end : DATE [?,<br />
self : Self Insurant, relation: \child" j \spouse") ])<br />
Operation :::<br />
End Insurant<br />
Class Self Insurant =<br />
IsA Insurant<br />
Structure ( employed by :Company , account no : NAT )<br />
Operation :::<br />
End Self Insurant<br />
A period <strong>of</strong> <strong>in</strong>surance <strong>in</strong> this example is <strong>of</strong> one <strong>of</strong> two possible k<strong>in</strong>ds: Either the <strong>in</strong>surant is<br />
employed by a company and therefore pays his/her own fee or (s)he is a family member <strong>of</strong><br />
the <strong>in</strong>surant without own <strong>in</strong>come.<br />
ut<br />
10.2.3 Operations<br />
The OODM dist<strong>in</strong>guishes between visible and hidden operations on classes to emphasize those<br />
that can be <strong>in</strong>voked by the user. However, all operations on a class <strong>in</strong>clud<strong>in</strong>g the hidden ones<br />
can be accessed by other operations. The justication for such aweak hid<strong>in</strong>g concept is due<br />
to two reasons:<br />
{ Visible operations serve as a means to specify (nested) transactions. In order to build<br />
sequences <strong>of</strong> database <strong>in</strong>stances we only regard these transactions assum<strong>in</strong>g a l<strong>in</strong>ear <strong>in</strong>vocation<br />
order on them.<br />
200
{ Hidden operations can be used to handle identiers. S<strong>in</strong>ce these identiers do not have<br />
any mean<strong>in</strong>g to the user, they must not occur with<strong>in</strong> the <strong>in</strong>put or output <strong>of</strong> a transaction.<br />
Each operation on a class C consists <strong>of</strong> a signature and a body. The signature consists <strong>of</strong><br />
an operation name O, a set <strong>of</strong> <strong>in</strong>put-parameter/type pairs i :: T i and a set <strong>of</strong> outputparameter/type<br />
pairs o j :: Tj 0 . The body is recursively built <strong>of</strong> the follow<strong>in</strong>g constructs:<br />
{ assignment x := E, where x is the class variable C <strong>of</strong> type fU C g or a local variable<br />
(<strong>in</strong>clud<strong>in</strong>g the output-parameters), and E is an expression <strong>of</strong> the same type as x,<br />
{ local variable declaration Let x :: T ,<br />
{ skip and fail,<br />
{ sequenc<strong>in</strong>g S 1 S 2 and branch<strong>in</strong>g IF P THEN S 1 ELSE S 2 ENDIF ,<br />
{ operation call C 0 :- O 0 (<strong>in</strong> : E 0 1 ::: E0 j out : x0 1 ::: x0 i ), where O0 is an operation on class<br />
C 0 with compatible signature and<br />
{ non-determ<strong>in</strong>istic selection <strong>of</strong> values New:f(x), where f is a selector on the representation<br />
type <strong>of</strong> C New Id selects a new identier.<br />
An operation O on a class C is called value-dened i all types occurr<strong>in</strong>g <strong>in</strong> its signature are<br />
proper value types. As already mentioned we require each visible operation to be value-dened.<br />
Subclasses <strong>in</strong>herit the operations <strong>of</strong> their superclasses, but overrid<strong>in</strong>g is allowed as long as the<br />
new operation is a specialization <strong>of</strong> all its correspond<strong>in</strong>g operations <strong>in</strong> its superclasses, but<br />
we dispense with a formal discussion <strong>of</strong> operational specialization.<br />
10.2.4 OODM schemata and <strong>in</strong>stances<br />
A database schema S is given by a nite collection <strong>of</strong> type and class denitions such that<br />
all types, classes and operations occurr<strong>in</strong>g with<strong>in</strong> type denitions, structure denitions and<br />
operations are dened <strong>in</strong> S.<br />
At any time, a class represents a nite set <strong>of</strong> objects. More precisely this is captured by the<br />
notion <strong>of</strong> an <strong>in</strong>stance (or database state). For a closed schema S an <strong>in</strong>stance D assigns to each<br />
class C a value D(C) <strong>of</strong> type f(ident : IDvalue : T C )g such that the follow<strong>in</strong>g conditions<br />
are satised:<br />
{ For each class C identiers must be unique.<br />
{ The set <strong>of</strong> identiers <strong>in</strong> a subclass C is a subset <strong>of</strong> the one <strong>in</strong> the superclass C 0 . Moreover,<br />
if T C T 0 C with subtype function f : T C ! T 0 C , then (i v) 2D(C) ) (i f(v)) 2D(C0 )<br />
holds.<br />
{ For each reference r from C to D identiers j occurr<strong>in</strong>g <strong>in</strong> a value v <strong>of</strong> an object <strong>in</strong> C<br />
with respect to the occurrence relation o r , i.e.(i v) 2D(C) and o r (v j) hold, must occur<br />
<strong>in</strong> D(D).<br />
Basic update operations, i.e. <strong>in</strong>sertion, deletion and update <strong>of</strong> a s<strong>in</strong>gle object <strong>in</strong>to a class C,<br />
cannot always be derived <strong>in</strong> the object-oriented case, because the abstract identiers have<br />
to be hidden from the user. However, <strong>in</strong> [10] it has been shown that for value-representable<br />
classes these operations are uniquely determ<strong>in</strong>ed by the schema and consistent with respect<br />
to the implicit referential and <strong>in</strong>clusion constra<strong>in</strong>ts.<br />
Value-representability <strong>of</strong> all classes <strong>in</strong> a closed schema is implied, if we have a (trivial)<br />
uniqueness constra<strong>in</strong>t for each class. Such aconstra<strong>in</strong>t requires the values <strong>of</strong> type T C <strong>in</strong> the<br />
class extension C to be unique.<br />
201
10.3 The dialogue layer<br />
<strong>Object</strong> orientation with<strong>in</strong> dialogue systems means to enter or select values on the screen and<br />
to <strong>in</strong>voke actions on them. The dialogue system reacts by oer<strong>in</strong>g other data or by activat<strong>in</strong>g<br />
and deactivat<strong>in</strong>g entries <strong>in</strong> selection lists or possible actions <strong>in</strong> the action bar [4]. We call<br />
such a collection <strong>of</strong> data and possible actions a dialogue object (d-object). In graphical user<br />
<strong>in</strong>terfaces d-objects are normally presented <strong>in</strong> a w<strong>in</strong>dow.<br />
Users <strong>in</strong>voke actions to change data <strong>in</strong> the database, to navigate to another possibly<br />
new dialogue object or to a modied presentation <strong>of</strong> the same dialogue object. Depend<strong>in</strong>g<br />
on selections or entries made <strong>in</strong> a d-object only a part <strong>of</strong> the possible actions are allowed.<br />
The process<strong>in</strong>g <strong>of</strong> an action may require further preconditions depend<strong>in</strong>g on the state <strong>of</strong> the<br />
dialogue system especially on other user's d-objects.<br />
A dialog object consists <strong>of</strong> a unique abstract identier, a set <strong>of</strong> values v i <strong>in</strong> associated<br />
elds F 1 ::: F n which correspond to describ<strong>in</strong>g values <strong>of</strong> objects, a set <strong>of</strong> references to other<br />
dialogue objects <strong>in</strong> order to allow aquicknavigational access, aset<strong>of</strong>actions to change the<br />
data and to control the dialogue and a state with the values `active' and `<strong>in</strong>active'. This<br />
means, that dialogue objects only exist as long as the dialogue object is visible on the screen.<br />
If a w<strong>in</strong>dow is closed the correspond<strong>in</strong>g dialogue object ist deleted.<br />
The identier serves to adm<strong>in</strong>istrate the dialogue objects. It is not known to the user,<br />
cannot be used by him and is not visible. Only the active d-object allows manipulations <strong>of</strong><br />
the represented data and only its actions can be <strong>in</strong>voked.<br />
10.3.1 Views <strong>in</strong> the datamodel<br />
Roughly spoken a view may be regarded as a stored query. In the relational datamodel queries<br />
can be expressed by terms <strong>in</strong> relational algebra. This can be generalized to the OODM us<strong>in</strong>g<br />
its type system. Then a query turns out to be represented by a term t over some type T such<br />
that the free variables <strong>of</strong> t represent the classes.<br />
S<strong>in</strong>ce objects employ identiers, we have to dist<strong>in</strong>guish between queries that result <strong>in</strong><br />
values and those that result <strong>in</strong> (collections <strong>of</strong>) objects. Therefore we dist<strong>in</strong>guish <strong>in</strong> the OODM<br />
between value queries and general access expressions. For a value query the type T <strong>of</strong> the<br />
den<strong>in</strong>g term t mustbeavalue type.<br />
This allows terms t to be built which <strong>in</strong>volve only identiers already exist<strong>in</strong>g <strong>in</strong> the<br />
database. Thus, such queries are called object preserv<strong>in</strong>g. Ifwewant the result <strong>of</strong> a query to<br />
represent `new' objects, i.e. if we want tohave object generat<strong>in</strong>g queries, we have to apply a<br />
mechanism to create new object identiers. This can be achieved by object creat<strong>in</strong>g functions<br />
on the type ID with arity ID ::: ID ! ID [10].<br />
The idea that a view is a stored query then carries over easily. Thus, a view on the<br />
schema S consists <strong>of</strong> a view name v 2 N C such that there is no class C with this name, a<br />
structure expression S(v) conta<strong>in</strong><strong>in</strong>g references to classes <strong>in</strong> S or to views on S and a den<strong>in</strong>g<br />
access expression 1 t(v) <strong>of</strong>type f(ident :IDvalue : T v )g, where T v is the representation type<br />
correspond<strong>in</strong>g to S(v).<br />
Example 10.3. Let us give a sample view on the schema <strong>of</strong> Example 10.2:<br />
View Course <strong>of</strong> Insurant =<br />
1 Assume for the moment that view denitions do not conta<strong>in</strong> recursive denitions.<br />
202
Structure<br />
[ ( k<strong>in</strong>d : \self", beg<strong>in</strong> : DATE,<br />
end : (date: DATE, reason: STRING) [?,<br />
fams: f ( id: Insurant, name:NAME, relation: \child" j \spouse",<br />
beg<strong>in</strong> : DATE, end:DATE [?) g ) [<br />
(k<strong>in</strong>d : \fam", beg<strong>in</strong> : DATE, end : DATE [?,<br />
self : (id : Insurant, name: NAME,<br />
beg<strong>in</strong> : DATE, end : DATE [?)) ]<br />
Def<strong>in</strong>ition<br />
f (i,course) j9cou . (i,cou) 2 Insurant ^<br />
course = [ p k9c 2 cou.course <strong>of</strong> <strong>in</strong>surance .<br />
p.k<strong>in</strong>d = c.k<strong>in</strong>d ^ p.beg<strong>in</strong> = c.beg<strong>in</strong> ^ p.end = c.end ^<br />
( c.k<strong>in</strong>d = \self" ) p.fams = f (j,fam) j9cou 0 .<br />
(j,cou 0 ) 2 Insurant ^ fam.name = cou 0 .name ^<br />
(\fam", fam.beg<strong>in</strong>, fam.end, i, fam.relation) =<br />
cou 0 .course <strong>of</strong> <strong>in</strong>surance.rst ^<br />
p.beg<strong>in</strong> fam.beg<strong>in</strong> ^<br />
(p.end 6= ?^fam.end 6= ?)fam.end p.end )) ^<br />
( c.k<strong>in</strong>d = \fam" )9(k,cou 00 ) 2 Insurant . c.self = k ^<br />
p.self.name = cou 00 .name ^ p.self.beg<strong>in</strong> p.beg<strong>in</strong> ^<br />
(\self", p.self.beg<strong>in</strong>, p.self.end) 2 cou 00 .course <strong>of</strong> <strong>in</strong>surance ^<br />
(p.self.end 6= ?^p.end 6= ?)p.end p.self.end )) ] g<br />
End Course <strong>of</strong> Insurant<br />
This view conta<strong>in</strong>s the course <strong>of</strong> <strong>in</strong>surance <strong>of</strong> one concrete <strong>in</strong>surant. Together with one period<br />
<strong>of</strong> k<strong>in</strong>d `self' <strong>in</strong> that course there are also the latest <strong>in</strong>surance periods <strong>of</strong> the family members.<br />
Together with one period <strong>of</strong> k<strong>in</strong>d `fam' there is also the period <strong>of</strong> the <strong>in</strong>surant to whose family<br />
the related <strong>in</strong>surant belongs.<br />
ut<br />
10.3.2 Dialogue classes<br />
Dialogue classes serve to group dialogue objects with the same structure and behaviour. As<br />
with objects we may use the type ID to represent identiers <strong>of</strong> d-objects and comb<strong>in</strong>e values<br />
and references <strong>in</strong> a structure expression now conta<strong>in</strong><strong>in</strong>g also references to other d-classes. This<br />
can be described by a view as dened <strong>in</strong> the previous subsection. In addition, there should be<br />
avisualtype which describes the data shown on the screen. This should be a supertype <strong>of</strong> the<br />
representation type correspond<strong>in</strong>g to the structure expression <strong>of</strong> the den<strong>in</strong>g view. Then the<br />
content <strong>of</strong> a d-object may be split over more than one d-class which leads to the <strong>in</strong>troduction<br />
<strong>of</strong> super-d-classes. Actions can be expressed by d-operations.<br />
Thus, a dialogue class (d-class) consists <strong>of</strong> a unique name DC, a set <strong>of</strong> names DC 1 , ::: ,<br />
DC n <strong>of</strong> d-classes (called the super-d-classes <strong>of</strong> DC), a den<strong>in</strong>g view with a structure expression<br />
DT 0 DC and a content denition def DC, a value type DT DC which is a supertype <strong>of</strong> the<br />
representation type T 0 DC correspond<strong>in</strong>g to DT 0 DC and a set <strong>of</strong> d-operations. We call DT 0 DC the<br />
content structure , T 0 DC the content type and DT DC the visual type <strong>of</strong> the d-class DC.<br />
Example 10.4. We give a part <strong>of</strong> the formal denition <strong>of</strong> a d-class correspond<strong>in</strong>g to the view<br />
<strong>in</strong> Example 10.3:<br />
203
Dialogue class Course <strong>of</strong> Insurance<br />
IsA IIP<br />
Visual<br />
[(k<strong>in</strong>d: \self", beg<strong>in</strong>: DATE, end: (date: DATE, reason: STRING) [?,<br />
fams: f(name: NAME, relation: \child" j \spouse", beg<strong>in</strong>: DATE,<br />
end: DATE [?)g ) [<br />
(k<strong>in</strong>d: \fam", beg<strong>in</strong>: DATE, end: DATE [?,<br />
self: (name: NAME, beg<strong>in</strong>: DATE, end: DATE [?))]<br />
Content<br />
[(k<strong>in</strong>d: \self", beg<strong>in</strong>: DATE, end: (date: DATE, reason: STRING) [?,<br />
fams: f(id : Insurant, name: NAME, relation: \child" j \spouse",<br />
beg<strong>in</strong>: DATE, end: DATE [?)g ) [<br />
(k<strong>in</strong>d: \fam", beg<strong>in</strong>: DATE, end: DATE [?,<br />
self: (id : Insurant, name:NAME, beg<strong>in</strong>: DATE, end:DATE [?))]<br />
Def<strong>in</strong>ition :::<br />
Operations :::<br />
End Course <strong>of</strong> <strong>in</strong>surance<br />
For the denition we refer to the view presented <strong>in</strong> Example 10.3.<br />
ut<br />
10.3.3 Operations on d-classes<br />
If a user selects an action associated with an active d-object, (s)he <strong>in</strong>itiates changes to that<br />
d-object <strong>in</strong>clud<strong>in</strong>g its deletion, the creation <strong>of</strong> a new d-object, modications to the underly<strong>in</strong>g<br />
database or switches to other d-objects. This is modelled by the d-operations on d-classes.<br />
As with the datamodel we dist<strong>in</strong>guish between visible and hidden d-operations. Only<br />
visible d-operations are accessible by user actions, whereas hidden d-operations can only<br />
be called from other d-operations. In contrast to the datamodel the access to (visible) d-<br />
operations may be restricted by preconditions that express the statusa <strong>of</strong> a d-object by means<br />
<strong>of</strong> selected or non-selected parts. Such preconditions are given by supertypes <strong>of</strong> the visual type<br />
DT DC <strong>of</strong> the d-class DC.<br />
Thus, a d-operation consists <strong>of</strong> a signature, aselection type and a body. The signature is the<br />
same as for classes <strong>in</strong> the datamodel. This also applies to the body with the dierences that<br />
operations to be called can be d-operations on d-classes and operations on classes, whereas<br />
assignments are not allowed. In this way we circumvent the update problem for views.<br />
Then by analogy to the datamodel we require visible d-operations to be value-dened.<br />
Sub-d-classes <strong>in</strong>herit d-operations from their super-d-classes, and overrid<strong>in</strong>g is restricted to<br />
specialization.<br />
Example 10.5. The follow<strong>in</strong>g denes a d-operation on the d-class Course <strong>of</strong> <strong>in</strong>surance<br />
<strong>of</strong> Example 10.5:<br />
New Insurant [(fams:(name:NAME)) [ (self:(name:NAME)) [?]<br />
(<strong>in</strong> : ?, out:?)<br />
System :- save (<strong>in</strong>: cont) <br />
Let <strong>in</strong>s :: ID <br />
Course <strong>of</strong> <strong>in</strong>surance :- Select (<strong>in</strong>: sel, out: <strong>in</strong>s) <br />
Course <strong>of</strong> <strong>in</strong>surance :- Delete (<strong>in</strong>: cont.ident) <br />
204
Course <strong>of</strong> <strong>in</strong>surance :- Invoke (<strong>in</strong>: <strong>in</strong>s)<br />
End New Insurant<br />
Here the selection supertype has been put <strong>in</strong>to brackets. Furthermore, we used two standard<br />
variables cont <strong>of</strong> type (ident :ID,value : TDC 0 ) for the identier/content pair <strong>of</strong> the current<br />
d-object and sel for the selected values with respect to the selection type.<br />
This d-operation New Insurant stores the actual data <strong>in</strong> the database, retrieves the identier<br />
<strong>of</strong> the selected <strong>in</strong>surant, deletes the current d-object and creates a new one associated<br />
with the course <strong>of</strong> <strong>in</strong>surance <strong>of</strong> the selected <strong>in</strong>surant.<br />
For that purposes we used calls to a d-operation save dened on the d-class System (assume<br />
this has been dened as a super-d-class <strong>of</strong> IIP) and to generic d-operations on Course<br />
<strong>of</strong> <strong>in</strong>surance for the deletion and creation <strong>of</strong> d-objects. We shall <strong>in</strong>vestigate genericity below.<br />
ut<br />
10.3.4 The dialogue management level<br />
The notions <strong>of</strong> d-schema and d-<strong>in</strong>stance generalize the correspond<strong>in</strong>g notions for the datamodel.<br />
A d-schema is a nite collection <strong>of</strong> type, class and d-class denitions that do not<br />
conta<strong>in</strong> undened types, classes, operations, d-classes or d-operations occurr<strong>in</strong>g <strong>in</strong> structure<br />
expressions, references, superclasses, signatures or calls. In particular, each d-schema DS has<br />
an underly<strong>in</strong>g OODM schema S.<br />
Then an <strong>in</strong>stance D <strong>of</strong> S already denes the contents <strong>of</strong> the views underly<strong>in</strong>g d-classes <strong>in</strong><br />
DS and hence also the contents with respect to the visual types. For a d-class DC we write<br />
D(DC) for the value <strong>of</strong> type f(ident :IDvalue : TDC 0 )g dened by the content denition part<br />
def DC on the <strong>in</strong>stance D. We call D(DC) the set <strong>of</strong> possible d-objects <strong>in</strong> d-clasd DC with<br />
respect to the <strong>in</strong>stance D.<br />
However, we want to associate with a d-<strong>in</strong>stance the set <strong>of</strong> actual (active) d-objects <strong>in</strong><br />
d-class DC. This should lead to subsets <strong>of</strong> D(DC) satisfy<strong>in</strong>g conditions analogous to those<br />
required for <strong>in</strong>stances D.<br />
Thus, a d-<strong>in</strong>stance DD for a d-schema DS consists <strong>of</strong> an <strong>in</strong>stance D <strong>of</strong> the underly<strong>in</strong>g<br />
OODM schema S and a mapp<strong>in</strong>g D act which assigns to each d-class DC 2DSasetD act (DC)<br />
such that the uniqueness <strong>of</strong> identiers, the <strong>in</strong>clusion <strong>in</strong>tegrity and the referential <strong>in</strong>tegrity (as<br />
dened for <strong>in</strong>stances) hold.<br />
10.3.5 The impact <strong>of</strong> genericity: selection, <strong>in</strong>vocation, navigation, deletion<br />
As for classes <strong>in</strong> the datamodel wemay ask for generic operations on d-classes. S<strong>in</strong>ce possible d-<br />
objects are already determ<strong>in</strong>ed by <strong>in</strong>stances, generic operations for d-classes can only provide<br />
the deletion <strong>of</strong> actual d-objects or the <strong>in</strong>vocation <strong>of</strong> another d-object. Note that the latter<br />
case comprises the navigation to an exist<strong>in</strong>g (active) d-object as well as the creation <strong>of</strong> a new<br />
one (<strong>in</strong> D act (DC)) comb<strong>in</strong>ed with a switch toit.<br />
S<strong>in</strong>ce sets <strong>of</strong> actual d-objects behave like sets <strong>of</strong> ord<strong>in</strong>ary OODM objects, we may exploit<br />
the identication theoty <strong>of</strong> the OODM <strong>in</strong> [10] to generate these generic operations. Even<br />
simpler, due to the denition via views we only need a value identication for d-objects.<br />
Recall that such a value identication is given by uniqueness constra<strong>in</strong>ts for all classes.<br />
S<strong>in</strong>ce subtyp<strong>in</strong>g can be easily extended to structure expressions, such a uniqueness constra<strong>in</strong>t<br />
on a class C with structure expression S C is simply given by a super-structure-expression S C .<br />
205
If the representation type T C <strong>of</strong> S C is a value type, then this means to determ<strong>in</strong>e a unique<br />
object <strong>in</strong> D(C), i.e. its identier, from a given value <strong>of</strong> type T C (if such an object exists at<br />
all). If S C conta<strong>in</strong>s a reference, we rst have to identify a referenced object, i.e. to determ<strong>in</strong>e<br />
its identier from a given value <strong>of</strong> some value type.<br />
Thus value identication gives rise to a generic select-operation which may call selectoperations<br />
on other classes. If there are several uniqueness constra<strong>in</strong>ts, we may comb<strong>in</strong>e the<br />
required <strong>in</strong>put types us<strong>in</strong>g the union constructor.<br />
Example 10.6. For the class Insurant <strong>in</strong> Example 10.2 we may obta<strong>in</strong> a select-operation<br />
with the signature<br />
select (<strong>in</strong> : sel :: (Isn: NAT) [ (name: NAME, date <strong>of</strong> birth: DATE, address: ADDRESS<br />
), out: i:: ID)<br />
S<strong>in</strong>ce the view <strong>in</strong> Example 10.3 is object preserv<strong>in</strong>g and the d-class <strong>in</strong> Example 10.4 is dened<br />
on top <strong>of</strong> this view, the operation carries over to a select operation for a d-objerct (used <strong>in</strong><br />
Example 10.5).<br />
ut<br />
As seen <strong>in</strong> Example 10.6 value identication gives rise to a generic select-operation for d-<br />
classes dened by object preserv<strong>in</strong>g views. In this case the delete- and <strong>in</strong>voke-operations<br />
can be split <strong>in</strong>to a selection part, i.e. a call to the select-operation and a simpler delete- or<br />
<strong>in</strong>voke-operation with an <strong>in</strong>put <strong>of</strong> type ID (the identier <strong>of</strong> the d-object).<br />
In the case <strong>of</strong> an object generat<strong>in</strong>g view the new objects depend on others and we may<br />
obta<strong>in</strong> a generic select-operation by rst select<strong>in</strong>g these other objects. E.g., <strong>in</strong> our <strong>in</strong>surance<br />
application this applies to a d-class <strong>in</strong> which each period <strong>of</strong> an <strong>in</strong>surant is turned <strong>in</strong>to a<br />
separate object.<br />
10.4 The presentation layer<br />
The handl<strong>in</strong>g <strong>of</strong> a dialogue system is best performed us<strong>in</strong>g a User Interface Management<br />
System (UIMS). Such a system provides (among other features)<br />
{ w<strong>in</strong>dows and operations to open and close them, to move them on the screen, to scroll,<br />
to change their size etc.<br />
{ several representations <strong>of</strong> data, such as selection lists or buttons, text entry elds etc.<br />
{ a ma<strong>in</strong> menu where all dialogues start, <strong>of</strong>ten called the operation desk.<br />
10.4.1 Presentation <strong>of</strong> dialogue classes<br />
For each d-class there is at least one representation on the screen. Normally there are actions<br />
with which the representation <strong>of</strong> the d-class on the screen can be modied without chang<strong>in</strong>g<br />
the state <strong>of</strong> the d-class. The representation <strong>of</strong> the d-class is given by the UIMS. The concrete<br />
description therefore depends on its functionality.<br />
Visual values are associated with elds consist<strong>in</strong>g <strong>of</strong> a relation to a component <strong>of</strong> the<br />
content type <strong>of</strong> a d-class, eld attributes like `protected' / `unprotected', `normal' / `emphasized',<br />
::: , the type<strong>of</strong>theeld (text entry eld, selection eld, ::: ), a selection state with<br />
the values `selected' and `unselected', the <strong>in</strong>formation whether data have been entered <strong>in</strong> a<br />
eld or not, the <strong>in</strong>formation where the cursor is placed and an optional name <strong>of</strong> the eld.<br />
206
System History Options W<strong>in</strong>dows<br />
Course <strong>of</strong> the Insurance<br />
1133557 Neumann, Luise 10.11.1948 273<br />
+more <strong>in</strong>formation about the <strong>in</strong>surant +<br />
K<strong>in</strong>d Beg<strong>in</strong> End Reason <strong>of</strong> End<br />
Name Relation Beg<strong>in</strong> End<br />
self 01.04.1979<br />
Neumann, Marga child 13.02.1984<br />
Neumann, Horst child 27.04.1986<br />
fam 10.11.1976 31.03.1979<br />
Meier-Neumann,<br />
01.01.1975<br />
Fritz<br />
self 01.10.1967 09.11.1976 Too old as student<br />
fam 10.11.1948 30.09.1967<br />
Neumann, Wilhelm 01.01.1919 16.08.1990<br />
Fig. 10.1. The presentation <strong>of</strong> a d-object<br />
Fields may be grouped together. Further properties <strong>of</strong> elds depend on the features <strong>of</strong><br />
the UIMS. For each eld there is at least one representation on the screen compris<strong>in</strong>g a<br />
declaration <strong>of</strong> its length, its style <strong>of</strong> emphasis and its style <strong>of</strong> representation <strong>of</strong> protection.<br />
For each representation there is also a representation <strong>of</strong> the selection state <strong>of</strong> the eld.<br />
Example 10.7. The presentation <strong>of</strong> a d-object <strong>in</strong> the class Course <strong>of</strong> <strong>in</strong>surance consists<br />
<strong>of</strong> three parts correspond<strong>in</strong>g to the d-classes Course <strong>of</strong> <strong>in</strong>surance, IIP and System (see<br />
Figure 10.1):<br />
{ The `<strong>in</strong>surant <strong>in</strong>formation part (IIP)' is part <strong>of</strong> most d-objects and gives an overview<br />
about the <strong>in</strong>surant.<br />
{ Besides the IIP the d-object conta<strong>in</strong>s a list <strong>of</strong> <strong>in</strong>surance periods. Each period is represented<br />
by a group <strong>of</strong> l<strong>in</strong>es <strong>of</strong> which therst l<strong>in</strong>e conta<strong>in</strong>s the k<strong>in</strong>d (self or as family member <strong>of</strong><br />
another <strong>in</strong>surant), the beg<strong>in</strong> and the end <strong>of</strong> the period. For periods <strong>of</strong> k<strong>in</strong>d `self' several<br />
l<strong>in</strong>es (maybe0)follow with names <strong>of</strong> family members, the relation <strong>of</strong> the family member<br />
to the <strong>in</strong>surant and beg<strong>in</strong> and end <strong>of</strong> the latest <strong>in</strong>surance period <strong>of</strong> the family member.<br />
For periods <strong>of</strong> k<strong>in</strong>d `fam' one l<strong>in</strong>e follows with the name and the <strong>in</strong>surance period <strong>of</strong> the<br />
<strong>in</strong>surant whose family the member belongs to.<br />
{ The last l<strong>in</strong>e is used for messages and orig<strong>in</strong>ates from the d-class System. ut<br />
Besides the d-classes which are<strong>in</strong>voked by the user there are dialogue boxes, <strong>in</strong> which data can<br />
be entered and processed [4]. Dialogue boxes are called by operations <strong>of</strong> d-classes, if further<br />
data are needed to nish an operation.<br />
207
10.4.2 Presentation <strong>of</strong> actions<br />
The user uses actions to change the data on the screen and to control the dialogue. These<br />
actions correspond to the d-operations <strong>in</strong> the d-classes <strong>of</strong> the d-object and therefore consist<br />
<strong>of</strong> a name used <strong>in</strong> the action bar, a shortcut symbol with which the action can be <strong>in</strong>voked<br />
alternatively and a selection criterium.<br />
The name <strong>of</strong> the action is the name <strong>of</strong> the correspond<strong>in</strong>g d-operation, but names <strong>of</strong> menus<br />
may be added if necessary. The selection criterium is given by elds that may or must be<br />
selected before <strong>in</strong>vok<strong>in</strong>g the action. It corresponds to the selection type <strong>of</strong> the correspond<strong>in</strong>g<br />
d-operation. Invok<strong>in</strong>g an action means to execute the body <strong>of</strong> the correspond<strong>in</strong>g d-operation.<br />
Example 10.8. Let us expla<strong>in</strong> some actions associated with the d-object <strong>in</strong> Figure 10.1:<br />
{ `History' shows earlier states <strong>of</strong> the course <strong>of</strong> the <strong>in</strong>surance.<br />
{ `System' and `Options' are pull-down-menus (omitted <strong>in</strong> the example). E. g., `System'<br />
conta<strong>in</strong>s the follow<strong>in</strong>g actions: New Insurant, Save, Cancel (Esc), Save and Quit (F3),<br />
Scroll Forward (Bild#), Scroll Back (Bild"), Desk (Strg + F4).<br />
{ `New <strong>in</strong>surant' saves the data on the screen and shows the course <strong>of</strong> <strong>in</strong>surance <strong>of</strong> another<br />
<strong>in</strong>surant which can be selected <strong>in</strong> the list <strong>of</strong> periods. If no <strong>in</strong>surant is selected a dialogue<br />
box with entry elds for the search for a new <strong>in</strong>surant is activated.<br />
{ `Save' saves the changes <strong>of</strong> the data on the screen and shows the same dialogue object<br />
aga<strong>in</strong>.<br />
{ `Cancel' deletes the dialogue object and returns to the one which was active before respectively<br />
to the desk. Changes made to the data are forgotten.<br />
{ `W<strong>in</strong>dows' is a pull-down-menu, conta<strong>in</strong><strong>in</strong>g the list <strong>of</strong> all exist<strong>in</strong>g dialogue objects. It is<br />
oered by the UIMS and not described here.<br />
ut<br />
10.5 Development Methods<br />
In the previous sections we presented an <strong>in</strong>tegrated data- and dialogue-model for conceptual<br />
modell<strong>in</strong>g, but we did not <strong>in</strong>vestigate how to use this model <strong>in</strong> practice. Due to space limitations,<br />
the follow<strong>in</strong>g presentation <strong>of</strong> development methods will be rather sketchy. We <strong>in</strong>dicate<br />
the power <strong>of</strong> the chosen model on the basis <strong>of</strong> two scenarios. The rst one captures the case<br />
<strong>of</strong> a new system to be designed, the second one handles the case <strong>of</strong> chang<strong>in</strong>g or extend<strong>in</strong>g a<br />
work<strong>in</strong>g application. In both cases, we concentrate on the conceptual level.<br />
10.5.1 Design<strong>in</strong>g a New Application<br />
In design<strong>in</strong>g a new application we have todenetypes, classes and d-classes from scratch. We<br />
assume that purely presentational aspects are captured by the use <strong>of</strong> a UIMS. The method we<br />
propose assumes an almost monotonic growth <strong>of</strong> application knowledge by means <strong>of</strong> <strong>in</strong>terviews<br />
with the <strong>in</strong>tended users and analysis <strong>of</strong> their work<strong>in</strong>g processes to be supported. The rst goal<br />
will be to gather the basic activities <strong>of</strong> the users and to outl<strong>in</strong>e the correspond<strong>in</strong>g dialogues.<br />
At this level, representational aspects naturally come <strong>in</strong>to play by means <strong>of</strong> restrictions on<br />
screens, facilities <strong>of</strong> the UIMS and basic hard- and s<strong>of</strong>tware.<br />
From a more abstract po<strong>in</strong>t <strong>of</strong> view this means to start with dialogue objects and to<br />
abstract to dialogue classes without know<strong>in</strong>g the underly<strong>in</strong>g datamodel schema. E.g., we<br />
208
could decide to have a presentation <strong>of</strong> dialogue objects and actions (grouped to menues) as<br />
<strong>in</strong> Figure 10.1. The simplest way to obta<strong>in</strong> a rst conceptual data schema is to take the view<br />
denition as trivial, i.e., the content type <strong>of</strong> the view co<strong>in</strong>cides with the representation type<br />
<strong>of</strong> a class. Note that references only occur, if we decided to split the data <strong>in</strong> the presentation<br />
among several dialogue objects.<br />
As to the methods, we may either postpone them for later specication or dene them<br />
on the basis <strong>of</strong> this rst schema. The rst alternative is generally recommended as long as<br />
the database schema is not stable. Thus, the rst development step results <strong>in</strong> a schema with<br />
certa<strong>in</strong> undened types, classes and references and with redundancies concern<strong>in</strong>g the data<br />
schema.<br />
Usually there is not only one such dialogue class { otherwise we are done. Hence there<br />
will be several dependencies among the classes <strong>of</strong> the schema. The second step will be to<br />
make these dependencies explicit by the denition <strong>of</strong> constra<strong>in</strong>ts. Then the third step is to<br />
rene the schema <strong>in</strong> order to shift as much <strong>of</strong> theses dependencies as possible <strong>in</strong>to structures.<br />
Renement rules for this purpose have been extensively discussed <strong>in</strong> [8, 9, 11]. These comprise<br />
{ the splitt<strong>in</strong>g <strong>of</strong> classes thereby <strong>in</strong>troduc<strong>in</strong>g references or IsA-relations,<br />
{ the <strong>in</strong>troduction <strong>of</strong> new classes by specialization omitt<strong>in</strong>g the old class or <strong>in</strong>troduc<strong>in</strong>g an<br />
IsA-relation,<br />
{ the extension <strong>of</strong> the schema by new types, classes, d-classes etc. and<br />
{ the completion <strong>of</strong> the schema add<strong>in</strong>g denitions to undened components.<br />
All these renement steps can be reversed. Apply<strong>in</strong>g one <strong>of</strong> them requires consequent changes<br />
to the constra<strong>in</strong>ts and the methods dened so far. Furthermore, each renement guarantees<br />
that classes <strong>of</strong> the old schema become views on the new schema, which <strong>in</strong>turnshows how to<br />
achieve complete d-classes.<br />
10.5.2 Chang<strong>in</strong>g an Exist<strong>in</strong>g Application<br />
When there already exists a runn<strong>in</strong>g application and we want to change or extend it, the<br />
process<strong>in</strong>g method is quite similar. We analyse the new processes, detect the dialogue object<br />
and add d-classes to the schema. In addition, let the underly<strong>in</strong>g views be trivial thereby<br />
<strong>in</strong>troduc<strong>in</strong>g redundancies on the data layer. Thus, the rst schema update is simply additive.<br />
In the follow<strong>in</strong>g steps redundancies have to be made explicit us<strong>in</strong>g constra<strong>in</strong>ts and these are<br />
shifted <strong>in</strong>to structure denitions. As a result we obta<strong>in</strong> aga<strong>in</strong> the required view denition which<br />
can nally be attened. We omit further details and refer to [8] for an extensive discussion <strong>of</strong><br />
an application example.<br />
10.6 Conclusion<br />
In this paper we presented an object oriented model which <strong>in</strong>tegrates a datamodel and a<br />
dialogue model by the means <strong>of</strong> views. <strong>Object</strong>s are used as units <strong>of</strong> data <strong>in</strong> the database with<br />
describ<strong>in</strong>g values, references to other objects and operations. They are managed by an object<br />
oriented database management system.<br />
In the same way d-objects dene the basic units <strong>of</strong> dialogues. A d-object abstract from<br />
presentational issues at the user <strong>in</strong>terface und hence provides a description <strong>of</strong> data and actions<br />
presented <strong>in</strong> dialogues. The data <strong>in</strong> the database and the dialogues are related by the<br />
209
means <strong>of</strong> views. This allows d-objects to be managed <strong>in</strong> the manner as objects. Only screen<br />
presentations are left to a support<strong>in</strong>g user <strong>in</strong>terface management system.<br />
The conceptual build<strong>in</strong>g blocks for objects and d-objects then follow the same pr<strong>in</strong>ciples.<br />
We use classes and d-classes to describe the abstract structural and behavioural aspects <strong>of</strong><br />
both. Then a view on the database describes the possible contents <strong>of</strong> d-objects. Selection and<br />
creation correspond to uniqueness constra<strong>in</strong>t thatwere <strong>in</strong>troduced <strong>in</strong> connection with valuerepresentability.<br />
In contrast to previous work the paper emphasizes just these relationships<br />
between the datamodel and the dialogue model.<br />
References for Chapter 10<br />
1. H. Balzert. Der JANUS-Dialogexperte: Vom Fachkonzept zur Dialogstruktur. S<strong>of</strong>twaretechnik-<br />
Trends, 13(3), August 1993.<br />
2. C. Beeri. A formal approach to object-oriented databases. In Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g,<br />
Vol. 5, 353 { 382. North Holland, 1990.<br />
3. P. CoadandE.Yourdan. <strong>Object</strong>-oriented analysis. Prentice Hall, Englewood-Clis, N.J., 1991.<br />
4. IBM (International Bus<strong>in</strong>ess Mach<strong>in</strong>es Corp.). Systems Application Architecture Common User<br />
Access / Advanced Interface Design Guide, 1991. Nr. SC34-4290.<br />
5. C. Janssen, A. Weisbecker, and J. Ziegler. Generat<strong>in</strong>g user <strong>in</strong>terfaces from data models and<br />
dialogue net specications. In Human Factors <strong>in</strong> Comput<strong>in</strong>g Systems (INTERCHI), 418 { 423,<br />
Amsterdam, 1993. ACM.<br />
6. J. C. Mitchell. Type systems for programm<strong>in</strong>g languages. In J. von Leeuwen, editor, The Handbook<br />
<strong>of</strong> Theoretical Computer Science, Vol. B, 365 { 458. Elsevier, 1990.<br />
7. J. Rumbaugh, M. Blaha, W. Premerlane, F. Eddy, and W. Lorensen. <strong>Object</strong>-<strong>Oriented</strong> Model<strong>in</strong>g<br />
and Design. Prentice Hall, Englewood Clis, New Jersey, 1991.<br />
8. B. Schewe. Kooperative S<strong>of</strong>twareentwicklung { E<strong>in</strong> objektorientierter Ansatz. Deutscher Universitatsverlag,<br />
Leverkusen, 1996.<br />
9. B. Schewe, K.-D. Schewe, and B. Thalheim. Objektorientierter Datenbankentwurf <strong>in</strong> der Entwicklung<br />
daten<strong>in</strong>tensiver Informationssysteme. Informatik -Forschung und Entwicklung, 10(3),<br />
1995, 115 { 127.<br />
10. K.-D. Schewe and B. Thalheim. Fundamental concepts <strong>of</strong> object oriented databases. Acta Cybernetica,<br />
Szeged, 11(1/2), 1993, 49 { 84.<br />
11. K.-D. Schewe and B. Thalheim. Pr<strong>in</strong>ciples <strong>of</strong> object oriented database design. In H. Jaakkola,<br />
H. Kangassalo, T. Kitahashi, and A. Markus, editors, Information Modell<strong>in</strong>g and Knowledge Bases<br />
V, 227 { 242. IOS Press, Amsterdam, 1994.<br />
12. B. Schewe and K.-D. Schewe. A user-centered method for the development <strong>of</strong> data-<strong>in</strong>tensive<br />
dialogue systems { an object oriented approach. In E. D. Falkenberg, W. Hesse, A. Olive, editors,<br />
Information System Concepts, 88 { 103. Chapman & Hall, 1995.<br />
13. B. Thalheim. Foundations <strong>of</strong> entity-relationship model<strong>in</strong>g. Annals <strong>of</strong> Mathematics and Articial<br />
Intelligence, 7, 1993, 197 { 256.<br />
210