Semantic Annotation for Process Models: - Department of Computer ...

Semantic Annotation for Process Models: - Department of Computer ... Semantic Annotation for Process Models: - Department of Computer ...

21.01.2014 Views

22 CHAPTER 2. PROBLEM SETTING ity of RML and OWL, it is not surprising to find a big overlapping of the modeling constructs between the two languages since both of them share a similar logic basis 5 . We do not intend to build one-to-one mapping between RML and OWL, but take the advantages of both as our research tools. We mainly use RML to visualize the concepts and relations of ontologies for human comprehension, and employ OWL to formalize ontology models for machine interpretation and reasoning. 2.4 Semantic Interoperability Interoperability is the ability of two or more systems or components to exchange information and to use the information that has been exchanged [119]. Interoperability is a broadly used term, encompassing many of the issues impinging upon the effectiveness with which diverse information resources might fruitfully co-exists. The issues can be defined for different purpose, such as, semantics. Semantic interoperability is the ability of two or more computer systems to exchange information and have the meaning of that information accurately and automatically interpreted by the receiving system. The main obstacle of semantic interoperability is semantic heterogeneity of the information to be exchanged. Common understanding of semantics and standardization of semantic representation are usually concerned as the solutions tackling the semantic heterogeneity to achieve semantic interoperability. 2.4.1 Semantic heterogeneity Semantic heterogeneity is usually distinguished from syntactic heterogeneity and structural heterogeneity in the database community [37] [39] [86] [88] [89]. Syntactic heterogeneity is concerned with the heterogeneity of data formats. Standardizing data formats is taken as an approach to solve syntactic heterogeneity problems. For example, XML is used as a standard format for all forms of Web-accessible data. Structural heterogeneity is associated with different data models, data structures or schemas, e.g. relational and object-oriented database models. An example of the solutions for structural heterogeneity is that RDF based on XML syntax provides a unified way to structure information sources or object models for Web-based information exchange [100]. When two information sources are modeled in a same format by applying a same modeling methodology, there still might be semantic heterogeneity problem. Semantic heterogeneity can be identified according to the different types of conflicts [172]: 1. Semantic conflicts. Different modelers do not perceive exactly the same set of real world objects, but instead they visualize overlapping sets (included or intersecting sets). For example, a "Student" object class may appear in one schema, while a more restrictive "CS-Student" object class (grouping students majoring in computer science) is in another schema. The "CS-Student" class will be integrated as a subclass of the "Student" class in the integration of two schemas. 2. Descriptive conflicts. Descriptive conflicts include naming conflicts due to homonyms and synonyms [7] [111], attribute domain, scale, constraints, operations, etc. [84]. 5 FOL (Fist Order Logic) can formalize the set theory.

2.4. SEMANTIC INTEROPERABILITY 23 3. Structural conflicts. Such structural conflicts are different from structural heterogeneity. Even if two modelers use the same data model, they can choose different constructs to represent common real-world objects. For instance, in object-oriented models when a modeler describes a component of an object type O, he has the modeling choices between creating a new object type or adding an attribute to O. 2.4.2 Semantic annotation The goal of empowering computer systems with semantic interoperability rests on the desirability of computer systems being able to find information and to use it for purposes that the original creator of the information did not anticipate. This goal of flexible information reuse requires some degree of understanding of the information, which in turn requires that the information be encoded in some standard fashion that is interpreted identically by all systems using that information. As a shared model of what the information represent, ontologies are usually used to achieve the level of understanding. Semantic annotation is an approach to link ontologies to the original information sources. Annotation is the extra information associated with a particular point in a document or other piece of information. For semantic annotation, the extra information is meaning definitions of the concepts used in a document. The meaning definitions are in most cases represented in ontologies. Annotation can be in the form of comments, or in the form of metadata. Metadata is data about data and it is used to facilitate the understanding, use and management of data. Machine-manipulable annotations are often organized as metadata, which is also the format of semantic annotations. There are a number of alternative approaches regarding the organization, structuring, and preservation of annotations. For instance, all the markup languages (HTML, SGML, XML, etc.) can be considered schemas for embedded or in-line annotation. On the contrary, open hypermedia systems use stand-off annotation models where annotations are kept detached, i.e. non-embedded in the content. Both annotation approaches can be document-level (annotating the document as a whole) or character-level (referring just a specific part of the text) [81] (see Figure 2.4). Embedded annotation seems easier to maintain. However, non-embedded annotations allow dynamic, user-specific semantic annotations because they can change corresponding to the interest of the user or the context of usage. The embedded annotations might also have negative impact on the volume of the content and complicate the maintenance [70]. Figure 2.4: Embedded annotation and stand-off annotation [81]

2.4. SEMANTIC INTEROPERABILITY 23<br />

3. Structural conflicts. Such structural conflicts are different from structural heterogeneity.<br />

Even if two modelers use the same data model, they can choose<br />

different constructs to represent common real-world objects. For instance, in<br />

object-oriented models when a modeler describes a component <strong>of</strong> an object type<br />

O, he has the modeling choices between creating a new object type or adding an<br />

attribute to O.<br />

2.4.2 <strong>Semantic</strong> annotation<br />

The goal <strong>of</strong> empowering computer systems with semantic interoperability rests on the<br />

desirability <strong>of</strong> computer systems being able to find in<strong>for</strong>mation and to use it <strong>for</strong> purposes<br />

that the original creator <strong>of</strong> the in<strong>for</strong>mation did not anticipate. This goal <strong>of</strong> flexible<br />

in<strong>for</strong>mation reuse requires some degree <strong>of</strong> understanding <strong>of</strong> the in<strong>for</strong>mation, which<br />

in turn requires that the in<strong>for</strong>mation be encoded in some standard fashion that is<br />

interpreted identically by all systems using that in<strong>for</strong>mation. As a shared model <strong>of</strong><br />

what the in<strong>for</strong>mation represent, ontologies are usually used to achieve the level <strong>of</strong><br />

understanding. <strong>Semantic</strong> annotation is an approach to link ontologies to the original<br />

in<strong>for</strong>mation sources.<br />

<strong>Annotation</strong> is the extra in<strong>for</strong>mation associated with a particular point in a document<br />

or other piece <strong>of</strong> in<strong>for</strong>mation. For semantic annotation, the extra in<strong>for</strong>mation is<br />

meaning definitions <strong>of</strong> the concepts used in a document. The meaning definitions are<br />

in most cases represented in ontologies. <strong>Annotation</strong> can be in the <strong>for</strong>m <strong>of</strong> comments,<br />

or in the <strong>for</strong>m <strong>of</strong> metadata. Metadata is data about data and it is used to facilitate the<br />

understanding, use and management <strong>of</strong> data. Machine-manipulable annotations are<br />

<strong>of</strong>ten organized as metadata, which is also the <strong>for</strong>mat <strong>of</strong> semantic annotations. There<br />

are a number <strong>of</strong> alternative approaches regarding the organization, structuring, and<br />

preservation <strong>of</strong> annotations. For instance, all the markup languages (HTML, SGML,<br />

XML, etc.) can be considered schemas <strong>for</strong> embedded or in-line annotation. On the<br />

contrary, open hypermedia systems use stand-<strong>of</strong>f annotation models where annotations<br />

are kept detached, i.e. non-embedded in the content. Both annotation approaches can<br />

be document-level (annotating the document as a whole) or character-level (referring<br />

just a specific part <strong>of</strong> the text) [81] (see Figure 2.4). Embedded annotation seems<br />

easier to maintain. However, non-embedded annotations allow dynamic, user-specific<br />

semantic annotations because they can change corresponding to the interest <strong>of</strong> the user<br />

or the context <strong>of</strong> usage. The embedded annotations might also have negative impact<br />

on the volume <strong>of</strong> the content and complicate the maintenance [70].<br />

Figure 2.4: Embedded annotation and stand-<strong>of</strong>f annotation [81]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!