26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

111<br />

are no established st<strong>and</strong>ards for “mix<str<strong>on</strong>g>in</str<strong>on</strong>g>g scripts <str<strong>on</strong>g>in</str<strong>on</strong>g> a regular search<str<strong>on</strong>g>in</str<strong>on</strong>g>g pattern” (Álvarez et al. 2010).<br />

One of the largest problems with current databases, however, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to these authors, is that<br />

although relati<strong>on</strong>al databases allow l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data <str<strong>on</strong>g>in</str<strong>on</strong>g> different tables to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate relati<strong>on</strong>ships, key<br />

entities <str<strong>on</strong>g>in</str<strong>on</strong>g> different <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, such as pers<strong>on</strong>al <strong>and</strong> place names, are typically not normalized. As a<br />

soluti<strong>on</strong> to this problem <strong>and</strong> others, Álvarez et al. proposed the creati<strong>on</strong> of an <strong>on</strong>tological schema<br />

based <strong>on</strong> EpiDoc. “Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <strong>on</strong>tological schema that allows for more normalized <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

structure,” Álvarez et al argued, “has the added benefit of prepar<str<strong>on</strong>g>in</str<strong>on</strong>g>g epigraphic data to be shared <strong>on</strong> the<br />

Web via l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data, which opens new possibilities to relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> currently dispersed <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

several databases.”<br />

After mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the EpiDoc schema to an “<strong>on</strong>tological representati<strong>on</strong> expressed <str<strong>on</strong>g>in</str<strong>on</strong>g> the W3C OWL 363<br />

language,” Álvarez et al. provided an example of how a sample <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc XML<br />

would appear as represented by their <strong>on</strong>tology. One important advantage of an <strong>on</strong>tological<br />

representati<strong>on</strong> for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s that Álvarez et al. listed was the possibility of mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the discrete<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> units found with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to other knowledge organizati<strong>on</strong> systems (e.g., mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

terms for civilizati<strong>on</strong> eras to the Getty Art & Architecture Thesaurus 364 ). Not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that EpiDoc also<br />

made use of some c<strong>on</strong>trolled vocabularies, such as <strong>on</strong>e that describes m<strong>on</strong>uments <strong>and</strong> objects that bear<br />

texts, 365 Álvarez et al. translated a number of these vocabularies <str<strong>on</strong>g>in</str<strong>on</strong>g>to separate <strong>on</strong>tology modules that<br />

can also be reused separately. Another advantage of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <strong>on</strong>tology for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s with both a<br />

series of properties <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ference rules is that far more precise search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be<br />

c<strong>on</strong>ducted, such as far more complex named-entity search<str<strong>on</strong>g>in</str<strong>on</strong>g>g (e.g., all parts of the “tria nom<str<strong>on</strong>g>in</str<strong>on</strong>g>a” [the<br />

praenomen, cognomen, <strong>and</strong> nomen] are encoded as separate data types associated to an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> as<br />

well as filiati<strong>on</strong>).<br />

The OWL representati<strong>on</strong> designed by Álvarez et al. avoided the use of free text str<str<strong>on</strong>g>in</str<strong>on</strong>g>gs whenever<br />

possible, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead treated all <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> as either properties or classes, <strong>and</strong> all <strong>on</strong>tology elements <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this representati<strong>on</strong> are identified by the URI of the element (class, property, or data) that is referenced<br />

when c<strong>on</strong>nect<str<strong>on</strong>g>in</str<strong>on</strong>g>g entries. They designed their implementati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> this manner to support the exportati<strong>on</strong><br />

of epigraphic-l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data by other applicati<strong>on</strong>s, or essentially to create mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able data:<br />

This idea br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs a new dimensi<strong>on</strong> to the shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of epigraphic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g for<br />

software agents to c<strong>on</strong>sume RDF <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> for specific purposes, complement<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces<br />

oriented to use by humans. Each of their elements described <str<strong>on</strong>g>in</str<strong>on</strong>g> the previous secti<strong>on</strong> will be<br />

referenced by a unique address, a URI that enables an unambiguous, c<strong>on</strong>sistent, <strong>and</strong> permanent<br />

identificati<strong>on</strong> for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> all their associated <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> items. In our approach,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is already stored <str<strong>on</strong>g>in</str<strong>on</strong>g> OWL-RDF so that the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong>al requirements are hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

a c<strong>on</strong>sistent URI design for the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> items <strong>and</strong> deploy<str<strong>on</strong>g>in</str<strong>on</strong>g>g the provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g services<br />

(Álvarez et al. 2010).<br />

One important use of URIs, Álvarez et al. noted, is that they could be used to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k records for the same<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> found <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases. While most epigraphical catalogs assign objects various codes<br />

depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the organizati<strong>on</strong> system, as well as often use st<strong>and</strong>ard reference identifiers (e.g., CIL #),<br />

Álvarez et al. po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked-data approach, the suffix of a URI could be changed<br />

depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the reference number, <strong>and</strong> the use of RDF triples <strong>and</strong> the predicate rdfs:seeAlso could be<br />

363 The OWL web <strong>on</strong>tology language has been created by the W3C to support further development of the Semantic Web, <strong>and</strong> the current recommendati<strong>on</strong><br />

for OWL 2 can be found at (http://www.w3.org/TR/owl2-overview/)<br />

364 http://www.getty.edu/research/tools/vocabularies/aat/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

365 For example, the “Eagle/EpiDoc Object Type Vocabulary” can be found at http://edh-www.adw.uni-heidelberg.de/EDH/<str<strong>on</strong>g>in</str<strong>on</strong>g>schrift/012116

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!