26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

151<br />

This example illustrates the primary advantage of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g the editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> XML. If editors<br />

wish to differ between uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> characters <strong>and</strong> broken characters they can encode them with<br />

different tags. They can then transform both tags <str<strong>on</strong>g>in</str<strong>on</strong>g>to under-dots if they still wish to present<br />

both <str<strong>on</strong>g>in</str<strong>on</strong>g>stances as such or they can decide to visualize <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>stance, underl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> the other<br />

under-dotted to dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish between them (Roued 2009).<br />

Thus, EpiDoc allows different scholarly op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s to be encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> the same XML file s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce c<strong>on</strong>tent<br />

markup (EpiDoc XML) <strong>and</strong> presentati<strong>on</strong> (separate XSLT sheets) are separated. Roued also supported<br />

the argument of Roueché (2009) that EpiDoc encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g is not a “substantial c<strong>on</strong>ceptual leap” from<br />

Leiden encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

While the first two V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablet publicati<strong>on</strong>s were encoded us<str<strong>on</strong>g>in</str<strong>on</strong>g>g EpiDoc, Roued observed that<br />

the level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g was not very granular <strong>and</strong> the website was not well set up to exploit the<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g. She also noted that the level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g a project chooses typically depends both <strong>on</strong> the<br />

technology chosen <strong>and</strong> the anticipated future use of the data. For the next series of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets,<br />

Roued expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that the project decided to pursue an even more granular level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

words <strong>and</strong> terms <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong>. This has supported an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive search functi<strong>on</strong>ality <strong>and</strong> added<br />

greater value to the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a knowledge base. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, the project encoded the tablets <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

greater detail regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g Leiden:<br />

Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stances of uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty, added characters <strong>and</strong> abbreviati<strong>on</strong>s enables us to extract<br />

these <str<strong>on</strong>g>in</str<strong>on</strong>g>stances from their respective texts <strong>and</strong> analyze them. We can, for example, count how<br />

many characters <str<strong>on</strong>g>in</str<strong>on</strong>g> the text or texts are deemed to be uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>. Similarly, we can look at the<br />

type of characters that are most likely to be supplied. These illustrate the many new<br />

possibilities for analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of ancient document (Roued 2009).<br />

In additi<strong>on</strong> to more extensive encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the texts <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc, the eSAD project decided to perform a<br />

certa<str<strong>on</strong>g>in</str<strong>on</strong>g> amount of manual “c<strong>on</strong>textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g” of words, people, place names, dates, <strong>and</strong> military<br />

terms, or basically of all the items found <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes. For words, the <str<strong>on</strong>g>in</str<strong>on</strong>g>dex c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a list of<br />

lemmas with references to places <str<strong>on</strong>g>in</str<strong>on</strong>g> the text where corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g words occurred; encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g these data<br />

allowed them to extract <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> such as the number of times a lemma occurred <str<strong>on</strong>g>in</str<strong>on</strong>g> the text. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, the project discovered numerous errors that needed to be corrected. All of<br />

this encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been performed to support new advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g features with a new launch of the<br />

website as V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a Tablets Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e 2.0 <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010. In particular, they have developed an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive<br />

search feature us<str<strong>on</strong>g>in</str<strong>on</strong>g>g AJAX, 509 LiveSearch, JavaScript, <strong>and</strong> PHP 510 that gives the user feedback while<br />

typ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> a search term. In the case of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a, it will give users a list of all words, terms, names,<br />

<strong>and</strong> dates that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> their search pattern.<br />

The XML document created for each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> text c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s all of its relevant bibliographic<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Roued expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that this necessitated develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods that<br />

could extract relevant <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong>ly, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g up<strong>on</strong> the need. The project thus decided to build<br />

RESTful web services us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ZEND framework 511 <strong>and</strong> PHP. The V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a web services receive<br />

URLs with certa<str<strong>on</strong>g>in</str<strong>on</strong>g> parameters <strong>and</strong> return answers as XML. This allows other projects to utilize these<br />

encoded XML files, <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, the knowledge base of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words. This web service is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

used <str<strong>on</strong>g>in</str<strong>on</strong>g> their related project that seeks to develop an ISS for readers of ancient documents. The<br />

509 AJAX, short for “Asynchr<strong>on</strong>ous JavaScript <strong>and</strong> XML” <strong>and</strong> is a technique “for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g fast <strong>and</strong> dynamic web pages”<br />

http://www.w3schools.com/ajax/ajax_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.asp<br />

510 PHP st<strong>and</strong>s for “Hypertext Processor” <strong>and</strong> is a server-side script<str<strong>on</strong>g>in</str<strong>on</strong>g>g language, http://www.w3schools.com/php/php_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.asp<br />

511 http://framework.zend.com/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!