26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

28<br />

One major project to recently emerge from the CDLI is the Open Richly Annotated Cuneiform Corpus<br />

(Oracc). 86 This project has grown out of the CDLI <strong>and</strong> has utilized technology developed by the<br />

Pennsylvania Sumerian Dicti<strong>on</strong>ary (PSD). 87 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, Oracc was created by Steve<br />

T<str<strong>on</strong>g>in</str<strong>on</strong>g>ney, Eleanor Robs<strong>on</strong>, <strong>and</strong> Niek Veldhuis <strong>and</strong> “comprises a workspace <strong>and</strong> toolkit for the<br />

development of a complete corpus of cuneiform whose rich annotati<strong>on</strong> <strong>and</strong> open licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g support the<br />

next generati<strong>on</strong> of scholarly research.” In additi<strong>on</strong> to CDLI <strong>and</strong> PSD, a number of other digital<br />

cuneiform projects are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> Oracc, 88 <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Assyrian Empire Builders (AEB), 89 the Digital<br />

Corpus of Cuneiform Mathematical Texts (DCCMT), 90 <strong>and</strong> the Geography of Knowledge <str<strong>on</strong>g>in</str<strong>on</strong>g> Assyria<br />

<strong>and</strong> Babyl<strong>on</strong>ia (GKAB). 91 Oracc was designed as a “corpus build<str<strong>on</strong>g>in</str<strong>on</strong>g>g cooperative” that will provide<br />

both <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> technical support for “the creati<strong>on</strong> of free <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong>s of cuneiform texts.”<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce Oracc wishes to promote both open <strong>and</strong> reusable data, they recommend that all participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

projects make use of Creative Comm<strong>on</strong>s (CC) 92 licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> all default Oracc projects have been<br />

placed under a CC “Attributi<strong>on</strong>-Share Alike” license. Oracc was designed to complement the CDLI<br />

<strong>and</strong> allows scholars to “slice” groups of texts from the larger CDLI corpus <strong>and</strong> then study those<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tensively with<str<strong>on</strong>g>in</str<strong>on</strong>g> what they have labeled “projects.” Am<strong>on</strong>g its various features, Oracc supports<br />

multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual translati<strong>on</strong> support, enables projects to be turned <str<strong>on</strong>g>in</str<strong>on</strong>g>to Word files, PDFs, or books us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the “ISO OpenDocument” st<strong>and</strong>ard; <strong>and</strong> allows data to be exported <str<strong>on</strong>g>in</str<strong>on</strong>g> the TEI format. Any cuneiform<br />

tablet transliterati<strong>on</strong>s that are created with<str<strong>on</strong>g>in</str<strong>on</strong>g> Oracc will also be automatically uploaded to the CDLI.<br />

The Oracc Project recognizes six major roles 93 <strong>and</strong> has developed specific documentati<strong>on</strong> for each: (1)<br />

user (a scholar us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Oracc corpora); (2) builder (some<strong>on</strong>e work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> texts to help build up Oracc,<br />

e.g., lemmatiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g or data entry); (3) manager (some<strong>on</strong>e manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g or adm<str<strong>on</strong>g>in</str<strong>on</strong>g>ister<str<strong>on</strong>g>in</str<strong>on</strong>g>g an Oracc project);<br />

(4) developer (some<strong>on</strong>e c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g code to the Oracc project); (5) system adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrator; <strong>and</strong> (6) <strong>and</strong><br />

steerer (senior Oracc users). Significant documentati<strong>on</strong> is freely available for all but the last two roles.<br />

Oracc is a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g project, <strong>and</strong> researchers are <str<strong>on</strong>g>in</str<strong>on</strong>g>vited to c<strong>on</strong>tribute texts to it through either a<br />

d<strong>on</strong>ati<strong>on</strong> or curati<strong>on</strong> model. Through the d<strong>on</strong>ati<strong>on</strong> model, text editi<strong>on</strong>s <strong>and</strong> any additi<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

are simply sent to Oracc, <strong>and</strong> the project team <str<strong>on</strong>g>in</str<strong>on</strong>g>stalls, c<strong>on</strong>verts, <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s them (Oracc reserves<br />

the right to perform m<str<strong>on</strong>g>in</str<strong>on</strong>g>or edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g but promises to provide proper identificati<strong>on</strong> <strong>and</strong> credit for all data<br />

as well as to identify all revisers of data). Through the curati<strong>on</strong> model, the Oracc team helps users to<br />

set up their cuneiform texts as a separate project <strong>on</strong> the Oracc server, <strong>and</strong> the curator is then<br />

resp<strong>on</strong>sible for lemmatiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g their texts (this model also gives the user greater c<strong>on</strong>trol<br />

over subsequent edits to materials). 94 Various web services assist those that are c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g corpora to<br />

Oracc.<br />

Oracc is an excellent example of a project that supports reuse of its data through the use of CC<br />

licenses, comm<strong>on</strong>ly adopted technical st<strong>and</strong>ards, <strong>and</strong> extensive documentati<strong>on</strong> as to how the data are<br />

86 http://oracc.museum.upenn.edu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

87 The Pennsylvania Sumerian Dicti<strong>on</strong>ary project (http://psd.museum.upenn.edu/epsd/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html) is based at the Babyl<strong>on</strong>ian Secti<strong>on</strong> of the University of<br />

Pennsylvania Museum of Anthropology <strong>and</strong> Archaeology. In additi<strong>on</strong> to their work with Oracc <strong>and</strong> the CDLI, they have collaborated with the Electr<strong>on</strong>ic<br />

Text Corpus of Sumerian Literature (ETSCL).<br />

88 For the full list, see http://oracc.museum.upenn.edu/project-list.html.<br />

89 http://www.ucl.ac.uk/sarg<strong>on</strong><br />

90 http://oracc.museum.upenn.edu/dccmt/<br />

91 http://oracc.museum.upenn.edu/gkab<br />

92 Creative Comm<strong>on</strong>s is a “n<strong>on</strong>profit corporati<strong>on</strong> dedicated to mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it easier for people to share <strong>and</strong> build up<strong>on</strong> the work of others, c<strong>on</strong>sistent with the<br />

rules of copyright” (http://creativecomm<strong>on</strong>s.org/about/) <strong>and</strong> provides free licenses <strong>and</strong> legal tools that can be used by creators of <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual works that<br />

wish to provide various levels of reuse of their work, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g attributi<strong>on</strong>-<strong>on</strong>ly, share-alike, n<strong>on</strong>commercial, <strong>and</strong> no-derivative works.<br />

93 http://oracc.museum.upenn.edu/doc/<br />

94 For technical details <strong>on</strong> the curati<strong>on</strong> model, see http://oracc.museum.upenn.edu/c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html#curati<strong>on</strong>; for their extensive corpus-builder<br />

documentati<strong>on</strong>, see http://oracc.museum.upenn.edu/c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html#curati<strong>on</strong>), <strong>and</strong> for the guide to project management, see<br />

http://oracc.museum.upenn.edu/doc/manager/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!