Rome Wasn't Digitized in a Day - Council on Library and Information ...
Rome Wasn't Digitized in a Day - Council on Library and Information ...
Rome Wasn't Digitized in a Day - Council on Library and Information ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
28<br />
One major project to recently emerge from the CDLI is the Open Richly Annotated Cuneiform Corpus<br />
(Oracc). 86 This project has grown out of the CDLI <strong>and</strong> has utilized technology developed by the<br />
Pennsylvania Sumerian Dicti<strong>on</strong>ary (PSD). 87 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, Oracc was created by Steve<br />
T<str<strong>on</strong>g>in</str<strong>on</strong>g>ney, Eleanor Robs<strong>on</strong>, <strong>and</strong> Niek Veldhuis <strong>and</strong> “comprises a workspace <strong>and</strong> toolkit for the<br />
development of a complete corpus of cuneiform whose rich annotati<strong>on</strong> <strong>and</strong> open licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g support the<br />
next generati<strong>on</strong> of scholarly research.” In additi<strong>on</strong> to CDLI <strong>and</strong> PSD, a number of other digital<br />
cuneiform projects are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> Oracc, 88 <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Assyrian Empire Builders (AEB), 89 the Digital<br />
Corpus of Cuneiform Mathematical Texts (DCCMT), 90 <strong>and</strong> the Geography of Knowledge <str<strong>on</strong>g>in</str<strong>on</strong>g> Assyria<br />
<strong>and</strong> Babyl<strong>on</strong>ia (GKAB). 91 Oracc was designed as a “corpus build<str<strong>on</strong>g>in</str<strong>on</strong>g>g cooperative” that will provide<br />
both <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> technical support for “the creati<strong>on</strong> of free <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong>s of cuneiform texts.”<br />
S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce Oracc wishes to promote both open <strong>and</strong> reusable data, they recommend that all participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />
projects make use of Creative Comm<strong>on</strong>s (CC) 92 licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> all default Oracc projects have been<br />
placed under a CC “Attributi<strong>on</strong>-Share Alike” license. Oracc was designed to complement the CDLI<br />
<strong>and</strong> allows scholars to “slice” groups of texts from the larger CDLI corpus <strong>and</strong> then study those<br />
<str<strong>on</strong>g>in</str<strong>on</strong>g>tensively with<str<strong>on</strong>g>in</str<strong>on</strong>g> what they have labeled “projects.” Am<strong>on</strong>g its various features, Oracc supports<br />
multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual translati<strong>on</strong> support, enables projects to be turned <str<strong>on</strong>g>in</str<strong>on</strong>g>to Word files, PDFs, or books us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />
the “ISO OpenDocument” st<strong>and</strong>ard; <strong>and</strong> allows data to be exported <str<strong>on</strong>g>in</str<strong>on</strong>g> the TEI format. Any cuneiform<br />
tablet transliterati<strong>on</strong>s that are created with<str<strong>on</strong>g>in</str<strong>on</strong>g> Oracc will also be automatically uploaded to the CDLI.<br />
The Oracc Project recognizes six major roles 93 <strong>and</strong> has developed specific documentati<strong>on</strong> for each: (1)<br />
user (a scholar us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Oracc corpora); (2) builder (some<strong>on</strong>e work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> texts to help build up Oracc,<br />
e.g., lemmatiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g or data entry); (3) manager (some<strong>on</strong>e manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g or adm<str<strong>on</strong>g>in</str<strong>on</strong>g>ister<str<strong>on</strong>g>in</str<strong>on</strong>g>g an Oracc project);<br />
(4) developer (some<strong>on</strong>e c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g code to the Oracc project); (5) system adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrator; <strong>and</strong> (6) <strong>and</strong><br />
steerer (senior Oracc users). Significant documentati<strong>on</strong> is freely available for all but the last two roles.<br />
Oracc is a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g project, <strong>and</strong> researchers are <str<strong>on</strong>g>in</str<strong>on</strong>g>vited to c<strong>on</strong>tribute texts to it through either a<br />
d<strong>on</strong>ati<strong>on</strong> or curati<strong>on</strong> model. Through the d<strong>on</strong>ati<strong>on</strong> model, text editi<strong>on</strong>s <strong>and</strong> any additi<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />
are simply sent to Oracc, <strong>and</strong> the project team <str<strong>on</strong>g>in</str<strong>on</strong>g>stalls, c<strong>on</strong>verts, <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s them (Oracc reserves<br />
the right to perform m<str<strong>on</strong>g>in</str<strong>on</strong>g>or edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g but promises to provide proper identificati<strong>on</strong> <strong>and</strong> credit for all data<br />
as well as to identify all revisers of data). Through the curati<strong>on</strong> model, the Oracc team helps users to<br />
set up their cuneiform texts as a separate project <strong>on</strong> the Oracc server, <strong>and</strong> the curator is then<br />
resp<strong>on</strong>sible for lemmatiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g their texts (this model also gives the user greater c<strong>on</strong>trol<br />
over subsequent edits to materials). 94 Various web services assist those that are c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g corpora to<br />
Oracc.<br />
Oracc is an excellent example of a project that supports reuse of its data through the use of CC<br />
licenses, comm<strong>on</strong>ly adopted technical st<strong>and</strong>ards, <strong>and</strong> extensive documentati<strong>on</strong> as to how the data are<br />
86 http://oracc.museum.upenn.edu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />
87 The Pennsylvania Sumerian Dicti<strong>on</strong>ary project (http://psd.museum.upenn.edu/epsd/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html) is based at the Babyl<strong>on</strong>ian Secti<strong>on</strong> of the University of<br />
Pennsylvania Museum of Anthropology <strong>and</strong> Archaeology. In additi<strong>on</strong> to their work with Oracc <strong>and</strong> the CDLI, they have collaborated with the Electr<strong>on</strong>ic<br />
Text Corpus of Sumerian Literature (ETSCL).<br />
88 For the full list, see http://oracc.museum.upenn.edu/project-list.html.<br />
89 http://www.ucl.ac.uk/sarg<strong>on</strong><br />
90 http://oracc.museum.upenn.edu/dccmt/<br />
91 http://oracc.museum.upenn.edu/gkab<br />
92 Creative Comm<strong>on</strong>s is a “n<strong>on</strong>profit corporati<strong>on</strong> dedicated to mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it easier for people to share <strong>and</strong> build up<strong>on</strong> the work of others, c<strong>on</strong>sistent with the<br />
rules of copyright” (http://creativecomm<strong>on</strong>s.org/about/) <strong>and</strong> provides free licenses <strong>and</strong> legal tools that can be used by creators of <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual works that<br />
wish to provide various levels of reuse of their work, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g attributi<strong>on</strong>-<strong>on</strong>ly, share-alike, n<strong>on</strong>commercial, <strong>and</strong> no-derivative works.<br />
93 http://oracc.museum.upenn.edu/doc/<br />
94 For technical details <strong>on</strong> the curati<strong>on</strong> model, see http://oracc.museum.upenn.edu/c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html#curati<strong>on</strong>; for their extensive corpus-builder<br />
documentati<strong>on</strong>, see http://oracc.museum.upenn.edu/c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html#curati<strong>on</strong>), <strong>and</strong> for the guide to project management, see<br />
http://oracc.museum.upenn.edu/doc/manager/