Rome Wasn't Digitized in a Day - Council on Library and Information ...
Rome Wasn't Digitized in a Day - Council on Library and Information ...
Rome Wasn't Digitized in a Day - Council on Library and Information ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
61<br />
Plat<strong>on</strong>ists). John Lee (2007) has c<strong>on</strong>ducted other work <str<strong>on</strong>g>in</str<strong>on</strong>g> textual reuse <strong>and</strong> explored sentence<br />
alignment <str<strong>on</strong>g>in</str<strong>on</strong>g> the Synoptic Gospels of the Greek New Testament. He po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancienttext<br />
reuse is a difficult but important task s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce ancient authors rarely acknowledged their sources <strong>and</strong><br />
often quoted from memory or comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed multiple sources. “Identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the sources of ancient texts is<br />
useful <str<strong>on</strong>g>in</str<strong>on</strong>g> many ways,” Lee stressed: “It helps establish their relative dates. It traces the evoluti<strong>on</strong> of<br />
ideas. The material quoted, left out or altered <str<strong>on</strong>g>in</str<strong>on</strong>g> a compositi<strong>on</strong> provides much <str<strong>on</strong>g>in</str<strong>on</strong>g>sight <str<strong>on</strong>g>in</str<strong>on</strong>g>to the agenda<br />
of its author” (Lee 2007).<br />
Authorship attributi<strong>on</strong>, or us<str<strong>on</strong>g>in</str<strong>on</strong>g>g manual or automatic techniques to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the authorship of<br />
an<strong>on</strong>ymous texts, has been previously explored <str<strong>on</strong>g>in</str<strong>on</strong>g> classical studies (Rudman 1998) <strong>and</strong> rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s a topic<br />
of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest. Forstall <strong>and</strong> Scheirer (2009) presented new methods for authorship attributi<strong>on</strong> based <strong>on</strong><br />
sound rather than text to Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> poets <strong>and</strong> prose authors:<br />
We present the functi<strong>on</strong>al n-gram as a feature well-suited to the analysis of poetry <strong>and</strong> other<br />
sound-sensitive material, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g toward a stylistics based <strong>on</strong> sound rather than text. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />
Support Vector Mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es (SVM) for text classificati<strong>on</strong>, we extend the expressi<strong>on</strong> of our results<br />
from a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle marg<str<strong>on</strong>g>in</str<strong>on</strong>g>al distance or a b<str<strong>on</strong>g>in</str<strong>on</strong>g>ary yes/no decisi<strong>on</strong> to a more flexible receiver-operator<br />
characteristic curve. We apply the same feature methodology to Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciple Comp<strong>on</strong>ent Analysis<br />
(PCA) <str<strong>on</strong>g>in</str<strong>on</strong>g> order to validate PCA <strong>and</strong> to explore its expressive potential (Forstall <strong>and</strong> Scheirer<br />
2009).<br />
The authors discovered that sounds tested with SVMs produced results that performed as well as, if not<br />
better than, functi<strong>on</strong>-words <str<strong>on</strong>g>in</str<strong>on</strong>g> every experiment performed, <strong>and</strong> thus c<strong>on</strong>cluded that “sound can be<br />
captured <strong>and</strong> used effectively as a feature for attribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g authorship to a variety of literary texts.”<br />
Forstall <strong>and</strong> Scheirer also reported some <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>itial results <str<strong>on</strong>g>in</str<strong>on</strong>g> explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Homeric poems,<br />
<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g test<str<strong>on</strong>g>in</str<strong>on</strong>g>g the argument that this poetry was composed without aid of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, an issue explored<br />
at length by the Homeric Multitext Project. “When the works of Thucydides, a literate prose historian,<br />
were projected us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal comp<strong>on</strong>ents derived from Homer, Thucydides' work not <strong>on</strong>ly<br />
clustered together but had a much smaller radius than either of the Homeric poems,” Forstall <strong>and</strong><br />
Scheirer c<strong>on</strong>tended, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “this result agrees with philological arguments for the Homer's works<br />
hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g been produced by a wholly different, oral mode of compositi<strong>on</strong>.” The work of Forstall <strong>and</strong><br />
Scheirer is just <strong>on</strong>e example of many am<strong>on</strong>g digital classics projects of how computer science<br />
methodologies can shed new light <strong>on</strong> old questi<strong>on</strong>s.<br />
The PDL has c<strong>on</strong>ducted some of its own experiments <str<strong>on</strong>g>in</str<strong>on</strong>g> automatic quotati<strong>on</strong> identificati<strong>on</strong>. Ernst-<br />
Gerlach <strong>and</strong> Crane (2008) <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced an algorithm for the automatic analysis of citati<strong>on</strong>s but found<br />
that they needed to first manually analyze the structure of quotati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> three different reference works<br />
of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e text quotati<strong>on</strong> alternati<strong>on</strong> patterns. Their experience c<strong>on</strong>firmed Lee’s earlier<br />
po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that text reuse is rarely word for word, though <str<strong>on</strong>g>in</str<strong>on</strong>g> this case it was the quotati<strong>on</strong> practices of<br />
n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth-century reference works, rather than those of ancient authors, that proved problematic:<br />
Quotati<strong>on</strong>s are, <str<strong>on</strong>g>in</str<strong>on</strong>g> practice, often not exact. In some cases, our quotati<strong>on</strong>s are based <strong>on</strong> different<br />
editi<strong>on</strong>s of a text than those to which we have electr<strong>on</strong>ic access <strong>and</strong> we f<str<strong>on</strong>g>in</str<strong>on</strong>g>d occasi<strong>on</strong>al<br />
variati<strong>on</strong>s that reflect different versi<strong>on</strong>s of the text. We also found, however, that some<br />
quotati<strong>on</strong>s – especially <str<strong>on</strong>g>in</str<strong>on</strong>g> reference works such as lexica <strong>and</strong> grammars – deliberately modify<br />
the quoted text – the goal <str<strong>on</strong>g>in</str<strong>on</strong>g> such cases is not to replicate the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text but to illustrate a<br />
po<str<strong>on</strong>g>in</str<strong>on</strong>g>t about lexicography, grammar, or some other topic (Ernst-Gerlach <strong>and</strong> Crane 2008).