26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

171<br />

texts that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude references to people, places, <strong>and</strong> th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. While pers<strong>on</strong> names are marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

XML text of the Old Bailey Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Bradley <strong>and</strong> Short remarked that there was no effort “to structure<br />

the names <str<strong>on</strong>g>in</str<strong>on</strong>g>to pers<strong>on</strong>s themselves.” This is <str<strong>on</strong>g>in</str<strong>on</strong>g> direct c<strong>on</strong>trast to their relati<strong>on</strong>al approach with the<br />

PASE, PBEW, <strong>and</strong> CCEd:<br />

Our three projects, <strong>on</strong> the other h<strong>and</strong>, are explicitly prosopographical by nature, <strong>and</strong> the<br />

identificati<strong>on</strong> of pers<strong>on</strong>s is the central task of the researchers, as it must be <str<strong>on</strong>g>in</str<strong>on</strong>g> any<br />

prosopography. They must have a way to separate the people with the same recorded name <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

separate categories, <strong>and</strong> to group together references to a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle pers<strong>on</strong> regardless of the<br />

spell<str<strong>on</strong>g>in</str<strong>on</strong>g>g of his/her name. … It is exactly because prosopographical projects are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

creati<strong>on</strong> of a model of their material that is perhaps not explicitly provided <str<strong>on</strong>g>in</str<strong>on</strong>g> the texts they<br />

work with that a purely textual approach is <str<strong>on</strong>g>in</str<strong>on</strong>g> the end not sufficient <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> of itself. Instead, it is<br />

exactly this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of structur<str<strong>on</strong>g>in</str<strong>on</strong>g>g which makes our projects particularly suitable for the relati<strong>on</strong>al<br />

database model (Bradley <strong>and</strong> Short 2005).<br />

In additi<strong>on</strong>, the databases for all three of these projects c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> not <strong>on</strong>ly “structured data <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of<br />

factoids” but also structures that are spread over several tables <strong>and</strong> represent other important objects <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the database <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “pers<strong>on</strong>s, geographic locati<strong>on</strong>s, <strong>and</strong> possessi<strong>on</strong>s.”<br />

Bradley <strong>and</strong> Short also addressed a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t raised earlier by Mathisen regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the limitati<strong>on</strong>s of<br />

historical databases <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> <strong>and</strong> categorizati<strong>on</strong> of data. As Mathisen ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, they<br />

argued that all work with prosopographical sources, whether writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g an article or creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a database<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volved a fair amount of scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> <strong>and</strong> categorizati<strong>on</strong>. Rather than attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create<br />

an “appropriate” model of their sources, Bradley <strong>and</strong> Short argued they were try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a model<br />

of how prosopographers work with those sources:<br />

For, of course, our database is not designed to model the texts up<strong>on</strong> which prosopography is<br />

based with all their subtle <strong>and</strong> ambiguous mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. The database, <str<strong>on</strong>g>in</str<strong>on</strong>g>stead, models the task of<br />

the prosopographer <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g them i.e. it is not a model of an historical text, but a model<br />

of prosopography itself (Bradley <strong>and</strong> Short 2005).<br />

The importance of model<str<strong>on</strong>g>in</str<strong>on</strong>g>g how scholars with<str<strong>on</strong>g>in</str<strong>on</strong>g> a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e c<strong>on</strong>duct their work <strong>and</strong> how they work<br />

with their sources are important comp<strong>on</strong>ents <str<strong>on</strong>g>in</str<strong>on</strong>g> the design not just of historical databases but also <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

larger digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures that will need to support multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work.<br />

Mathisen has described the approach of the PBW as a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of a “multi-file relati<strong>on</strong>al model”<br />

with a “decentralized biography model” (Mathisen 2007) or where <str<strong>on</strong>g>in</str<strong>on</strong>g>stead of hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

record with dedicated fields created for each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual, each pers<strong>on</strong> is <str<strong>on</strong>g>in</str<strong>on</strong>g>stead assigned a unique ID<br />

key that is then associated with the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> bites or “factoids” as described above <str<strong>on</strong>g>in</str<strong>on</strong>g> various other<br />

databases. “Biographies” are thus created for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals by assembl<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the relevant factoids for an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual. Mathisen offered a few caveats <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the methodology chosen for the PBW, namely,<br />

that the complexity of the data structure would make it hard for any<strong>on</strong>e without expert computer skills<br />

to implement such a soluti<strong>on</strong> <strong>and</strong> that the “multiplicity of sub-databases <strong>and</strong> lack of core biographies”<br />

would make it difficult to export this material or <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate it with another PDB without specialized<br />

programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Mathisen 2007). He also feared that the lack of “base-level” pers<strong>on</strong> entries might mean<br />

that important <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals could be omitted when different factoids were comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong><br />

could also make it difficult to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e when occurrences of the same name represent the same or<br />

different <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals. Despite this asserti<strong>on</strong>, Bradley <strong>and</strong> Short proposed that by not provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g their<br />

users with an “easy-to-read” f<str<strong>on</strong>g>in</str<strong>on</strong>g>al article about each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead present<str<strong>on</strong>g>in</str<strong>on</strong>g>g a collecti<strong>on</strong> of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!