10.01.2015 Views

Integrating Digital Libraries and Electronic Publishing in the ...

Integrating Digital Libraries and Electronic Publishing in the ...

Integrating Digital Libraries and Electronic Publishing in the ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Integrat<strong>in</strong>g</strong> <strong>Digital</strong> <strong>Libraries</strong> <strong>and</strong><br />

<strong>Electronic</strong> <strong>Publish<strong>in</strong>g</strong> <strong>in</strong> <strong>the</strong><br />

DART Project<br />

David Millman<br />

Gordon Dahlquist<br />

Brian Hoffman<br />

Columbia University<br />

April 2005


EPIC Background<br />

<strong>Electronic</strong> <strong>Publish<strong>in</strong>g</strong> Initiative at Columbia<br />

• 3-way partnership—Columbia Univ. Press,<br />

Academic Information Systems, Columbia<br />

<strong>Libraries</strong><br />

• Publications<br />

– Columbia International Affairs Onl<strong>in</strong>e (ciao)<br />

– Columbia Earthscape<br />

– Gutenberg-E<br />

• Evolv<strong>in</strong>g editorial <strong>and</strong> technology roles,<br />

workflow<br />

Columbia/DART—Apr 2005—2


DART Background<br />

<strong>Digital</strong> Anthropology Resources for Teach<strong>in</strong>g<br />

• NSF/JISC fund<strong>in</strong>g— “<strong>Digital</strong> <strong>Libraries</strong> <strong>in</strong><br />

<strong>the</strong> Classroom” program<br />

• Partnership with London School of<br />

Economics & Political Science<br />

• Anthropology Departments with<br />

<strong>Publish<strong>in</strong>g</strong>/Educational Technology units<br />

• 2 postdoc Fellows <strong>in</strong> each Anthropology<br />

Dept.—offload teach<strong>in</strong>g load <strong>and</strong> l<strong>in</strong>ks to<br />

senior faculty <strong>in</strong> each <strong>in</strong>stitution<br />

Columbia/DART—Apr 2005—3


DART Educational Mission<br />

• To help undergraduate students ga<strong>in</strong><br />

<strong>in</strong>sight <strong>in</strong>to <strong>the</strong> way <strong>in</strong> which<br />

anthropologists conduct research <strong>and</strong><br />

draw conclusions<br />

• Improve <strong>in</strong>formation literacy of<br />

undergraduate anthropology students<br />

through use of structured yet unfiltered<br />

digital resources<br />

Columbia/DART—Apr 2005—4


E-<strong>Publish<strong>in</strong>g</strong> Mission<br />

• To develop a digital library <strong>in</strong>frastructure<br />

that will store digital resources so that <strong>the</strong>y<br />

can be used <strong>in</strong> flexible ways<br />

• To catalogue digital assets embedded<br />

with<strong>in</strong> complex learn<strong>in</strong>g tools so that <strong>the</strong>y<br />

can be used for broader research <strong>and</strong>/or<br />

teach<strong>in</strong>g goals<br />

Columbia/DART—Apr 2005—5


Case 1: Intro to South Asian<br />

Culture<br />

• Onl<strong>in</strong>e syllabus that l<strong>in</strong>ks to catalogued<br />

digital assets (primary texts, maps, photos,<br />

video)<br />

• Teacher builds class assignments around<br />

<strong>the</strong>se assets (response to questions,<br />

essays on read<strong>in</strong>gs, <strong>and</strong> full research<br />

paper)<br />

• Increas<strong>in</strong>g levels of <strong>in</strong>teraction with library<br />

materials throughout <strong>the</strong> semester<br />

Columbia/DART—Apr 2005—6


Case 2:The Ethnographic<br />

Imag<strong>in</strong>ation<br />

• The teach<strong>in</strong>g module conta<strong>in</strong>s a digitized<br />

selection of author’s field notes <strong>and</strong><br />

published book<br />

• Students read both sets of materials <strong>and</strong><br />

write about <strong>the</strong> process of transform<strong>in</strong>g <strong>the</strong><br />

notes <strong>in</strong>to an ethnography<br />

• Increas<strong>in</strong>g underst<strong>and</strong><strong>in</strong>g of how<br />

knowledge is created from data<br />

Columbia/DART—Apr 2005—7


DART <strong>Publish<strong>in</strong>g</strong> Environment<br />

• Traditional Roles <strong>and</strong> Chang<strong>in</strong>g<br />

Relationships<br />

• Editors/Authors & Publication Process<br />

• Publications & <strong>the</strong> Library<br />

Columbia/DART—Apr 2005—8


<strong>Digital</strong> Teach<strong>in</strong>g Tools <strong>and</strong><br />

Research Library Resources<br />

• Focus on <strong>the</strong> relationship between <strong>the</strong><br />

“closed” world of <strong>the</strong> classroom <strong>and</strong><br />

teach<strong>in</strong>g tools, <strong>and</strong> <strong>the</strong> “open” world of <strong>the</strong><br />

library<br />

• Can students explore freely <strong>the</strong> vast array<br />

of research tools available through <strong>the</strong><br />

Web, while still hav<strong>in</strong>g an appropriate level<br />

of guidance concern<strong>in</strong>g how to select <strong>and</strong><br />

evaluate <strong>the</strong> sources that <strong>the</strong>y f<strong>in</strong>d<br />

Columbia/DART—Apr 2005—9


Unlimited Information as Benefit or<br />

Obstacle to Learn<strong>in</strong>g<br />

• How do we make <strong>in</strong>formation mean<strong>in</strong>gful<br />

to users with diverse skills <strong>and</strong> needs<br />

• Future work will explore how to f<strong>in</strong>d <strong>the</strong><br />

right balance between directed <strong>and</strong><br />

unfiltered presentation of digital teach<strong>in</strong>g<br />

<strong>and</strong> research materials <strong>in</strong> electronic<br />

publications<br />

Columbia/DART—Apr 2005—10


<strong>Integrat<strong>in</strong>g</strong> Teach<strong>in</strong>g Tools <strong>and</strong><br />

<strong>Digital</strong> Library<br />

Value added from each direction as part of<br />

production process<br />

• Non-Hermetic Teach<strong>in</strong>g Tools<br />

• Collection presented with<strong>in</strong> pedagogical<br />

context(s)<br />

Columbia/DART—Apr 2005—11


User Experience<br />

Columbia/DART—Apr 2005—12


Technology<br />

• Accommodate different styles for teach<strong>in</strong>g<br />

– fall ’04 (South Asian History & Culture): web browser focus<br />

(syllabus navigation)<br />

– spr<strong>in</strong>g ’05 (Ethnographic Imag<strong>in</strong>ation): digital resource focus<br />

(primary source navigation)<br />

– fall ’05 (plann<strong>in</strong>g): consider<strong>in</strong>g mobile device <strong>in</strong> DL discovery &<br />

retrieval; “Virtual Calcutta” object/software<br />

• Web services import/export<br />

• Access management/Shibboleth<br />

• Metadata: “versions” revisited<br />

Columbia/DART—Apr 2005—13


Acquisition<br />

<strong>Digital</strong> South Asia Library<br />

DSAL @ U Chicago<br />

Cambridge Univ Library<br />

<strong>in</strong>stitutional repository<br />

DART catalog<br />

(proposed)<br />

Tibetan-Himalayan DL<br />

thdl @ U of Virg<strong>in</strong>ia<br />

OAI DSpace Fedora<br />

Publishers<br />

& Archives<br />

mapp<strong>in</strong>g<br />

DART faculty<br />

local workflow<br />

DART content<br />

Columbia/DART—Apr 2005—14


Access<br />

METS<br />

OAI<br />

MPEG21/DID Sakai/OKI<br />

JSR170<br />

IMS/CP<br />

library & repository<br />

environments<br />

collaborative & learn<strong>in</strong>g<br />

environments<br />

browser<br />

html<br />

Z39.50<br />

openURL<br />

DART catalog<br />

DART content<br />

Columbia/DART—Apr 2005—15


The View from Production<br />

Build<strong>in</strong>g DART’s e-publish<strong>in</strong>g<br />

production cycle<br />

<strong>in</strong>to open archive <strong>in</strong>frastructure<br />

systems


Build<strong>in</strong>g Publications<br />

• Structured presentations of digital objects<br />

• Legal presentation of digital objects<br />

(rights)<br />

• Presentation through l<strong>in</strong>k<strong>in</strong>g or embedd<strong>in</strong>g<br />

• One to many relation between locally or<br />

remotely stored orig<strong>in</strong>als <strong>and</strong> versions<br />

embedded <strong>in</strong> publications<br />

Columbia/DART—Apr 2005—17


Examples of Publications<br />

• Slide shows<br />

• M<strong>in</strong>i-sites for classroom or homework use<br />

• Onl<strong>in</strong>e syllabi<br />

• Complex page-view<strong>in</strong>g <strong>in</strong>terfaces (onl<strong>in</strong>e<br />

fieldnotes)<br />

• Interactive games<br />

• Any navigational <strong>in</strong>terface to <strong>the</strong> digital library<br />

(faceted navigation, topic maps, etc.)<br />

Columbia/DART—Apr 2005—18


Objects with<strong>in</strong> Publications<br />

• Must conform to publication’s<br />

specifications (e.g., consistent image size)<br />

• Publication-specific metadata (e.g.,<br />

caption)<br />

• Embedded <strong>in</strong> a new format (HTML, Flash,<br />

Video)<br />

• Objects appear<strong>in</strong>g <strong>in</strong> a publication called<br />

“Assets”<br />

Columbia/DART—Apr 2005—19


Harvested Assets<br />

• Harvest c<strong>and</strong>idate (metadata) records<br />

from open archives <strong>and</strong> partner <strong>in</strong>stitutions<br />

• Identify objects to import: desired assets<br />

• Import bitstreams<br />

• Draft metadata from c<strong>and</strong>idate record<br />

(pre-populate fields)<br />

• Edit metadata (catalog from our<br />

perspective)<br />

Columbia/DART—Apr 2005—20


Assets Digitized Locally<br />

• Create digital archival copy (scan,<br />

photograph, etc.)<br />

• Orig<strong>in</strong>al Catalog<strong>in</strong>g<br />

•Store<br />

– part of preservation strategy<br />

Columbia/DART—Apr 2005—21


Publication Assembly<br />

• File Modification<br />

– Crop, detail, resize<br />

– Reduce, snip, clip, extract<br />

– Interpret, expla<strong>in</strong>, contextualize<br />

• Presentation Context<br />

– Associate, locate<br />

– Incorporate, <strong>in</strong>clude, attach<br />

– Interpret, expla<strong>in</strong>, contextualize<br />

Columbia/DART—Apr 2005—22


Three Asset Scenarios<br />

Columbia/DART—Apr 2005—23


Asset 1<br />

• Digitized Map from <strong>Digital</strong> South Asia<br />

Library (http://dsal.chicago.edu)<br />

Columbia/DART—Apr 2005—24


Asset 1<br />

• Bitstream <strong>and</strong> metadata copied to<br />

DART collection<br />

• Metadata edited by DART editors<br />

• DART bitstream copied <strong>and</strong> deployed<br />

<strong>in</strong>to various publications<br />

• Copies are reduced, cropped, applied<br />

with hotspots <strong>in</strong> photoshop, etc<br />

Columbia/DART—Apr 2005—25


Asset 2<br />

• <strong>Digital</strong> video <strong>in</strong>terview with von Furer-<br />

Haimendorf (http://www.lib.cam.ac.uk)<br />

• 1.3 hours<br />

Columbia/DART—Apr 2005—26


Asset 2<br />

• Metadata copied to DART collection<br />

• Metadata edited by DART editors<br />

• Short video clips deployed <strong>in</strong> various<br />

publications<br />

• DART keeps no copy of <strong>the</strong> orig<strong>in</strong>al object<br />

Columbia/DART—Apr 2005—27


Asset 3<br />

• Chapter of Sherpas Through Their Rituals<br />

by Sherry Ortner<br />

Columbia/DART—Apr 2005—28


Asset 3<br />

• Bitstream <strong>and</strong> metadata created by DART<br />

• Re-publication rights secured by DART<br />

• Scann<strong>in</strong>g done by DART<br />

• Archival responsibility assumed by DART<br />

Columbia/DART—Apr 2005—29


Expos<strong>in</strong>g Items <strong>in</strong> DART Library to<br />

O<strong>the</strong>r Systems<br />

• Complicated relationships between source<br />

files <strong>and</strong> derivations<br />

• Version<strong>in</strong>g, entropy<br />

• Redundancy <strong>and</strong> degradation (import<strong>in</strong>g a<br />

large file <strong>and</strong> pass<strong>in</strong>g along a small file)<br />

• Even more complicated relationships<br />

between source file metadata <strong>and</strong><br />

derivation file metadata<br />

Columbia/DART—Apr 2005—30


Express<strong>in</strong>g Relations Among<br />

Versions <strong>and</strong> Derivations<br />

• DART metadata schema = extension of<br />

Dubl<strong>in</strong> Core element set<br />

• derivedFrom tag<br />

• Plan to offer OAI harvesters DART<br />

schema <strong>in</strong> addition to OAI_DC<br />

• Now catalog<strong>in</strong>g <strong>and</strong> track<strong>in</strong>g derivation<br />

<strong>in</strong>formation<br />

Columbia/DART—Apr 2005—31


derivedFrom element<br />

• URI of source file<br />

– Ano<strong>the</strong>r DART item<br />

– An item <strong>in</strong> an outside system (URI may be download<br />

page)<br />

• Date copy was made<br />

• Description of alterations, copy methods,<br />

purpose, etc.<br />

• Analogous to OAI provenance tag<br />

– OAI provenance : metadata :: derivedFrom :<br />

bitstreams<br />

Columbia/DART—Apr 2005—32


OAI provenace<br />

• Describes metadata provenance<br />

• Assumes fixed object, mobile metadata<br />

• 0 provenance tags for a copy made for <strong>the</strong><br />

purpose of alteration <strong>and</strong> <strong>in</strong>corporation<br />

• Problem of metadata<br />

– Source metadata used to “seed” derivation metadata<br />

– Can’t record this k<strong>in</strong>d of provenance through OAI<br />

provenance<br />

Columbia/DART—Apr 2005—33


Exposure of O<strong>the</strong>rs’ Metadata<br />

<br />

<br />

<br />

<br />

oai:lib.uchicago.edu:ta013<br />

2004-10-08T18:50:13Z<br />

dsal<br />

dsal:hensley<br />

<br />

<br />

<br />

http://pi.lib.uchicago.edu/1001/org/dsal/ima...<br />

Gate <strong>in</strong>to Taj grounds<br />

...<br />

<br />

<br />

<br />

<br />

The University of Chicago Library<br />

No rights to <strong>the</strong> use of <strong>the</strong>se...<br />

<br />

<br />


Exposure of DART’s Metadata<br />

<br />

<br />

<br />

<br />

oai:dart.columbia.edu:dart0023<br />

... <br />

<br />

<br />

https://dart.columbia.edu/ma<strong>in</strong>/DART-0023.html<br />

Photograph of Gate Into Taj Grounds<br />

...<br />

<br />

This image was resized to 700 by 800 pixels,<br />

<strong>and</strong> cropped around a sketch at <strong>the</strong> corner of a notebook...<br />

<br />

http://pi.lib.uchicago.edu/1001/<br />

org/dsal/images/hensley/ta013<br />

2004-10-07T06:05:04Z<br />

<br />

<br />

<br />

<br />

<br />

Columbia/DART—Apr 2005—35


Open Publications<br />

• Potential for Publication-based harvest<strong>in</strong>g<br />

• “Dissolve” a publication <strong>in</strong>to a set of decontextualized<br />

digital objects<br />

• Many po<strong>in</strong>ts of alignment between publication<br />

<strong>and</strong> archival processes<br />

• Publications can supply as well as re-purpose<br />

archived material<br />

Columbia/DART—Apr 2005—36


dart.columbia.edu<br />

Columbia/DART—Apr 2005—37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!