11.07.2015 Views

Digital Repository Best Practices for Cultural Heritage Organizations

Digital Repository Best Practices for Cultural Heritage Organizations

Digital Repository Best Practices for Cultural Heritage Organizations

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Introduction: Assumptions and DefinitionsThis white paper is based on knowledge and experience I have gained in digitalrepository work and digital repository community involvement over the pastdecade. The contents of this report could be backed up by citations to journalarticles and research papers. However, the intent is to offer practical advice <strong>for</strong> theComputer History Museum (CHM) to use in digital repository planning. Rather thaninterrupt the flow of the narrative with references, I have appended an annotatedbibliography. Thanks to Heather Yager <strong>for</strong> surveying the literature, collecting copiesof the resources, and extracting salient points from them.The term digital repository is used to mean a variety of things. In academicenvironments, digital repositories are often systems where researchers depositarticles and papers. In archives, digital repositories are most often systems tomanage digital assets‐‐some born digital such as email correspondence, and somedigital surrogates of other archival material such as photographs or documents.In museums, digital surrogates from the museums’ collections are commonly held indigital repositories. Some museums also collect born digital artifacts such as timebasedart or software. Sometimes these repositories include preservationcomponents, and sometimes they are focused on processing workflows. In certaincases, the repositories include “end user” access to the digital content, while in othersituations access might be provided by another separate system.For the purposes of this paper, and to meet CHM needs, we will define the digitalrepository as the systems and workflows that support digital asset management anddigital preservation. Although access to CHM digital collections is important, enduser access is not the focus of the digital repository planning project. <strong>Digital</strong> contentproduction such as creating digital surrogates of objects in CHM collections, orproduction of digital video content is also excluded from this report.I will begin the paper with background in<strong>for</strong>mation about the history of digitalrepositories in cultural heritage organizations. This section will include lessonslearned over the past two decades, and current thinking about digital repositories.Then I will suggest practices that should serve CHM well as the Museum plans <strong>for</strong>the next level of digital asset management and preservation <strong>for</strong> digital collections. Iwill present these recommendations within the context of emerging best practicesin the cultural heritage community.BackgroundOver the past two decades, leaders in museums, libraries, and archives haveemphasized the role cultural heritage organizations must play in the managementand preservation of digital collections, continuing the role they have long played in<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 2 of 19Computer History Museum February 2012


designed to be more extensible, with application programming interfaces (APIs)that enable separate but related systems to share data. In addition, many opensource tools are available to fill specific requirements such as <strong>for</strong>mat verification orpackaging <strong>for</strong> ingest.Current best practice when building or selecting a digital repository system is toemphasize extensibility and flexibility. That way, as digital repository systemscontinue to mature, it is possible to swap out parts, without having to replace thewhole. More and more, digital repository developers are being asked to integrateexisting tools and services into workflows, and become active in the open sourcecommunities that enhance and maintain them.Establish risk assessment policies and procedures:Risk assessment policies and procedures must strike a balance to preserve what isinherently valuable while recognizing resource constraints.Risk assessment includes physical evaluation of digital material when it first arrivesat the CHM, and analysis of file <strong>for</strong>mats. Physical evaluation is essential to determineif media is stable. It is also necessary <strong>for</strong> determining whether the data can be read.For example, it may be difficult to migrate data from an obsolete hand‐held device toa CHM storage system.Policies and procedures could set time limits <strong>for</strong> attempting to preserve data fromdonated devices, or could suggest seeking support <strong>for</strong> recovering the data throughoutsourced services when such a donation is accepted. In the mean time, bestpractice would be to store the obsolete device. Although the data it contains may bevaluable, migrating the data to the digital repository would become a lower priorityuntil resources could be obtained.File <strong>for</strong>mat risk assessment can in<strong>for</strong>m the level of service the CHM defines <strong>for</strong>certain kinds of digital content. The CHM may wish to establish a list of wellunderstoodfile <strong>for</strong>mats such as those that are generated in‐house during mediaproduction or digitization. These file types would be considered low risk <strong>for</strong>successful preservation. While CHM will almost certainly ingest other less wellknownfile <strong>for</strong>mats (interactive games, obsolete software) the donor agreementsand associated service level agreements <strong>for</strong> these file types could state the riskassociated with preserving them. A basic preservation system is likely to be able tostore and retrieve files in a wide range of <strong>for</strong>mats. Enhanced systems and serviceswould be needed to create emulation environments, or to <strong>for</strong>ward‐migrate files.<strong>Digital</strong> preservation tools and frameworks offer criteria <strong>for</strong> digital collection riskassessment, pre‐and post‐ingest. Using these frameworks as a guide, it is importantto create local policies and procedure to assess risk levels <strong>for</strong> incoming objects and<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 6 of 19Computer History Museum February 2012


collections. Risk assessment is important in prioritizing preservation treatment <strong>for</strong>digital material.Prefer throughput over exhaustive up­front preparation (just in time, not justin case):Use collection policies to in<strong>for</strong>m decisions about levels of treatment <strong>for</strong> digitalcollections.Early digital preservation system implementers learned that requiring extensivepreparation <strong>for</strong> digital objects be<strong>for</strong>e they can be deposited often translates intobacklogs of digital objects that are not under any kind of control or protection.Rather than expecting all objects to be re<strong>for</strong>matted, or described in detail,particularly when future use <strong>for</strong> the object may not be well understood, it is betterto ingest the objects using streamlined low‐barrier‐to‐entry processes. Often, thisinvolves little more than providing minimal level metadata to insure that objects canbe identified and retrieved, generating a unique identifier <strong>for</strong> each object, andvalidating, and virus‐checking files.Metadata StandardsConsider Dublin Core <strong>for</strong> descriptive metadata and PREMIS <strong>for</strong> preservationmetadata. Technical and structural metadata <strong>for</strong>mat choices can be based on what issupported by the digital repository systems that are selected.Metadata <strong>for</strong> digital repositories may vie with storage as the areas with the leastbest practices consensus. For descriptive and technical metadata, choose <strong>for</strong>matsthat meet management and access needs and that workflow tools support.It is important that metadata content standards be flexible enough to allow digitalcollections to be ingested and processed quickly, especially if the material isvulnerable (e.g. on unstable media). In these cases, minimal descriptive metadatasuch as title, creator, donor, and date, and a system generated unique identifier isenough to insure the material can be retrieved <strong>for</strong> further processing in the future.Thorough descriptive metadata <strong>for</strong> certain categories of digital assets, such asobjects that will be used in educational programs, or included in exhibits may berequired. However, expecting that all collections receive the same treatment couldbe an obstacle to timely ingest of at risk material.Descriptive metadata standards include Dublin Core, Metadata Object DescriptiveSchema (MODS), and Encoded Archival Description (EAD) <strong>for</strong> archival materials.Although a range of descriptive metadata <strong>for</strong>mats have been developed <strong>for</strong>specialized material, because of the diversity of CHM collections, the CHM would bebest served by choosing a flexible descriptive metadata <strong>for</strong>mat. Dublin Core would<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 7 of 19Computer History Museum February 2012


e a good choice. In addition to being flexible, it is already in use at the CHM. If EADrecords are needed <strong>for</strong> shared access to archival collections, these records may needto be created in a system that is separate from the core digital repository system asnot all digital repository software packages support the EAD <strong>for</strong>mat.For technical metadata such as capture in<strong>for</strong>mation, the Museum should implementsimilarly flexible policies. When digital surrogates are generated in‐house orthrough contracted agreements, capture in<strong>for</strong>mation should be retained as technicalmetadata. In the case of donated objects or collections, this kind of in<strong>for</strong>mationshould be recorded if available. However, it may be more important to ingestwithout full technical metadata than to wait to ingest until this in<strong>for</strong>mation could beobtained.The array of choices <strong>for</strong> storing technical metadata is even more confusing thanthose <strong>for</strong> descriptive metadata. Technical metadata may be stored in an xmldocument, or wrapped in a METS wrapper. In Fedora‐based systems, technicalmetadata can be stored in a “FOXML” datastream. Either option is perfectlyreasonable. The systems and software that are best suited to CHM requirements willlikely determine the technical metadata <strong>for</strong>mat.PREMIS is the de facto standard <strong>for</strong> preservation metadata and the Museum shouldplan to implement some version of PREMIS <strong>for</strong> the preservation system. Irecommend a lightweight PREMIS implementation such as the Stan<strong>for</strong>d model.PREMIS supports a level of complexity in preservation metadata that would bedifficult to sustain. <strong>Best</strong> practice is to create local preservation policies andprocedures and document them within the PREMIS framework.<strong>Best</strong> practices <strong>for</strong> structural metadata may also vary, from none required <strong>for</strong> singleimages, to a range of solutions <strong>for</strong> representing complex relationships.Requirements are often based on what is needed <strong>for</strong> proper display and use of theobjects. Complex relationships can be recorded in different ways, depending on thesystem or workflow used <strong>for</strong> processing. For digital artifacts such as interactivegames, best practices have yet to be established.Many systems support METS <strong>for</strong> structural metadata, although Fedora‐basedsystems require METS packages to be broken apart into separate data streams <strong>for</strong>ingest. Here again, best practice is to choose a system that will support the rangeand level of structural metadata needed <strong>for</strong> the digital repository; the metadata<strong>for</strong>mat is less important than the in<strong>for</strong>mation that can be stored within it. Figure 2:Metadata in the Workflow shows where different metadata <strong>for</strong>mats and containerscomeinto play in a digital repository workflow.<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 8 of 19Computer History Museum February 2012


Figure 2: Metadata in the workflowAt this point, the CHM can af<strong>for</strong>d to be flexible in thinking about what metadata<strong>for</strong>mats to implement. EAD may be required to make archival collectionsdiscoverable, but METS might not be needed <strong>for</strong> structural metadata, <strong>for</strong> example.As digital repository software is evaluated, CHM staff should make certain thesystem provides support <strong>for</strong> the level of structural metadata that is required in a<strong>for</strong>mat that will be easy <strong>for</strong> other systems, such as the CHM website, to use.Validation and VerificationCollection policies should determine the level of <strong>for</strong>mat validation the digitalrepository supports. When the CHM controls production of the artifacts, such assurrogates of physical objects, automatic <strong>for</strong>mat validation should be part of theworkflow. Registries and validation tools <strong>for</strong> well‐known <strong>for</strong>mats are available andcan be integrated into the digital repository workflow. For other collections such ashistoric software, <strong>for</strong>mat validation should not be required. Tools <strong>for</strong> checking donot necessarily exist, it would be expensive to produce them, and the software isunlikely to be rejected due to lack of validation. I there<strong>for</strong>e recommend that theMuseum plan to implement different levels of <strong>for</strong>mat validation <strong>for</strong> some collectiontypes than <strong>for</strong> others.Following best practices <strong>for</strong> ingest, it is important to check <strong>for</strong> viruses, per<strong>for</strong>mfixity checks on incoming files, and to generate a unique identifier <strong>for</strong> the object.Preservation <strong>Practices</strong><strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 9 of 19Computer History Museum February 2012


Unless future uses <strong>for</strong> digital objects are well understood, it is best to defer creationof emulation environments or other activities that involve research anddevelopment. While few organizations can af<strong>for</strong>d to create emulation environmentsin operational workflows, as long as the objects are preserved, they can be retrieved<strong>for</strong> enhancement later, when specific use cases arise.Align digital collection policies and procedures with policies and procedures<strong>for</strong> non­digital collections:Reference existing Museum policies and procedures when creating policies andprocedures <strong>for</strong> digital collection management and preservation to insure thatcollections in all areas are handled with the same level of attention.Particularly in smaller organizations, aligning digital collection policies andprocedures with existing policies and procedures can leverage scarce resources.Specialized systems will be needed to store digital assets, and asset managementsystems that are used <strong>for</strong> analog resources may not be extensible to digital assets.As a result, processing workflows <strong>for</strong> digital collections may differ from those <strong>for</strong>analog resources.It may make sense to create and maintain separate documentation <strong>for</strong> digitalrepository systems, processes and procedures. Even so, the more similar workflowsand processes can be, the easier it is to cross‐train staff and provide flexibility <strong>for</strong>handling both analog and digital collections, including providing metadata todiscovery and delivery systems.Make storage basics a high priority:It is imperative that all digital collections be migrated to a robust storage systemthat is backed up.Some CHM collections are currently at risk because they are not being stored onprofessional IT grade media. Be<strong>for</strong>e CHM will be ready to implement a digitalrepository, it is imperative that all digital collections be migrated to a robust storagesystem that is backed up, preferably to tape that is then stored in an off‐site location.Once this fundamental best practice is in place, the system can be enhanced withpreservation management capabilities.When selecting preservation software <strong>for</strong> the CHM, it may be necessary to considerproprietary solutions, even though open source is preferred. Specificrecommendations from the Storage Consultant will be useful in this area.Storage may be the area where best practices remain the least clear. That said, thefirst priority <strong>for</strong> a digital repository system is to insure that files are organized in away that makes retrieval possible at the object as well as the file level. In addition, a<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 10 of 19Computer History Museum February 2012


egular and robust backup procedure using proven mainstream IT technologies andmedia such as data tape is critical to digital repository operation.At the next level, a storage system that includes preservation functions should beimplemented. Despite the growing popularity of cloud storage <strong>for</strong> personalcollections of music, photographs, and other data, preservation professionals havebeen slow to adopt cloud solutions, even <strong>for</strong> redundancy. Thus, current best practiceis to replicate data, preferably using different technology stacks to create each copy.There is no hard and fast rule about how many copies are enough and there is littlein the literature to suggest that three copies are better than two, <strong>for</strong> example.“Lots of copies keeps stuff safe”, the slogan behind the LOCKKSS acronym maydescribe a suitable practice <strong>for</strong> data that is already natively redundant, andrelatively small. Licensed text‐based content that the LOCKSS project has focused onis ubiquitous. A number of organizations can capture and replicate this data in apeer‐to‐peer network without much ef<strong>for</strong>t. The same strategy will not work <strong>for</strong>unique collections, especially collections involving large files.Consequently, best practice in this area must be in<strong>for</strong>med by what is practical andaf<strong>for</strong>dable <strong>for</strong> the organization. A preservation master on spinning disk or tape, andat least one other copy that is stored in a separate location, but is accessible by asystem that allows audit would meet minimum level requirements <strong>for</strong> apreservation system. Beyond that, questions about how many additional copies areneeded, and whether access copies such as lower resolution versions of image filesneed to be preserved must be considered in the context of organizational missionand resources.Establish a clear rights policy:The Computer History Museum must create rights policies consistent with riskassessment levels and resource allocation.Rights issues are complex. Although best practices can provide some guidance, theCHM must decide the level of risk it is willing to assume, what level of resources theorganization can dedicate to resolving rights questions, and how vulnerable theresources are.For example, will digital collections or commercial software that appear to havebeen copied be accepted? If the objects have no provenance, or the provenance isuncertain, will CHM ingest them <strong>for</strong> preservation but not provide access? Will rightsbe sought to copy software <strong>for</strong> preservation purposes, or will the CHM take theposition that copying <strong>for</strong> preservation is allowable?<strong>Digital</strong> preservation professionals argue that best practices from a preservationstandpoint make it important <strong>for</strong> organizations to preserve digital material, even if<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 11 of 19Computer History Museum February 2012


the rights to do so are not entirely clear. However, most people in the field recognizethat the organization’s legal advisors must make the final determination what theorganization’s policies should be.The Museum has already established rights policies and procedures within theGuidelines <strong>for</strong> Documenting <strong>Digital</strong> Exhibit Artifacts. These policies and proceduresare aligned with current best practices and could serve as a model <strong>for</strong> generalizedrights policies and procedures <strong>for</strong> the digital repository.ConclusionThere are not yet, and perhaps never will be a set of unified digital repository bestpractices. Organizational needs are too diverse to support a single group ofrecommendations that would function well in all environments. However, broadframeworks and reference models do provide overall guidance. Creating localpolicies and procedures that fit within general frameworks such as PLATTER andOAIS will insure that a digital repository meets functional requirements that areaccepted by the community. In addition, a modular design will create a digitalrepository that is sustainable and extensible. With preservation as a key service, it isalso imperative as a best practice, to implement a robust storage and backup systemfollowing professional IT practices.Beyond these fundamentals, best practices <strong>for</strong> digital repositories involve selectingsystems to support workflows <strong>for</strong> efficient and effective asset management andpreservation. This is where organizational needs must be translated into policiesand procedures <strong>for</strong> managing and preserving digital collections within the contextof the Museum’s mission.<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 12 of 19Computer History Museum February 2012


Appendix: Annotated Bibliography<strong>Digital</strong> <strong>Repository</strong> best practices are still very much a work in progress. Although itmay appear that there is little consensus, best practices are emerging through theef<strong>for</strong>ts of digital preservation pioneers. Based on this early experience, thecommunity has shifted from a focus on digital preservation to digital curation. Thisshift puts preservation into context with the full digital object lifecycle. Along withthis shift, best practices have become less absolute, and more situational. In thefollowing bibliography, the resources issued after about 2005 are more likely toreflect current thinking in the community than those produced be<strong>for</strong>e that date.Organizational web sitesThere is no one organization in charge of promulgating digital repository bestpractices <strong>for</strong> the cultural heritage community. However, a number of organizationsplay key roles. These organizations frequently post white papers on their web sitesand sometimes offer blogs and webinars on digital repository topics.The Library of Congress <strong>Digital</strong> Preservation web site links to white papers on arange of digital repository topics such as digital <strong>for</strong>mat sustainability and rights. TheLibrary of Congress also lists relevant conference and educational offerings on thissite, links to The Signal blog—a great resource <strong>for</strong> keeping up to date on digitalrepository issues. Library of Congress @ndiipp recent tweets appear on the websiteas well, highlighting timely topics of interest. Although focused on digitalpreservation programs in libraries and archives, the web site contains muchpractical advice of use to the broader cultural heritage community, and even toindividuals interested in preserving personal digital collections such asphotographs.Based in the UK, the <strong>Digital</strong> Curation Centre focuses on managing digital researchdata. However, their white papers and reference models can be helpful inunderstanding management and preservation choices <strong>for</strong> born‐digital content. DCCalso lists training opportunities, although their listings tend to be <strong>for</strong> events held inEurope.Traditionally associated with libraries, OCLC Research has expanded its mission toinclude archives and museums. Their focus is much broader than digital repositorymanagement, so mining the web site <strong>for</strong> relevant in<strong>for</strong>mation can be somewhat timeconsuming. OCLC is a good resource <strong>for</strong> metadata standards in<strong>for</strong>mation, and canalso be a useful launch point <strong>for</strong> in<strong>for</strong>mation on standards <strong>for</strong> unique identifiers.Brian Lavoie has contributed to numerous articles and papers on the economics andsustainability of digital preservation. Reading a few of these recent reports can<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 13 of 19Computer History Museum February 2012


enhance understanding of the ongoing nature of digital repository management andoperation.Be<strong>for</strong>e the Research Libraries Group (RLG) merged with OCLC, the twoorganizations jointly produced Trusted <strong>Digital</strong> Repositories: Attributes andResponsibilities. Although published in 2002, this white paper has in<strong>for</strong>med manycurrent frameworks <strong>for</strong> establishing trust in repository operations and is stillconsidered a classic.The Coalition <strong>for</strong> Networked In<strong>for</strong>mation (CNI) is a membership organization aimedat academic libraries and in<strong>for</strong>mation technology organizations. The ExecutiveDirector, Cliff Lynch, monitors issues of interest to these types of organization,including digital curation. Of late, CNI has focused significant attention onpreserving scientific data. However, earlier task <strong>for</strong>ce reports and papers onbuilding sustainable digital preservation solutions may be of interest. The web sitealso links to videos and podcasts of talks from CNI project briefings from CNImeetings held twice a year.Other notable resourcesEvery year, Charles W. Bailey, Jr. publishes a <strong>Digital</strong> Curation and PreservationBibliography. The 2010 edition cites over 500 selected resources, published from2000‐2010.Members of the <strong>Digital</strong> curation Google group discuss everything from specifictechnical implementations such as debugging the BagIt tool, to how the term“curation” has come into general parlance. Over 500 members strong, this list is theplace to be to learn about trending topics, upcoming training opportunities and toget advice from other members of the digital curation community.Selected material on specific topicsThe articles listed below in<strong>for</strong>med the best practices recommendations in the<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> <strong>for</strong> <strong>Cultural</strong> <strong>Heritage</strong> <strong>Organizations</strong> report.<strong>Repository</strong> BasicsBall, A. (2010). Preservation and curation in institutional repositories (version 1.3).Edinburgh, UK: <strong>Digital</strong> Curation Centre.While this white paper focuses on preservation of scientific data, it provides a usefulreview of the state of digital repository development as of 2010. The paper isdescriptive rather than prescriptive, although it suggests preferences in some areassuch as metadata <strong>for</strong> complex objects by omitting some commonly used standards.<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 14 of 19Computer History Museum February 2012


The report includes in<strong>for</strong>mation about three specific repository solutions, EPrints,DSpace, and Fedora. It goes on to inventory digital preservation models such as OAIS,architectures, and planning tools, including PLATTER.The metadata section of the report includes in<strong>for</strong>mation about <strong>for</strong>mat registries thatmight be of use when deciding which preservation <strong>for</strong>mats to support <strong>for</strong> objects theCHM controls (see <strong>for</strong>mat verification section of <strong>Best</strong> <strong>Practices</strong> document). The articlecovers Dublin Core as a descriptive metadata option in the most detail. Predictably,PREMIS is emphasized <strong>for</strong> preservation metadata. Somewhat surprisingly, there is notmention of METS <strong>for</strong> complex metadata representation, with OAI­ORE the onlystandard that is described.The remainder of the paper reviews tools <strong>for</strong> extracting and capturing a range ofmetadata, from technical to descriptive. It goes on to discuss what might be calledlevels of preservation, from bit preservation to emulation to migration and wraps upwith an inventory of audit tools and resources <strong>for</strong> understanding costs and benefits ofdigital repositories.Consultative Committee <strong>for</strong> Space Data Systems. (2002). Reference model <strong>for</strong> anOpen Archival In<strong>for</strong>mation System (OAIS). Washington, DC: CCSDS.This reference model continues to define the essential functions a digital repositoryshould support.Cothey, V. (2010). <strong>Digital</strong> curation at Gloucestershire Archives: From ingest toproduction by way of trusted storage. Journal of the Society of Archivists. 31(2), p.207‐228.This article is an excellent how­to use case <strong>for</strong> using OAIS and PLATTER to implementa digital repository.The Shift from Preservation to CurationAbrams, S., Cruse, P., & Kunze, J. (2009). Preservation is not a place. Internationaljournal of digital curation, 4(1). Retrieved fromh ttp://www.ijdc.net/index.php/ijdc/article/view/98The authors explain the design changes they are making in the Cali<strong>for</strong>nia <strong>Digital</strong>Library digital repository prompted by community shifts that emphasize curationrather than preservation.Lee, C.A. & Tibbo, H. (2007). <strong>Digital</strong> curation & trusted repositories. Journal of digitalin<strong>for</strong>mation, 8(2). Retrieved fromh ttp://journals.tdl.org/jodi/article/view/229/183<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 15 of 19Computer History Museum February 2012


This article is an introduction to a special issue of JoDI. It puts digital curation incontext and explains the community shift from preservation to curation.Rubin, N. (2009). Preserving digital public television: Not just an archive. Librarytrends, 57(3). Retrieved from https://www.ideals.illinois.edu/handle/2142/13605Specific to the television production industry, this article includes general in<strong>for</strong>mationabout the way moving to digital production causes a shift in thinking about preservingdigital video. The shift involves providing consistent access to what might previouslyhave been thought of as archival material. This rein<strong>for</strong>ces the lifecycle management,or curation approach.The project was a pioneering ef<strong>for</strong>t to create file wrappers (structural metadata) <strong>for</strong>digital video and recommend preservation metadata standards <strong>for</strong> this type ofmaterial. The experimental repository handled both high­resolution masters, and lowresolutiondistribution copies of each digital title. To meet the local needs of theproject, PBCore was used as the descriptive metadata <strong>for</strong>mat, PREMIS was used <strong>for</strong>preservation metadata, METSRights was used <strong>for</strong> rights, and METS was used as thewrapper. These choices were presented in the context of this particular project, notnecessarily as a best practice <strong>for</strong> digital video curation. The article touches on rightsand sustainability issues but does not recommend best practices in either of theseareas.<strong>Best</strong> <strong>Practices</strong> <strong>for</strong> Establishing TrustJantz, R. & Giarlo, M. (2007). <strong>Digital</strong> archiving and preservation: Technologiesand processes <strong>for</strong> a trusted repository. Journal of archival organization. 4(1‐2) p.193‐21. doi:10.1300/J201v04n01_1Jantz and Giarlo put what it means to develop a trusted repository in context throughthe example of the Fedora­based preservation repository developed at Rutgers. Whilethey restrict their examples and discussion to surrogate digital objects, not borndigitalcontent, many of the solutions they suggest would be broadly applicable. Theymake the case that good storage policies need input from archivists and in<strong>for</strong>mationtechnology professionals, and emphasize the importance of persistent identifiers inestablishing authenticity, and reliability.Research Libraries Group. (2002). Trusted digital repositories: Attributes andresponsibilities. RLG: Mountain View, CA. Retrieved fromh ttp://www.oclc.org/research/activities/past/rlg/trustedrepositories.pdfAlthough this paper was written be<strong>for</strong>e the preservation to curation shift, and RLG isnow part of OCLC, this report stands as a useful framework <strong>for</strong> thinking about how anorganization can fulfill its responsibilities to the community it serves in terms ofpreservation services.<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 16 of 19Computer History Museum February 2012


Rosenthal, C., Hutař, J. & Blekinge‐Rasmussen. (2008). Planning a Trusted <strong>Digital</strong><strong>Repository</strong> with PLATTER. <strong>Digital</strong> Preservation Europe. Retrieved fromhttp://www.slideshare.net/<strong>Digital</strong>PreservationEurope/platter‐colin‐rosenthalpresentationThis slide presentation explains how the PLATTER framework can be used to establishorganizational policies and procedures to insure repository services are trustworthy.Trend toward Modular SolutionsCramer, T., & Kott, K. (2010). Designing and implementing second generation digitalpreservation services: A scalable model <strong>for</strong> the Stan<strong>for</strong>d <strong>Digital</strong> <strong>Repository</strong>. D­LibMagazine, 16(9/10). Retrieved fromh ttp://www.dlib.org/dlib/september10/cramer/09cramer.htmlIn addition to presenting arguments <strong>for</strong> building modular digital asset managementsystems, this article makes the case <strong>for</strong> minimal treatment of some collections to insurethey can be ingested <strong>for</strong> preservation in a timely manner.Mikhail, Y., Adly, N., & Nagi, M. (2011). DAR: Institutional repository in action. In S.Gradmann, F. Borri, C. Meghini, & H. Schuldt (Eds.), Research and advancedtechnology <strong>for</strong> digital libraries. (pp. 348‐359). Berlin: Springer‐Verlag. Retrievedfromhttp://www.bibalex.org/isis/UploadedFiles/Publications/DLF36_DAR%20Institutional%20<strong>Repository</strong>%20Integration%20in%20Action.pdfBibliotheca Alexandrina (BA) also developed a modular solution <strong>for</strong> their “<strong>Digital</strong>Asset <strong>Repository</strong>” (DAR). They translate the OAIS model into four somewhat differentlydivided components:• <strong>Digital</strong> Assets Factory (DAF) <strong>for</strong> digitization and ingestion,• <strong>Digital</strong> Assets Metadata (DAM) subsystem <strong>for</strong> metadata management,• <strong>Digital</strong> Assets Publishing (DAP) components to enable discovery and deliverysystems, and• <strong>Digital</strong> Assets Keeper (DAK) <strong>for</strong> object storage and versioning.The article explains how they implemented a repository, including architecture(Fedora­based) and metadata choices.Metadata ResourcesDonaldson, D., Conway, P. (2010). Implementing PREMIS: A case study of the Florida<strong>Digital</strong> Archive. Library Hi Tech 28(2) p. 273‐28. doi 10.1108/0737883101104767While this article focuses on how developers at the Florida Center <strong>for</strong> LibraryAutomation implemented PREMIS when developing a preservation repository, it<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 17 of 19Computer History Museum February 2012


provides context <strong>for</strong> understanding PREMIS as an in<strong>for</strong>mation model. PREMIS itselfdoes not dictate preservation policy, nor does is offer implementation solutions. Itsimply provides a framework <strong>for</strong> expressing preservation and other metadata(including rights metadata) in xml.PREMIS Editorial Committee. (2011). Introduction and supporting material fromPREMIS Data Dictionary <strong>for</strong> Preservation Metadata. Version 2.1. Library of Congress:Washington, DC. Retrieved fromh ttp://www.loc.gov/standards/premis/v2/premis‐report‐2‐1.pdfBy removing nearly 200 pages from the full data dictionary, the Editorial Committeehas created a manageable length document that explains how to use PREMIS <strong>for</strong>preservation metadata. The report goes far beyond the practical application ofPREMIS. It puts PREMIS in the context of digital preservation, defines preservationconcepts, and includes a preservation glossary.Riley, J. (2009). Seeing standards: A visualization of the metadata universe.Bloomington, IN: Indiana University Libraries.Retrieved from http://www.dlib.indiana.edu/~jenlrile/metadatamap/This poster maps all the various metadata standards in use in various culturalheritage sectors. It gives a flavor <strong>for</strong> the range of choices available. The accompanyingglossary defines the standards and provides in<strong>for</strong>mation about what standards bodiesare responsible <strong>for</strong> maintaining them. Helpful <strong>for</strong> identifying metadata options.Smith‐Yoshimura, K. (2007). RLG Program Descriptive Metadata <strong>Practices</strong> Report.Dublin, OH: OCLC Programs and Research. Retrieved fromh ttp://www.oclc.org/research/publications/library/2007/2007‐03.pdfAn inventory of systems, descriptive metadata structures and content standards usedby RLG members describes a field without much cohesion, although Dublin Core doesseem to be emerging as the most prevalent non­MARC metadata structure.Wilson, A. (2010). How much is enough: Metadata <strong>for</strong> preserving digital data.Journal of library metadata. 10, p. 205‐217. doi 10.1080/19386389.2010.506395Wilson writes with a view towards recommending preservation metadata standards<strong>for</strong> preserving digital data based on sound archival record­keeping practices. Hecritiques PREMIS <strong>for</strong> the shortfalls he sees in the standard’s capacity to handleprovenance in<strong>for</strong>mation, which he sees as critical to establishing authenticity <strong>for</strong>digital objects. He is also critical of PREMIS and a particular implementation ofPREMIS in METS <strong>for</strong> falling short of what is needed to prove reliability, usability andintegrity. In his view, the metadata model should make provision <strong>for</strong> access. Noteveryone would agree with this approach, seeing other systems outside thepreservation system as the place where discovery and delivery metadata is maintainedand backed up, not necessarily preserved.<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 18 of 19Computer History Museum February 2012


Ide, M. & Weisse, L. (2006). Recommended appraisal guidelines <strong>for</strong> selecting borndigitalmaster programs <strong>for</strong> preservation and deposit with the Library of Congress.Boston: WGBH. Retrieved fromhttp://cn2.wnet.org/thirteen/ptvdigitalarchive/files/2009/10/appraisal‐guidelines‐final.pdfAssessment, Appraisal, Upstream RequirementsWhile developed <strong>for</strong> a specific appraisal process, this white paper contains backgroundmaterial on the archival appraisal process in general. It also puts appraisal of publictelevision programming in context with digital moving image and audio material. Itprovides a useful model <strong>for</strong> creating appraisal policies and procedures <strong>for</strong> complex,born­digital content.Marmor, J., Van Malssen, K. & Goldman, D. (2011). A preservation­compliant mediaasset management system <strong>for</strong> television production. New York: WNET. Retrievedfrom http://www.thirteen.org/ptvdigitalarchive/uncategorized/designing‐anddeploying‐a‐preservation‐compliant‐media‐asset‐management‐system‐<strong>for</strong>television‐production/127/This white paper makes recommendations <strong>for</strong> combining preparation of media files <strong>for</strong>preservation with the television production process. It is based on the experience ofingesting digital output from television production into the New York Universitypreservation repository. The NYU repository contains diverse content, but requiresthat metadata <strong>for</strong> ingest be presented in a standard <strong>for</strong>mat. The incoming metadatafrom different television production workflows required laborious re­work <strong>for</strong> ingest.The resulting recommendation was to push requirements <strong>for</strong> standardizationupstream, suggesting that production workflows should include creation and captureof standardized technical and descriptive metadata. Creating incentives <strong>for</strong> productionto take on the additional work of creating improved metadata could proveproblematic unless resources <strong>for</strong> production and preservation are viewed holistically,from a resource standpoint.Ross, H. & Thompson, D. (2010). Automating the appraisal of digital materials.Library Hi Tech. 28(2) p. 313‐32. doi 10.1108/0737883101104770On the way to exploring whether or not automated appraisal of digital materials isfeasible, this article makes a number of sensible preservation policy recommendations.The authors note that following sound archival principles is essential to goodpreservation policy and that OAIS should be used as a digital preservation framework.They suggest that both intellectual appraisal, evaluating the intrinsic value of thematerial, and technical appraisal, the question of whether the organization can af<strong>for</strong>dto preserve it, should be considered. While tools such as those developed through thePLANETS project, and JHOVE can be used <strong>for</strong> <strong>for</strong>mat validation, integrating such toolsinto an appraisal workflow involves a complex set of tasks by experienced developers.<strong>Digital</strong> <strong>Repository</strong> <strong>Best</strong> <strong>Practices</strong> Report Page 19 of 19Computer History Museum February 2012

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!