NDSA:Digital Preservation Page -- draft outline

From DLF Wiki
Revision as of 15:18, 11 February 2016 by Dlfadm (talk | contribs) (29 revisions imported: Migrate NDSA content from Library of Congress)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Return to NDSA Standards and Best Practices Working Group Home Page


===THIS PAGE IS NOW OBSOLETE. FURTHER ADDITIONS AND CHANGES ARE BEING ADDED TO THE GOOGLE DOCS VERSION===

https://docs.google.com/a/apps.cul.columbia.edu/document/d/1efjrPtREvTdz8TN2KfuZ4GlarqEEfImFC7wGQYm4PNo/edit?pli=1 [1]

It will be deleted by August 1, 2012. (S. Davis, 2012-07-05)


x
x
x
x
x
x
x
x
x
x

Scope of article

This article addresses basic issues relating to digital preservation. Related topics that are not dealt with in this article include: intellectual property issues, privacy, selection for preservation, asset management, content management.

Definition of digital preservation

(generic, high-level)

Digital preservation can be understood as the series of managed activities necessary to ensure continued access to digital materials for as long as necessary. [1] It combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. [2] The domain of digital preservation encompasses content that has been digitized from pre-existing analog material as well as to content that is created in digital form ("born-digital" content). [Arora, J. Digital Preservation: An Overview. From: Open Access to Textual and Multimedia Content: Bridging the Digital Divide, January 29-30, 2009, p. 108.]

Challenges of digital preservation

(generic, high level)

There are significant challenges to the task of digital preservation, both technical and economic.

Unlike traditional analog objects such as books or photographs where the user has unmediated access to the content, a digital object always needs a software environment to render it. These environments keep evolving and changing at a rapid pace, threatening the continuity of access to the content. [ref= Becker,C. et al. Systematic planning for digital preservation. International Journal on Digital Libraries Date: December 19, 2009, p. 134. (Int J Digit Libr (2009) 10:133–157 DOI 10.1007/s00799-009-0057-1)] Physical storage media, data formats, hardware, and software all become obsolete over time, posing significant threats to the survival of the content. [Evans, Mark; Carter, Laura. The Challenges of Digital Preservation. Presentation at the Library of Parliament, Ottawa, December 2008.]

In the case of born-digital content (e.g., institutional archives, Web sites, electronic audio and video content, born-digital photography and art, research data sets, observational data) the enormous and growing quantity of content presents significant scaling issues.

Digital content can often present challenges to preservation because of its complex and dynamic nature, e.g., interactive Web pages, virtual reality and gaming environments, learning objects, social media sites. [ref= Arora, J. Digital Preservation, an Overview. Presented at: Open Access to Textual and Multimedia Content: Bridging the Digital Divide, January 29-30, 2009, p.111.]

The economic challenges of digital preservation are also great. Preservation programs require significant up front investment to create, along with ongoing costs for data ingest, data management, data storage, and staffing. One of the key strategic challenges to such programs is the fact that while they require significant current and ongoing funding, their benefits accrue largely to future generations. [ref= Sustainable Economics for a Digital Planet: Ensuing Long-Term Access to Digital Information. Final Report of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access, February 2010. p.35]

Strategies for digital preservation

(generic, high level)

  1. Refreshing, cyclical re-copying
Refreshing is the transfer of data between two types of the same storage medium so there are no NDSA:bitrate changes or alteration of data. [3] For example, transferring census data from an old preservation NDSA:CD to a new one. This strategy may need to be combined with migration when the software or hardware required to read the data is no longer available or is unable to understand the format of the data. Refreshing will likely always be necessary due to the deterioration of physical media.
  1. Replication
  2. Content preservation versus object preservation
  3. Migration vs. emulation
Content preservation is generally achieved by one of two strategies, migration or emulation. Migration requires the repeated copying or conversion of digital objects from one technology to a more stable or current, be it hardware or software. Each migration incurs certain risks and preserves only a certain fraction of the characteristics of a digital object. Emulation as the second important strategy strives to reproduce all essential characteristics of the performance of a system, allowing programs and media designed for a particular environment to operate in a different, newer setting.

Identification of digital preservation communities

(e.g., research libraries, national libraries, archives, governments, scientific communities, geospatial and observational data communities, architecture and design industry, video and film industry, broadcast industry)

Research library and “memory institutions’” digital preservation efforts

  1. History of engagement / involvement; relationship to institutional repository movement
  2. Organizations engaged in digital preservation planning (U.S. only? See Initiatives and Programs below), e.g., NDSA, PASIG, etc.
  3. Use cases (converted analog, born-digital documents, images, audio-visual material, data sets, observational data, electronic records, email, CAD-CAM content, digital games, mixed archival collections; digitization as sole preservation strategy for audio and moving images; computer software, dance performances; Web sites; social media archives; databases )
  4. Issues, assumptions, approaches, best practices [THIS SECTION COULD BE BETTER STRUCTURED]
    1. Data integrity, provenance, versioning
    2. Metadata considerations (types of metadata, objectives of metadata)
    3. “Dark archiving” versus access-oriented strategies
    4. Digital file format preservation issues
    5. Digital forensics
  5. Current and evolving technical standards (discussion)
    1. [branch to listing of individual standards and practices]
  6. “Trusted digital repository” framework
In 2007, CRL/OCLC published Trustworthy Repository Audit & Certification: Criteria & Checklist (TRAC), a document allowing digital repositories to assess their capability to reliably store, migrate, and provide access to digital content. TRAC is based upon existing standards and best practices for trustworthy digital repositories and incorporates a set of 84 audit and certification criteria arranged in three sections: Organizational Infrastructure; Digital Object Management; and Technologies, Technical Infrastructure, and Security [1]. TRAC provides tools for the audit, assessment, and potential certification of digital repositories, establishes the documentation requirements required for audit, delineates a process for certification, and establishes appropriate methodologies for determining the soundness and sustainability of digital repositories [2].
Footnotes:
--OCLC and CRL. (2007). Trustworthy Repository Audit & Certification: Criteria & Checklist. Accessed on : April 16, 2012 from http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf
--Philips, Stephen C. (2010). Service level agreements for storage and preservation, p.13
Accessed on May 1, 2012 from : http://www.prestocentre.org/library/resources/service-level-agreements-storage-and-preservation
ALSO INCLUDE: Metrics for assessment, Certification strategies
  1. Digital curation
  2. Preservation of original hardware and software access systems
  3. Storage and OS considerations
  4. Sustainability and economic models for preservation
  5. Open source systems and tools (e.g., Fedora, JHOVE, PRONOM)
  6. Vendor-provided systems and tools (e.g., Rosetta)
  7. Preservation Initiatives and programs
    1. United States
      1. NDSA, LOCKSS, Hathi Trust, Portico, MetaArchive, CDL, Internet Archive, CRL, consortia, etc. -- mostly links to other articles
    2. United Kingdom
      1. Digital Preservation Coalition (DPC)
    3. Europe
      1. CASPAR, PLANETS, TIMBUS
    4. [Other countries, regions]
  8. Preservation-oriented conferences and meetings (e.g., iPres)
  9. Granting agencies supporting digital preservation

Other Preservation-Related Domains and Communities

[NOTE: The following are additional communities with somewhat different considerations and approaches in the area of digital preservation. While domain-specific published and defacto standards may in many cases be the same as those used in the research library community – and could be referenced from within the sections below -- best practices and use cases will differ. NDSA would not necessarily take any responsibility for these sections, so they are for now notional.)

  1. Digital preservation in the scientific and geospatial community
  2. Digital preservation in the architecture community
  3. Digital preservation in the domain of “personal digital preservation” (e.g., LC’s personal preservation initiative)
  4. Digital preservation in the broadcast media community
  5. Digital preservation in the audio engineering industry
  6. [Others contributed by other communities]