NDSA:Digital Preservation X-Challenges: Difference between revisions

From DLF Wiki
Jump to navigation Jump to search
(→‎Digital Preservation X-Challenges: Pulled Micah and Micah's ideas for different chalanges from the email list and posted them here)
Line 1: Line 1:
This is where this action team will make its plans.
This is where this action team will make its plans.


==Digital Preservation X-Challenges==
==Action Team Members==
===Action Team Members===  
*Micah Beck, University of Tennessee  
*Micah Beck, University of Tennessee  
*Jane Mandelbaum, Library of Congress  
*Jane Mandelbaum, Library of Congress  
Line 9: Line 8:
*Micah Altman, Harvard University
*Micah Altman, Harvard University
*Mike Smorul, University of Maryland
*Mike Smorul, University of Maryland
===Project Overview===  
==Project Overview==
This group will plan and launch a set of challenges and/or prizes in to spur innovation in digital preservation. Group members will focus on defining the challenges, promoting them, and exploring ways to identify funding to support challenges if it is determined that funding would be appropriate. The action team will communicate over email and report their work on their wiki page. Email Jane Mandelbaum (jman@loc.gov) if you would like to participate.
This group will plan and launch a set of challenges and/or prizes in to spur innovation in digital preservation. Group members will focus on defining the challenges, promoting them, and exploring ways to identify funding to support challenges if it is determined that funding would be appropriate. The action team will communicate over email and report their work on their wiki page. Email Jane Mandelbaum (jman@loc.gov) if you would like to participate.


===Examples of different prize/challenge models===
==Examples of different prize/challenge models==
* Digital Story Telling Challenge: http://forums.techsoup.org/cs/p/tsdigs.aspx
* Digital Story Telling Challenge: http://forums.techsoup.org/cs/p/tsdigs.aspx
* Knight News Challenge: http://www.newschallenge.org/
* Knight News Challenge: http://www.newschallenge.org/
Line 18: Line 17:
* Kickstarter: http://www.kickstarter.com/
* Kickstarter: http://www.kickstarter.com/
* High Performance Computing Challenge: http://www.hpcchallenge.org/
* High Performance Computing Challenge: http://www.hpcchallenge.org/
==Potential Grand Challenges==
===BitStab===
Develop and promote a specific technical competition, analogous to the Top 500 supercomputer ranking (http://www.top500.org), but more relevant to Digital Preservation. Learning from the Top 500 experience, such a competition requires a good metric (in the case of Top 500 the Linpack benchmark) which is widely understood and accepted, and which is not too difficult/expensive to implement nor too easy to game. Then it requires buy-in from a community that is wide acknowledged to include "the best contenders."Proposal for such a metric: bulk bit stability (stable bit-years).  For some definition of "without change", we simply ask for evidence of the product of how much data is stored without change (stable bits) and the length of time it was stored for.
===Real-World Reliability of Long-Term Digital Storage===
How to assess and quantify the real-world reliability of long-term digital storage, when in the long term dominant threats are likely to be economic, and organizational?
===Format migration===
Format migration remains a central technical strategy for digital preservation, but creates a risk of loss of information in formatting. A grand challenge might be to identify a process for verifying that the semantic content of  digital objects in different forms is (approximately) the same. Note that industry has made much more progress in practical application of semantic fingerprints (primarily for DRM) and similar technologies for this -- but the aims differ somewhat.
===Other Potential Challenge Issues===
Other grand challenge problems seem less technical to me. Rereading the Blue Ribbon Task Force report suggests that selection criteria appropriate for the data deluge;  realistic cost models for long-term preservation activities; business models for funding preservation and access to public goods; and legal strategies for enabling long term access in the face of short-term copyright, IP and confidentiality restrictions are central challenges.

Revision as of 10:45, 9 March 2011

This is where this action team will make its plans.

Action Team Members

  • Micah Beck, University of Tennessee
  • Jane Mandelbaum, Library of Congress
  • John Spencer, BMS/Chace
  • Dean Farrel, State Library of North Carolina
  • Micah Altman, Harvard University
  • Mike Smorul, University of Maryland

Project Overview

This group will plan and launch a set of challenges and/or prizes in to spur innovation in digital preservation. Group members will focus on defining the challenges, promoting them, and exploring ways to identify funding to support challenges if it is determined that funding would be appropriate. The action team will communicate over email and report their work on their wiki page. Email Jane Mandelbaum (jman@loc.gov) if you would like to participate.

Examples of different prize/challenge models

Potential Grand Challenges

BitStab

Develop and promote a specific technical competition, analogous to the Top 500 supercomputer ranking (http://www.top500.org), but more relevant to Digital Preservation. Learning from the Top 500 experience, such a competition requires a good metric (in the case of Top 500 the Linpack benchmark) which is widely understood and accepted, and which is not too difficult/expensive to implement nor too easy to game. Then it requires buy-in from a community that is wide acknowledged to include "the best contenders."Proposal for such a metric: bulk bit stability (stable bit-years). For some definition of "without change", we simply ask for evidence of the product of how much data is stored without change (stable bits) and the length of time it was stored for.

Real-World Reliability of Long-Term Digital Storage

How to assess and quantify the real-world reliability of long-term digital storage, when in the long term dominant threats are likely to be economic, and organizational?

Format migration

Format migration remains a central technical strategy for digital preservation, but creates a risk of loss of information in formatting. A grand challenge might be to identify a process for verifying that the semantic content of digital objects in different forms is (approximately) the same. Note that industry has made much more progress in practical application of semantic fingerprints (primarily for DRM) and similar technologies for this -- but the aims differ somewhat.

Other Potential Challenge Issues

Other grand challenge problems seem less technical to me. Rereading the Blue Ribbon Task Force report suggests that selection criteria appropriate for the data deluge; realistic cost models for long-term preservation activities; business models for funding preservation and access to public goods; and legal strategies for enabling long term access in the face of short-term copyright, IP and confidentiality restrictions are central challenges.