NDSA:Storage ping

Below are the notes from the Storage Ping/Bit Stability discussion:

The Challenge Challenge Workshop Technical Discussion Table Notes 7/20/2011

Storage Ping Characteristics:

Test for average, not maximum, latency
Test for bit integrity
Test for uptime of files
A more sophisticated version of a link checker, it would act as an extra audit for properly prompting redirects to URLS.
Runs against repositories, not specifically hardware
Similar to scrubbing, but occurs externally and uses statistical analysis, sampling the collection to determine availability of files.
Limited requests per day per collection to keep from negatively impacting the repository
Use fixity provided with submission of URL. If no fixity is provided, generate fixity upon first read of file and store that for later checks of file.
A transparent way to view the current status of stored bits

Next Steps:

Stephen Abrams to send document on related software he's working on to NDSA-Innovation listserv.
- Here's a (belated) writeup of the idea I introduced during the July NDSA meeting. This is an exploration of what we're calling "repository neighborhood watch," an idea first batted around by my colleagues Trisha Cruse and John Kunze. The point is that what you (as a repository operator) say about your trustworthiness is less important than what your "neighbors" (i.e., repository customers) say about you. (Attached PDF: File:Neighborhood-watch-for-repository-QA.pdf) --Stephen.abrams 21:10, 8 September 2011 (UTC)

Mike Smourl to send material/document on related software to NDSA-Innovation listserv.
Group to refine idea, characteristics and model in advance of the August 2011 Curate Camp. Pass of plan to team member at Indiana University Libraries or CDL who will be attending so that team member can get feedback about the model.
Suggest what benchmarks would be reported on for participating repositories
Innovation Working Group to refine model based on comments at Curate Camp.
Present refined model to LC Storage Workshop in Washington, DC in September 2011.
Further refinement and development. Develop tool as well as benchmark participation agreement.
Potentially have workshop on Storage Ping tool at DLF or CNI in fall/winter 2011.

Benchmark Participation Agreement:

Guidelines for participating repositories
Register collection with URLs for participation in the Storage Ping
Provide URLs to fixity of files if available
Look at stability of software over time as well as stability of files

Discussion Notes:

The conversation began with a discussion about how a Bit Stability Challenge might function. While there was interest at the table in exploring the idea more, it was deemed better to start small with a less-controversial metric. The table determined that getting the repository population used to reporting was the appropriate initial objective. There was interest in engaging with repositories as well as storage service providers, however, the table decided that starting with repositories was the most productive plan.

Objectives of the challenge and reporting include:

Identifying benchmarks that will act as indicators of trust
Developing benchmarks that are able to be reproduced by the consumer
Encouraging transparency in metrics
Reporting on size of collection and length of time the collection has been successfully preserved.

Possible metrics that were discussed included:

Size of storage: number of bits preserved
Total bit years where bit year = bytes x length of time preserved
Effort over time
Relating the score of a repository not only to the stability of the storage but the overall cost of the storage approach.