NDSA:Tuesday, March 15, 2011

From DLF Wiki

NDSA Infrastructure call, March 15, 2011 On the call:

  • Karen Cariani
  • Dan Dodge
  • Mike Smorul
  • Mike Giarlo
  • Micah Altman
  • Michelle Galliger
  • Cory Snavely
  • Trevor Owens
  • Dean Farrell
  • Kris Carpenter
  • Joseph Pawletko
  • Patricia Cruse
  • David Minor
  • Andrew Woods
  • Dave MacCarn

(apologies for anyone missed)

Agenda

  • Regroup after our first cloud and large storage calls and re-evaluate approach
  • Discussion of potential collaboration with the standards group

Reflecting on the first three calls

We have now completed calls with three different groups working on cloud and large scale storage projects and services (iRODS, DuraSpace, and MetaArchive/GDDP). The group began the call discussing the format of the calls and how to move forward with this project. The group responded well to the talks and was grateful for the participation of our first three presenters, however, there seemed to be consensus in the group that it would be ideal to think through how to refocus future calls very directly on a common set of core questions. It was suggested that we think about refining the set of questions on the wiki and have future speakers respond to the questions over email before a call. Then, after a short twenty minute introduction to a particular project or service the calls could move to discuss the presenters’ responses to the common set of questions

Developing common questions focused toward implementers of large scale storage and cloud services

In reflecting on the calls, several participants wanted to hear more about how institutions are implementing some of these large scale and cloud services to meet specific digital preservation challenges and objectives. It was suggested that it would be valuable to get partners and others to respond to a related common set of questions focused on implementation of cloud and large scale data projects. The following questions were discussed as potential questions. Trevor volunteered to distribute this initial list of questions to the infrastructure email list to kick off discussion of the questions on the list. It would be ideal if we could winnow this list down to 5-8 questions.

  • What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)
  • What large scale storage or cloud technologies are you using to meet that challenge?
  • Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)
  • How big is your collection? (In terms of number of objects and storage space required)
  • What are your performance requirements?
  • What storage media have you elected to use? (Disk, Tape, etc)
  • What do you think the key advantages of the system you use?
  • What do you think are the key problems or disadvantages your system present?
  • What important principles informed your decision about the particular tool or service you chose to use?
  • Which service providers or tools did you consider and how did you make your choice?
  • How frequently do you migrate from one system to another?

Assembling information about the first three calls?

The group briefly discussed how to extract the knowledge from the first calls and get that information up on the wiki for us to use as we move forward. There was discussion of filling out the grid started on the Cloud Presentations wiki page. Call participants encouraged each other to fill thoughts and reflections on the calls on the cloud presentations wiki page. With that noted, for future calls getting presenters to respond over email to the preset questions can ensure that we are creating a record of the focal points of these discussion to return to for the creation of this particular projects final product.

There was a strong feeling that we should not just focus on cloud solutions, but that cloud solutions were one specific area to drill into. The group reiterated its goals of developing best practices and potential solutions for large scale digital preservation to help institutions make informed decision about preservation. We want to create a product that gives context for people deciding upon digital preservation practices and makes suggestions for solutions.

Perhaps if we build a grid based what each of our own institutions are using for digital preservation as a starting point, and add use cases to those scenarios.

Potential partnership with the standards group

The final piece of business we considered was a request from the Standards group to consider working together on a project they are undertaking to document various standards. The particular intersection between the two groups work seems to be the point at which particular file format standards intersect with tools for writing, validating and editing those kinds of files.

Infrastructure group members were interested in the project, however, several members voiced concerns about the value of a general kind of tools directory. Several suggested that it would be particularly valuable to think about putting together exemplar use cases in which particular standards and particular tools were implemented by a particular preservation partner to accomplish a specific preservation goal. To some extent, this kind of approach could be structurally similar to some of how we discussed reframing the large scale and cloud storage calls. Call participants seemed to think that this kind of use case driven knowledge base of tools and standards in practice could be a valuable resource.


The action items left for the group for the next call were to

  • review the questions listed above and send feedback/edits on the questions. We will then ask WG members to answer the questions themselves.
  • look at the wiki page again – particularly the grid on the cloud presentation page and think about/add what information we might want to comment on or share in order to help us reach the goal of sharing useful information with others outside the WG

generally interested in working with the standards group, but need concrete examples to respond to – perhaps when their work is further along

  • vote on the doodle poll forthcoming for a potential new time for our monthly calls