NDSA:Cloud Presentations

From DLF Wiki

In each case we would want to identify who would present, who will contact them. Then when they will present.

From there we can include specific questions we would like them to respond to.

Presentation Schedule

Once we start scheduling presenters we will keep a list of the talks here.

  1. Feb 1, Tues, 1:00 EST call with iRods Reagan Moore (presentation)
  2. Feb 14, Monday, 11:00 EST call with Duracloud
  3. Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert

People/Projects to Contact

  • DuraCloud/Duraspace (Leslie to contact)
  • Chronopolis (Mike Smorul will contact)
  • Open questions from the Educopia Guide to Distributed Digital Preservation http://www.metaarchive.org/GDDP (Martin will contact)
  • Irods: Reagan Moore, 2/1/2011 see slides: NIAID.ppt
  • Commercial providers? (Who specifically would we want here? Please add them.)
    • Azure (Leslie to contact)
    • Amazon (Who will contact?)

General Guiding Questions for Presenters

Here we are working on a set of general questions for presenters to develop talks around.

  1. What sort of use cases is your system designed to support? What doesn't this support?
  2. What preservation strategies would your system support?
  3. What preservation standards would your system support?
  4. What resources are required to support a solution implemented in your environment
  5. What infrastructure do you rely on?
  6. How can the cloud environment impact digital preservation activities?
  7. If we put data in your system today what systems and processes are in place so that we can get it back 50 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)

Responses to questions

iRODS

  1. ...

Other general notes:

  • [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.

DuraCloud

  1. ...

Other general notes:

  • [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of service by Duracloud.

MetaArchive/GDDP

  1. ...

Other general notes:

  • [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide.

Chronopolis

  1. ...

MicroSoft Azure

  1. ...

Amazon S3/EC2

  1. ...


General Concerns

  1. confidential data
  2. encrypted data
  3. auditing
  4. preservation risks
  5. legal compliance
  6. ...

Solution Models and Environments

Name Offered as Service Deployed Locally Opensource Authentication Scheme Ingest Mechanism Export Mechanism Integrity/Validation Mechanism Replication Mechanism Administration Model (Federated, etc.) Tiering Support
iRODS
DuraCloud
MetaArchive/GDDP
Chronopolis
Microsoft Azure
Amazon S3/EC2