NDSA:Cloud Presentations: Difference between revisions
→General Guiding Questions for Presenters: Removed "==Questions for Implementers of Large Scale Storage and Cloud Services== Removed "# What preservation strategies" question per list disc |
adding logo |
||
(42 intermediate revisions by 10 users not shown) | |||
Line 1: | Line 1: | ||
[[File:NDSA Logo.png|thumb]] | |||
In each case we would want to identify who would present, who will contact them. Then when they will present. | In each case we would want to identify who would present, who will contact them. Then when they will present. | ||
Line 15: | Line 16: | ||
**Amazon (Who will contact?) | **Amazon (Who will contact?) | ||
==General | ==General Questions for Cloud Service Presenters== | ||
Here we are working on a set of general questions for presenters to develop talks around. | Here we are working on a set of general questions for presenters to develop talks around. | ||
# What sort of use cases is your system designed to support? What doesn't this support? | # What sort of use cases is your system designed to support? What doesn't this support? | ||
# What preservation standards would your system support? | # What preservation standards would your system support? | ||
# What resources are required to support a solution implemented in your environment | # What resources are required to support a solution implemented in your environment? | ||
# What infrastructure do you rely on? | # What infrastructure do you rely on? | ||
# How can | # How can your system impact digital preservation activities? | ||
# If we put data in your system today what systems and processes are in place so that we can get it back | # If we put data in your system today what systems and processes are in place so that we can get it back 10 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.) | ||
# What types of materials does your system handle? (documents, audio files, video file, stills, data sets, etc) And give examples of those types in practice | |||
===Responses to questions=== | ===Responses to questions=== | ||
====iRODS==== | ====[[NDSA:iRODS]] direct responses==== | ||
Other general notes: | Other general notes: | ||
Line 33: | Line 34: | ||
* [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk. | * [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk. | ||
====DuraCloud==== | ====[[NDSA:DuraCloud]] direct responses==== | ||
Other general notes: | Other general notes: | ||
* [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of cloud storage service by Duracloud. | * [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of cloud storage service by Duracloud. | ||
====MetaArchive/GDDP==== | ====[[NDSA:MetaArchive/GDDP]] direct responses==== | ||
Other general notes: | Other general notes: | ||
* [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide. | * [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide. | ||
==Solution Models and Environments== | ==Solution Models and Environments== | ||
Line 77: | Line 60: | ||
|- | |- | ||
|iRODS | |iRODS | ||
| | |Offered as Service | ||
| | |Deployed Locally | ||
| | |Opensource | ||
| | |Authentication Scheme | ||
| | |Ingest Mechanism | ||
| | |Export Mechanism | ||
| | |Integrity/Validation Mechanism | ||
| | |Replication Mechanism | ||
| | |Content Administration Model (Federated, etc.) | ||
| | |Tiering Support | ||
|Certifications | |||
|- | |- | ||
|DuraCloud | |DuraCloud | ||
| | |yes | ||
| | |yes | ||
| | |yes (Apache2) | ||
| | |Basic Auth | ||
| | |1:web-ui, 2:client-side utility, 3:REST-API | ||
| | |1:web-ui, 2:client-side utility, 3:REST-API | ||
| | |Checksum verified on ingest. On-demand checksum verification service. | ||
| | |Built-in support for cross-cloud replication. | ||
| | |Local | ||
| | |No | ||
|- | |- | ||
|MetaArchive/GDDP | |MetaArchive/GDDP | ||
| | |Mixed - PLN service layer on top of local LOCKSS nodes | ||
| | |Mixed - PLN service layer on top of local LOCKSS nodes | ||
| | |No | ||
| | |IP-based | ||
| | |LOCKSS harvesting plugins | ||
| | |LOCKSS web proxy | ||
| | |LOCKSS distributed integrity checking | ||
| | |LOCKSS P2P | ||
| | |Single superuser across all nodes | ||
| | |No | ||
|- | |- | ||
|Chronopolis | |Chronopolis | ||
| | |Yes | ||
| | |No | ||
| | |No | ||
| | |SRB/Irods based | ||
| | |SRB/Irods based | ||
| | |SRB/Irods based | ||
| | |Local checksums | ||
| | |SRB/Irods | ||
| | |Single superuser | ||
| | |No | ||
|- | |- | ||
|Microsoft Azure | |Microsoft Azure | ||
| | |Yes | ||
| | |No | ||
| | |No | ||
| | |Multiple | ||
| | | .Net/WIF | ||
| | | Multiple APIs, .Net | ||
| | |Not known/propietary | ||
| | |Not known/propietary | ||
| | |Single super user | ||
| | |Not known/propietary | ||
|- | |- | ||
|Amazon S3/EC2 | |Amazon S3/EC2 | ||
| | |Yes | ||
| | |No | ||
| | |Opensource | ||
| | |Multiple, including certs; proprietary / limited delegation model | ||
| | |Restful API's | ||
| | |Restful API's | ||
| | |Proprietary | ||
| | |Proprietary | ||
| | |Single superuser | ||
| | |Yes | ||
|- | |||
|DVN/Safearchive | |||
|Yes | |||
|Yes | |||
|Opensource | |||
|Basic Auth/IP | |||
|Proprietary UI/Batch UI/LOCKSS harvesting plugins | |||
|OAI/Lockss harvesting/proprietary | |||
|LOCKS distributed integrity checks with additional TRAC auditing layer | |||
|LOCKS with additional TRAC-based provisioning layer | |||
|Federated & distributed | |||
|No | |||
|- | |- | ||
|} | |} | ||
Latest revision as of 16:59, 29 November 2016
In each case we would want to identify who would present, who will contact them. Then when they will present.
From there we can include specific questions we would like them to respond to.
Presentation Schedule and Slides
- Feb 1, Tues, 1:00 EST call with iRods Reagan Moore (presentation)
- Feb 14, Monday, 11:00 EST call with Duracloud (presentation)
- Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert MetaArchive NDSA (presentation)
People/Projects to Contact
- Chronopolis (Mike Smorul will contact)
- Open questions from the Educopia Guide to Distributed Digital Preservation
- Commercial providers? (Who specifically would we want here? Please add them.)
- Azure (Leslie to contact)
- Amazon (Who will contact?)
General Questions for Cloud Service Presenters
Here we are working on a set of general questions for presenters to develop talks around.
- What sort of use cases is your system designed to support? What doesn't this support?
- What preservation standards would your system support?
- What resources are required to support a solution implemented in your environment?
- What infrastructure do you rely on?
- How can your system impact digital preservation activities?
- If we put data in your system today what systems and processes are in place so that we can get it back 10 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)
- What types of materials does your system handle? (documents, audio files, video file, stills, data sets, etc) And give examples of those types in practice
Responses to questions
NDSA:iRODS direct responses
Other general notes:
- [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.
NDSA:DuraCloud direct responses
Other general notes:
- [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of cloud storage service by Duracloud.
NDSA:MetaArchive/GDDP direct responses
Other general notes:
- [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide.
Solution Models and Environments
Name | Offered as Service | Deployed Locally | Opensource | Authentication Scheme | Ingest Mechanism | Export Mechanism | Integrity/Validation Mechanism | Replication Mechanism | Administration Model (Federated, etc.) | Tiering Support | |
---|---|---|---|---|---|---|---|---|---|---|---|
iRODS | Offered as Service | Deployed Locally | Opensource | Authentication Scheme | Ingest Mechanism | Export Mechanism | Integrity/Validation Mechanism | Replication Mechanism | Content Administration Model (Federated, etc.) | Tiering Support | Certifications |
DuraCloud | yes | yes | yes (Apache2) | Basic Auth | 1:web-ui, 2:client-side utility, 3:REST-API | 1:web-ui, 2:client-side utility, 3:REST-API | Checksum verified on ingest. On-demand checksum verification service. | Built-in support for cross-cloud replication. | Local | No | |
MetaArchive/GDDP | Mixed - PLN service layer on top of local LOCKSS nodes | Mixed - PLN service layer on top of local LOCKSS nodes | No | IP-based | LOCKSS harvesting plugins | LOCKSS web proxy | LOCKSS distributed integrity checking | LOCKSS P2P | Single superuser across all nodes | No | |
Chronopolis | Yes | No | No | SRB/Irods based | SRB/Irods based | SRB/Irods based | Local checksums | SRB/Irods | Single superuser | No | |
Microsoft Azure | Yes | No | No | Multiple | .Net/WIF | Multiple APIs, .Net | Not known/propietary | Not known/propietary | Single super user | Not known/propietary | |
Amazon S3/EC2 | Yes | No | Opensource | Multiple, including certs; proprietary / limited delegation model | Restful API's | Restful API's | Proprietary | Proprietary | Single superuser | Yes | |
DVN/Safearchive | Yes | Yes | Opensource | Basic Auth/IP | Proprietary UI/Batch UI/LOCKSS harvesting plugins | OAI/Lockss harvesting/proprietary | LOCKS distributed integrity checks with additional TRAC auditing layer | LOCKS with additional TRAC-based provisioning layer | Federated & distributed | No |