<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.diglib.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Csnavely</id>
	<title>DLF Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.diglib.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Csnavely"/>
	<link rel="alternate" type="text/html" href="https://wiki.diglib.org/Special:Contributions/Csnavely"/>
	<updated>2026-05-10T18:55:12Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.44.0</generator>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2522</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2522"/>
		<updated>2011-06-06T20:52:06Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* HathiTrust Response to Implementations of Large Scale Storage Architectures */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
#* HathiTrust&#039;s mission is (jointly) multi-institutional long-term preservation and access of digitized library materials.&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice?&lt;br /&gt;
#* HathiTrust does not use cloud storage providers due to potential legal issues with copyrighted content. At least within the scope of digital preservation of library materials, we would consider ourselves to be a cloud storage provider, or similar to one. The large scale storage system we use is from Isilon which was chosen for its ability to scale in capacity and performance while keeping maintenance requirements constant and low. All well-known storage vendors (about 15) were considered.&lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)&lt;br /&gt;
#* Currently, print volumes comprises the bulk of preserved content. Pilot projects for continuous tone images and audio are well underway.&lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
#* As of May 2011, HathiTrust is preserving 8.7 million print volumes which consume approximately 400TB of storage.&lt;br /&gt;
# What are your performance requirements? Further, why are these your particular requirements?&lt;br /&gt;
#* Fixity checking (checksum validation) requires all data to be read over an approximate 90-day periodicity, which, at the current repository size, translates to 54MB/s continuous read activity over 90 days (or higher over fewer days). This has, so far, been easily attainable through parallelization. The most recent peak in the ingest rate was ~500,000 volumes per month, which translates to an average of ~7MB/s write activity, also easily attainable.&lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc) Further, why did you choose these particular media?&lt;br /&gt;
#* HathiTrust uses two instances of disk in different cities (several hundred miles apart) for primary storage and two instances of tape in one city (several miles apart) providing 6 months of previous-version retention (backups). Disk was chosen for primary storage because the archive is light, with materials continuously accessible, and because repository activities such as full-text search indexing and fixity checking require frequent low-latency access to content. Tape was chosen for backups because, as is the conventional wisdom, previous-version retention inflates storage requirements (in our case, roughly 25%) but does not demand low-latency access, and so tape is the most cost-effective medium.&lt;br /&gt;
# What do you think are the key advantages of the system you use?&lt;br /&gt;
#* Being NAS-based, the storage environment is a simple filesystem which offers easy integration with applications and easy movement between vendor platforms. The Isilon system has a number of advanced data integrity features that are well-suited to digital preservation including internal checksums and inline data correction to protect against bit rot, misdirected/torn writes, and similar data storage risks.&lt;br /&gt;
# What do you think are the key problems or disadvantages your system presents?&lt;br /&gt;
#* The simplicity of filesystems can be seen as a risk for the same reason it is beneficial; without imposed structure, filesystems can accommodate informal and unrigorous access to data. When using a filesystem, the price paid for simplicity is that greater attention is required to permissions and formalized processes for data management.&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use?&lt;br /&gt;
#* The need to use an in-house storage system was heavily influenced by the sensitive nature of copyrighted materials that are the subject of an ongoing lawsuit.&lt;br /&gt;
# How frequently do you migrate from one system to another? Further, what is it that prompts you to make these migrations?&lt;br /&gt;
#* HathiTrust migrated to Isilon storage in 2007 from temporary DAS and has not since needed to do another whole migration, although the first cycle of hardware replacement (100TB at both locations) was completed using the built-in capability for removing and adding storage nodes; there was no direct handling of data required. The initial move was simply the initial hardware acquisition, and the hardware replacement is an annual cycle to maintain hardware currency; the equipment lifetime is 3-4 years.&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
#* The Isilon system enables us to have greater parity redundancy than conventional storage systems: we use N+3, as opposed to the conventional RAID 5 (N+1) or slightly novel RAID 6 (N+2). The latest release of Isilon system software computes checksums on write and validates checksums on read, correcting data as needed, and the cluster architecture scales the compute capacity required to do this work, as opposed to conventional head/tray design. The system provides features for built-in data migration for hardware replacement that eliminate the need for manual data handling or even downtime during hardware replacement.&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
#* We have developed fixity checking and other repository auditing tools as a check on our use of the storage system, ensuring that our processes have stored data correctly. This check doubles as bit rot detection in addition to that performed by the system. We have developed a small diagnostic script to isolate problems in the data synchronization from one site to another if the storage system reports an error.&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;br /&gt;
#* There are no issues or shortcomings of the system with respect to TRAC requirements.&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2521</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2521"/>
		<updated>2011-06-06T19:35:57Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* HathiTrust Response to Implementations of Large Scale Storage Architectures */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
#* HathiTrust&#039;s mission is (jointly) multi-institutional long-term preservation and access of digitized library materials.&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice?&lt;br /&gt;
#* HathiTrust does not use cloud storage providers due to potential legal issues with copyrighted content. At least within the scope of digital preservation of library materials, we would consider ourselves to be a cloud storage provider, or similar to one. The large scale storage system we use is from Isilon which was chosen for its ability to scale in capacity and performance while keeping maintenance requirements constant and low. All well-known storage vendors (about 15) were considered.&lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)&lt;br /&gt;
#* Currently, print volumes comprises the bulk of preserved content. Pilot projects for continuous tone images and audio are well underway.&lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
#* As of May 2011, HathiTrust is preserving 8.7 million print volumes which consume approximately 400TB of storage.&lt;br /&gt;
# What are your performance requirements? Further, why are these your particular requirements?&lt;br /&gt;
#* Fixity checking (checksum validation) requires all data to be read over an approximate 90-day periodicity, which, at the current repository size, translates to 54MB/s continuous read activity over 90 days (or higher over fewer days). This has, so far, been easily attainable through parallelization. The most recent peak in the ingest rate was ~500,000 volumes per month, which translates to an average of ~7MB/s write activity, also easily attainable.&lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc) Further, why did you choose these particular media?&lt;br /&gt;
#* HathiTrust uses two instances of disk in different cities (several hundred miles apart) for primary storage and two instances of tape in one city (several miles apart) providing 6 months of previous-version retention (backups). Disk was chosen for primary storage because the archive is light, with materials continuously accessible, and because repository activities such as full-text search indexing and fixity checking require frequent low-latency access to content. Tape was chosen for backups because, as is the conventional wisdom, previous-version retention inflates storage requirements (in our case, roughly 25%) but does not demand low-latency access, and so tape is the most cost-effective medium.&lt;br /&gt;
# What do you think are the key advantages of the system you use?&lt;br /&gt;
#* Being NAS-based, the storage environment is a simple filesystem which offers easy integration with applications and easy movement between vendor platforms. The Isilon system has a number of advanced data integrity features that are well-suited to digital preservation including internal checksums and inline data correction to protect against bit rot, misdirected/torn writes, and similar data storage risks.&lt;br /&gt;
# What do you think are the key problems or disadvantages your system presents?&lt;br /&gt;
#* The simplicity of filesystems can be seen as a risk for the same reason it is beneficial; without imposed structure, filesystems can accommodate informal and unrigorous access to data. When using a filesystem, the price paid for simplicity is that greater attention is required to permissions and formalized processes for data management.&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use?&lt;br /&gt;
#* The need to use an in-house storage system was heavily influenced by the sensitive nature of copyrighted materials that are the subject of an ongoing lawsuit.&lt;br /&gt;
# How frequently do you migrate from one system to another? Further, what is it that prompts you to make these migrations?&lt;br /&gt;
#* HathiTrust migrated to Isilon storage in 2007 from temporary DAS and has not since needed to do another whole migration, although the first cycle of hardware replacement (100TB at both locations) was completed using the built-in capability for removing and adding storage nodes; there was no direct handling of data required. The initial move was simply the initial hardware acquisition, and the hardware replacement is an annual cycle to maintain hardware currency; the equipment lifetime is 3-4 years.&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
#* The Isilon system enables us to have greater parity redundancy than conventional storage systems: we use N+3, as opposed to the conventional RAID 5 (N+1) or slightly novel RAID 6 (N+2). The latest release of Isilon system software computes checksums on write and validates checksums on read, correcting data as needed, and the cluster architecture scales the compute capacity required to do this work, as opposed to conventional head/tray design. The system provides features for built-in data migration for hardware replacement that eliminate the need for manual data handling or even downtime during hardware replacement.&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
#* We have developed fixity checking and other repository auditing tools as a check on our use of the storage system, ensuring that our processes have storage data correctly. This check doubles as bit rot detection in addition to that performed by the system. We have developed a small diagnostic script to isolate problems in the data synchronization from one site to another if the storage system reports an error.&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;br /&gt;
#* There are no issues or shortcomings of the system with respect to TRAC requirements.&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2520</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2520"/>
		<updated>2011-05-16T16:38:27Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* HathiTrust Response to Implementations of Large Scale Storage Architectures */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
#* HathiTrust&#039;s mission is (jointly) multi-institutional long-term preservation and access of digitized library materials.&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice?&lt;br /&gt;
#* HathiTrust does not use cloud storage providers due to potential legal issues with copyrighted content. At least within the scope of digital preservation of library materials, we would consider ourselves to be a cloud storage provider, or similar to one. The large scale storage system we use is from Isilon which was chosen for its ability to scale in capacity and performance while keeping maintenance requirements constant and low. All well-known storage vendors (about 15) were considered.&lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)&lt;br /&gt;
#* Currently, print volumes comprises the bulk of preserved content. Pilot projects for continuous tone images and audio are well underway.&lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
#* As of May 2011, HathiTrust is preserving 8.7 million print volumes which consume approximately 400TB of storage.&lt;br /&gt;
# What are your performance requirements? Further, why are these your particular requirements?&lt;br /&gt;
#* Fixity checking (checksum validation) requires all data to be read over an approximate 90-day periodicity, which, at the current repository size, translates to 54MB/s continuous read activity over 90 days (or higher over fewer days). This has, so far, been easily attainable through parallelization. The most recent peak in the ingest rate was ~500,000 volumes per month, which translates to an average of ~7MB/s write activity, also easily attainable.&lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc) Further, why did you choose these particular media?&lt;br /&gt;
#* HathiTrust uses two instances of disk in different cities (several hundred miles apart) for primary storage and two instances of tape in one city (several miles apart) providing 6 months of previous-version retention (backups). Disk was chosen for primary storage because the archive is light, with materials continuously accessible, and because repository activities such as full-text search indexing and fixity checking require frequent low-latency access to content. Tape was chosen for backups because, as is the conventional wisdom, previous-version retention inflates storage requirements (in our case, roughly 25%) but does not demand low-latency access, and so tape is the most cost-effective medium.&lt;br /&gt;
# What do you think are the key advantages of the system you use?&lt;br /&gt;
#* Being NAS-based, the storage environment is a simple filesystem which offers easy integration with applications and easy movement between vendor platforms. The Isilon system has a number of advanced data integrity features that are well-suited to digital preservation including internal checksums and inline data correction to protect against bit rot, misdirected/torn writes, and similar data storage risks.&lt;br /&gt;
# What do you think are the key problems or disadvantages your system presents?&lt;br /&gt;
#* The simplicity of filesystems can be seen as a risk for the same reason it is beneficial; without imposed structure, filesystems can accommodate informal and unrigorous access to data. When using a filesystem, the price paid for simplicity is that greater attention is required to permissions and formalized processes for data management.&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use?&lt;br /&gt;
#* The need to use an in-house storage system was heavily influenced by the sensitive nature of copyrighted materials that are the subject of an ongoing lawsuit.&lt;br /&gt;
# How frequently do you migrate from one system to another? Further, what is it that prompts you to make these migrations?&lt;br /&gt;
#* HathiTrust migrated to Isilon storage in 2007 from temporary DAS and has not since needed to do another migration, although the first cycle of hardware replacement (100TB at both locations) was completed using the built-in capability for removing and adding storage nodes; there was no direct handling of data required. The initial move was simply the initial hardware acquisition, and the hardware replacement is an annual cycle to maintain hardware currency.&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
#* The Isilon system enables us to have greater parity redundancy than conventional storage systems: we use N+3, as opposed to the conventional RAID 5 (N+1) or slightly novel RAID 6 (N+2). The latest release of Isilon system software computes checksums on write and validates checksums on read, correcting data as needed, and the cluster architecture scales the compute capacity required to do this work, as opposed to conventional head/tray design. The system provides features for built-in data migration for hardware replacement that eliminate the need for manual data handling or even downtime during hardware replacement.&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
#* We have developed fixity checking and other repository auditing tools as a check on our use of the storage system, ensuring that our processes have storage data correctly. This check doubles as bit rot detection in addition to that performed by the system. We have developed a small diagnostic script to isolate problems in the data synchronization from one site to another if the storage system reports an error.&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;br /&gt;
#* There are no issues or shortcomings of the system with respect to TRAC requirements.&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2519</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2519"/>
		<updated>2011-05-16T16:35:06Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* HathiTrust Response to Implementations of Large Scale Storage Architectures */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
#* HathiTrust&#039;s mission is (jointly) multi-institutional long-term preservation and access of digitized library materials.&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice?&lt;br /&gt;
#* HathiTrust does not use cloud storage providers due to potential legal issues with copyrighted content. At least within the scope of digital preservation of library materials, we would consider ourselves to be a cloud storage provider, or similar to one. The large scale storage system we use is from Isilon which was chosen for its ability to scale in capacity and performance while keeping maintenance requirements constant and low. All well-known storage vendors (about 15) were considered.&lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)&lt;br /&gt;
#* Currently, print volumes comprises the bulk of preserved content. Pilot projects for continuous tone images and audio are well underway.&lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
#* As of May 2011, HathiTrust is preserving 8.7 million print volumes which consume approximately 400TB of storage.&lt;br /&gt;
# What are your performance requirements? Further, why are these your particular requirements?&lt;br /&gt;
#* Fixity checking (checksum validation) requires all data to be read over an approximate 90-day periodicity, which, at the current repository size, translates to 54MB/s continuous read activity over 90 days (or higher over fewer days). This has, so far, been easily attainable through parallelization. The most recent peak in the ingest rate was ~500,000 volumes per month, which translates to an average of ~7MB/s write activity, also easily attainable.&lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc) Further, why did you choose these particular media?&lt;br /&gt;
#* HathiTrust uses two instances of disk in different cities (several hundred miles apart) for primary storage and two instances of tape in one city (several miles apart) providing 6 months of previous-version retention (backups). Disk was chosen for primary storage because the archive is light, with materials continuously accessible, and because repository activities such as full-text search indexing and fixity checking require frequent low-latency access to content. Tape was chosen for backups because, as is the conventional wisdom, previous-version retention inflates storage requirements (in our case, roughly 25%) but does not demand low-latency access, and so tape is the most cost-effective medium.&lt;br /&gt;
# What do you think are the key advantages of the system you use?&lt;br /&gt;
#* Being NAS-based, the storage environment is a simple filesystem which offers easy integration and movement between vendor platforms. The Isilon system has a number of advanced data integrity features that are well-suited to digital preservation including internal checksums and inline data correction to protect against bit rot, misdirected/torn writes, and similar data storage risks.&lt;br /&gt;
# What do you think are the key problems or disadvantages your system presents?&lt;br /&gt;
#* The simplicity of filesystems can be seen as a risk for the same reason it is beneficial; without imposed structure, filesystems can accommodate informal and unrigorous access to data. When using a filesystem, the price paid for simplicity is that greater attention is required to permissions and formalized processes for data management.&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use?&lt;br /&gt;
#* &lt;br /&gt;
# How frequently do you migrate from one system to another? Further, what is it that prompts you to make these migrations?&lt;br /&gt;
#* HathiTrust migrated to Isilon storage in 2007 from temporary DAS and has not since needed to do another migration, although the first cycle of hardware replacement (100TB at both locations) was completed using the built-in capability for removing and adding storage nodes; there was no direct handling of data required. The initial move was simply the initial hardware acquisition, and the hardware replacement is an annual cycle to maintain hardware currency.&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
#* The Isilon system enables us to have greater parity redundancy than conventional storage systems: we use N+3, as opposed to the conventional RAID 5 (N+1) or slightly novel RAID 6 (N+2). The latest release of Isilon system software computes checksums on write and validates checksums on read, correcting data as needed, and the cluster architecture scales the compute capacity required to do this work, as opposed to conventional head/tray design. The system provides features for built-in data migration for hardware replacement that eliminate the need for manual data handling or even downtime during hardware replacement.&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
#* We have developed fixity checking and other repository auditing tools as a check on our use of the storage system, ensuring that our processes have storage data correctly. This check doubles as bit rot detection in addition to that performed by the system. We have developed a small diagnostic script to isolate problems in the data synchronization from one site to another if the storage system reports an error.&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;br /&gt;
#* There are no issues or shortcomings of the system with respect to TRAC requirements.&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2518</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2518"/>
		<updated>2011-05-16T16:16:12Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
#* HathiTrust&#039;s mission is (jointly) multi-institutional long-term preservation and access of digitized library materials.&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice?&lt;br /&gt;
#* HathiTrust does not use a cloud storage provider. At least within the scope of digital preservation of library materials, we would consider ourselves to be similar to a cloud storage provider. The large scale storage system we use is from Isilon.&lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)&lt;br /&gt;
#* Currently, print volumes comprises the bulk of preserved content. Pilot projects for continuous tone images and audio are well underway.&lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
#* As of May 2011, HathiTrust is preserving 8.7 million print volumes which consume approximately 400TB of storage.&lt;br /&gt;
# What are your performance requirements?&lt;br /&gt;
#* Checksum validation requires all data to be read over an approximate 90-day periodicity, which, at the current repository size, translates to 54MB/s read activity which is attainable through parallelization. The most recent peak in the ingest rate was ~500,000 volumes per month which translates to an average of ~7MB/s writing. &lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc)&lt;br /&gt;
#* HathiTrust uses two instances of disk in different cities (several hundred miles apart) for primary storage and two instances of tape in one city (several miles apart) for backup and 6 months of previous-version retention.&lt;br /&gt;
# What do you think the key advantages of the system you use?&lt;br /&gt;
#* Being NAS-based, the storage environment is a simple filesystem which offers easy integration and movement between vendor platforms. The Isilon system has a number of advanced data integrity features that are well-suited to digital preservation including internal checksums and inline data correction to protect against bit rot, misdirected/torn writes, and similar data storage risks.&lt;br /&gt;
# What do you think are the key problems or disadvantages your system present?&lt;br /&gt;
#* The simplicity of filesystems can be seen as a risk for the same reason it is beneficial; without imposed structure, filesystems can accommodate informal and unrigorous access to data. When using a filesystem, the price paid for simplicity is that greater attention is required to permissions and formalized processes for data management.&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use?&lt;br /&gt;
#* The storage solution was chosen for its ability to scale in capacity and performance while keeping maintenance requirements constant and low.&lt;br /&gt;
# How frequently do you migrate from one system to another?&lt;br /&gt;
#* HathiTrust migrated to Isilon storage in 2007 from temporary DAS and has not since needed to do another migration, although the first cycle of hardware replacement (100TB at both locations) was completed using the built-in capability for removing and adding storage nodes; there was no direct handling of data required.&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
#* The Isilon system enables us to have greater parity redundancy than conventional storage systems: we use N+3, as opposed to the conventional RAID 5 (N+1) or slightly novel RAID 6 (N+2). The latest release of Isilon system software computes checksums on write and validates checksums on read, correcting data as needed, and the cluster architecture scales the compute capacity required to do this work, as opposed to conventional head/tray design. The system provides features for built-in data migration for hardware replacement that eliminate the need for manual data handling or even downtime during hardware replacement.&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;br /&gt;
#* There are no issues or shortcomings with respect to TRAC requirements.&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2517</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2517"/>
		<updated>2011-05-16T15:34:34Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* HathiTrust Response to Implementations of Large Scale Storage Architectures */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice? &lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.) &lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
# What are your performance requirements?&lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc)&lt;br /&gt;
# What do you think the key advantages of the system you use?&lt;br /&gt;
# What do you think are the key problems or disadvantages your system present?&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use? &lt;br /&gt;
# How frequently do you migrate from one system to another?&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2516</id>
		<title>NDSA:HathiTrust</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:HathiTrust&amp;diff=2516"/>
		<updated>2011-05-16T15:34:16Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: Created page with &amp;#039;==HathiTrust Response to Implementations of Large Scale Storage Architectures==  # What is the particular preservation goal or challenge you need to accomplish? (for example, re-…&amp;#039;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==HathiTrust Response to Implementations of Large Scale Storage Architectures==&lt;br /&gt;
&lt;br /&gt;
# What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
&lt;br /&gt;
# What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice? &lt;br /&gt;
&lt;br /&gt;
# Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.) &lt;br /&gt;
&lt;br /&gt;
# How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
&lt;br /&gt;
# What are your performance requirements?&lt;br /&gt;
&lt;br /&gt;
# What storage media have you elected to use? (Disk, Tape, etc)&lt;br /&gt;
&lt;br /&gt;
# What do you think the key advantages of the system you use?&lt;br /&gt;
&lt;br /&gt;
# What do you think are the key problems or disadvantages your system present?&lt;br /&gt;
&lt;br /&gt;
# What important principles informed your decision about the particular tool or service you chose to use? &lt;br /&gt;
&lt;br /&gt;
# How frequently do you migrate from one system to another?&lt;br /&gt;
&lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2033</id>
		<title>NDSA:Cloud Presentations</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2033"/>
		<updated>2011-05-16T15:31:51Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In each case we would want to identify who would present, who will contact them. Then when they will present. &lt;br /&gt;
&lt;br /&gt;
From there we can include specific questions we would like them to respond to. &lt;br /&gt;
&lt;br /&gt;
==Presentation Schedule and Slides==&lt;br /&gt;
# Feb 1, Tues, 1:00 EST call with iRods Reagan Moore ([[NDSA:Media:NIAID.ppt|presentation]])&lt;br /&gt;
# Feb 14, Monday, 11:00 EST call with Duracloud ([[NDSA:Media:DuracloudNDSA.ppt|presentation]])&lt;br /&gt;
# Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert MetaArchive NDSA ([[NDSA:Media:MetaArchive NDSA Infrastructure.ppt|presentation]])&lt;br /&gt;
&lt;br /&gt;
==People/Projects to Contact==&lt;br /&gt;
*Chronopolis (Mike Smorul will contact)&lt;br /&gt;
*Open questions from the Educopia Guide to Distributed Digital Preservation &lt;br /&gt;
*Commercial providers? (Who specifically would we want here? Please add them.)&lt;br /&gt;
**Azure (Leslie to contact)&lt;br /&gt;
**Amazon (Who will contact?)&lt;br /&gt;
&lt;br /&gt;
==General Questions for Cloud Service Presenters==&lt;br /&gt;
Here we are working on a set of general questions for presenters to develop talks around. &lt;br /&gt;
&lt;br /&gt;
# What sort of use cases is your system designed to support? What doesn&#039;t this support?&lt;br /&gt;
# What preservation standards would your system support? &lt;br /&gt;
# What resources are required to support a solution implemented in your environment? &lt;br /&gt;
# What infrastructure do you rely on?&lt;br /&gt;
# How can your system impact digital preservation activities?&lt;br /&gt;
# If we put data in your system today what systems and processes are in place so that we can get it back 10 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)&lt;br /&gt;
# What types of materials does your system handle? (documents, audio files, video file, stills, data sets, etc) And give examples of those types in practice&lt;br /&gt;
&lt;br /&gt;
===Responses to questions===&lt;br /&gt;
====[[NDSA:iRODS]] direct responses====&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:DuraCloud]] direct responses====&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of cloud storage service by Duracloud.&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:MetaArchive/GDDP]] direct responses====&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide.&lt;br /&gt;
&lt;br /&gt;
====Chronopolis====&lt;br /&gt;
# ...&lt;br /&gt;
====MicroSoft Azure====&lt;br /&gt;
# ...&lt;br /&gt;
====Amazon S3/EC2====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Questions for Member Institution Implementations of Large Scale Storage Architectures==&lt;br /&gt;
#What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)&lt;br /&gt;
#What large scale storage or cloud technologies are you using to meet that challenge? Further, why did you choose these particular technologies?&lt;br /&gt;
#Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.) &lt;br /&gt;
#How big is your collection? (In terms of number of objects and storage space required)&lt;br /&gt;
#What are your performance requirements? Further, why are these your particular requirements?&lt;br /&gt;
#What storage media have you elected to use? (Disk, Tape, etc) Further, why did you choose these particular media?&lt;br /&gt;
#What do you think the key advantages of the system you use?&lt;br /&gt;
#What do you think are the key problems or disadvantages your system present?&lt;br /&gt;
#What important principles informed your decision about the particular tool or service you chose to use? &lt;br /&gt;
#How frequently do you migrate from one system to another? Further, what is it that prompts you to make these migrations? &lt;br /&gt;
# What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)&lt;br /&gt;
# What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)&lt;br /&gt;
# Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?&lt;br /&gt;
 &lt;br /&gt;
===Responses to questions===&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:Florida Center for Library Automation]]====&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:HathiTrust]]====&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:National Library of Medicine Responses]]====&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:Penn State]]====&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:WGBH Responses]]====&lt;br /&gt;
&lt;br /&gt;
====[[NDSA:Your Institution Here]]====&lt;br /&gt;
&lt;br /&gt;
==General Concerns==&lt;br /&gt;
# confidential data&lt;br /&gt;
# encrypted data&lt;br /&gt;
# auditing&lt;br /&gt;
# preservation risks&lt;br /&gt;
# legal compliance&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Solution Models and Environments==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot;&lt;br /&gt;
!Name&lt;br /&gt;
!Offered as Service&lt;br /&gt;
!Deployed Locally&lt;br /&gt;
!Opensource&lt;br /&gt;
!Authentication Scheme&lt;br /&gt;
!Ingest Mechanism&lt;br /&gt;
!Export Mechanism&lt;br /&gt;
!Integrity/Validation Mechanism&lt;br /&gt;
!Replication Mechanism&lt;br /&gt;
!Administration Model (Federated, etc.)&lt;br /&gt;
!Tiering Support&lt;br /&gt;
|-&lt;br /&gt;
|iRODS&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|DuraCloud&lt;br /&gt;
|yes&lt;br /&gt;
|yes&lt;br /&gt;
|yes (Apache2)&lt;br /&gt;
|Basic Auth&lt;br /&gt;
|1:web-ui, 2:client-side utility, 3:REST-API&lt;br /&gt;
|1:web-ui, 2:client-side utility, 3:REST-API&lt;br /&gt;
|Checksum verified on ingest. On-demand checksum verification service.&lt;br /&gt;
|Built-in support for cross-cloud replication.&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|MetaArchive/GDDP&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Chronopolis&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Microsoft Azure&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Amazon S3/EC2&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2004</id>
		<title>NDSA:Cloud Presentations</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2004"/>
		<updated>2011-03-18T21:23:39Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* DuraCloud */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In each case we would want to identify who would present, who will contact them. Then when they will present. &lt;br /&gt;
&lt;br /&gt;
From there we can include specific questions we would like them to respond to. &lt;br /&gt;
&lt;br /&gt;
==Presentation Schedule==&lt;br /&gt;
Once we start scheduling presenters we will keep a list of the talks here.&lt;br /&gt;
# Feb 1, Tues, 1:00 EST call with iRods Reagan Moore ([[NDSA:Media:NIAID.ppt|presentation]])&lt;br /&gt;
# Feb 14, Monday, 11:00 EST call with Duracloud&lt;br /&gt;
# Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert&lt;br /&gt;
&lt;br /&gt;
==People/Projects to Contact==&lt;br /&gt;
*DuraCloud/Duraspace (Leslie to contact)&lt;br /&gt;
*Chronopolis (Mike Smorul will contact)&lt;br /&gt;
*Open questions from the Educopia Guide to Distributed Digital Preservation http://www.metaarchive.org/GDDP (Martin will contact)&lt;br /&gt;
*Irods: Reagan Moore, 2/1/2011  see slides: NIAID.ppt &lt;br /&gt;
*Commercial providers? (Who specifically would we want here? Please add them.)&lt;br /&gt;
**Azure (Leslie to contact)&lt;br /&gt;
**Amazon (Who will contact?)&lt;br /&gt;
&lt;br /&gt;
==General Guiding Questions for Presenters==&lt;br /&gt;
Here we are working on a set of general questions for presenters to develop talks around. &lt;br /&gt;
&lt;br /&gt;
# What sort of use cases is your system designed to support? What doesn&#039;t this support?&lt;br /&gt;
# What preservation strategies would your system support? &lt;br /&gt;
# What preservation standards would your system support? &lt;br /&gt;
# What resources are required to support a solution implemented in your environment &lt;br /&gt;
# What infrastructure do you rely on?&lt;br /&gt;
# How can the cloud environment impact digital preservation activities?&lt;br /&gt;
# If we put data in your system today what systems and processes are in place so that we can get it back 50 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)&lt;br /&gt;
&lt;br /&gt;
===Responses to questions===&lt;br /&gt;
====iRODS====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.&lt;br /&gt;
&lt;br /&gt;
====DuraCloud====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of cloud storage service by Duracloud.&lt;br /&gt;
&lt;br /&gt;
====MetaArchive/GDDP====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide.&lt;br /&gt;
&lt;br /&gt;
====Chronopolis====&lt;br /&gt;
# ...&lt;br /&gt;
====MicroSoft Azure====&lt;br /&gt;
# ...&lt;br /&gt;
====Amazon S3/EC2====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==General Concerns==&lt;br /&gt;
# confidential data&lt;br /&gt;
# encrypted data&lt;br /&gt;
# auditing&lt;br /&gt;
# preservation risks&lt;br /&gt;
# legal compliance&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Solution Models and Environments==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot;&lt;br /&gt;
!Name&lt;br /&gt;
!Offered as Service&lt;br /&gt;
!Deployed Locally&lt;br /&gt;
!Opensource&lt;br /&gt;
!Authentication Scheme&lt;br /&gt;
!Ingest Mechanism&lt;br /&gt;
!Export Mechanism&lt;br /&gt;
!Integrity/Validation Mechanism&lt;br /&gt;
!Replication Mechanism&lt;br /&gt;
!Administration Model (Federated, etc.)&lt;br /&gt;
!Tiering Support&lt;br /&gt;
|-&lt;br /&gt;
|iRODS&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|DuraCloud&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|MetaArchive/GDDP&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Chronopolis&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Microsoft Azure&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Amazon S3/EC2&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2003</id>
		<title>NDSA:Cloud Presentations</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2003"/>
		<updated>2011-03-18T21:22:01Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* MetaArchive/GDDP */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In each case we would want to identify who would present, who will contact them. Then when they will present. &lt;br /&gt;
&lt;br /&gt;
From there we can include specific questions we would like them to respond to. &lt;br /&gt;
&lt;br /&gt;
==Presentation Schedule==&lt;br /&gt;
Once we start scheduling presenters we will keep a list of the talks here.&lt;br /&gt;
# Feb 1, Tues, 1:00 EST call with iRods Reagan Moore ([[NDSA:Media:NIAID.ppt|presentation]])&lt;br /&gt;
# Feb 14, Monday, 11:00 EST call with Duracloud&lt;br /&gt;
# Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert&lt;br /&gt;
&lt;br /&gt;
==People/Projects to Contact==&lt;br /&gt;
*DuraCloud/Duraspace (Leslie to contact)&lt;br /&gt;
*Chronopolis (Mike Smorul will contact)&lt;br /&gt;
*Open questions from the Educopia Guide to Distributed Digital Preservation http://www.metaarchive.org/GDDP (Martin will contact)&lt;br /&gt;
*Irods: Reagan Moore, 2/1/2011  see slides: NIAID.ppt &lt;br /&gt;
*Commercial providers? (Who specifically would we want here? Please add them.)&lt;br /&gt;
**Azure (Leslie to contact)&lt;br /&gt;
**Amazon (Who will contact?)&lt;br /&gt;
&lt;br /&gt;
==General Guiding Questions for Presenters==&lt;br /&gt;
Here we are working on a set of general questions for presenters to develop talks around. &lt;br /&gt;
&lt;br /&gt;
# What sort of use cases is your system designed to support? What doesn&#039;t this support?&lt;br /&gt;
# What preservation strategies would your system support? &lt;br /&gt;
# What preservation standards would your system support? &lt;br /&gt;
# What resources are required to support a solution implemented in your environment &lt;br /&gt;
# What infrastructure do you rely on?&lt;br /&gt;
# How can the cloud environment impact digital preservation activities?&lt;br /&gt;
# If we put data in your system today what systems and processes are in place so that we can get it back 50 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)&lt;br /&gt;
&lt;br /&gt;
===Responses to questions===&lt;br /&gt;
====iRODS====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.&lt;br /&gt;
&lt;br /&gt;
====DuraCloud====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of service by Duracloud.&lt;br /&gt;
&lt;br /&gt;
====MetaArchive/GDDP====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Built on LOCKSS, so data integrity assurances are provided by robust networked software model augmented to commodity hardware and storage. Federated nature provides integrity assurance but also a lack of central control in that the accidental loss of multiple caches is unlikely but e.g. scheduled maintenance or upgrades could coincidentally collide.&lt;br /&gt;
&lt;br /&gt;
====Chronopolis====&lt;br /&gt;
# ...&lt;br /&gt;
====MicroSoft Azure====&lt;br /&gt;
# ...&lt;br /&gt;
====Amazon S3/EC2====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==General Concerns==&lt;br /&gt;
# confidential data&lt;br /&gt;
# encrypted data&lt;br /&gt;
# auditing&lt;br /&gt;
# preservation risks&lt;br /&gt;
# legal compliance&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Solution Models and Environments==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot;&lt;br /&gt;
!Name&lt;br /&gt;
!Offered as Service&lt;br /&gt;
!Deployed Locally&lt;br /&gt;
!Opensource&lt;br /&gt;
!Authentication Scheme&lt;br /&gt;
!Ingest Mechanism&lt;br /&gt;
!Export Mechanism&lt;br /&gt;
!Integrity/Validation Mechanism&lt;br /&gt;
!Replication Mechanism&lt;br /&gt;
!Administration Model (Federated, etc.)&lt;br /&gt;
!Tiering Support&lt;br /&gt;
|-&lt;br /&gt;
|iRODS&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|DuraCloud&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|MetaArchive/GDDP&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Chronopolis&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Microsoft Azure&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Amazon S3/EC2&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2002</id>
		<title>NDSA:Cloud Presentations</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2002"/>
		<updated>2011-03-18T21:14:15Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* DuraCloud */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In each case we would want to identify who would present, who will contact them. Then when they will present. &lt;br /&gt;
&lt;br /&gt;
From there we can include specific questions we would like them to respond to. &lt;br /&gt;
&lt;br /&gt;
==Presentation Schedule==&lt;br /&gt;
Once we start scheduling presenters we will keep a list of the talks here.&lt;br /&gt;
# Feb 1, Tues, 1:00 EST call with iRods Reagan Moore ([[NDSA:Media:NIAID.ppt|presentation]])&lt;br /&gt;
# Feb 14, Monday, 11:00 EST call with Duracloud&lt;br /&gt;
# Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert&lt;br /&gt;
&lt;br /&gt;
==People/Projects to Contact==&lt;br /&gt;
*DuraCloud/Duraspace (Leslie to contact)&lt;br /&gt;
*Chronopolis (Mike Smorul will contact)&lt;br /&gt;
*Open questions from the Educopia Guide to Distributed Digital Preservation http://www.metaarchive.org/GDDP (Martin will contact)&lt;br /&gt;
*Irods: Reagan Moore, 2/1/2011  see slides: NIAID.ppt &lt;br /&gt;
*Commercial providers? (Who specifically would we want here? Please add them.)&lt;br /&gt;
**Azure (Leslie to contact)&lt;br /&gt;
**Amazon (Who will contact?)&lt;br /&gt;
&lt;br /&gt;
==General Guiding Questions for Presenters==&lt;br /&gt;
Here we are working on a set of general questions for presenters to develop talks around. &lt;br /&gt;
&lt;br /&gt;
# What sort of use cases is your system designed to support? What doesn&#039;t this support?&lt;br /&gt;
# What preservation strategies would your system support? &lt;br /&gt;
# What preservation standards would your system support? &lt;br /&gt;
# What resources are required to support a solution implemented in your environment &lt;br /&gt;
# What infrastructure do you rely on?&lt;br /&gt;
# How can the cloud environment impact digital preservation activities?&lt;br /&gt;
# If we put data in your system today what systems and processes are in place so that we can get it back 50 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)&lt;br /&gt;
&lt;br /&gt;
===Responses to questions===&lt;br /&gt;
====iRODS====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.&lt;br /&gt;
&lt;br /&gt;
====DuraCloud====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] Treatment of cloud provider is generally as a black box, without a strong sense of actual reliability of underlying storage systems. Cloud providers tend to promise checksum validation of contents, but recourse if validation fails was unknown (right?). Additional checksum validation has been augmented on top of service by Duracloud.&lt;br /&gt;
&lt;br /&gt;
====MetaArchive/GDDP====&lt;br /&gt;
# ...&lt;br /&gt;
====Chronopolis====&lt;br /&gt;
# ...&lt;br /&gt;
====MicroSoft Azure====&lt;br /&gt;
# ...&lt;br /&gt;
====Amazon S3/EC2====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==General Concerns==&lt;br /&gt;
# confidential data&lt;br /&gt;
# encrypted data&lt;br /&gt;
# auditing&lt;br /&gt;
# preservation risks&lt;br /&gt;
# legal compliance&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Solution Models and Environments==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot;&lt;br /&gt;
!Name&lt;br /&gt;
!Offered as Service&lt;br /&gt;
!Deployed Locally&lt;br /&gt;
!Opensource&lt;br /&gt;
!Authentication Scheme&lt;br /&gt;
!Ingest Mechanism&lt;br /&gt;
!Export Mechanism&lt;br /&gt;
!Integrity/Validation Mechanism&lt;br /&gt;
!Replication Mechanism&lt;br /&gt;
!Administration Model (Federated, etc.)&lt;br /&gt;
!Tiering Support&lt;br /&gt;
|-&lt;br /&gt;
|iRODS&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|DuraCloud&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|MetaArchive/GDDP&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Chronopolis&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Microsoft Azure&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Amazon S3/EC2&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2001</id>
		<title>NDSA:Cloud Presentations</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=2001"/>
		<updated>2011-03-18T21:05:23Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* iRODS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In each case we would want to identify who would present, who will contact them. Then when they will present. &lt;br /&gt;
&lt;br /&gt;
From there we can include specific questions we would like them to respond to. &lt;br /&gt;
&lt;br /&gt;
==Presentation Schedule==&lt;br /&gt;
Once we start scheduling presenters we will keep a list of the talks here.&lt;br /&gt;
# Feb 1, Tues, 1:00 EST call with iRods Reagan Moore ([[NDSA:Media:NIAID.ppt|presentation]])&lt;br /&gt;
# Feb 14, Monday, 11:00 EST call with Duracloud&lt;br /&gt;
# Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert&lt;br /&gt;
&lt;br /&gt;
==People/Projects to Contact==&lt;br /&gt;
*DuraCloud/Duraspace (Leslie to contact)&lt;br /&gt;
*Chronopolis (Mike Smorul will contact)&lt;br /&gt;
*Open questions from the Educopia Guide to Distributed Digital Preservation http://www.metaarchive.org/GDDP (Martin will contact)&lt;br /&gt;
*Irods: Reagan Moore, 2/1/2011  see slides: NIAID.ppt &lt;br /&gt;
*Commercial providers? (Who specifically would we want here? Please add them.)&lt;br /&gt;
**Azure (Leslie to contact)&lt;br /&gt;
**Amazon (Who will contact?)&lt;br /&gt;
&lt;br /&gt;
==General Guiding Questions for Presenters==&lt;br /&gt;
Here we are working on a set of general questions for presenters to develop talks around. &lt;br /&gt;
&lt;br /&gt;
# What sort of use cases is your system designed to support? What doesn&#039;t this support?&lt;br /&gt;
# What preservation strategies would your system support? &lt;br /&gt;
# What preservation standards would your system support? &lt;br /&gt;
# What resources are required to support a solution implemented in your environment &lt;br /&gt;
# What infrastructure do you rely on?&lt;br /&gt;
# How can the cloud environment impact digital preservation activities?&lt;br /&gt;
# If we put data in your system today what systems and processes are in place so that we can get it back 50 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)&lt;br /&gt;
&lt;br /&gt;
===Responses to questions===&lt;br /&gt;
====iRODS====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
Other general notes:&lt;br /&gt;
&lt;br /&gt;
* [Snavely] The need for each storage target to support a specific set of operations, and consistently with other storage targets, seems like a risk that comes along with the elegant abstraction that iRODS provides. Clear specifications help mitigate this risk.&lt;br /&gt;
&lt;br /&gt;
====DuraCloud====&lt;br /&gt;
# ...&lt;br /&gt;
====MetaArchive/GDDP====&lt;br /&gt;
# ...&lt;br /&gt;
====Chronopolis====&lt;br /&gt;
# ...&lt;br /&gt;
====MicroSoft Azure====&lt;br /&gt;
# ...&lt;br /&gt;
====Amazon S3/EC2====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==General Concerns==&lt;br /&gt;
# confidential data&lt;br /&gt;
# encrypted data&lt;br /&gt;
# auditing&lt;br /&gt;
# preservation risks&lt;br /&gt;
# legal compliance&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Solution Models and Environments==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot;&lt;br /&gt;
!Name&lt;br /&gt;
!Offered as Service&lt;br /&gt;
!Deployed Locally&lt;br /&gt;
!Opensource&lt;br /&gt;
!Authentication Scheme&lt;br /&gt;
!Ingest Mechanism&lt;br /&gt;
!Export Mechanism&lt;br /&gt;
!Integrity/Validation Mechanism&lt;br /&gt;
!Replication Mechanism&lt;br /&gt;
!Administration Model (Federated, etc.)&lt;br /&gt;
!Tiering Support&lt;br /&gt;
|-&lt;br /&gt;
|iRODS&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|DuraCloud&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|MetaArchive/GDDP&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Chronopolis&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Microsoft Azure&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Amazon S3/EC2&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
	<entry>
		<id>https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=1996</id>
		<title>NDSA:Cloud Presentations</title>
		<link rel="alternate" type="text/html" href="https://wiki.diglib.org/index.php?title=NDSA:Cloud_Presentations&amp;diff=1996"/>
		<updated>2011-02-07T18:47:58Z</updated>

		<summary type="html">&lt;p&gt;Csnavely: /* Solution Models and Environments */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In each case we would want to identify who would present, who will contact them. Then when they will present. &lt;br /&gt;
&lt;br /&gt;
From there we can include specific questions we would like them to respond to. &lt;br /&gt;
&lt;br /&gt;
==Presentation Schedule==&lt;br /&gt;
Once we start scheduling presenters we will keep a list of the talks here.&lt;br /&gt;
# Feb 1, Tues, 1:00 EST call with iRods Reagan Moore ([[NDSA:Media:NIAID.ppt|presentation]])&lt;br /&gt;
# Feb 14, Monday, 11:00 EST call with Duracloud&lt;br /&gt;
# Feb 17, Thurs, 11:00 EST call with MetaArchive/GDDP Katherine Skinner, Matt Schultz and Martin Halbert&lt;br /&gt;
&lt;br /&gt;
==People/Projects to Contact==&lt;br /&gt;
*DuraCloud/Duraspace (Leslie to contact)&lt;br /&gt;
*Chronopolis (Mike Smorul will contact)&lt;br /&gt;
*Open questions from the Educopia Guide to Distributed Digital Preservation http://www.metaarchive.org/GDDP (Martin will contact)&lt;br /&gt;
*Irods: Reagan Moore, 2/1/2011  see slides: NIAID.ppt &lt;br /&gt;
*Commercial providers? (Who specifically would we want here? Please add them.)&lt;br /&gt;
**Azure (Leslie to contact)&lt;br /&gt;
**Amazon (Who will contact?)&lt;br /&gt;
&lt;br /&gt;
==General Guiding Questions for Presenters==&lt;br /&gt;
Here we are working on a set of general questions for presenters to develop talks around. &lt;br /&gt;
&lt;br /&gt;
# What sort of use cases is your system designed to support? What doesn&#039;t this support?&lt;br /&gt;
# What preservation strategies would your system support? &lt;br /&gt;
# What preservation standards would your system support? &lt;br /&gt;
# What resources are required to support a solution implemented in your environment &lt;br /&gt;
# What infrastructure do you rely on?&lt;br /&gt;
# How can the cloud environment impact digital preservation activities?&lt;br /&gt;
# If we put data in your system today what systems and processes are in place so that we can get it back 50 years from now? (Take for granted a sophisticated audience that knows about multiple copies etc.)&lt;br /&gt;
&lt;br /&gt;
===Responses to questions===&lt;br /&gt;
====iRODS====&lt;br /&gt;
# ...&lt;br /&gt;
====DuraCloud====&lt;br /&gt;
# ...&lt;br /&gt;
====MetaArchive/GDDP====&lt;br /&gt;
# ...&lt;br /&gt;
====Chronopolis====&lt;br /&gt;
# ...&lt;br /&gt;
====MicroSoft Azure====&lt;br /&gt;
# ...&lt;br /&gt;
====Amazon S3/EC2====&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==General Concerns==&lt;br /&gt;
# confidential data&lt;br /&gt;
# encrypted data&lt;br /&gt;
# auditing&lt;br /&gt;
# preservation risks&lt;br /&gt;
# legal compliance&lt;br /&gt;
# ...&lt;br /&gt;
&lt;br /&gt;
==Solution Models and Environments==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot;&lt;br /&gt;
!Name&lt;br /&gt;
!Offered as Service&lt;br /&gt;
!Deployed Locally&lt;br /&gt;
!Opensource&lt;br /&gt;
!Authentication Scheme&lt;br /&gt;
!Ingest Mechanism&lt;br /&gt;
!Export Mechanism&lt;br /&gt;
|-&lt;br /&gt;
|iRODS&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|DuraCloud&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|MetaArchive/GDDP&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Chronopolis&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Microsoft Azure&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Amazon S3/EC2&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Csnavely</name></author>
	</entry>
</feed>