NDSA:Penn State: Difference between revisions

From DLF Wiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
Back to [[NDSA:Cloud Presentations]]
==Penn State Response to Implementations of Large Scale Storage Architectures==
==Penn State Response to Implementations of Large Scale Storage Architectures==
#What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)
#What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)

Revision as of 09:45, 11 April 2011

Back to NDSA:Cloud Presentations

Penn State Response to Implementations of Large Scale Storage Architectures

  1. What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)
    • Digital Library Technologies, a unit of Information Technology Services at Penn State, provides IT systems and services to support the teaching, research, and outreach mission of the Penn State University Libraries and the institution. A core goal of Digital Library Technologies is advancing a joint initiative with the Penn State University Libraries to build an institutional digital stewardship program. The program addresses extant and emerging digital content and asset management needs in areas such as digital library collections, scholarly communications, electronic records archiving, and e-science/e-research data management. Accordingly, we have numerous preservation goals including all of the examples listed.
  2. What large scale storage or cloud technologies are you using to meet that challenge? Further, which service providers or tools did you consider and how did you make your choice?
    • We are not currently working with cloud providers. Our storage infrastructure is largely built on HP StorageWorks Enterprise Virtual Arrays (e.g., the 8400), virtualized fibrechannel storage area networks.
  3. Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)
    • Most of our materials now are image- and text-based, but we have bits and pieces of everything. We have initiatives under way that may bring in a large amount of electronic records (largely text-based but also web archives) and research data sets.
  4. How big is your collection? (In terms of number of objects and storage space required)
    • Since our collections are currently spread across a minimum of four legacy delivery applications, we lack a good way to keep track of the # of objects. We estimate that our collection is a few dozen terabytes, including archival masters and delivery copies.
  5. What are your performance requirements?
    • Not yet identified.
  6. What storage media have you elected to use? (Disk, Tape, etc)
    • Tape is used only for backups, at the moment; we use hard disks for all of our storage.
  7. What do you think the key advantages of the system you use?
    • TBD
  8. What do you think are the key problems or disadvantages your system present?
    • TBD
  9. What important principles informed your decision about the particular tool or service you chose to use?
    • TBD
  10. How frequently do you migrate from one system to another?
    • TBD
  11. What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)
    • TBD
  12. What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)
    • We have just built a proof-of-concept services layer above the storage layer that provides functionality useful in the long-term preservation context, such as on-demand checksum validation, provenance event logging, and version control. We plan to build our prototype out soon, and it will include some other functionality (e.g. replication).
  13. Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?
    • None come to mind, though we will likely build out a production-ready version of our prototype with some TRAC guidelines in mind.