NDSA:NYU Response

From DLF Wiki

1. What is the particular preservation goal or challenge you need to accomplish? (for example, re-use, public access, internal access, legal mandate, etc.)

NYU Libraries processes, enables access to, and preserves digital materials that come from both the NYU community and from collaborating partner organizations.


2. What large scale storage or cloud technologies are you using to meet that challenge? Further, why did you choose these particular technologies?

Our current repository asset store runs on SunFire X4500 and X4540 storage servers. The data servers are mirrored and backed up to tape. We are building a new repository system using Isilon storage arrays. The Isilon arrays are mirrored, geographically distributed, and backed up to tape.

We are not pursuing cloud storage at this time.


3. Specifically, what kind of materials are you preserving (text, data sets, images, moving images, web pages, etc.)

Our preservation repository contains: - texts - images - video - audio


4. How big is your collection? (In terms of number of objects and storage space required)

Combined existing and new repository systems: 22,594 objects 81 TB (63 TB of video)


5. What are your performance requirements? Further, why are these your particular requirements?

The storage solution must be fast enough to support ongoing fixity, ingest, and access operations.


6. What storage media have you elected to use? (Disk, Tape, etc) Further, why did you choose these particular media?

We use both disk and tape (for backup). The first and second copies are stored on disk. The third copy is stored on tape.

We need content on disk because we serve some content directly from repository storage. We also transcode to create access copies served through streaming media servers.


7. What do you think the key advantages of the system you use?

The new system is under construction, but will be able to support various curation, publication, and preservation workflows. The underlying storage solution will allow us to easily add capacity to the system as needed.


8. What do you think are the key problems or disadvantages your system present?

Ingest in our current system can be rather slow due to the ingest mechanisms in our application.


9. What important principles informed your decision about the particular tool or service you chose to use?

We requested that the storage system be scalable, and ideally present a single filesystem to the applications using the storage. Our systems group then researched multiple storage solutions.


10. How frequently do you migrate from one system to another? Further, what is it that prompts you to make these migrations?

We are coming up on our first major migration in approximately four years. In addition to content in our preservation repository, we have a legacy content that is stored across multiple systems. The new repository should allow us to aggregate and manage all of our content in a single system.


11. What characteristics of the storage system(s) you use do you feel are particularly well-suited to long-term digital preservation? (High levels of redundancy/resiliency, internal checksumming capabilities, automated tape refresh, etc)

The Isilon storage system is designed to scale and includes configurable data integrity and data recovery features.


12. What functionality or processes have you developed to augment your storage systems in order to meet preservation goals? (Periodic checksum validation, limited human access or novel use of permissions schemes)

Ongoing fixity checks and "completeness" checks.


13. Are there tough requirements for digital preservation, e.g. TRAC certification, that you wish were more readily handled by your storage system?

Not at this time.