NDSA:Tuesday, Mar 25, 2014: Difference between revisions

From DLF Wiki
Bhow (talk | contribs)
Bhow (talk | contribs)
Line 43: Line 43:
''Future directions<br/>
''Future directions<br/>
An open discussion followed where the caller's discussed their local set up:
An open discussion followed where the caller's discussed their local set up:
*Jim Harper, PFA Inc, local gov't in backing up their records, bp in preservation electronic records. As industry changes, tech changes and new talen wants to do things differently and how we monitor these changes. What we do it, and how do we sustain it. Great variety of things. Document the exchange of property
*PRIA works with local governments in backing up their records, and is interested in learning about best practices in the preservation electronic records. There are industry and technology changes, and new talent wants to do things differently, so they need to keep up and figure out how to monitor these changes and figure out how to be sustainable. PRIA does a variety of things, but their core business is to help preserve the documentation of the exchange of property.
*Leah Prescott at Georgetown Law at beginning of server to store bagged files and METS records for metadata and workflow process for digital content. virtual storage in a server farm. it's not something the Law Library has done before. Working with born-digital procedures for WRLC and main campus to acquire a DAMS.
*Georgetown Law is at the beginning implementing a new server to store bagged files and METS records for metadata, and document the workflow process for digital content. They basically have virtual storage in a server farm. It's not something the Law Library has done before. They are also working on developing procedures for born-digital content with the [http://www.wrlc.org/ Washington Research Library Consortium (WRLC)], and the Georgetown main campus to acquire a DAMS.
*Joe Pawlekto, NYU, in house digitization and a lot of images, audio, video, using BagIt and usin git at upload. Amazon storage, and microservices approach to fixity checks. Re-engineering message architecture, and an event logger to log things as they happen. Can talk about this in a couple of months
*NYU doing a lot of in-house digitization, and has a lot of images, audio, and video. They are using [http://en.wikipedia.org/wiki/BagIt BagIt], and Git at upload, plus Amazon storage, and a micro services approach to fixity checks. They are currently re-engineering their message architecture to include an event logger to log things as they happen. Joe could talk about this in a couple of months.
*Kevin, NARA is re-architecting ERA. Digital processing environment, preprocessing of materials that comes in prior to putting into a repository. A cloud-based staging area with some tools. Need to find out what they can share.
*NARA is re-architecting its electronic records archive (ERA), but can't disclose any details at the moment. They will be looking at their digital processing environment, and pre-processing materials that come in prior to putting into a repository. The ERA may include a cloud-based staging area with some tools. Kevin will find out what he can share, and get back to the group.
*Kat at Dance Heritage, Dance digitization is implementing LTO 6, but can't share publicly. We're digitizing through hubs that are in NY, DC, and SF. Unique moving image materials and creating preservation copies and access nodes. Dave Rice is main technical consultant, and best person to talk to. QC tools into final stage of dev bootcamp tomorrow in SF. BayVC stuff on NEH grant. Digital management and tools, got an NEH P&A grant. Lauren just got picked up by the Library.
*Dance Heritage Coalition does a lot of digitization for their partners, and is looking into implementing LTO-6 tape drives and can't yet share details publicly. Digitizing happens through hubs that are in DC, NY, and SF. There is a lot of unique moving image materials, and they are creating preservation copies and access nodes. [https://www.bavc.org/dave-rice Dave Rice, BAVC,] is main technical consultant, and the best person to talk to. He has developed some quality control tools through an NEH grant, and they're getting into the final stage of the project. A development bootcamp is being held in San Francisco on March 26. BAVC received the [https://www.bavc.org/BAVC-awarded-NEH-preservation-grant grant from NEH].
*Dave, WGBH, NEH digital preservation grant to build a Hydra stack on fedora, and wrapping it up. Wanted to see if they could build something to handle large files, and have it replicated. Built off thing at Penn State, but accommodate their needs. Will take all file tapes. Run into challenges managing large files, and moving things around. Were using proprietary tape robot for years, but tested the Hydra implementation. Managing the expectation of the user cause you're not going to get that 100MB file back immediately. Re-working their workflow. Instead of relying robot system, go back to a vault. Pull things from the archive, pull the LTO tape, and can pull the file back to their computer. Very difficult to put a lot of money on infrastructure. Give a talk on down the road. The Hydra system works, and they have total control over the code. fedora 4 has come out. How do you migrate from 3 to 4, and what does it offer?
*WGBH, received an NEH digital preservation grant to build a Hydra stack on fedora, and are wrapping it up. They wanted to see if they could build something to handle large files, and have it replicated. It was modeled off a thing at Penn State, but needed to accommodate WGBH's specific needs. Will take all file tapes. Run into challenges managing large files, and moving things around. Were using proprietary tape robot for years, but tested the Hydra implementation. Managing the expectation of the user cause you're not going to get that 100MB file back immediately. Re-working their workflow. Instead of relying robot system, go back to a vault. Pull things from the archive, pull the LTO tape, and can pull the file back to their computer. Very difficult to put a lot of money on infrastructure. Give a talk on down the road. The Hydra system works, and they have total control over the code. fedora 4 has come out. How do you migrate from 3 to 4, and what does it offer?
*Trevor has been working with best editions statement on software, and LC may be putting out some format guidance to come out in the near future. It'll be broadly distributed.
*Trevor has been working with best editions statement on software, and LC may be putting out some format guidance to come out in the near future. It'll be broadly distributed.
*Shawn, MSU, they just sent Media Preserve 40 VHS and discussed what mezzanine. They have a fedora repository, and store on a SAN. Tried Archivematica with fedora. Pretty much an Islandora shop. 12 TB of data with mezanine files. Drupal for access on top of Archivematica, then lost a key staff member and has derailed things.
*Shawn, MSU, they just sent Media Preserve 40 VHS and discussed what mezzanine. They have a fedora repository, and store on a SAN. Tried Archivematica with fedora. Pretty much an Islandora shop. 12 TB of data with mezanine files. Drupal for access on top of Archivematica, then lost a key staff member and has derailed things.

Revision as of 12:19, 26 March 2014

Roster

  • Trevor Owens, Library of Congress
  • Karen Cariani, WGBH
  • Barrie Howard, Library of Congress
  • Dave MacCarn, WGBH
  • Jim Harper, Property Records Industry Association (PRIA)
  • Joe Pawletko, New York University
  • Martin Jacobson, U.S. National Archives and Records Administration
  • Shawn Nicholson, Michigan State University
  • Kevin McCarthy, U.S. National Archives and Records Administration
  • Martin Kong, Chicago State University
  • Chelcie Rowell, Wake Forest University
  • Kat Bell, Dance Heritage Coalition
  • Leah Prescott, Georgetown University Law Center
  • Ernest Bryant, U.S. National Archives and Records Administration

Agenda

  1. Update on 2015 National Agenda for Digital Stewardship
  2. Fixity check factsheet
  3. Update on NDSA Storage Survey report
  4. Ideas for potential speakers - ArchivesSpace was well attended, and awaiting interview responses
  5. Digital Preservation 2014 meeting
  6. Open source software in digital preservation projects (interview series)
  7. Future directions

Action Items

  • Draft a lightning talk proposal on the fixity check factsheet - Trevor
  • Pull out thematic topics from blog posts, and share the ideas with the group - Trevor

Discussion

Update on 2015 National Agenda for Digital Stewardship
Please read the 2014 National Agenda, if you haven't. This group had a call, and put together some ideas and passed them along to the Coordinating Committee. If you have anything to add, email you ideas back to the list, or pass them along to Trevor or Karen.

Fixity check factsheet
There hasn't been a lot of feedback from blog post about the fixity check document. This document was developed on a previous call where the group worked up a draft factsheet. There's been a lot of positive response from people, so we're in a position to call it done and release a first version. A general announcement should go out to NDSA-All providing another week or two for responses, and then we'll call it a day.

Update on NDSA Storage Survey report
It's taking a little time to get the data into shape and the analysis done. Once it comes together, someone needs to lead writing the actual report. The previous report provides a foundation, so the writing will involve updating from last time. It will still be awhile to get the charting stuff finished, so maybe in about 2 - 3 months. Leah Prescott will help lead the writing. The next storage survey will be done again in 2015.

Ideas for potential speakers
The ArchivesSpace webinar was well attended, and Trevor is awaiting interview responses from Brad. What topics are of interest to infrastructure people, and the NDSA as a whole? There has been a presentation on DPN, but maybe someone who is an implementer can speak. Mark Leggott spoke recently about Islandora. One potential topic is the Olive Library Project from Carnegie-Mellon, which is providing emulation and virtualization as a service. A service to LTO tape from 4 to 6. Open source Swift nodes could be an interesting topic. Amazon is built on top of object stores, and there are a number of projects, e.g., SwiftStack, that like that model of interfacing with your storage. If anyone has any ideas, please share things over the list

Digital Preservation 2014 meeting
The call for proposals yielded over 80 proposals. The meeting takes place in the DC Metro Area from July 22-24, so mark your calendars. If it would be interesting to have a face to face, that can be arranged. The group can ask for meeting space, or just meet up for happy hour. A lot of special interest groups meet for breakfast before the program starts. There was consensus that it would be good to meet face to face for those who will be attending.

Open source software in digital preservation projects (interview series)
Trevor will pull out some thematic things on from the blog posts, and share the ideas with the group.

Future directions
An open discussion followed where the caller's discussed their local set up:

  • PRIA works with local governments in backing up their records, and is interested in learning about best practices in the preservation electronic records. There are industry and technology changes, and new talent wants to do things differently, so they need to keep up and figure out how to monitor these changes and figure out how to be sustainable. PRIA does a variety of things, but their core business is to help preserve the documentation of the exchange of property.
  • Georgetown Law is at the beginning implementing a new server to store bagged files and METS records for metadata, and document the workflow process for digital content. They basically have virtual storage in a server farm. It's not something the Law Library has done before. They are also working on developing procedures for born-digital content with the Washington Research Library Consortium (WRLC), and the Georgetown main campus to acquire a DAMS.
  • NYU doing a lot of in-house digitization, and has a lot of images, audio, and video. They are using BagIt, and Git at upload, plus Amazon storage, and a micro services approach to fixity checks. They are currently re-engineering their message architecture to include an event logger to log things as they happen. Joe could talk about this in a couple of months.
  • NARA is re-architecting its electronic records archive (ERA), but can't disclose any details at the moment. They will be looking at their digital processing environment, and pre-processing materials that come in prior to putting into a repository. The ERA may include a cloud-based staging area with some tools. Kevin will find out what he can share, and get back to the group.
  • Dance Heritage Coalition does a lot of digitization for their partners, and is looking into implementing LTO-6 tape drives and can't yet share details publicly. Digitizing happens through hubs that are in DC, NY, and SF. There is a lot of unique moving image materials, and they are creating preservation copies and access nodes. Dave Rice, BAVC, is main technical consultant, and the best person to talk to. He has developed some quality control tools through an NEH grant, and they're getting into the final stage of the project. A development bootcamp is being held in San Francisco on March 26. BAVC received the grant from NEH.
  • WGBH, received an NEH digital preservation grant to build a Hydra stack on fedora, and are wrapping it up. They wanted to see if they could build something to handle large files, and have it replicated. It was modeled off a thing at Penn State, but needed to accommodate WGBH's specific needs. Will take all file tapes. Run into challenges managing large files, and moving things around. Were using proprietary tape robot for years, but tested the Hydra implementation. Managing the expectation of the user cause you're not going to get that 100MB file back immediately. Re-working their workflow. Instead of relying robot system, go back to a vault. Pull things from the archive, pull the LTO tape, and can pull the file back to their computer. Very difficult to put a lot of money on infrastructure. Give a talk on down the road. The Hydra system works, and they have total control over the code. fedora 4 has come out. How do you migrate from 3 to 4, and what does it offer?
  • Trevor has been working with best editions statement on software, and LC may be putting out some format guidance to come out in the near future. It'll be broadly distributed.
  • Shawn, MSU, they just sent Media Preserve 40 VHS and discussed what mezzanine. They have a fedora repository, and store on a SAN. Tried Archivematica with fedora. Pretty much an Islandora shop. 12 TB of data with mezanine files. Drupal for access on top of Archivematica, then lost a key staff member and has derailed things.
  • Martin, CSU, sent out Digital POWRR grant. They and others have issues for access and preservation, espectially money. Tried to look at tools that are available out there either OS or for purchase. Five different institutions, different sizes, different constituencies. Tested: DuraCloud; MetaArchive; Archivematica; Preservica. Take results and use them going forward. Not sure if this will be consortially addressed. Law passed for state universities to have an open access mandate. Mandate to store and provide pertetual access.

Documents

File:NDSA Fixity Check Project Concept Draft v6 5.pdf

CSU is part of an IMLS grant working on tools for DP for small to mid-sized academic institutions. They finished the tool testing, and are aobut to finish workshops? Digital POWRR

The Fixity Check factsheet is missing, add as #2. The call for the summer meeting has closed, but if there are any requests send them to Trevor, #5.