NDSA:Tuesday, Mar 25, 2014: Difference between revisions
m 32 revisions imported: Migrate NDSA content from Library of Congress |
|||
(18 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
Return to [[NDSA:Infrastructure Working Group#Meeting Schedules, Minutes and Agendas | Meeting Schedules, Minutes and Agendas]] | |||
==Roster== | ==Roster== | ||
*Trevor Owens, Library of Congress | *Trevor Owens, Library of Congress | ||
Line 4: | Line 6: | ||
*Barrie Howard, Library of Congress | *Barrie Howard, Library of Congress | ||
*Dave MacCarn, WGBH | *Dave MacCarn, WGBH | ||
*Jim Harper, | *Jim Harper, Property Records Industry Association (PRIA) | ||
*Joe Pawletko, New York University | *Joe Pawletko, New York University | ||
*Martin Jacobson, U.S. National Archives and Records Administration | *Martin Jacobson, U.S. National Archives and Records Administration | ||
Line 12: | Line 14: | ||
*Chelcie Rowell, Wake Forest University | *Chelcie Rowell, Wake Forest University | ||
*Kat Bell, Dance Heritage Coalition | *Kat Bell, Dance Heritage Coalition | ||
*Leah Prescott, Georgetown Law | *Leah Prescott, Georgetown University Law Center | ||
*Ernest Bryant, U.S. National Archives and Records Administration | *Ernest Bryant, U.S. National Archives and Records Administration | ||
Line 25: | Line 27: | ||
==Action Items== | ==Action Items== | ||
* | *Draft a lightning talk proposal on the fixity check factsheet - Trevor | ||
*Pull out thematic topics from blog posts, and share the ideas with the group - Trevor | |||
==Discussion== | ==Discussion== | ||
''Update on 2015 National Agenda for Digital Stewardship''<br/> | ''Update on 2015 National Agenda for Digital Stewardship''<br/> | ||
Please read the 2014 National Agenda, if you haven't. This group had a call, and put together some ideas and passed them along to the Coordinating Committee. If you have anything to add, email you ideas back to the list, or pass them along to Trevor or Karen.<br/> | Please read the [http://www.digitalpreservation.gov/ndsa/nationalagenda/ 2014 National Agenda], if you haven't. This group had a call, and put together some ideas and passed them along to the Coordinating Committee. If you have anything to add, email you ideas back to the list, or pass them along to Trevor or Karen.<br/><br/> | ||
''Fixity check factsheet''<br/> | ''Fixity check factsheet''<br/> | ||
There hasn't been a lot of feedback from blog post. This document was developed on a previous call where the group worked up a factsheet. There's been a lot of positive response from people, so we're in a position to call it done and release a first version. A general announcement should go out to NDSA-All providing another week or two for responses, and then we'll call it a day.<br/> | There hasn't been a lot of feedback from [http://blogs.loc.gov/digitalpreservation/2014/02/check-yourself-how-and-when-to-check-fixity/ blog post about the fixity check document]. This document was developed on a previous call where the group worked up a [http://blogs.loc.gov/digitalpreservation/files/2014/02/NDSA-Checking-your-digital-content-Draft-2-5-14.pdf draft factsheet]. There's been a lot of positive response from people, so we're in a position to call it done and release a first version. A general announcement should go out to NDSA-All providing another week or two for responses, and then we'll call it a day.<br/><br/> | ||
''Update on NDSA Storage Survey report''<br/> | ''Update on NDSA Storage Survey report''<br/> | ||
It's taking a little time to get the data into shape and the analysis done. Once it comes together, someone needs to lead writing the actual report. The previous report provides a foundation, so the writing will involve updating from last time. It will still be awhile to get the charting stuff finished, so maybe in about 2 - 3 months. Leah Prescott will help lead the writing. The next storage survey will be done again in 2015.<br/> | It's taking a little time to get the data into shape and the analysis done. Once it comes together, someone needs to lead writing the actual report. The previous report provides a foundation, so the writing will involve updating from last time. It will still be awhile to get the charting stuff finished, so maybe in about 2 - 3 months. Leah Prescott will help lead the writing. The next storage survey will be done again in 2015.<br/><br/> | ||
''Ideas for potential speakers''<br/> | ''Ideas for potential speakers''<br/> | ||
The ArchivesSpace webinar was well attended, and Trevor is awaiting interview responses from Brad. What topics are of interest to infrastructure people, and the NDSA as a whole? There has been a presentation on DPN, but maybe someone who is an implementer can speak. Mark | The [http://www.archivesspace.org/ ArchivesSpace] webinar was well attended, and Trevor is awaiting interview responses from Brad. What topics are of interest to infrastructure people, and the NDSA as a whole? There has been a presentation on [http://www.dpn.org/ DPN], but maybe someone who is an implementer can speak. Mark Leggott spoke recently about [http://islandora.ca/ Islandora]. One potential topic is the Olive Library Project from Carnegie-Mellon, which is providing emulation and virtualization as a service. A service to LTO tape from 4 to 6. Open source Swift nodes could be an interesting topic. Amazon is built on top of object stores, and there are a number of projects, e.g., [https://swiftstack.com/openstack-swift/architecture/ SwiftStack], CEPH [http://ceph.com/] that like that model of interfacing with your storage. If anyone has any ideas, please share things over the list<br/><br/> | ||
''Digital Preservation 2014 meeting''<br/> | ''Digital Preservation 2014 meeting''<br/> | ||
The call for proposals yielded over 80 proposals. The meeting takes place in the DC Metro Area from July 22-24, so mark your calendars. If it would be interesting to have a face to face, that can be arranged. The group can ask for meeting space, or just meet up for happy hour. A lot of special interest groups meet for breakfast before the program starts. There was consensus that it would be good to meet face to face for those who will be attending.<br/><br/> | |||
''Open source software in digital preservation projects (interview series)''<br/> | ''Open source software in digital preservation projects (interview series)''<br/> | ||
Trevor will pull out some thematic things on from the blog posts, and share the ideas with the group.<br/><br/> | |||
''Future directions<br/> | ''Future directions<br/> | ||
An open discussion followed where the caller's discussed their local set up: | An open discussion followed where the caller's discussed their local set up: | ||
* | *PRIA works with local governments in backing up their records, and is interested in learning about best practices in the preservation electronic records. There are industry and technology changes, and new talent wants to do things differently, so they need to keep up and figure out how to monitor these changes and figure out how to be sustainable. PRIA does a variety of things, but their core business is to help preserve the documentation of the exchange of property. | ||
* | *Georgetown Law is at the beginning implementing a new server to store bagged files and METS records for metadata, and document the workflow process for digital content. They basically have virtual storage in a server farm. It's not something the Law Library has done before. They are also working on developing procedures for born-digital content with the [http://www.wrlc.org/ Washington Research Library Consortium (WRLC)], and the Georgetown main campus to acquire a DAMS. | ||
* | *NYU doing a lot of in-house digitization, and has a lot of images, audio, and video. They are using [http://en.wikipedia.org/wiki/BagIt BagIt], and Git at upload, plus Amazon storage, and a micro services approach to fixity checks. They are currently re-engineering their message architecture to include an event logger to log things as they happen. Joe could talk about this in a couple of months. | ||
* | *NARA is re-architecting its electronic records archive (ERA), but can't disclose any details at the moment. They will be looking at their digital processing environment, and pre-processing materials that come in prior to putting into a repository. The ERA may include a cloud-based staging area with some tools. Kevin will find out what he can share, and get back to the group. | ||
* | *Dance Heritage Coalition does a lot of digitization for their partners, and is looking into implementing LTO-6 tape drives and can't yet share details publicly. Digitizing happens through hubs that are in DC, NY, and SF. There is a lot of unique moving image materials, and they are creating preservation copies and access nodes. [https://www.bavc.org/dave-rice Dave Rice, BAVC,] is main technical consultant, and the best person to talk to. He has developed some quality control tools through an NEH grant, and they're getting into the final stage of the project. A development bootcamp is being held in San Francisco on March 26. BAVC received the [https://www.bavc.org/BAVC-awarded-NEH-preservation-grant grant from NEH]. | ||
* | *WGBH, received an NEH digital preservation grant to build a Hydra stack on fedora, and are wrapping it up. They wanted to see if they could build something to handle large files, and have it replicated. It was modeled off a thing at Penn State, but needed to accommodate WGBH's specific needs. They have run into some challenges managing large files, and moving things around. Prior to this project they were using a proprietary tape robot for years, but then tested the Hydra implementation. Managing the expectations of users has been a big lesson learned because you're not going to get a 100GB file back immediately. They are re-working their workflow. Instead of relying on the robot system, they moving back to using a vault. People will have to go pull LTO tape drives from the archive so that a particular file can be pulled back to the user's computer. They have found it's very difficult to put a lot of money on infrastructure. The Hydra system works, and they have total control over their code. fedora 4 has come out, and they're thinking about how you migrate from 3 to 4, and what does 4 offer? They can give a talk on down the road. | ||
* | *The Library of Congress has been working on best editions statements regarding the deposit of software, and may be putting out some format guidance in the near future. It'll be broadly distributed. | ||
* | *Michigan State University just sent Media Preserve 40 VHS tapes to reformat and receive preservation masters and mezzanine formats. They have a fedora repository, and store on a SAN. They tried [https://www.archivematica.org/wiki/Main_Page Archivematica] with fedora underneath and Drupal on top for access. They are now pretty much an Islandora shop. They hold 12 TB of data with mezzanine files. They lost a key staff member, which derailed things for awhile. | ||
* | *Chicago State University has been working on the [http://digitalpowrr.niu.edu/ Digital POWRR project], funded by IMLS. They, and other small- to medium-sized institutions, have issues for access and preservation, especially funding. They looked at tools that are available either as open source, or for purchase. Five different institutions, of different sizes and constituencies participated in the project. They have tested DuraCloud, MetaArchive, Archivematica, and [http://preservica.com/ Preservica]. They are building capacity and knowledge, and will use what they learned going forward. They are not sure if any next steps will be consortially addressed. A State of Illinois legal mandate to provide open access to research articles, [http://www.ilga.gov/legislation/publicacts/fulltext.asp?Name=098-0295 Open Access to Research Articles Act (Public Act 098-0295)], is one driver for the work of the Digital POWRR project. | ||
==Documents== | ==Documents== | ||
[[File:NDSA Fixity Check Project Concept Draft v6 5.pdf]] | [[File:NDSA Fixity Check Project Concept Draft v6 5.pdf]] | ||
Latest revision as of 14:20, 11 February 2016
Return to Meeting Schedules, Minutes and Agendas
Roster
- Trevor Owens, Library of Congress
- Karen Cariani, WGBH
- Barrie Howard, Library of Congress
- Dave MacCarn, WGBH
- Jim Harper, Property Records Industry Association (PRIA)
- Joe Pawletko, New York University
- Martin Jacobson, U.S. National Archives and Records Administration
- Shawn Nicholson, Michigan State University
- Kevin McCarthy, U.S. National Archives and Records Administration
- Martin Kong, Chicago State University
- Chelcie Rowell, Wake Forest University
- Kat Bell, Dance Heritage Coalition
- Leah Prescott, Georgetown University Law Center
- Ernest Bryant, U.S. National Archives and Records Administration
Agenda
- Update on 2015 National Agenda for Digital Stewardship
- Fixity check factsheet
- Update on NDSA Storage Survey report
- Ideas for potential speakers - ArchivesSpace was well attended, and awaiting interview responses
- Digital Preservation 2014 meeting
- Open source software in digital preservation projects (interview series)
- Future directions
Action Items
- Draft a lightning talk proposal on the fixity check factsheet - Trevor
- Pull out thematic topics from blog posts, and share the ideas with the group - Trevor
Discussion
Update on 2015 National Agenda for Digital Stewardship
Please read the 2014 National Agenda, if you haven't. This group had a call, and put together some ideas and passed them along to the Coordinating Committee. If you have anything to add, email you ideas back to the list, or pass them along to Trevor or Karen.
Fixity check factsheet
There hasn't been a lot of feedback from blog post about the fixity check document. This document was developed on a previous call where the group worked up a draft factsheet. There's been a lot of positive response from people, so we're in a position to call it done and release a first version. A general announcement should go out to NDSA-All providing another week or two for responses, and then we'll call it a day.
Update on NDSA Storage Survey report
It's taking a little time to get the data into shape and the analysis done. Once it comes together, someone needs to lead writing the actual report. The previous report provides a foundation, so the writing will involve updating from last time. It will still be awhile to get the charting stuff finished, so maybe in about 2 - 3 months. Leah Prescott will help lead the writing. The next storage survey will be done again in 2015.
Ideas for potential speakers
The ArchivesSpace webinar was well attended, and Trevor is awaiting interview responses from Brad. What topics are of interest to infrastructure people, and the NDSA as a whole? There has been a presentation on DPN, but maybe someone who is an implementer can speak. Mark Leggott spoke recently about Islandora. One potential topic is the Olive Library Project from Carnegie-Mellon, which is providing emulation and virtualization as a service. A service to LTO tape from 4 to 6. Open source Swift nodes could be an interesting topic. Amazon is built on top of object stores, and there are a number of projects, e.g., SwiftStack, CEPH [1] that like that model of interfacing with your storage. If anyone has any ideas, please share things over the list
Digital Preservation 2014 meeting
The call for proposals yielded over 80 proposals. The meeting takes place in the DC Metro Area from July 22-24, so mark your calendars. If it would be interesting to have a face to face, that can be arranged. The group can ask for meeting space, or just meet up for happy hour. A lot of special interest groups meet for breakfast before the program starts. There was consensus that it would be good to meet face to face for those who will be attending.
Open source software in digital preservation projects (interview series)
Trevor will pull out some thematic things on from the blog posts, and share the ideas with the group.
Future directions
An open discussion followed where the caller's discussed their local set up:
- PRIA works with local governments in backing up their records, and is interested in learning about best practices in the preservation electronic records. There are industry and technology changes, and new talent wants to do things differently, so they need to keep up and figure out how to monitor these changes and figure out how to be sustainable. PRIA does a variety of things, but their core business is to help preserve the documentation of the exchange of property.
- Georgetown Law is at the beginning implementing a new server to store bagged files and METS records for metadata, and document the workflow process for digital content. They basically have virtual storage in a server farm. It's not something the Law Library has done before. They are also working on developing procedures for born-digital content with the Washington Research Library Consortium (WRLC), and the Georgetown main campus to acquire a DAMS.
- NYU doing a lot of in-house digitization, and has a lot of images, audio, and video. They are using BagIt, and Git at upload, plus Amazon storage, and a micro services approach to fixity checks. They are currently re-engineering their message architecture to include an event logger to log things as they happen. Joe could talk about this in a couple of months.
- NARA is re-architecting its electronic records archive (ERA), but can't disclose any details at the moment. They will be looking at their digital processing environment, and pre-processing materials that come in prior to putting into a repository. The ERA may include a cloud-based staging area with some tools. Kevin will find out what he can share, and get back to the group.
- Dance Heritage Coalition does a lot of digitization for their partners, and is looking into implementing LTO-6 tape drives and can't yet share details publicly. Digitizing happens through hubs that are in DC, NY, and SF. There is a lot of unique moving image materials, and they are creating preservation copies and access nodes. Dave Rice, BAVC, is main technical consultant, and the best person to talk to. He has developed some quality control tools through an NEH grant, and they're getting into the final stage of the project. A development bootcamp is being held in San Francisco on March 26. BAVC received the grant from NEH.
- WGBH, received an NEH digital preservation grant to build a Hydra stack on fedora, and are wrapping it up. They wanted to see if they could build something to handle large files, and have it replicated. It was modeled off a thing at Penn State, but needed to accommodate WGBH's specific needs. They have run into some challenges managing large files, and moving things around. Prior to this project they were using a proprietary tape robot for years, but then tested the Hydra implementation. Managing the expectations of users has been a big lesson learned because you're not going to get a 100GB file back immediately. They are re-working their workflow. Instead of relying on the robot system, they moving back to using a vault. People will have to go pull LTO tape drives from the archive so that a particular file can be pulled back to the user's computer. They have found it's very difficult to put a lot of money on infrastructure. The Hydra system works, and they have total control over their code. fedora 4 has come out, and they're thinking about how you migrate from 3 to 4, and what does 4 offer? They can give a talk on down the road.
- The Library of Congress has been working on best editions statements regarding the deposit of software, and may be putting out some format guidance in the near future. It'll be broadly distributed.
- Michigan State University just sent Media Preserve 40 VHS tapes to reformat and receive preservation masters and mezzanine formats. They have a fedora repository, and store on a SAN. They tried Archivematica with fedora underneath and Drupal on top for access. They are now pretty much an Islandora shop. They hold 12 TB of data with mezzanine files. They lost a key staff member, which derailed things for awhile.
- Chicago State University has been working on the Digital POWRR project, funded by IMLS. They, and other small- to medium-sized institutions, have issues for access and preservation, especially funding. They looked at tools that are available either as open source, or for purchase. Five different institutions, of different sizes and constituencies participated in the project. They have tested DuraCloud, MetaArchive, Archivematica, and Preservica. They are building capacity and knowledge, and will use what they learned going forward. They are not sure if any next steps will be consortially addressed. A State of Illinois legal mandate to provide open access to research articles, Open Access to Research Articles Act (Public Act 098-0295), is one driver for the work of the Digital POWRR project.