NDSA:Monday, November 22, 2010

From DLF Wiki
Revision as of 14:17, 11 February 2016 by Dlfadm (talk | contribs) (2 revisions imported: Migrate NDSA content from Library of Congress)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

NDSA Infrastructure Working Group Notes for First Phone Meeting Monday, November 22, 2010

In attendance

Leslie Johnston, Library of Congress (co-chair) Karen Cariani, WGBH (co-chair) Jimi Jones, Library of Congress (note taker) Micah Altman, Harvard University Aaron Binns, Internet Archive Joe Pawletko, New York University Kris Carpenter, Internet Archive Dean Farrell, NCDCR Daniel Dodge, Thompson Reuters Michelle Gallinger, Library of Congress Michael J. Giarlo, Penn State Martin Halbert, University of North Texas Mike Smoral, University of Maryland Burch Lazorchak, Library of Congress Cal Lee, University of North Carolina Trevor Owens, Library of Congress Curtis Pulford, State of Wisconsin Geographic Information Office David Minor, San Diego Supercomputer Center John Spencer, BMS-Chace John Unsworth, University of Illinois at Urbana-Champaign Andrew Woods, DuraSpace William Ying, ArtStor Daphane DeLeon, Nevada State Library and Archives

Meeting Notes

After a round-robin session of introductions Karen led us into a discussion of the Infrastructure Working Group Charter draft that she had been compiling.

Statement of Purpose Discussion

  • The Infrastructure Working Group will work to build a community of sharing information and best practices about technical developments for systems to support digital preservation.
  • The focus of the Infrastructure Working Group is the development and maintenance of tools for curation and preservation, and providing storage, hosting, migration, or similar services for the long term preservation of digital content.

John Unsworth said that both the above options have implicit statements that we should make explicit.

Leslie said that were should be more explicit about the use of the term “providing” in #2. The NDSA/Library of Congress will not be providing the services listed in that statement.

Curt noted that both statements are very different from one another. There was general discussion about the two and the general opinion is that #2 is probably closer to what this group can reasonably do with the time and tools at our disposal.

Micah preferred #2 because it, to his mind, provides more of a focus for the first 1-3 years of this working group’s life. Curt agreed with this and furthered it by saying that we could blend the two statements together. He prefers the term “investigating” rather than “development.”

We then moved on to the Practices section of the scope statement. This section refers to the means by which the group will work – the tools the group will employ to communicate and collaborate.

Micah asserted that we should make explicit in the scope statement that attendance at the annual NDSA conference is not mandatory since budgets are different for the various members. This suggestion was met with general assent.

Leslie mentioned that one tool LC is developing is a wiki for the various NDSA working groups. Michelle said that the wiki should be done in the next 2-3 weeks.

Martin said that all the tools listed seem perfectly good and that some of them overlap (listservs and email, for example). He asked if there are any expectations on which will be used the most? To this Karen registered her preference for emails and phone calls for actually getting work done and communicating. There was agreement on this. A point was made that emails and phone calls should be group-accessible. A side note from your note taker: perhaps using WebEx where appropriate and recording the sessions (audio and slides/white board) could help this?

There was also a discussion of using Google Docs because this is a good tool for simultaneous editing. Concerns over privacy were raised but there was agreement that Google Docs could be a good and powerful tool in cases where privacy need not be a critical issue.

Martin noted that the term “social media” is pretty generic. Leslie joked that we likely won’t create an NDSA Infrastructure Group Facebook page. It was noted, however, that social media like Twitter could be a great way to disseminate results if not necessarily an appropriate place to do the work.

Participation Section Discussion

There was a brief discussion of the term “Action Teams” under this section. Michelle said that this term was adapted for NDSA for another effort – the National Geospatial Advisory Committee (NGAC, http://www.fgdc.gov/ngac). The purpose of action teams within NDSA working groups will be to break work down into chunks and allow group members to interface with outside experts who are not part of the NDSA. Action team members who are not NDSA members will not have the same privileges or ability to vote that NDSA members have. This does, however, allow flexibility with respect to who working group members interface with on projects.

Karen went through the participant requirements in this section to general assent. Leslie noted that these requirements can be revisited and edited over time – this is not a static section. (This presumably applies to the entire scope statement document.)

Micah identified the need to rectify scope statements between groups so there is no “cross cutting.” Leslie agreed that this is necessary and said that this is one of the purposes of the upcoming December NDSA “Constitutional Congress” meeting.

It was also noted that action teams will have to create their own scopes of work for themselves as needed.

Scope of Work Discussion

Karen then moved us to the real meat of the meeting: the Scope of Work section.

In this section there are five different ideas for work for this group. The ideas were culled from the IdeaSpace section for this group (http://ndsa.ideascale.com/). The ideas are as follows:

  • Promote collaborative development of free and open source tool using sustainable software development processes and to create a structure for the support of the software over time.
  • Explore the use of computer forensics tools for the appraisal, processing, and preservation of born-digital collections.
  • Design and participate in more initiatives that investigate and document potential preservation best practices in the use of large-scale storage and cloud infrastructures.
  • Promote and suggest strategies and best practices for *storage* of digital objects to be preserved for preservation architecture and protocols for content submission and exchange for preservation architecture and protocols for content submission and exchange.
  • Encouraging communities with highly specialized needs (e.g., geospatial, datasets, observational data) to develop storage networks or access services that can serve the entire community.

Karen decided to read through the five ideas and elicit feedback on them in order to tease out which ideas seemed the best fit for the next year or so’s worth of work for this group. She noted that we should focus on those ideas with achievable tasks in that period of time for a group composed of already-full-time workers.

Regarding Idea #1, Leslie noted that she isn’t sure how to translate this into a concrete, achievable deliverable. To which a meeting participant recommended that we might consider “identifying best practices” with respect to open-source tool creation.

In response to a question about the duration of the work in this scope statement Leslie said we should look at this as one year – three years is too daunting.

Another participant said that for #1 we could generate a guide to relevant open-source tools and discuss what each tool does. Another participant said that this would be highly difficult to do because we can’t possibly survey all of the tools out there. Plus this is a moving target as more tools are being developed all the time.

John Unsworth suggested we identify 5 or 6 topics relevant to Idea #1 and discuss one topic per monthly phone call. This could serve as a kind of “brown bag” discussion in which we invite an expert speaker to discuss each topic and then record the discussion in some way. This idea was met with general assent and interest with the participants.

There was more discussion of the “surveying tools” idea and it seemed that many were in agreement that maintaining and keeping current such a tool would require much more effort than this group could reasonably expend. However, an idea came up to identify the necessary components/characteristics of creating an open source tools – a kind of guide to “what you should do if you’re interested in this kind of thing.”

Karen said that perhaps Idea #1 wouldn’t be the best idea for us to pursue at this point so we moved on to #2

John Unsworth said that this idea seems more focused than #1 and therefore may be more manageable. Leslie said that the University of Maryland is working on a very similar project about forensics so perhaps we should wait at least until they’ve published their findings, which is very soon.

Cal said Idea #2 is something that would almost certainly need brainpower from outside this group, and that the Maryland report does not cover best practices. Micah argued for waiting on this idea because it’s a bit narrow of a topic for this group. He said he’s leaning towards the “best practices” version of Idea #1.

Cal said that a lot of initiatives (like the Planets work) focus on the bitstream. He said that something that gets ignored is looking at the information that is on the media that you receive. This is the first step in acquiring materials – perhaps this group could focus on that? Leslie said that forensics in the case of transferring materials over a network (via FTP, for example) is potentially very important. There was general agreement on Cal and Leslie’s discussion points.

Karen then turned the discussion to what kinds of activities do we want to focus on. Do we want to generate surveys? Best practices? Other?

John Spencer said that he liked the idea that Cal put forth. He said that BMS-Chace deals with lots of data coming off of lots of esoteric formats. This kind of analysis could be highly beneficial for their work.

Karen asked the group if best practices should, generally, be the kind of work we want to generate. There was general assent to this question. Someone remarked that generating best practices would be a good thing to do for the larger digital preservation community.

Karen then summarized what we had as ideas so far:

  • Best practices for forensics (from Idea #2)
  • Best practices for creating open-source tools (from Idea #1)

We then moved to Idea #3. A participant noted that emphasizing “the cloud” in what we do with this group is a good idea because issues of privacy are very hot right now.

John Spencer mentioned a group called the Open Data Center Alliance, which is comprised of companies like Marriott, BMW and others. He said this group is investing considerable money in working with vendors to generate cloud storage specifications for their respective needs. John said this group would do well to stay on top of what this group is doing.

As we started getting pressed for time we moved on to Idea #4. Karen asked if this was really a Standards working group question. Mike said that this is probably an idea that is broader than the Standards group’s work and can really be seen as exploring best practices for choosing relevant products and services. Martin agreed with Mike on this.

We moved on to Idea #5.

Curt opined that there is no real organization to the GIS community that it would be hard to do work that would persuade such a disparate group. He said that #5 would be “a good idea but tough to do.” Butch said that the Library of Congress has a lot of GIS-related activities going right now. He would be happy to periodically report to the Infrastructure group about what’s going on at LC in this arena. Karen said it might be best to leave this one there and figure out over time how best to address it. She then said that it seems like the group is leaning toward some kind of best practices work.

Kris noted that whatever we decide to do it will be very important to engage outside experts. Martin agreed with this and went on to say that he sees “best practices” as a loaded term since our field is so emergent. We might consider refining that term somehow.

Karen then brought us to considering what our deliverables will be for the group’s work. It seems that a chart and/or some kind of written document will be the chief deliverables. John Unsworth, referring back to his idea for how to implement Idea #1, said that a deliverable could be one of these brown bag conference calls with an expert and the documentation of the call. Kris said she very much likes using wikis because of their “living” and continually-updatable quality. John Spencer added that whatever we produce we should be explicit about the length of time we will maintain that resource.

Curt said that we should have a care with using the terms “free” and “open source” too liberally in our scope statement. These terms, he said, are not necessarily interchangeable. Leslie quipped that this is like a “Free Puppies” sign – you discover how much you have to invest in the puppies once you get them home. We have to be conscious of the fact that there are maintenance and licensing issues involved in open source stuff – these can be hidden costs on the back end.

Karen then concluded the meeting by saying she will take the notes from the meeting and use them to help her edit the scope statement. She will also disseminate a Doodle poll to determine the date/time of the next phone meeting.