NDSA:November 2 Blog Preservation Meeting Minutes

From DLF Wiki
Revision as of 14:47, 7 November 2011 by Abgr (talk | contribs) (→‎Discussion)
Jump to navigation Jump to search

November 2, 2011, 11am ET

Attendees

  • Anderson, Janice Snyder | Georgetown University Law Library | anderjan@law.georgetown.edu
  • Anderson, Martha | Director, NDIIPP, Library of Congress | mande@loc.gov
  • Baker, Timothy D. | Maryland State Archives | timb@MDSA.NET
  • Beers, Elizabeth | University of Michigan Library | embeers@umich.edu
  • Carpenter, Kris | Internet Archive | kcarpenter@archive.org
  • Chudnoff, Dan | George Washington University | dchud@gwu.edu
  • Fallon, Tessa | Columbia University | taf2111@columbia.edu
  • Fido-Radin, Ben | Rhizome | ben.finoradin@rhizome.org
  • Grotke, Abbie | Library of Congress, Co-Chair of the NDSA Content Working Group | abgr@LOC.GOV
  • Hanna, Kristine | Internet Archive | kristine@ARCHIVE.ORG
  • Hartman, Cathy | University of North Texas/ Co-Chair of the NDSA Content Working Group | cathy.hartman@UNT.EDU
  • Jones, Gina | Library of Congress | gjon@loc.gov
  • Johnston, Leslie | Library of Congress | lesliej@loc.gov
  • Moffatt, Christie | National Library of Medicine | moffattc@mail.nlm.nih.gov
  • Nacin, Andrew | Wordpress | andrewnacin@gmail.com
  • Owens, Trevor | Library of Congress | trow@loc.gov
  • Potter, Abbey | Library of Congress | abpo@LOC.GOV
  • Reib, Linda | Arizona State Library, Archives, and Public Records | lreib@LIB.AZ.US
  • Schmitz Furhrig, Lynda | Smithsonian Institution | SchmitzfuhrigL@si.edu
  • Smith, Stephanie | Maryland State Archives
  • Taylor, Nicholas | Library of Congress | ntay@loc.gov
  • Wurl, Joel | National Endowment for the Humanities | jwurl@neh.gov

(Attendees from NDSA member organizations in bold)

AGENDA

Welcome/Introductions

Attendees introduced themselves and talked about their specific interests in this project.

Brief report on background of this idea

Abbie provided a quick report on how we got to this meeting (referring to the distributed blog proposal). She summed up the three ideas listed in that proposal, but explained that this meeting was to focus particularly on the "Flag for opt-in to preservation and harvesting" idea.

Discussion

Some of the questions received before the meeting included: *If a blog owner opts-in, would that guarantee preservation?

The idea is that mostly likely, yes: IA will crawl everything that is flagged by site owners for preservation. Other NDSA members will be able to select from the list what they would like to include in their own archives. Multiple organizations may collect the same URLs (duplication is not a bad thing).

  • Will blog owners expect backup services if they opt-in?

We talked quite a bit about making sure the purpose of the pilot is clear to site owners, and that this particular plugin is not meant to be able to provide backup services. Ideas #1 and #2 about downloading a for personal backup would cover this sort of request, most likely.

*How do we get notified that blogs have opted-in? Are notices sent somewhere? Some other ideas:

    • A Google spreadsheet that gets auto-updated; preservationists could refer to and pick and choose what to preserve.
    • Machine-readable tag that could be used to auto-detect sites that have opted-in (a la creative commons)
    • A feed of some sort?

Andrew demonstrated a mockup of what a simple plugin might look like, which is essentially just "submit for preservation" button. The group discussed a number or options for what would happen upon submitting. The easiest approach is to have that data get sent to an established URL to populate a database. More details on what we will do initially are below, in the proposal for moving ahead / next steps.

*How often/frequently would notifications or updates to whatever process we put in place occur?

We discussed frequency of archiving a bit but didn't go into great detail about the frequency of notifications. Basically this will occur real time - as the site owner clicks to opt-in, data will be sent to our NDSA database. If the site owner changes his/her mind and decides to STOP participating, that notice will also be sent. We need to make it very clear that if they OPT OUT after opting in, we will not DELETE their content already preserved. We will just stop archiving moving forward.

  • Is a license/agreement needed?

We didn't really go into details about this specifically, but there were concerns about copyright and whether or not preservation could include comments, posts by people other than the site owner who installs the plugins. We discussed whether parts of the sites could be identified for archiving; these seems too difficult to manage.

If a site owner opts in, does that give us explicit right to archive even if the blog/site contains content produced by others? We discussed this topic. LC, when asking permission, lets the site owner tell us if they can't grant permission for all of the content; if they say they can't then we don't archive. For this project, we're assuming that if the site owner opts in, it's okay to preserve. We discussed providing some language that could be posted to their sites telling commenters/contributors that they are potentially being archived.

We will have text describing what they are opting in to within the plugin (or available via a link to somewhere describing it all).

*What sorts of information would we want from the blog owner besides permission? (category/subject? Frequency of change information? Other data?)

We discussed and came to agreement on this core set for now:

    • URL (will be sent automatically)
    • Title (will be scraped from site automatically)
    • Category (NDSA to come up with a list for site owner to select from)
    • Description (site owner to fill out)
    • Name/Contact information
    • Option to select a creative commons license to go along with it (based on how IA does it with donations to the archive)

*Do we want data shared about who is preserving what? If organizations are picking and choosing what to preserve from the available blogs, might be good to have that information available to others (publicly or among NDSA members?)


Other topics that came up during the discussion:

  • Archiving will capture the look and feel/functionality, images, etc. not just the text. We discussed capture of feeds but at this time we're not exploring that.
  • This would not include FTP access to content
  • If NDSA members are interested in encouraging blog owners to install the plugin so that archiving is easier, they could send a link to the plugin page with instructions for installing. The pilot project will not include a "permissions letter" per se since the plugin opt-in grants the permissions required.

Next Steps/Action Items