NDSA:PDF Exploration: Difference between revisions
→Statement of the Problem and Goals for Addressing the Problem:: Amplified to cover explicitly the circumstances that LC faces for copyright deposit |
|||
Line 89: | Line 89: | ||
Next call: Tuesday Jan. 22, 2013, 2:00 P.M. | Next call: Tuesday Jan. 22, 2013, 2:00 P.M. | ||
Notes: | |||
[January 22, 2013 Call] | |||
==Background Materials== | ==Background Materials== |
Revision as of 13:54, 22 January 2013
Back to Standards Working Group Main Page
Title of Activity or Project
NDSA PDF/A-3 Scoping Project
One Sentence Description:
NDSA PDF/A-3 Scoping Project working group members will research the pros and cons of using the PDF/A-3 standard as an all-purpose wrapper in different preservation scenarios, including use as an extension to PDF/A-1 and PDF/A-2 in circumstances for which those formats have been adopted or recommended and use as a wrapper for various digital asset/media types, such as textual, audio, video, photo, and GIS data.
Statement of the Problem and Goals for Addressing the Problem:
The single extension to PDF/A-2 in PDF/A-3 is the ability to embed files of any type within a PDF/A document. PDF/A-3 was designed to accommodate supplementary media files for text documents. Issues raised by this extension include:
- Is PDF/A-3 appropriate as a de facto normalization wrapper format for some or all media types or in particular circumstances?
- For circumstances where PDF/A-2 has already been deemed an appropriate preservation format (primarily for textual documents), what are the risks and opportunities offered by the ability to embed content in non-PDF formats?
The goal is to develop guidelines for the appropriate use of PDF/A-3 with respect to different scenarios that include both detailed technical information and a practical quick reference guide for end-users.
Strategic Value of Activity:
- Improve understanding of best practices for using PDF/A-3 in digital preservation activities
- Enhance consistency and improve long-term viability of digitally preserved content
- Provide guidance to those considering PDF/A-3 as a long-term archiving format
Required Resources:
- Time of working group members
- Publishing venue(s)
- Communication channels
Roadmap:
- Hold regular working group conference calls (monthly, between NDSA Standards WG calls)
- Draft document and review
- Invite broader NDSA member feedback
- Publish document (digitalpreservation.gov, others?)
Dissemination of Knowledge:
- Publish report on digitalpreservation.gov
- Write a blog post
- Announce on NDSA member organization communication channels
- Present at conferences that members (and non-members?) are attending
Signifiers of Success and Outcomes:
- Completed guidelines document published on digitalpreservation.gov
- Guidelines document referenced on related Wikipedia pages
- Guidelines referenced in FDD (format description document) for PDF/A-3 [1]
- Guidelines in use or recommended by NDSA participating organizations or others
- Publication at other conferences/other journals
Questions to Ask and Answer
- Talk about background (what is pdf/a-3 and how is it different from earlier versions of PDF/A)
- Iterate categories of materials/use cases/concrete examples where it makes sense to use A-3 and other categories where it doesn't make sense. Example: if you're sending a video file don't put it in a PDF! If you had a certain kind of a journal article that had a static version of the spreadsheet in the doc but a malleable version embedded perhaps that argues for it.
- Risks to the format (scenarios in why this might be bad and why)
- Possibilities of the format (scenarios in why this might be good and why)
- Have list of defined terms in our document. How do these relate to the terms in the ISO spec. Leverage NDSA Levels of Preservation glossary. Link to glossary.
PDF/A-3 Use Case Scenarios
Add them here! We can create a separate page as necessary.
Example: Federal agency with a document management system puts an MPEG video file (and nothing else) into a PDF/A-3 file to store and then, later, to submit as an SIP (Submission Information Package) to NARA for long-term management.
Example: Publisher has a text-only article and puts it into a PDF/A-3 file, even though, in the past, the publisher used PDF/A-2. The article is then sent to library where it will be preserved for the long term.
Example: Publisher has an article that includes a complicated table, "frozen" in place, and puts it into a PDF/A-3 file, along with the Excel file from which the table was generated, in order to make it easier for a future researcher to have a malleable version of the table for use when writing another article on the same subject.
Example: Data creator has a digital map, a report, a database, digital photos, and detailed metadata that comprise a whole and wants to archive these together for the long-term.
Example from Luratech Webinar used to show primary intent of PDF/A-3: PDF/A document with diagram based on data, with embedded spreadsheet associated with diagram, metadata associated with subsection of document, source word-processing file, and audio rendering of the document (perhaps for accessibility).
Use case #1 from Luratech Webinar: Scanned documents, with the scanned image as the main PDF/A content, with native metadata in XML embedded.
Use case #2 from Luratech Webinar: "Hybrid archiving" used when document in its active life cycle, further versions might be created. Create PDF/A-3 for archive-ready rendition and embed the document in its native (e.g., word-processor) format. Built in to a standard workflow, this would leave documents "archive ready" at all times.
Use case #3 from Luratech Webinar: Human-readable invoice with embedded data marked up in CEN Core Invoice Standard (XML).
Members
- Caroline Arms, Library of Congress (caar@loc.gov)
- Don Chalfant, NARA (Donald.Chalfant@nara.gov)
- Kevin DeVorsey, NARA (Kevin.DeVorsey@nara.gov)
- Chris Dietrich, National Park Service (chris_dietrich@nps.gov)
- Carl Fleischauer, Library of Congress (cfle@loc.gov)
- Butch Lazorchak, Library of Congress (wlaz@loc.gov)
- Sheila Morrissey, Ithaka (Sheila.Morrissey@ithaka.org)
- Kate Murray, NARA (Kate.Murray1@nara.gov)
Calls and Notes
Call information:
- Call-in toll-free number (US/Canada): 866-469-3239
- Participant access code: 21408589
Next call: Tuesday Jan. 22, 2013, 2:00 P.M.
Notes:
[January 22, 2013 Call]
Background Materials
- Library of Congress Sustainability of Digital Formats DRAFT PDF/A-3 format description document (FDD) COMMENTS PLEASE to caar@loc.gov and cfle@loc.gov
- Blog Post on PDF/A-3 on the Signal
- Sheila M. Morrissey, The Network is the Format: PDF and the Long-term Use of Digital Content, Archiving 2012, pg. 200-203 (2012)
- Ithaka comments on ISO 19005-3 draft
- Caroline's thoughts on PDF/A-3 circulated in late November, 2012
- Video of Webinar by Luratech on PDF/A-3 Nov 8, 2012. Includes uses cases and demos.
- Slides used for Luratech Webinar Nov 8, 2012. Includes uses cases and demos. Do not distribute.
- In future set up calls with Steve Levinson (U.S. Courts) and Leonard Rosenthal (Adobe)