NDSA:January 22, 2013 Call

From DLF Wiki

Back to Standards Working Group Main Page

Back to PDF Exploration Page


Agenda

Participants

Kevin DeVorsey, Don Chalfant, Kate Murray, Sheila Morrissey, Carl Fleischhauer, Caroline Arms, Butch Lazorchak

Meeting Notes

Discussion on PDF as a "wrapper" and what that means for this effort.

What is the "use case" where someone would include embedded files in a PDF/A-3 document but NOT feel that information was important to preserve?

Example: Epub3 packaging. Define behaviors for handlers of different kinds of content. This is a "less encapsulated" container than PDF/A-3.

Complexity=risk. Hard to say that we'd ever get to the point where we'd differentiate between "record" material and "non-record" material within a single file.

The downstream uses of content is not only recreating the user experience of 2013 but to manipulate an entire corpora of files. This is different from the preservation of the original user experience.

SIP-based formats for bundling things together are more flexible.

Embedding things in a PDF with such a limited description of what it is is troubling. You're forced to provide a mime-type but that's probably not good enough. That should be documented as part of this analysis.

Are more people using XML/SGML now? Get these from almost every single publisher but largely get header/abstract as opposed to fulltext. Page image becomes the full intellectual content of the document for something like 20 million journal articles.

The use of tablets and phones is putting pressure on PDF as a format.

Scholarly communication, just what exactly is a publication? Many more things are out in the universe, not necessarily XML. Adobe in their suite of tools now knows how to make ePub.

Are there some narrowly defined uses where PDF/A-3 would be useful? Redundant information, the spreadsheet that represents the frozen information in the PDF.

Imaging the uses cases is one of the outputs of this group. Not necessarily outright prohibition but to articulate the positive side of it.

In academic circles, making data available as well as the conclusions. Some of these packaged up presentations would make it easier to put together the data with the conclusions. Anything in the profile that would support the effort to share data and make it available?

Supplementary materials: not the main body of an article but ancillary and enriching materials that support but are not essential to the original document.

Embedding in a PDF for an article is not a solution for preserving the data. The data should be preserved in an archive so that it was available for people searching across datasets. Appropriate for distributing to the immediate generation of readers.

Would need more metadata in the PDF/A to describe the embedded materials.

We just need to articulate the terrain really well, not solve all the problems.