NDSA:January 22, 2013 Call: Difference between revisions

From DLF Wiki
Jump to navigation Jump to search
Line 26: Line 26:


Embedding things in a PDF with such a limited description of what it is is troubling. You're forced to provide a mime-type but that's probably not good enough. That should be documented as part of this analysis.
Embedding things in a PDF with such a limited description of what it is is troubling. You're forced to provide a mime-type but that's probably not good enough. That should be documented as part of this analysis.
Are more people using XML/SGML now? Get these from almost every single publisher but largely get header/abstract as opposed to fulltext. Page image becomes the full intellectual content of the document for something like 20 million journal articles.

Revision as of 15:19, 22 January 2013

Back to Standards Working Group Main Page

Back to PDF Exploration Page


Agenda

Participants

Kevin DeVorsey, Don Chalfant, Kate Murray, Sheila Morrissey, Carl Fleischhauer, Caroline Arms, Butch Lazorchak

Meeting Notes

Discussion on PDF as a "wrapper" and what that means for this effort.

What is the "use case" where someone would include embedded files in a PDF/A-3 document but NOT feel that information was important to preserve?

Example: Epub3 packaging. Define behaviors for handlers of different kinds of content. This is a "less encapsulated" container than PDF/A-3.

Complexity=risk. Hard to say that we'd ever get to the point where we'd differentiate between "record" material and "non-record" material within a single file.

The downstream uses of content is not only recreating the user experience of 2013 but to manipulate an entire corpora of files. This is different from the preservation of the original user experience.

SIP-based formats for bundling things together are more flexible.

Embedding things in a PDF with such a limited description of what it is is troubling. You're forced to provide a mime-type but that's probably not good enough. That should be documented as part of this analysis.

Are more people using XML/SGML now? Get these from almost every single publisher but largely get header/abstract as opposed to fulltext. Page image becomes the full intellectual content of the document for something like 20 million journal articles.