NDSA:Preservation Storage Topic JPEG2000

From DLF Wiki
Revision as of 15:19, 11 February 2016 by Dlfadm (talk | contribs) (4 revisions imported: Migrate NDSA content from Library of Congress)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Statement of Purpose

The Infrastructure Working Group, in February 2012, initiated a series of open conversations on detailed aspects of preservation storage. These conversations are conducted over the listserv and each topic is discussed over the course of 2-3 weeks. Topic 2 was on compression, and in relation to that point, this research done for the FADGI group was reproduced here to illustrate the way that compressed & uncompressed file formats can influence digital preservation policies and infrastructure.

Files/PDFs referenced in this report

For efficiency's sake, most of the PDFs and pages linked below are all included in the attached PDF. Unfortunately, I haven't the time to upload all 35 individually, so they are merged into one doc. Please find them there as they have not been uploaded individually and many of the below links will not work.

File:JP2K documents.pdf

Recent JPEG 2000 Literature

Applied Earth Observations and Geoinformation - Color Orthophotos - April 2012

Techworld - British Newspaper Archive - 6 February 2012

  • Sophie Curtis - British Newspaper Archive: Digitising the Nation's Memory
  • British Library and Brightsolid
  • Stance on JPEG 2000: Used as preservation standard because of compression capabilities
  • The pages are scanned in TIFF format and then converted into JPEG 2000 files
  • "We throw away the TIF files because they're just too big to keep. To put it into perspective, we've probably got something like 250TB of JPEG 2000, and we have 3 copies of each file, so it's a lot of data. If we'd just been going with the uncompressed TIF, that would probably be something in excess of a petabyte and a half." - Malcolm Dobson, chief technology officer at brightsolid

Telecommunications Systems 49.2 - Satellite Image Communication - February 2012

  • M. Abolfathi and R. Amirfattahi - Design and implementation of a reliable and authenticated satellite image communication
    • Full-text unavailable as of March 1, 2012, but this issue will be available at the Library of Congress
  • Telecommunications Systems 49.2, February 2012, p.171-177
  • Stance on JPEG 2000: Unclear from the abstract, but satellite-based projects tend to use JPEG 2000 for compression purposes
  • This article came up during a JPEG 2000 search, and this implies that the compression of JPEG 2000 image files is crucial to this satellite project.

States News Service - Mars Observer - 7 December 2011

  • Shelley Littin, NASA Space Grant intern - With "Google Earth" for Mars, Explore the Red Planet from Home
  • University of Arizona
  • Stance on JPEG 2000: Lossless image compression ideal for storing high-resolution images
  • "This file format allows software supporting the JPEG2000 networking capabilities to download and view just a portion of any image in less than 30 seconds, resolving the need to spend a long time downloading an entire image and ensuring there is enough local storage space to hold the image."

Journal of Digital Imaging 24.6 - Compressed Dental Image Comparison - December 2011

  • B. Güniz Baksi and Aleš Fidler - Fractal Analysis of Periapical Bone from Lossy Compressed Radiographs: A Comparison of Two Lossy Compression Methods
  • School of Dentistry, Department of Oral Diagnosis and Radiology, Ege University, Izmir, Turkey
  • Stance on JPEG 2000: performs slightly worse than compressed JPEGs but the difference is neglible
  • "The JPEG compression method performed only slightly better than JPEG2000 since it showed less FD difference at the same compressed file size down to JPEG 30 CL. However, the difference between the two methods was small and it may be negligible in a clinical setting."

Proceedings of SPIE - Hyperspectral Image Compression - November 2011

  • Milosz Ciznicki, Krzysztof Kurowski, Antonio Plaza - GPU implementation of JPEG2000 for hyperspectral image compression
  • Proceedings of SPIE - The International Society for Optical Engineering
  • Stance on JPEG 2000: Invaluable for hyperspectral imaging of Earth
  • "JPEG2000 is an important technique for data compression which has been successfully used in the context of hyperspectral image compression, either in lossless and lossy fashion ... Specifically, we develop GPU (graphics processing units) implementations of the lossless and lossy modes of JPEG2000."

Proceedings of SPIE - Compression of Astronomical Images - 24-26 August 2011

  • P. Pata - CCD noise influence on JPEG2000 compression of astronomical images
    • Full-text unavailable
  • Proceedings of SPIE - The International Society for Optical Engineering
  • Stance on JPEG 2000: Lossless JPEG 2000 does not compress enough to be practical. Lossy JPEG 2000 is being researched astronomical imaging uses.
  • From abstract: "This work deals with the influence of noise generated in the CCD structure to the defined quality criteria. It will also be shown the impact of the lossy standard JPEG2000 on quality of image data in astronomy."

Journal of Digital Imaging 24.4 - Mobile Tele-Radiology Imaging System - August 2011

  • Dong Keun Kim - A Mobile Tele-Radiology Imaging System with JPEG2000 for an Emergency Care
  • National Research Foundation of Korea funded by the Ministry of Education, Science and Technology
  • Stance on JPEG 2000: Highly recommended for emergency medical images
  • "In order to overcome the data bandwidth limitation at wireless communication links and improve the efficiency of mobile teleradiology systems for transferring massive Digital Imaging and Communications in Medicine (DICOM) medical images, we adopted the JPEG2000 coding method."
  • Upgraded to JPEG 2000 after using JPEG as the old standard for medical imaging

IEEE International Conference - JPEG 2000 Compression Quality Assessment - 11-15 July 2011

  • J-F Pambrun - Perceptual quantitative quality assessment of JPEG2000 compressed ct images with various slice thicknesses
    • Full-text unavailable
  • 2011 IEEE International Conference on Multimedia and Expo (ICME 2011)
  • Stance on JPEG 2000: There is reluctance to use lossy JPEG 2000 since the quality is relatively poor but the saved space is crucial to the medical imaging community
  • From abstract: "In this paper, we present an objective quantitative quality assessment of compressed CT images using Visual Signal to Noise Ratio. Our results show that visual fidelity can be significantly affected by two factors, slice thickness and exposure time, for images compressed using the same compression ratio."

D-Lib Magazine - JPEG 2000 for long-term preservation - May/June 2011

  • Johan van der Knijff - JPEG 2000 for Long-term Preservation: JP2 as a Preservation Format
  • National Library of the Netherlands
  • Stance on JPEG 2000: Needs to be slightly altered before it can be the international preservation standard
  • "In the case of ICC profiles, a strict interpretation of the standard even completely prohibits the use of ICC profiles for defining working colour spaces, which would make the format unsuitable for any applications that require colour support beyond the sRGB colour space. For preservation, this results in a number of risks, because images may not be rendered properly by future viewers, and colour space and resolution information may be lost in future migrations. These issues could be remedied by some small adjustments of JP2's format specification, which would create minimal backward compatibility problems, if any at all."

Robert Buckley - JPEG 2000 DC Summit Presentation - 12 May 2011

  • Robert Buckley - The Benefits of JPEG 2000 for Image Access and Preservation
  • University of Rochester
  • Stance on JPEG 2000: It should be every institution's preservation standard
  • Cites reduced storage costs, enhanced image handling, and new opportunities as reasons to adopt JPEG 2000 as the standard.
  • Questions regarding sustainability, color support, and implementation remain.

Alabama Digital Preservation Network - DDP Networks in North America - 15 February 2010

  • Aaron Trehub, Thomas C. Wilson - Keeping it simple: the Alabama Digital Preservation Network (ADPNet)
  • Auburn University and University of Alabama Libraries
  • Stance on JPEG 2000: Will not adopt until it's the standard in more institutions
  • "If JPEG2000 is accepted as a standard preservation format, CONTENTdm’s ability to use this format for presentation will simplify the process of archiving and caching high-quality archival images in a PLN (Private LOCKSS Network)."

Digital Preservation Coalition - TIFF or JPEG2000? - 27 January 2010

  • Conversation between digital preservationists - TIFF or JPEG2000?
  • Participants from: Natural History Museum, Oxford Archeology, [[NDSA:#Wellcome Library|Wellcome Library]], Tate Britain, and [[NDSA:#National Archives and Records Administration (NARA)|NARA]]
  • Stance on JPEG 2000: Mixed; generally wary of the new format
  • Polly Parry of the Natural History Museum asks whether she should consider TIFF or JPEG 2000 for her preservation format: "While the general consensus of responses ... seems to be TIFF, there is an element of horizon-scanning and if JPEG2000 is the next big thing, maybe we should just bite the bullet?"

JEADV 24 - JPEG and JPEG 2000 Compression of Dermatological Images - 10 November 2009

  • Gulkesen, KH - Evaluation of JPEG and JPEG2000 Compression Algorithms for Dermatological Images
  • Journal of the European Academy of Dermatology and Venereology
  • Department of Biostatistics and Medical Informatics and Department of Dermatology at Akdeniz University in Turkey
  • Stance on JPEG 2000: Works better than JPEG for maintaining image quality after compression
  • "When JPEG and JPEG2000 algorithms were compared, it was observed that JPEG2000 algorithm was more successful than JPEG for all compression rates. However, loss of image quality is recognizable in some of images in all compression rates."

Multimedia Systems 15 - Survey on JPEG 2000 Encryption - January 2009

  • Dominik Engel, Thomas Stütz, Andreas Uhl - A survey on JPEG2000 encryption
  • Multimedia Systems 15, January 2009, p.243–270
  • Stance on JPEG 2000: Neutral; this is a survey to see how well it works regarding encryption
  • "In this survey we have discussed and compared various techniques for protecting JPEG2000 codestreams by encryption technology.As to be expected, some techniques turn out to be more beneficial than others and some methods hardly seem to make sense in any application context. In any case, a large variety of approaches exhibiting very different properties can be considered useful and covers almost any thinkable multimedia application scenario."

Volker Heydegger - File Formats and Data Integrity - June 2008

  • Volker Heydegger - Analysing the Impact of File Formats on Data Integrity
  • Proceedings of the Archiving Conference
  • Stance on JPEG 2000: Performs well for a compressed file, uncompressed formats perform better regarding bit rot
    • June 2008 in Bern, Switzerland
  • "As compression is a widely used feature in many file formats, for some explicitly dedicated to (e.g., JP2), compression can be considered as one of the most important features of file formats and therefore is one of the crucial factors for a file formats impact on data integrity ... Nevertheless JP2 compression is, compared to other compressions, quite successful in producing images which keep their visual quality, especially in case of low corruption rates, although there are moderate differences in pixel data."


Top of Page

Institutions that Have Implemented JPEG 2000

Biodiversity Heritage Library

British Library

  • Publication: Preservation Plan for Microsoft - Update
    • 19 June 2007
  • Type of use: Preservation master
  • “The JP2 files fulfill the role of master file but a lack of industry take-up is a slight concern from a preservation viewpoint ... However, the format is well defined and documented and poses no immediate risk.”

Early European Books (Arts & Humanities Proquest | Cambridge, UK)

Federal Bureau of Investigation (FBI)

  • Publication: Profile for 1000ppi Fingerprint Compression
    • April 2004
  • Type of use: Master fingerprint images
  • The FBI uses JPEG 2000 as the fingerprint image standard because of the high compression rate and retention of detail.
  • "This document specifies a format for use in compressing 1000ppi fingerprints. This format is a profile (usage subset) of the ISO/IEC 15444-1 JPEG 2000 image compression standard."

Google Books

  • Publication: Jeff Breidenbach - JPEG 2000 and Google Books
    • 13 May 2011 (JPEG 2000 Summit Presentation at the Library of Congress)
  • Type of use: Images for the Google Books project (archive masters)
  • From presentation:
    • "JPEG2000
      • pre-processed images
      • processed illustrations and color images
      • library return format
      • illustrations inside PDF files"

Harvard University Library

  • Publication: File Formats & Guidelines : Digital Preservation : Office for Information Systems
    • Current as of 2 March 2012
  • Type of use: Preservation master, but highest priority objects are still done in TIFF
  • "While JPEG 2000 is becoming more acceptable in the library community as a preservation format, there are still advantages to TIFF over JPEG 2000 for preservation. TIFF uncompressed is a simpler format internally and has more general tool support."

Internet Archive

  • Publication: R. Miller, “Internet Archive (IA): Book Digitization and Quality Assurance Processes,” confidential document for IA partners. (cited in A Status Report on JPEG 2000 Implementation for Still Images: The UConn Survey)
    • 7 May 2009
  • Type of use: Visually lossless JPEG 2000 for scanned book preservation and access
  • "The Internet Archive has developed its visually lossless JPEG 2000-based benchmark in concert with partner institutions."

Library and Archives Canada (LAC)

Library of Congress (LOC) - United States

National Digital Newspaper Program

National Audio Visual Conservation Center

  • Publication: James Snyder - JPEG2000 in Moving Image Archiving
    • 13 May 2011 (JPEG 2000 Summit Presentation at the Library of Congress)
  • Type of use: Preservation master for audiovisual materials
  • Using: "JPEG2000 ‘lossless lossless’ (reversible 5x3) (ISO 15444)"

National Diet Library - Japan

National Library of Norway

  • Publication: Digitization of books in the National Library – methodology and lessons learned
    • September 2007
  • Type of use: Preservation master for book digitization project
  • The Library is digitizing books at 400 dpi and a color depth of 24 bits. They chose to use JPEG 2000 over TIFF to save storage space by approximately 50%, and they performed tests to make sure that they could convert JPEG 2000 into uncompressed TIFF with no information loss.

National Library of the Czech Republic

National Library of the Netherlands

New York Public Library (NYPL)

University of Connecticut Libraries

  • Publication: Where We are Today: An Update to the UConn Survey on JPEG 2000 Implementation for Still Images
    • 12 May 2011 (JPEG 2000 Summit Presentation at the Library of Congress)
  • Type of use: Lossless JPEG 2000 for archival masters and lossy JPEG 2000 for processed masters
  • "... makes it easier to also archive raw DNG “safety masters” along with a rendered format (JPEG 2000). For a given image, storage footprint results in something smaller than a single uncompressed TIFF."

Wellcome Library

  • Publication: JPEG 2000 as a Preservation and Access Format for the Wellcome Trust Digital Library
    • August 2009
  • Type of Use: Archival and production masters
  • "This report recommends irreversible JPEG2000 compression for the preservation and access formats of single grayscale or color images. Initially specifying a minimally lossy datastream will result in overall compression ratios around 4:1; the exact value will depend on image content."
  • "This report recommends that the preservation format for single grayscale and color images be a JP2 file containing a minimally lossy irreversible JPEG 2000 datastream, typically with five resolution levels and multiple quality layers."

Yale University Library

  • Publication: JPEG 2000 Page Image Compression for Large-Scale Digitization at Yale
    • 17 January 2008
  • Type of use: Lossy JPEG 2000 used for access
  • "We recommend that Yale adopt lossy compression at Kirtas’ quality level 90 as the standard for Yale’s copy of the images produced in the Microsoft/Kirtas large-scale digitization project. We are confident that this level of compression provides the best compromise between image quality and manageable file size for an access-oriented project of this type."


Top of Page

Institutions That Have Not Implemented JPEG 2000 for Preservation

National Archives and Records Administration (NARA)

Princeton University Digital Library

  • Publication: The link to the imaging standards is currently broken but was frequently cited as http://diglib.princeton.edu/?_xq=html&_xsl=imaging.xsl
  • Type of use: TIFF as preservation masters with derived .jp2 images
  • "For the most part, we'll be deriving JPEG2000 images from the master TIFF files ... JPEG2000s are still not viewable in most browsers, so we've acquired the Aware JPEG2000 server software for dynamically displaying JPEG2000s as JPEGs, while still retaining the same flexible end user interaction and tools."

University of Utah

  • Publication: Email from Kenning Arlitsch of the University of Utah
    • 4 January 2005
  • Type of use: JPEG 2000 for access but TIFF for preservation masters
  • "We view JPEG2000 as very useful for delivery purposes, but are sticking with TIFF as our archival file format. As the others have mentioned JP2 is still a very young format, and TIFF is the recognized and accepted archival standard. And at 15:1 compression I have to believe there will be some loss of image quality, though I haven't done any side-by-side print comparison." (Comments from 2005)


Top of Page

Criteria for Digital Preservation Formats

Library of Congress

  • List of 7 criteria
    1. Disclosure
    2. Adoption
    3. Transparency
    4. Self-documentation
    5. External dependencies
    6. Impact of patents
    7. Technical protection mechanisms
  • Note: Even though LC lists "adoption" as a preservation requirement, they have still chosen JPEG 2000 as the preservation master standard,

NARA

  • List of 12 criteria
    1. Ubiquity
    2. Support
    3. Disclosure
    4. Documentation quality
    5. Stability
    6. Ease of identification and validation
    7. Intellectual Property Rights
    8. Metadata Support
    9. Complexity
    10. Interoperability
    11. Viability
    12. Re-usability
  • Note: "Adoption" is not listed specifically as a preservation requirement, but NARA is concerned about institutional acceptance of a format to keep using TIFF instead of JPEG 2000 as their preservation format.

Recurring Issues with JPEG 2000

  1. Belief that lossless JPEG 2000 still loses some information at high rates of compression
  2. Reluctance to change from a format that works (TIFF)
  3. Issues converting between TIFF and JPEG 2000
  4. Downloading special software to use view JPEG 2000 causes people to worry about long-term preservation
    • Are digital orphans imminent?
  5. Concerns about future migration from JPEG 2000 into a newer format
  6. General lack of industry take-up
  7. Lack of published research and/or case studies to show that lossless JPEG 2000 is truly lossless
  8. Considered an access format but not a preservation format

Recurring Positive Traits of JPEG 2000

  1. Less necessary storage space for high-resolution images
    • Leads to reduced storage costs and more storage possibilities for small institutes
  2. Quick download of large image files or image subsets
  3. Possible to covert to TIFF with no information loss
  4. Format is well-defined and documented
  5. Many national libraries have adopted it as the preservation standard
  6. In general, science and medical professionals prefer JPEG 2000 to other file formats because of the high compression rates.
    • However, they are not so concerned with preservation as cultural heritage institutions.
  7. Open standard and license-free
  8. Graceful degradation
  9. Mathematically lossless compression possible at around a 2:1 ratio
    • Visually lossless at around 10:1 or 20:1 compression
  10. Region of Interest (ROI) coding

Top of Page