NDSA:2014 National Agenda Outline

From DLF Wiki
Jump to navigation Jump to search

Draft Outline

Introduction

a. Description of the National Agenda for Digital Stewardship

i. The document is inspiration for the planning of digital preservation work and observations of the joint leadership group. It is also an evaluation of the state of digital preservation activity and key emerging issues for the year

ii. The document is not intended to be prescriptive, a directive to working groups, and it is not intended to replace any organizational efforts, planning, goals or opinions.

iii. Hoped for impact

b. Description of the NDSA, NDSA goals and how the 2014 Agenda furthers those goals (i.e inform and inspire individual, working group, and organizational work plans)

c. Intended audience: NDSA members and the wider digital preservation community

d. Authored by the joint leadership group

Section topics

Trends in Digital Content

i. Emerging content types, formats, or challenges that are of interest to the digital preservation community. Special focus on content themes as they address the interests and needs of the collecting organizations.


Start input from S&P Working Group on trends in digital content

  • Web archiving
  • Research data
  • Big data
    • Computational consumption of archives
  • How do you connect annotations to content? Should we preserve those connections?
  • How do we provide access with appropriate limits
    • (government classification, copyright restrictions, donor agreements, licenses, human subject research restrictions). ** Rights metadata standards?
  • Compound, complex objects
    • Dynamic content, integrating resources
    • Not just documents (video, digital art / new media, etc.)
  • Preservation of social media
  • How to connect related publications (within and between repositories)
  • Findability and discoverability of content
  • Accessibility of digital content (e.g., usable via screen reader)
    • Accessibility of data sets
    • In the context of open access requirements / mandates / etc.

End input from S&P Working Group on trends in digital content

Research Priorities

The Research Priorities section focuses on two distinct aspects of research: the long term preservation of research data such as e-science, data sets, and so forth; and the need for research on digital preservation activities.

Research Data

[EXAMPLE] Education Workforce Development Research: Sentence to paragraph description with rationale for including the topic in the 2014 National Agenda. Recommendation for action included if relevant.

Research Related to Digital Preservation Practices

Applied Research

In the near term future, there are specific areas of applied research around digital preservation lifecycle issues that need attention. Currently there are limited models for cost estimation for ongoing storage of digital content. Cost estimation models need to be robust and flexible. Different approaches to cost estimation should be explored and comparisons of existing models made with emphasis on reproducibility of results. Auditing models also need to be strengthened and further developed. The SafeArchive system and other bit-level auditing practices could be connected to the NDSA Levels of Preservation work to help organizations determine and validate the costs of scaling different auditing schemes. Around both topics, research needs to address multiple storage models: locally stored data, distributed preservation networks, data cooperatives, cloud storage, brokered cloud storage systems and hybrid systems need to be addressed in cost models and auditing practices so that organizations can make informed cost-effective digital preservation decisions.


4. Research in Curriculum Development [Helen]

ii.Theoretical Framework ( 3-6 Year horizon) ["helen"] 1. Information valuation/selection. Models for estimating future private & public value of information. 2. Models for estimating future risks

iii. Information Equivalence (3-6 year) [Jefferson]: Significant properties, fingerprints, authenticity

iv. preservation at scale (3-6 year): [Jefferson] 1. Preserving 'big data' -- storage scale 2. preserving high-velocity/dynamic 3. Scalable models for information provenance, equivalence, and quality 4. Information valuation and portfolio management 5. Privacy & confidentiality @ scale

iv. Policy Research (3-6 year): ["Micah"] 1.Trust engineering, trust frameworks

v. Education Workforce Development Research (3-6 year) [Helen]

vi. Evidence-Based for Preservation Methodologies & Policies (Cross-Cutting/10 years/Grand Challenge) ["Micah"] 1. experimental: labs/testbeds/field experiments • Methodologies for digital preservation research that can provide useful results with simulation of long time periods. • Methodologies for digital preservation research that provide reliable test plans. • Methodologies that combine aspects of different research areas (e.g., computer science, materials science 2. observational: random sampling/systematic trend/coverage 3. computational: replicable theoretically grounded computer models 4. Research in a lab or test-bed environment, with a focus on methods to test research results and implement effective strategies from the research lab or test-bed. Frameworks that allow people to apply their specialized knowledge and skills to specific problems


Start input from S&P Working Group on research

  • Findability and discoverability of content
  • Large scale integration of emulation into delivery (connect to work done internationally)
  • Format migration testing
  • Integration of emulation and migration (hybrid approach)
  • How do we leverage tools and practices in the digital forensics community (and other fields)?

End input from S&P Working Group on research


Research Priorities References:

Infrastructure Development

i. Infrastructure can be generally defined as the set of interconnected structural elements that provide framework supporting an entire structure of development. This includes both physical and institutional elements. --Micah altman 18:31, 13 February 2013 (UTC)

ii.Examples:

• Trends in data protection standards

• Best practices for using cloud concepts within a digital preservation strategy

• Cost-benefit analysis techniques for infrastructure planning


Start input from S&P Working Group on infrastructure

  • Development of commercial products for digital preservation; creating and maintaining relationships with the private sector
  • Consolidating and keeping alive the palette of tools we need to do our work of digital preservation, and for rendering in the future
    • Shared tool development or reusing tools developed by other communities
  • Common packaging (general and specialized)
    • In a perfect world, record-keeping systems in federal agencies would all know how to create a package, so that all sorts of systems become interoperable; would achieve huge economies for the government
  • Use and access – tends to be divorced from preservation, but needs to be more integrated
    • Preservation is ensuring access over time
    • Need to involve researchers more
    • “Archlive” – shouldn’t be places of storage, but of dynamic activities
    • Have yet to pursue the other end of the OAIS model – the consumer archive
    • New demands for API and federated access to our content coming out of initiatives like DPLA, edX, jdarchive
  • What tools are available to do things like package and annotate content (i.e., in lieu of PDF/A-3)
  • Storage concerns at scale.
  • Tools for risk assessment or other archive management tasks (e.g. preservation planning)

End input from S&P Working Group on infrastructure


Organizational Roles, Policies, and Practices

i. Preservation happens through the work of individuals and institutions. Just as it is critical to refine and develop infrastructure and basic research it is similarly critical to refine and develop workflows, practices, roles, and responsibilities both inside institutions and within networks of institutions to ensure long term access to digital content.

ii.Examples:

• Need for models for licensing old software for long term virtualization

• Need for creation of more dedicated FTEs to staff digital preservation initiatives

• Development for policies around crowdsourcing as part of digital preservation life cycle

• Expanded use of machine readable licensing for data under long term preservation

Raw notes from S&P Working Group:

  • Sustainable budgetary models for long-term preservation
  • Articulating the compendium of best practices
  • Continuum of policies ranging from high-level organizational policies to lower-level rules
  • Role of national efforts, e.g. DPN, Academic Preservation Trust
  • International efforts and leveraging other preservation groups
  • Aligning National Approaches to Digital Preservation publication as a reference
  • Need for creation of more dedicated FTEs to staff digital preservation initiatives
    • Findings from the staffing survey (needs gaps, characteristics of needed staff)
  • What are the barriers to hiring qualified staff? Is it training? Budget? Finding people?
  • Collection of position descriptions that people could use as models.
  • How do we convince management that digital preservation is important and deserves resources?
  • Audit and certification
  • Scope of what we’re responsible for as practitioners has been broadening (data management,...) Also at different levels (department, institution, community)
  • Role of disciplinary repositories (how does our organization’s repository fit into the network of repositories?)
  • Changing rules for compliance


Conclusion

a. Possible ways to engage with the topics and issues detailed in the agenda