What should a citation consist of?
The California Digital Library's Datapub blog has a good summary of data citation basics, which document the following "core" and "recommended" components of dataset citations:
Minimal components of a data citation:
Creator (Year) Title. Publisher. Identifier
- Creator(s): Individual(s) or organization responsible for creating the dataset.
- Year: Year the dataset was published, not necessarily created.
- Title: Should be as descriptive as possible.
- Publisher: Organization that provides access to the dataset (e.g. Dryad, Zenodo).
- Identifier: Persistent, unique identifier (e.g. a DOI).
- Location / Availability: The web address of the dataset is essential when the identifier can’t be used to reach the dataset.
- Version / Edition: Version of the dataset used in the present publication. Needed to reproduce analysis of versioned dynamic datasets.
- Access Date: Date of access for analysis in the present publication. Needed to reproduce analysis of continuously updated dynamic datasets.
- Format / Material Designator: e.g. database, CD-ROM.
- Feature Name: A description of the subset of the dataset used. May be a formal title or a list of variables (e.g. concentration, optical density).
- Verifier: Used to confirm that two datasets are identical. Most commonly a UNF or MD5 checksum.
- Series: Used if the dataset is part of series of releases (e.g. monthly, yearly).
- Contributor: e.g. editor, compiler
Digitized Special collections bring more to the table, however. For example, many institutions prefer to include citation information for the containing collection and holding institution. How best can we do that?