loading page

AGU data citation community of practice - Credit for creators of data within collections using the concept of a reliquary
  • +9
  • Justin Buck,
  • Deb Agarwal,
  • James Ayliffe,
  • Chris Erdmann,
  • Carole Goble,
  • Ugis Sarkans,
  • Daniel Noesgaard,
  • Uwe Schindler,
  • Shelley Stall,
  • Martin Fenner,
  • Martina Stockhause,
  • Paolo Manghi
Justin Buck
National Oceanography Center

Corresponding Author:[email protected]

Author Profile
Deb Agarwal
Lawrence Berkeley National Laboratory
Author Profile
James Ayliffe
National Oceanography Center
Author Profile
Chris Erdmann
Author Profile
Carole Goble
University of Manchester
Author Profile
Ugis Sarkans
EMBL-European Bioinformatics Institute
Author Profile
Daniel Noesgaard
GBIF Secretariat
Author Profile
Uwe Schindler
MARUM - University of Bremen
Author Profile
Shelley Stall
American Geophysical Union
Author Profile
Martin Fenner
Front Matter
Author Profile
Martina Stockhause
German Climate Computing Centre (DKRZ)
Author Profile
Paolo Manghi
Consiglio Nazionale delle Ricerche (CNR)
Author Profile


A gap in community practice on data citation that emerged during the AGU fall meeting 2020 Data FAIR Town Hall, “Why Is Citing Data Still Hard?” with the goal of addressing the use case of citing a large number of datasets such that credit for individual datasets is assigned properly. The discussion included the concept of a “Data Collection” and the infrastructure and guidance still needed to fully implement the capability so it is easier for researchers to use and receive credit when their data are cited in this manner. Such collections of data may contain thousands to millions of elements with a citation needing to include subsets of elements potentially from multiple collections. Such citations will be crucial to enable reproducible research and credit to data and digital object creators. To address this gap, the data citation community of practice formed including members from data centres, research journals, informatics research communities, and data citation infrastructure. The community has the goal of recommending an approach that is realistic for researchers to use and for each stakeholder to implement that leverages existing infrastructure. To achieve data citation of these subsets of large data collections the concept of a “reliquary” is introduced. In this context the reliquary is a container of persistent identifiers (PIDs) or references defining the objects used in a research study. This can include any number of elements. The reliquary can then be cited as a single entity in academic publications. The reliquary concept will enable data citation use cases such as the citation of elements within a data collection that are formed from numerous underlying datasets that have their own PIDs, unambiguous citation of data used in IPCC Assessment Reports, and citing the subsets of collections of research data that contain millions of elements. The discussions over the course of 2021 have developed a theoretical concept, at the time of writing formal use cases and initial applications are being defined. The recommendation developed by this effort will be available for review and comment by communities such as ESIP and RDA. All are welcome.