Daniel Heydebreck

and 5 more

Within the AtMoDat project (Atmospheric Model Data, www.atmodat.de), a standard has been developed which is meant for improving the FAIRness of atmospheric model data published in repositories. Atmospheric model data form the basis to understand and predict natural events, including atmospheric circulation, local air quality patterns, and the planetary energy budget. Such data should be made available for evaluation and reuse by scientists, the public sector, and relevant stakeholders. Atmospheric modeling is ahead of other fields in many regards towards FAIR (Findable, Accessible, Interoperable, Reusable, see e.g. Wilkinson et al. (2016, doi:10.1101/418376)) data: many models write their output directly into netCDF or file formats that can be converted into netCDF. NetCDF is a non-proprietary, binary, and self-describing format, ensuring interoperability and facilitating reusability. Nevertheless, consistent human- and machine-readable standards for discipline-specific metadata are also necessary. While standardisation of file structure and metadata (e.g. the Climate and Forecast Conventions) is well established for some subdomains of the earth system modeling community (e.g. the Coupled Model Intercomparison Project, Juckes et al. (2020, https:doi.org/10.5194/gmd-13-201-2020)), other subdomains are still lacking such standardisation. For example, standardisation is not well advanced for obstacle-resolving atmospheric models (e.g. for urban-scale modeling). The ATMODAT standard, which will be presented here, includes concrete recommendations related to the maturity, publication, and enhanced FAIRness of atmospheric model data. The suggestions include requirements for rich metadata with controlled vocabularies, structured landing pages, file formats (netCDF), and the structure within files. Human- and machine-readable landing pages are a core element of this standard and should hold and present discipline-specific metadata on simulation and variable level.

Amandine Kaiser

and 3 more

Data maturity describes the degree of the formalisation/standardisation of a data object with respect to FAIRness and quality of the (meta-) data. Therefore, a high (meta-) data maturity increases the reusability of data. Moreover, it is an important topic in data management, which is reflected by a growing number of tools and theories trying to measure it, e.g. the FAIR testing tools assessed by RDA(1) or the NOAA maturity matrix(2). If the results of stewardship tasks cannot be shown directly in the metadata, reusers of data cannot easily recognise which data is easy to reuse. For example, the DataCite Metadata Schema does not provide an explicit property to link/store information on data maturity (e.g. FAIRness or quality of data/metadata). The AtMoDat project (3, Atmospheric Model Data) aims to improve the reusability of published atmospheric model data by scientists, the public sector, companies, and other stakeholders. These data are valuable because they form the basis to understand and predict natural events, including the atmospheric circulation and ultimately the atmospheric and planetary energy budget. As most atmospheric data has been published with DataCite DOIs, it is of high importance that the maturity of the datasets can be easily found in the DOI’s Metadata. Published data from other fields of research would also benefit from easily findable maturity information. Therefore, we developed a Maturity Indicator concept and propose to introduce it as a new property in the DataCite Metadata Schema. This indicator is generic and independent of any scientific discipline and data stewardship tool. Hence, it can be used in a variety of research fields. 1 https://doi.org/10.15497/RDA00034 2 Peng et al., 2015: https://doi.org/10.2481/dsj.14-049 3 www.atmodat.de