Authorea

Nick McKay edited A flexible container for paleoclimate data.tex almost 9 years ago

Commit id: 0c9c48015e9ec40a50fc1746cb3f45f77433e249

deletions | additions

Describing the columns in the datatable in the LiPD framework allows explicit encoding of key metadata that are commonly lost or misunderstood in current data structures. For example, the ``climateInterpretation'' section above allows the scientist to explicitly describe the details of how the parameter ``senses'' climate. When encoded as above and explicitly defined and linked, the knowledge that this record is interpreted to record May through July sea surface temperature, and that those temperature estimates were derived from the Mg/Ca calibration equations of \citet{Barker_2005} and \citet{Thornalley_2009} becomes built into the dataset, and readable to both people and computers. It's queryable, and linked to other datasets, and transparent when datasets are used in ways that are outside the published interpretations. An advantage of using JSON as the default container for this information is that it is an extremely common vehicle for all manner of data, and can be parsed by nearly all modern programming languages. However, since it may not As each LiPD dataset is comprised of a JSON-LD file and one or more csv files; each dataset is packaged using BagIt \footnote{https://en.wikipedia.org/wiki/BagIt}, which provides a simple format for collecting and validating files for distribution, and that can be familiar to readily serialized into a majority compressed file for exchange between users. To facilitate input and output of climate scientists, LiPD datasets, we are developing code to easily export LiPD datasets into and out of Matlab (as structured arrays), R (as lists) and Python (as dictionaries) (link to code base). dictionaries). As each LiPD dataset is comprised of a JSON-LD file and one or more csv files; each dataset is packaged using BagIt \footnote{https://en.wikipedia.org/wiki/BagIt}, which provides a simple format for collecting and validating files for distribution, and that can be readily serialized into a compressed file for exchange between users.