Nick McKay edited Introduction.tex  over 9 years ago

Commit id: 2e7814ab9624c094f8fff51222145c7981b6185c

deletions | additions      

       

Science is entering a data-intensive era, where insight is increasingly gained by extracting information from large volumes of data \cite{Hey_2012}. Thisstep  is particularly critical in paleoclimatology; as understanding past changes in climate system requires observations across large spatial and temporal scales. Paleoclimatic observations are typically limited to small geographic domains, thus investigating large scales requires integrating many disparate studies and datasets. In recognition of this, the community has made a major effort to make their data available to the broader community, largely through World Data Center for Paleoclimatology and the Pangaea data archives. Nevertheless, the lack of consistent formatting and metadata standards has made the re-use of such data needlessly labor-intensive by preventing computers from participating in the task of making connections across datasets. As the number of records in these archives has grown, making connections manually has become more and more challenging, and approaching paleoclimate questions from the view point of the vast array the data that has been archived is relatively rare at a time when it should be flourishing. The Linked Data paradigm\cite{Bizer_2009} was designed to address this problem, and to allow for data-driven discovery between datasets that would be unlikely or impossible otherwise. In this technical note, we present a new, flexible linked-data container designed for paleoclimate data. Such a data container is a necessary first step towards a ``semantic web of paleoclimatology'' \cite{Emile_Geay_2013}, and provides a straightforward framework in which communities and researchers can explicitly describe their data and metadata in common terms that the community, and computers, can understand.