Leslie Hsu edited background.md  over 9 years ago

Commit id: 9bdb679132c4c15035610c778656a31cca92f677

deletions | additions      

       

One of the first online sample based geochemical databases was PetDB, the Petrological Database, formerly the Petrological Database of the Ocean Floor. The database was built on a sample-based data model \citep{Lehnert_2000}, which served as a foundational structure for several disciplinary databases that developed in the following decade, including SedDB (Lehnert et al., 2005), [GEOROC](http://georoc.mpch-mainz.gwdg.de/georoc/) (Sarbas and Nohl, 2009), NAVDAT (Walker et al., 2004), and VentDB \citep{34e0d125-4bec-4225-8afa-59a6c7565821}. These databases combine data from numerous sources into a single relational synthesis database, allowing the rapid production of integrated datasets, and significantly reducing the time commitment that was previously necessary to manually compile the same data from the original sources.  The state of the art of geochemical data publication was laid out a decade ago by \citet{Staudigel_2003} with the goal of initiating discussion of data formats and metadata in geochemistry at the “earliest stages of [geochemistry’s] exploitation of Information Technology”. \citet{Staudigel_2003} highlight complexities within the organizational structure relating to standardization, conventions, lack of tabular data, and incomplete metadata. These issues have not disappeared, but management and mitigation have significantly improved and evolved. In the last decade, improvements such as governmental data policy statements [e.g. U.S. Office of Management and Budget Memo Open Data Policy—Managing Information as an Asset (M-13-13) [http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf], endorsement of best practices, and stricter rules regarding data reporting were implemented by editors, reviewers, professional societies, and funding agencies (e.g. CODATA Scientific Data Policy Statements). Editors from several peer-reviewed journals that publish manuscripts including geochemical data agreed on minimum standards for documentation about data quality, sample information, and the format and accessibility, which was published as the Editors Roundtable document “Requirements for the Publication of Geochemical Data” (Goldstein et al., 2014). \citep{http://dx.doi.org/10.1594/IEDA/100426}.  The recommendations have been implemented by some journals, but strict enforcement is not yet common. Data management software that works directly with the laboratory equipment is one of the most efficient ways to overcome the hurdle of initiating data management. In addition to the development of suggested reporting norms for geochemical data, the Geochron (www.geochron.org) software works directly with mass spectrometers and reduction programs in order to retain the essential sample metadata (Bowring et al., 2011; Walker \citep{http://dx.doi.org/10.1029/2010GC003479} (Walker  et al., 2011). The automated software improves the workflow and streamlines the metadata preservation process by bringing data directly from the machine to data management and visualization software on the computer. Software of this type has greatly increased the ability of scientists to collect, manage, and publish data that can be easily contributed to sample-based databases. Maintaining integrated synthesis databases is an arduous task that involves sustaining controlled vocabularies, obtaining data from authors, and tracking data in a way that captures the complex metadata relationships. Increasingly, investigators are seeking rapid publication of their data, along with the ability to search multiple disciplinary databases at once. In order to address these needs and provide useful search and discovery tools, EarthChem has built several complementary systems to support its user community.