August Muench edited discussion.tex  over 10 years ago

Commit id: a67283ba87889dca6b721bc5cf46fc90826c5006

deletions | additions      

       

\item provide a data citation for every dataset uploaded. The citation includes a persistent identifier which links to the data, and can be added to the the references sections of any publication.  \end{enumerate}  In addition For the everyday astronomer  TheAstroData flips the equation of data sharing in a virtual obsevatory context on its head. It trades interoperability that comes with homogenized data sets for ease of data sharing by astronomers. Search functions focus on descriptive metadata instead of quantified slicing of datasets by physical quantities such as location on the sky. This trade off is not permanent, and we assert that the kinds of data access envisioned by Szalay & Gray for small published datasets can be achieved ex post facto. Our plans are to re-index and qualify (or expose  the numerical file level metadata related to)  shared data files files,  extracting addtional numerical  metadata fields to enable finer grain search. Further, the audience for TheAstroData is completely transparent and focused on indivdiual scientists or projects that have dervied (and often heterogeneous) datasets to share or to publish along side a refereed paper. It is already the case that TheAstroData datasets are linked to literature publication records in two ways. Foremost, we provide primary publication-to-dataset links to the SAO-NASA Astrohpysical Data System (ADS) \url{http://adsabs.harvard.edu/}, which is the universal liteature resource for all of astronomy; an astronomer's TheAstroData datasets appear as "Data Archive" links in the primary publication's ADS record. Second, our records are listed in the Thomson-Reuters Data Citation Index \url{http://wokinfo.com/products_tools/multidisciplinary/dci/}, which makes use of the Dataverse Network's OAI-PMH harvesting interface. Our future plans include transmutating the rich DDI metadata standards adopted by the Dataverse Network and enhanced with our astronomy specific extensions means into VO standards and exporting this version to indexing tools such as the VO Registry (or similar data publishing registry). We anticipate that our adoption of the Dataverse Network for TheAstroData has two additional benefits for everyday astronomers:     \begin{enumerate}   \item In future iterations of TheAstroData, we plan to reuse the data analysis capabilities of the Dataverse Network software to allow integration of astronomers FITS data with new visualization or analysis tools, for example, GlueViz \url{glueviz.org};   \item The stamping of TheAstroData datasets with a standardized data citation will facilitate the adoption of data citation by publishers - it is critical that this type of citations become part of the references sections in publications, and are easily traceable to derive their impact. We are in conversations with relevant astronomer publishers.   \end{enumerate}  % Also as in the case of Social Science, the central repository not only serves as a mere file system to drop and access data files, but instead provides the tools to understand the nature of the data sets and how they can be reused. It accomplishes this by allowing to add descriptive metadata about the data set and complementary files such as documentation and code, and extracting metadata automatically from the data file. In quantitative social science, the most common data formats are R, SPSS and STATA, formats that allow researchers to have rich statistical metadata for data tables. These data files are recognized by the Dataverse software and the rich metadata is extracted not only for searching, but also for providing summary statistics and analysis tools for these data types. Our extension in astronomy is to provide similar rich functionality for FITS files; in the first iteration to support searching of FITS files, but in future iterations, to allow integration with visualization or analysis tools. The Dataverse also provides the infrastructure to export that metadata and make it accessible through the Open Archive Initiative (OAI) protocol, or through data and metadata RESTful APIs, so that it can be easily harvested by other systems and make the datasets more easily discoverable by the astronomy community.