Alyssa Goodman I significantly re-structured the intro over the past hour, and made it less about "optical" surveys.  over 10 years ago

Commit id: 3567e5ef5fd33b5b48f61a08a0a4df6cb43653c1

deletions | additions      

       

The physical and astronomical sciences have a well established reputation for being disciplines with a strong culture of data sharing. Astronomy has pioneered Open Access to both publications and data. In fact, the data generated by large sky surveys, such as those indicated above, are collected under national grants (e.g., NASA), archived by national institutions (e.g., the Space Telescope Science Institute, STScI), and made publicly available to anyone (e.g., at \url{http://archive.stsci.edu/}). The fact that astronomical data from large surveys are publicly available is remarkable, but by no means surprising. Astronomers collect data about the Universe, and thus, they feel a moral obligation to share collected data openly. Moreover, these data are collected under national programs that require data to be made openly available.  Astronomers often have access to efficient and robust mechanisms that serve to archive, curate, and make primary data available (e.g. [MAST]\url{http://archive.stsci.edu}, MAST,  NED, SIMBAD xxadd linksxx). But, almost no organized, well-known, very few  parallel systems exist for derived data. data, and none is yet robust.  Because most, if not all, scientific articles in astronomy are based on derived data, making such data visible, intelligible and available to the public is of fundamental importance.In this article, we analyze how the processes of sharing, archiving, and citing derived astronomical data is accomplished by means of a quantitative link structure analysis and a qualitative interview study. The results of this research article are divided in two sections, accordingly. In the first part of the results, we report on a link analysis performed on all articles published in the main astronomy journals between 1997 and 2008. We explore all these articles for outgoing links. If links are present in an article, are those links pointing to data? Are the links still valid and reachable? We find that 1) astronomers have increasingly used links in papers to provide pointers to derived data, and 2) the availability of these data deteriorates with time (broken links) especially when derived data are hosted on personal websites. In the second part of the results section, we report on the results of a personal interview study conducted with a dozen astronomers of the Harvard-Smithsonian Center for Astrophysics. The purpose of the interview was to document astronomers' data practices in a semi-structured format. We found that 1) astronomers produce by and large derived data in standard astronomical formats, 2) they are overwhelmingly willing to share their data with their peers and the public, and 3) they are normally unaware of mechanisms for archiving and citing derived data.  In this article, we analyze how the processes of sharing, archiving, and citing derived astronomical data is {\it presently} accomplished. Our research is based upon a quantitative link structure analysis and a qualitative interview study. The results of this article are divided in two sections, accordingly. In the first part of the results, we report on a link analysis performed on all articles published in the main astronomy journals between 1997 and 2008. We explore all these articles for outgoing links. If links are present in an article, are those links pointing to data? Are the links still valid and reachable? We find that 1) astronomers have increasingly used links in papers to provide pointers to derived data, and 2) the availability of these data deteriorates with time (broken links) especially when derived data are hosted on personal websites. In the second part of the results section, we report on the results of a personal interview study conducted with a dozen astronomers at the Harvard-Smithsonian Center for Astrophysics. The purpose of the interview was to document astronomers' data practices in a semi-structured format. We found that 1) astronomers produce by and large derived data in standard astronomical formats, 2) they are overwhelmingly willing to share their data with their peers and the public, and 3) they are normally unaware of mechanisms for archiving and citing derived data.