Alberto Pepe edited introduction.tex  over 10 years ago

Commit id: 525366617b89a6310385f1aadc72e568eb6b4fb9

deletions | additions      

       

The physical and astronomical sciences have a well established reputation for being disciplines with a strong culture of data sharing. Astronomy has pioneered Open Access to both publications and data. In fact, the data generated by large sky surveys, such as those indicated above, are collected under national grants (e.g., NASA), archived by national institutions (e.g., the Space Telescope Science Institute, STScI), and made publicly available to anyone (e.g., at \url{http://archive.stsci.edu/}). The fact that astronomical data from large surveys are publicly available is remarkable, but by no means surprising. Astronomers collect data about the Universe, and thus, they feel a moral obligation to share collected data openly. Moreover, these data are collected under national programs that require data to be made openly available.  Astronomers often have access to efficient and robust mechanisms that serve to archive, curate, and make primary data available (e.g. \url[MAST]{http://archive.stsci.edu/}, \url{MAST}{http://archive.stsci.edu/},  NED, SkyView, SIMBAD xxadd linksxx). But, very few parallel systems exist for derived data, and none is yet robust. Because most, if not all, scientific articles in astronomy are based on derived data, making such data visible, intelligible and available to the public is of fundamental importance. In this article, we analyze how the processes of sharing, archiving, and citing derived astronomical data is {\it presently} accomplished. Our research is based upon a quantitative link structure analysis and a qualitative interview study. The results of this article are divided in two sections, accordingly. In the first part of the results, we report on a link analysis performed on all articles published in the main astronomy journals between 1997 and 2008. We explore all these articles for outgoing links. If links are present in an article, are those links pointing to data? Are the links still valid and reachable? We find that 1) astronomers have increasingly used links in papers to provide pointers to derived data, and 2) the availability of these data deteriorates with time (broken links) especially when derived data are hosted on personal websites. In the second part of the results section, we report on the results of a personal interview study conducted with a dozen astronomers at the Harvard-Smithsonian Center for Astrophysics. The purpose of the interview was to document astronomers' data practices in a semi-structured format. We found that 1) astronomers produce by and large derived data in standard astronomical formats, 2) they are overwhelmingly willing to share their data with their peers and the public, and 3) they are normally unaware of mechanisms for archiving and citing derived data.