Alyssa Goodman edited introduction.tex  over 10 years ago

Commit id: 6a12a6a59ddc5062fb45f73b28c7fcf1009df269

deletions | additions      

       

Astronomers often have access to efficient and robust mechanisms that serve to archive, curate, and make primary data available (e.g. \url{http://archive.stsci.edu/}, \url{http://ned.ipac.caltech.edu/}, \url{http://skyview.gsfc.nasa.gov/}, \url{http://simbad.u-strasbg.fr/simbad/}). But, very few parallel systems exist for derived data, and none is yet robust. Because most, if not all, scientific articles in astronomy are based on derived data, making such data visible, intelligible and available to the public is of fundamental importance.  In this article, we analyze how the processes of sharing, archiving, and citing derived astronomical data is {\it presently} accomplished. Our research is based upon a quantitative link structure analysis and a qualitative study, composed of interview interviews  and a survey. The results of this article are divided in two sections, accordingly. In the first part of the results, we report on a link analysis performed on all articles published in the main astronomy journals between 1997 and 2008. We explore all these articles for outgoing links. If links are present in an article, are those links pointing to data? Are the links still valid and reachable? We find that 1) astronomers have increasingly used links in papers to provide pointers to derived data, and 2) the availability of these data deteriorates with time (broken links) especially when derived data are hosted on personal websites. In the second part of the results section, we report findings from a personal interview study conducted with a dozen astronomers at the Harvard-Smithsonian Center for Astrophysics and a follow-up survey conducted at the same institution (175 respondents). The purpose of this dual qualitative study was to document astronomers' data use and sharing practices in a semi-structured format. We found that 1) astronomers produce derived data in standard astronomical formats, 2) they are overwhelmingly willing to share their data with their peers and the public, 3) they are normally unaware of mechanisms for archiving and citing derived data, and 4) they rely upon non-automated, non-standard methods to acquire and provide derived data (e.g., they put derived data on their website and link to it, they contact paper authors to obtain data).