Ivelina Momcheva

and 1 more

INTRODUCTION Much of modern Astronomy research depends on software. Digital images and numerical simulations are central to the work of most astronomers today, and anyone who is actively involved in astronomy research has a variety of software techniques in their toolbox. Furthermore, the sheer volume of data has increased dramatically in recent years. The efficient and effective use of large data sets increasingly requires more than rudimentary software skills. Finally, as astronomy moves towards the open code model, propelled by pressure from funding agencies and journals as well as the community itself, readability and reusability of code will become increasingly important (Figure [fig:xkcd]). Yet we know few details about the software practices of astronomers. In this work we aim to gain a greater understanding of the prevalence of software tools, the demographics of their users, and the level of software training in astronomy. The astronomical community has, in the past, provided funding and support for software tools intended for the wider community. Examples of this include the Goddard IDL library (funded by the NASA ADP), IRAF (supported and developed by AURA at NOAO), STSDAS (supported and developed by STScI), and the Starlink suite (funded by PPARC). As the field develops, new tools are required and we need to focus our efforts on ones that will have the widest user base and the lowest barrier to utilization. For example, as our work here shows, the much larger astronomy user base of Python relative to the language R suggests that tools in the former language are likely to get many more users and contributers than the latter. More recently, there has been a growing discussion of the importance of data analysis and software development training in astronomy (e.g., the special sessions at the 225th AAS “Astroinformatics and Astrostatistics in Astronomical Research Steps Towards Better Curricula” and “Licensing Astrophysics Codes”, which were standing room only). Although astronomy and astrophysics went digital long ago, the formal training of astronomy and physics students rarely involves software development or data-intensive analysis techniques. Such skills are increasingly critical in the era of ubiquitous “Big Data” (e.g., , or the 2015 NOAO Big Data conference). Better information on the needs of researchers as well as the current availability of training opportunities (or lack thereof) can be used to inform, motivate and focus future efforts towards improving this aspect of the astronomy curriculum. In 2014 the Software Sustainability Institute carried out an inquiry into the software use of researchers in the UK (, see also the associated presentation). This survey provides useful context for software usage by researchers, as well as a useful definition of “research software”: Software that is used to generate, process or analyze results that you intend to appear in a publication (either in a journal, conference paper, monograph, book or thesis). Research software can be anything from a few lines of code written by yourself, to a professionally developed software package. Software that does not generate, process or analyze results - such as word processing software, or the use of a web search - does not count as ‘research software’ for the purposes of this survey. However, this survey was limited to researchers at UK institutions. More importantly, it was not focused on astronomers, who may have quite different software practices from scientists in other fields. Motivated by these issues and related discussions during the .Astronomy 6 conference, we created a survey to explore software use in astronomy. In this paper, we discuss the methodology of the survey in §[sec:datamethods], the results from the multiple-choice sections in §[sec:res] and the free-form comments in §[sec:comments]. In §[sec:ssicompare] we compare our results to the aforementioned SSI survey and in §[sec:conc] we conclude. We have made the anonymized results of the survey and the code to generate the summary figures available at https://github.com/eteq/software_survey_analysis. This repository may be updated in the future if a significant number of new respondents fill out the survey[1]. [1] http://tinyurl.com/pvyqw59