Jacob Hummel edited Introduction.tex  about 8 years ago

Commit id: d9485f7f96e5559dfb50370d4868300567b0e922

deletions | additions      

       

Fortunately, the issue of data management and analysis is not endemic to astronomy, and the resulting overlap with the needs of the broader scientific community and the industrial community at large provides a large pool of scientific software developers to tackle these common problems.  In recent years, this broader community has settled on Python as its programming language of choice due to its efficacy as a 'glue' language and the rapid speed of development it allows.   This has led to the development of a robust scientific software ecosystem with packages for numerical data analysis like NumPy (Oliphant 2006; Van Der Walt et al. 2011), SciPy (Jones et al. 2001), pandas (McKinney 2010),and scikit-image; Matplotlib (Hunter 2007),  and seaborn for plotting; scikit-learn for machine learning, and statistics and modeling packages like scikits-statsmodels, pymc, and emcee \citep{Foreman-Mackeyetal2013}. Python is quickly becoming the language of choice for astronomers as well, with theadvent of the  Astropy project \citep{Robitailleetal2013} and its affiliated packages providing a coordinated set of python tools implementing the core astronomy-specific functionality needed by researchers.  and the analysis capabilities provided by the nascent pandas library will only strengthen that trend in the future. Pandas is a thoroughly documented, open-source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for python with a strong community of developers. With this in mind, we present a pandas-based framework for analyzing GADGET-HDF5 files: the GADGET dataframe library, or GADFLY.