Comparison height width

Alyssa Goodman

and 2 more

Let's use Authorea to keep track of B5 materials...fits cubes: ready for Glue volume rendering.YT moviesThe blue/green is \(C^{18}O\) (2-1), and red/orange is \(NH_3\) (1, 1).keynote scienceThe three slides have been uploaded.Alvaro-style 3D movies (clustering by the FoF method)The clustering is done using a code I wrote following explanation in Alvaro's paper. The friends-of-friends threshold in Alvaro's paper is 3 km s\(^{-1}\) pc\(^{-1}\) (~ 1 \(c_s\)/half beam). Using the same threshold, the (extended) B5 would be clustered/grouped into one single component. The clustering in the movies below is done using a threshold of 1 km s\(^{-1}\) pc\(^{-1}\) (one third of Alvaro's threshold), in order to cluster the data points into multiple components (which isn't too bad a choice, since Alvaro was using CO (1-0), with a broader line width). In the movies below, the clustering is run on the combined Gaussian fit, where we have one peak from one-component Gaussian fit if the residual of one-component fit is smaller, and two peaks from the two-component fit if the residual of two-component fit is smaller.movie0_opaque_linewidth: the movie made from 3D visualization of Gaussian fitted peaks without friends-in-velocity (FIVE) clustering. Brighter/white circles are where the (Gaussian fitted) emission is higher. The size is scaled with the (Gaussian fitted) line width.movie0_transparent_linewidth: the same movie, with alpha.movie_opaque: the movie made from 3D visualization of Gaussian fitted peaks with friends-in-velocity (FIVE) clustering. The size is NOT scaled with the line width.movie_transparent: the same movie, with alpha.movie_opaque_linewidth: the movie made from 3D visualization of Gaussian fitted peaks with friends-in-velocity (FIVE) clustering. The size is scaled with the line width.movie_transparent_linewidth: the same movie, with alpha.Clustering by the DBScan methodThe clustering is done using the DBScan (density-based scanning) method in scikit-learn. The DBScan method should perform better than the FoF method(, which is similar to the K-Means). In practice, the mean silhouette coefficient, measuring how the clustering performs (ranging from -1 to 1 for each data point, with -1 meaning that the clustering is not appropriate for that data point, and 1 meaning the clustering is good), shows that the result of the DBScan (mean silhouette score ~ 0.06) is better than the FoF method (mean silhouette score ~ -0.26; the score for the FoF method is calculated on the same standardized dataset used in the DBScan analysis, to be fair). The DBScan method also identifies a number of data points which cannot be clustered (the "noisy samples"). See the scikit-learn clustering page for an overview of various clustering algorithms.To implement the DBScan method, the ppv positions of fitted Gaussian peaks (of C\(^{18^{ }}\)O 2-1) are first standardized. No other scaling is applied. The best parameters for setting up DBScan are found by measuring the mean silhouette coefficient, within a reasonable range. DBScan (set up with the best parameters) finds 12 components, compared to 10 components found by FoF.movie_DBScan and movie_DBScan_linewidth in the Google drive folder show the result in the original RA-Dec-velocity space. The smaller, black data points indicate those categorized by DBScan as the "noisy samples".
Galileo

Alberto Pepe

and 1 more

INTRODUCTION In the early 1600s, Galileo Galilei turned a telescope toward Jupiter. In his log book each night, he drew to-scale schematic diagrams of Jupiter and some oddly-moving points of light near it. Galileo labeled each drawing with the date. Eventually he used his observations to conclude that the Earth orbits the Sun, just as the four Galilean moons orbit Jupiter. History shows Galileo to be much more than an astronomical hero, though. His clear and careful record keeping and publication style not only let Galileo understand the Solar System, it continues to let _anyone_ understand _how_ Galileo did it. Galileo’s notes directly integrated his DATA (drawings of Jupiter and its moons), key METADATA (timing of each observation, weather, telescope properties), and TEXT (descriptions of methods, analysis, and conclusions). Critically, when Galileo included the information from those notes in _Siderius Nuncius_ , this integration of text, data and metadata was preserved, as shown in Figure 1. Galileo's work advanced the "Scientific Revolution," and his approach to observation and analysis contributed significantly to the shaping of today's modern "Scientific Method" . Today most research projects are considered complete when a journal article based on the analysis has been written and published. Trouble is, unlike Galileo's report in _Siderius Nuncius_, the amount of real data and data description in modern publications is almost never sufficient to repeat or even statistically verify a study being presented. Worse, researchers wishing to build upon and extend work presented in the literature often have trouble recovering data associated with an article after it has been published. More often than scientists would like to admit, they cannot even recover the data associated with their own published works. Complicating the modern situation, the words "data" and "analysis" have a wider variety of definitions today than at the time of Galileo. Theoretical investigations can create large "data" sets through simulations (e.g. The Millennium Simulation Project). Large scale data collection often takes place as a community-wide effort (e.g. The Human Genome project), which leads to gigantic online "databases" (organized collections of data). Computers are so essential in simulations, and in the processing of experimental and observational data, that it is also often hard to draw a dividing line between "data" and "analysis" (or "code") when discussing the care and feeding of "data." Sometimes, a copy of the code used to create or process data is so essential to the use of those data that the code should almost be thought of as part of the "metadata" description of the data. Other times, the code used in a scientific study is more separable from the data, but even then, many preservation and sharing principles apply to code just as well as they do to data. So how do we go about caring for and feeding data? Extra work, no doubt, is associated with nurturing your data, but care up front will save time and increase insight later. Even though a growing number of researchers, especially in large collaborations, know that conducting research with sharing and reuse in mind is essential, it still requires a paradigm shift. Most people are still motivated by piling up publications and by getting to the next one as soon as possible. But, the more we scientists find ourselves wishing we had access to extant but now unfindable data , the more we will realize why bad data management is bad for science. How can we improve? THIS ARTICLE OFFERS A SHORT GUIDE TO THE STEPS SCIENTISTS CAN TAKE TO ENSURE THAT THEIR DATA AND ASSOCIATED ANALYSES CONTINUE TO BE OF VALUE AND TO BE RECOGNIZED. In just the past few years, hundreds of scholarly papers and reports have been written on questions of data sharing, data provenance, research reproducibility, licensing, attribution, privacy, and more--but our goal here is _not_ to review that literature. Instead, we present a short guide intended for researchers who want to know why it is important to "care for and feed" data, with some practical advice on how to do that. The set of Appendices at the close of this work offer links to the types of services referred to throughout the text. BOLDFACE LETTERING below highlights actions one can take to follow the suggested rules.
All1

Hope How-Huan Chen

and 1 more

ABSTRACT. ρ Ophiuchii is a group of five B-stars, embedded in a nearby molecular cloud: Ophiuchus, at a distance of ∼ 119 pc. A “bubble”-like structure is found in dust thermal emission around ρ Oph. The circular structure on the Hα map further indicates that this bubble is physically connected to the source at the center. The goal of this paper is to estimate the impact of feedback from these embedded B-stars on the molecular cloud, by comparing the energy associated with the material entrained in the bubble to the total turbulent energy of the cloud. In this paper, we combine data from the COMPLETE Survey, which includes ¹²CO (1-0) and ¹³CO (1-0) molecular line emission from FCRAO, an extinction map derived from 2MASS near-infrared data using the NICER algorithm, and far-infrared data from IRIS (60/100 μm) with data from the Herschel Science Archive (PACS 100/160 μm and SPIRE 250/350/500 μm). With the wealth of data tracing different components of the cloud, we try to determine the best strategy to derive physical properties and to estimate the energy budget in the shell and in the cloud. We also experiment with the hierarchical Bayesian-fitting technique introduced by in an effort to eliminate the bias in the derived column densities and/or temperatures induced by noise in the far-IR data. We find that the energy entrained in the bubble is ∼ 12 % of the total turbulent energy of the Ophiuchus molecular cloud. This fraction is similar to the number give for the Perseus molecular cloud, and it suggests the non-negligible role of B-stars in driving the turbulence in clouds. We expect that a complete survey of “bubbles” in the Ophiuchus cloud will reveal the importance of B-star winds in molecular clouds.
1nessie findingchart

Alyssa Goodman

and 10 more

ABSTRACT The very long, thin infrared dark cloud Nessie is even longer than had been previously claimed, and an analysis of its Galactic location suggests that it lies directly in the Milky Way’s mid-plane, tracing out a highly elongated bone-like feature within the prominent Scutum-Centaurus spiral arm. Re-analysis of mid-infrared imagery from the Spitzer Space Telescope shows that this IRDC is at least 2, and possibly as many as 8 times longer than had originally been claimed by Nessie’s discoverers, ; its aspect ratio is therefore at least 150:1, and possibly as large as 800:1. A careful accounting for both the Sun’s offset from the Galactic plane (∼25 pc) and the Galactic center’s offset from the (lII, bII)=(0, 0) position defined by the IAU in 1959 shows that the latitude of the true Galactic mid-plane at the 3.1 kpc distance to the Scutum-Centaurus Arm is not b = 0, but instead closer to b = −0.5, which is the latitude of Nessie to within a few pc. Apparently, Nessie lies _in_ the Galactic mid-plane. An analysis of the radial velocities of low-density (CO) and high-density (${\rm NH}_3$) gas associated with the Nessie dust feature suggests that Nessie runs along the Scutum-Centaurus Arm in position-position-velocity space, which means it likely forms a dense ‘spine’ of the arm in real space as well. No galaxy-scale simulation to date has the spatial resolution to predict a Nessie-like feature, but extant simulations do suggest that highly elongated over-dense filaments should be associated with a galaxy’s spiral arms. Nessie is situated in the closest major spiral arm to the Sun toward the inner Galaxy, and appears almost perpendicular to our line of sight, making it the easiest feature of its kind to detect from our location (a shadow of an Arm’s bone, illuminated by the Galaxy beyond). Although the Sun’s (∼25 pc) offset from the Galactic plane is not large in comparison with the half-thickness of the plane as traced by Population I objects such as GMCs and HII regions (∼200 pc; ), it may be significant compared with an extremely thin layer that might be traced out by Nessie-like “bones” of the Milky Way. Future high-resolution extinction and molecular line data may therefore allow us to exploit the Sun’s position above the plane to gain a (very foreshortened) view “from above" of dense gas in Milky Way’s disk and its structure.
Link volume

Alberto Pepe

and 4 more

We analyze data sharing practices of astronomers over the past fifteen years. An analysis of URL links embedded in papers published by the American Astronomical Society reveals that the total number of links included in the literature rose dramatically from 1997 until 2005, when it leveled off at around 1500 per year. This rise indicates an increased interest in data-sharing over the same time period that the web saw its most dramatic growth in usage in the developed world. The analysis also shows that the availability of linked material decays with time: in 2011, 44% of links published a decade earlier, in 2001, were broken. A rough analysis of link types reveals that links to data hosted on astronomers’ personal websites become unreachable much faster than links to datasets on curated institutional sites. To gauge astronomers’ current data sharing practices and preferences further, we performed in-depth interviews with 12 scientists and online surveys with 173 scientists, all at a large astrophysical research institute in the United States: the Harvard-Smithsonian Center for Astrophysics, in Cambridge, MA. Both the in-depth interviews and the online survey indicate that, in principle, there is no philosophical objection to data-sharing among astronomers at this institution, and nearly all astronomers would share as much of their data as others wanted if it were practicable. Key reasons that more data are not presently shared more efficiently in astronomy include: the difficulty of sharing large data sets; over reliance on non-robust, non-reproducible mechanisms for sharing data (e.g. emailing it); unfamiliarity with options that make data-sharing easier (faster) and/or more robust; and, lastly, a sense that other researchers would not want the data to be shared. We conclude with a short discussion of a new effort to implement an easy-to-use, robust, system for data sharing in astronomy, at theastrodata.org, and we analyze the uptake of that system to-date.

Alberto Pepe

and 1 more

If you are a scholar and haven't spent the last ten years in a vacuum, you have heard of Open Access: the emerging practice of providing unrestricted access to peer-reviewed scholarly works, such as journal articles, conference papers, and book chapters. Open Access comes in two flavors, _green_ and _gold_, depending on how an article is made available to the public. The "green road" to Open Access involves an author making her article publicly available after publication, e.g. by depositing the article's post-print in an open institutional repository. According to many, the preferred avenue to achieve Open Access, however, is the "golden road" which happens when an author publishes an article directly in an OA journal. THE FACT THAT OPEN ACCESS, REGARDLESS OF ITS FLAVOR, HAS INNUMERABLE BENEFITS FOR RESEARCHERS AND THE PUBLIC AT LARGE IS BEYOND DISCUSSION --- even the most traditional scholarly publishers would have to agree. Importantly, the vision of universal Open Access to scholarly knowledge, i.e., the idea that the entire body of published scholarship should be made available to everyone free of charge, is not too far fetched. In practice, by a combination of green and golden OA practices, this vision is already a reality in some scientific fields, such as physics and astronomy. So: Open Access is both fundamentally necessary and bound to happen. BUT, WHETHER OPEN ACCESS, ALONE, CAN GUARANTEE REPRODUCIBILITY AND TRANSPARENCY OF RESEARCH RESULTS IS A DIFFERENT AND COMPELLING QUESTION. Do research articles contain enough information to exactly (or even approximately) replicate a scientific study? Unfortunately, very often the answer to this question is no. As science, and scholarship in general, become inevitably more computational in nature, the experiments, calculations, and analyses performed by researchers are too many and too complex to be described in detail in a research article. As such, the minutiae of research activity are often hidden from view, making science unintelligible and irreproducible, not only for the public at large, but also for scientists, experts and, paradoxically, even for the same scientists who conducted the research in the first place, who may have not documented their exact workflows elsewhere. A parallel movement to Open Access --- Open Science --- is building up momentum in scholarly circles. Its mission is to provide open, universal access to the full sources of scientific research.
1nessie findingchart

Alyssa Goodman

and 6 more

Instructions for Co-Authors The full file repository for this paper is at a shared Google Drive directory, https://drive.google.com/#folders/0BxIRxiTe1u6BcGlnUGt2ckU1Vms, shared with all co-authors. NOTE: THE “AAS” (PRESS CONFERENCE) SLIDES AT HTTPS://DRIVE.GOOGLE.COM/#FOLDERS/0BXIRXITE1U6BRKLQRZLUAUNUUUU GIVE A BETTER IDEA OF WHERE THIS DRAFT IS GOING THAN THE TEXT/FIGURES HERE AS OF NOW... AG WILL UPDATE ALL BY C.1/1/13! The Mendeley Library “Nessie and Friends” used to house references used in this work, at: http://www.mendeley.com/groups/2505711/nessie-and-friends/, but since Authorea works more directly with ADS links, we’ll use the ADS Private Library at http://adsabs.harvard.edu/cgi-bin/nph-abs_connect?library&libname=Nessie+and+Friends&libid=488e32b08b instead. The Mendeley library is the source of the nessie.bib file in the “Bibliography” folder here on Authorea, but I am not sure how to get the ADS references out as a .bib file. xxAlberto?xx The Glue software used to intercompare data sets used in this work is online through: http://glue-viz.readthedocs.org/en/latest/ We are using Authorea.com as an experimental platform to compile this paper. The manual steps we will need to take before submission include: - download LaTeX file - modify LaTeX file to use aas macros - insert needed information (e.g. about authors, running header) into was version of LaTeX manuscript - extract needed figures from relevant folders here & bundle them with LaTeX manuscript & macros - create .bib file from ADS Private Library - add .bib file to folder with manuscript & figures - fix in-line referencing so that $\citet$ and $\citep$ commands work