Comparison height width

Alyssa Goodman

and 2 more

Let's use Authorea to keep track of B5 materials...fits cubes: ready for Glue volume rendering.YT moviesThe blue/green is \(C^{18}O\) (2-1), and red/orange is \(NH_3\) (1, 1).keynote scienceThe three slides have been uploaded.Alvaro-style 3D movies (clustering by the FoF method)The clustering is done using a code I wrote following explanation in Alvaro's paper. The friends-of-friends threshold in Alvaro's paper is 3 km s\(^{-1}\) pc\(^{-1}\) (~ 1 \(c_s\)/half beam). Using the same threshold, the (extended) B5 would be clustered/grouped into one single component. The clustering in the movies below is done using a threshold of 1 km s\(^{-1}\) pc\(^{-1}\) (one third of Alvaro's threshold), in order to cluster the data points into multiple components (which isn't too bad a choice, since Alvaro was using CO (1-0), with a broader line width). In the movies below, the clustering is run on the combined Gaussian fit, where we have one peak from one-component Gaussian fit if the residual of one-component fit is smaller, and two peaks from the two-component fit if the residual of two-component fit is smaller.movie0_opaque_linewidth: the movie made from 3D visualization of Gaussian fitted peaks without friends-in-velocity (FIVE) clustering. Brighter/white circles are where the (Gaussian fitted) emission is higher. The size is scaled with the (Gaussian fitted) line width.movie0_transparent_linewidth: the same movie, with alpha.movie_opaque: the movie made from 3D visualization of Gaussian fitted peaks with friends-in-velocity (FIVE) clustering. The size is NOT scaled with the line width.movie_transparent: the same movie, with alpha.movie_opaque_linewidth: the movie made from 3D visualization of Gaussian fitted peaks with friends-in-velocity (FIVE) clustering. The size is scaled with the line width.movie_transparent_linewidth: the same movie, with alpha.Clustering by the DBScan methodThe clustering is done using the DBScan (density-based scanning) method in scikit-learn. The DBScan method should perform better than the FoF method(, which is similar to the K-Means). In practice, the mean silhouette coefficient, measuring how the clustering performs (ranging from -1 to 1 for each data point, with -1 meaning that the clustering is not appropriate for that data point, and 1 meaning the clustering is good), shows that the result of the DBScan (mean silhouette score ~ 0.06) is better than the FoF method (mean silhouette score ~ -0.26; the score for the FoF method is calculated on the same standardized dataset used in the DBScan analysis, to be fair). The DBScan method also identifies a number of data points which cannot be clustered (the "noisy samples"). See the scikit-learn clustering page for an overview of various clustering algorithms.To implement the DBScan method, the ppv positions of fitted Gaussian peaks (of C\(^{18^{ }}\)O 2-1) are first standardized. No other scaling is applied. The best parameters for setting up DBScan are found by measuring the mean silhouette coefficient, within a reasonable range. DBScan (set up with the best parameters) finds 12 components, compared to 10 components found by FoF.movie_DBScan and movie_DBScan_linewidth in the Google drive folder show the result in the original RA-Dec-velocity space. The smaller, black data points indicate those categorized by DBScan as the "noisy samples".
Link volume

Alberto Pepe

and 4 more

We analyze data sharing practices of astronomers over the past fifteen years. An analysis of URL links embedded in papers published by the American Astronomical Society reveals that the total number of links included in the literature rose dramatically from 1997 until 2005, when it leveled off at around 1500 per year. This rise indicates an increased interest in data-sharing over the same time period that the web saw its most dramatic growth in usage in the developed world. The analysis also shows that the availability of linked material decays with time: in 2011, 44% of links published a decade earlier, in 2001, were broken. A rough analysis of link types reveals that links to data hosted on astronomers’ personal websites become unreachable much faster than links to datasets on curated institutional sites. To gauge astronomers’ current data sharing practices and preferences further, we performed in-depth interviews with 12 scientists and online surveys with 173 scientists, all at a large astrophysical research institute in the United States: the Harvard-Smithsonian Center for Astrophysics, in Cambridge, MA. Both the in-depth interviews and the online survey indicate that, in principle, there is no philosophical objection to data-sharing among astronomers at this institution, and nearly all astronomers would share as much of their data as others wanted if it were practicable. Key reasons that more data are not presently shared more efficiently in astronomy include: the difficulty of sharing large data sets; over reliance on non-robust, non-reproducible mechanisms for sharing data (e.g. emailing it); unfamiliarity with options that make data-sharing easier (faster) and/or more robust; and, lastly, a sense that other researchers would not want the data to be shared. We conclude with a short discussion of a new effort to implement an easy-to-use, robust, system for data sharing in astronomy, at theastrodata.org, and we analyze the uptake of that system to-date.

Alberto Pepe

and 1 more

If you are a scholar and haven't spent the last ten years in a vacuum, you have heard of Open Access: the emerging practice of providing unrestricted access to peer-reviewed scholarly works, such as journal articles, conference papers, and book chapters. Open Access comes in two flavors, _green_ and _gold_, depending on how an article is made available to the public. The "green road" to Open Access involves an author making her article publicly available after publication, e.g. by depositing the article's post-print in an open institutional repository. According to many, the preferred avenue to achieve Open Access, however, is the "golden road" which happens when an author publishes an article directly in an OA journal. THE FACT THAT OPEN ACCESS, REGARDLESS OF ITS FLAVOR, HAS INNUMERABLE BENEFITS FOR RESEARCHERS AND THE PUBLIC AT LARGE IS BEYOND DISCUSSION --- even the most traditional scholarly publishers would have to agree. Importantly, the vision of universal Open Access to scholarly knowledge, i.e., the idea that the entire body of published scholarship should be made available to everyone free of charge, is not too far fetched. In practice, by a combination of green and golden OA practices, this vision is already a reality in some scientific fields, such as physics and astronomy. So: Open Access is both fundamentally necessary and bound to happen. BUT, WHETHER OPEN ACCESS, ALONE, CAN GUARANTEE REPRODUCIBILITY AND TRANSPARENCY OF RESEARCH RESULTS IS A DIFFERENT AND COMPELLING QUESTION. Do research articles contain enough information to exactly (or even approximately) replicate a scientific study? Unfortunately, very often the answer to this question is no. As science, and scholarship in general, become inevitably more computational in nature, the experiments, calculations, and analyses performed by researchers are too many and too complex to be described in detail in a research article. As such, the minutiae of research activity are often hidden from view, making science unintelligible and irreproducible, not only for the public at large, but also for scientists, experts and, paradoxically, even for the same scientists who conducted the research in the first place, who may have not documented their exact workflows elsewhere. A parallel movement to Open Access --- Open Science --- is building up momentum in scholarly circles. Its mission is to provide open, universal access to the full sources of scientific research.
1nessie findingchart

Alyssa Goodman

and 6 more

Instructions for Co-Authors The full file repository for this paper is at a shared Google Drive directory, https://drive.google.com/#folders/0BxIRxiTe1u6BcGlnUGt2ckU1Vms, shared with all co-authors. NOTE: THE “AAS” (PRESS CONFERENCE) SLIDES AT HTTPS://DRIVE.GOOGLE.COM/#FOLDERS/0BXIRXITE1U6BRKLQRZLUAUNUUUU GIVE A BETTER IDEA OF WHERE THIS DRAFT IS GOING THAN THE TEXT/FIGURES HERE AS OF NOW... AG WILL UPDATE ALL BY C.1/1/13! The Mendeley Library “Nessie and Friends” used to house references used in this work, at: http://www.mendeley.com/groups/2505711/nessie-and-friends/, but since Authorea works more directly with ADS links, we’ll use the ADS Private Library at http://adsabs.harvard.edu/cgi-bin/nph-abs_connect?library&libname=Nessie+and+Friends&libid=488e32b08b instead. The Mendeley library is the source of the nessie.bib file in the “Bibliography” folder here on Authorea, but I am not sure how to get the ADS references out as a .bib file. xxAlberto?xx The Glue software used to intercompare data sets used in this work is online through: http://glue-viz.readthedocs.org/en/latest/ We are using Authorea.com as an experimental platform to compile this paper. The manual steps we will need to take before submission include: - download LaTeX file - modify LaTeX file to use aas macros - insert needed information (e.g. about authors, running header) into was version of LaTeX manuscript - extract needed figures from relevant folders here & bundle them with LaTeX manuscript & macros - create .bib file from ADS Private Library - add .bib file to folder with manuscript & figures - fix in-line referencing so that $\citet$ and $\citep$ commands work
Figure01

Arzu Coltekin

and 1 more

PLEASE NOTE: This is an in-process DRAFT, not yet a publication.  Thank you for understanding.  Please email the authors if you have questions.Abstract Visualization is critical to the work of nearly all scientists, but not all scientists are equally adept at visualization. Software defaults and instinct often guide scientists’ visualization designs. However, not all software is optimized for visualization design, and instinct does not always guide us to good functional design choices. Most software is not developed in collaboration with design or science professionals, and software developers are not always trained in cognitive and perceptual sciences to assess the implications of their design choices on humans. Furthermore, while instinct in visualization design can produce outcomes that please a personal aesthetic, it may not lead to correctly understood and interpreted visualizations. In other words; many scientists are accidentally producing ineffective visualizations because they are not aware of the body of knowledge emerging from empirical studies on how people experience visualizations. There is a knowledge transfer gap between the knowledge in visualization research domain and those who ‘practice’ the art. This paper distills and synthesizes the literature on visualization research into (potentially) practicable rules of thumb optimized for scientists and data journalists who are not trained in visualization design. Specifically, we offer the reader a set of questions they need to ask themselves before making visualization decisions.
L1688

Hope How-Huan Chen

and 5 more