The B5 Superhighway

Let's use Authorea to keep track of B5 materials...

  • fits cubes: ready for Glue volume rendering.
    • The blue/green is \(C^{18}O\) (2-1), and red/orange is \(NH_3\) (1, 1).
    • The three slides have been uploaded.
    • The clustering is done using a code I wrote following explanation in Alvaro's paper. The friends-of-friends threshold in Alvaro's paper is 3 km s\(^{-1}\) pc\(^{-1}\) (~ 1 \(c_s\)/half beam). Using the same threshold, the (extended) B5 would be clustered/grouped into one single component. The clustering in the movies below is done using a threshold of 1 km s\(^{-1}\) pc\(^{-1}\) (one third of Alvaro's threshold), in order to cluster the data points into multiple components (which isn't too bad a choice, since Alvaro was using CO (1-0), with a broader line width). In the movies below, the clustering is run on the combined Gaussian fit, where we have one peak from one-component Gaussian fit if the residual of one-component fit is smaller, and two peaks from the two-component fit if the residual of two-component fit is smaller.
    • movie0_opaque_linewidth: the movie made from 3D visualization of Gaussian fitted peaks without friends-in-velocity (FIVE) clustering. Brighter/white circles are where the (Gaussian fitted) emission is higher. The size is scaled with the (Gaussian fitted) line width.
    • movie0_transparent_linewidth: the same movie, with alpha.
    • movie_opaque: the movie made from 3D visualization of Gaussian fitted peaks with friends-in-velocity (FIVE) clustering. The size is NOT scaled with the line width.
    • movie_transparent: the same movie, with alpha.
    • movie_opaque_linewidth: the movie made from 3D visualization of Gaussian fitted peaks with friends-in-velocity (FIVE) clustering. The size is scaled with the line width.
    • movie_transparent_linewidth: the same movie, with alpha.
    • The clustering is done using the DBScan (density-based scanning) method in scikit-learn. The DBScan method should perform better than the FoF method(, which is similar to the K-Means). In practice, the mean silhouette coefficient, measuring how the clustering performs (ranging from -1 to 1 for each data point, with -1 meaning that the clustering is not appropriate for that data point, and 1 meaning the clustering is good), shows that the result of the DBScan (mean silhouette score ~ 0.06) is better than the FoF method (mean silhouette score ~ -0.26; the score for the FoF method is calculated on the same standardized dataset used in the DBScan analysis, to be fair). The DBScan method also identifies a number of data points which cannot be clustered (the "noisy samples"). See the scikit-learn clustering page for an overview of various clustering algorithms.
    • To implement the DBScan method, the ppv positions of fitted Gaussian peaks (of C\(^{18^{ }}\)O 2-1) are first standardized. No other scaling is applied. The best parameters for setting up DBScan are found by measuring the mean silhouette coefficient, within a reasonable range. DBScan (set up with the best parameters) finds 12 components, compared to 10 components found by FoF.
    • movie_DBScan and movie_DBScan_linewidth in the Google drive folder show the result in the original RA-Dec-velocity space. The smaller, black data points indicate those categorized by DBScan as the "noisy samples".