Authorea

Kyle Willett edited Analysis Plan.tex about 8 years ago

Commit id: 2ea22a6050ff2166035950c3699a53828d47be9d

deletions | additions

GZH2 will also use FERENGI images to correct for redshift bias, but the process will incorporate several improvements from the GZH method. First, the selection of SDSS galaxies to include in the FERENGI sample will be chosen to better overlap the HST galaxies in surface brightness and redshift distributions. Because corrections to the vote fractions were calculated in discrete bins of surface brightness and redshift, it was necessary to have a large number of FERENGI images in \emph{each} bin. In GZH, the distributions of surface brightness and redshift of the FERENGI images was offset from the space occupied by the HST galaxies (see Figure~\ref{fig:eyeofsauron}); for this reason corrections could not be applied to vote fractions for many HST galaxies, which occupied surface brightness-redshift spaces which did not overlap with the FERENGI data. Because of this limitation, 25\% of GZH could not be corrected for redshift bias. The new FERENGI sample will be selected to maximize the overlap of this space, to correct the maximum number of galaxies in GZH2. The FERENGI method for debiasing vote fractions was also limited in GZH due to the decision tree structure. As described above, the GZH question tree only required a minimum number of users to classify each galaxy; it did not require a minimum number of users to answer each question. This problem is doubly significant for the FERENGI galaxies, since vote fractions for the galaxy at its low redshift \emph{and} high redshift image must have statistical significance; in other words, for each data point analyzed in FERENGI, two images must have enough users answer a given question for both vote fractions to be significant. This strict requirement is not met by most pairs of galaxy images in FERENGI for higher-tier questions. For this reason, there are not enough FERENGI images in each surface brightness-redshift bin to measure a relationship between the vote fractions at high and low redshift, and therefore only vote fractions pertaining to the first question in GZH were able to be corrected for redshift bias. In GZH2, the FERENGI sample will be classified using the new decision tree method, which will ensure enough data is obtained for all questions in the tree, and as a result vote fractions for all morphological features will be corrected. \item in-browser In-browser fitting tool In addition to the discrete labels applied to galaxy morphology in visual classification (including Galaxy Zoo), fitting the galaxy light profile to an analytic model allows for a quantitative decomposition of the light into various components. The simplest distinctions between early- and late-type galaxies, for example, determine whether a radially-averaged light profile is better fit by $I[r]\sim e^{\alpha(r/r_e)^{1/4}}$ (for elliptical galaxies, commonly known as a de~Vaucouleurs profile) or $I[r]\sim e^{\alpha(r/r_e)}$ (for disk galaxies; an exponential profile). Many galaxies with both bulge and disk components can be well-fit with a linear combination of the two profiles (eg, as adopted in the SDSS model magnitude pipelines). For galaxies with a fixed (or otherwise known) mass-to-light ratio, this provides a simple way of characterizing the distribution of luminous matter in a disk that can be more easily compared to theory and simulations, while also testing the scale invariance of the various components over a range of effective radii ($r_e$). Extraction of galaxy structural components through automatic fitting has been developed in a variety of codes, among the most widely-used of which is GALFIT \citep{Peng_2002}. The original version of GALFIT models azimuthal 2-D profiles for an arbitrary number of components, and but is also capable of handling a variety of radial functional forms in addition to more irregular morphologies (such as warps or boxy components). Advanced versions of GALFIT \citep{Peng_2010} provide additional profile components often seen in high-resolution images of galaxies, such as truncated shapes, rings, irregular morphologies, and power-law spirals. Fitting parametric models automatically to galaxy images are an extremely powerful method of quantifying the morphology and the relationship between the galaxy's dynamical state and its luminosity. One of the major challenges to this approach, however, is running unsupervised decomposition codes on large samples of images. While a number of error minimization algorithms can be used to determine the ``best-fit'' of a model to an image, images of galaxies with extensive structure (especially for HST images with sub-arcsecond resolution) often mean that the appropriate number of model components can be quite large (and impossible to determine \textit{a priori}). In addition, the accuracy of fitting the model parameters (eg, from $\Chi^2$ minimization) can be sensitive to relatively small changes in the initial conditions. For example, a model may pick unusual and likely un-physical radial profiles to get a formal minimization of the residuals if the center position of the galaxy is off, sometimes by as little as a few pixels. Finally, galaxies are real objects that are observed surrounding (and sometimes obscuring or overlapping) many other objects; this can include image artifacts, foreground stars in the Milky Way, and nearby companion galaxies. Deciding which of the objects should be modeled (and which components should be used) is a non-trivial task for real data. This approach benefits significantly from the interaction of humans with the modeling software; while the profile-fitting code is capable of quickly matching the light and providing instant feedback (by viewing both the output model and the residuals after subtracting it from the image), the pattern recognition capabilities of humans are critical for a good solution by specifying the number and type of components and by keeping the values for the model within reasonable physical bounds. Citizen scientists have already demonstrated the ability to distinguish between complicated models of galaxies for $N$-body simulations of mergers \citep{Holincheck_2016}, for example. We will extend the current methods of Galaxy~Zoo morphological measurements by adding a separate task where users manipulate an analytic model of the galaxy light profile in the browser, and assess the goodness-of-fit by examining the residuals in real time. The current model (while still in alpha development) runs entirely within the browser, but can make accurate fits to the light profile by making efficient use of fast Fourier transforms and the fitsjs library \citep{444f7c4f-f54a-4397-a97a-3699a5c6364e}. \textbf{Need figure; may be problem, since the heroku app currently isn't working. Discuss how many images this will be run on and what science we'll do with it.} \end{enumerate} \item Description of final catalog, how to be used by public \item Timeline