ROUGH DRAFT authorea.com/104972

# Scientific Justification

The ‘Scientific Justification’ section of the proposal (see Section 9.1) should include a description of the scientific investigations that will be enabled by the final data products, and their importance

6 page limit, total proposal + figures can be 11.

One of the most powerful observational tools for constraining the physics governing galaxy formation and evolution is morphology. The structural features of a galaxy are known to have close relationships with its physical properties; eg. the link between star formation rate and Hubble type (Masters et al., 2010; Bundy et al., 2010; Schawinski et al., 2014) or spiral arms (Willett et al., 2015), bars and AGN (Oh et al., 2012; Hao et al., 2009; Galloway et al., 2015), bars and atomic gas content (Masters et al., 2012), [lots more possibilities of examples - help with more non-galaxy zoo examples?] It is known that the demographics of most morphological features are not, in general, constant as a function of redshift. This is not surprising, given that key elements involved in the formation of galaxies are also shown to change as the Universe evolves, eg. star formation is known to peak at $$z\sim1$$ and drop steadily thereafter.

[few paragraphs of more descriptive examples of how galaxy physics is related to morphology + reasons for studying $$0<z<2$$)]

Obtaining morphological data for such large numbers of galaxies is a unique challenge, in that to date there is no system that can produce both accurate and complete morphologies using automated methods. This problem is especially present with increasing redshift, for two reasons. First, images of distant galaxies are less resolved, making it difficult to distinguish finer features in the image. Second, galaxy shapes become increasingly irregular in the early Universe, due to increased merger rate and the clumpy nature of star formation. As large telescopes become more capable of imaging these distant galaxies, we continue to discover for the first time new large-scale structures which do not exist at low $$z$$; this creates a difficulty in defining an automated categorization for these unique types. Until automated methods overcome these challenges, visual classification by humans remains the most accurate method of measuring galaxy morphology, especially for galaxies beyond the local Universe.

Visual classification is of course not without its own challenges, which are time and efficiency. While humans produce more accurate and complete classifications than a computer, the time it takes to do so is overwhelming for the wealth of data becoming available by large surveys. The Galaxy Zoo project has developed a highly innovative method for bypassing the time drawback while maintaining the accuracy of visual classification. Displaying images of SDSS galaxies to volunteers via a simple and engaging web interface, aureplacedverbaa asks people to classify the images by eye. Within its first year, each of the $$\sim1$$ million SDSS galaxies had already been classified an average of 40 times through the efforts of hundreds of thousands of members of the general public providing $$\sim40$$ million classifications (Lintott et al., 2008; (citation not found: Fortson2012).

In 2010, Galaxy Zoo moved beyond the local Universe by including $$\sim100,000$$ HST galaxies in a project known as Galaxy Zoo: Hubble. All galaxies were classified at least 40 times by late 2012. This project enabled the first direct, morphologically accurate studies to be done on the evolution of galaxies, several of which have already been completed with the preliminary data, including bar fraction with redshift (Cheung et al., 2014; Melvin et al., 2014) and passive disk fraction with redshift (Galloway et al., 2016). These only represent a small fraction of the numerous possibilities for scientific investigation capable with these data; disk/spheroidal distinction, bars, spiral arms, clumpiness, and bulge dominance are a portion of the morphological information provided by this catalog (for the full list see Figure [fig:decision tree]).

Our aim with this proposal is to develop the next phase of Galaxy Zoo:Hubble, which we will hereafter refer to as Galaxy Zoo:Hubble 2 (GZH2). The motivation for extending this project is twofold: First, although the visual classification methods have been immensely successful thus far in obtaining robust morphologies for large ($$~100,000$$) samples of galaxies, automation methods have improved since the first release in the form of powerful machine-learning algorithms. These alone are still not independently capable of accurate classification for galaxies at all redshifts, however combining these methods with the current system of human classifications has been shown to reduce the classification time of galaxies by 80% (can we cite something/ provide a figure Melanie?), thereby significantly improving both the efficiency and accuracy of GZH classifications. The details for this process are explained in full in the Analysis Plan. Second, in addition to the original GZH galaxies, an additional XX,XXX HST galaxies will be added to the project to be classified by this new method.

By combining machine-learning with human classifications, GZH2 will provide the most morphologically accurate data for the widest redshift range (to $$z\sim1.2$$) currently available. These data will enable countless new science projects involving galaxy evolution than has ever been capable to this level of accuracy. With the funding from this proposal, our team will focus on two science cases: clumpy galaxies (need better zinger description) and the mass-metallicity relation.

# Science 1: Clumps

Things we know: Clumps are found more frequently in higher redshift galaxies. They are believed to form due to instabilities resulting from high gas fractions and lead to bursts of star formation. Large clumps are shown in simulations to fall toward the center of the galaxy, forming classical bulges.

What we don’t know: It is unknown whether the instabilities are purely secular or whether mergers / local environment play a role in the creation of clumps. Understanding the role environment may have on clump formation may give insight to why we do find clumpy galaxies in the local universe (albeit rarely.)

There are several ways in which clumps tend to be distributed within galaxies. Elmegreen et al. (2005) defines four categories: chains, in which the clumps align linearly, double clumps, in which the system is dominated by two similar clumps, tadpoles, in which the system is dominated by a single large clump which is off-center from a diffuse linear emission, and clump clusters, which consist of multiple clumps distributed asymmetrically.

Examples of all classes are found in the GZH data, using the following parameters:

• Chain: $$p_\mathrm{line}$$ and $$p_\mathrm{chain}$$

• Double clump: $$p_\mathrm{number clumps 2}$$ and $$p_\mathrm{bright clump no}$$

• Tadpole: $$p_\mathrm{number clumps 2}$$ and $$p_\mathrm{bright clump yes}$$ *or* $$p_\mathrm{number clumps 1}$$

• Clump clusters: ??

A more detailed exploration may reveal additional classes.

With the GZH data, we would be able to: Identify classes of clump distributions in clumpy galaxies Investigate what physical processes may give rise to different clump morphologies. Elmegreen et al. (2005) acknowledges that the different morphological types may not imply physical differences; ex double clumps may simply be smaller versions of chains, or chain galaxies and clump clusters may be identical but viewed at different angles. They identified $$\sim100$$ galaxies / morphological type and did not find significant differences in the distributions of colors between the samples. With $$N_\mathrm{arrangement} > 10$$ and $$p_\mathrm{line} > 0.6$$, I find 700 galaxies which are highly likely to be of the chain morphological type; thus we can expect to have significantly larger samples to compare.

Things to compare between types:

-Environment - nearest neighbor? Not sure what data exists

-Colors

-SFR, can we measure per clump...?

-Clump size; perhaps there are size constraints limiting how clumps can be arranged?

-redshift - how do distributions evolve spatially over time?

[paragraph about what has been done @ high redshift, how can we do similar/more with GZH]

Although clumpy galaxies are most abundant at high redshift, there is a significant population that exist locally (z < 0.2). Analysis of this population is ideal for revealing the finer substructure of the clumps, which have typical sizes of 100pc $$\sim$$ 1kpc. HST/ACS imaging is capable of resolving [some size] at z=0.2, while SDSS can only resolve features of [this size] at the same redshift.