ROUGH DRAFT authorea.com/12308

# Introduction

## Fake Galaxies Generation

The fake galaxy models are generated by the Sersic function in GalSim package based on parameters from an input catalog. The procedures can be summarized into following steps:

1. Input magnitude is converted into flux using calibrations extracted from HSC pipeline; and the input effective radius is converted into pixel unit (reff_pix).
2. Using input flux and reff_pix, a GalSim.Sersic object is generated with its flux truncated at 10 times of the reff_pix. Even when the Sersic index is high, the output image still is a reasonable size without losing much flux at outer radii.
3. Using input axis ratio (b/a), simple q=b/a shear has been applied to the Sersic object to turn it into an elliptical model. Since GalSim perserve the total area, the real major-axis half-light radii of the output model becomes reff_pix / sqrt(b/a).
4. Using input position angle (PA), the elliptical Sersic object is rotated.
5. PSF convolution is applied to the model using the PSF image extracted from HSC data products at the desired location.
6. An output image of the models is generated using the GalSim drawImage function.

We have tested these GalSim models by applying Galfit to the images generated in exactly the same way. The results prove that our models are reliable.

## Input Catalog

To make sure that the fake galaxies we inject on the images are as realistic as possible, we choose to use the models of COSMOS galaxies from (Mandelbaum 2014). The catalog is based on Exponential (Exp), De Vaucouleurs (Dev), and single Sersic component (Sersic) fitting of galaxies with $$I_{F814W} \le 23.5\ mag$$ on the ACS high-resolution (0.03''/pix) images.

From the full COSMOS catalog, we select appropriate Exp, Dev, and Sersic models according to the following standards:

1. Exp models: $$mag \le 23.0$$, $$b/a \ge 0.4$$, $$2.0 \le R_e \le 20.0$$, and $$MADEXP\_DEV > 1.0$$; This gives 9630 models in the catalog.
2. Dev models: $$mag \le 23.0$$, $$b/a \ge 0.5$$, $$2.0 \le R_e \le 20.0$$, and $$MADEXP\_DEV \le 1.0$$; This gives 6390 models in the catalog.
3. Sersic models: $$mag \le 23.0$$, $$b/a \ge 0.4$$, $$3.0 \le R_e \le 20.0$$, and $$0.8 \le n_{Sersic} \le 4.0$$; This gives 7523 models in the catalog.

The MADEXP_DEV is the ratio of MAD (Median absolute deviation) of the Exp and Dev models. MADEXP_DEV smaller than 1.0 indicates that the galaxy is more Exp-like; larger than 1.0 means it is more Dev-like. The cut at low axis ratio and low Sersic index is simply because GalSim sometimes fails to generate such model due to the maximum iterations allowed.

## Fake Galaxies Injection

At this point, we only work on single frame images. A group of 22 "clean" images are selected from the visit=1236 COSMOS-UDEEP i-band data for this test. These images are from CCDs that are close to the center of the camera. And, we visually check the images to ensure that the contamination from bright saturated stars is at minimum.

For each run, 50 models are randomly selected from the input catalog, and are injected into these 22 images at random pixel positions. The galaxies and pixel positions are the same for each CCD. The calibration parameters and PSF models are extracted at the exact X-Y locations, and are passed to the funcation that generates the fake galaxy image. Appropriate noise is also added to the models before we put them on the images. We make sure the random image coordinates are not too close to the edge, but do not put special effort into avoiding real objects on the images. The X-Y coordinates of these fake galaxies, along with their ID, are recorded in the header of the images.

After that, the fake-injected images are passed to the pipeline for source detection and photometric measurements. We cross-match the X-Y coordinates of the fake objects with the ones estimated by the pipeline using a 2 pixel maximun separation. For the ones return a multiple-match, we keep the one with the smallest separation (Claire has tried a different approach, which is keep all the matched objects. It has very small impact on the results). Meanwhile, we also keep record of the ones without any matched objects.

To make sure that the input models sample the intrinsic distributions of key parameters of the COSMOS galaxy models, we repeat this process 9 times. The same model can be selected in different runs, but only rarely. In general, we have 420-440 different models for Exp, Dev, and Sersic cases. For each model, the average, median, and standard deviation of important photometric parameters are estimated from all the detections (for most cases >15 out of 22), and are used to compared with the input values. Normally, for each run, 5-7% of the fake objects (22 CCDs x 50 Models = 1100 Fake objects) are without any match within 2 pixels. Most of these cases are due to the faintness of the model and/or proximity to bright objects.

At this point, we focus on comparing the input parameters with the magnitude, size, and shape measured by the CModel method in the pipeline. We do notice that, among all the fake objected injected in each run, 6-8% of them have failed CModel photometry. It is not clear what exactly cause this problem. To make the comparison more related to the photometric measurement itself, we furthur exclude all matched detections with $$nChild > 0$$ (normally, >10/22).