3.1 Ploidy identification and variant calling
Whole genome resequencing of the 151 individuals generated a total of ~ 10.6 G of clean paired-end reads (70.2 M per individual), giving an average sequencing depth of 12.2× (Table S1). Among them, 99.14% of reads were on average mapped to the reference genome of goldfish. The analysis of allele frequency distributions at biallelic variants clearly showed one Gaussian distribution with mean 0.5 for diploids and bimodal distributions with means 0.33 and 0.67 for triploid (Figure S1). Among all individuals, 73 and 78 were identified as diploids and triploids, respectively. In invasive populations, the number of diploids and triploids were 29 and 9 in the LL population, and 2 and 22 in the CBL population, respectively (Table 1). In direct source regions, samples from Ningxia and Sichuan were almost all triploids with only one diploid detected from Sichuan. In indirect source regions, 29 diploids and 6 triploids were detected. After variant calling and filtering, a total of 16,888,283 and 17,954,020 SNPs were identified from diploids and triploids, respectively, with 10,719,815 SNPs being common to both ploidy categories.