2.2.1 | Simulating the admixed population, Effective population size and sampling individuals
At each generation, MetHis performs simple Wright-Fisher (Fisher, 1922; Wright, 1931) forward-in-time simulations, individual-centered, in a panmictic population of diploid effective size Ng . For a given individual in the population H at the following generation (g + 1), MetHisindependently draws each parent from the source populations with probability \(s_{S,g}\) (Figure 1 , Table 1 ), or from population H with probability\(h_{g}=1-\sum_{\text{Sϵ}\left(Afr,Eur\right)}s_{S,g}\), randomly builds a haploid gamete of independent markers for each parent, and pairs the two constructed gametes to create the new individual.
Here, we decided to neglect mutation over the 21 generations of admixture considered. This is reasonable when studying relatively recent admixture histories and considering independent genotyped SNP markers. Nevertheless, for users interested in microsatellite variation and longer admixture histories, MetHis readily implements a standard General Stepwise Mutation Model allowing for insertion or deletion (Estoup, Jarne, & Cornuet, 2002), with parameters set by the user (Supplementary Note S1 ).
To focus on the admixture process itself without excessively inflating the parameter space, we consider, for each nine-competing model, the admixed population H with constant effective population sizeNg = 1000 diploid individuals. Nevertheless, note that MetHis readily allows the user parametrization of stepwise or continuous changes in Ne (Supplementary Note S1 ).
After each simulation, we randomly draw individual samples matching sample-sizes in our observed dataset (see 2.4.3). We sample individuals until our sample set contains no individuals related at the 1st degree cousin within each population and between population H and either source populations, based on explicit parental flagging during the last 2 generations of the simulations. Note that this is done to best mimic, a priori , the observed case-studies dataset, but excluding related individuals is an option set by the user in MetHis (Supplementary Note S1 ).
2.2.2 | Simulating source populations
MetHis , in its current form, does not allow simulating the source populations for the admixture process modeled in Verdu and Rosenberg (2011). Simulating source populations can be done separately using existing genetic data simulation software such as fastsimcoal2sequential coalescent (Excoffier, Dupanloup, Huerta-Sanchez, Sousa, & Foll, 2013; Excoffier & Foll, 2011).
Another possibility to simulate source populations emerges if genetic data is already available for the known source populations, as it is the case in our case studies of enslaved-African descendants in the Americas (see 2.4.3). We consider here that the African and European source populations are very large populations at the drift-mutation equilibrium, accurately represented by the Yoruban YRI and British GBR datasets here investigated (see 2.4.3). Therefore, we first build two separate datasets each comprising 20,000 haploid genomes of 100,000 independent SNPs, each SNP being randomly drawn in the site frequency spectrum (SFS) observed for the YRI and GBR datasets respectively. These two datasets are used as fixed gamete reservoirs for the African and European sources separately, at each generation of the forward-in-time admixture process. From these reservoirs, we build an effective individual gene-pool of diploid size N g, by randomly pairing gametes avoiding selfing. These virtual source populations provide the parental pool for simulating individuals in the admixed population H with MetHis , at each generation. Thus, while our gamete reservoirs are fixed, the parental genetic pools are randomly built anew at each generation. Again, note that this is not necessary to the implementation of MetHis for investigating complex admixture histories; source populations can be simulated separately by the user at will.