In silico selection of microsatellite loci
In total, 217 of the 269 investigated loci were not optimal for microsatellite genotyping of catarrhines. For 147 of them, one or both primer binding sites or the complete locus were located in repetitive elements. This increases the likelihood to amplify various off-target PCR products, particularly in multiplex settings when many primers that can bind multiple times in the respective genome are combined in a single PCR reaction. For an additional 32 loci, we could not find conserved primer binding sites near the microsatellite and a further 15 loci contained relatively long microsatellite repeat regions for one or more species, resulting in long PCR products (>250 bp). Longer PCR products are often difficult to generate if only degraded DNA material is available and can result in null alleles. Further problems included, for instance, the location of loci directly next to each other on the same chromosome and thus increasing the risk of linkage. Additionally, double entries of loci under different names or gaps in some of the reference genomes (especially for Y-chromosomal loci) impeded the screening process. A full list of screened loci including the respective reasons for their exclusion is provided in Table S6. Of the 52 loci which fulfilled our criteria, we selected 45 (1-3 loci per human chromosome including gonosomes) for downstream analyses.
The newly designed primers for the 45 loci (consisting of di-, tri- and tetra-repeats) amplify PCR products between 56 - 215 bp (according to available genome data; Table S2). Compared to the original published primers, we were able to reduce PCR product sizes by 2 - 225 bp (mean 75.9 bp) in 37 loci whereas for five loci, the new primers amplify a moderately longer fragment (elongation by 2 - 15 bp; mean 7.6 bp). PCR product size for the remaining three loci did not change. As primer binding sites were not always perfectly conserved among the 17 investigated catarrhine reference genomes, primers for 21 loci contain wobble positions. Mismatches in primer binding sites, found only in a few (1 - 2) of the investigated species, were neglected in primer design and probably result in less efficient or no amplification of the respective locus in the given species (0 - 12 loci with mismatches per species, mean 3.4; Table S2).