The number of sequences in each set ranged from 17 to 350 (84.3±5.5)
1. The original length of the fragments ranged from 366 to 1153 sites (652±13.7). Up to 11 groups based on the original fragment (3.2±0.2) and up to 9 groups based on the reduced fragment (2.4±0.2) were identified by the analysis of intraspecific structure. About 27.7% of the original datasets (23 sets) showed no intraspecific structure, and 12 (14.5%) of the reduced datasets did so (Table S2). Seventy-nine datasets (95.2%) gave a reliable determination of the number of clusters (the density value of 80% or higher according to calculations on the non-reduced set), while for the remaining 4 (
J. singaporensis (Crustaceans),
G. adustus,
R. formosanus and
S. formosus (Fishes)) even a repeated run using more generations did not increase the reliability of results (the density value was from 24 to 60%). A significant decrease in the number of clusters was expected when moving to a reduced dataset according to the results of the Wilcoxon pairwise test (Fig. 2A). In contrast, one species (
C. sapidus (Crustaceans)) showed the opposite trend, but it was associated with a decrease in the confidence of the intraspecific analysis (see Table S2). A decrease in validity was also recorded in the case of the species
L. armata (Crustaceans) when moving to the reduced dataset. Except for these 2 cases, the length reduction did not cause a decrease in validity for the other sets (see Table S2). Overall, without the pairwise test, when looking at the sparse data presented in the original fragment analysis, neither the number of sequences in the set and sites with SNPs (Fig. 2B) nor the length of sequences (Fig. 2C) explained the increased number of detectable intraspecific clusters (R2⩽0.3 at p=0.05). In the reduced dataset, the reliably detectable number of clusters which are consistent with estimates based on the original, and not consistent with the original set, also show no detectable trends that could be noticed for extrapolation on the global level.