Fig. 4: Cluster size evaluation: Cross-entropy plot for the number of clusters K=1-6 with 10 repetitions for randomization. Value of the cross-entropy criterion as a function of the number of populations in sNMF. The retained value of K is K=3.
We assessed the spatial distribution of SNPs to ascertain whether they can be detected in aggregated cluster formations. The results of the cross-validation revealed the presence of three statistically verified clusters, but the difference in cross-entropy from K=3 to K=4 is negligible (Fig. 4). In addition, four clusters remain stable even if the cluster analysis is run several times. If more clusters are set (K=5 or K=6), the resulting cluster 5 and cluster 6 look different for each run.