Benchmarks with Synthetic Data Sets:
ROHMM ’s performance was compared against PLINK andbcftools roh with our synthetic datasets. Under various homozygosity and erroneous site levels ROHMM ’s allele distribution model showed comparable performance against its competitors under both genome and exome simulated data scenarios. Additionally allele frequency model implemented in ROHMM performed similarly if not better under all conditions compared to bcftools roh(Figure 2).
To test the stability and performance of ROHMM with various levels of data density we used randomly down-sampled synthetic exome samples from our simulated datasets. ROHMM ’s false positive rate did not increase more than 0.06 % and false negative rate did not increase more than 3.3 %. Additionally ROHMM ’s alternative allele frequency model showed even lesser changes in both error types which was comparable to what was published for bcftools roh(Narasimhan et al., 2016) (Figure 3).