Performance on single sample clinical data:
To test the performance of ROHMM on single sample clinical data we used the publicly available case data from Pippucci and colleagues (Pippucci et al., 2013). This test was also a means to show the unique abilities of ROHMM in a single sample case where homozygous reference sites are almost always not available from the variant call format. We tested ROHMM using 3 different settings to infer homozygosity from this data. ROHMM was able to detect the long homozygous stretch containing the CACNA2D2 NM_006030.4:c.1295del mutation in the proband as it was detected by the original study (Pippucci et al., 2013). Authors of the original study used a predefined set of SNPs to infer homozygosity and we tried simulating a similar input using a BED file containing the same set of SNPs withROHMM ’s spike-in function (Setting3). We observed that the spike-in functionality further enabled the detection of potentially cryptic short ROH’s that are otherwise not visible from the variant sites only data (Figure 7).