Searching for non-B DNA-forming repeats in flanking sequences
Non-B DNA-forming repeats within each flanking sequence were obtained from the non-B DB database(Cer et al., 2011; Cer et al., 2013) with custom filters for mirror repeats (Table S1). As shown in Table S1, the mirror repeats were filtered by triplex-motif that is predicted by non-B DB as subset=1. In this study, six types of non-B DNA-forming repeat were considered, specifically direct repeats (DR), inverted repeats (IR), mirror repeats (MR), G-quartets (GQ), short tandem repeats (STR), and Z-DNA (Z)(Ghosh & Bansal, 2003; Kondrashov & Rogozin, 2004; Wells, 2007). More detailed information on each type of non-B DNA-forming repeat is to be found in the Supplementary Material (Table S1). The frequencies of the non-B DNA-forming repeats in the flanking sequences of the pathogenic deletions were compared with the frequencies of these repeats in the simulated data, the control1 dataset. Statistical significance was assessed by means of the Student’s t-test, and a Bonferroni correction was applied to allow for multiple testing.