Gross deletions and microdeletions are naturally partitioned
Our analysis indicates that the frequencies of non-B DNA-forming repeat, GC content, and specific sequence motifs all correlated with the length of the deletions when deletion length was shorter than a given threshold. The PCC values against deletion lengths are shown in Figure 7A. Here, PCC represents the extent of the correlation between deletion length and the frequencies of non-B DNA-forming repeats, GC content, and the frequencies of the sequence motifs being explored. As indicated in Figure 7A, when the deletion length was <25 bp, the PCC values pertaining to motif frequency and deletion length were negatively correlated. The PCC of the correlation between non-B DNA-forming repeat frequencies and deletion length attained its maximum value when the deletion length was 25 bp. The highest PCC value for the correlation between the deletion length and GC content was observed when the deletion length was 29 bp. Thus, we conclude that 25-30 bp may be a natural threshold to functionally distinguish gross deletions from microdeletions in terms of the underlying generative mechanisms.