Gross deletions and microdeletions are naturally partitioned
Our analysis indicates that the frequencies of non-B DNA-forming repeat,
GC content, and specific sequence motifs all correlated with the length
of the deletions when deletion length was shorter than a given
threshold. The PCC values against deletion lengths are shown in Figure
7A. Here, PCC represents the extent of the correlation between deletion
length and the frequencies of non-B DNA-forming repeats, GC content, and
the frequencies of the sequence motifs being explored. As indicated in
Figure 7A, when the deletion length was <25 bp, the PCC values
pertaining to motif frequency and deletion length were negatively
correlated. The PCC of the correlation between non-B DNA-forming repeat
frequencies and deletion length attained its maximum value when the
deletion length was 25 bp. The highest PCC value for the correlation
between the deletion length and GC content was observed when the
deletion length was 29 bp. Thus, we conclude that 25-30 bp may be a
natural threshold to functionally distinguish gross deletions from
microdeletions in terms of the underlying generative mechanisms.