Duplication´s molecular characterization
SNP-array from III4 also detected the duplication previously observed by
MLPA, which spans ~124kb. We were able to delimitate the
5’ breakpoint within an interval defined by SNPs rs12856332
(NC_00023.11:g.32367273) and rs1801187 (NC_00023.11:g.32362879), while
the 3’ breakpoint by SNPs rs111931446 (NC_00023.11:g.32229089) and
rs143786489 (NC_00023.11:g.32227327). As distances between these SNPs
were too long (~4kb and ~2kb,
respectively), SNP-array results alone did not offer enough data to set
up a specific PCR amplification system even under the head-to-tail
fusion hypothesis (Figure 2A).
WGS results were essential to chase the unique singularity of the
duplication´s breakpoint junction. The identification of chimeric reads,
formed by sequences of introns 37 and 43, mapping in the limits of
double coverage region (e.g. ID22075, ID17613, ID18803, ID15702) were
consistent with a head-to-tail tandem duplication (Figure 2B,
Supplementary Figure 1B). Noteworthy, we could find out 2 contradictions
between SNP-array and WGS, as rs1801187 and rs143786489 which were
supposed to be in double and single copy, respectively, by the array
turned out to be in single and double copy.
The alignment of WGS chimeric overlapping reads allowed the
determination of the breakpoint sequence. This information permitted the
design of a duplication specific head-to-tail PCR of 366bp, in order to
confirm the duplication breakpoint characterization by Sanger sequencing
(Figure 2C). The size of the duplicated region was defined by 131,284bp
(NM_004006.3:c.6291-5371_6291-5370ins[TAAAATGCAATTTCATTT;5326-5188_6291-5370]).
Finally, the characterization of the duplication breakpoint junction
showed a complex rearrangement, formed by a 7bp inverted insertion,
followed by an 11bp direct insertion both with intron 43 sequence
identity. This suggest 3 events of template switching (TS) with
microhomology of 1bp “T” at first TS and a microhomology of 2bp “TC”
at third TS (Figure 2C, Table 1).