Deletion and duplication breakpoints bioinformatic analysis
DNA intervals ranging from 10bp to 50bp centered on each 5’ and 3’
breakpoint of deletion and duplication were in silico screened
for microhomologies, repetitive elements, non-B DNA, secondary
structures and recombinogenic DNA motifs. These elements constitute a
heterogeneous group of sequences with different lengths (3-18bp), that
may act as stimulators for DSB, triggering an incorrect DNA repair/DNA
replication leading to non-allelic recombination. The study was
performed using the Human Reference Genome (GRCh38) [NC_000023.11:
31641233-32372273 downloaded 5-Sep-2018 from the NCBI website,
www.ncbi.nlm.nih.gov/] and was
based on a recently reported strategy
(Abelleyro et al., 2020).
For this analysis, DSB stimulation motifs that showed significant
Expected values (E-values <0.05) in random points from the
referred study were considered
(Abelleyro et al., 2020).
Bioinformatic analysis was mainly achieved using SeqBuilder and MegAlign
programs [LaserGene DNA Star], ClustalW algorithm
[www.ebi.ac.uk/Tools/msa/clustalw2/] and BLAST algorithm
[blast.ncbi.nlm.nih.gov/Blast.cgi]. The RepeatMasker algorithm and
Dfam [www.dfam.org/] were used to
identify repetitive elements. Analysis of non-B DNA sequences was
achieved by the non-B DNA motif search tool (nBMST)
[nonb-abcc.ncifcrf.gov/apps/nBMST/default/] and confirmed by
RepeatAround [portugene.com/repeataround.html] and QGRS mapper
[bioinformatics.ramapo.edu/QGRS/analyze.php]. Secondary structure
modelling was depicted using mfold
[unafold.rna.albany.edu/?q=mfold]. Finally, among the recombinogenic
motifs screened using SeqBuilder [LaserGene DNA Star], are included
Scaffold Attachment Region (SAR), Ig heavy chain switch and
hexanucleotide motifs targeted by the endonuclease/retro-transcriptase
of mammalian retroposons (Jurka motifs)
(Jurka, 1997).