Lynch syndrome (LS) is an autosomal dominant inherited disease and its
prevalence is 1–3% in unselected colorectal or endometrial cancer
patients (de la Chapelle, 2005). It is
characterized by increased risks for early-onset tumor development,
especially for colorectal cancer (CRC), endometrial cancer, ovarian
cancer, and other extracolonic tumors such as hepatobiliary, urothelial,
brain or central nervous system tumors, as well as sebaceous tumors
(Cohen & Leininger, 2014). Lynch syndrome
is caused by germline mutations in one of the mismatch repair (MMR) andEPCAM genes (Da Silva, Wernhoff,
Dominguez-Barrera, & Dominguez-Valentin, 2016). Tumors from LS
patients normally exhibit high microsatellite instability (MSI-H) and
loss of expression of one or more MMR proteins
(Boland, Koi, Chang, & Carethers, 2008).
Substitutions, small insertion/deletions, large deletions/duplications,
inversions (Liu et al., 2016;
Mork et al., 2017;
Rhees, Arnold, & Boland, 2014), as well
as insertions of retrotransposon have been reported in the MMR genes as
causes of LS (Peltomaki & Vasen, 2004;
van der Klift, Tops, Hes, Devilee, &
Wijnen, 2012).
Retrotransposons are DNA sequences that proliferate in the genome using
an RNA intermediate and a ‘copy- and-paste’ retrotransposition
mechanism. Retrotransposons can be subdivided into two groups
distinguished by the presence or absence of long terminal repeats
(LTRs). Retrotransposons without LTRs include Long Interspersed Elements
1 (LINE-1, L1), Alu elements (Short Interspersed Elements, SINE) and SVA
(SINE-VNTR-Alu) elements (Cordaux &
Batzer, 2009; Rebollo, Romanish, &
Mager, 2012). Approximately 124 retrotransposon insertions associated
with human disease have been previously reported
(Hancks & Kazazian, 2016)
To date, ten (10) gross insertions larger than 20 base pairs (bp) have
been recorded in MMR genes in the Human Genome Mutation database (HGMD
Professional 2019.4). Five of these large insertions involved
retrotransposons, four of which were Alu insertions with two in each of
the MLH1 (Leclerc et al., 2018;
Solassol et al., 2019) and MSH2genes (Kloor et al., 2004;
Marshall, Isidro, & Boavida, 1996) and
one was an SVA insertion in PMS2(van der Klift et al., 2012). Up to now,
no SVA insertion has been reported in MSH2. In this study, we
report an insertion of an SVA element at c.1972 in exon 12 ofMSH2 as a novel cause of Lynch syndrome.
Our proband is a 49-year-old man who was diagnosed with colon cancer at
age 43. A four-generation pedigree (Fig. 1) indicated that other family
members were affected with early-onset colorectal cancer (CRC) under age
50. The proband’s mother was diagnosed with metachronous endometrial and
CRC and one maternal aunt was diagnosed with CRC at 50. One of the
proband’s brothers had colon polyps, and was subsequently diagnosed with
a proximal colon cancer at age 54 which was MSH2 and MSH6 deficient on
immunohistochemical (IHC) staining. Another brother was diagnosed with a
screen detected colon cancer at age 38 which also demonstrated loss of
expression of the MSH2 and MSH6 proteins by IHC. However, no mutation
was identified through next generation sequencing (NGS) and the 10 Mb
inversion in MSH2 was not detected
(Rhees et al., 2014).
Another maternal aunt of the index case was diagnosed with CRC at 35.
Her daughter was diagnosed with endometrial cancer at age 38 which
demonstrated MSI-H and loss of expression of MSH2 and MSH6 proteins by
IHC. This family member was initially tested in 2007 for MLH1 andMSH2 sequencing and large rearrangement in a reference laboratory
and was identified to have an MSH2 intron 12 rearrangement which
was classified as variant of uncertain significance (VUS). Multiple
family members affected with colon or endometrial cancer were tested and
no mutation was identified, although tumor tissues of several
individuals were tested and showed loss of MSH2 and MSH6 proteins with
immunohistochemistry (IHC). A weak aberrant larger transcript was
identified but not further characterized in lymphocyte RNA isolated from
one of these family members who was affected with colorectal cancer (age
38) that showed loss of MSH2 and MSH6 expression (Fig. 2a). Additional
Southern blot analysis on genomic DNA of the same patient indicated the
presence of a 3 kb insertion, possibly a large LINE-1 or SVA insertion
(Fig. 2b); restriction fragment analysis could narrow down the place of
the insertion to a 1.45 kb region around MSH2 exon 12 (Fig. 2c). The
same rearrangement was shown with Southern blot analysis in the genomic
DNA from another more distantly related family member; this individual
presented with endometrial cancer at age 34, and showed loss of MSH2 and
MSH6 in tumor tissue. However, the type of retrotransposon and the exact
genomic location of the insertion were not determined.
The proband was seen at the Clinical Genetics Service (CGS) at Memorial
Sloan Kettering Cancer Center (MSKCC). Immunohistochemistry (IHC)
analysis indicated loss of MSH2 and MSH6 proteins in the tumor. NoMSH2 inversion was detected. Given the strong family history of
colon cancer, a colorectal multi-gene panel test (sequencing and large
rearrangement analysis of APC , EPCAM (large rearrangement
only), MLH1, MSH2, MSH6, MUTYH, PMS2, POLD1, POLE , with add-on
genes: PTEN, BRCA1, BRCA2 ) was performed at the Diagnostic
Molecular Genetics laboratory at MSK. Testing identified aberrant
sequences before c.1954 and after c.1972 positions in exon 12 ofMSH2 (Fig. 3a). No other mutations or VUSs were identified in the
remaining eleven genes analyzed. The copy number of MSH2 exon 12
was normal based on our next-generation sequencing (NGS) analysis (Fig.
3b) which was confirmed by multiplex ligation-dependent probe
amplification (MLPA) (Fig. 3c), indicating that the aberrant sequence
was probably not due to a genomic deletion or duplication of the coding
region of MSH2 .
To investigate the nature and origin of the abnormal sequence,
long-range (LR) PCR, was performed on genomic DNA from the patient using
the TaKaRa LA PCR Kit according to the manufacturer’s protocol, an
M13-tagged forward primer located in intron 11 (5’- GTA AAA CGA CGG CCA
GT GGGTTTTGAATTCCCAAATG - 3’) and an M13-tagged reverse primer in intron
12 (5’- CAG GAA ACA GCT ATG AC AAAACGTTACCCCCACAAAG-3’). One band about
400 bp in length was present in negative controls (Fig. 4a). Another
band of a larger size (~3 - 4 kb) was observed in the
patient but absent in the negative controls (Fig. 4a). The larger
aberrant fragment was extracted and sequenced with M13 forward and
reverse primers. Sequence analysis of the extracted aberrant fragment
revealed a targeted duplication of 19 bps, with the location of the
insertion at c.1972, and part of the inserted sequence (660 bp) in an
antisense orientation with respect to the MSH2 transcription
direction (Fig. 4b, 4c). The Repeat Masker
(http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker ) indicated
the inserted sequence belongs to an SVA element. To map the inserted
sequence, we performed a BLAST search
(https://blast.ncbi.nlm.nih.gov/Blast.cgi, human genome GRCh38.p12
reference, Annotation Release 109). A total of 27 hits covered almost
all chromosomes, except chromosome 16, 18 and Y. The first alignment
showed close to 100% homology between the 559 inserted sequence bp
(except one nucleotide) and a region on chromosome 3 (Chr3: 48,210,600 -
48,211,258). The Repeat Masker indicated that an SVA repeat was present
at this location.
The sequence from 48,210,600 to 48,220,000 in chromosome 3 was retrieved
from the NCBI database and sequencing primers were designed based on the
retrieved sequence. Apart from a short gap in the VNTR region that could
not be sequenced due to the repetitive structure, we were able to
decipher a total of 2,937 bp inserted sequence without including the
polyA tail in the antisense strand (Fig. 5a). The inserted sequence
identified in our proband starts with a guanine nucleotide followed by a
295 bp exon 1 of the MAST2 gene, an Alu-like element (37 bp), an
approximately 2.2 kb VNTR region with tandem repeats ranging from 37 to
54 bp long, a SINE-R region (492 bp), the putative polyadenylation
signal AATAAA, and a long polyA tail. The inserted sequence is followed
by a target site duplication (TSD) of 19 bps (MSH2 c.1954_1972)
at the 3’end of the insertion (Fig. 4c, 5a). Interestingly, the inserted
sequence is very close to the sequence from 48,210,600 to 48,213,111 in
chromosome 3, except that the insertion has a longer VNTR (Fig. 5a, 5b).
Further analysis characterized it as a human specific SVA subfamily of
retrotransposons termed SVA_F1 that contains a MAST2 5’
transduction group and is a fusion of MAST2 exon 1 containing CpG
island and a 5’-truncated SVA (Bantysh &
Buzdin, 2009; Damert et al., 2009;
Hancks, Ewing, Chen, Tokunaga, & Kazazian,
2009) (Fig. 5c, 5d). Seventy-six members have been identified in the
SVA_F1 subfamily in the human genome. In 96% of SVA_F1 members, the
SVA element insert starts with a guanine residue
(Bantysh & Buzdin, 2009) and the SVA_F1
insertion in this case also starts with a ‘G‘ (Fig. 4c, 5a).
The SVA insertion in MSH2 exon 12 likely occurred through
LINE-1-mediated retrotransposition as it exhibits several classical
features of this process (Hancks et al.,
2009; Raiz et al., 2012) as shown in
Fig. 4c and 5a: (1) insertion at consensus LINE-1 endonuclease cleavage
site 5’-TTTT/AA-3’ (where “/”denotes the cleavage site); (2) the
presence of a direct repeat TSD of 19 bp in length, within the size
range of 4–20 bp that is typical for LINE-1 mediated
retrotranspositions; (3) a long polyA tail preceded by the putative
polyadenylation signal AATAAA; and (4) presence of 5’ transducing and
truncation, a structural variation encompassing more than 8% of all SVA
elements in the human genome (Damert et
al., 2009; Raiz et al., 2012;
Wang et al., 2005).
In summary, we describe here for the first time an SVA insertion into
the coding sequence of MSH2 mediated by LINE-1 protein machinery.
Precise location of SVA insertion and determination of the specific SVA
sequence in the MSH2 gene are important for cancer management to
guide genetic testing of family members and potentially preimplantation
genetic testing. Furthermore, cancer affected family members identified
to have Lynch Syndrome may further benefit from immune checkpoint
inhibitors which are FDA-approved for MMR deficient and MSI-H tumors,
the hallmark of Lynch Syndrome associated tumors. Therefore,
identification and characterization of the SVA elements and their roles
in cancer predisposition genes paved the path for genomic precision
medicine and cancer prevention and therapy.