Number of disease causing and benign MECP2 genetic
variants available
Based on the 13 genotype-phenotype databases identified in (Townend et
al., 2018), the inclusion criteria for this study were not met by
DisGeNET, dbSNP, dbVAR, Café Variome, and HGMD. DisGeNET, dbSNP and
dbVAR did not provide unambiguous descriptions of variations as the RS
identifier only indicates a location of polymorphism and needs
evaluation of the, sometimes ambiguous, additional information about the
nucleotide change. Café Variome provided only protein change which,
although very relevant itself, cannot be translated back to an
unambiguous genetic change. HGMD, the only commercial database, did not
allow re-use and re-distribution of the content. The eight databases
that did fulfil our inclusion criteria and data previously anonymized
from local RTT patients were used in this study (see Table 2). At the
time of research, in total 12,158 MECP2 variation entries were
found in these databases. The databases contained between 34 (DECIPHER)
and 4,706 (RettBASE) MECP2 variations (Table 2). Between 15% and 100%
of these variations were unique database entries (occur only once in one
single database). Multiple entries of one variation were found
frequently in disease specific databases, giving an indication of the
abundance of this variant and also confirming its pathogenicity. In
total we identified 4,573 RTT causing MECP2 variants (of which
863 were unique) that annotate genetic information with diagnosis
(RettBase, ClinVar, Maastricht Rett dataset, KMD) and/or clear phenotype
descriptions (DECIPHER) clearly stating that they cause RTT (or similar
e.g., X-linked mental retardation) (intake criteria Sup. Table 1). We
identified 617 benign MECP2 variants, of which 209 were unique,
from two of the databases that annotate with diagnosis information
(RettBase and ClinVar). These were clearly stated to be benign. 19
variants were found annotated both as RTT causing and benign (Sup. Table
2).
In total, we collected 12,158 MECP2 variants, which resulted in a
collection of 10,968 (5,038 unique) curated and integrated variants.
These processed datasets are available as csv on gdrive
(link).
Out of the 10,968 curated MECP2 variations only 11 occur in more
than 1% of all database entries, and these account for 53.7% of all
database entries (data not shown).