MaxSpliZer: a Tool to Predict Effects of Splice Variants Based on
the Maximum Entropy Model
Abstract
The Clinical Genome Resource Consortium (ClinGen) recommends MaxEntScan (MES) model to predict effects of LDLR splice variants. We developed “MaxSpliZer”, a software tool to automate implementation of MES and validated it using ClinVar and UK-Biobank (UKBB) data. We tested concordance of MaxSpliZer predictions with ClinVar classifications of pathogenicity of variants in LDLR and FBN1 with potential effect on splicing. We also annotated LDLR splice variants in 200,618 UKBB participants, categorizing them using MaxSpliZer as deleterious (n=90) and non-deleterious (n=7,404). Low-density lipoprotein cholesterol (LDL-C) levels were compared in these two groups after adjustment for lipid lowering medication use. MaxSpliZer prediction was concordant with the ClinVar classification in 96% of LDLR variants and 98% of FBN1 variants. In the UKBB, splice variants predicted as deleterious by MaxSpliZer had higher LDL-C than non-deleterious splice variants (158.7±47.4 vs. 146.0±34.8mg/dL, p-value = 0.014). Compared to manual curation time of 12±7 min per variant, MaxSpliZer took 0.52±0.11 min for single entries and 1.5 s per variant for biobank-scale data. MaxSpliZer, a software tool that implements MES based on the ClinGen guideline, can accurately classify splice variants in a rapid automated fashion.