Association of a rare NOTCH4 coding variant with systemic sclerosis: a family-based whole exome sequencing study


Systemic sclerosis, also known as SSc or scleroderma, is an autoimmune disease characterized by a triad of microvascular dysfunction, immune dysfunction, and generalized fibrosis in connective tissues and organs [1]. One of the most concerning aspects of the disease is that mortality has not improved greatly over the last several decades because there is a critical lack of therapies to address the fibrotic process [2]. The urgent need for innovation in SSc is one of the motivations of the genetics community in attempting to explore the hereditary underpinnings of this condition. Genetic epidemiology has shown convincing evidence of familial aggregation, with increased risk to siblings and first degree relatives as well as substantial epidemiologic overlap with other autoimmune diseases [3]. The etiology of the disease is multifactorial, with poorly-understood environmental influences and a complex mode of genetic inheritance. Since SSc is a relatively rare disease, most cases appear sporadically, without family history [3]. Recent advances in genomic technology, such as high-density genotyping on microarrays, have made possible genome-wide association studies (GWAS) that have enhanced the genetic understanding of SSc.
The single stand-out genetic risk for SSc is associated with an array of variants in the major histocompatibility complex (MHC), containing the human leukocyte antigen (HLA) genes [4], a pattern seen in a wide array of autoimmune diseases. The first large GWAS revealed associations with non-coding SNPs at a number of loci in addition to the HLA, including IRF5, STAT4, CD247, CDH7, and IRF4 [4]. Later GWAS on specific biomarkers and clinical phenotypes [5] as well as high-density genotyping in selected regions on the Immunochip [6] have yielded additional associations. A recent study used whole exome sequencing (WES) in a modest number of cases to identify specifically protein-altering variants, revealing a low-frequency variant in ATP8B4 which was enriched among SSc cases compared to controls (\(OR\ =6.1\)) [7].
Of particular interest is an association from GWAS with the NOTCH4 locus which lies on chromosome 6p21 in proximity to the HLA region. This locus gave an association with the presence of anti-centromere antibody (ACA) or anti-topoisomerase I antibody (ATA) in SSc with \(P<8.84 \times 10^{-21}\), \(OR=0.55\) which were independent of the HLA class II associations [5]. The NOTCH4 locus has previously been associated, independently from the HLA, with other autoimmune disorders including ulcerative colitis [8], rheumatoid arthritis [9], and alopecia areata [10] and age-related macular degeneration [11].
NOTCH4 is a member of a four-gene family (NOTCH1 to 4) and is expressed specifically in endothelial cells [12]. NOTCH proteins are transmembrane receptors activated by transmembrane ligands of the DSL family (Delta/Serrate/Lag-2). Based on structural investigation of the well-studied NOTCH1 family member, binding of the ligand triggers a conformational change in the negative regulatory region (NRR), consisting of LNR repeats and a heterodimerization (HD) region consisting of a NOD and a NODP domain (NOTCH domain) [13, 14]. The isomerization of the NRR unmasks protease cleavage sites, which leads to the intracellular domain of the NOTCH1 receptor being cleaved off. The free intracellular domain translocates to the nucleus and binds to the DNA-binding transcription factor RBP-Jk, activating transcription (Fig. 1).
There are multiple phenotypic manifestations caused by the activation of NOTCH4 in a mouse model system. Ectopic overexpression of the free NOTCH4 intracellular domain in mammary epithelium leads to oncogenic transformation and mammary carcinogenesis [14, 15]. Expression of the free intracellular domain in vascular endothelium is embryonic lethal, with disorganized vascular networks, fewer small vessels, and compromised vessel-wall integrity, demonstrating an important role for NOTCH4 signaling in the development of the vascular system [16]. The role of NOTCH4 in vascular development has significant implications for SSc because the pathological process is thought to be driven by damage to the microvasculature caused by dysfunctional endothelial cells. Morphological changes and activation of endothelial cells are often the earliest detectable sign of disease [17]. This vascular damage leads to reduction in the number of small vessels, thickening of the vessel wall, and luminal narrowing, eventually leading to tissue hypoxia [17]. The connection between vasculopathy and fibrosis is unclear but is under investigation.
Here we describe a family presenting with a three-generation history of SSc in an apparently autosomal-dominant mode of inheritance. We used whole exome sequencing to identify rare mutations which segregate as expected in the pedigree and which might be contributory to the development of the disease. Our characterization of a very rare missense variant in the NOTCH4 NODP domain is described below. The NODP domain is of particular interest because in the homologous NOTCH1 receptor, mutations in this domain result in constitutive activation and consequent T cell acute lymphoblastic leukemia [18].



Whole exome sequence analysis

The SSc phenotype of the proband was determined by a senior pediatric rheumatologist and family history was confirmed.
After written informed consent was obtained, genomic DNA was extracted from the peripheral blood lymphocytes of the proband, mother, affected maternal aunt, unaffected maternal uncle and unaffected maternal grandmother. Whole exome capture was carried out for the two patients and unaffected maternal grandmother using the SureSelect Human All Exon version 3 kit (Agilent Technologies, Santa Clara, CA), according to the manufacturer’s protocols. Sequencing was carried out on the HiSeq 2000 instrument (Illumina, San Diego, CA) using the manufacturer’s recommended procedure. Mapping of next generation sequencing reads and variant calling was performed with the Burrows-Wheeler aligner (BWA) [19] and the variants called using the Genome Analysis Toolkit (GATK) [20]. The results were filtered to exclude synonymous variants, variants with minor allele frequency greater than 0.5% under an autosomal dominant model, and variants previously identified in controls by our in-house exome variant database using ANNOVAR [21]. ANNOVAR produced the data in Supplementary Table 1, including functional impact scores (SIFT [22], PolyPhen2 [23], and GERP [24]). The kinship coefficient was calculated between every two samples via KING to confirm reported relationships [25]. Co-segregation patterns were confirmed by Sanger sequencing in 5 members whose DNA was available using standard PCR amplicons.