The genetics of Alzheimer’s disease; what has GWAS ever done for us?

Introduction

Alzheimer’s disease (AD) is an incurable neurodegenerative disease and the most common form of dementia, affecting approximately 850,000 people in the United Kingdom (Janssen 1992) (Alzheimer’s Society 2014. Dementia 2014: An opportunity for change). AD typically manifests as insidious cognitive decline with episodic memory loss and, neuropathologically, as gross cortical atrophy of the temporal cortex. Neuronal neurofibrillary tangles of hyperphosphorylated tau protein and extracellular plaques of the amyloid\(\beta\) peptide are classic hallmark of AD, and provide definitive evidence of LOAD post-mortem.

Familial forms of AD classically present early (early-onset Alzheimer’s disease or EOAD) and are rare. EOAD is the result of highly penetrant, autosomal dominant mutations within genes on the ’amyloid’ pathway (APP, PSEN1 and PSEN2). However, approximately 95 percent of AD cases are of late onset (late onset Alzheimer’s disease or LOAD). Typically presenting after the fifth decade, LOAD is aetiologically highly complex, involving multiple genetic and environmental risk factors. Although not a familial disease, it has been approximated that upwards of 60 percent of LOAD liability is genetic (Van 2015).

The first methodological approach to pay dividends was linkage analysis, which identifies genetic loci segregating with a disease phenotype between affected family members. In 1993 Corder et al identified a haplotype in the apolipoprotein-E gene (APOE) on chromosome 19, which remains the strongest risk factor for LOAD to this day (Corder 1993). One and two copies of the \(\epsilon\)4 haplotype increases LOAD risk approximately fourfold and sixteenfold respectively. However, despite this early success, it would be another twenty years until the next genetic risk factor for LOAD was discovered. In hind sight APOE was the low hanging fruit; due to the atypically large effect it imparts on a complex phenotype, textitAPOE was uniquely amenable to a family-based linkage approach in a modest sample size. A fundamental change in approach, and technology, would be required to identify smaller genetic effects in unrelated samples.

The era of the genome-wide association study

The completion of the Human Geneome Project in 2003 ushered in a new era of genomics. Equipt with a reference genome new initiatives could explore uncharted waters. The International HapMap Project (HapMap), launched in 2003, genotyped 1.6 millions single nucleotide polymorphisms (SNPs) in 1184 samples from 11 different ethnic populations, with the aim of understanding genetic variability between individuels [http://hapmap.ncbi.nlm.nih.gov] (Altshuler 2010). For the first time it was possible to map the haplotype structure of the human genome and calculate linkage disequalibrium (LD) between SNPs. Using this information, the genetic variability within a ’gene of interest’ could now be captured with a smaller number of ’tag-SNPs’, vastly reducing experiment cost and time.

Octobre 2015 saw completion of the 1000 Genomes Project (1KP) which, like HapMap, sought to catalogued human genetic variability, albeit on a much finer scale (Auton 2015). The combination of whole-genome (7.4x) and deep whole-exome (65.7x) sequencing in over 2500 samples has enabled a comprehensive catalogue of over 88 million variant sites to be discovered, including rare and structural changes. This has has several important applications, including designing arrays of rarer coding changes (’exome chips’) and providing a reference set of haplotypes for genotype imputation.

Genome-wide association studies (GWAS) became a reality when commercial microarray providers (Affymetrix and Illumina) began to use produce ’SNP chips’ containing hundreds of thousands of non-redundant,