Authorea

Christopher Medway edited Introduction.tex over 8 years ago

Commit id: 0f6a118db2d3d636fd9e419d075de52c9424c840

deletions | additions

\usepackage{pdflscape} \section{Introduction} Alzheimer's disease (AD) is an incurable neurodegenerative disease and the most common form of dementia, affecting approximatly 750,000 people in the United Kingdom. AD typically manifests as insidious cognitive decline with episodic memory loss and, neuropathologically, as gross cortical atrophy of the temporal cortex. Histologically, neuronal Neuronal neurofibrillary tangles of hyperphosphorylated tau protein and extracellular plaques of the amyloid$\beta$ peptide are classic hallmark of AD post-mortom. AD, which are required for difinitive diagnosis post-mortem. Familial forms of AD exists, which are rare; manifesting early (Early-onset (early-onset Alzheimer's disease or EOAD) and they are caused by the result of highly penetrant, autosomal dominant mutations in within genes on the 'amyloid' pathway (\textit{APP}, \textit{PSEN1} and \textit{PSEN2}). However, approximtley 95 percent of AD cases are of late-onset (Late-onset Alzheimer's disease or LOAD), typically manifesting LOAD). Typically presenting after the fifth decade and are decade, LOAD is aeitiologically highly complex, involving multiple genetic and environmental risk factors. Although not a familial disease, it has been approximated that 65-80 upwards of 60 percent of LOAD liability is genetic. The methodological approached available in the early 1990's made the discovery of the This first LOAD risk gene possible. Family-based methodological approach to pay dividends was linkage analysis analysis, which looks for segregating loci within family members affected with a given disease. In 1993 ?? et al identified a haplotype in the apolipoprotein-E gene (\textit{APOE}) on Chromosome 19 (Figure 1) chromosome 19, which remains the strongest risk factor for LOAD to this day. One or two copies of the $\epsilon$4 haplotype increasesthe risk of LOAD risk fourfold and sixteen fold respectively. Despite However, despite this early success, it would be another twenty years until the next genetic risk factor for LOAD was discovered. In hind sight APOE was the low hanging fruit; due to substantial risk it imparts, APOE was uniquely amenable to a family-based linkage approaches approach in a modest sample size. A fundermental change in approach approach, and technology, would be required to identify smaller genetic effects in unrelated samples. \section{The era of the genome-wide association study} The completion of the Human Geneome Project in 2003 ushered in a new era of genomics. It its wake other international projects sought to build on the foundations of the The reference genome. genome empowered new initiatives to surge uncharted waters. The International HapMap Project (HapMap), launched in 2003, set its sights on determining the variability between individuels by genotyping single nucleotide polymorphisms (SNPs) throughout hundreds of human genomes. Upon it's completion in 2005 the project had genotyped 1.6 millions SNPs in 1184 invididuels from 11 different ethnic populations [http://hapmap.ncbi.nlm.nih.gov] \cite{20811451}. For the first time it was now possible to map the haplotype structure of the human genome and calculate linkage disequalibrium (LD) between SNPs. Using this information, the genetic variability within a 'gene of interest' could now be captured with a smaller number of 'tag-SNPs', vastly reducing cost and time. *However, it was when commercial microarray providers, Illumina and Affymetrix, designing high-throughput arrays of non-redundant SNPs capturing the genetic variability across the human genome that the the technology was born* - needs work. More recently Octobre 2015 saw completeion of the 1000 Genomes Project (1KP) has (1KP). Like HapMap 1KP provideda much finer map of human genetic variation. variability, albiet on a much finer scale. The combination of genome whole-genome (7.4x) and deep exome whole-exome (65.7x) sequencing in over 2500 samples has enabled amore comprehensive catalouge of over 88 million variant sites to be discovered, including rare and structural changes. This has has several important applications, including designing arrays of rarer coding changes ('exome chips') and as a database to exclude pathogenic variants. However, it is the ability impute GWAS datasets with a larger set of variants using 1KP reference hapolotypes that has been a game change \cite{26432245} . The seminal GWAS was published in 2007 by the Wellcome Trust Case Controlm Consortium (WTCCC) \cite{17554300}. With a series of 3,000 healthy controls and 14,000 combined cases across seven common human disease, the consortium identified 24 novel genetic associations with diabetes (types I & II), coronary artery disease, Crohn's disease, rheumatoid arthritis and bipolar disorder. In order to obtain a sufficiently large case-control series for GWAS, previously independent genetic genetic began to form large consortia and user there combined resources to unearth genetic risk factors for complex human diseases. The era of the GWAS had arrived. According to the NHGRI-EBI GWAS catalogue, as of November 2015 there have been 2312 published GWAS \cite{24316577}