Inroduction

Brief literature review

Most genetic association studies focus on common variants.
But, rare genetic variants can play major roles in influencing complex traits. \cite{Pritchard_2001,Schork_2009}
The rare susceptibility variants identified through sequencing have potential to explain some of the ’missing heritability’ of complex traits. \cite{Eichler_2010}.
However, standard methods to test for association with single genetic variants are underpowered for rare variants unless sample sizes are very large. \cite{Lee_2014}
The lack of power of single-variant approaches holds in fine-mapping as well as genome-wide association studies.
In this report, we are concerned with fine-mapping a genomic region that has been sequenced in cases and controls to identify disease-risk loci.
A number of methods have been developed to evaluate the disease association for both single-variant and multiple-variants in a genomic region.
Besides single-variant methods, we consider three broad classes of methods for analysing sequence data: pooled-variant, joint-modelling and tree-based methods.
Overview of 3 types of analysis methods (Besides single-variant method)
- Pooled-variant methods evaluate the cumulative effects of multiple genetic variants in a genomic region. The score statistics from marginal models of the trait association with individual variants are collapsed into a single test statistic, either by combining the information for multiple variants into a single genetic score or by evaluating the distribution of the pooled score statistics of individual variants. \cite{Lee_2014}
- Joint-modeling methods identify the joint effect of multiple genetic variants simultaneously. These methods can assess whether a variant carries any further information about the trait beyond what is explained by the other variants. When trait-influencing variants are in low linkage disequilibrium, this approach may be more powerful than pooling test statistics for marginal associations across variants \cite{Cho_2010}.
- Tree-based methods.
  - A local genealogical tree represents the ancestry of the sample of haplotypes at each locus in the genomic region being fine-mapped.
  - Haplotypes carrying the same disease risk alleles are expected to be related and cluster on the genealogical tree at a disease risk locus.
  - Tree-based methods assess whether trait values co-cluster with the ancestral tree for the haplotypes (e.g., \citeNP{Bardel_2005}).
  - \citeNP{Mailund_2006} has developed a method to reconstruct and score genealogies according to the case-control clusters.
- Review Burkett et al. study briefly(!), what it found.
  - In practice true trees are unknown. However, cluster statistics based on true trees represent a best case for detecting association as tree uncertainty is eliminated.
  - Burkett et al. use known trees to assess the effectiveness of such a tree-based approach for detection of rare, disease-risk variants in a candidate genomic region under various models of disease risk in a haploid population.
  - They found that Mantel statistics computed on the known trees outperform popular methods for detecting rare variants associated with disease.
  - Following Burkett et al., we use clustering tests based on true trees as benchmarks against which to compare the popular association methods.
  - However, unlike Burkett et al., who focus on detection of disease risk variants, we here focus on localization of association signal in the candidate genomic region. Moreover, we use a diploid disease model instead of a haploid disease model.