Authorea

Charith Bhagya Karunarathna edited untitled.tex almost 8 years ago

Commit id: 66025b9a8ef4151ea6bede3b4e68aac7a0c5341e

deletions | additions

Variable threshold of Price et al.\cite{Price_2010} is based on the regression of phenotypes versus counts of mutations meeting the MAF threshold. Price et al. assume that the variants with MAF below some threshold, are more likely to be functional than variants with higher MAF. For each possible MAF threshold, a genotype score is computed based on given collapsing theme. The chosen MAF threshold maximizes the association signal and permutation testing is used to adjust for multiple thresholds. Price et al. found that VT approach had high power to detect the association between rare variant and disease trait, when effects are in one direction, in their simulations. Unlike VT-test, C-alpha test of Neale et al. is a non-burden single gene association method that helps improve power over burden test, especially when the effects of rare variants are in different directions. This C-alpha tests the variance of genetic effects under the assumption of rare variants observed in cases and controls are a mixture of deleterious, protective or neutral variants. We employed both VT-test and C-alpha test across the simulated region by using sliding windows of 20 SNVs overlapping by 5 SNVs. \subsection{Joint-modeling method} CAVIARBF\cite{Chen_2015} is a fine mapping method using marginal test statistics for the SNVs and their pairwise association in the Bayesian framework. This method approximates the Bayesian multivariate regression implemented in BIMBAM\cite{Servin_2005}. Chen et al. found that both CAVIARBF and BIMBAM have better performance than PAINTOR, and other methods. However, CAVIARBF is much faster than BIMBAM because it computes Bayse factors using only the SNVs in each causal model. These Bayes factors can be used to calculate the probability of SNVs being causal in the region.To compute the probability of SNVs being causal, set of models and their Bayes factors have to be considered. Let $p$ be the total number of SNVs in a candidate region, then the all possible number of causal models is $2^p$. Since it is difficult to compute all models for large $p$, this approach has a limitation on the number of causal variants in the model. So, this limitation reduces the number of models to evaluate in the model space, to $\sum_{i=0}^{l} \dbinom{p}{i}$, where $l$ is the number of causal SNVs in the model. Since there are 2630 SNVs in our data, to keep the computational load down, we considered $l=2$. $l=2$.\\ Elastic-net\cite{Zou_2005} is a hybrid regularization and variable selection method that linearly combines the $L1$ and $L2$ regularization penalties of Lasso\cite{Tibshirani_2011}, and Ridge \cite{Cessie_1992} methods in multivariate regression.