Jinko Graham edited untitled.tex  almost 8 years ago

Commit id: b0016139eecd3737cdc9b5c1f2cef4f029bed238

deletions | additions      

       

\item Association studies based on next-generation sequencing data  \item Sequence data allows us to study both common and rare variants  \item About trees underlying the sequence data (where mutation occurs on tree). CAN YOU FLESH THIS OUT WITH SOME SUBPOINTS?  \item Using variant data simulated from coalescent trees, we investigate the ability of several methods to localize association signal within some candidate region harbouring risk variants. Our work extends that of Burkett et al., which investigated the ability to detect association signal in the candidate region.  \end{itemize}     

\end{itemize}  \item Joint-modeling method  \begin{itemize}  \item CAVIARBF \cite{Chen_2015} Fine mapping method using marginal test statistics for the SNVs and their pairwise association. Approximates the Bayesian multivariate regression implemented in BIMBAM \cite{Servin_2007}. CAN YOU DESCRIBE HOW BIMBAM MODELS ALL POSSIBLE COMBINATIONS OF 1,2,3 etc. SNVS AND THEIR INTERACTION TERMS? THEN SAY THAT, TO KEEP THE COMPUTATIONAL LOAD DOWN, WE CONSIDERED ALL POSSIBLE COMBINATIONS OF SNVS UP TO PAIRS ONLY.  \item Elastic-net \cite{Zou_2005}: A hybrid regularization and variable selection method that linearly combines the L1 and L2 regularization penalties of the Lasso and Ridge methods in multivariate regression. WEDO NOT  CONSIDER ANY INTERACTIONS AMONGST ONLY MAIN EFFECTS FOR  SNVs IN OUR ELASTIC NET MODELS. \begin{itemize}  \item Particularly useful when number of predictors exceeds the number of observations.  \end{itemize} 

\item \citeNP{Mailund_2006} have found Blossoc to be a fast and accurate method to localize {\bf common} disease-causing variants but how well does it work with rare variants?   \item Can use either phased or unphased genotype data. However, it is impractical to apply it to unphased data with more than a few SNPs due to the computational burden associated with phasing. We will thereform assume the SNV data are phased, as might be done in advance with a fast-phasing algorithm such as fastPHASE (ref), BEAGLE (ref), IMPUTE2 (ref) or MACH (ref).  \end{itemize}  \item True trees (MT-rank of the coalescent events, \citeNP{Burkett_2013}): Detect co-clustering of  theassociation between  disease trait and variants on  genealogical trees. \begin{itemize}  \item In practice, the true trees are unknown. However, the cluster statistics based on true trees represent a best case insofar as tree uncertainty is eliminated. A previous simulation study \cite{Burkett_2013} established the optimality of true-tree-based clustering these  tests for detecting association. We therefore includesuch  a clustering  test based on true trees  as a benchmark for comparison in this investigation of how well various methods localize (rather than detect) association signal. comparison.  \item Upweight the short branches at the tip of the tree. (DESCRIBE BRIEFLY HOW WE ACHIEVE UPWEIGHTING OF THE SHORT BRANCHES AT THE TIPS). We assign a branch-length of one to all branches, even the relatively longer branches that are close to the time to the most recent common ancestor. [NOW CAN REMOVE:  Expected number of time it takes for the final two of k lineages to coalesce is $ E(T_{2}) = 0.5 \times E(TMRCA) $. So, if we rank the coalescence events(i.e. intercoalescence times are 1 time unit), $ T_{2} $ becomes 1, as well as $T_{k}$ is one. So, this has the effect of upweighting the branch. branch.]  \item Success in localization was declared if the strongest signal was in the risk region.  \end{itemize}   \end{itemize}