Charith Bhagya Karunarathna edited subsection_Analysis_and_Approaches_begin__.tex  over 7 years ago

Commit id: 8e1aed31b43bc1dd23cb62cda7793f1d164be0d9

deletions | additions      

       

\item Elastic-net \cite{Zou_2005} is a hybrid regularization and variable selection method that linearly combines the L1 and L2 regularization penalties of the Lasso \cite{Tibshirani_2011} and Ridge \cite{Cessie_1992} methods in multiple regression.   \begin{itemize}  \item This combination of Lasso and Ridge penalties provides a more precise prediction than using multiple regression, when SNVs are in high linkage disequilibrium \cite{Cho_2009}.  \item In addition, the elastic-net can accommodate situations in which the number of predictors exceeds the number of observations. We used the elastic-net to select risk SNVs by considering only the main effects.% WE CONSIDER ONLY MAIN EFFECTS FOR SNVs IN OUR ELASTIC NET MODELS. effects.  \item We used the SNV's variable inclusion probability (VIP), a frequentist analog of the Bayesian posterior inclusion probability, as a measure of its importance for predicting disease risk.\cite{Cho_2010}  \item To obtain the VIP for a SNV, we re-fitted the elastic-net model using $100$ bootstrap samples and calculated the proportion of samples in which the SNV was included in fitted model.    

\item Blossoc aims to localize the risk variants by reconstructing genealogical trees at each SNV. This method approximates perfect phylogenies for each SNV, assuming an infinite-sites model of mutation, and scores them according to the non-random clustering of affected individuals.  \item The underlying idea is that genomic regions containing SNVs with high clustering scores are likely to harbour risk variants.  \item Blossoc can be used for both phased and unphased genotype data. However, the method is impractical to apply to unphased data with more than few SNVs due to the computational burden associated with phasing.   \item We therefore assumed the SNV data are phased, as might be done in advance with a fast-phasing algorithm such as fastPHASE \cite{Scheet_2006}, BEAGLE \cite{Browning_2011}, IMPUTE2 \cite{Howie_2009} or MACH \cite{Li_2010,Li_2009}, and evaluated Blossoc with the phased haplotypes, using the probability-scores probability-score  criterion which is the recommended scoring scheme for small datasets \cite{Mailund_2006}. %%%%% Mantel test  \item In practice, the true trees are unknown. However, the cluster statistics based on true trees represent a best case insofar as tree uncertainty is eliminated \cite{Burkett_2014}.  \item We therefore include two versions of the Mantel test as a benchmark for comparison.  \begin{itemize}