Wen Jenny Shi edited sectionDiscussionlab.tex  over 9 years ago

Commit id: c0a10bd542e89895d5bf5821051e5886dea709e5

deletions | additions      

       

\section{Discussion}\label{Sec:discussion}  In this manuscript we We  introduce a Dirichlet mixture model for detecting and clustering changes in allele frequencies in DNA or RNA samples drawn sequence data  from a population sampled at different time points. This annotation free approach is particularly useful for RNA viruses and other organisms where the secondary structure of the RNA can constrain influence  evolution in ways not predicted by the genetic code. standard analysis methods.  Without requiring a prior distribution on the number of mixture components, Io identify significant changes in allele frequency,  our method algorithm  uses a combination of a  hierarchical divisive clustering tree and a  block Metropolis-Hasting. This approach does not require a prior distribution on the number of mixture components.  It automatically produces both an appropriatecluster number  upper bound for the cluster number (for the  mixture components components)  and good initial states for the Gibbs sampler performed on applied to  the joint sequence. The hierarchical tree structure enables parallel computing and overcomes the computational difficulties any direct Markov chain Monte Carlo method presents. Our method outperforms direct Gibbs approaches with important additional benefits of avoiding using number of mixture components ad hoc and computational efficiency gained from parallel computing. The threshold for identifying substitution sites is derived based on the posterior distribution comparison for the time collections without treatment. It is chosen by examining the curvature in the graph of the number of members in the noise set instead of selecting an ad hoc cutoff.