Graham McVicker edited Modeling Allelic Imbalance.tex  over 9 years ago

Commit id: 61d53105f7b14563eac1872db511cb4bb8bc6972

deletions | additions      

       

\subsection{Modeling the allelic imbalances}  Allele-specific read counts are sometimes modelled using the binomial distribution \cite{Reddy_2012}, however, we have found that allele-specific read counts are overdispersed. We instead model allele-specific read counts with a beta-binomial (BB) distribution and estimate include  a parameter $\Upsilon_i$ (estimated separately)  that captures the overdispersion for each individual. The likelihood of the parameters data  given the data parameters  is then: \[  \textrm{L}\left(\alpha_h, \beta_h \textrm{L}\left(D  \left| D \alpha_h, \beta_h  \right. \right) = \prod_i \prod_k \Pr_{\mathrm{BB}} \left( Y = y_{ik} \left| n_{ik}, p_h, \Upsilon_i \right. \right) \\ \]  where $y_{ik}$ is the number of allele-specific reads from the reference haplotype and $n_{ik}$ is the total number of allele-specific reads for individual $i$ at target SNP $k$. The expected fraction of allele-specific reads from the reference allele is $p_h = \frac{\alpha_h}{\alpha_h + \beta_h}$.