Authorea

Graham McVicker edited Modeling Allelic Imbalance.tex over 9 years ago

Commit id: 61d53105f7b14563eac1872db511cb4bb8bc6972

deletions | additions

\subsection{Modeling the allelic imbalances} Allele-specific read counts are sometimes modelled using the binomial distribution \cite{Reddy_2012}, however, we have found that allele-specific read counts are overdispersed. We instead model allele-specific read counts with a beta-binomial (BB) distribution and estimate include a parameter $\Upsilon_i$ (estimated separately) that captures the overdispersion for each individual. The likelihood of the parameters data given the data parameters is then: \[ \textrm{L}\left(\alpha_h, \beta_h \textrm{L}\left(D \left| D \alpha_h, \beta_h \right. \right) = \prod_i \prod_k \Pr_{\mathrm{BB}} \left( Y = y_{ik} \left| n_{ik}, p_h, \Upsilon_i \right. \right) \\ \] where $y_{ik}$ is the number of allele-specific reads from the reference haplotype and $n_{ik}$ is the total number of allele-specific reads for individual $i$ at target SNP $k$. The expected fraction of allele-specific reads from the reference allele is $p_h = \frac{\alpha_h}{\alpha_h + \beta_h}$.