Graham McVicker edited Correcting for incorrect genotype calls.tex  over 9 years ago

Commit id: 4f41df835aefb601e21a8396847fe7d802a3f6f1

deletions | additions      

       

\subsection{Correcting for incorrect genotype calls}  SNP genotypes that are incorrectly called as heterozygous are a major source of false positives, since reads that overlap them appear to come from only one allele. To account for this issue, we assume that allele specific reads are drawn from a mixture of two beta-binomials, with probabilities $H_{ik}$ and $1-H_{ik}$, where $H_{ik}$ is the probability that individual $i$ is heterozygous for SNP $k$. Reads from heterozygous individuals contain the reference allele with probability $p_{h}$. We assume that reads from homozygous individuals still have a small probability of coming from the other allele due to sequencing errors, which occur with probability, $p_{\textrm{err}}$. The probability of observing $y_{ik}$ reads from the reference alleleat SNP $k$  for individual $i$ at SNP $k$  then becomes: \begin{eqnarray*}  & \Pr_{\mathrm{BB-mix}}\left(Y = y_{ik} \left| p_{h}, n_{ik}, \Upsilon_i \right. \right) =