Graham McVicker edited Correcting read depth and GC.tex  almost 10 years ago

Commit id: e305d2af802a48fb40dcd36e0e96f7e8e28502a9

deletions | additions      

       

For each target region, $j$, we count the total number of reads across individuals, $v_j = \sum_i x_{ij}$, and calculate the GC content, $w_j$. Then, for each individual $i$, we find maximum likelihood estimates of the coefficients $a_{0i}, a_{1i}, \ldots, b_{4i}$ that define the adjusted total read depth $T^{*}_{i,j}$, given the observed read counts and GC content:  \[  \textrm{L}\left(a_{0i}, a_{1i}, \ldots, b_{4i} \left| D_i \right. \right ) = \prod_j \Pr_{\mathrm{Pois}} \left(X_{ij} \left(X  = x_{ij} \left| T^{*}_{ij} \right. \right)\\ \]  \[  T^{*}_{ij} = \exp\left(a_{0i} + a_{1i} w_j + a_{2i} w_j^2 + a_{3i} w_j^3 + a_{4i} w_j^4 \right) \left(b_{1i} v_j + b_{2i} v_j^2 + b_{3i} v_j^3 + b_{4i} v_j^4 \right)