Authorea

Friederike Dündar details of TMM routine almost 9 years ago

Commit id: eff51ee76e05b733db9a238fd65c907b295e3cb6

deletions | additions

\end{itemize} In order to identify genes that are differentially expressed \textit{between two conditions}, we must therefore calculate the fraction of each gene's reads relative to the total number of reads. While the number of sequenced reads is known, the total RNA library and its complexity is unknown and can vary drastically from sample to sample (e.g., due to contaminations as well as biological reasons). \begin{table}[h!] %\caption[Normalization methods I.]{\textsf{Normalization methods for the comparison of gene read counts between different conditions.}} %\label{tab:normMethods} conditions.}}\label{tab:normMethods} \begin{small} \begin{tabular}{lll} \textbf{Name} & \textbf{Details} & \textbf{Caveats} \textbf{Comment} \\ %\tabularnewline \toprule %----------------------- Total Count & $\frac{gene\,read\,count}{total\,read\,number}$ & inappropriate if some genes are are only expressed in one condition or extremely highly expressed \\ %\tabularnewline \midrule %----------------------- Trimmed Mean of M-values (TMM) & 1. calculate gene-wise $log_2$ fold changes (= M-values): $M_g = \frac{log_2( \frac{obs.\,gene\,count_1}{total\,read\,number_1} )}{log_2( \frac{obs.\,gene\,count_2}{total\,read\,number_2} )} $; 2. trimming: removal of upper and lower 30\%; 3. precision weighing: the inverse of the estimated variance is used to account for lower variance of genes with larger counts %$log_2(TMM^r_{gk}) = \frac{\displaystyle\sum_{g \in G^*} w^r_{gk}M^r_{gk}}{\displaystyle\sum_{g \in G^*} w^r_{gk}} $ & \\ details in \citet{RobinsonOshlack2010}\\ %\tabularnewline \midrule %----------------------- DESeq & & \\ %\tabularnewline \midrule %----------------------- Upper quartile & & \\ %\tabularnewline \midrule %----------------------- Median & & \\ %\tabularnewline \midrule %----------------------- Quantile & & \\ %\tabularnewline \bottomrule %----------------------- \end{tabular} \end{small} \end{table} \end{tabular}\end{small}\end{table} \begin{itemize} \item Total Count \item Upper Quartile \item Median \item DESeq \item TMM (Trimmed Mean of M-values) \citep{RobinsonOshlack2010} \item Quantile \end{itemize} For the comparison of absolute gene expression values \textit{between different genes}, the differing lengths must also be taken into account. \begin{itemize} \item RPKM (reads per kilobase of exons per million mapped reads)