Bryce van de Geijn edited Correcting for unknown covariates using principal components.tex  over 9 years ago

Commit id: c18e5d071b89d621bad7646b181484bb0b83cffe

deletions | additions      

       

\subsection{Correcting for unknown covariates using principal components}  Covariates that are both measurable (such as time of experiment, age of sample, etc.) and unknown affect molecular trait measurements and confound QTL studies. Principal component analysis (PCA) are used to capture and remove these effects from QTL studies \cite{Stephens_Gilad_Pritchard_2010}\cite{Pickrell,XXXXX}. In order to leverage PCA while maintaining the discrete nature of the count data, the CHT directly models the covariate effects. To do this we include a user defined number of  PCA loadings $u_{i\bullet}$ and fit coefficients $c_{h\bullet}$ when calculating $\lambda_{hi}$. \[  \lambda_{hi} = \left\{ 

\end{array} \right.  \]  Unfortunately fitting too many coefficients simultaneously can cause numerical optimization to be slow and inaccurate. However, since the principal components are by definition orthogonal orthogonal,  we can optimize their coefficients one at a time without losing accuracy. We then use the fitted coefficients to calculate $\lambda_{hi}$ for the null and alternative models.