David Andrew Eccles edited Comparison With Simple Ranking.tex  almost 9 years ago

Commit id: 0c0f35b64569964222edb720dcf7f20dae320f85

deletions | additions      

       

A ranking statistic is necessary for the bootstrap process to  determine which markers are more likely to be associated with the  phenotype of interest. The purpose of this the ranking  statistic is to rank the effectiveness of markers in distinguishing groups, rather than give a  precise indication of their utility. This means that the actual  statistic used is not important, as long as it is generally  able to rank an informative marker higher than a less informative marker. In this  case, a genotype-based $\chi^2$ statistic was chosen for evaluating  marker effectiveness. This statistic considers situations where a 

\label{tab:T1D-snprank-random10-10}  \end{table}  \subsubsection{Identifying Group-specific Markers}  \label{sec:sig-thy-gsms}  Markers that have are not ranked in the top 5\% of markers in  \emph{any} sub-sample are excluded from further analysis. When using  this process on the T1D discovery group, a \emph{bootstrap-consistent  set} of 458 SNPs group-specific markers (GSMs)  were found in the top 24501\footnote{$24501 = \lfloor 0.05 * 490032 \rfloor$} SNPs markers  in \emph{all} 100 sub-samples. Of these bootstrap-consistent SNPs, GSMs,  182 (40\%) are located between 30Mb and 33Mb from the beginning of chromosome 6, near the HLA region.  The remaining 276 SNPs GSMs  are distributed fairly evenly throughout the genome (see Figure~\ref{fig:sig-thy-consistent-association-458}). From  these observations of chromosomal location, T1D appears to have a very  strong association signal near the HLA region on chromosome 6, and  limited signal elsewhere in the genome.