Bootstrap Sub-sampling of the Discovery Group

\label{sec:meth-summ-bootstrapping}

A bootstrap sub-sampling method was then carried out, generating 100 subsample replicates of the discovery group, each replicate having 490 T1D cases and 364 NBS controls (i.e. retaining the same proportions as the original 981 cases and 729 controls), sampled from the original discovery set without replacement. The SNPs were then ranked by \(\chi^{2}\), and a bootstrap-consistent set of 458 SNPs was identified, each ranked in the top 5% of SNPs (24501 SNPs) in every bootstrap sub-sample (see Figure \ref{fig:sig-thy-bootstrap-consistency}-A). Most of these 458 SNPs had a maximum rank below 5000, whereas most of the remaining 489574 SNPs had a maximum rank above 350000 (see Figure \ref{fig:sig-thy-bootstrap-consistency}-B).

The bootstrap sub-sampling method was used to eliminate those markers from the initial X chromosome filtered set of 490032 SNPs that were not effective for genetically distinguishing case and control groups. In each iteration of the bootstrap process, a sub-sample of individuals from each group was carried out, then markers were ranked based on a statistic that evaluates the effectiveness of each marker (see Figure \ref{fig:bootstrap-procedure}). Markers that consistently had a high association statistic in each bootstrap sub-sample were selected for the next stage in the process.