Supplemental Figure S1. Flowchart of subject-level quality control process. From a total of 2967 children genotyped, subject level QC removed 56 who did not match the reported sex, 39 for genotype heterozygosity and 37 who were likely related as indicated by identity by descent of alleles, resulting in 2835 subjects for further analysis.
Supplemental Figure S2. Principal component analysis. The y-axis shows the fraction of the variance explained by the individual PC (variance explained / total variance) and the x-axis displays the index of eigenvalues tested. The red vertical line at x=3 indicates the inclusion threshold we chose, which is where a steep decline in explained variance is observed.
Supplemental Figure S3. Overlay of the first two principal components (PC1 and PC2) from CHILD cohort and the 1000 Genomes Project (phase 3). The two dotted lines in the plot indicate the inclusion threshold of PC1 and PC2, chosen to identify subjects with Central European (CEU) ancestry in CHILD cohort.
Supplemental Figure S4. Manhattan plot showing GWAS results of recurrent wheeze in the Caucasian sub-cohort of the CHILD Study. The Y-axis depicts the -log10 transformed p-value and the x-axis shows chromosomal positions of SNVs. Suggestive significance (p=5e-5) and genome-wide significance (p=5e-8) are respectively marked as blue and red horizontal lines, respectively.
Supplemental Figure S5. Annotations of 98 SNVs associated with recurrent wheeze in the full, admixed CHILD Study cohort. The Pie chart on the left shows the breakdown of all predicted consequences and pie chart on the right shows the predicted consequences of protein-coding variants.
Supplemental Figure S6. Identification of the GRS associated with recurrent wheeze. This plot illustrates the process in determining the variants for inclusion in the GRS analysis. The y-axis is -log10 transformed p-value of the GRS and the x-axis shows the number of variants included in that GRS from the top associated 100 variants (results from previously published GWAS). The GRS with the lowest p-value occur with the inclusion of first (i.e. most strongly associated) 4 variants, which is indicated by the red colored bar (p=1e-7.8).
Supplemental Figure S7. GRS performance in predicting recurrent wheeze and asthma. Performance of GRS is compared for prediction of recurrent wheeze at 2-5 years and asthma diagnosed at 5 years. Y-axis of the plot shows -log10 transformed p-value and x-axis shows the number of variants included in the GRS. Both plots show that best performance is reached when first 4 variants are utilized in the GRS.