If there is a "winner's curse" in the included studies, then when a SNP a) is reported to be associated with intelligence in an initial study and b) this association is replicated in a subsequent study, the statistical significance of the association should be lower in the replication study or studies than in the initial study. Therefore, I first isolated all SNPs reported more than once in this dataset. I then calculated the average associated -log p-value (or -log(P)) for each SNP based on whether this was the first, second, third, etc. time that SNP has been reported as associated with intelligence. Because of the reversal of sign induced by taking the negative log, if the later associations are weaker and less statistically significant, the -log(P) should be greater for subsequent replications than for the same SNP in earlier study/studies.
Results
Included studies
The data downloaded from the GWAS Catalog included 14 different studies, in which 2,719 genome-wide significant associations between a SNP and intelligence were reported in total. These studies used a total of 15 samples, and included a combined total of up to 1,651,560 individuals.1
Upon closer inspection, though the GWAS Catalog website lists 18 different studies, only 14 of them are included in the downloaded dataset. The reasons for this appear to be twofold: the paper by Lam et al. (2017)\cite{Lam2017} was counted twice on the Catalog's website, because it measured SNP associations with cognitive ability once with the novel technique of multi-trait analysis of GWAS (MTAG) and once without it, and three studies\cite{Davies2011}\cite{Davis2010}\cite{Butcher2008} were listed on the website but excluded entirely from the downloaded dataset, apparently because all of them reported no genome-wide significant associations whatsoever. Consequently, ignoring the single instance of double counting, this list technically only includes 17 unique studies.\cite{Hill2019,Gialluisi2014,Lam2017,Davis2010,Butcher2008,Trampush2017,Benyamin2014,Davies_2015,Sniekers_2017,Savage_2018,Loo_2012,Lencz2014,Kirkpatrick2014,Davies2011,Davies_2018,Zabaneh_2017,Coleman2019}
I will be using the total number of 17 studies in most of the remainder of this paper, but when analyzing SNP associations I will only be examining the 14 studies which reported any such associations whatsoever. The titles of all 17 included studies are listed alphabetically by title in Table 1 below, with the titles of each of the three studies reporting no associations in bold.
A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. |
A genome-wide association study for extremely high intelligence. A three-stage genome-wide association study of general cognitive ability: hunting the small effects. |
Biological annotation of genetic loci associated with intelligence in a meta-analysis of 87,740 individuals. |
Childhood intelligence is heritable, highly polygenic and associated with FNBP1L. |
Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53949). |
Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. |
Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. |
Genome-wide association studies establish that human intelligence is highly heritable and polygenic. |
Genome-wide association study of intelligence: additive effects of novel brain expressed genes. |
Genome-wide quantitative trait locus association scan of general cognitive ability using pooled DNA and 500K single nucleotide polymorphism microarrays. |
Genome-wide screening for DNA variants associated with reading and language traits. |
GWAS meta-analysis reveals novel loci and genetic correlates for general cognitive function: a report from the COGENT consortium. |
Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets. |
Molecular genetic evidence for overlap between general cognitive ability and risk for schizophrenia: a report from the Cognitive Genomics consorTium (COGENT). |
Results of a "GWAS plus:" general cognitive ability is substantially heritable and massively polygenic. |
Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function. |
Table 1. The 17 unique published studies included in the present review. Papers whose titles are in bold in this table were excluded from association analyses because none of them reported any genome-wide significant associations. The paper whose title is underlined included two sets of associated SNPs and was therefore counted twice in the GWAS Catalog (hereafter "the catalog").
Inconsistent reporting
All of the included SNPs were reported in the catalog (specifically in the column labeled "Strongest SNP Risk Allele") using the corresponding NCBI-assigned Reference SNP (rs) number (e.g. rs10010325), with four exceptions: in the catalog, 4 of the 13 SNPs associated with intelligence in the study by Trampush et al. (2017)\cite{Trampush2017} were listed in the format "chrA:B", where A is the number of the chromosome and B (also a number) is the position of the SNP on the chromosome. The value in the "location" column for these SNPs in the catalog's page for Trampush et al. is listed as "Mapping not available".\cite{catalog} The four locations formatted in this way in the catalog are: chr17:43463493, chr17:43569909, chr17:44210933, and chr17:44366572. Examining Supplemental Table 2 in Trampush et al. (2017)\cite{Trampush2017} indicates that three of these locations (chr17:43463493, chr17:43569909, and 44366572) are in intergenic regions, and the other one (chr17:44210933) is in an intronic region. This may explain why the rs identifiers for these SNPs were not present in the dbSNP database.
In addition, though almost all SNPs were each reported on a separate line, 12 lines turned out to be exceptions, as they each contained 4 SNPs. They are nevertheless each treated as a single SNP here because they were reported in the database with the same associated characteristics (p-value, location in the genome, etc.).2 Each of these lines corresponded to a haplotype of four SNPs that were tested together by Loo et al. (2012).\cite{Loo_2012} This was the only study included in the present review to test multiple SNPs for association with intelligence simultaneously.
Finally, in the Catalog, each SNP was classified into exactly one functional category (see "Functional annotation" section below), except for the same 12 lines (each containing four SNPs) noted above and four additional lines (each containing one SNP but no functional category whatsoever). The 12 lines that each contained 4 SNPs also each contained 4 category names (e.g. "intron_variant"), clearly with each name corresponding to one of the SNPs on the line. Each of the four names on each of these lines was counted as a separate SNP for functional purposes. In addition, there were four more lines that each contained one listed SNP but no functional category; all of these were excluded from functional analyses.
Associations
The 2,719 SNPs reported to be associated with intelligence included 2,335 unique SNPs. Of these, 2,047 (87.7%) were reported only once. In total, only 6 unique SNPs (0.3% of the total) were replicated in five different studies, and none were replicated in more than that. The number and percentage of unique SNPs reported a given number of times in the included studies (ranging from 1 to 5) is shown in Table 2.
# of times reported | # of SNPs | % of all SNPs |
1 | 2047 | 87.7% |
2 | 219 | 9.4% |
3 | 48 | 2.1% |
4 | 15 | 0.6% |
5 | 6 | 0.3% |
A general view of the strength of the reported SNP-intelligence associations can be seen in Fig. 1 below, which shows the frequency of the -log(p)-value corresponding each association in the dataset. Note that this includes all unique associations, so it necessarily includes individual SNPs that were reported multiple times as multiple values. This figure illustrates that most associations are relatively weak, as is evident from the strong right-skew of the histogram; this is because stronger associations will have smaller p-values, which in turn translate into larger -log(p) values.