Results

Clinical characteristics of subjects

Here, we collected the largest Chinese children obesity cohort, as far as we know, composed of 6,484 subjects for association study. We performed WES on 150 subjects (76 cases vs 74 controls) to discover the obesity-associated candidate variants and further validated them in a 6,334 subjects (2,480 cases vs 3,854 controls). The characteristics of all collected subjects were shown in Table1 . For 150 subjects in discovery stage, sex, age, and birth weight were roughly matched between cases and controls (P > 0.05 , Chi-square test), but obesity-related traits as weight, BMI, FMP, WHtR, serum lipid level (triacylglycerol and high-density lipoprotein cholesterol), and blood pressure were significantly discrepant (P < 0.001, Chi-square test). Similar characteristic patterns were observed between cases and controls in validation cohort except parameters as follows: age, sex, fat-free mass index, total cholesterol, and LDL-C (each has P < 0.001, Chi-square test).

Identification of obesity-associated variants

To discover the obesity associated variants, especially for those causal ones, we performed whole-exome sequencing on 76 obesity children and 74 controls at discovery stage. We totally obtained 6,725,533 variants from the 150 subjects with an average depth of 23×. After quality control (Figure 1 ), we finally got 921,287 variants for association test. For all variants showed nominal (P < 0.05) association to obesity, we just kept those lose-of-function and deleterious ones as candidates for further investigation. In the end, we identified 26 obesity associated candidates for validation stage. We genotype all candidates in a cohort of 6,334 individuals including 2,480 cases and 3,854 controls. After genotyping these candidates and quality control on call rate, we totally got 23 SNVs for association study by adjusting age, sex, and population stratification (Figure 1 andSupplementary Table 1 ).
We performed logistic regression on the 23 variants and found 2 of them significantly associated with common obesity at genome-wide level: rs1059491 (SULT1A2 :c.704T>G, P = 7.71E-24,OR = 2.296, 95% CI = 1.953-2.699) and rs189326455 (MAP3K21 :c.162G>C, P = 6.16E-11, OR = 0.2187, 95% CI = 0.1387-0.3449) (Table 2 ). Here, we used BMI to define common obesity according to the WHO and IOTF standards. When using FMP as phenotype of the ‘fatty obesity’, only rs1059491 associated with obesity at genome-wide significant level (P = 1.17E-11, OR = 1.935, 95% CI = 1.599-2.342). Meanwhile, rs1059491 was also significantly (P = 5.01E-10,OR = 1.66, 95% CI = 1.415-1.948) associated with dyslipidaemia. Furthermore, we performed association test on genotype of rs1059491 from transcriptome of blood and adipose tissues, and found it significantly associated with obesity (P = 0.01408, OR = 3.533, 95% CI = 1.29-9.675). When combing WES, validation, and transcriptome data, we found no significant heterogeneity existed and rs1059491 (P = 2.57E-28, OR = 2.405, 95% CI = 2.058-2.811) and rs189326455 (P = 8.98E-12, OR = 0.2122, 95% CI = 0.1359-0.3313) became more significant in association with obesity in Chinese children.
To further examine whether the identified variants affect obesity related quantitative traits as BMI, FMI, FMP, FFMI, and LDL, we performed multiple linear regression analyses (Table 3 ). After adjusting by sex, age, and population stratification, rs1059491 demonstrated genome-wide significant associations with BMI (P = 1.41E-18, β = 1.953, 95% CI = 1.519 - 2.386), FMP (P = 7.22E-13, β = 3.282, 95% CI =4.175 - 7.197), FMI (P = 1.87E-11, β = 1.157, 95% CI = 0.82 - 1.493), FFMI (P = 8.14E-09, β = -0.8541, 95% CI = -1.144 - -0.5643), and LDL level (P = 5.42E-09, β = 0.1583, 95% CI = 0.1052 - 0.2114), while rs189326455 just showed genome-wide significant association with BMI (P = 3.36E-10,β = -2.655, 95% CI = -3.481 - -1.828, Table 3 ).

Functional predictions of genome-wide significant variants

From the association study, we identified two novel obesity associated SNVs in Chinese children. The rs1059491 (SULT1A2:p.N235T) is a missense variant on the exon 2 of SULT1A2 (Figure 2A ). The rs189326455 (MAP3K21:p.E54D) is a missense variant on the exon 1 ofMAP3K21 (Figure 2B ). Both SNVs are highly conserved across mammals (Figure 2A,B ) and rs1059491 is predicted as a deleterious mutation according to SIFT, Polyphen-2 and MutationTaster (Supplementary Table 2 ). However, neither SNV was predicted to affect on the secondary structure of SULT1A2 or MAP3K21 (Supplementary Figure 1 ). Therefore, neither of them might affect phenotype by changing protein conformation.
Considering both SNVs reside on promoter or promoter flanking regions (Figure 2A,B ), we interrogated their roles in regulating gene expression. The rs1059491 has been reported to enhance the binding affinities to transcription factors like PPAR_2 and RXRA which were involved in pathway of regulation of lipid metabolism by peroxisome proliferator-activated receptor alpha (Supplementary Table 3 ). MEME suite also predicted that rs1059491 has affected the binding affinity of transcription factors like RXRA homodimer, PPARG, PPARG::RXRA, VDR, and NR1H2::RXRA (Figure 2C ). The rs189326455 resided in the binding sites of transcription factors including POL2, CHD2_disc3, and ESR2(Consortium, 2011; Kheradpour & Kellis, 2014), which would drastically reduce the binding affinity of CHD2 and ESR2 (Supplementary Table 3 ). Taken together, both rs1059491 and rs189326455 may affect the expression of certain genes by changing the binding affinity to transcription factors.

The role of rs1059491 in gene expression

According to GTEx Portal, rs1059491 is an eQTL that related to the differential expression of multiple genes and transcripts (Suplementary Table 4 ). The genotype of rs1059491 is correlated with expression of 17 genes including SULT1A2 , SULT1A1 ,EIF3C, LAT, and SH2B1 in subcutaneous adipose tissues data or blood data deposited in GTEx. From our transcriptome sequencing data, we confirmed that the genotypes of rs1059491 correlated with the expression of SGF29 , SPNS1 , SULT1A2 , andTUFM in blood, and EIF3C in adipose tissues (Figure 3A ).
We further examined the expression of those eQTL target genes between obese and normal weighted children in blood and adipose tissues. We found that NPIPB9 were significantly differentially expressed between cases and controls in blood, whereas ATXN2L ,SULT1A2 , and TUFM were differentially expressed in adipose tissues (Figure 3B ). SULT1A2 and TUFM were significantly up-regulated in obese individuals, while ATXN2L andNPIPB9 were down-regulated. In conclusion, rs1059491 regulated the expression of several genes in blood and adipose tissue, which may have accounted for its association with obesity.