Results
Clinical characteristics of subjects
Here, we collected the largest Chinese children obesity cohort, as far
as we know, composed of 6,484 subjects for association study. We
performed WES on 150 subjects (76 cases vs 74 controls) to discover the
obesity-associated candidate variants and further validated them in a
6,334 subjects (2,480 cases vs 3,854 controls). The characteristics of
all collected subjects were shown in Table1 . For 150 subjects
in discovery stage, sex, age, and birth weight were roughly matched
between cases and controls (P > 0.05 , Chi-square
test), but obesity-related traits as weight, BMI, FMP, WHtR, serum lipid
level (triacylglycerol and high-density lipoprotein cholesterol), and
blood pressure were significantly discrepant (P <
0.001, Chi-square test). Similar characteristic patterns were observed
between cases and controls in validation cohort except parameters as
follows: age, sex, fat-free mass index, total cholesterol, and LDL-C
(each has P < 0.001, Chi-square test).
Identification of obesity-associated variants
To discover the obesity
associated variants, especially for those causal ones, we performed
whole-exome sequencing on 76 obesity children and 74 controls at
discovery stage. We totally obtained 6,725,533 variants from the 150
subjects with an average depth of 23×. After quality control
(Figure 1 ), we finally got 921,287 variants for association
test. For all variants showed nominal (P < 0.05)
association to obesity, we just kept those lose-of-function and
deleterious ones as candidates for further investigation. In the end, we
identified 26 obesity associated candidates for validation stage. We
genotype all candidates in a cohort of 6,334 individuals including 2,480
cases and 3,854 controls. After genotyping these candidates and quality
control on call rate, we totally got 23 SNVs for association study by
adjusting age, sex, and population stratification (Figure 1 andSupplementary Table 1 ).
We performed logistic regression on the 23 variants and found 2 of them
significantly associated with common obesity at genome-wide level:
rs1059491 (SULT1A2 :c.704T>G, P = 7.71E-24,OR = 2.296, 95% CI = 1.953-2.699) and rs189326455
(MAP3K21 :c.162G>C, P = 6.16E-11, OR =
0.2187, 95% CI = 0.1387-0.3449) (Table 2 ). Here, we
used BMI to define common obesity according to the WHO and IOTF
standards. When using FMP as phenotype of the ‘fatty obesity’, only
rs1059491 associated with obesity at genome-wide significant level
(P = 1.17E-11, OR = 1.935, 95% CI = 1.599-2.342).
Meanwhile, rs1059491 was also significantly (P = 5.01E-10,OR = 1.66, 95% CI = 1.415-1.948) associated with
dyslipidaemia. Furthermore, we performed association test on genotype of
rs1059491 from transcriptome of blood and adipose tissues, and found it
significantly associated with obesity (P = 0.01408, OR =
3.533, 95% CI = 1.29-9.675). When combing WES, validation, and
transcriptome data, we found no significant heterogeneity existed and
rs1059491 (P = 2.57E-28, OR = 2.405, 95% CI =
2.058-2.811) and rs189326455 (P = 8.98E-12, OR = 0.2122,
95% CI = 0.1359-0.3313) became more significant in association
with obesity in Chinese children.
To further examine whether the identified variants affect obesity
related quantitative traits as BMI, FMI, FMP, FFMI, and LDL, we
performed multiple linear regression analyses (Table 3 ). After
adjusting by sex, age, and population stratification, rs1059491
demonstrated genome-wide significant associations with BMI (P =
1.41E-18, β = 1.953, 95% CI = 1.519 - 2.386),
FMP
(P = 7.22E-13, β = 3.282, 95% CI =4.175 - 7.197),
FMI (P = 1.87E-11, β = 1.157, 95% CI = 0.82 -
1.493), FFMI (P = 8.14E-09, β = -0.8541, 95% CI =
-1.144 - -0.5643), and LDL level (P = 5.42E-09, β =
0.1583, 95% CI = 0.1052 - 0.2114), while rs189326455 just showed
genome-wide significant association with BMI (P = 3.36E-10,β = -2.655, 95% CI = -3.481 - -1.828, Table 3 ).
Functional predictions of genome-wide significant
variants
From the association study, we identified two novel obesity associated
SNVs in Chinese children. The rs1059491 (SULT1A2:p.N235T) is a missense
variant on the exon 2 of SULT1A2 (Figure 2A ). The
rs189326455 (MAP3K21:p.E54D) is a missense variant on the exon 1 ofMAP3K21 (Figure 2B ). Both SNVs are highly conserved
across mammals (Figure 2A,B ) and rs1059491 is predicted as a
deleterious mutation according to SIFT, Polyphen-2 and MutationTaster
(Supplementary Table 2 ). However, neither SNV was predicted to
affect on the secondary structure of SULT1A2 or MAP3K21
(Supplementary Figure 1 ). Therefore, neither of them might
affect phenotype by changing protein conformation.
Considering both SNVs reside on promoter or promoter flanking regions
(Figure 2A,B ), we interrogated their roles in regulating gene
expression. The rs1059491 has been reported to enhance the binding
affinities to transcription factors like PPAR_2 and RXRA which were
involved in pathway of regulation of lipid metabolism by peroxisome
proliferator-activated receptor alpha (Supplementary Table 3 ).
MEME suite also predicted that rs1059491 has affected the binding
affinity of transcription factors like RXRA homodimer, PPARG,
PPARG::RXRA, VDR, and NR1H2::RXRA (Figure 2C ). The rs189326455
resided in the binding sites of transcription factors including POL2,
CHD2_disc3, and ESR2(Consortium, 2011; Kheradpour & Kellis, 2014),
which would drastically reduce the binding affinity of CHD2 and ESR2
(Supplementary Table 3 ). Taken together, both rs1059491 and
rs189326455 may affect the expression of certain genes by changing the
binding affinity
to
transcription factors.
The role of rs1059491 in gene
expression
According to GTEx Portal, rs1059491 is an eQTL that related to the
differential expression of multiple genes and transcripts
(Suplementary Table 4 ). The genotype of rs1059491 is correlated
with expression of 17 genes including SULT1A2 , SULT1A1 ,EIF3C, LAT, and SH2B1 in subcutaneous adipose tissues data
or blood data deposited in GTEx. From our transcriptome sequencing data,
we confirmed that the genotypes of rs1059491 correlated with the
expression of SGF29 , SPNS1 , SULT1A2 , andTUFM in blood, and EIF3C in adipose tissues
(Figure 3A ).
We further examined the expression of those eQTL target genes between
obese and normal weighted children in blood and adipose tissues. We
found that NPIPB9 were significantly differentially expressed
between cases and controls in blood, whereas ATXN2L ,SULT1A2 , and TUFM were differentially expressed in adipose
tissues (Figure 3B ). SULT1A2 and TUFM were
significantly up-regulated in obese individuals, while ATXN2L andNPIPB9 were down-regulated. In conclusion, rs1059491 regulated
the expression of several genes in blood and adipose tissue, which may
have accounted for its association with obesity.