Phenotypic and genomic data, and plant height - seeds weight
scaling
Ready-to-use phenotypic data for plant height and thousands seeds weight
(hereafter seeds weight) and high-density genome-wide SNP dataset for
approximately 13,000 barley (Hordeum vulgare L.) accessions were
obtained from the Federal ex situ Genebank for Agricultural and
Horticultural Plant Species (IPK) in Germany. The panel includes both
domesticated barley (cultivars and landraces) and its conspecific wild
progenitor H. vulgare ssp. spontaneum (K. Koch) Thell.
Plant height from the soil surface to the top of the spike, including
awns and seeds weight (in the form of thousand seeds weight) were
assessed during seed regeneration using plots of at least 3
m2 (Gonzalez et al., 2018). SNP profiles were derived
from single plant of the accessions in the IPK barley collection through
genotyping-by-sequencing (GBS) method (Milner et al. 2019).
We retained samples with both phenotypic and genotypic data available
for further analysis. The retained phenotype and genotype data are
subject to further filtering with all samples with <10%
missing genotypes and minor allele frequency (MAF) > 0.01.
Consequently, we have obtained 133,588 SNPs for 12,828 samples,
including wild types, landrace, and cultivars, from 85 countries and
regions of all continents with agriculture. The samples also contain
different habit (winter-type with vernalisation required for flowering,
or spring-type with relaxed vernalisation required for flowering) and
growth form (two-rowed, or six-rowed), and contain sufficient variation
in life history to capture the general scaling law.
Plant height and seeds weight scaling was first evaluated through
bivariate linear model analysis using PAST V3 (Hammer et al., 2001). If
the correlation between plant height and seeds weight is determined by
shared genomic factors, it would be expected that the two traits are
evolutionarily correlated independent of their phylogenetic
relationship. We therefore first test the evolutionary correlation of
the two traits after controlling phylogenetic relatedness among the
samples. To do so, we first used RAxML to construct the
phylogenetic tree of the 12,828 samples following a maximum likelihood
procedure (Stamatakis, 2014). We then implemented a generalized least
squares regression analysis, and used phylogenetic generalized ANOVA to
test the correlation of the two traits after controlling their
phylogenetic relationship using the software package of Phylocom(Webb et al., 2008).
Heritability and genetic correlation , genome-wide
association studies for plant height and seeds weight
We evaluated the heritability of plant height and seeds weight in
barley. We employed a genome-based restricted maximum likelihood method
(GREML-LDMS) to estimate the narrow-sense SNP-based heritability
(h 2SNP) (Yang et al., 2015). To
do so, we computed linkage disequilibrium (LD) scores between SNPs with
the block size of 100 kb using GCTA (Yang et al., 2011), then used GREML
(a function within GCTA) to calculate the proportion of variance in a
phenotype explained by the SNPs following an LD score regression ash 2SNP (Yang et al., 2015). We
further estimated the genetic correlation between the two traits
following the bivariate GREML procedure using GCTA (Yang et al., 2011).
We further identified SNPs that are associated with either plant height
or seed weight through GWAS analysis. We first calculated the first five
principal eigenvectors from principal components analysis (PCA) using
GCTA (Yang et al., 2011) as covariates in the GWAS model in order to
account for population genetic structure. GWAS analysis was conducted
using program FaST-LMM that calculates and uses kinship as a realised
relationship matrix and following a Factored Spectrally Transformed
Linear Mixed Model (Listgarten et al., 2012). We used Bonferroni
correction to determine significant SNPs.
We finally evaluated linkage disequilibrium (LD) decay using ther 2 parameter between all pairwise SNP
comparisons within a genome window of 5 Mb by using PLINK ver 1.9 (Chang
et al., 2015) and PopLDdecay (Zhang et al., 2019). We examined the
pattern of the distance between immediate neighbouring SNP pairs with
one SNPs being significantly associated with plant height, and the
another with seeds weight, and evaluated against the global LD decay
pattern according to their distance separated in the chromosome.