Gene families and Phylogenetic analysis
OrthoMCL was used to identify the orthologous groups among 12 Gramineae
species (O. kokonorica , C. songorica, O. thomaeum, E. tef,
S. bicolor, Z. mays, Digitaria exilis, Panicum hallii, Setaria italica,
Brachypodium distachyon, Hordeum vulgare, O. sativa ), two Commelinids
species (Ananas comosus, Musa acuminata ), one Monocot species
(Phalaenopsis equestris )
and one Rosid species (A. thaliana ). For the three tetraploids
(i.e., O. kokonorica , C. songorica and E. tef ),
both subgenomes were used for orthologous group construction and
phylogenetic analysis. The dynamic evolution of gene families in these
16 species was predicted using CAFÉ v. 3.1 (Han et al., 2013), and the
significantly expanded or contracted gene families were determined based
on p-values (p < 0.01). We then completed GO enrichment and
KEGG analyses on the expanded gene families in O.kokonorica .
Single-copy orthologous genes from the orthologous clustering results
were extracted and aligned using MAFFT v. 7.158b (Katoh & Standley,
2013). Then, Gblocks v. 0.9171 (Castresana, 2000) was used to delete
regions with poor alignment or large differences after multiple
alignments. A maximum likelihood phylogenetic tree was reconstructed
based on the single-copy orthologous gene data set using RAXML v. 8.1.17
(Stamatakis, 2014) with the PROTGAMMAJTTX model and 1,000 bootstrap
replicates. Nucleotide substitution rate and divergence time were
calculated by four-fold degenerate sites (4DTv) of single-copy
orthologous genes. The nucleotide substitution rate was estimated using
BASEML v. 4.8a, a program within PAML v. 4.0 (Yang, 2007). Species
divergence time was inferred by MCMCtree in the PAML program, based on
known approximate divergence times for P. equestris and M.
acuminata (102-120 Ma) from the TimeTree database
(http://www.timetree.org).