2.7 Phylogenetic and gene family evolutionary analyses
The single-copy orthologs from all involved species were statistically
analyzed using the longest transcripts for each gene. The single-copy
orthologous genes shared by the above 11 species (including C.
fluminea ) were aligned using MUSCLE (version 3.8.31) (Edgar, 2004). The
super-alignment of nucleotide sequences provided a reference tree
topology using PhyML (version 3.3) (Guindon et al., 2010). The
divergence times among species were roughly estimated by the MCMCTree
program of the PAML package (version 4.7a) (Yang, 2007) with the
approximate likelihood calculation method. We utilized molecular clock
data from the TimeTree (http://www.timetree.org/) (Kumar, Stecher,
Suleski, & Hedges, 2017) database as the calibration times.
According to divergence times and phylogenetic relationships, CAFÉ
(version 4.2) (De Bie, Cristianini, Demuth, & Hahn, 2006) was used to
analyze gene family evolution. The gene family expansion and contraction
were analyzed by comparing the differences between the ancestor and
involved species. The extended family genes for C. fluminea were
extracted and aligned to the functional enrichment on GO and KEGG to
detect their functions.