2.6 Gene family identification
Protein data from C. fluminea and other representative species, including Capitella teleta , Lingula anatina , Octopus vulgaris , Lottia gigantea , Crassostrea gigas ,Crassostrea virginica , Pinctada imbricata ,Mizuhopecten yessoensis , Mytilus coruscus , andBathymodiolus platifrons , were retrieved in the corresponding databases and aligned using BLAST (version 2.2.31) (Altschul, Gish, Miller, Myers, & Lipman, 1990) with a maximum e-value of 1e−5. Proteins with sequence lengths >100 amino acids were searched against the Pfam (https://pfam.xfam.org) database by Pfam scan (El-Gebali, et al., 2018). The ortholog groups for gene families were generally clustered using OrthoMCL (version 2.0.9) (Li, Stoeckert, & Roos, 2003). Four selected shellfish (C. gigas , L. gigantea , B. platifrons , and C. virginica ) and C. fluminea were grouped together to conduct the analysis for gene family characteristics.