Phylogenetic analyses
The nucleotide sequences of 13 protein–coding genes (PCGs) and two ribosome rRNAs (12S rRNA + 16S rRNA) and amino acids (13 PCGs) were aligned using MAFFT v7.394 (Katoh & Standley, 2013) with the highly accurate L-INS-I strategy, trimmed using trimAl v1.4.1 (Capella-Gutiérrez et al., 2009) with the heuristic method ‘automated1’ to remove gap-only and ambiguous-only positions, and concatenated using FASconCAT-G v1.04 (Kück & Longo, 2014). Finally, we generated three matrices for the tree inference: (1) amino acids sequence with the 13 protein-coding genes (PCGs_faa); (2) nucleotide sequence of 13 protein code genes with the third codon excluded (PCG12_fna); (3) nucleotide sequence of PCG12_fna plus the two ribosomal RNAs (PCG12_fna plus two rRNAs). Third codon positions were excluded from the nucleotide-based analyses to reduce the possibility of bias or long-branch attraction due to substitution saturation among species belonging to different genera. We applied both partitioned and non-partitioned approaches for phylogenetic inference. Partitioned maximum likelihood reconstructions were performed using IQ-TREE v1.6.3 (Nguyen et al., 2015) with 1,000 ultrafast bootstrap (UFBoot) (Hoang et al., 2018) and 1,000 SH‒aLRT replicates (Guindon et al., 2010). The option ‘-m MFP+MERGE’ was performed in all three matrices. Non-partitioned reconstructions were made using site heterogeneous models in both maximum likelihood (ML) and Bayesian inference (BI). Posterior mean site frequency (PMSF) model (Wang et al., 2018) was used for the PCGs_faa matrix by specifying a profile mixture model with the option ‘-mtInv+C60+FO+R’ in IQ-TREE. The corresponding partitioned tree (PCGs_faa matrix) was treated as an initial guide tree. Bayesian inference using PhyloBayes MPI v1.8b (Lartillot et al., 2013) was performed for the PCGs_faa matrix as well. Two separate chains were independently run for 10,000 generations under the CAT+GTR model (Lartillot & Philippe, 2004) using a starting tree derived from PMSF ML analyses. We used the program bpcomp (maxdiff value) and tracecomp (minimum effective size) to check for convergence, that is, when the maxdiff value is smaller than 0.3 and minimum effective sizes are larger than 50. In addition, pairwise distances (p-distance) using the nucleotide sequence of 13 PCGs are shown in Table S3.