Phylogenetic analyses
Cox1 . We conducted a maximum likelihood (ML) phylogenetic analysis for the cox1 data set using the program IQTREE with 10,000 ultrafast bootstrap replicates (UFBoot2; Nguyen et al. 2015; Hoang et al. 2018) at the CEBIV, Austria web server (Trifinopoulos et al. 2016). We considered three partitions according to codon positions and selected the most appropriate models of evolution for each partition with the program ModelFinder (Kalyaanamoorthy et al. 2017). The selected evolutionary models were position 1= HKY+F+I, position 2= HKY+F, position 3= TIM2+F+G4. Branches with UFBoot2 support ≥95% were considered as well supported (Hoang et al. 2018). We also built a haplotype network in POPART (Leigh and Bryant 2015) using TCS (Clement et al. 1997) to have a better visualization of the relationships amongcox1 haplotypes.
3RAD. We carried out ML and Bayesian phylogenetic analyses with the four selected matrices including the outgroup (clustering threshold = 98%, min sample locus= 25, 28, 30, 33). The ML analyses were conducted with the program RAxML version 8.0 (Stamatakis 2014) using the GTR-GAMMA model. Branch support was estimated using the automatic bootstrap function, which calculates a stopping rule to determine when enough replicates have been generated (Pattengale et al. 2010). We conducted the Bayesian analyses with the program ExaBayes version 1.5 (Aberer et al. 2014). These analyses were run using the generalized time-reversible model (GTR+G) and five independent MCMC chains of 1,000,000 generations each. The first 100,000 trees (10%) were discarded as burn-in for each MCMC run prior convergence (i.e. when maximum discrepancies across chains <0.1). We assessed burn-in, convergence among runs and run performance examining the resulting parameter files with the program TRACER version 1.7.0 (Rambaut et al 2018). We computed consensus trees using the consensus utility of ExaBayes.
We also employed the program SVDquartets version 1.0 (Chifman and Kubatko 2014) implemented in PAUP version 4.0a (Swofford et al. 2003) to carry out lineage-level phylogenetic analyses with SNP data using the above matrices. SVDquartets is a fast and robust method implemented in the latter program for species- and lineage-level phylogenetic analysis using multi-locus or SNP data under the coalescent model. We assessed variability in the estimated tree using a nonparametric bootstrapping with 500 replicates.
UCEs. We conducted ML analyses for the five UCE matrices (75, 80, 90, 95, 100% taxon occupancy) using the program RAxML version 8 (Stamatakis 2014) with the best tree plus rapid bootstrap search (‘-f a’ option) and 200 bootstrap replicates. We used the GTR+Γ model of sequence evolution for the best tree and bootstrap searches. We carried out these analyses using three different partition schemes (unpartitioned, data partitioned by locus, data pre-partitioned by locus). We selected the best evolutionary model for these partitions using the program PARTITIONFINDER version 2 (Lanfear et al. 2017) based on the Bayesian Information Criterion (BIC) and the rcluster option, which is more appropriate for larger data sets.
We carried out Bayesian analyses with the program Exabayes version 1.4.1 (Aberer et al. 2014). Each analysis consisted of two independent runs of 10 million generations each, two independent runs with Metropolis-Coupling in parallel to better sample parameter space, three heated and one cold chain per run, and sampling trees every 1000 generations. We linked branch lengths across partitions and ran each partitioned search for one million generations. Mixing and stationarity were monitored with the program TRACER version 1.6 (Rambaut and Drummond 2003). We built consensus trees using the consensus utility contained in the program Exabayes version 1.4.1 (Aberer et al. 2014) using a burn-in of 25%. To evaluate for the presence of reticulation, we used the neighbor-net method (Bryant and Moulton 2004) implemented in the program SplitsTree (Huson and Bryant 2006) for computing an unrooted phylogenetic network based on the alignment of the concatenated phased loci with 100% taxon occupancy.
For tree estimation based on the multispecies coalescent model, we generated gene trees from the 100% taxon occupancy matrix, phasing loci with the program RaxML with 200 bootstrap replicates. We then used the resulting gene trees to carry out analyses with the program ASTRAL v.4.10.8 (Mirarab et al. 2014; Mirarab and Warnow 2015), which used unrooted trees and missing data. We calculated nodal support with 200 multi-locus bootstrap replicates (Seo 2008).