Phylogenetic analyses
Cox1 . We conducted a maximum likelihood (ML) phylogenetic
analysis for the cox1 data set using the program IQTREE with
10,000 ultrafast bootstrap replicates (UFBoot2; Nguyen et al. 2015;
Hoang et al. 2018) at the CEBIV, Austria web server (Trifinopoulos et
al. 2016). We considered three partitions according to codon positions
and selected the most appropriate models of evolution for each partition
with the program ModelFinder (Kalyaanamoorthy et al. 2017). The selected
evolutionary models were position 1= HKY+F+I, position 2= HKY+F,
position 3= TIM2+F+G4. Branches with UFBoot2 support ≥95% were
considered as well supported (Hoang et al. 2018). We also built a
haplotype network in POPART (Leigh and Bryant 2015) using TCS (Clement
et al. 1997) to have a better visualization of the relationships amongcox1 haplotypes.
3RAD. We carried out ML and Bayesian phylogenetic analyses with the four
selected matrices including the outgroup (clustering threshold = 98%,
min sample locus= 25, 28, 30, 33). The ML analyses were conducted with
the program RAxML version 8.0 (Stamatakis 2014) using the GTR-GAMMA
model. Branch support was estimated using the automatic bootstrap
function, which calculates a stopping rule to determine when enough
replicates have been generated (Pattengale et al. 2010). We conducted
the Bayesian analyses with the program ExaBayes version 1.5 (Aberer et
al. 2014). These analyses were run using the generalized time-reversible
model (GTR+G) and five independent MCMC chains of 1,000,000 generations
each. The first 100,000 trees (10%) were discarded as burn-in for each
MCMC run prior convergence (i.e. when maximum discrepancies across
chains <0.1). We assessed burn-in, convergence among runs and
run performance examining the resulting parameter files with the program
TRACER version 1.7.0 (Rambaut et al 2018). We computed consensus trees
using the consensus utility of ExaBayes.
We also employed the program SVDquartets version 1.0 (Chifman and
Kubatko 2014) implemented in PAUP version 4.0a (Swofford et al. 2003) to
carry out lineage-level phylogenetic analyses with SNP data using the
above matrices. SVDquartets is a fast and robust method implemented in
the latter program for species- and lineage-level phylogenetic analysis
using multi-locus or SNP data under the coalescent model. We assessed
variability in the estimated tree using a nonparametric bootstrapping
with 500 replicates.
UCEs. We conducted ML analyses for the five UCE matrices (75, 80, 90,
95, 100% taxon occupancy) using the program RAxML version 8 (Stamatakis
2014) with the best tree plus rapid bootstrap search (‘-f a’ option) and
200 bootstrap replicates. We used the GTR+Γ model of sequence evolution
for the best tree and bootstrap searches. We carried out these analyses
using three different partition schemes (unpartitioned, data partitioned
by locus, data pre-partitioned by locus). We selected the best
evolutionary model for these partitions using the program
PARTITIONFINDER version 2 (Lanfear et al. 2017) based on the Bayesian
Information Criterion (BIC) and the rcluster option, which is more
appropriate for larger data sets.
We carried out Bayesian analyses with the program Exabayes version 1.4.1
(Aberer et al. 2014). Each analysis consisted of two independent runs of
10 million generations each, two independent runs with
Metropolis-Coupling in parallel to better sample parameter space, three
heated and one cold chain per run, and sampling trees every 1000
generations. We linked branch lengths across partitions and ran each
partitioned search for one million generations. Mixing and stationarity
were monitored with the program TRACER version 1.6 (Rambaut and Drummond
2003). We built consensus trees using the consensus utility contained in
the program Exabayes version 1.4.1 (Aberer et al. 2014) using a burn-in
of 25%. To evaluate for the presence of reticulation, we used the
neighbor-net method (Bryant and Moulton 2004) implemented in the program
SplitsTree (Huson and Bryant 2006) for computing an unrooted
phylogenetic network based on the alignment of the concatenated phased
loci with 100% taxon occupancy.
For tree estimation based on the multispecies coalescent model, we
generated gene trees from the 100% taxon occupancy matrix, phasing loci
with the program RaxML with 200 bootstrap replicates. We then used the
resulting gene trees to carry out analyses with the program ASTRAL
v.4.10.8 (Mirarab et al. 2014; Mirarab and Warnow 2015), which used
unrooted trees and missing data. We calculated nodal support with 200
multi-locus bootstrap replicates (Seo 2008).