Sequence analysis
The resulting reads were demultiplexed with zero mismatches allowed and
then merged using PEAR (Zhang et al. 2014) with a minimum overlap of 50
bp and a minimum quality of 20. The merged read pairs were additionally
quality filtered for a minimum of 90 % of bases at a quality of ≥ 30
and transformed to fasta format using the fastx toolkit (Gordon &
Hannon 2010). PCR primers were trimmed and all separate samples were
merged into a combined file. USEARCH (Edgar 2010 & 2016) was used to
dereplicate the sequences and cluster them into zero radius OTUs (zOTUS)
using the unoise3 command and 3 % radius OTUs (OTUs) using thecluster_otus command. While OTUs should approximate species in
the dataset, zOTUs represent haplotypic variation in individual species.
Using BLASTn (Altschul et al. 1990), we linked OTUs and matching zOTU.
Each 3 % radius OTU should consist of ≥ 1 zOTUs, with a maximum
dissimilarity of 97 %. We then assigned taxonomy to all resulting OTUs
and their matching zOTUs using BLASTn against the complete GenBank
database (downloaded 02/2021) with a maximum number of 10 matches. We
only retained sequences identified as arthropods and assigned order
status to them. We refrained from annotations to lower taxonomic levels,
due to the incomplete barcode reference libraries for most Hawaiian
arthropods. Spiders (order Araneae) are a notable exception, with
well-developed reference libraries. The majority of spiders could be
classified to species level. The resulting, well-revolved taxonomic
information for spiders was used to classify OTUs into native and
non-native species in Hawaiʻi. Using spiders as a model taxon, we could
thus test the resilience of different habitat types against biological
invasions.
A zOTU table was created using the otutab command in USEARCH. The
resulting table consisted of five entries for each sample, e.g., the
four size categories and the Collembola sample. We then used the number
of specimens in each of the five categories, to rarefy this table and
merge the size categories into one final sample according to Lim et al.
(2022). Briefly, the total read number for each category was subsampled
by the number of specimens in that category. Each sampled specimen was
represented by 15 reads in the final table. After rarefaction, the five
categories were merged into final counts for each site. Using this final
table, we calculated patterns of alpha and beta diversity for zOTUs and
3 % radius OTUs in vegan (Oksanen et al. 2007) in R (R Core Team
2021).