Sequence analysis
The resulting reads were demultiplexed with zero mismatches allowed and then merged using PEAR (Zhang et al. 2014) with a minimum overlap of 50 bp and a minimum quality of 20. The merged read pairs were additionally quality filtered for a minimum of 90 % of bases at a quality of ≥ 30 and transformed to fasta format using the fastx toolkit (Gordon & Hannon 2010). PCR primers were trimmed and all separate samples were merged into a combined file. USEARCH (Edgar 2010 & 2016) was used to dereplicate the sequences and cluster them into zero radius OTUs (zOTUS) using the unoise3 command and 3 % radius OTUs (OTUs) using thecluster_otus command. While OTUs should approximate species in the dataset, zOTUs represent haplotypic variation in individual species. Using BLASTn (Altschul et al. 1990), we linked OTUs and matching zOTU. Each 3 % radius OTU should consist of ≥ 1 zOTUs, with a maximum dissimilarity of 97 %. We then assigned taxonomy to all resulting OTUs and their matching zOTUs using BLASTn against the complete GenBank database (downloaded 02/2021) with a maximum number of 10 matches. We only retained sequences identified as arthropods and assigned order status to them. We refrained from annotations to lower taxonomic levels, due to the incomplete barcode reference libraries for most Hawaiian arthropods. Spiders (order Araneae) are a notable exception, with well-developed reference libraries. The majority of spiders could be classified to species level. The resulting, well-revolved taxonomic information for spiders was used to classify OTUs into native and non-native species in Hawaiʻi. Using spiders as a model taxon, we could thus test the resilience of different habitat types against biological invasions.
A zOTU table was created using the otutab command in USEARCH. The resulting table consisted of five entries for each sample, e.g., the four size categories and the Collembola sample. We then used the number of specimens in each of the five categories, to rarefy this table and merge the size categories into one final sample according to Lim et al. (2022). Briefly, the total read number for each category was subsampled by the number of specimens in that category. Each sampled specimen was represented by 15 reads in the final table. After rarefaction, the five categories were merged into final counts for each site. Using this final table, we calculated patterns of alpha and beta diversity for zOTUs and 3 % radius OTUs in vegan (Oksanen et al. 2007) in R (R Core Team 2021).