DNA sequence data processing
Our DNA sequence data processing is detailed in Bessey et al. (2021), it
directly follows the procedure described at
https://pythonhosted.org/OBITools/wolves.html,
and we briefly outline those procedures here again. Data generated by
Illumina sequencing were processed using OBITools
(https://pythonhosted.org/OBITools/) command ‘ngsfilter’ to assign
each sequence record to the corresponding sample based on tag and
primer. Then ‘obiuniq’ was used to dereplicate reads into unique
sequences. Reads less than 190 bp and with counts less than 10 were
discarded. Denoising was performed using ‘obiclean’ to retain only
sequences with no variants containing a count greater than 5% of their
own. Sequences were assigned to taxa using ‘ecotag’ and a result table
was generated using ‘obiannotate’.
Our reference database was built
in silico using our universal fish primer assay on 03/08/2021. Only fish
species with identities ≥ 90% and whose sequence variants could be
assigned to at least family (and lower) were included. All variants were
assigned a single name (eg. to family, genus or species) and directly
compared to the known species in the mesocosm (Table 2). For example, an
assignment to genus could be compared to the species of that genus which
are known to inhabit the mesocosm.
Statistics
A Box-Cox transformation normalized the data (Shapiro-Wilks Test), which
allowed for the use of parametric statistics. We used an analysis of
variance on the linear model fit of mean Cq value by material, followed
by a Tukey Honest Significant Difference to compare materials. We also
used an analysis of variance on the linear model fit between mean Cq
value and submersion duration for each material. A linear model fit of
mean Cq values by material and submersion duration, and their
interactions, produced the same results. These statistics were likewise
used to determine differences in the number of species detected between
materials and submersion intervals.
We fit a smoothing spline to the
interval data for a visual estimation of how mean Cq values and species
detections varied with time. All statistics and graphics were produced
using R (version 2.14.0; R Development Core Team 2011) and graphics were
edited in Inkscape (https://inkscape.org/).