Data processing for microbiome analysis
Fastq files were imported and processed by Qiime2 (v 2021-11)[
https://qiime2.org/] following the public tutorial for Casava 1.8
paired end demultiplexed sample format. Sequences were denoised using
DADA2 method via qiime data2 denoise-paired to obtain amplicon
sequencing variant (ASV) table. Taxonomy was assigned to ASVs by qiime
feature-classifier (classify-sklearn) against the pre-trained Silva v138
database or V4 region (515F/806R) using RESCRIPt. Untargeted sequences
were removed. The potential for contamination was addressed by
co-sequencing DNA amplified from specimens and from two each of
template-free controls and extraction kit reagents processed the same
way as the specimens. ASVs were considered putative contaminants (and
were removed) if their mean abundance in controls reached or exceeded 25
% of their mean abundance in specimens. Finally, samples with less than
1000 high quality read counts were excluded from downstream analysis. An
average of 8197 quality-filtered reads were generated per wild sample.
An average of 23408 quality-filtered reads were generated per laboratory
sample. The total number of ASVs from the wild and laboratory datasets
were 4555 (including those occurring once with a count of 1, or
singletons).