Abstract
Arsenic
(As) is the most ubiquitous toxic metalloid in nature. Microbe mediated
As metabolism plays an important role in the global As biogeochemical
processes, greatly changing its toxicity and bioavailability. While
metagenomic sequencing may advance our understanding of the
As
metabolism capacity of microbial communities in different
environments,
accurate metagenomic profiling of As metabolism remains challenging due
to low coverage and inaccurate definitions of As metabolism gene
families in public orthology databases. Here we
developed
a manually curated As
metabolism
gene database (AsgeneDB) comprising 414,773 representative sequences
from 59 As metabolism gene families, which
are
affiliated with 1,653 microbial genera from 46 phyla. We then applied
AsgeneDB for functional and taxonomic profiling of As metabolism in
metagenomes from various habitats (freshwater, hot spring, marine
sediment, and soil).
Compared
with other databases, AsgeneDB substantially improved the mapping ratio
of short read in metagenomes from various environments.
Our
results indicate that the diversity and importance of microbial arsenic
metabolism in the environment remains to be explored. In addition, we
developed an R package Asgene to facilitate the analysis and
statistical of metagenomic data. AsgeneDB
and
the associated R PackageAsgenewill greatly
promote
the study of arsenic metabolism in microbial communities in various
environments.