Tree construction
Maximum-likelihood trees were constructed using IQ-Tree (Nguyen et al. , 2015), under the optimal model defined by the ModelFinder (-MFP) command (Kalyaanamoorthy et al. , 2017) (Table S6). Ultrafast bootstraps and approximate likelihood ratio tests were performed using IQ-Tree’s ultrafast bootstrap and Sh-aLRT parameters (Minh, Nguyen and Von Haeseler, 2013; Hoang et al. , 2018). Full treefiles, supports, and expanded accession data for all sequences are provided in Supplementary Datafile S5.
Scripts and Jupyter Notebook files (Kluyver et al. , 2016) used for automating alignment or treefile analysis, curation, and visualization are available at https://github.com/slschwartz/fournierlab-scripts.
Gene and enzyme structural analysis :
FIND (Murali et al. , 2019) was used to identify structural features and conserved denitrification pathway genes in deep subsurface genomes. Putative domains within denitrification gene ORFs were identified and compared across genomes using NCBI’s Conserved Domains Database (CDD) (Marchler-Bauer et al. , 2015; Lu et al. , 2020) and EMBL Interpro (Mitchell et al. , 2019).
Existing enzyme structures for canonical denitrification genes were downloaded from the RCSB Protein Data Bank (PDB) (Berman et al. , 2000). Anaerolineales-type enzyme structures were predicted using SWISS-MODEL (Waterhouse et al. , 2018). All enzyme structures were visualized and analyzed in PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.)