Figure 1. Framework of AsgeneDB construction. (a) Core database construction: a core database was constructed for selected gene (sub)families by retrieving protein sequences from UniProt databases using keywords (Swiss-Prot database & TrEMBL database). Sequences that failed to cluster at 30% identity were manually checked again to remove outlier sequences. (b) Full database construction: As metabolic gene families and homologous gene families were retrieved from the public orthology databases and NCBI RefSeq database and representative sequences were extracted and included in the full database. (c) Metagenomic profiling:AsgenePackage generates both gene abundance and taxonomic profiles of environmental samples.