To assess the hypothetical gene densities, genome data were retrieved in .gff format from PARTIC 3.2.73 \cite{Wattam_2016}. Iteratively, coding regions (CDS) with the product keywords "hypothetical" and "unknown" were tallied against the total. Duplicate CDS were removed based on genome coordinates.