Gene content and arrangement of the maternally inherited markers
Mitochondrial minichromosomes . Neither the Spades nor the aTRAM method was successful in reconstructing complete sequences of the eleven mitochondrial minichromosomes. However, the aTRAM assemblies contained whole coding regions and were used for both the phylogenetic reconstruction and the comparison of gene content between the SE and SW lineages. Although yielding considerable genetic differences (Table S2), the mitochondrial minichromosomes show an identical arrangement of the genes (shared synteny) in both the SW and SE lineages (Table S3). This arrangement is also very similar to that in the related louse species,P. spinulosa . Concatenation of the minichromosomes produced a 15,693 bp long matrix. When phylogenetically analyzed, it yielded a tree with two very distant clusters corresponding to the SE and SW lineages (Figures 4b, S2). Within both clusters, the distances were significantly lower than the distance between the clusters. However, the overall range of distance was higher within the SW than within the SE (Table S2, Figure 4), likely reflecting the broader geographic sampling range of the SW lineage.
Legionella. Genomes of the symbiont L. polyplacis revealed phylogenomic structure parallel to the mtDNA (Figures 4a, S3), with a deep genetic split between the SW and SE lineages. The complete genomes displayed a high degree of similarities with all pairwise comparisons exceeding 99% identity across the 530,063 bp matrix. The contrast between the intra- and inter-cluster comparisons is better illustrated by the counts of the observed differences, which were 215-213 within the SW cluster and 0-113 within the SE cluster, compared to 3,702 - 3,727 between the clusters (Table S2). When comparing the genome sequences, we did not find any clear instance of missing genes. The majority of the gaps introduced by genomic alignment span just one or two nucleotides and were placed in intergenic regions (only one deletion span across 26 nucleotides, also located between the gene coding sequences). The annotations provided by RAST contained several differences between the two clusters, indicating that a gene present in one lineage is shortened or missing in the other cluster. In all of these cases, however, the differences were not caused by a convincing absence of the gene sequence but rather by failure of the algorithm to recognize the sequence as coding a gene, most likely due to the aberrant nature of highly derived symbiont genomes.