##Human-to-human Transmission of Multiple EBOV Genomes

Intrahost variants (iSNVs) that appear during the course of the epidemic may provide valuable information about human-to-human transmission. In particular, shared iSNVs have been used to estimate the relative size of the transmission bottleneck \cite{Khiabanian_2015} and to identify human-to-human transmission chains \cite{Gire_2014}. In the current data set, which includes 85 samples with at least one iSNV (Figure S3A), several iSNVs are shared among two or more patients, often spanning several months of the EVD epidemic (Figure 2A). The existence of shared iSNVs could be explained by patient infection from multiple sources (superinfection), sample contamination, recurring mutations (with or without balancing selection to reinforce mutations), or co-transmission of slightly diverged viruses that arose by mutation earlier in the tranmission chain.

We can rule out superinfection and contamination as primary explanations for the iSNVs in our data because none of the iSNVs are located at common SNP positions. For example, a SNP at position 14,019 is at intermediate frequency in the population (found in approximately 40% of samples we sequenced), and defines the SL4 lineage (see Figure 1A). If superinfection were common among EVD patients, we would expect to sometimes see both SL3 virus and SL4 virus in the same patient, which would appear as an iSNV at that position. Contamination would result in a similar pattern, with intermediate-frequency SNPs appearing as iSNVs in contaminated samples. Additionally, contamination would be most visible in low-coverage, low-RNA-content samples, because contaminants would make up more of the RNA available for sequencing, whereas samples with extremely high coverage would be the most visible contaminants (Figure S3B). The highest coverage sample (G4960.1) contains genomes belonging to lineage SL3 only and lacks the SL4 SNP, so if there were widespread contamination we should see a low frequency iSNV at position 14,019 in SL4 samples with iSNVs. Since SL3 and SL4 samples were processed together (8 of 9 sequencing batches contained multiple samples from both lineages), and we saw no instances of an iSNV at that position, we conclude that superinfection and contamination are not important contributors to iSNVs.

The remaining possible sources for persistently shared iSNVs are co-transmission and recurrent mutation. In either case, the iSNV could be maintained by balancing selection, or could be evolving neutrally. Figure 2A suggests that selection is not the primary cause of persistence, since synonymous and nonsynonymous variants are equally common among the shared iSNVs, and selective pressures are likely to be different for the two classes of variant. All shared iSNVs are unlikely to be simply the product of recurring mutation: if they were, they should have a frequency spectrum heavily weighted toward low frequency, characteristic of new mutations. However, that is not the case. For example, the variant at position 18,911 is found at >15% frequency in eight different samples (Figure S3C), a much higher frequency than expected if the change represented a de novo mutation in each sample.

In summary, we conclude that a combination of human-to-human transmission and recurrent mutations is likely responsible for the iSNV pattern observed in Figure 2A. This hypothesis is supported by the iSNV at position 18,911: samples containing this variant often cluster on the phylogenetic tree (Figure 2B), although more isolated samples may represent separate mutation events. More generally, pairs of samples that share an iSNV are typically located near one another phylogenetically; these pairs are separated by an average of 0.16 years of evolution, whereas random pairs are separated by an average of 0.30 years (p < 10-4, randomization test). These results suggests transmission of iSNVs in at least some cases, and therefore suggest that the transmission bottleneck is wide enough to facilitate the transmission of low or intermediate-frequency variants between hosts.