IBD relatedness and selection analysis
Shared ancestry and relatedness between isolates was estimated using
Identity-by-descent (IBD). PED and MAP file formats were created using
VCFtools from an LD-pruned vcf dataset of the full genome (core +
(sub)telomeric and low complexity regions of the 14 chromosomes).
IBD-sharing between pairs of samples was calculated using the isoRelate
package in R, which can analyse IBD in haploid recombining
microorganisms in the presence of multiclonal infections . Genetic
distance was calculated using an estimated mean map unit size fromPlasmodium chabaudi of 13.7 kb/centimorgan (cM) . We set the
thresholds of IBD at the minimum number of SNPs (n = 20) and length of
IBD segments (5000 bp) reported to reduce false-positive calls using an
error of 0.001. IBD has been shown to be superior to probabilistic
models such as STRUCTURE for understanding the relatedness and
interconnectivity of parasite populations . Networks of IBD-sharing
(>10% of the genome shared) between individuals were
created using the igraph package in R, and the cumulative level of
IBD-sharing between isolates in countries in the network was plotted as
a connection map with Scimago graphica and used as a measure of
connectivity between countries.
For the samples from Latin America, the proportion of pairs of isolates
sharing IBD, as well as significance of IBD-sharing was calculated using
the isoRelate package in R for all samples together and subdivided by
population, based on country, as a measure of positive selection.