pcAdapt is an R package which uses principal components analysis (PCA) on single-nucleotide polymorphisms (SNPs) across a sampled population to detect genes that are under selection (Luu et al. 2017). pcAdapt can be used to infer population structure. We ran the pcAdapt package on a dataset of 3,784 nuclear SNPs derived from our 94 individual Malus sp. trees sampled from Boulder County, plus the 20 previously-published samples that included five domestic apple varieties (Malus domestica) mentioned in a 1922 Boulder County orchard survey, several other domestic apple varieties, Malus sieversii (Central Asian ancestral wild apple), Malus sylvestris (crabapple), and Malus hupehensis (tea crabapple). A score plot of the first two principal components shows that this apple population clusters into two major groups, one of which is exclusively composed of all 94 of the Boulder County Malus sp. samples. This result is inconclusive: it could indicate that Boulder County Malus trees are evolving as a geographic cohort, or it could be a result of batch effects. \cite{Luu_2016}
\cite{Cheng_2017}
The program Ohana was used to determine population structure, with the number of populations (K) set at various integers (2-8). Ohana’s qpas function utilizes allele frequencies and genotype likelihoods to create a vector of component proportions, which is output as a Q matrix. The Q matrix allows admixture of each sample to be viewed as a barplot. A covariance matrix can also be created from Ohana’s nemco function to allow inference of a Newick tree showing relations among the populations.