Bayesian hybrid index and genomic cline estimation with the R package
gghybrid
- Richard Bailey

Abstract
Admixture, the creation of individuals with combined genomic material
from multiple differentiated source populations, is now known to be a
dominant evolutionary force. Admixture increases polymorphism and can
generate novel phenotypes and selection pressures, often leading to both
novel adaptation and reproductively isolated hybrid taxa. When a large
variety of recombinant types and admixture proportions between two
source populations exist, both geographic and genomic cline analysis are
suitable methods for inferring biased, restricted, or excessive gene
flow at individual loci into the foreign genomic background. Hence,
cline analysis can provide evidence for reproductive isolation,
selection across an environmental transition, balancing selection, and
adaptive introgression, in natural hybridizing populations. Of the two
cline methods, genomic cline analysis has fewer assumptions and is
suitable in a wider variety of circumstances. Here, I introduce
gghybrid, an R package for Bayesian estimation of genome-wide hybrid
index and locus-specific genomic clines using bi-allelic data, suitable
for both small and large datasets. gghybrid uses Buerkle's likelihood
formula to estimate hybrid index and Fitzpatrick's logit-logistic
genomic cline function to infer restricted, extreme, or biased gene
flow. It employs the commonly available Structure file format for data
input, is highly parallelizable, and allows use of admixture proportions
estimated from other software. Parameters can be pooled across test
subjects, or their values fixed, and model comparison carried out using
both AIC and waic. Here, I describe the functions, pipeline, and
statistical properties of gghybrid.