Heritable genetic variation in dispersal
We first estimated the variance in dispersal (i.e. natal dispersal
between any of the study islands) that was attributable to additive
genetic effects and several environmental effects using a basic genetic
groups animal model (basic GGAM), where individuals born in the farm and
non-farm island habitat types were allowed to differ in mean breeding
values for dispersal but where the additive genetic variances of
dispersal was similar in the two habitat types (Muff et al., 2019; Wolak
& Reid, 2017). Next, we formulated an extended genetic groups animal
model (extended GGAM; (Aase et al., 2022; Muff et al., 2019) where the
additive genetic variance in dispersal was allowed to differ for farm
and non-farm island habitat types. The two genetic groups corresponded
to the farm and non-farm island habitat types, where the genomes of
individuals were proportionally assigned to their origin from either the
farm or non-farm genetic group. The proportional assignment to farm or
non-farm genetic group origin was based on the metapopulation level
pedigree that included all 3116 successfully SNP-typed individuals and
an additional 440 dummy individuals that were assigned as parents to
identify known relationships among recruits (such as full- or
half-sibling relationships) when one or both of the true genetic parents
were not genotyped (Niskanen et al., 2020). More specifically, the
assignment of these 3556 real or dummy individuals’ genomes to the two
genetic groups was done based on information on the assumed natal island
habitat type of the phantom (i.e. unknown) parents of individuals in our
metapopulation level pedigree. To obtain assumed natal island habitat
type of phantom parents we first identified the known or most likely
natal island habitat type of each individual in the pedigree. For 2741
of the 3116 real individuals their natal island was known either from
ecological or genetic assignment data (Saatoglu et al., 2021). Because
most house sparrows in our study metapopulation are resident individuals
(Ranke et al., 2021; Saatoglu et al., 2021), we used the first island
they were recorded on as the most likely proxy for the natal island of
the remaining 375 real individuals. Furthermore, dummy individuals that
had at least one known parent (N = 169) was assigned the same natal
island as their parent(s). Finally, dummy individuals without any known
parent(s) (N = 271) were assigned the island where their offspring were
born as their natal island. In the metapopulation level pedigree, 592
real and 303 dummy individuals had either both parents (N = 646), their
mother (N = 93) or their father (N = 156) missing. These unknown parents
represent the pedigree’s phantom parents, and were assigned to the same
natal island habitat type as their (real or dummy) offspring. Finally,
the proportional genetic group contribution (qij )
values to the farm and non-farm genetic groups for each individual in
the metapopulation level pedigree were calculated from the pedigree
based on the phantom parents’ assumed natal island habitat types using
the “ggcontrib” function from the R package NADIV (Wolak, 2012).
Our basic GGAM partitioning variation in dispersal probability allowing
differences in group-specific mean breeding values was defined using a
binomial regression model with logistic link function and linear
predictor for individual i given as
, (1)
where µ is the intercept, Xi is a vector
indicating the fixed covariates of individual i and β is a vector
of fixed effects. Individual sex was included as a fixed effect to
account for differences between sexes in dispersal propensity (Saatoglu
et al., 2021), and the proportional genetic contribution from the
non-farm genetic group was included as a fixed effect (continuous
covariate) to account for any mean differences in dispersal probability
between the genetic groups. Both fixed effects variables were mean
centered. The random effects include individual i ’s natal island
(islandi ~ N(0,σisland 2)) and hatch year
(hyeari ~ N(0,σyear 2)), and captured the
variance in dispersal attributable to spatio-temporal environmental
variation. Furthermore, the total additive genetic effect of individuali is given as ui , which is the weighted
genetic group mean effect for group 2 (g 2; we
defined group 2 as the non-farm genetic group) plus the breeding valueai , distributed as a ⊤= (a1 ,…,an )⊤ ∼ N(0 ,σA 2A ) with additive
genetic variance σA 2 and
additive genetic relatedness matrix A that represents the
relatedness among individuals (Kruuk, 2004). Thus, the genetic group
mean effect for the farm group was set to 0 (i.e.g 1 = 0) for identifiability reasons, and the
estimate for g 2 is the difference in the non-farm
group’s mean total additive genetic effect compared to the farm group’s
mean total additive genetic effect. Note that, because our animal models
were formulated as logistic regression models, there is no residual
variance component (de Villemereuil, Schielzeth, Nakagawa, & Morrissey,
2016).
The basic GGAM was extended to allow estimation of group-specific
additive genetic variances. Our extended GGAM was thus formulated as a
logistic regression model with linear predictor given as
where the total additive genetic effect of individual i is again
given as ui , which is now the sum of the genetic
group mean effect for group 2 (g 2; the non-farm
genetic group) multiplied by the genetic group 2 proportion of
individual i (qi2 ), plus group-specific
additive genetic values of group 1
(ai 1, the farm genetic group) and
group 2 (ai 2, the non-farm
genetic group). As in model (1) the genetic group mean effect for the
farm group was set to 0 (i.e. g 1=0) so that the
estimate for g 2 is the difference in the non-farm
group’s mean total additive genetic effect compared to the farm group’s
mean total additive genetic effect. However, the breeding valuea i in model (1) is now split into two
group-specific components ai 1 andai 2, witha j⊤ =
(a1 j,…,an j)⊤ ∼
N(0 ,σAj 2A j)
for both groups j = 1, 2, whereσAj 2 is the additive genetic
variance in group j , and A j are
group-specific relatedness matrices calculated as in Muff et al., 2019.
We denote ai 1 andai 2 as the partial breeding
values, because they represent the contributions to the breeding value
of individual i that are inherited from group 1 and 2,
respectively.
Narrow-sense heritabilities for dispersal probability were obtained from
i) the basic GGAM for the whole study population combined and ii) the
extended GGAM for farm and non-farm genetic groups separately, from the
variance component estimates by using the formula (showing the extended
GGAM case)
, (3)
where the variances are defined as above, and residual variance was
approximated by π2/3 (Nakagawa & Schielzeth, 2010).
The heritability estimate for dispersal from the basic GGAM
(h 2) was obtained by usingσA 2 instead ofσAj 2 in formula (3). The
proportion of phenotypic variance in dispersal explained by the natal
island and hatch year was also estimated for the basic GGAM and the
extended GGAM using the same formulas, but withσisland 2 orσyear 2 as the numerators,
respectively (instead of σA 2 orσAj 2). Note that we for
heritabilities and other proportions of phenotypic variance explained
assume that the island and year variances are the same within the farm
and non-farm habitats.
The basic GGAM and the extended GGAM were fitted in a Bayesian framework
with integrated nested Laplace approximations using R-INLA (Rue,
Martino, & Chopin, 2009), which is a fast and accurate alternative to
MCMC (Holand, Steinsland, Martino, & Jensen, 2013; Steinsland, Larsen,
Roulin, & Jensen, 2014). In order to prevent overfitting, a penalized
complexity prior was used for the precisions of the environmental random
components (with u = 2, α = 0.02) (Simpson, Rue, Riebler, Martins, &
Sørbye, 2017).