2 MATERIALS AND METHODS

2.1 Sampling and laboratory analyses

We collected 86 golden eagle samples from across Eurasia (Figure S1) dated between 1885 and 2017 (Table S1). The skin (N = 82) and a feather (N = 1) samples were taken from museum collections, while shed feathers (N = 3) were taken from captive adults. We extracted genomic DNA from skin samples with E.Z.N.A.® Tissue DNA Kit following the modified Mouse Tail Snips Protocol (Omega Bio-Tek, USA). The volume of TL Buffer was increased to 400 µl to fully cover the tissues. To enhance the dissolving of keratin from residue feathers, we added 20 µl of dithiothreitol (DTT) to each sample before an overnight incubation. When lysis was incomplete, an extra 20 µl of protease solution was added, and the sample was vortexed and left in an incubator for additional 60–90 min. The amount of DNA wash buffer was decreased to 650 µl for both washing steps. Elution was done only once with 50 µl of Elution buffer.
Feather samples were purified with chlorine prior to extraction. First, we cut off the quill ends of feathers to Eppendorf tubes. When possible, a part of a quill with a blood clot was taken. Then, we added 10% chlorine to each sample, and vortexed and centrifuged the samples for 30 min at maximum speed. After that, we discarded the chlorine, washed the sample three times with sterile water and left them to dry with open lids at room temperature. The subsequent DNA extraction was done the same way as with the skin samples. The DNA concentration was measured with NanoDrop (Thermo Scientific, Walthan, MA, USA), and it varied between 0.25 and 477 ng/µl (median of 60.0, mean of 95.5 ng/µl). All DNA extractions were stored at -20°C.
We sequenced a fragment of a mitochondrial DNA control region (mtDNA CR) with primers modGOEA_CR1L (5’-CCCCCGTATGTATTATTGTA-3’, Nebel et al. , 2015) and GOEA_CR595H (5’-GCAAGGTCGTAGGACTAACC-3’ Sonsthagen et al. , 2012) and genotyped golden eagles with 12 microsatellite loci: Aa02, Aa04, Aa11, Aa15, Aa26, Aa27, Aa35, Aa36, Aa39, Aa43, NVHfr142, and NVHfr206 (Table S2; Nesje & Røed, 2000; Martinez-Cruz et al. , 2002; Bielikovaet al. , 2010) following Kylmänen et al. (2023). We added five randomly selected golden eagles from Finland that had been genotyped and sequenced in our previous study (Kylmänen et al. , 2023) to better cover the Palearctic distribution of the species.

2.2 Sequencing and genotyping quality

The mtDNA sequences were manually edited and aligned based on CLUSTAL W Multiple Alignment (Thompson, Higgins, & Gibson, 1994) in BioEdit v.7.2.5 (Hall, 1999). We performed microsatellite genotyping of all individuals twice, and for samples with weaker amplification (20% of the duplicates), thrice. The alleles were scored with GeneMapper v.5.0. (Thermo Scientific, Walthan, MA, USA). We created consensus genotypes by selecting alleles that were consistently replicated, or by including a heterozygote when two out of three replicates showed both a homozygote and a heterozygote. When alleles in the replicates were inconsistent, the specific marker was called missing.
We estimated genotyping success with Microsat_errcalc (Honka & Merikanto, 2020) by calculating error, allelic dropout (ADO) and false-allele (FA) rates. The presence of null alleles and stuttering was checked with MicroChecker 2.2.3 (Van Oosterhout et al. , 2004) and null allele frequencies were calculated with FreeNA (Chapuis & Estoup, 2007). Deviations from Hardy-Weinberg equilibrium (HWE) and linkage equilibrium were tested with Genepop v.4.0.10 (Rousset, 2008) with 1000 permutations.

2.3 Data analyses

2.3.1 Datasets

Our dataset for microsatellites consisted of 91 golden eagles (Table S1), while for the mitochondrial analyses we used a larger dataset that consisted of our samples and sequences from GenBank (326 bp; N = 321–581, depending on the analysis, see below; Table S3) to present a more comprehensive overview of the species in Eurasia.
We used three grouping criteria for the downstream analyses. First, the samples were grouped according to their geographical origin. With only our data (N = 91), we divided individuals into four groups: (1) Northern Europe, (2) Central and Eastern Europe, (3) Central Asia and Caucasus, and (4) Far East (Figure S1). For the analysis including the GenBank sequences, we created a fifth geographical group: (5) Western Europe. The split between Northern Europe and Central and Eastern Europe was according to division of Russian federal districts, with samples from the Northwestern federal district assigned to Northern Europe, and samples from the Central and the Volga federal districts assigned to Central and Eastern Europe. Central Asia and Caucasus included samples from the Caucasus region (Russian North Caucasian federal district and Azerbaijan) and Central Asian countries (Kazakhstan, Kyrgyzstan, Turkmenistan, and Uzbekistan) including Iran. Samples east from the Ural Mountains, which were not included in Central Asia and Caucasus, formed the Far East group. Western Europe consisted of GenBank sequences from Spain and non-Alpine France.
The second grouping was done according to the mitochondrial lineages defined by Nebel et al. (2015): Mediterranean and Holarctic. The third subdivision was based on temporal groups: a Bottleneck group (until the year 1984) and a Post-bottleneck group (from the year 1985 onwards). The temporal groups were chosen based on information on protective legislation and generation time of golden eagles. We calculated golden eagle generation time using the formula described by Brown (1966) as the time taken by two adults to replace themselves: 2/(preadult survival rate × number of chicks per pair per year). The survival and reproduction success values were taken from published data for golden eagles from Finland, Sweden, and Scotland (Table S4; Sulkavaet al. , 1984; Whitfield et al. , 2004; Ollila, 2019). Thus, the average generation length of golden eagles was 14.56 years ranging between 9.14 and 23.50 years. Since golden eagles were protected in many countries by the 1970s, by adding one generation as a buffer, we considered the year 1985 as the borderline between the two temporal groups.

2.3.2 MtDNA analyses

We calculated the number of haplotypes (H), haplotype (h) and nucleotide (π) diversities, and Watterson’s theta (θ, from the number of segregating sites) using DnaSP v.6.12 (Rozas et al. , 2017) for every group and for the entire Eurasian population. To account for different sample sizes, we performed sample size-based rarefaction-extrapolation for the number of haplotypes using the “iNEXT” package (Chao et al. , 2014; Hsieh, Ma, & Chao, 2016) in R v.4.0.4 (R Core Team, 2021) with 1000 bootstrap replications. The rarefaction-extrapolation analyses were only performed for the dataset that included the GenBank sequences (N = 434 for geographical and mitochondrial lineage groups, and N = 393 for temporal groups).
A median joining-haplotype network (Bandelt, Forster, & Röhl, 1999) for the entire Holarctic region was constructed in PopART (Leigh & Bryant, 2015). For that, we used golden eagles sequenced in this study (N = 82) and GenBank sequences of golden eagles from Eurasia with location information (N = 352, Nebel et al. , 2015, 2019, Kylmänen et al. , 2023) and North America (N = 229, Sonsthagen et al., 2012; Nebel et. al 2015; Craig et al. , 2016; Judkins & van den Bussche, 2017; Table S3). Haplotype and trait files for PopART were created using packages “pegas” (Paradis, 2010) and “ape” (Paradis & Schliep, 2019) in R. Haplotype frequencies from the previous studies were used as reported in Nebel et al. (2015, 2019), Craig et al.(2016), Judkins & Van Den Bussche (2017), and Kylmänen et al.(2023), while for Sonsthagen et al. (2012) the frequencies were unknown, and we used one sequence per haplotype.
To further investigate the genetic structure, we calculated pairwise Φst values and performed analysis of molecular variance (AMOVA) for the three groupings in Arlequin v.3.5 (Excoffier & Lischer, 2010) with 10 000 permutations and using Kimura 2-parameter distance model with gamma distribution (shape parameter of 0.05). Both the distance model and the shape parameter were estimated in MEGA X (Kumar et al. , 2018), and the model with the lowest Bayesian Information Criterion (BIC) value was chosen. BIC was selected over Akaike Information Criterion (AIC), because it performs better in selecting the correct model for explaining the existing data (Chakrabarti & Ghosh, 2011; Ahoet al. , 2014). We checked for signals of population size changes with the three most powerful neutrality tests in detecting expansions (Ramírez-Soriano et al. , 2008): Tajima’s D (Tajima, 1989), Fu’s Fs (Fu & Li, 1993), and Ramos-Onsins and Rozas R2(Ramos-Onsins & Rozas, 2002) in DnaSP. We also constructed mismatch distribution graphs in DnaSP as indicators of population demographic changes (Harpending, 1994). For these analyses, we included our sequences (N = 82) and the GenBank data (N = 352; Nebel et al. , 2015; Kylmänen et al. , 2023).
To explore temporal population dynamics, we constructed a temporal haplotype network (TempNet, Prost & Anderson, 2011) with sequences from this study (N = 72) and Eurasian golden eagles from GenBank (N = 321; Nebel et al. , 2015; Kylmänen et al. , 2023) which had available information on the year of the sample. With the same dataset, we generated Bayesian skyline plots (BSP, Drummond et al. , 2005) in BEAST 2 (Bouckaert et al. , 2014) to identify possible fluctuations in effective female population sizes (Nef) of the total Eurasian population, and of Mediterranean and Holarctic lineages separately. We applied a strict molecular clock fixing the rate to 1, because no estimates of the clock rate for golden eagles or closely related species were available. We used years of samples as tip dates and applied the Hasegawa-Kishino-Yano (HKY) substitution model. We estimated the substitution rates, HKY frequencies and kappa, chose two gamma categories with the shape parameter of 0.50 and the proportion of invariable sites of 0.86. We ran 100 million Markov Chain Monte Carlo (MCMC) with a 10% burn-in, sampling model parameters and genealogies every 1000 iterations. After the first run, we implemented the recommended corrections to operators (Table S5) and performed multiple simultaneous runs, which were afterwards combined using LogCombiner (implemented in BEAST) to achieve sufficient effective sample sizes (ESS > 200). The ESSs above 200 were achieved after eight runs for the total dataset, after six runs for the Holarctic dataset, and after two runs for the Mediterranean dataset (Table S5). Additionally, we repeated the analyses treating these groups as independent datasets, but since the results did not change, we only report the settings described above. The Bayesian Skyline reconstruction was done in Tracer v.1.7.2 (Rambaut et al. , 2018).

2.3.3 Microsatellite analyses

We calculated the polymorphic information content (PIC) for each locus in Cervus 3.0.7 (Kalinowski, Taper, & Marshall, 2007) and the number of alleles (A), observed heterozygosity (HO), and unbiased expected heterozygosity (HE) in GenAlEx 6.5 (Peakall & Smouse, 2012). The allelic richness (AR) and inbreeding coefficients (FIS) were calculated using FSTAT 2.9.4 (Goudet, 2003). Estimates of allelic richness were based on eight diploid individuals for the four geographical groups, on six diploid individuals for the mitochondrial lineage groups, on 11 diploid individuals for the temporal dataset, and on 47 diploid individuals for the total dataset. To check whether FIS significantly deviated from zero, we used the one-sample Wilcoxon signed rank test in R. Differences in genetic diversity (A, HO, HE, FIS) between the mitochondrial lineage groups and between the temporal groups were tested using the Mann-Whitney U test (Wilcoxon rank sum test) in R. To statistically compare genetic diversity among the four geographical groups, we used the Kruskal-Wallis rank sum test in R for simultaneous comparison of the groups, and the Mann-Whitney U test for pairwise comparison between groups. Statistical comparisons of FIS between the temporal groups and among the geographical groups were done with 11 loci: locus NVHfr206 was excluded, because expected heterozygosities in the Post-bottleneck group and in the Northern Europe group for this locus were zero, making calculation of the inbreeding coefficient impossible. Furthermore, we used ADZE (Szpiech, Jakobsson, & Rosenberg, 2008) to calculate the private allelic richness (PAR) per group.
Population structure was studied with STRUCTURE 2.3.4 (Pritchard, Stephens, & Donnelly, 2000), where we applied 500 000 MCMC chains with 20% burn-in and performed 10 iterations for each of 1 to 5 possible clusters (K). We used the admixture ancestry model (alpha inferred from the data) with correlated allele frequencies. We ran STRUCTURE de novo, and using geographical groups, mitochondrial lineages, and temporal groups as LOCPRIORs. The most likely number of clusters was estimated using three methods: Evanno ΔK (Evanno, Regnaut, & Goudet, 2005), standard log probability test (L(K), Pritchard et al. , 2000) and the Puechmaille’s optimal K (Puechmaille, 2016), all implemented in Structure Selector (Li & Liu, 2018). We examined the results of all four Puechmaille’s K estimators (median of medians, median of means, maximum of medians, and maximum of means), and if differences were observed, we chose the median of medians (MedMedK) , because this parameter is less affected by incorrect grouping of individuals into populations and presence of migrants, and it helps to avoid overestimation, which might occur with estimators based on the maximum (MaxMeaK and MaxMedK, Puechmaille, 2016). We visualized the results of assignment with the online tool POPHELPER (Francis, 2016).
We also analyzed genetic structure using Discriminant analysis of principal components (DAPC, Jombart et al. , 2010) implemented in “adegenet” 2.1.3 (Jombart, 2008) package in R. We performed DAPC analysis for all pre-defined groups and with de novo grouping of individuals into clusters. For the de novo assignment we used the find.clusters() command and selected K according to the lowest BIC score. For choosing the optimal number of principal components (PCs) to retain in the discriminant analysis, we applied the cross-validation method with 1000 permutations. In addition, we used GenePlot in the “geneplot” (McMillan & Fewster, 2017) package in R to visualize the genetic assignment of individuals. In GenePlot, we performed Principal Components Analysis (PCA) for geographical groups, log genotype probability (LGP) test for pairwise comparisons of these groups, LGP test to compare Mediterranean and Holarctic lineages, and LGP test to compare the temporal groups. For these analyses, we only included individuals with a minimum of eight genotyped loci. We calculated pairwise FST values between groups and performed AMOVA with number of different alleles as a distance method using 10 000 permutations in Arlequin. To test for isolation by distance (IBD) we used the Mantel test and the spatial autocorrelation test, both implemented in GenAlEx.
To additionally check for signs of population bottlenecks, we ran the Bottleneck program (Piry, Luikart, & Cornuet, 1999) using the two-phase mutation model (TPM) with variance set to 30, and a proportion of SMM (stepwise mutation model) of 80% in TPM. We used the Wilcoxon sign rank test for heterozygote excess with 10 000 replications and the mode-shift test to identify groups with signs of bottlenecks.