2.5.3 Genetic relationship with crown baldness
During sampling, we noted that eight King Island scrubtits had a distinct bald patch on the crown of their heads (Figure S3), so explored the possibility that this feature was linked to genome-wide heterozygosity or particular SNP loci in these individuals. We used “inbreedR” v 0.3.3 (Stoffel et al. 2016) to calculate multi-locus heterozygosity (MLH) values for each individual, then implemented a heterozygosity-fitness correlation analysis using logistic regression via package “lme4” v1.1-31 (Bates et al. 2015) with a binomial response of bald or not bald against (i) the entire scrubtit sample; and (ii) only King Island scrubtits. We included genotypic sex as a random term in both models.
To determine if any loci were significantly associated with baldness, we used a latent factor mixed modelling (LFMM) approach. This method tests the explanatory significance of a trait variable on the genotypic matrix, allowing for inference regarding the genetic basis of the trait. As LFMM requires no missing data, we imputed missing genotypes via the ‘impute’ function of the “LEA” package (Frichot & François, 2015), utilising four ancestral populations and method = “mode”. We then used the ‘lfmm2’ exact least-squares function of the LEA package to build the LFMM object and identified allele frequencies that were correlated with each of the environmental variables (Caye et al., 2019). This method controls for population structure via a number of latent factors equal to the number of ancestral populations. We adjusted the p -values for each SNP using the robust estimate of the genomic inflation factor (Martins et al., 2016) and a Benjamini-Hochberg algorithmic correction (Benjamini & Hochberg, 1995) to ensure a low rate of false discovery (corrected to 1 in 10,000 SNPs). We then produced a Manhattan plot along with the positions of candidate SNPs. We identified the genomic coordinates of the candidate SNPs in the transcriptome-guided genome annotation to determine if they were genic or non-genic and the putative function of the gene or nearest candidate gene. If the gene was not annotated by FGENESH++, we queried the protein sequence against the National Center for Biotechnology Information (NCBI)’s RefSeq non-redundant protein sequences database using the BLASTp webserver for homology to known genes (Johnson et al., 2008).