2.5.3 Genetic relationship with crown baldness
During sampling, we noted that eight King Island scrubtits had a
distinct bald patch on the crown of their heads (Figure S3), so explored
the possibility that this feature was linked to genome-wide
heterozygosity or particular SNP loci in these individuals. We used
“inbreedR” v 0.3.3 (Stoffel et al. 2016) to calculate multi-locus
heterozygosity (MLH) values for each individual, then implemented a
heterozygosity-fitness correlation analysis using logistic regression
via package “lme4” v1.1-31 (Bates et al. 2015) with a binomial
response of bald or not bald against (i) the entire scrubtit sample; and
(ii) only King Island scrubtits. We included genotypic sex as a random
term in both models.
To determine if any loci were significantly associated with baldness, we
used a latent factor mixed modelling (LFMM) approach. This method tests
the explanatory significance of a trait variable on the genotypic
matrix, allowing for inference regarding the genetic basis of the trait.
As LFMM requires no missing data, we imputed missing genotypes via the
‘impute’ function of the “LEA” package (Frichot & François, 2015),
utilising four ancestral populations and method = “mode”. We then used
the ‘lfmm2’ exact least-squares function of the LEA package to build the
LFMM object and identified allele frequencies that were correlated with
each of the environmental variables (Caye et al., 2019). This method
controls for population structure via a number of latent factors equal
to the number of ancestral populations. We adjusted the p -values
for each SNP using the robust estimate of the genomic inflation factor
(Martins et al., 2016) and a Benjamini-Hochberg algorithmic correction
(Benjamini & Hochberg, 1995) to ensure a low rate of false discovery
(corrected to 1 in 10,000 SNPs). We then produced a Manhattan plot along
with the positions of candidate SNPs. We identified the genomic
coordinates of the candidate SNPs in the transcriptome-guided genome
annotation to determine if they were genic or non-genic and the putative
function of the gene or nearest candidate gene. If the gene was not
annotated by FGENESH++, we queried the protein sequence against the
National Center for Biotechnology Information (NCBI)’s RefSeq
non-redundant protein sequences database using the BLASTp webserver for
homology to known genes (Johnson et al., 2008).