2.3 Amplification, sequencing and analysis of MHC genes
The fragment of DRB gene from exon 2 of MHC class II (Beta 1, 231 bp) was targeted using the primers designed by Becker et al. (2009) forMustela lutreola and a fragment from exon 2 (alpha 1) from MHC class I, using the primers Meme-MHC-Iex2F and PpLAa1L250 designed by Sin et al. (2012) for mustelids. PCRs were carried out in 25 μl volumes containing 0.9 μl of primer mix, 5 μl of GoTaq reaction buffer (Promega), 2 μl of MgCl2,0.04 μl of BSA, 0.8 μl of dNTPs, 0.125 μl of GoTaq G2 DNA polymerase (Promega, France) and 3 μl of DNA. The specific protocol was used for PCR: annealing with touchdown protocol from 65°C to 56°C for 30 s. Amplified DNA in duplicates were pooled after quantification using the Quant-iT PicoGreen dsDNA Assay Kit (Thermo Fisher Scientific Inc., Austria). The library preparation and sequencing were performed by Novogene (UK). Using their designated library protocol, 2×250 bp paired‐end sequencing with a depth of 50,000 reads/sample for MHC genotyping and was completed using an Illumina NovaSeq platform (Illumina Biotechnology Co., Novogene, UK).
To analyze MHC-I and MHC-II amplicon sequences, we used the three-step pipeline AmpliSAS (Sebastian et al., 2015). Low-quality sequences with Phred scores lower than 20 were removed and clustering was conducted using the default parameters for Illumina sequences. Already identified alleles of MHC-II DRB for E-mink were extracted from NCBI (Becker et al., 2009), as well as sequences from closely related species (Mustela putorius and Mustela itatsi ) for MHC-I exon 2. If NCBI blast reveled 100% of sequence identify between the discovered alleles in this study and already identified one, their name was replaced by the accession number of these sequences. For the subsequent analysis, we focused on the amino acid translated sequences (referred as MHC motifs) as they are in direct contact with bacteria. We measured motif richness as the number of sequences per individual for each locus. We calculated functional distances between individuals following the approach described in Strandh et al. (2012). A maximum-likelihood tree was constructed based on the chemical binding properties of the amino acids, as described by five physico-chemical descriptor variables (z-descriptors) for each amino acid, using sequences of Meles meles , Meles leucurus , Meles anakuma and Martes zibelina as out-group retrieved through NCBI blast (Figure S1). The trees were used as reference from which the functional distances between individuals were calculated using unweighted UniFrac for both genes (Lozupone & Knight, 2005). Following Bolnick et al. (2014), the genetic distance between each amino acid sequences within each individual (Faith’s PD) were calculated, and further defined as motif divergence.