2. Genotyping
The methods to sequence and genotype MHC IIβ alleles were identical to
Stutz & Bolnick (2017). Briefly, genomic DNA was extracted from fin
clips using a Promega Wizard 96-well extraction kit. We used PCR to
amplify the second exon of MHC IIβ genes in each fish, with primers and
PCR cycles as described in Stutz & Bolnick (2017). This exon contains
the hypervariable peptide-binding region that binds to possible parasite
antigens (Sommer, 2005). Each specimen was barcoded with a unique
combination of forward and reverse primer tags for multiplexing. We used
Quant-iT PicoGreen kits (Invitrogen P11496) to quantify DNA
concentrations of magnetic bead-purified (Agencourt AMPure XP beads) PCR
products, then pooled up to 400 samples in equimolar amounts to
construct a library. We used Illumina Mi-Seq to sequence these
multiplexed amplicon libraries. Then, we used a Stepwise Threshold
Clustering (STC) program (Stutz & Bolnick, 2014), implemented in the
AmpliSaS web software (URL:
http://evobiolab.biol.amu.edu.pl/amplisat/index.php?amplisas;
Sebastian, Herdegen, Migalska, & Radwan, 2016) to distinguish real
sequence variants from sequencing error or PCR chimeras. The algorithm
was originally validated by sequencing cloned amplicon products (Stutz
& Bolnick, 2017). The software outputs a table of individual fish
(rows) and unique MHC sequences (columns) with read depths.
To efficiently process data in AmpliSaS, we set the upper limit of read
depth for each individual as 5000, which was sufficient to retrieve all
the possible MHC alleles (Stutz & Bolnick, 2014). Because low
sequencing coverage could bias the number of MHC alleles to be
identified, individual fishes with coverage lower than 450 were excluded
from this study (Fig. S1). After excluding the 160 individuals with low
coverage (leaving N = 1277 individuals in the subsequent analyses),
there was no longer a significant linear relationship between sequencing
coverage and the number of MHC alleles (t = 1.46, p = 0.14). The number
of unique MHC alleles was inferred based on the translated protein
sequences, thus merging distinct exonic sequences that produce identical
amino acid sequences.