Synonymous/nonsynonymous vs functional/nonfunctional
For most pairs of sister species that recently evolved from a common
ancestor and now have a DNA barcode gap, there is no difference in the
amino acid sequence of the portion of the COX1 gene serving as the
barcode gene. However, this does not necessarily mean that there are no
functional changes to other protein-coding genes that include seven
subunits of Complex I, one subunit of Complex III, two (additional)
subunits of Complex IV, and two subunits of Complex V. Indeed, the seven
mitochondrially encoded protein subunits of Complex I are much more
frequently implicated in adaptive divergences between sister taxa than
Complex IV subunits (da Fonseca et al. 2008; Garvin et al. 2014). At
least some sister taxa also carry fixed differences in amino acid
sequence for subunits of Complexes III and V (reviewed in Hill (2019a)).
A hypothesis that is worthy of testing is that the pattern of little
variation within species but substantial differences between species in
mt DNA sequence arises entirely as a consequence of strong selection on
adaptive amino acid substitutions in mt-encoded proteins (da Fonseca et
al. 2008). Given available data, however, I do not think that an
adaptive protein evolution hypothesis will be the primary solution to
the paradox of the mt DNA barcode gap, because purifying selection is,
indisputably, the dominant force in the evolution of all mt
protein-coding genes (Stewart et al. 2008; Kerr 2011).
I propose that the key to explaining the evolution of the mt DNA barcode
gap lies in giving full consideration to the fact that all of the genes
encoded by the animal mitochondrial genome will evolve via natural
selection primarily in response to the internal genomic environment
(Sunnucks et al. 2017; Sloan et al. 2018; Hill et al. 2019). Most of the
genes in the mitochondrial genome code from products other than
proteins; in bilaterian animals, 24 out of 37 mitochondrial genes (65%)
code for transfer RNA (tRNA) or ribosomal RNA (rRNA) (Rand et al. 2004;
Burton and Barreto 2012). Every one of these genes maintains
coadaptation with the N-encoded genes through coevolution; in other
words, there is a prediction of perpetual directional adaptive evolution
of all of the products of the mt genome in response to the internal
genomic environment (Kivisild et al. 2006; Hill 2019a; Wei et al. 2019;
Zaidi and Makova 2019). The selective driver of this process of adaptive
evolution of mt genes is compensatory coevolution, whereby mt genes
evolve so as to compensate for deleterious N genotypes and vice versa
(Rand et al. 2004; Dowling et al. 2008; Osada and Akashi 2012; Barreto
et al. 2018; Hill 2020). Even the non-coding region of the animal
mitochondrial RNA, which serves as the origin of replication site for
transcription and replication, coevolves with N genes (Gaspari et al.
2004; Ellison and Burton 2008a, 2010). The expression of mt and N genes
that code for co-functioning units must also be co-regulated, another
important level of mitonuclear coadaptation (Barshad et al. 2018; Calvo
et al. 2019).
There is a large and rapidly growing literature showing that single
nucleotide substitutions in each of the non-protein-coding genes of the
animal mitochondrion has important fitness consequences (reviewed in
Hill 2019). Many of these fitness effects play out in relation to the
external environment of the organism (Hoekstra et al. 2013), but the
source of the hypothesized perpetual evolutionary change of all of the
products of the mt genome would be selection to maintain coadaptation
with products of the N genome to enable cellular respiration (Meiklejohn
et al. 2013; Barreto et al. 2018; Hill 2020). Because it is dependent on
random mutations in both the N and mt genomes, coevolution of
co-functioning mt and N genes to maintain mitochondrial function will be
idiosyncratic, unpredictable, and not repeatable (Blount et al. 2018).
Directional selection on both the mt and N genomes to maintain
mitonuclear coadaptation will create the sort of divergence in mt
genotypes between species that give rise to a DNA barcode gap (Burton
and Barreto 2012; Hill 2016). The key missing element is: how would
divergence in a tRNA, rRNA, or the control region affect the barcode
region of the COX1 gene or other synonymous substitution in protein
coding genes?