WGD-2 causes duplication of SrCYP76AK6–8 genes and site-specific
mutations in molecular docking sites
Carnosic acid and carnosol are the primary diterpenes in S.
rosmarinus leaves, the biosynthesis of them have been elaborated. These
compounds are derived from precursors (IPP and DMAPP) through MEP
pathway in the plastids, and are catalyzed by downstream genes including
diterpene synthases and cytochrome P450. In S. rosmarinus genome,
we identified three genes encoding SrCYP76AK6 , two encodingSrCYP76AK7 , and two encoding SrCYP76AK5 on
pseudochromosome 11. All of these genes were clustered within a 0.33 Mb
region (Figure 6d), and one, four, and one homologous gene were
identified in the syntenic positions in S. miltiorrhiza , S.
splendens , and S. baicalensis , respectively (Figure 5e). These
findings suggest that substantial duplication of SrCYP76AK5,
SrCYP76AK6 and SrCYP76AK7 occurred on pseudochromosome 11 after
speciation of S. rosmarinus. Moreover, SrCYP76AK5-2 ,SrCYP76AK6-1 , and SrCYP76AK6-2 are highly expressed in
rosemary leaves, with expression levels 6.59-fold, 5.64-fold, and
6.25-fold higher than in roots, respectively (Figure 7d). Therefore, the
clustering, expansion, and high expression of the genes encodingSrCYP76AK5 , SrCYP76AK6 and SrCYP76AK7 might have
contributed to the accumulation of carnosol in S. rosemarinus .
In addition to the SrCYP76AK genes on pseudochromosome 11, we
have also identified one SrCYP76AK5 gene and oneSrCYP76AK8 gene on pseudochromosome 3. Our analysis of the
evolutionary trajectory for the chromosomes of S. rosemarinussuggested that the duplication of CYP76AK8 occurred as result of
the WGD-2 and subsequently underwent chromosomal rearrangements and
fusions on pseudochromosomes 3 and 11, respectively (Figure S29 b). The
Ks values between homologous gene pairs (SrCYP76AK5-1 vsSrCYP76AK5-2 and SrCYP76AK5-1 vs SrCYP76AK6-2) were
all close to Ks value of WGD-2, indicating their duplication
occurred during this event, and then SrCYP76AK6-2 replicated toSrCYP76AK7-2 . The duplication of SrCYP76AK7-1 toSrCYP76AK7-2 and SrCYP76AK6-1 to SrCYP76AK6-2occurred close to the present (Table S31). We hypothesize thatSrCYP76AK5-1 and SrCYP76AK8-1 on Chr3 were copied to Chr11
during the event of WGD-2, following a tandem duplication occurred
recently on Chr11. It was followed by replications ofSrCYP76AK6-2 and SrCYP76AK6-3 to form the cluster of sixSrCYP76AK copies. Moreover, further duplications of chromosome
fragments led to the clustering of six SrCYP76AK6–8 genes on
pseudochromosome 11 within a 0.33 Mb region (Figure 7e). This is
supported by our phylogenetic analysis of the proteins encoded by
homologous gene pairs (Figure S30), and Ks calculations (Table
S29).
To gain a comprehensive understanding of the evolution of CYP76AKsubfamily, we examined the proteins encoded by CYP76AK1 andCYP76AK6-8 in 24 different species and extracted a total of 18
protein sequences, mainly from Salvia species. Using Ocimum
basilicum CYP76 gene as an outgroup, a maximum likelihood (ML)
tree of CYP76 genes were reconstructed (Figure S29 a). The phylogenetic
relationships revealed that the proteins encoded by CYP76AK1s ,CYP76AK2s , CYP76AK3s , CYP76AK5s , CYP76AK6s ,CYP76AK7s and CYP76AK8s align into four distinct groups,
respectively. The evolutionary tree of the CYP76AK subfamily
showed two clades, the clade of the gene encoding the CYP76AK3and CYP76AK7 was sister to the clade of CYP76AK5 ,CYP76AK6 , CYP76AK8 , CYP76AK1 and CYP76AK2.
CYP76AK6, CYP76AK1 and CYP76AK2 did not form the independent
clade, which indicated that CYP76AK6, CYP76AK1 andCYP76AK2 were evolved from CYP76AK8, CYP76AK1 andCYP76AK2 were evolved from CYP76AK6
We observed that SrCYP76AK6–8 catalyzed the conversion of
11-hydroxy ferruginol into capraldehyde in S. rosmarinus , whileSmCYP76AK1 catalyzed the production of 11,20-dihydroxy ferruginol
in S. miltiorrhiza (Figure 7a). To further investigate the
catalytic mechanism of CYP76AK subfamily, we performed homology
modeling and molecular docking to infer the key amino acid sites onSmCYP76AK1 and SrCYP76AKs . The latter were highly
expressed in leaves of rosemary. Using SmCYP76AH1 (PDBid: 5ym3)
structure as a PDB template, we generated 3D models of SmCYP76AK1 and
SrCYP76AKs, and docked them to the substrate 11-hydroxy-ferroginol. Our
results showed that position C-20 in 11-hydroxy-ferruginol, which docked
with SrCYP76AK5-2 , SrCYP76AK6-1 , and SrCYP76AK6-2 ,
was closer to heme iron than that with SmCYP76AK1 (Figure 6b).
This closer proximity may have led to a sequential oxidation reaction at
C-20 that resulted in the accumulation of carnosol precursors. We
hypothesized that mutations in essential amino acids could result in
functional differentiation of CYP76AKs , leading to the
accumulation of carnosol in the leaves of S. rosmarinus and
tanshinone in the roots of S. miltiorrhiza , respectively.
Furthermore, we investigated amino acid mutations within 8 Å of the
active pocket in order to understand their potential influence on the
proximity of the ligands to the heme iron (Figure 7c). To identify key
residues involved in docking sites, we compared differential amino acid
residues within this range between SrCYP76AKs andSmCYP76AK1 , and found. nine candidate residues. We then conducted
remodeling and docking experiments by replacing the corresponding
residues of SmCYP76AKs with those of SmCYP76AK1 , and vice
versa. Specifically, we mutated S445 and I449 of SrCYP76AK5-2 ,SrCYP76AK6-1 , and SrCYP76AK6-2 to I445 and M449,
respectively, to mimic SmCYP76AK1 . Conversely, we mutated I445
and M449 of SmCYP76AK1 to S445 and I449, respectively, to mimic
SrCYP76AK5-2, SrCYP76AK6-1, and SrCYP76AK6-2. We then docked these
remodeled proteins with 11-hydroxy-ferruginol. whereas I445S, M449I withSmCYP76AK1 . The results showed that the co-mutation of S445I and
I449M in SrCYP76AK5-2, SrCYP76AK6-1, SrCYP76AK6-2 led to ligands docking
away from heme iron at the docking sites, while co-mutation of I445S and
M449 in SmCYP76AK1 resulted in docking close to heme iron (Figure S31).
Therefore, we hypothesized that S445I and I449M played a significant
role in determining the distance of ligand from heme iron, and may have
contributed to the functional divergence of SmCYP76AK1 fromSrCYP76AK6–8 . Our findings suggest that these residues are
critical for ligand binding and may have important implications for
understanding the functional differences between these two enzymes.