Among all phytocannabinoids, THC is the major psychotropic one. However, chemically all molecules mentioned above are very similar in structure and are produced from the same precursor molecules (Figure 4). CBDA and THCA are biochemically synthesized by two closely related enzymes, CBDA and THCA synthase (Shoyama et al., 2012; Taura et al., 1996). CBDA and THCA are both synthesized from CBGA, while CBGA is synthesized from two non-cannabinoids, olivetolic acid and geranyl pyrophosphate by a prenyltransferase (Fellermeier and Zenk, 1998)(Figure 4). Cannabichromenic acid (CBCA) synthase converts CBGA to CBCA (Morimoto et al., 1997) and is closely related to THCA and CBDA synthase (Figure 5), but the CBCA content of most mature Cannabis flowers is low (de Meijer et al., 2009a). Interestingly, CBDA synthase-like genes have been found in other plants and fungi (Aryal et al., 2019; Vergara et al., 2019).
Cannabis plants can have very high levels of phytocannabinoids or close to no phytocannabinoids at all, or anything in between (Aizpurua-Olaizola et al., 2016; de Meijer et al., 2009a). This has stipulated the description of different chemotypes that are characterized by their distinct phytocannabinoid profiles. The chemotypes are a very useful concept for chemical classifications and for breeding programmes. It should be kept in mind, however, that they do not necessarily constitute a phylogenetic classification based on evolutionary relationships (de Meijer et al., 2009b; Small and Beckstead, 1973). Cannabis plants can roughly be categorized into five different ‘chemotypes’ (Figure 4). Plants of chemotype I (short ‘type I’) produce high levels of THCA and only low levels of CBDA and CBGA (Small and Beckstead, 1973). This means the ratio of THCA/CBDA is much larger than 1. In type II Cannabis plants THCA and CBDA are both produced in approximately equal amounts (Small and Beckstead, 1973). Both, type I and type II plants, are usually classified as ‘marijuana’ and can underlie strong regulations, depending on the country or jurisdiction. These plants are bred to produce up to 20 % of their dry mass as phytocannabinoids.
In contrast, type III plants have high CBDA levels and low to very low amounts of THCA.
Chemotype IV and V refer to Cannabis plants which have CBGA as their dominant phytocannabinoid or very low levels of phytocannabinoids overall, respectively (de Meijer et al., 2009a; de Meijer and Hammond, 2005)(Figure 4).
In addition to the five different chemotypes, also the hemp-marijuana distinction is used to characterize different Cannabis plants (Figure 4). If the THC/THCA content in the dry flower mass is below 0.2-1 %, these plants are usually categorized as hemp, above that as marijuana (depending on the jurisdiction this threshold can vary) (Brunetti et al., 2020; Mead, 2017). The differentiation between hemp and marijuana can typically also be drawn genetically, with hemp and marijuana varieties forming two genetically distinct populations (Sawler et al., 2015). Further, hemp and marijuana can be phenotypically quite distinct with marijuana plants generally being bushier and with a dense set of inflorescences while hemp plants tend to be taller, less branched and with less dense flower structures. However, there are also plants with low THC/THCA content (type III) which strongly resemble marijuana in overall plant and inflorescence architecture (Grassa et al., 2018). Hence, the terms hemp and marijuana do not necessarily always refer to distinct genetic populations or phylogenetic categories. As the critical distinction between hemp and marijuana is the THC/THCA content, they can also be considered broader categories of chemotypes.
The underlying genetics of the different chemotypes have been studied in quite some detail in the last two decades (de Meijer et al., 2009a, 2009b, 2003; de Meijer and Hammond, 2005; Pacifico et al., 2006; Toth et al., 2020; Weiblen et al., 2015; Welling et al., 2016). However, the complex nature of the Cannabis genome with its many transposable elements, low complexity regions and high heterozygosity have made a conclusive analysis of the loci controlling phytocannabinoid production challenging (Grassa et al., 2018; Laverty et al., 2019; McKernan et al., 2018).
Different genetic loci had been postulated which determine a plant’s chemotype, they are encoding for the different types of synthases: at locus B two codominant alleles were hypothesized to exist, the allele BT encodes for the THCA synthase, BD for the CBDA synthase (Figure 4)(de Meijer et al., 2003). Depending on the presence of either or both loci, the plant will be chemotype I (BT/BT), chemotype II (BT/BD) or chemotype III (BD/BD) (de Meijer et al., 2003; Toth et al., 2020; Welling et al., 2016). Additionally, non-functional alleles of the synthase gene (B0) are predicted to be associated with chemotype IV, where neither CBDA nor THCA are produced and the precursor, CBGA, accumulates (Figure 4) (de Meijer and Hammond, 2005; Onofri et al., 2015; Welling et al., 2016).
Further, according to this model, CBCA synthase is encoded by an independent locus (C) while another independent locus (O) is relevant for precursor production, with a knockout resulting in overall minimal phytocannabinoid levels (Figure 4) (de Meijer et al., 2009a, 2009b).
The genetic basis of the chemotypes was analysed in detail by producing a cross between high-THC Purple Kush (chemotype I) and low-THC Finola (chemotype III). This resulted in an F1 generation of mainly type II plants, producing both, THCA as well as CBDA (Weiblen et al., 2015). This confirmed earlier findings of crosses between type I and type II plants, resulting in intermediate type II individuals (de Meijer et al., 2003). The segregation pattern of phytocannabinoid profiles in the F2 generation pointed towards a Mendelian inheritance pattern: type I, type II and type III plants were all observed in the F2 generation with the expected distribution of 1:2:1 (de Meijer et al., 2003; Weiblen et al., 2015). A correlation of the expression of either THCA or CDBA synthase with the respective chemotype was also observed and the THCAS/CBDAS locus could be mapped (Weiblen et al., 2015).
However, although these findings were consistent with the idea of codominant alleles at one single locus, it became apparent that the situation is more complex (Grassa et al., 2018; Laverty et al., 2019; Weiblen et al., 2015). New draft genomes generated with third-generation sequencing technology indicated that the THCA and CBDA synthases do not seem to be encoded by alleles of one and the same gene, but rather by distinct loci in marijuana and hemp, respectively, without a clear counterpart in the other genome (Grassa et al., 2018; Laverty et al., 2019). Sequencing of the hemp cultivar ‘Finola’ and the marijuana cultivar ‘Purple Kush’ indicates that a functional CBDA synthase gene is present only in in the ‘Finola’ genome while the ‘Purple Kush’ genome only encodes for a functional THCA synthase (Laverty et al., 2019). While mapping to approximately the same region in both genomes, the DNA sequences surrounding the respective synthase genes are drastically different from each other. Further, a low albeit still detectable recombination rate between the two loci supports the notion that they are genetically distinct (Laverty et al., 2019). The sequencing of a different Cannabis variety (‘CBDRx’), which is a chemotype III hemp-marijuana hybrid revealed an even more complex genomic arrangement with a number of pseudo- and functional synthase genes in three different cassettes on the same chromosome (Figure 5) (Grassa et al., 2018).
The CBDA and THCA synthase genes themselves seem to be embedded in cassettes of multiple tandem duplications of putatively non-functional synthase genes, which are regularly interspersed with long terminal repeat (LTR) retrotransposons, making the assembly and analysis of these loci even more challenging (Figure 5) (Grassa et al., 2018; Laverty et al., 2019). This is also the reason why these complex loci could not be resolved in the first published Cannabis genome, which relied on short-read sequencing data (van Bakel et al., 2011). This genomic constitution, where the difference between marijuana and hemp comes down to a large structural variation is, if true, very unusual. Hence, the aforementioned locus “B” with its different alleles might look very different from what was previously assumed to be simple isoforms of a single gene.