DISTRIBUTION AND CLASSIFICATION OF BZIP TRANSCRIPTION FACTORS
Currently, there are at least 64 families of transcription factors have been found in plants (Pérez-Rodriguez et al., 2010). According to their differences in DNA-binding domains, transcription factors can be defined as different families, such as bZIP, NAC, MYB, EREBP/AP2, Zinc-finger, etc. Among which, bZIPs (basic region/leucine zipper motifs) are widely found in humans, animals and plants, insects and microorganisms. To date, a large number of bZIP transcription factors have been identified in almost all eukaryotes. There are 77, 89, 247, 92, 89, 69, 125, 64, 55, 114 bZIP transcription factors been found in Arabidopsis thaliana ,Oryza sativa , Glycine max , Sorghum bicolor ,Hordeum vulgare L , Solanum lycopersicum , Zea mays ,Cucumis sativus , Vitis vinifera and Malus domestica , respectively (Baloglu et al., 2014; Corrêa et al., 2008; Li et al., 2015, 2016; Liu et al., 2014b; Nijhawan et al., 2008; Pourabed et al., 2015; Wang et al. 2011; Wei et al., 2012; Zhang et al. 2018). Only 25, 21 and 21 bZIP transcription factors were found in yeast, nematode, and fruit fly (Riechmann et al., 2000). Compared to other eukaryotes, plants have more bZIP homologous proteins and more conserved amino acid sequences in these homologies (Ali et al., 2016). Studies have shown that the structures of bZIP protein are closely related to its biological function. Jakoby et al. (2002) used MEME (multiple em for motif elicitation) to analyze a large number of bZIP transcription factors in Arabidopsis thaliana . Based on the characteristics of both the bZIP and other conserved motifs, the 75 bZIPs in Arabidopsis thalianawere classified into 10 subfamilies of A, B, C, D, E, F, G, H, I and S. With similar method, the bZIP transcription factor family genes in other plants have also been categorized. The 131 bZIP transcription factors isolated from the soybean genome were also divided into abovementioned 10 subfamilies A-S (Liao et al., 2008). Though the 89 members of the bZIP transcription factor family in rice were also divided into 10 subfamilies, the subfamily S was replaced with J (Nijhawan et al., 2008). It seems that most of these subfamilies of bZIPs are conserved among different plants. Corrêa et al. (2008) identified the possible non-redundant complete sets of bZIPs in rice, comprising 92 proteins, and in black cottonwood, comprising 89 proteins. Based on both bZIP domain and other conserved motifs similarities, these collections of bZIPs together with the 77 bZIPs from Arabidopsis were categorized into 13 subfamilies, including A, B, C, D, E, F, G, H, I, J, K, L, and S, three subgroups J, K and L were added.
With the advancement of bioinformatics, more and more conversed motifs, except bZIP, were identified for categorizing bZIP subfamilies. Hence, the classification of bZIP transcription factors has become more and more sophisticated and scientific. Recent years, there are increasing reports on regulation mechanism of various bZIPs on different stress responses (Hwang et al., 2014; Ji et al., 2013; van Leene et al., 2016; Liu et al., 2012; Tsugama et al., 2016; Wang et al., 2019; Zhang et al., 2017a, b). Specific roles of bZIPs in different subgroups might also be categorized into corresponding biological pathways, considering plenty of functional annotated bZIPs been classified into the known subfamilies with those sophisticated and scientific bioinformatics.