R2R3-MYB Transcription Factor Family in Boehmeria nivea (L.) Gaudich: Potential Associations between Cd2+ Stress and Flavonoid Metabolism


 Background: The characterization of gene families especially MYB being the largest transcription factor (TFs) families in plants is a crucial step for functional studies. The completion of ramie genome sequencing provides a great opportunity to investigate the organization and evolutionary traits of ramie MYB genes at the genome-wide level. Results: A total of 105 BnGR2R3-MYB genes were identified in ramie and divided into 35 distinct subfamilies according to phylogeny divergence and sequences similarity. These genes were unevenly distributed among 14 chromosomes. Collinearity analysis showed that the segmental and tandem duplication events is the dominant form of the gene family expansion, and duplications prominent in distal telomeric regions. 88 BnGR2R3-MYB genes showed syntenic relationship with those in Apocynum venetum, followed by Cannabis sativa (58), Arabidopsis (53), maize (8) and rice (4). 103 of the 105 BnGMYB proteins were predicted nuclear subcellular, the remaining two were in either chloroplast or cytoplasm. The combination of transcriptome data and phylogenetic tree allows us to propose some powerful MYB candidates that might be involved in the regulation of secondary wall-associated cellulose synthases (BnGMYB14) secondary cell wall thickening (BnGMYB66/67) and flavonoids synthesis (BnGMYB60). The MYBs subgroups involve in regulating anthocyanin were different from arabidopsis and tomato pointing that BnGMYB in other groups play a role in regulating anthocyanin synthesis. qPCR results revealed 8 MYB TFs candidate genes for cadmium stress in ramie. There was an increased in synthesis of procyanidins under the cadmium stress, which suggest a new regulatory pathway in response to the stress. The predicted network identifies the interface between flavonoid metabolic pathways and adversity stress, and found evidence for the involvement of flavonoid synthetic pathways in the stress regulation.Conclusions: This work provides a basic understanding of BnGR2R3-MYB gene family characteristics and provides valuable information that may contribute in improving the tolerance of ramie to cadmium stress and fiber quality.


Background
Transcription factors (TFs) also called trans-acting factors are DNA-binding proteins that can speci cally interact with the cis-acting elements of eukaryotic genes and activate or inhibit the transcription of genes.
From the perspective of protein structure, these TFs are generally composed of four functional regions: DNA binding domain, transcription regulation domain (including activation domain or inhibition domain), oligomerization site, and nuclear localization signal [1]. Structure determines function, so most transcription factors can be classi ed into different gene families according to their conserved sequence structure. The MYB TFs, which are distinguished by the highly conserved MYB domain, comprise one of the largest TF families in the plant kingdom [2] and are usually composed of one to four adjacent imperfect tandem repeats which share a highly conserved MYB binding domain at the N-terminus. Each repeat is approximately 50-53 amino acid residues in length, containing three α-helices that together form a helix-turn-helix (HTH) secondary structure that interacts with the major DNA at the speci c recognition site C/TAACG/TG during transcription [3]. On the contrary, the C-terminal region is highly variable that leads to the wide range of regulatory roles of the MYB gene family [4]. According to the number of the adjacent MYB repeats, MYB genes are distinguished into the following four subfamilies: 1R-MYB, R2R3-MYB, R1R2R3-MYB and 4R-MYB. Generally, the R2R3-MYB members, which have diverse functions, are the most extensively studied [5].
The MYB TF is present in all eukaryotes. Zea mays C1, the rst plant MYB gene to be cloned and functionally characterized, is responsible for regulation of anthocyanin biosynthesis [6]. Nowadays, with the advance of plant genome-wide association and molecular biology methods, an increasing number of studies have shown that the MYB proteins are involved in the regulation of plant primary and secondary metabolism, the control of plant development, and the partake in response to various biotic and abiotic stresses [7][8][9][10]. For example, CsTSI regulates the biosynthesis of theanine in different organs and tissues of tea plants [11]. While CaMYB31 is reported to involve in regulation of biosynthesis of capsaicinoids in Capsicum annum [12], DkMyb4 controls proanthocyanidin biosynthesis in persimmon [13]. A number of MYB TFs regulating anthocyanin biosynthetic pathway have been identi ed from many species, such as AtMYB75, AtMYB90, AtMYB113 and AtMYB114 in Arabidopsis [14][15][16], CsRuby1 and CsMYB3 in citrus [17] and SlMYB12 in tomato [18]. Many of the reported MYB TFs have been shown to control cellular morphogenesis, secondary cell wall biosynthesis, meristem formation and cell cycle regulation [19][20][21][22]. A rice R2R3-MYB gene (OsCSA) directly triggers expression of sugar partitioning and metabolic genes during pollen and seed development [23]. GbMYB5 confers drought tolerance in cotton and transgenic tobacco plants [24]. Another rice MYB (OsMYB3R-2) exhibited enhanced cold tolerance by alteration in cell cycle and ectopic expression of stress genes [25].
Ramie is one of the most important ber crops in the family Urticaceae. It is also widely used as feed and other industrial raw materials [26][27][28]. The completion of the genome sequencing project at the chromosome level of ramie provides a valuable platform for revealing the genomic fabric of ramie R2R3-MYB gene family and exploring the evolutionary characteristics among different species [26,29]. R2R3-MYB transcription factor plays an important role in plant speci c processes making the MYB TFs a major research topic in many plants. Despite that, very limited reports are present to date on the functional characteristics of R2R3-MYB transcription factor in ramie. Therefore, this study was carried out to elucidate the function, evolution and expression pro le of BnGR2R3-MYB transcription factor in ramie based on sequencing data. Comprehensive investigations involving gene structures, gene duplications, chromosomal locations, phylogeny and cis-acting elements of the BnGR2R3-MYB genes were performed.
Expression pro les under cadmium stress and different stages of ber development were used to focus and discuss the regulatory role of BnGR2R3-MYB genes.

Materials And Methods
Identi cation of MYB protein in Boehmeria nivea (L.) Gaudich To identify more comprehensive MYB TFs, 168 MYB family sequences and 97 MYB-related family sequences (A. thaliana), 208 MYB family sequences and 85 MYB-related family sequences (G. arboreum), 130 MYB family sequences and 106 MYB-related family sequences (O. sativa) were retrieved from the PlantTFDB (http://planttfdb.gao-lab.org/). Protein sequences of these plant species were used as reference to lter the possible sequence of target specie by Blast Wrapper, and the expectation cut-off value (E-value) of 0.01 set as threshold signi cance. Subsequently, the MYB sequences obtained were aligned to Swissport Database / Reference Species Whole Protein Sequence Library, and screened by annotation information. The resulting datasets of MYB sequences were con rmed based on the completeness of MYB domains using Pfam, CDD and SMART with an E-value of 1e-2. To correct for deletion of some conserved sites, we used FGENESH-M (http://www.softberry.com/berry.phtml) to predict multiple variants potential genes in genomic DNA. All ramie R2R3-MYB proteins were manually inspected to ensure that the putative gene models contained two complete MYB domains and ultimately identi ed and classi ed all members of the MYB family in Boehmeria nivea. Length of sequences, molecular weights, GRAVY and pI of MYB proteins were analyzed using the ExPASy online tool[80]and subcellular localization using the pLoc-mPlant tools for Batch Prediction.

Sequence analysis and structural characterization of BnGR2R3-MYB genes
Multiple sequence alignments of the MYB domains sequences were performed using MEGA-X with default parameters. The deduced amino acid sequences in MYB motifs were then adjusted manually with Jalview software. WEBLOGO was used to show up the sequence logos of R2 and R3 MYB domain repeats with default setting. The exon-intron organizations of the BnGR2R3-MYB genes, including intron distribution patterns, phases, intro-exon boundaries and highlighted region of the MYB domains (R2, R3), were graphically displayed by TBtools [81]. The conserved motifs of the ramie MYB proteins were predicted by using the MEME[82] version 5.1.0 online program, with optimized parameters: zero or one per sequence; the maximum number of motifs, 20 respectively, and visualized by using TBtools.
Analyses of chromosome distribution, gene duplication and synteny for BnGR2R3-MYB genes  [42]) were obtained based on the description in corresponding literatures, and adopted the same method as above.
To detect WGD events, the paralogous genes of ramie were detected using all-vs-all homology searches in BLASTP with an E-value threshold of 1e-5. Syntenic blocks within a genome were identi ed based on the detected homologous gene pairs using MCscan.
Identi cation of the cis-elements of BnGR2R3-MYB genes and GO annotation To investigate the promoter regions of the BnGR2R3-MYB genes, 1000 bp genomic DNA upstream sequences of each of the 105 BnGR2R3-MYB genes were selected and the cis-elements predicted using PlantCare (http://bioinformatics.psb.ugent.be/web tools/plantcare/html/). Response class elements such as light-responsiveness, plant growth, stress-responsiveness and phytohormone-responsiveness were ltered and preserved. The cis-acting elements were visualized in TBtools, and each elements classi ed and the result displayed using heatmap. Functional annotation of BnGMYB proteins was performed in Blast2GO Tool by rst blasting in BLASTP using Swissport database, followed by mapping Gene Ontology (GO) terms, and nally visualization with R ggpolt2.
Expression patterns of BnGR2R3-MYB genes in the representative tissues of the ramie To explore the expression patterns of the BnGR2R3-MYB genes, 9 samples involving different tissues at different developmental stages were used, RNA-Seq data was obtained from Genome Sequence Archive database and transcript abundance of the BnGR2R3-MYB genes was calculated as fragments per kilobase of exon model per million mapped reads (FPKM). Log2(FPKM + 1) values were displayed according to the color code. Detailed FPKM values were listed in Table S7.
Stress treatment (Cd + 2 ) of ramie under hydroponic conditions Two ramie varieties (ZZ_1; HX_1) were used for cadmium stress treatment. Shoots from each of the varieties at similar growth stage during the same period were selected for hydroponic cutting according to

Prediction of BnGMYB protein-protein interaction network
The orthologous relationships of eight selected cadmium-responsive MYB regulatory genes were determined between Arabidopsis thaliana and ramie using OrthoVeen2 [88]. Prediction of interactions between BnGMYB proteins and other proteins based on the Arabidopsis homologs was obtained using the online program STRING version 11.5 with high con dence > 0.700[89], ltered genes were used to construct the correlation network. The interaction network was visualized by Cytoscape v3.8.2.

RNA isolation and expression analysis of BnR2R3-GMYB genes
Total RNA was extracted using EasySpin Plus plant RNA rapid extraction Kit (Aid-lab Biotechnologies Co., Ltd). The RNA was reverse-transcribed (Thermo Scienti c RevertAid First cDNA Synthesis Kit) into cDNA and quantitative RT-PCR (qPCR) analysis conducted using gene speci c primers (Table S8). 18s gene was used as internal control. The qPCR was conducted using SYBR Green Premix Pro Taq HS qPCR Kit (Accurate Biotechnology Co., Ltd) on a CFX96 Touch Deep Well Real-Time PCR System (Bio-Rad) according to standard procedure. Relative transcript levels were calculated using the 2 -ΔΔCt formula and the result displayed using histograms drawn with GraphPad Prism 8 software and all the histograms merged using Adobe Photoshop (2020) software. All qPCR analyses were performed with three biological and technical replications.

Results And Discussion
Identi cation and sequence feature of ramie MYB genes To identify the MYB family genes in ramie genome, the amino acid sequence of hidden Markov model (HMM) pro le of the Pfam MYB domain (PF00249) was used as a query in BLASTP searches. 245 deduced amino acid sequences that contain MYB or MYB-like repeats were examined. The redundant sequences and MYB genes with incomplete ORFs were corrected by FGENESH-M. Subsequently, 211 sequences were further examined to verify the presence of the MYB domains on the basis of the PROSITE, Pfam and NCBI-CDD analyses and a total of 200 non-redundant ramie MYB proteins were identi ed which including 89 MYB-related proteins (1R-MYB), 105 R2R3-MYB proteins (2R-MYB), 5 R1R2R3-MYB proteins (3R-MYB) and 1 4R-MYB proteins. According to previous reports, R2R3-MYB family members not only represent more than half of the proportion of the total MYB proteins, but also contain most functional genes, and thus constitute the functional group of MYB genes. Therefore, only the R2R3-MYB family members were further analyzed in this study.
Based on their order on the chromosome, the 105 BnGR2R3-MYB genes were renamed BnGMYB1 to BnGMYB102 and also BnGMYB103 -BnGMYB105 for the three MYB genes (Maker00022368, Maker00031449, and Maker00022368) which could not be conclusively mapped to any chromosome (Table S1). BnGR2R3-MYB genes were further characterized by computing such information as length of the CDS (Coding Sequence), the length of the protein sequence, the protein molecular weight (MW), isoelectric point (pI) and the subcellular localization (Table S1). Among the 105 BnGMYB proteins, BnGMYB52 was the smallest protein with 135 amino acids (aa), whereas the largest one was BnGMYB1 (593 aa). The MW of the proteins ranged from 15.6 to 65.0 kDa, and the pI ranged from 4.56 (BnGMYB23) to 10.08 (BnGMYB96). The predicted subcellular localization results showed that 103 of the 105 BnGMYB proteins were located in the nuclear region, whereas the remaining two were in either chloroplast or cytoplasm.
R2, R3 repeats are signi cantly conserved sequences within the MYB domain regions. To investigate the homologous domain sequence features and the frequency of amino acids at each position into the ramie R2R3-MYB domains, multiple alignments (MEGA 7.0) and sequence logos were constructed ( Fig. 1). In general, R2R3-MYB conservative region contains 108 basic residues (including the linker region) on average. The results revealed that 11 and 6 completely conserved amino acid residues were identical among all BnGMYB proteins detected in the R2 and R3 MYB repeat regions, respectively (Fig. 1). These included the most prominent series of evenly distributed and highly conserved triplet tryptophan (Trp., W) residues in each repeat, and the characteristic W residues was located at positions 9, 29 and 49 of the R2 repeat and 28 and 47 of the R3 repeat in ramie. These characteristic amino acids are known to play an important role in sequence-speci c DNA binding, and are considered as landmarks of plant MYB proteins [30].
R2 repeats contain three fully conserved tryptophan residues; for 101 out of the 105 BnGMYB proteins, the R3 repeat sequences contained two Trp residues, whereas, these conserved sequences were not immutable, three R3 repeats (BnGMYB66, BnGMYB56, BnGMYB67) only contained rst Trp residues, the last Trp was replaced by Tyrosine (Tyr., Y) or phenylalanine (Phe., F). The results were consistent with those of other species [31,32]. It has been reported that the insertion of a leucine residue (Leu., L) between the second and third helices of the R2 repeat represents the evolutionary relationship of the R2R3-MYB domain [33]. In this study, 83 (79.05%) BnGR2R3-MYBs were observed to have Leu-38 inserted at the same site (Table S2). As shown in Fig. 1, the highly conserved amino acid residues of the MYB domain were mainly distributed near the third conserved Trp residues indicating that the last HTH domain of the MYBs was the most conserved among the 105 BnGMYB proteins.

The classi cation, gene structure and motif composition of BnGMYBs
We performed a phylogenetic analysis of R2R3-MYB proteins using neighbor-joining (NJ) and maximum likelihood (ML) methods with 1000 bootstrap replicates. The tree topologies obtained using the two methods were largely similar with a very few variations in the gene classi cation ( Fig. 2a and Figure S1).
Since the ML method can choose the optimal alternative model and optimize the evolutionary tree with certain topological structure and branch length, we adopted it for further characterization. The R2R3-MYB members of ramie were subdivided into 30 subgroups (designated A1-A30 in this study) using Arabidopsis MYB proteins [4] as reference based on the phylogenetic analysis result (Fig. 2a), each subgroup included 2-8 members and most subgroups were supported with high bootstrap value. Most of the large subgroups in our classi cation were consistent with those of Arabidopsis, and only one small subgroup (AtMYB24; S19) not retrieved from the previously constructed phylogenetic trees of Arabidopsis MYB protein. Nine clades only contained ramie R2R3-MYB members, and four did not t into any subgroup.
To investigate the relationship between gene structural function and evolution, we analyzed the structural diversity of the BnGR2R3-MYB family, which showed genes having the same genetic structure clustering together as obtained in the phylogenetic tree. Most of gene coding sequences were disrupted by introns, except for almost all members from A30 subgroup, ungrouped BnGMYB27/37 and BnGMYB72 (Fig. 2c). The number of introns ranges from zero to eleven, but most ramie MYB genes possessed three exons and two introns in line with previous reports [34]. Interestingly, the majority of intron insertions occur in R2, R3 conserved domains indicating the important role of these conserved domains in plants. In conclusion, the highly conserved gene structure and domain in uence each other, which provided important evidence for the division of subgroups.
MYB conservative motifs as obtained using MEME online tool with the aim of unraveling diversi cation of these BnGR2R3-MYB genes (Fig. 2b) revealed the presence of 20 conserved motifs in the ramie MYB proteins (Table S3). Similarly, we also found that the protein architecture was conserved within a speci c subgroup. Phylogenetic tree, gene structure, and motif analysis suggested that MYB proteins within the same subgroup were likely to share similar functions. In addition to the conserved motif representing R2R3, other similar motifs were shared by the BnGR2R3-MYB members within the same subgroup with motifs such as 11/17/19 found to be unique in A25, A23 and A1 respectively, while motif 8/9/12/18 were unique in several subgroup, which strongly supporting our subgroup classi cations.
Chromosomal distribution and synteny analysis of BnGMYB genes As shown in Fig. 3, ramie R2R3-MYB genes were distributed throughout all 14 chromosomes, thought it was uneven with Chr10 having the largest number (14) of the R2R3-MYB genes followed by Chr4 (13) and only 2 genes were found on chr12. No signi cant correlation was obtained between the chromosome length and the number of BnGR2R3-MYB genes, though, it's concentrated on almost every chromosome, and on the top and bottom of chromosome (Fig. 3). This was in line with the theory that the closer the more prone to cross both ends of the chromosomes, ectopic, and other mutations [35].
The segmental and tandem duplication events is the dominant form of the gene family expansion [36]. Based on BLASTP and MCScanX methods, we investigated the gene duplication events. Reference to Holub[37] for a description of tandem duplication, among the BnGR2R3-MYB genes, the blue lines in Fig. 3 showed four pairs of tandemly duplicated genes (BnGMYB32/33, BnGMYB51/52, BnGMYB61/62, BnGMYB93/94) located in chr5, chr7, chr8, chr13, respectively, while the red lines showed 39 segmental (37.1%) duplication pairs between BnGR2R3-MYB genes (Table S4) including two MYB-related genes (Maker00025472; Maker00000772) and one 3R-MYB gene (Maker00072694). Some BnGMYBs located on chr13, chr6 and chr4, such as BnGMYB91, BnGMYB90, BnGMYB89, BnGMYB21, BnGMYB44 and BnGMYB45, were involved in multiple duplication events. Interestingly, chr12 (BnGMYB89), chr13 (BnGMYB90/91) and chr6 (BnGMYB45) were interrelated with one another in a segmental duplication manner, all segmentally duplicated genes belonged to A30,and in the same cluster as Arabidopsis AtMYB45 (Fig. 2a) making us speculate that this speci c population of R2R3-MYB genes duplication in ramie might be closely related to photomorphogenesis [38], and might play a vital role in the growth and development of the plant.
The positive pressure criteria were selected based on M. Lynch[39]: Ka/Ks < 1 stands for purifying selection, Ka/Ks = 1 means neutral selection, while Ka/Ks > 1 signi es positive selection. We found that there was a high sequence divergence value among the four segmental gene pairs. These sequences diverged greatly and evolution distance was long, owing to the occurrence of a large number of synonymous mutations. The remaining segmental and tandem duplicated BnGMYB gene pairs were Ka/Ks < 1 except BnGMYB66/67 (Ka/Ks = 1.04). These results suggest that the ramie R2R3-BnGMYB gene family had evolved under the effect of purifying selection (Table S4).
To further explore the potential evolutionary mechanisms of BnGR2R3-MYB genes, we constructed ve comparative syntenic maps of ramie with ve other plants, three from which are dicots (A. thaliana, C. sativa, A. venetum) and two monocots (rice and maize) (Fig. 4). The numbers of orthologous pairs between the ramie and A. venetum, C. sativa, A. thaliana, O. sativa and Z. mays were 88, 58, 53, 8 and 4 respectively. Collinearity analysis showed that ramie and A. venetum and C. sativa had the most collinearity segments, which was consistent with the phylogenetic divergence time estimation which supported the placement of ramie within the same clade as C. sativa [29]. Some of the BnGMYBs were found to be associated with multiple syntenic gene pairs, particularly BnGMYB89 and four A. thaliana genes, suggestion that these genes might have played an important role in MYB gene family during evolution. Remarkably, the number of collinear gene pair between ramie and A. venetum, C. sativa, and A. thaliana (dicotyledonous) was far greater than between ramie and O. sativa/Z. mays (monocotyledonous)[29, 40, 41]. 19 BnGMYBs identi ed in dicotyledonous plant were absent in monocotyledonous plant, suggestion that these orthologous pairs (19 BnGMYBs) formed after the divergence of dicotyledonous and monocotyledonous plants. Additionally, one collinear pair (BnGMYB64) was identi ed in all the six detected plants, indicating that these orthologous pairs might had already existed before the ancestral divergence (Table S5).
Comparative phylogenetic analysis of BnGMYB and R2R3-MYB family from six different plant species A Neighbor-joining (NJ) phylogenetic tree was constructed using the 105 full-length R2R3-MYB protein sequences of ramie and those from Arabidopsis (126), rice (81), maize (156), tomato (121) and pineapple (93). Due to the large number of proteins (682), only condensation tree and statistics were provided (Fig.  5). The complete version was shown in Figure S2. The phylogenetic analysis generated 35 clades (C1-C35) among six species and the numbers of the MYB members in each species were listed. The fact that this tree was in good agreement with the classi cation results of Arabidopsis [4] and other plants [31,32,42] demonstrated the reliability of the data. Moreover, the comprehensive phylogenetic analysis of the R2R3-MYB proteins in these six species might provide more evolutionary historical clues. For example, monocotyledon proteins in the same group were often found clustered together, while MYB proteins from dicotyledonous plants were clustered side by side in lateral branches. These data were also in the order Poales. Meanwhile, several clades were found only in some particular species. For example, C5, C11, and C16 not exited in rice, maize, and pineapple. C31 was only found in monocotyledons, but not in dicotyledons. This indicated a great evolutionary distance between monocotyledons and dicotyledons, and these MYB genes might not be present prior to monocotyledonous genomic differentiation.
Most of the clades contained members from all six plants, suggesting that these genes were present in the common ancestor before the species diverged, while each clade had a different number of representative MYB proteins suggesting that MYB genes expansion may be more active in some plant species, and this expansion inequality may be related to the well conserved chromosome karyotype of ramie [43].
Interestingly, A subgroup (C24) containing all members of other species but ramie may have disappeared during the evolutionary course of ramie,and their function need further exploration. C12 contains only Arabidopsis members, further demonstrating that the Arabidopsis corresponding S12 subgroup was a speci c clade of Arabidopsis that is involved in the regulation of Glucosinolates and Camalexin biosynthesis which is consistent with the results of the Prunus salicina MYB study [31]. Maker00077618 (BnGMYB33), Maker00077519 (BnGMYB32) in ramie were closely clustered with AtMYB103 (from Arabidopsis), SlMYB52 (from tomato) which negatively regulate endoreduplication during trichome initiation and expansion, suggestion that the two genes in ramie might also involve in trichomes and tapetal cells functions. In addition, the annotation in the reference model plant showed the unique functions of each clade and provided a direction for the functional identi cation of ramie MYB proteins [4,38,44].
Polyploidy and subsequent genes loss exist in most species is an important driving force of species evolution. Through comparative genomic analysis, it could be inferred ( Figure S3) that ramie has undergone at least one round of genome-wide duplication events. Phylogenetic analysis of MYB gene family showed that the number of BnGMYB members in some clades was consisted with rice and maize. Ramie also undergone the genome-wide duplication event common to rice [45] and maize [46].
Analysis of the cis-acting elements of the BnGR2R3-MYB genes and Gene ontology annotation The upstream promoter regions (1000bp) of all BnGR2R3-MYB genes were predicted to better understand the functions of the R2R3-MYB genes in the ramie plants. 41 responsive cis-acting elements obtained were categorized into 4 groups ( Fig. 6 and Table S6), which made up of 22 light responsive elements, 8 phytohormone responses, 6 plant growth development-related responses and 5 abiotic stresses (adversity). Light response elements being the most abundant are present in all genes. Box-4, G-box (light responsiveness), ARE (anaerobic induction) and ABRE (abscisic acid responsiveness) were the most prominent elements, and more than 100 were found in the 105 BnGR2R3-MYB genes, suggesting that these elements might play an important role in regulating gene expression.
A total of 65 (61.9%) genes were found to contain ABRE. 118 CGTCA-motif and TGACG-motif were involved in the response to MeJA and as well 77 TGA-element and TCA-element involved in the response to auxin. These results indicate the potential function of BnGR2R3-MYB genes in response to hormonal stresses.
Almost all BnGR2R3-MYB genes as obtained in this study contained several cis-acting elements associated with abiotic stresses such as drought-inducibility, anaerobic induction, low-temperatureresponsive, defense and stress-responsive, and wound-responsive which were similar to what was reported [47]. The abiotic stress response elements were mostly clustered, which might be related to the function of gene groups. For example: A29 and A30 groups of phylogenetic trees in the red line in Fig. 6 (corresponding to the grouping C32 and C33 constructed by NJ method in Fig. 5) constructed by ML method showed concentrated distribution of abiotic stress response elements, which were consistent with their functional clustering. This further illustrates the accuracy of the results.
Plant growth elements were dispersed (Fig. 6), and CAT-box and O2-site were the main responses to plant growth and development. Summarily, the overall results indicated that BnGR2R3-MYB genes play an irreplaceable role in plant growth and development, stress and phytohormone responses.
Gene ontology (GO) annotations of 105 BnGMYB proteins were obtained using protein blast in Blast2GO tool. Based on the identi ed GO terms, the involvement of the identi ed BnGMYB proteins in biological processe (BP), cellular component (CC) and molecular function (MF) were shown in Fig. 7[48]. The GO term "binding" (GO: 0005488) best described the greatest number of genes (96, 92.38%), and "singleorganism developmental process" (GO: 0044767), "developmental process" (GO:0032502) were signi cantly enriched in top 20 of biological process. These GO annotations of BnGMYB proteins were in agreement with the experimental ndings in other species [49][50][51].

Expression pro ling of BnGR2R3-MYB genes with RNA-seq in different tissues
To explore the temporal and spatial expression patterns of BnGMYBs, we analyzed their transcript levels using transcriptome data of 9 different tissues (phloem, root, leaf) or developmental stages of ramie ( Fig. 8 and Table S7a). The results showed most BnGMYBs have different expression patterns, consistent with the results of other plants [52]. Three BnGMYBs can't detected in all nine samples, suggesting that they were pseudogenes, or only expressed in other tissues or special temporal (Fig. 8). We performed hierarchical cluster analysis using expression data of 102 other BnGMYBs from 9 different samples. BnGMYBs were categorized into three main groups, high expression, preferential expression, relatively lower expression. We found that the phylogenetic grouping was inconsistent with the expression grouping in most cases, indicating that the expression patterns (period, location) of genes with similar functions were signi cantly different. However, some closely related MYB genes showed highly similar transcript levels, for example, three members of A30 (BnGMYB89/90/91) were all grouped in high expression group. There were 57 genes expressed in all tissues, and 16 of them showed constitutive expression (FPKM > 2). In order to better understand the preferential expression of genes, we ltered genes based on meeting two set rules: FPKM > 2 (at least in one tissue); tissues with the highest expression levels should have twice as much expression in at least one of the other tissues. Based on this, 11 genes in phloem_third period, 19 genes in leaf of variety ZZ_1, 22 genes in terrestrial root were found to have exhibited preferential expression over the others ( Table S7c).
As a bast ber crop, ramie ber underwent successively co-growth with the stem. The expression level of Maker00025443 (BnGMYB14) increased gradually during the ve developmental stages of phloem (Fig.  8). The homologous gene AtMYB46/83 are gatekeeper of secondary wall biosynthesis in Arabidopsis, which directly regulates the gene expression of secondary wall-associated cellulose synthases [53,54]. Therefore, we speculated that BnGMYB14 was related to the synthesis of secondary cell wall in ramie. The bark thickness and phloem ber length were important yield traits of ramie and increased with the growth of ramie [29], which is consistent with the expression patterns of some MYB TFs. For example, Maker09720 (BnGMYB67), Maker00041853 (BnGMYB66) both clustered into C30 subgroup and also have higher homologies to AtMYB69, which was linked to regulating biosynthesis of lignin, xylan and cellulose, participating in secondary cell wall thickening [55]. The expression levels of these two genes in the fth cell wall thickening stage were signi cantly higher than those in the rst four stages. According to its expression pattern and phylogenetic relationship, the two genes could be involved in cell wall regulation during ramie ber development. In previous studies, we explored the mechanism of red leaf formation using transcriptomic analysis, based on which, we identi ed MYB TFs that regulate leaf color. It is also reported that, the highly conserved MYB-bHLH-WD repeat (MBW) transcriptional complex model plays an important role in regulating avonoid pigment pathway [56]. In this study, we found some MYB TFs related to avonoid syntheses which are expressed in HX_1 leaves more than in ZZ_1. C1 cluster subgroup have several experimentally veri ed anthocyanin regulators, such as AtMYB75 [57], AtMYB90 [15], AtMYB113/114[58] from Arabidopsis, SlMYB75 [14] from tomato and was the largest anthocyanin subgroup. However, no BnGMYB was classi ed into this subgroup, which must mean that MYB in other groups such as group C4 [59] and C6[60] also play a role in regulating anthocyanin synthesis, and further pointed to the fact that the mechanisms of anthocyanin synthesis regulation are differed with species. Furthermore, adequate avonoids and other components in ramie may be the reason for its mildew-proof and bacteriostatic effects[61, 62]. Maker00087487 (BnGMYB60, C4) was closely clustered with AtMYB12, which acted as an activator and enhanced the biosynthesis of avonoids in Arabidopsis [59,63], and its expression was higher in leaves of ZZ_1 than that of HX_1.
Thus, BnGMYB60 might be involved in the regulation of avonoids synthesis ( Table S7b). The above two substances represented the avanols synthesis pathway and proanthocyanin synthesis pathway[64, 65], respectively, and there may be a substrate competition relationship (antagonistic relationship) between them. Therefore, it was not di cult to understand that the expression of proanthocyanin biosynthesis related gene (C1) was high in the red variety, while the expression of avanols biosynthesis related gene (BnGMYB60) was high in the green variety. Further study should be conducted to verify these results.

Expression pro les of BnGMYBs under Cd + 2 stress
Increasing studies revealed that R2R3-MYB family genes were involved in various stress response. However, not much information is available about the MYB genes involvement into the cadmium tolerance response. It is well known that ramie has strong cadmium tolerance and accumulation ability, and is a promising plant for remediation of land polluted by heavy metal cadmium [66]. With aid of RNAseq expression pro le data combined with phylogenetic analysis, we preliminarily screened 11 MYB genes related to stress and proanthocyanidin synthesis in ramie and investigated the expression pattern of these MYB genes.
According to the results, BnGMYB gene showed different levels of expression under different cadmium treatment times, suggesting that they would be candidate genes for cadmium tolerance in ramie (Fig. 9).
Among the 11 genes investigated, no two genes showed exactly the same expression trend. Most of the genes were up-regulated, while some genes were down-regulated. BnGMYB41/104 in the stem and BnGMYB89 in the root remained unchanged. The expression trend of BnGMYB78 showed a two-stage pattern: its expression level in stem and leaf rst increased before decreased after the cadmium stress treatment, while the trend was reversed in the root. In previous studies,Liu et al[67] selected BnGMYB78 as a corresponding factor to cadmium stress. BnGMYB78 and BnGMYB28 belong to the same group, though their patterns of expression were inconsistent indicating the complexity of the response pathway of cadmium stress in ramie. Previous studies have reported that under cadmium stress, expression of AtMYB4 was induced in Arabidopsis and resulted in cadmium tolerance regulation via enhanced protection against oxidative damage and increases expression of PCS1 and MT1C[60]. Interestingly, BnGMYB41 and AtMYB4 are orthologous genes, and the expression of this gene in roots and leaves increased with the extension of cadmium treatment time reaching a signi cant level. This implied that BnGMYB41 might also activate the protective mechanism of oxidative damage in ramie, contributing to ramie cd-tolerance.
Coi ncidentally, the expression patterns of the 11 genes in the three tissues were consistent in the two ramie varieties, which indicated similarity in the response mechanism of different ramie varieties to cadmium stress. However, the response of many other genes such as BnGMYB28 in root; BnGMYB9 in stem; BnGMYB78 in leaf varies greatly between the two, with the degree of responses more intense in of HX_1 than in ZZ_1. This might be the reason for the differences in cadmium tolerance between the two.
Some genes like BnGMYB9/11/41/89 showed a bottom-up (root-stem-leaf) time sequential response pattern, which might be related to the direction of cadmium transport and signal transduction. All genes were up-regulated in leaves after cadmium treatment. BnGMYB10/11/12 were all up-regulated in all the three tissues, and their expression trends were very similar. BnGMYB10 has been identi ed as a cadmium stress response factor and belongs to the same C1 subgroup with the other two (BnGMYB11/12) and by extension, these three were found very close to the S5 subfamily of Arabidopsis thaliana, whose function was related to proanthocyanidin synthesis [4]. Therefore, we inferred those plants initiate proanthocyanidin synthesis under cadmium stress, and proanthocyanidin functions in alleviating the damage caused due to the stress via producing an antioxidant active substance [68] or produced a large number of reactive oxygens to alleviate.
According to report, AtMYB49 regulates cadmium accumulation[69], AtMYB59 regulates calcium signaling during plant growth and stress response [70]. In ramie, we found their respective homologous genes, BnGMYB9 and BnGMYB 104 and their expression patterns were similar to those of AtMYB49/59 under cadmium treatment [69,70], so we deduced that they also had similar functions. BnGMYB45/89/90 which fall to the C33 group were according to phylogenetic analysis valuable candidate for abiotic stress [71][72][73]. Taken together, these above ndings might be valuable for improving the cadmium resistance of ramie via the manipulation of the BnGMYBs.

BnGR2R3-MYB protein-protein interaction network prediction
To perform a functional analysis of BnGR2R3-MYB proteins, interaction network was generated using protein families based on their orthologues AtR2R3-MYB in Arabidopsis. Several interactions were predicted and we could preliminarily infer the possible function and potential contact of some BnGR2R3-MYB genes (Fig. 10). Previous studies have shown AtMYB4 (homolog of BnGMYB41), AtMYB12 (homolog of BnGMYB10) and AtMYB123 (homolog of BnGMYB12) involvement in the regulation of multiple avonoid synthesis pathway genes like F3H, FLS, C4H, ANR. These regulatory genes, in addition to their role avonoid biosynthesis also regulate some stress resistance-related proteins such as LPP2 (Lipid phosphate phosphatase 2). And AtMYB123 [74] which involved in the control of avonoid late metabolism in developing siliques also plays a key role in determining tissue-speci c activation of leucoanthocyanidin reductase, implying that the mechanism by which LPP2 responded to stress via attenuating the signal generated by phosphodiesterase was also present in ramie [75]. We found that there is a strong bidirectional interaction between SAD2 and AtMYB4. SAD2 functions in regulating abscisic acid (ABA)-mediated pathways in response to cold or salt stress in the form of an autonomous nuclear transport receptor or a junction-like protein [76,77]. Furthermore, multiple genes in the avonoid metabolic pathway interact with JOX4 (Jasmonate-induced oxygenase 4) involved in the oxidation of jasmonate (JA) and stress-induced synthesis of phytohormones [78,79], indicating an intrinsic link between avonoid metabolic pathways and stress.
It is worth mentioning that AtMYB30 (homolog of BnGMYB28), AtMYB49 (homolog of BnGMYB9) and AtMYB96 (homolog of BnGMYB78), which are associated with biotic and abiotic stresses, interact with an extremely rich set of resistance proteins including SIZ1 (participating in abiotic stress-induced sumoylation), RD22 (moisture stress), YAK1 (ABA-mediated, drought response), ABI5 (response to abscisic acid, chitin, gibberellin, salt stress, water deprivation), and nally connects the two possible coercive coping mechanisms at the TGG1 node (Fig. 10). Also, this veri es our later conjecture. In conclusion, the predicted network identi es the interface between avonoid metabolic pathways and adversity stress, and found evidence for the involvement of avonoid synthetic pathways in the stress regulation.         The relative expression levels of selected MYB genes in two ramie varieties in different tissue sites (root, stem, leaf) at different periods under Cd+2 treatment (0h, 6h, 12h, 24h, 48h). A represents root; B represents stem; C represents leaf. The error bar represents the standard deviation of the three biological duplicates. * represents signi cant difference (P<0.05), ** represents extremely signi cant difference (P< 0.01).