Genome-level analysis of BpR2R3-MYB family genes transcribed in seedlings of Betula platyphylla and BpR2R3-MYB15 enhanced flavonoid production

Flavonoids have a wide range of biological activities in plant development, stress resistance and human health, etc. R2R3-MYBs are one of the key elements in regulation of flavonoid production, but their functional importance in Betula platyphylla remains elusive. The full-length transcriptome data of 30-day-old seedlings of Betula platyphylla were used to identify BpR2R3-MYB family genes, and their gene structure, chromosome distribution and syntenic relationships were predicted by bioinformatics methods. Agrobacterium-mediated transient transformation was used to verify the function of BpR2R3-pMYB15 in flavonoid production. 44 BpR2R3-MYB family genes expressed in seedlings of Betula platyphylla were identified and found to be unevenly distributed in 11 chromosomes. Among them, 90.90% of the BpR2R3-MYBs had introns, and only four genes had no introns. Five gene pairs with segment duplication were found, and their Ka/Ks ratios were less than 1. Thirty orthologs between Betula platyphylla and Arabidopsis thaliana and 68 orthologs between Betula platyphylla and Populus trichocarpa were detected. Five BpR2R3-MYBs were clustered with R2R3-MYB genes related to flavonoid synthesis, and BpR2R3-pMYB15 had the highest correlation coefficients between the value of gene expression and flavonoid content. BpR2R3-pMYB15 was cloned, and its transient overexpression obtained using Agrobacterium-mediated transformation positively regulated flavonoid production. This work enriches the collection of R2R3-MYBs related to flavonoid production in seedlings of Betula platyphylla.


Background
Flavonoids are a class of polyphenol phytochemicals that are abundant in many leaves, stems, flowers, fruits and other tissues or organs of higher plants [1]. About 8000 flavonoids have been discovered to date, ant they can be divided into six main subclasses, namely, the flavonols, flavanones, flavones, flavanols, isoflavones, and anthocyanidins [2,3]. Flavonoids have a variety of biological activities in plant growth, development, and stress resistance to harsh environments, including (I) regulation of axillary bud or pollen tube growth [4,5]; (II) provision of pigmentation for leaves, flowers, and fruits [3,6]; (III) protection against biological and non-biological stresses [7][8][9]; (IV) acting as signal molecules between plant and microbe interactions [3]. Flavonoids also have medicinal properties including antitussive, expectorant, antibacterial, antifungal, anti-free radical, and anti-oxidation [10]. Concerns have been raised about their potential functions in plant and human health, and thus, understanding the molecular basis of flavonoid biosynthesis is crucial.
Betula platyphylla is a pioneer hardwood tree species with ecological, economic, and pharmacological activities, and it thrives in northeastern China, Russian Far East, Siberia, Mongolia, Northern Korea, and Japan [18]. Flavonoids are one of the main secondary metabolites in leaf extracts of B. platyphylla, which exhibits antifungal, anti-free radical, and antioxidant activities [19]. The overexpression of BpCHS3 promotes flavonoid production and enhances the salt tolerance of B. platyphylla [20]. Given the importance of R2R3-MYB proteins in flavonoid biosynthesis in plants, the functional characterization of R2R3-MYB family members in flavonoid biosynthesis of B. platyphylla, especially in seedlings, has not been systematically investigated.
In this study, the latest B. platyphylla reference genome was used to characterize R2R3-MYB family members [18], which were screened from full-length transcriptome data of 30-day-old B. platyphylla seedlings. BpR2R3-MYB15, a R2R3-MYB gene predicted to be involved in flavonoid synthesis, was cloned and verified via transient transformation in B. platyphylla. The results of this study can contribute to the functional characterization of R2R3-MYB transcription factors in seedlings of B. platyphylla.

Chromosomal distribution analysis
The B. platyphylla genomic sequence was inputted into the function module of Genome Length Filter in TBtools software to obtain chromosome information [22]. Then, the chromosome and location information of the 44 R2R3-MYBs were entered into the function module of Gen Location Visualize (Advanced) in TBtools software to visualize the chromosomal distribution of the 44 R2R3-MYBs.

Construction of the phylogenetic trees
The amino acid sequences of the 44 BpR2R3-MYB proteins, 126 A. thaliana R2R3-MYB proteins, and 4 R2R3-MYB proteins related to flavonoid synthesis were aligned using Clustal W in MEGA X software [23][24][25]. PHYLOGENY in MEGA X software was used to construct a neighbor-joining tree through 1000 bootstrap replications.

Conserved motifs and gene structures
Multiple expectation maximization for motif elicitation (MEME) (https:// meme-suite. org/ meme/) was used to analyze the conserved motifs of the 44 BpR2R3-MYB proteins. The maximum number of motifs was set to 10 (width range of motif = 6-300 residues). PHYLOG-ENY in MEGA X software was adopted to construct a maximum-parsimony tree through 1000 bootstrap replications. Gene Structure View (Advanced) in TBtools software was used to visualize the exon/intron structures [22].

Collinearity analysis
The gene synteny, tandem, and segmental duplications of BpR2R3-MYBs among B. platyphylla, A. thaliana, and P. trichocarpa were analyzed using the One-Step MCScanX function in TBtools software [22]. The Advanced Circos and Multiple Synteny Plot in TBtools software were utilized to visualize intra-genomic and inter-genomic collinearity. Five gene pairs with segmental duplication were selected for the calculation of Ka (non-synonymous substitution sate) and Ks (synonymous substitution rate). The values of Ka and Ks were calculated with the Simple Ka/Ks Calculator (NG) in TBtools software. Generally, Ka/Ks < 1.0 indicates purifying or negative selection, Ka/ Ks = 1.0 denotes neutral selection, and Ka/Ks > 1.0 means positive selection [26].

Plant materials
Leaves of 18 ten-year-old B. platyphylla trees planted in Northeast Forestry University were collected (3-h intervals on August 3-4, 2020, 45° 72ʹ 66″ N, 126° 64′ 47″ E) for daily cycle analysis, which provided a basis for sampling time in seedling. Thirty-day-old seedlings of B. platyphylla obtained from sterile seeds were treated with 90 μmol L −1 Cd treatment for 1 and 4 days, and 15-dayold calli of B. platyphylla (easily transformed at this stage) obtained from the stem of tissue-cultured seedlings were used for Agrobacterium-mediated transient transformation. The seedlings were planted in a woody plant medium supplemented with 20 g L −1 of sucrose. The calli were cultured in Gamborg's B 5 medium supplemented with 0.3 mg L −1 of 6-benzyladenine, 0.6 mg L −1 of thidiazuron, and 20 g L −1 of sucrose. The pH of the medium was adjusted to 5.6 ± 0.2 prior to autoclaving. Fresh samples frozen with liquid nitrogen were used for gene expression, and samples dried through the ovendrying method were used for the analysis of flavonoid or procyanidin content.

Cloning of full-length BpR2R3-MYB15 and BpR2R3-MYB21
The full-length sequences of BpR2R3-MYB15 and BpR2R3-MYB21 were amplified by the following PCR primers: PCR amplification was performed as follows: 94 C for 5 min; 35 cycles of 98 C for 10 s, 50 C for 45 s, and 72 C for 1 min; and 72 C for 10 min. Positive colonies (purified PCR amplification fragment ligated with pMD TM 18-T vector) were sequenced at Rui Biotech (Beijing).

Agrobacterium-mediated transient transformation
Agrobacterium tumefaciens strain LBA4404 harboring Cam1304-SubC-BpMYB15 (overexpression vector) or RhRNA-pTRV2-BpMYB15 (RNAi vector) was used to infect 15-day-old B. platyphylla calli (soaked in 25% sucrose for 5 min) for 1 h [27]. The infection solution consisted of 2 mM L −1 of MES-KOH (pH = 5.4), 10 mM L −1 of CaCl 2 , 120 μM L −1 of acetosyringone (AS), 2% sucrose, 270 mM L −1 of mannitol, and 200 mg L −1 of dithiothreitol + 0.02% Tween. The infected calli were cultured in B 5 liquid medium containing 100 μmol L −1 of AS for 2 days in the dark at 28 °C. Then, the infected calli were washed with distilled water for analysis of gene expression and total flavonoid content. The transient expression of GUS was also histochemically assayed by staining the infected calli with X-GLUC solution in dark at 37 ℃ for 1 h.

Determination of the total flavonoid and procyanidin content
The collected fresh samples were dried at 105 ℃ for 15 min, and then dried to constant weight at 60 ℃. Dried samples (0.05 g) were accurately weighed and soaked in 5 mL of 65% ethanol for 24 h. After centrifugation at 2504g for 10 min, 1 mL of the supernatant solution was obtained for content analysis. The total flavonoid content was determined using the AlCl 3 colorimetric method with quercetin as the standard [28], and the linear equation was y = 0.1151x + 0.0504 (R 2 = 0.996), where x indicates the absorbance of the solution at 510 nm. Procyanidin content was analyzed via vanillin-hydrochloric acid spectrophotometric quantification with procyanidin B1 as the standard, and the linear equation was y = 0.0018x + 0.0027 (R 2 = 0.991), where x indicates the absorbance of the solution at 500 nm [29].

Gene expression analysis
The CTAB-based method was used to isolate the total RNA. The Taqman probes and primers are presented in Additional file 1: Table S1. PCR amplification was performed on an ABI Prism7500 real-time PCR system as follows: 95 °C for 30 s, followed by 40 cycles at 95 °C for 5 s and 60 °C for 34 s. Each RT-qPCR analysis was performed with three technical replicates. Gene expression data were calculated relative to the reference gene (α-tubulin, TU) following the 2 −ΔΔCt method [30]. The data of gene expression were log 10 transformed and used for heatmap construction with TBtool software.

Statistical analysis
The experiments were repeated three times. The data presented in the figures are means ± standard error. The data were analyzed through one-way ANOVA by using SPSS version 21.0. The different letters show significant differences among treatment means (P < 0.05, Tukey's test) [7].

Identification of BpR2R3-MYBs
On the basis of the sequencing data of the full-length transcriptome of B. platyphylla, the genes with R2R3-MYB conserved structure domains were screened using the HMMer database, and the screened BpR2R3-MYB family genes were further verified using Pfam and CDD databases. Forty-four BpR2R3-MYB family genes were identified and numbered according to the order in which they were found ( Table 1). The 44 BpR2R3-MYB proteins contained 186 (BpMYB24) to 484 (BpMYB41) amino acids, with molecular weights of 20.97 kDa (BpMYB24) and 54.60 kDa (BpMYB41) and an isoelectric point ranging from 5.12 (BpMYB38) to 9.57 (BpMYB10). The subcellular location predicted that most of the proteins were nuclear proteins. Only BpMYB2, BpMYB27, and BpMYB27 were distributed in the chloroplast. BpMYB16 and BpMYB23 were distributed in the cytoplasm and mitochondria, respectively.

Analysis of the gene structure and conserved BpR2R3-MYB domains
The gene structure and domains of the BpR2R3-MYB proteins were analyzed using the online software MEME and TBtools. As shown in Fig. 2A, motifs with similar structures and domains were clustered into one clade, indicating that they had an analogous function. A total of 10 conserved amino acid motifs were identified in the BpR2R3-MYB proteins (Fig. 2B). Among them, Motifs 5,6,7,8,9,and 10 (Fig. 3). In the R3-MYB structural domain, the first W residue (position 9) was frequently substituted by phenylalanine (F) and less frequently by isoleucine (I), leucine (L), methionine (M), or tyrosine (Y).

Evolutionary analysis of BpR2R3-MYBs
The Ka/Ks ratio was calculated to explore the evolutionary constraints on the BpR2R3-MYB gene family. The results showed that 20% (9/44) of the BpR2R3-MYB genes exhibited fragment duplication, and they were    Fig. 4). Five gene pairs with segment duplication, namely, BpMYB11&BpMYB19, BpMYB43&BpMYB22, BpMYB6&BpMYB26, BpMYB3&BpMYB17, and BpMYB3&BpMYB14 (Table 2), were found in the chromosomes. The Ka/Ks ratios of the five gene pairs were less than 1. In addition, the syntenic relationships of the R2R3-MYB genes showed that 30 orthologs existed between B. platyphylla and A. thaliana, and 68 orthologs existed between B. platyphylla and Populus trichocarpa (Fig. 5).

Correlation analysis of the flavonoid content and gene expression of BpMYB15 and BpMYB21
The transcriptome sequencing data of nitrosoglutathione reductase (GSNOR) gene-silenced B. platyphylla plants (BpGSNOR-RNAi) and wild-type plants (WT) in our laboratory were used to analyze the correlation coefficients of the gene expression of the five BpR2R3-MYBs (BpR2R3-MYB12, 15, 21, 36, and 37) and the key enzyme genes of flavonoid synthesis (Additional file 1: Fig. S1 and Table S1). The results showed that the correlation coefficients of BpR2R3-MYB15 and BpR2R3-MYB21 were higher than those of the three other genes. Hence, we cloned BpR2R3-MYB15 and BpR2R3-MYB21 via PCR (Additional file 1: Fig. S2, 3). The flavonoid content and gene expression of BpR2R3-MYB15 and BpR2R3-MYB21 were further analyzed under one daily cycle and 90 μmol L −1 Cd treatment (Fig. 7). In one daily variation, the flavonoid content peaked at 18:00, and the time with high flavonoid content was from 15:00 to 0:00. The gene expression of BpR2R3-pMYB15 and BpR2R3-MYB21 peaked at 21:00 and 12:00, respectively. The flavonoid content and gene expression of BpR2R3-pMYB15 and BpR2R3-MYB21 in the leaves of the B. platyphylla plants reached the highest one day after Cd treatment, but the gene expression of BpR2R3-pMYB15 and BpR2R3-MYB21 in the stem and root of B. platyphylla mostly decreased after Cd treatment. The correlation coefficient of the gene expression of BpR2R3-pMYB15 and flavonoid content was higher than that of of BpR2R3-pMYB21 and flavonoid content ( Table 3).

Overexpression of BpR2R3-MYB15 enhanced flavonoid production
After 3 days of Agrobacterium-mediated transient transformation, the overexpression of BpR2R3-MYB15 in B. platyphylla calli (5.72 times than that of untransformed calli) significantly enhanced the flavonoid and procyanidin contents and increased the gene expression of BpCHI1, BpF3H, and BpDFR, which are key enzyme genes for flavonoid synthesis. The silencing of BpR2R3-MYB15 in B. platyphylla calli (0.68 times than that of untransformed calli) decreased the flavonoid and procyanidin contents and reduced the gene expression of BpCHI1, BpF3H, and BpDFR (Fig. 8).

Discussion
Transcriptome analysis provides insights into the spatiotemporal genes transcribed during plant growth and development processes or stress responses [31]. To clarify the function of R2R3-MYB family genes in the seedling development period of B. platyphylla, complete full-length transcriptome data of 30-day-old seedlings were generated using the PacBio Sequel System II, and 44 typical BpR2R3-MYB family genes with complete domains were identified. This study updated the collection of BpR2R3-MYB family genes in B. platyphylla.
The reported B. platyphylla genome provided an opportunity to investigate the gene structure and synteny of the identified BpR2R3-MYB family genes [18]. The 44 BpR2R3-MYB proteins had highly conserved R2 (-W-(X19)-W-(X19)-W-) and R3 (-F-(X19)-W-(X19)-W-) structural domains. The first W residue (position 9) in the R3-MYB structural domain is frequently substituted by phenylalanine (F) and less frequently by isoleucine (I), leucine (L), methionine (M), or tyrosine (Y). These substitutions in the R3 structural domain may result in the recognition of novel target genes and/or may significantly impair the DNA-binding activity [32]. Phylogenetic analysis of the BpR2R3-MYBs in this study showed that the genes in the same subgroups or subclades generally contained the same intron pattern, and most genes had no more than two introns. This result is in line with the results for other plants [33]. In addition, four genes had no intron, and BpR2R3-MYB 41 had 12 introns. The BpR2R3-MYBs with different numbers of introns may be one of the reasons for the enlarged family members or functional diversity in B. platyphylla.
Plants experience gene duplication events, including tandem, fragment, and conversion duplication, in the process of evolution [34]. In our study, five gene pairs with segment duplication were found, and their Ka/ Ks ratios were all less than 1. This result indicates that 22.73% of the genes of BpR2R3-MYBs (10/44) evolved under the effect of purifying selection. In addition, the number of orthologs between B. platyphylla and P. trichocarpa (68) was twice that between B. platyphylla and A. thaliana (30). We infer that B. platyphylla may be closer to P. trichocarpa than to A. thaliana in evolutionary branch, and similar results have been reported for other family genes of B. platyphylla [35,36]. Forty-four BpR2R3-MYBs were unevenly distributed in 11 chromosomes of B. platyphylla, and similar results have been derived for Arabidopsis, P. trichocarpa, and six Rosaceae species [37,38]. Combined with the results of the phylogenetic analyses, we found that 52.27% (23/44) of the BpR2R3-MYBs clustered into one subgroup were distributed in the same chromosome (Additional file 1: Table S3). In general, the members of the same subgroup had similar functions. Whether the above-mentioned BpR2R3-MYB genes in a subgroup have the same function and whether there is substitution or superposition of BpR2R3-MYB genes with the same function in the same chromosome will be verified experimentally in the future.
To clarify the identified BpR2R3-MYBs in flavonoid biosynthesis of B. platyphylla, BpR2R3-MYB15 related to flavonoid biosynthesis was screened out based on phylogenetic analyses. The correlation coefficients between the gene expression of BpR2R3-MYBs and the key enzymes of flavonoid synthesis were determined using GSNOR transgenic seedlings via RNAi silencing or by using wild plants under one daily cycle and 90 μmol L −1 Cd treatment. On this basis, we further verified the function of BpR2R3-MYB15 in flavonoid biosynthesis by using Agrobacterium-mediated transient transformation in the calli of B. platyphylla. Next, We will obtain BpR2R3-MYB15transformed birch plants to further analyze its function and mechanism in flavonoid synthesis. The results of this study lay a foundation for analyzing how BpR2R3-MYBs regulate flavonoid biosynthesis and the function of flavonoids in seedling development.

Conclusions
In this study, 44 BpR2R3-MYB family genes were identified based on the full-length transcriptome of 30-day-old seedlings of Betula platyphylla, and their gene structure, chromosome distribution, and syntenic relationships were analyzed at the genomic level. BpR2R3-pMYB15, one of the five BpR2R3-MYBs clustered with R2R3-MYB genes related to flavonoid synthesis, positively regulated flavonoid production via Agrobacterium-mediated transient transformation.