Comparative transcriptome analyses reveal genes related to pigmentation in the petals of a flower color variation cultivar of Rhododendron obtusum

Rhododendron is an important woody ornamental plant, and breeding varieties with different colors is a key research goal. Although there have been a few reports on the molecular mechanisms of flower colors and color patterning in Rhododendron, it is still largely unknown what factors regulate flower pigmentation in Rhododendron. In this study, the flower color variation cultivar ‘Yanzhi Mi’ and the wild-type (WT) cultivar ‘Dayuanyangjin’ were used as research objects, and the pigments and transcriptomes of their petals during five flower development stages were analyzed and compared. The results showed that derivatives of cyanidin, peonidin and pelargonidin might be responsible for the pink color of mutant petals and that the S2 stage was the key stage of flower color formation. In total, 412,910 transcripts and 2780 differentially expressed genes (DEGs) were identified in pairwise comparisons of WT and mutant petals. GO and KEGG enrichment analyses of the DEGs showed that ‘DNA-binding transcription factor activity’, ‘Flavonoid biosynthesis’ and ‘Phenylpropanoid biosynthesis’ were more active in mutant petals. Early anthocyanin pathway candidate DEGs (CHS3-CHS6, CHI, F3Hs and F3′H) were significantly correlated and were more highly expressed in mutant petals than in WT petals in the S2 stage. An R2R3-MYB unigene (TRINITY_DN55156_c1_g2) was upregulated approximately 10.5-fold in ‘Yanzhi Mi’ petals relative to ‘Dayuanyangjin’ petals in the S2 stage, and an R2R3-MYB unigene (TRINITY_DN59015_c3_g2) that was significantly downregulated in ‘Yanzhi Mi’ petals in the S2 stage was found to be closely related to Tca MYB112 in cacao. Taken together, the results of the present study could shed light on the molecular basis of anthocyanin biosynthesis in two Rhododendron obtusum cultivars and may provide a genetic resource for breeding varieties with different flower colors.


Introduction
Rhododendron is not only the largest genus of the family Ericaceae but is also an important ornamental woody plant group that includes approximately 1000 species and thousands of commercial hybrids [1]. Rhododendron plants are familiar ornamental shrubs that are widespread around the world and have beautiful flowers. Flower color is an important ornamental trait that greatly affects the economic value of Rhododendron. There is a remarkably broad range of Rhododendron flower colors, including white, red, pink, purple, yellow, green, and blue [2]. Previous studies have shown that many factors (cell shape, copigmentation, pH, environmental conditions, and so forth) affect petal coloration, among which the pigment composition is the most important [3]. The petals of Rhododendron flowers contain flavonols and anthocyanins as their major pigments [4]. The main flavonols of Rhododendron petals are quercetin, kaempferol, and myricetin, although many others exist [5][6][7]. The flavonol composition and content have some effect on flower color by copigmentation, but might not play a major role in coloration [8]. The major anthocyanidins Xiaobo Sun and Lisi He have contributed equally to this work.
* Chang Li changli529@foxmail.com Extended author information available on the last page of the article of Rhododendron petals are cyanidin, peonidin, delphinidin, petunidin, malvidin, and pelargonidin [4,9,10]. The anthocyanin composition and content are fundamental elements that determine Rhododendron flower colors [4,8,11]. Therefore, the study of anthocyanin biosynthesis in the petals of Rhododendron plants and the related regulatory mechanisms are important for elucidating the formation of different flower colors in Rhododendron.
To date, there have been a few reports on the molecular mechanisms of flower colors and color patterning in Rhododendron. CHS and DFR were cloned from R. simsii hybrids and the expression of these two genes was examined in eight azalea cultivars, the results showed that there was no significant correlation between the flower color phenotype and expression levels of the CHS and DFR [21,22]. The main enzymes in the anthocyanin biosynthesis pathway in azalea include CHS, CHI, F3H, F3ʹH, F3ʹ5ʹH, DFR, ANS, and flavonol synthase (FLS) [17]. The genes encoding these biosynthetic enzymes were isolated from evergreen azalea R. × pulchrum Sweet 'Oomurasaki', and expression analysis showed that only F3′5′H gene expression was strongly correlated with the accumulation of total anthocyanins in petals [23]. A comparative study between purple-flowered 'Oomurasaki' and its red-flowered mutant showed that 'Oomurasaki' contained anthocyanidins from both the cyanidin and delphinidin series, whereas the red flower mutant contained only pigments from the cyanidin series. The transcript levels of F3ʹ5ʹH in the mutant were 0.14-fold those in 'Oomurasaki', suggesting that low transcript levels of F3ʹ5ʹH in the red-flowered mutant resulted in no accumulation of delphinidin-derived anthocyanin [9]. Despite these reports, it is still largely unknown what factors regulate flower pigmentation in Rhododendron.
Our laboratory bred the flower color variant cultivar 'Yanzhi Mi' (pink petals), which was derived from a bud variation of Rhododendron obtusum 'Dayuanyangjin' (white petals with pink stripes). The only difference from 'Dayuanyangjin' observed in 'Yanzhi Mi' is a difference in flower color, and the genetic background is otherwise the same as that of 'Dayuanyangjin'; thus, 'Yanzhi Mi' provides ideal experimental material for studying the mechanism of Rhododendron petal coloration. In the current study, we qualitatively and quantitatively characterized and compared pigmentation in the petals of the mutant 'Yanzhi Mi' and the corresponding wild-type (WT), 'Dayuanyangjin', in five stages of flower development. Then, we used the Illumina sequencing platform to conduct a transcriptome sequencing analysis of mixed RNA extracted separately from the petals of 'Yanzhi Mi' and 'Dayuanyangjin' in the same five stages of flower development. After analyzing the data, we identified some key candidate genes related to anthocyanin biosynthesis in R. obtusum. These transcriptome sequences may provide a valuable genomic resource for better understanding the molecular mechanisms of color formation in R. obtusum.

Plant materials
R. obtusum 'Dayuanyangjin' (D) and 'Yanzhi Mi' (Y) were grown in a greenhouse at the Jiangsu Academy of Agricultural Sciences (Nanjing, China, 118.881E, 32.039 N). In March 2019, flowers in five developmental stages [closed buds (S1 stage), buds showing color at the top but with scales still present (S2 stage), the initial flowering stage (S3 stage), the full flowering stage (S4 stage) and the last flowering stage (S5 stage)] were sampled from ten plants of each cultivar (Fig. 1). The petals were immediately cut off, frozen in liquid nitrogen, and then stored at − 80 °C for pigment measurement and RNA-seq analyses. Three biological replicates were analyzed for each sample.

Extraction and measurement of pigment
Anthocyanins and flavonoids were extracted and measured according to a previous study [24]. One gram of fresh azalea petal tissue was immersed in 5 ml of an extraction mixture consisting of acetonitrile, water and formic acid (5:5:1, v:v:v) at 4 °C for 24 h, and extraction was then performed by ultrasonication at ambient temperature for 30 min. The samples were subsequently filtrated, and the residue was extracted with 5 ml of extraction solution for 30 min and filtered again. After two filtration steps, the two filtrate samples were combined and concentrated to 5 ml under a 1 3 nitrogen gas stream. The concentrated extract was passed through a SPEC18 Sep-pak (GRAE company, USA), and the eluent was used for testing. All samples were analyzed on an Agilent 1260 ultra-high performance liquid chromatography (UHPLC) system coupled with a 6530 quadrupole-time of flight (Q-TOF) mass spectrometer (HPLC-Q-TOF-MS) operating in positive ion mode. Chromatographic separation was performed in an Agilent Poroshell 120 SB-Aq column (4.6 × 100 mm, 2.7 μm). Separation was carried out via the following 50 min multistep linear gradient using a mobile phase consisting of (A) 1% v aqueous formic acid in water and (B) acetonitrile: from 0 to 20 min, increase from 5% B to 25% B; from 20 to 40 min, increase 25% B to 100% B; and then maintenance of 100% B for 10 min, not including post-time. The flow rate was 0.3 mL min −1 , the column temperature was 35 °C, the injection volume was 20 μL, and detection was performed at 530 nm and 254 nm. The source temperature was set at 350 °C, capillary voltage at + 4.0 kV, N2 drying gas flow at 10 mL min −1 , nebulizer pressure at 50 psi, and fragmentor voltage at 175 V. The collision energy was set at 25 V for MS/MS analysis.
The compounds in the sample extracts were identified by comparison with the retention times of standards. The characteristics of the UV-Vis spectra of peaks and the mass spectrometric information were analyzed using a Mass-Hunter B0.05.0 Workstation (Agilent, USA). The relative contents of anthocyanins and flavonoids were calculated from the peak areas of the samples based on the intensity of the corresponding standard compounds, including cyanidin, pelargonidin, peonidin, cyanidin-3-rutinoside, peonidin 3,5-diglucoside, cyanidin-3-glucoside, kaempferol, quercetin, quercetin-3-glucoside, Isorhamnetin and proanthocyanidin. For compounds lacking the corresponding standards, quantification was performed using similar compounds. The mean values and standard deviations (SDs) were calculated from three biological replicates.

RNA extraction, cDNA library construction and RNA-seq
Total RNA was isolated from 1 mg of mixed petal power from ten plants using a TRIzol kit according to the manufacturer's instructions (Invitrogen, Carlsbad, CA, USA). The RNA concentration and purity were assessed using a Qubit RNA Assay Kit on a Qubit 2.0 Fluorometer (Life Technologies, CA, USA) and a NanoDrop2000 spectrophotometer (IMPLEN, CA, USA), respectively. RNA integrity was examined using an Agilent 2100 Bioanalyzer (Santa Clara, CA, USA) according to an RNA integrity number (RIN) > 8.0. Three biological replicates were included for each flower development stage in each variety. Library construction and RNA-seq analyses were performed by Genepioneer Biotechnology Corporation (Nanjing, China) using the Illumina HiSeqTM 2500 platform. A total of 30 RNAseq libraries were generated (Table S1). All sequencing data were deposited in NCBI Sequence Read Archive under accession number SRP304986.

Transcriptome assembly and gene annotation
After removing the poor-quality reads (adaptor reads, ambiguous nucleotides and low-quality reads) [25], the clean reads were assembed using Trinity software as previously described for de novo transcriptome assembly without a reference genome [26]. The assembled unigenes were searched against public databases, including the NCBI, NR, COG, SwissProt, Pfam, GO, KOG and KEGG database.

DEG analysis
The gene expression levels of unigenes were estimated using the FPKM method in RSEM software [27]. Clean data from each sample were mapped back onto the assembled transcripts using bowtie2 software [28], and the normalized expression values RPKM (reads per kilobase per million mapped reads) of each unigene in the 30 libraries were used to represent gene expression levels [29]. Differential expression analysis was carried out with DESeq2 [30]. Threshold criteria based on the FDR statistical method were used to determine significant differences in gene expression, and the normalized expression values RPKM were compared according to the thresholds of a P-value ≤ 0.01 and a |log2 (FC)|≥ 1 based on an FDR value ≤ 0.05. DEGs (FDR value ≤ 0.05 and |log2 (FC)|≥ 1) were determined and then analyzed through GO and KEGG pathway enrichment analysis using GoSeq and KOBAS (2.0), respectively [31,32].

Phylogenetic analysis
A phylogenetic tree of selected R2R3-MYB transcription factors was constructed using MEGA 5.1 software via the neighbor-joining method with bootstrap analysis of 1000 replicates [33].

Validation of GEGs by quantitative real-time PCR
qPCR analysis was performed using a SYBR® Premix Ex TaqTM Kit (TaKaRa, Dalian, China) and an ABI 7500 Real-Time PCR system (Applied Biosystems, CA, USA). The RNA samples used for the qRT-PCR assays were the same samples employed for sequencing. Total RNA (1 μg) was used to synthesize cDNA with the PrimeScript RT Reagent Kit (TaKaRa, Dalian, China) following the recommended procedures. All qRT-PCR assays were performed in a total volume of 20 μL comtaining 10 μL of SYBR Master Mix, 2 μL of cDNA template, 0.4 μL of 10 μM forward and reverse primers and 7.2 μL of dd H 2 O. The cycling conditions were as follows: one cycle of 95 °C for 30 s, followed by 45 cycles of 95 °C for 5 s, 60 °C for 15 s, and 72 °C for 30 s. The housekeeping gene Actin was used as the control gene [34]. All primers used in these assays are listed in Table S2.
The specificity of each primer pair was checked by agarose gel electrophoresis and melting curve analysis. The relative expression levels of the genes were calculated using the 2 −ΔΔCT formula [35]. Both biological and technical triplicates of each sample were analyzed.

Statistical analysis
The statistical analyses of anthocyanin and flavonoid contents were conducted with the one-way ANOVA LSD test using the IBM SPSS Statistics (version 19) statistical software. P values < 0.05 were considered significant. Correlation tests were performed by calculating the Pearson correlation coefficient (Pearson's r) with a two-tailed test. Correlations were considered positive when Pearson's r > 0.65 and P < 0.05.

Qualitative and quantitative pigments analyses
The total anthocyanin contents of 'Yanzhi Mi' and 'Dayuanyangjin' petals were similar in the S1 stage. In the other four floral development stages, total anthocyanins accumulated to significantly higher levels in 'Yanzhi Mi' petals, especially in the S2 stage; the anthocyanin contents of 'Yanzhi Mi' petals were 35.9 times higher than those of 'Dayuanyangjin' petals in the S2 stage ( Fig. 2A). In 'Yanzhi Mi' petals, the anthocyanin contents were highest in the S2 stage, decreased rapidly in the S3 stage, and remained unchanged in the S4 and S5 stages. In contrast, the anthocyanin contents of 'Dayuanyangjin' petals were highest in the S1 stage, decreased rapidly in the S2 stage, and remained unchanged until the S5 stage. The flavonoid contents in 'Dayuanyangjin' petals were higher than those in 'Yanzhi Mi' petals in the S1 stage and were similarin the other four floral development stages ( Fig. 2A). The results indicated that anthocyanins might play vital roles in pink flower color formation of 'Yanzhi Mi'. Therefore, we further identified the anthocyanin composition of the petals of 'Yanzhi Mi' and 'Dayuanyangjin'. In total, seven anthocyanins were identified in the petals of the two cultivars, which included cyanidin-3-glucoside-5-xyloside, cyanidin-3-rutinoside, cyanidin-3-glucoside, cyanidin-3-xyloside, peonidin -3-glucoside-5-xyloside, peonidin 3, 5-diglucoside and pelargonidin-3-diphrhamnoside. Among the seven anthocyanins, pelargonidin-3-diphrhamnoside, cyanidin-3-glucoside-5-xyloside and cyanidin-3-xyloside are the three most abundant anthocyanins in the petals of 'Yanzhi Mi' in the S2 stage ( Fig. 2B and Table S3). These results suggested that the derivatives of cyanidin, peonidin and pelargonidin might be responsible for the pink coloration of 'Yanzhi Mi' petals.

Transcriptome sequencing, de novo assembly, annotation and PCA clustering
Based on the analysis of anthocyanins, we further investigated petal transcriptome variation in 'Yanzhi Mi' and 'Dayuanyangjin' in five floral development stages by using Illumina technology. Following data cleaning and quality checking, 722,321,099 reads with Q30 values ≥ 90.00% were obtained from the 30 libraries. Among the clean reads, > 96% showed quality scores at the Q20 level (Table S1). The raw sequencing data have been deposited in the NCBI database and can be accessed in the Short Read Archive (SRP304986).
The total length of the clean reads was 216.70 Gb, which was equivalent to ~ 310-fold coverage of the genome of R. obtusum (approximately 0.7 Gb). All clean reads were merged and de novo assembled using Trinity platform software, resulting in the generation of 412,910 transcripts with an average length of 1276.10 bp and an N50 length of 1881 bp (Table S4). Most of the reads could be mapped back to the assembled transcripts, and the total length of all transcripts was approximately 527 Mb. These transcripts were further subjected to cluster and assembly analyses, which resulted in 137,018 unigenes with an average length of 1069.22 bp and an N50 value of 1549 bp, among which 34.19% of the unigenes (46,853) were more than 1 kb in length (Table S4).
Sequence similarity searches were carried out against protein sequences available in various databases using the BLASTX algorithm with an E value threshold of 1e − (Table S5). Principal Component Analysis (PCA) is made through the gene expression matrix of each RNA-seq librarie (Fig. S1), and the PCA clustering diagram showed that three biological replicates at each floral development stage of 'Yanzhi Mi' and 'Dayuanyangjin' clustered together respectively, suggesting their repeatability is good. Except Y2 and D2, Y1 and D1, Y3and D3, Y4 and D4, and Y5 and D5 clustered together respectively, indicating that there were significant differences in gene expression between Y2 and D2.

Analysis of DEGs in five floral development stages
The criteria of a false discovery rate (FDR) ≤ 0.05 and a |log2 (FC)|≥ 1 were used as the standards for the pairwise differential expression analysis of the five sample groups (Y-S1vs D-S1, Y-S2 vs D-S2, Y-S3 vs D-S3, Y-S4 vs D-S4 and Y-S5 vs D-S5) (Fig. S2) (Fig. S2). Overall, the number of upregulated genes identified in 'Yanzhi Mi' petals was greater than the number of downregulated genes during floral development. These results indicate that the S2 stage is the key stage for determining the difference in flower color between 'Yanzhi Mi' and 'Dayuanyangjin'.
We conducted GO enrichment and KEGG pathway analyses for illustrating the main biological functions of the DEGs. A GO enrichment analysis can give a description of gene products in terms of their associated Molecular Function (MF), Cellular Component (CC) and Biological Process (BP) [36].
The DEGs were assigned into a total of 53 GO categories, and 1,607 DGEs identified in Y-S2 vs D-S2 were categorized into all 53 GO categories. Forty-two significant enriched GO categories (p-value ≤ 0.05) were identified in Y-S2 vs D-S2, and 'binding', 'cell', and 'cellular process' were the most highly represented GO terms in MF, CC, and BP, respectively, among which the most highly enriched terms, "cell", was enriched with 1169 DEGs (Fig. S3A). Significantly upregulated DEGs were observed in six GO categories ('DNA-binding transcription factor activity; GO:0003700', 'sequence-specific DNA binding; GO:0043565', 'DNA replication initiation; GO:0006270', 'flavonoid biosynthetic process; GO:0009813', 'MCM complex; GO:0042555'and 'naringenin-chalcone synthase activity;GO:0016210) in the S2 stage (Fig. S3B). Among these GO categories, four (DNA-binding transcription factor activity, sequencespecific DNA binding, flavonoid biosynthetic process and naringenin-chalcone synthase activity) were found to be related to the color of flowers. The GO categories pectin catabolic process (GO:0045490), apoplast (GO:0048046), chitinase activity (GO:0004568) and pectatelyase activity (GO:0030570) were found to more highly enriched among downregulated DEGs in the S2 stage (Fig. S3B).
All DEGs were annotated into 94 metabolic pathways using the KEGG database. A total of 427 DEGs in the S2 stage (including 220 upregulated unigenes and 207 downregulated unigenes) were categorized into 92 pathways. KEGG pathway analysis indicated that 'Flavonoid biosynthesis', 'Circadian rhythm-plant', 'Photosynthesis-antenna proteins', 'Phenylpropanoid biosynthesis' and 'Pentose and glucoronate interconversions' were the five most enriched pathways in the S2 stage (Fig. S4). Among these pathways, 'Flavonoid biosynthesis' was the most significantly enriched (− log10 (p-value): inf) of the up-regulated DEGs; in contrast, the 'Pentose and glucuronate interconversions' pathway was the most significantly enriched of the downregulated DEGs. Interestingly, two functional pathways ('Phenylpropanoid biosynthesis' and 'Plant hormone signal transduction') were found to be enriched not only among the upregulated DEGs but also among the downregulated DEGs in the S2 stage. Among these pathways, two KEGG pathways ('Flavonoid biosynthesis' and 'Phenylpropanoid biosynthesis') were found to be related to the color of flowers, which included 20 and 34 unigenes, respectively (Fig. S4).

DEGs involved in the anthocyanin biosynthesis pathway and their expression patterns
Qualitative and quantitative pigment analyses of 'Yanzhi Mi' and 'Dayuanyanjin' petals showed that anthocyanins might play vital roles in petal color variation in 'Yanzhi Mi'. Therefore, the focus of this study was mainly on the unigenes involved in the anthocyanin biosynthesis pathway. Based on the gene annotation and DEGs analysis, the targeted genes associated with anthocyanins and the related biosynthesis pathways were screened and compared.
According to the results of pigment determination and the GO and KEGG annotation of DEGs, 62 DEGs related to flower color formation were identified (Table 1).
Among these DEGs, 16 candidate anthocyanin structural genes encoding seven putative enzymes were involved in anthocyanin biosynthesis in 'Yanzhi Mi' petals. The synthetic pathways and expression profiles of these genes are shown in Fig. 3A. These genes included CHS (6), CHI (1), F3H (2), F3′H (1), DFR (2), ANS (1), and GT (3), and they exhibited significant differential expression patterns in the petals of 'Yanzhi Mi' vs 'Dayuanyangjin'. We found that except for the GT genes, all of the genes encoding the six enzymes showed similar expression levels in the petals of 'Yanzhi Mi' and 'Dayuanyangjin' in the S1 stage, and the three genes encoding GTs showed lower expression levels in 'Yanzhi Mi' petals than in 'Dayuanyangjin' petals. In the S2 stage, the expression levels of thirteen genes (except for GTs) were higher in 'Yanzhi Mi' petals than in 'Dayuanyangjin' petals, with a maximum multiple of 2.82 (DFR1) and a minimum multiple of 1.75 (F3H2); three GTs showed lower expression levels in the petals of 'Yanzhi Mi'. In the S3 stage, thirteen genes showed higher transcript levels in 'Yanzhi Mi' petals than in 'Dayuanyangjin' petals, with a maximum multiple of 2.9 (3GT1) and a minimum multiple of 1.19 (CHI); F3H1 and F3H2 showed similar expression levels in the petals of 'Yanzhi Mi' and 'Dayuanyangjin', but 5,3GT showed a lower expression level in 'Yanzhi Mi' petals. Only two genes, CHS5 (TRINITY_DN49577_c0_g1) and F3'H (TRIN-ITY_DN68439_c2_g2), showed higher expression levels in 'Yanzhi Mi' petals than in 'Dayuanyangjin' petals in the S4 stage. In the S5 stage, the expression levels of seven genes were similar in the petals of 'Yanzhi Mi' and 'Dayuanyangjin', those of eight genes were higher in 'Yanzhi Mi' petals, and 5,3GT (TRINITY_DN64021_c3_g2) was not expressed in 'Yanzhi Mi' and 'Dayuanyangjin' petals.
In 'Yanzhi Mi', many genes in the anthocyanin synthetic pathways showed maximum expression in the S2 stage, after which their expression decreased in each subsequent developmental stage. For example, CHS3, CHS4, CHS5, CHS6, CHI, F3H1, and F3′H were expressed more highly in the S2 stage than in other floral development stages; among these genes F3′H (TRINITY_DN68439_c2_g2) was upregulated 1.14-, 4.22-, 5.20-and 5.83-fold in the S2 stage compared to its expression in the S1, S3, S4 and S5 stages, respectively. However, in 'Dayuanyanjin' flowers, the expression levels of ten genes (CHSs, CHI, F3Hs and F3'H) were highest in the S1 stage. For example, F3'H (TRINITY_DN68439_c2_g2) was upregulated 2.21-, 6.13-, 6.24-and 5.93-fold in the S1 stage, respectively, compared to its expression in the S2, S3, S4 and S5 stages.
The correlations of the expression profiles of the abovementioned 16 genes were detected by Pillet's method [37], and positive correlations between these genes (Pearson's r > 0.65, P < 0.05) were identified (Fig. 3B, C). Compared with the gene expression correlation map of 'Dayuanyangjin' petals, we found that eight early anthocyanin pathway genes (including CHS3, CHS4, CHS5, CHS6, CHI, F3H1, F3H2 and F3'H) were significantly related and shared more than two positive correlations with other genes in 'Yanzhi Mi' petals. Interestingly, a late anthocyanin pathway gene, 3GT1, showed a different regulatory mechanism and shared negative correlations with six early anthocyanin pathway genes  (CHS3-CHS6, F3H1 and F3H2). This correlation pattern clearly suggests that these anthocyanin pathway genes might be coregulated by the same transcription factors. A ternary complex that comprised three groups of transcription factors (R2R3-MYB, bHLH, and WDR) can coordinately regulate the expression of most of the structural genes in the anthocyanin biosynthetic pathway [38]. Therefore, the expression levels of the regulatory factors that potentially control anthocyanin biosynthesis were analyzed.
The results showed that the S2 stage is the key stage in which the flower color difference between 'Yanzhi Mi' and 'Dayuanyangjin' is determined. Thus, we selected the transcription factors that were related to flower color formation and were differentially expressed in the two cultivars in the S2 stage for further analysis. We 35 DEGs encoding candidate transcription factors from the DEG data were identified, including 26 unigenes encoding R2R3-MYBs and 9 encoding bHLHs. Among the 26 predicted MYB, 14 were highly expressed in 'Yanzhi Mi' (in cluster 2), and 12 were highly expressed in 'Dayuanyangjin' (in cluster 1) (Fig. 4A). Five unigenes were differentially expressed with absolute log2 (FC) values > 2, including two downregulated genes (TRIN-ITY_DN62378_c2_g1 and TRINITY_DN59015_c3_g2) and three upregulated genes (TRINITY_DN55156_c1_g2, TRINITY_DN53621_c6_g1 and TRINITY_DN62502_c1_ g3). Notably, TRINITY_DN55156_c1_g2 was upregulated approximately 10.5-fold in 'Yanzhi Mi' relative to 'Dayuanyangjin', and TRINITY_DN59015_c3_g2 was found to be closely related to Tca MYB112 in cacao (Fig. S5) [39]. Among the 9 predicted bHLHs, 8 were highly expressed in 'Yanzhi Mi' (in cluster 2), and 1 was highly expressed in 'Dayuanyangjin' (in cluster 1) (Fig. 4B). Four unigenes were differentially expressed with absolute log2 (FC) values > 2.0, and they were all upregulated (TRINITY_DN50930_c0_g3, TRINITY_DN44252_c0_g1, TRINITY_DN42619_c0_g1 and TRINITY_DN53618_c2_g2). TRINITY_DN53618_c2_ g2 showed particularly high upregulation, with a maximum multiple of 2.43.

qPCR validation of the DEGs of the flavonoid pathway
To confirm the transcripts obtained in the sequencing analysis, the reliability of the comparative transcriptional data was further verified. A total of 10 structural gene transcripts and transcription factor transcripts that might be involved in anthocyanin biosynthesis were randomly selected for qPCR test. The relative expression levels of these transcripts in the five floral development stages of 'Yanzhi Mi' and the transcripts per million clean tags (TPM) values of these transcripts obtained from the sequencing data from the 15 'Yanzhi Mi' samples are shown in Fig. S6. These results showed that the expression patterns obtained by qPCR were consistent with the digital expression data and indicated that the transcriptomic data used in this study for anthocyanin synthesis-related gene analysis were reliable and highly reproducible.

Discussion
Flower color is one of the most important ornamental traits of rhododendrons, and the study of the mechanism underlying the formation of different flower colors is of great significance. The major anthocyanidins in rhododendron petals are cyanidin, peonidin, delphinidin, petunidin, malvidin, and pelargonidin [4,9,10]. Seven anthocyanins were previously identified in the petals of 30 Rhododendron species, among which the red-flowered species mainly contained 2 cyanidin monoglycosides [8]. A total of 5 anthocyanin compounds were identified in 10 Rhododendron species, and cyanidin derivatives were the major anthocyanins found in most of the red-purple group and in R. triflorum (red flower) [40]. The flowers of R. schlippenbachii Maxim contained 2 anthocyanins, cyanidin-3,5-diglucoside and cyanidin-3-sambubioside, and among 3 different flower colours found in R. schlippenbachii, red flowers exhibited higher amounts of total anthocyanins than violet flowers, whereas no anthocyanins were detected in white flowers [11]. Previous studies have not clarified the mechanism responsible for pink flowers, but some researchers consider pink coloration to result from a pigment gradient [41]. De Keyser et al. also confirmed that pink coloration in Rhododendron can be observed to result from a lower intensity of red (carmine) pigmentation by means of image analysis [22]. In this study, we compared the pigmentation of the petals of the 'Yanzhi Mi' mutant and the corresponding WT 'Dayuanyangjin' during floral development. According to HPLC-Q-TOF-MS analysis, 'Yanzhi Mi' and 'Dayuanyangjin' showed similar contents of flavonoids, but significantly different contents of anthocyanins with the same compositions. During floral development, the pink petals of 'Yanzhi Mi' exhibited a higher content of anthocyanins, mainly composed of seven derivatives of cyanidin, pelargonidin and peonidin. These results are consistent with previous reports indicating that the presence of cyanidin and peonidin leads to the formation of red petals [31]. The white petals of 'Dayuanyangjin' mainly contained flavonoids, and the trace levels of anthocyanins in the white petals may be responsible for the presence of pink stripes in them.
The biosynthesis of anthocyanins is crucial to expand the range of flower colors. Previous studies have shown that C4H, CHS, F3H, F3H, F3′5′H, DFR and ANS are the key enzymes involved in the biosynthesis of anthocyanins [42]. Flower color regulation is achieved mainly via the coordinated transcriptional control of structural genes [43,44]. In this study, we identified 16 DEGs in the petal transcriptomes of 'Yanzhi Mi' vs 'Dayuanyangjin' that encoded seven putative enzymes involved in anthocyanin biosynthesis. By comparing their gene expression profiles during floral development, we found that except for the GTs, all of the other genes encoding the six enzymes showed similar expression levels in the petals of 'Yanzhi Mi' and 'Dayuanyangjin' in the S1 stage. However, most of the genes showed significantly higher transcript levels in 'Yanzhi Mi' petals than in 'Dayuanyangjin' petals in the S2 and S3 stages, especially in the S2 stage. The comparative analysis of the petal transcriptomes of 'Yanzhi Mi' vs 'Dayuanyangjin' showed that from stage S1 to stage S5, the DEGs were mainly concentrated in stage S2, which accounted for 91.6% of the total DEGs. Based on the identification of no DEGs in the S1 stage and the similar anthocyanin contents of 'Yanzhi Mi' vs 'Dayuanyangjin' petals in this stage, we can conclude that the S2 stage is the key stage for flower color formation in 'Yanzhi Mi'.
In R. × pulchrum flower buds in the S1 stage, five genes (CHS, CHI, F3H, DFR, and ANS) were expressed, but no anthocyanins were detected. Among these genes, the expression of CHS, F3H, and ANS was highest in the S1 stage and decreased during flower development, whereas CHI and DFR transcripts were expressed at the highest levels in the S2 stage [23]. However, the anthocyanin content was highest at the candle stage (buds showing color at the top but without any remaining scales) [21,23]. Anthocyanin synthesis and gene expression in R. × pulchrum petals were not coincident. In our study, only two CHS candidate genes (CHS1 and CHS2) showed the highest transcript accumulation in the S1 stage of 'Yanzhi Mi' flowers, and other early pathway candidate genes (CHS3-6, CHI, F3Hs, and F3′H) were strongly expressed in 'Yanzhi Mi' petals in the S2 stage, after which their expression declined in the subsequent stages of floral development. The maximum expression of later pathway candidate genes (DFRs, ANS and UFGTs) was found in the S3, S4 and S5 stages (Fig. 3A). The lowest content of anthocyanins was detected in the S1 stage in 'Yanzhi Mi' flowers, and the highest content of anthocyanins was found in the S2 stage in 'Yanzhi Mi' flowers ( Fig. 2B). However, in 'Dayuanyanjin', the expression levels of all early pathway candidate genes (CHS1-6, CHI, F3Hs, and F3′H) were highest in the S1 stage and decreased in each subsequent developmental stage. Late pathway candidate genes (DFRs, ANS and UFGTs) were highly expressed in the S2, S4 and S5 stages (Fig. 3A). The anthocyanin content of 'Dayuanyangjin' petals in the S1 stage, which was highest among the five floral development stages, was similar to that of 'Yanzhi Mi' petals in the S1 stage. Combined with the above results, we suggest that in 'Yanzhi Mi' and 'Dayuanyangjin', anthocyanin synthesis and gene expression in petals are coincident in the key stage of flower color formation, the S2 stage, which is different from the situation in R. × pulchrum flowers; flower color formation in 'Yanzhi Mi' and 'Dayuanyangjin' subsequently begins to diverge in the S2 stage. A combination of early pathway genes (CHS, CHI, F3H, and F3′H) are correlated with the differentiation between the pink flowers of 'Yanzhi Mi' and the white flowers of 'Dayuanyangjin'. Therefore, we speculated that CHS, CHI, F3H and F3′H were the most likely candidates responsible for flower coloration in 'Yanzhi Mi'.
R2R3-MYB, bHLH and WD40 proteins are the main transcription factors responsible for regulating the expression of structural genes in the anthocyanin biosynthesis pathway [26]. Often, these transcription factors form a complex (the MYB-bHLH-WDR, or MBW, complex) that can coordinately activate or repress the expression of a set of target genes in the anthocyanin biosynthesis pathways to modulate pigment production [18,45]. In Phalaenopsis spp., PeMYB11 has been shown to activate the expression of the anthocyanin biosynthetic genes PeF3H5, PeDFR1 and PeANS3 [46]. In Dendrobium hybrid orchids, DhMYB2 has been reported to interact with DhbHLH1 to regulate the expression of DhDFR and DhANS [47]. In addition, MiMYB1 has been shown to interact with MibHLH2 and MiWDR1 to activate the transcription of the anthocyanin biosynthetic genes MiF3ʹH, MiDFR, and MiANS in Matthiola incana flowers [48]. Some R2R3-MYB repressors have also been identified in plants in recent years, including PtrMYB182 and PtrMYB57 from poplar, PpMYB17-20 from peach, MdHB1, MdMYB16, and MdMYB15L from apple, NtMYB2 from Chinese narcissus, and CmMYB#7 from chrysanthemum [49][50][51][52][53][54][55]. In the present study, 26 R2R3-MYBs and 9 bHLHs were identified, and their expression profiles in the S2 stage were analyzed. The R2R3-MYBs were used to construct a phylogenetic tree with anthocyanin-related R2R3-MYBs from other plants. Notably, one R2R3-MYB (TRINITY_DN55156_c1_g2) exhibited 10.5-fold higher expression levels in 'Yanzhi Mi' petals than in 'Dayuanyangjin' petals in the S2 stage; one R2R3-MYB (TRINITY_ DN59015_c3_g2) that was significantly downregulated in 'Yanzhi Mi' petals in the S2 stage was found to be closely related to the anthocyanin-regulated gene Tca MYB112 in cacao [39]. Further research on these candidate transcription factors is needed to confirm our hypothesis.

Conclusions
In conclusion, a combination of analytical chemistry and transcriptome analyses was performed to elucidate the molecular basis underlying pink pigmentation in the flowers of the 'Yanzhi Mi' mutant and white pigmentation in the flowers of WT 'Dayuanyangjin' plants. Seven derivatives of cyanidin, pelargonidin and peonidin were deemed to be the main contributors to pink color formation. We identified potential candidate genes encoding key enzymes in the anthocyanin biosynthetic pathway, such as CHS, CHI, F3H, and F3′H, which are significantly differentially expressed in the pink flowers of 'Yanzhi Mi' and the white flowers of 'Dayuanyangjin' in the key stage of flower color formation, the S2 stage. These genes were the most likely candidates responsible for flower coloration in 'Yanzhi Mi'. In particular, two R2R3-MYB transcription factors that might be involved in pink flower coloration regulation in 'Yanzhi Mi' were also identified. The most likely cause of color variation in the flowers of 'Yanzhi Mi' was proposed and discussed.