Flavonoids in the pulp of B. ramiflora.
Flavonoids and phenylpropanes are the precursors of anthocyanin synthesis and are related to the synthesis of anthocyanins. Flavonoids have a variety of stress-protective effects on plants and have health-promoting effects on the human body29. A branch of the flavonoid biosynthetic pathway is involved in the production and regulation of anthocyanins, procyanidins, and flavonols29. The metabolomic analysis of the fruit pulp of B. ramiflora showed that there were 35 types of flavonoids (Table S2), and no carotene was identified. The flavonoids were mainly rhusflavanone, procyanidin B1, and (+)-catechin. Nine metabolites of flavonoids and anthocyanin biosynthetic pathways were identified in the fruit pulp of B. ramiflora: naringenin chalcone, naringenin, and eriodictyol, taxifolin/dihydroquercetin, kaempferol, quercetin, (+)-catechin, (-)-epicatechin, and procyanidin B1.
Dynamic changes in anthocyanin pathway metabolites during LR and BR pulp development.
The changes in the 9 flavonoids in the anthocyanin synthesis pathway during the developmental stages are shown in Fig. 2. Overall, they showed a decline with time. Compared with BR, the anthocyanin synthesis pathways of LR had a greater rate of decline during the period from 52 DAF to 73 DAF, which may have been associated with the more abundant anthocyanins in the pulp of BR. The accumulation of the contents of various substances mainly occurred in the first two stages, indicating that the anthocyanins in the fruit pulp of B. ramiflora were mainly synthesized in the early stage.
Differential metabolites in LR and BR pulp during development stages. Naringin chalcone showed a significant decrease only in LR4 vs. LR3 (p = 0.028, FC -2.437, VIP 1.045). Dihydroquercetin significantly decreased in BR5 vs. BR4 (p = 0.020, FC -1.338, VIP 1.209), LR3 vs. BR3 (p = 0.011, FC -1.346, VIP 1.282), and LR4 vs. BR4 (p = 0.006, FC -2.070, VIP 1.661). Dihydroquercetin is an essential precursor for the synthesis of cyanidin. We speculated that the increase in dihydroquercetin content in BR was an important reason for the pink colour of the pulp. Kaempferol significantly decreased in BR5 vs. BR4 (p = 0.021, FC -1.339, VIP 1.256), LR3 vs. BR3 (p = 0.011, FC -1.378, VIP 1.335), and LR4 vs. BR4 (p = 0.004, FC -2.070, VIP 1.661). Procyanidin B1 showed a decreasing trend during the entire fruit development process, and we speculated that the gradual decrease in the bitterness and astringency of the fruit pulp was related to its decrease. (+)-Catechin is an upstream substance for the synthesis of procyanidin B1, and it exhibited a decreasing trend during development, which is consistent with the decrease in the synthesis of procyanidin B1, (+)-catechin only showed a significant decrease in LR4 vs. LR3 (p = 0.017, FC -2.140, VIP 1.002). Procyanidin B1 and (+)-catechin gradually decreased, which may be related to the synthesis of anthocyanidins or the reduction of upstream substrate synthesis.
Transcriptome sequencing and evaluation.
The RNA at different developmental stages of the fruit pulp was sequenced using PE150 on the Illumina NovaSeq platform. The filtered data are shown in Table S3. The clean reads after processing ranged from 22,698,981 to 27,405,036; the base data after processing ranged from 6.80 Gb to 8.21 Gb, which met the requirements we set. The GC content was in the range of 43.65% − 46.99%. Q20 was in the range of 96.86% − 97.65%, and Q30 was between 91.56% and 92.98%, both greater than 88%, indicating that the RNA sequencing data in this study had good quality and met our requirements. The comparison of the 30 RNA-seq clean reads of B. ramiflora with the B. ramiflora genome showed that the total number of reads in each sample was between 20,335,899 and 33,541,270, and the alignment rate reached 88.97% − 92.79%, further indicating that the quality of the transcriptome sequencing data was satisfactory.
The Pearson correlation coefficient is an important indicator of the reliability of gene expression experiments done on various samples and of the rationality of sample selection. Based on the fragments per kilobase per million expression values of all genes in each sample, the intragroup and intergroup correlation coefficients of the samples were calculated, and a heatmap was plotted to display the sample duplication within the group and the sample difference between the groups. The heatmap of the square of Pearson’s correlation coefficient (R2) of the 30 samples of fruit pulp is shown in Fig. 3. The R2 is more closer to 1, the more similar the expression patterns are between the samples. In general, the requirement of biological replication is R2 > 0.8. The R2 between samples in this study was between 0.807 and 0.997. The three biological samples had good repeatability and met the requirement of the correlation coefficient.
PCA is often used to assess intergroup differences and intragroup sample reproducibility. PCA performs dimensionality reduction of gene expression as well as principal component calculation through a linear algebraic calculation method. Ideally, the same group of samples should be clustered together. The PCA of the three biological replicates at each stage is shown in Fig. 4. The samples at the first developmental stage of LR and BR were closely clustered, indicating that the reproducibility of the samples was good; the samples at the second and third stages are more scattered, indicating that the samples at these two developmental stages are quite different, but the overall clustering is one type; the clustering of samples from the last two developmental stage indicates that the biological reproducibility is good. The samples from the five stages were divided into three categories: 30 DAF was the young fruit stage, 52 DAF and 73 DAF were the growth stage, and 93 DAF and 112 DAF were the ripening stage. PCA indicated that the fruit started to undergo ripening from 73 DAF to 93 DAF, consistent with the metabolomic analysis.
Structural genes in the anthocyanin biosynthetic pathway in fruit pulp of B. ramiflora.
The biosynthetic pathways of flavonoids in plants have been relatively clear and are relatively conserved among different species14,30. In this study, after obtaining the DEGs, based on the GO enrichment and KEGG enrichment analysis of DEGs at different developmental stages, the structural genes in the biosynthetic pathways of flavonoids and anthocyanins in the fruit pulp of B. ramiflora and DEGs at different developmental stages were distinguished (Table S1), and a pathway map containing the expression heatmap of various structural genes in the anthocyanin biosynthesis pathway in the fruit pulp was constructed (Fig. 5). A total of 43 structural genes related to the flavonoid biosynthetic pathway were found, and differential expression of 39 genes was found (Table S4). The early structural genes of anthocyanin synthesis included PAL (ctg2456.g21765, ctg836.g07135), C4H (ctg733.g06345), C4L (ctg3279.g28008, ctg2580.g23157, ctg836.g07174, ctg3058.g26217 and ctg3058.g26218), CHS (ctg655.g05350,ctg3105.g26705), CHI (ctg965.g08335,ctg502.g04473), F3H (ctg1825.g16638,ctg1147.g09820), F3′5′H (ctg3090.g26553, ctg1305.g11051, ctg2135.g19559, ctg2839.g24678 and ctg17.g00099), F3′H (ctg2495.g22288, ctg287.g02877, ctg2135.g19557), DFR (ctg1578.g14126), and FLS (ctg1560.g13893, ctg2313.g20660, ctg2657.g23512). The structural genes in the middle and late stages of anthocyanin synthesis included LAR (ctg2548.g22735, ctg1760.g15871), LDOX (leucoanthocyanidin dioxygenase) (ctg438.g03741, ctg3056.g26143), and UFGT/3GT (ctg2661.g23571, ctg1496.g13357, ctg1652.g14649, ctg1652.g14646, ctg1306.g11055, ctg1210.g10432, ctg2440.g21453, ctg2170.g19838, ctg3000.g25733, ctg2661.g23551, ctg2055.g18534), and ANR (ctg1329.g11603, ct1g2699.g23958). In the transcriptome analysis, the ANS gene was not annotated, but the two homologous LDOX sequences were annotated. Cyanidin is produced from leucoanthocyanidin via ANS or an iron-dependent 2-gluconate dioxygenase (also known as ANS31. The ANS and LDOX genes have the same biological function. In addition, the ANR gene that acts in the procyanidin pathway was discovered as a DEG here, and its role in the metabolome is of great significance to further understand the procyanidin pathway.
DEGs in the anthocyanin synthesis pathway.
In the DEGs of LR1 vs. BR1, C4L (ctg2580.g23157 and ctg836.g07174) was downregulated; CHI (ctg965.g08335) was upregulated; ctg1825.g16638 of F3H was upregulated, and ctg1147.g09820 was downregulated; and 3 F3′5′H genes, ctg1305.g11051, ctg2839.g24678 and ctg17.g00099 were downregulated; ctg287.g02877 of F3′H was upregulated; ctg1578.g14126 of DFR was upregulated; ctg1760.g15871 of LAR was upregulated; and ctg1306.g11055 and ctg1210. g10432 of UFGT/3GT were downregulated.
In the DEGs of LR2 vs. BR2, ctg836.g07174 was downregulated in C4L; ctg1147.g09820 of F3H was downregulated; ctg2495.g22288 of F3′H was downregulated; and ctg2661.g23571, ctg1652.g14646, ctg1306.g11055, ctg2661.g23, and ctg2055.g18534 of UFGT/3GT were all downregulated.
In the DEGs of LR3 vs. BR3, ctg836.g07174 of C4L was downregulated; ctg965.g08335 and ctg502.g04473 of CHI were all upregulated; and ctg1305.g11051, ctg2839.g24678 and ctg17.g00099 of F3′5′H were all upregulated; ctg1578.g14126 of DFR was upregulated; ctg1560.g13893 of FLS was upregulated; ctg1210.g10432 of UFGT/3GT was upregulated.
Among the DEGs of LR4 vs. BR4, only ctg1147.g09820 of F3H was downregulated.
In the DEGs of LR5 vs. BR5, ctg2456.g21765 of PAL was downregulated; ctg2580.g23157 and ctg836.g07174 of C4L were downregulated; ctg1147.g09820 of F3H was downregulated; and ctg1305.g11051 and ctg2135.g19559 of F3′5′H were upregulated; ctg2495.g22288 of F3′H was downregulated; for UFGT/3GT, ctg1496.g13357 and ctg1652.g14649 were both upregulated, and ctg1306.g11055 was downregulated.
Transcription factors associated with the anthocyanin biosynthesis pathway. Structural genes involved in anthocyanin biosynthesis are transcriptionally regulated by the MBW (MYB-bHLH-WD40) complex30,32. The transcription factors that were related to the regulation of flavonoid and anthocyanin biosynthetic pathways in this study included MYB (ctg100.g00957-0F, ctg1410.g12339-1F, ctg1554.g13859-0F, ctg1784.g16146-2F, ctg2470.g22007-0F, ctg2470.g22008-0F, ctg2470.g22009-0F, ctg842.g07305-0F, ctg911.g07979-0F, ctg1317.g11116-0F), and MYB-related (ctg1226.g10502-0F, ctg2353.g20962-0F). We speculate that these 12 MYB/MYB-related transcription factors are involved in the regulation of anthocyanin synthesis in the pulp of B. ramiflora.
qRT-PCR gene expression analysis.
To verify the sequencing results of the transcriptome of the B. ramiflora pulp transcriptome, 8 DEGs related to anthocyanin synthesis pathways (CHS: ctg655.g05350; CHI: ctg965.g08335; F3′5′H: ctg2839.g24678 and ctg2135.g19559; UFGT: ctg1923.g16986; UFGT89B2: ctg2170.g19838; and UFGT88F3: ctg2661.g23571) were selected for qRT-PCR analysis. As shown in Fig. 6, the expression levels of the DEGs in qRT-PCR were similar to those of the RNA-seq data, indicating that the RNA-seq data were reliable. This ensured the DEGs analysis and enrichment analysis of the transcriptome data in the later stage.