Analyses of key gene networks controlling carotenoid metabolism in Xiangfen 1 banana

Background Banana fruits are rich in various high-value metabolites and play a key role in the human diet. Of these components, carotenoids have attracted considerable attention due to their physiological role and human health care functions. However, the accumulation patterns of carotenoids and genome-wide analysis of gene expression during banana fruit development have not been comprehensively evaluated. Results In the present study, an integrative analysis of metabolites and transcriptome profiles in banana fruit with three different development stages was performed. A total of 11 carotenoid compounds were identified, and most of these compounds showed markedly higher abundances in mature green and/or mature fruit than in young fruit. Results were linked to the high expression of carotenoid synthesis and regulatory genes in the middle and late stages of fruit development. Co-expression network analysis revealed that 79 differentially expressed transcription factor genes may be responsible for the regulation of LCYB (lycopene β-cyclase), a key enzyme catalyzing the biosynthesis of α- and β-carotene. Conclusions Collectively, the study provided new insights into the understanding of dynamic changes in carotenoid content and gene expression level during banana fruit development.


Background
Banana fruits play a key role in the human diet due to their desirable palatability and high nutritional value [1,2]. Bananas are rich in various metabolites, such as soluble sugars, vitamins, carotenoids, phenolics, and minerals [3]. Of these components, carotenoids represent a large and diverse class of biological compounds and fulfill many important physiological functions [4]. However, the mechanism underlying carotenoid biosynthesis in banana remains unclear. Carotenoids in plants can produce a series of compounds named apocarotenoids under oxidative cleavage, which confers volatile compounds to the aromatic components of flowers, leaves, and fruits, as well as the well-known phytohormones, such as abscisic acid and strigolactones [5]. Carotenoids are typically tetraterpene (C40) molecules with 40 carbon atoms and multiple conjugated double bonds [6] . These bonds enable carotenoids in the selective absorption of certain wavelengths of the visible light spectrum to give bright colors, such as yellow, orange, and red, to fruits, flowers, and vegetables [7,8]. Thus, carotenoids have been as dyes for various industrial applications due to this property. Furthermore, carotenoids can serve as precursors for the biosynthesis of vitamin A and also provide precursors to many flavor-related compounds, which confer sensory attributes to the consumers [9]. Carotenoids have been used for the food, nutraceutical, and pharmacological industries due to their various beneficial effects on human and animal health [10].
Similar to other isoprenoids, carotenoids are synthesized via successive condensations of the five-carbon molecule isopentenyl diphosphate (IPP) and its isomer dimethylallyl diphosphate (DMAPP) [11] . Plants have two distinct routes for IPP and DMAPP biosynthesis: the cytosolic mevalonic acid and the plastid methylerythritol 4-phosphate pathways [12,13]. Geranylgeranyl pyrophosphates (GGPP) are formed by three IPP and one DMAPP in plastids. First, the colorless carotenoid phytoene is formed by the condensation of two molecules of GGPP. Then, colorless phytoene is converted into red lycopene via a series of desaturation and isomerization. Lycopene can produce a large variety of carotenoids with different physical properties via various end-group modifications, such as α-carotene, β-carotene, zeaxanthin, and lutein [7,14]. In addition to the structural genes, some transcription factors have been reported to be involved in the synthesis of carotenoids by regulating the expression of carotenoid biosynthetic genes, such as MADS-box [15], SBP-box [16], NAC [17], AP2/ERF [18], MYB [19], HD-Zip [20], and NF-Y [21].
Integrative analysis of metabolome and transcriptome profiles has been performed because the accumulation of metabolites is preceded by coordinated increases in the transcriptional level of relevant genes. Based on the correlation, this method has been widely applied to fig [22]., asparaguses [23], peach [24], ginkgo biloba [25], kiwifruit [26], and other plants. Nevertheless, integrated investigations on carotenoid biosynthesis characteristics and regulators are relatively few. Xiangfen 1, a novel flavonoid-rich banana germplasm, was used in this study to perform the dynamic metabolites and transcriptome analyses in banana pulp at three different developmental stages and identify the accumulation patterns of carotenoids and their underlying regulation. An understanding of dynamic changes in carotenoid content and the gene expression level during fruit development is essential for the breeding of special banana subgroups with high carotenoid contents.

Identification of differentially expressed genes (DEGs)
Using a |log 2 fold change| of ≥1 and an FDR of ≤0.05 as the thresholds, a total of 4590 (1703 upregulated and 2887 downregulated), 14,149 (6207 upregulated and 7942 downregulated) and 15,991 (6782 upregulated and 9209 downregulated) differentially expressed genes (DEGs) were identified in the three comparison groups: young and mature green, mature green and mature, and young and mature fruits, respectively. The majority of DEGs were downregulated during fruit development ( Fig. 2A). The Venn diagram showed that 2703, 3737, and 12,195 DEGs were shared by two comparison groups, and 2205 DEGs were common to all three comparison groups (Fig. 2B).

Enrichment of GO terms and KEGG pathway analysis
Gene Ontology (GO) term analysis was assigned to the identified DEGs to evaluate the gene expression of fruit development (Fig. 3A, B, C). GO analysis classified 18,839, 17,800, and 17,469 genes into the biological process, cell component, and molecular function, respectively. Among the biological process categories, the cellular and metabolic processes account for a higher proportion, followed by biological regulation, response to stimulus, and regulation of biological process. The most highly represented terms within the cellular component categories were the cell, cell part, organelle, membrane, and membrane part. Meanwhile, the most highly represented terms in the molecular function categories included binding, catalytic activity, and transcription regulator activity.

Expression of genes related to carotenoid biosynthesis
Carotenoid concentration is one of the main features that give an esthetic and nutritional value to banana fruit. Seven DEGs representing six genes were involved in carotenoid biosynthesis in banana in this study. The expression analysis of these DEGs is displayed in Fig. 4. The expression level of two genes encoding CRTB gradually decreased with fruit development, whereas the gene encoding Z-ISO, LCYB, LCYE, and CRTZ gradually increased during fruit development. The gene encoding VDE demonstrated high expression levels in the young fruit and low expression levels in the mature green and mature fruits.

Transcription factors involved in carotenoid biosynthesis
Gene expression in plant carotenoid biosynthesis is strictly controlled by transcription factors. A total of 646 differentially expressed transcription factor genes were identified between the young and mature green fruits. Among these genes, 170 transcription factor genes were assigned to MADS-box (4 upregulated and 9 downregulated), SBP-box (0 upregulated and 13 downregulated), NAC (11 upregulated and 20 downregulated), AP2/ERF (16 upregulated and 29 downregulated), MYB (17 upregulated and 43 downregulated), and NF-Y (3 upregulated and 5 downregulated). Interestingly, most of the transcription factor genes demonstrated downregulation between the young and mature green fruits (Table 1).

Co-expression network analysis of metabolites, genes, and transcription factors related to carotenoid biosynthesis
A correlation network was constructed combining 10 metabolites, 7 enzyme genes, and 108 transcription factors related to carotenoid biosynthesis. Only the correlation pairs with a Pearson correlation coefficient > 0.8 were included in this analysis (Fig. 5). The visualized network in Cytoscape showed that a total of 125 nodes were connected, linked by 910 edges. The gene-to-gene FPKM value and gene-to-metabolite accumulation pattern Lycopene β-cyclase (LCYB) is a key enzyme catalyzing the biosynthesis of α-carotene and β-carotene. In Fig. 5, 79 (15 upregulated and 64 downregulated) differentially expressed transcription factor genes were filtered by direct correlation with the gene encoding LCYB.

Validation of transcriptomic data by quantitative real-time PCR (qRT-PCR)
A total of 23 DEGs (5 carotenoid biosynthetic pathway genes, 18 transcription factor genes) were used to analyze their expression levels in YF (young fruit), MGF (mature green fruit), and MF (mature fruit) using RT-qPCR to validate the key RNA-Seq results. The expression patterns of these genes were similar to the RNA-Seq results, with correlation coefficients (R 2 ) > 0.91 (Fig. 6). The results validated the relevance of the RNA-Seq data, and RT-qPCR showed good consistency for upregulated and downregulated gene expressions.

Discussion
Carotenoids are widely distributed secondary metabolites that are not only crucial in plant physiology but also beneficial to human health as dietary components [27]. A total of 18 carotenoids were detected by the LC-MS/ MS in the present study to investigate the accumulation pattern of carotenoids during the entire developmental period of fruit. However, seven carotenoids remained undetected in this study due to the lower carotenoid content in the sample than the detection limit of the instrument or the absence of carotenoid in the sample. A previous study revealed that α-carotene, β-carotene, and lutein displayed a dramatic increase with banana fruit     [28,29]. This finding was consistent with the obtained results that most of the carotenoid compounds were undetectable or at considerably low levels at young fruits but markedly increased at the mature green and/or mature fruits. These results all suggest that the synthesis of carotenoids mainly occurs in the middle and late stages of fruit development [28,29]. RNA sequencing of the samples at three critical developmental stages was performed to understand the genome-wide expression patterns during fruit development. A large number of DEGs across the samples revealed a stage-specific transcriptome profile during fruit development [30]. The GO analysis classified 18,839, 17,800, and 17,469 genes into the biological process, cell component, and molecular function, respectively. These function annotations demonstrated that the gene expressed in banana encodes diverse metabolism-related proteins [23]. KEGG analysis revealed that DEGs were mainly involved in the biosynthesis of secondary metabolites, arachidonic acid metabolism, plant hormone signal transduction, and endocrine and other factor-regulated calcium reabsorption. This study focused on differential carotenoid accumulation during fruit development. The carotenoid accumulation in plants is a complex process associated with the expression of genes involved in carotenoid biosynthesis, degradation, and storage [31]. Carotenoid biosynthesis was enriched in the comparison of young and mature green fruits. Seven DEGs involved in carotenoid biosynthesis were identified, suggesting that these genes may be responsible for the differential carotenoid accumulation during fruit development. A putative road map of carotenoid biosynthesis was also drawn. Notably, most of the DEGs gradually increased with fruit development, which is consistent with the carotenoid metabolic characteristics discussed above and the previous reports [28,32]. In the current study, the gene encoding Z-ISO gradually increased with fruit development, which is directly correlated with the accumulation of lycopene [28].
The expression of gene encoding lycopene β-cyclase (LCYB), lycopene ε-cyclase (LCYE), and β-carotene hydroxylase gradually increased with fruit development to verify the high contents of carotenoid at the middle and late stages of fruit development. Moreover, the expression level of the gene encoding violaxanthin de-epoxidase (VDE) gradually decreased with fruit development, which resulted in the low content of violaxanthin in mature green and mature fruits. These results suggested that the content of carotenoids is closely related to the expression of structural genes [33]. The transcriptional regulation of carotenoid biosynthetic genes is the first level and an important control mechanism for carotenoid production in fruits [34]. Transcription factors are critical for the regulation of these biosynthetic gene expressions. LCYB is crucial in branching the metabolic flux into either α-carotene in β, ε-branch or β-carotene in β, β-branch of the pathway [34][35][36]. In the present study, co-expression network analysis revealed that 79 differentially expressed transcription factor genes may be responsible for the regulation of LCYB. The functional analysis of these DEGs will contribute to the understanding regarding the molecular mechanism of carotenoid accumulation in bananas.

Conclusion
The mechanisms of carotenoid accumulation during banana fruit development were analyzed in this study by using the dynamic metabolites, transcriptome, and qRT-PCR. A total of 11 carotenoid compounds were identified, and most of these compounds had high contents of carotenoid at the middle and late stages of fruit development. Furthermore, a series of carotenoid biosynthetic and regulatory genes were analyzed by RNA-seq and qRT-PCR. Collectively, these findings provide new information on the mechanisms of carotenoid accumulation during banana fruit development and a series of candidate genes with applications in the breeding of special banana subgroups with high carotenoid contents. It is difficult to improve fruit quality by conventional breeding, however molecular breeding which uses gene editing technology might breed directionally high carotenoid content of banana.

Plant materials and treatment
The Xiangfen1 banana plants used in this study were planted in an orchard at South Subtropical Crop Research Institute, Chinese Academy of Tropical Agricultural Science, Zhanjiang, Guangdong, China (21°27 N, 110°35′E). Xiangfen1 banana fruit samples at three different developmental stages (cut off flower days 45, 85, and 85 + 3) were collected from the banana plantation. The fruits collected on the 3 days (days 45, 85, and 85 + 3) represented three typical samples of banana (young, mature green, and mature fruits, respectively). All flesh samples were immediately frozen in liquid nitrogen and stored at −80 °C until further use.

Sample preparation and extraction
Fresh plant materials were freeze dried, and stored at − 80 °C until needed. All analyses were performed in triplicate. Then dried plant materials were homogenized and powdered in a mill. 50 mg of dried powder was extracted with mixed solution of n-hexane: acetone: ethanol, and add internal standard. The extract was vortexed for 20 min at room temperature. The supernatants were collected after centrifugation. The residue was re-extracted and repeat the steps above. Both supernatants were collected and then evaporated to dryness under nitrogen gas stream, reconstituted in mixed solution of methanol: MTBE. The solution was filtered through 0.22 μm filter for further LC-MS analysis [37].

APCI-q trap-MS/MS
API 6500 Q TRAP LC/MS/MS System, equipped with an APCI Turbo Ion-Spray interface, operating in a positive ion mode and controlled by Analyst 1.6.3 software (AB Sciex). The APCI source operation parameters were as follow: ion source, APCI+; source temperature 350 °C; curtain gas (CUR) were set at 25.0 psi; the collision gas (CAD) was medium. DP and CE for individual MRM transition was done with further DP and CE optimization. A specific set of MRM transitions were monitored for each period according to the carotenoids eluted within this period [39].

RT-qPCR validation
RT-qPCR was applied to investigate gene expression patterns. First-strand cDNA was generated from 1 μg total RNA isolated from the seven pericarp samples using the PrimeScript ™ RT reagent kit (TaKaRa, Japan). RT-qPCR primers were designed using Primer Premier 5.0 software (Premier, Canada) and synthesized by Sangon Biotech (Shanghai, China) Co., Ltd. The relative expression level of the genes were calculated using Eq. 2 −ΔΔC t .

Statistical analysis
To reduce the dimension of data and simplify transcriptome data, principal component analysis (PCA), a multivariate statistical analysis method, was used in this study. The differential metabolites and genes were annotated using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway database (http:// www. kegg. jp/ kegg/ pathw ay. html).