Flesh Color Assessment and Carotenoid Content Variation During Watermelon Fruit Development of Five Genotypes
Watermelon flesh features at different developmental stages have been shown in Fig. 1, we further determined the color space values to confirm flesh color variations. At the 10 DAP stage, the fruits were small, white and hard fleshed, there was no significant difference on color space parameters among varieties at early stages (Additional file1: Table S1). At the 20 DAP stage, the flesh started to become soften with flesh colors ranging from white to pink and white to yellow, this is the key period of continuous and rapid accumulation of pigments. At the 34 DAP, the fruits were matured and characterized by fleshy water-storing, the flesh water is saturated and has the brightest colors except for white flesh cultivar. The significant differences of L*, a*, b*, and Chroma (C) can be observed in different fleshed fruits at 20 DAP and 34 DAP. The difference in flesh color appeared at the 20 DAP and become more evident at 34 DAP.
Genotype R, P, O, Y, and W represents the red, pink, orange, yellow, and white flesh color varieties, respectively; DAP, days after pollination. Scale bars = 5 cm
Flesh color in watermelon fruits is determined by carotenoids composition and content. Generally, the lycopene, β-carotene, violaxanthin are responsible for the red, orange, and yellow flesh colors, respectively [10]. The major carotenoids metabolic changes in three stages with different flesh colors were measured using liquid chromatography-mass spectrometry (Additional file2: Table S2). The PCA result of the time- course and color comparative analyses suggesting the reliable metabolic data (Additional file3: Fig. S1). At 10 DAP,There were no pigments or only trace phytoene could be detected in the young fruits. At 20 DAP, lycopene, β-carotene, and violaxanthin are the major pigments in red/pink fruits, orange fruits and yellow fruit, low level carotenoids were detected for the white flesh. At 34 DAP༌the fruit is ripe, the highest level of lycopene was noted in the red fleshed fruits, Orange flesh has the highest level of β-carotene, the highest violaxanthin content was noted in the yellow fleshed fruits. For the white fruits, only trace levels of violaxanthin, antheraxanthin and lutein were observed (Additional file2: Table S2, Fig. 1).
Taken together, the color space values, and carotenoid levels revealed substantial variations among red, pink, orange, yellow, and white genotypes. It is conceivable that the DEGs at the three stages among the 5 different flesh colored watermelons fruits may play important roles in determining flesh color formation.
Transcript Sequencing of Developing Watermelon Fruits with Different Flesh Colors
To explore the potential molecular mechanisms underlying the flesh coloration during the development of watermelon fruit among different genotypes, RNA-Sequencing analyses were conducted on fruit flesh to generate transcriptome profiles. Samples of fruit flesh at three critical stages (10 DAP, 20 DAP, and 34 DAP) were obtained from five genotypes (Fig. 1), all samples were analyzed in three independent biological replicates.
In total, 45 libraries were constructed and analyzed. After removing low quality reads, the average number of reads per library was over 51.9 million for 45 samples, with an average GC content of 44.15% (Additional file 4: Table S3). The RNA-Seq reads were aligned with the reference map of the watermelon (97103) genome (http://cucurbitgenomics.org/organism/1), using HISAT(version 2.0.4) [23]. More than 97% of the total clean reads had Phred-like quality scores at the Q20 level (Supplementary Table S3). Ultimately, 24,794 genes (including 1354 novel genes) were identified by Cufflinks v2.1.1 [24]. Via comparative transcriptome analysis, this high-quality RNA-Seq data provided a solid foundation for identifying key genes participating in carotenoid syntheses during watermelon development and ripening.
The numbers of transcripts identified in each sample, were expressed in FPKMs. Approximately 39.08% of expressed genes were in the 0–1 FPKM range, and 13.48% of expressed genes were above 60 FPKM, exhibited high expression level (Additional file 5: Table S4). Genes with normalized reads lower than 1 FPKM were removed from the subsequent analysis. These data showed sufficient coverage of the transcriptome during fruit development in five cultivars. The gene expression level among different experimental groups are compared in Additional file 6: Fig. S2a. The expression patterns among biologically repeated samples are highly consistent similarity (Additional file 6: Fig. S2b) and the correlation coefficient is close to 1(Additional file 7: Table S5), it indicates that the high quality of the replicates and reliable results of subsequent differential gene analysis.
Identification of Differentially Expressed Genes Among the Genotypes and Different Fruit Developmental Stages
To identify the genes correlating with the flesh color in different development stages, we conducted pairwise comparison at each developmental stage among five genotypes. The DEGs were screened with FDR < 0.05, |log2(FoldChange)| >1 as a threshold, the number and list of significantly differentially expressed gene (up-regulated and down-regulated) of each pairwise comparisons are shown in Additional file 8: Table S6 and Additional file 9–12: dataset 1–4, 16,781 genes were differentially expressed between at least one comparison.
At the early development stage (10 DAP), 5318 significantly differentially expressed genes were identified (Fig. 3a, Additional file 9: dataset 1). Specifically, 510,262༌588༌349 significantly differentially expressed genes were identified in red flesh variety compare to pink, orange, yellow, and white flesh variety. The numbers of other pairwise comparisons were listed in Additional file 8: Table S6. The fruit shape candidate gene ClFS1 (Cla011257) were differentially expressed in different fruits, consistent to the previous research [25]. Cla019403 encode Xyloglucan endotransglucosylase/hydrolase, is related to the growth of plant cell [26], and highly expressed at this stage (Additional file 13: Fig. S3a, Additional file 14: Table S7). An Auxin response factor(ARF, Cla009800), A Growth-regulating factor 5 (GRF, Cla006802), An Auxin-induced SAUR-like protein (Cla016617), were obviously higher expressed at early development stage (Additional file 13: Fig. S3a, Additional file 14: Table S7) and associated with the fruit development and expansion [27]. There are less DEGs at 10 DAPs compared to later stages, due to little phenotypic differences among varieties at this stage.
At the pigment accumulation stage (20 DAP), 11814 significantly differentially expressed genes were identified (Fig. 3b, Additional file 10: dataset 2). Specifically, 2498,4830༌3123༌4876 significantly differentially expressed genes were identified in red flesh variety compare to pink, orange, yellow, and white flesh variety. The numbers of other pairwise comparisons were listed in Supplementary Table 6. Geranylgeranyl pyrophosphate synthase (Cla015797), Phytoene synthase protein (Cla005425), Phytoene desaturase (Cla010898), Carotenoid isomerase (Cla017593), Lycopene cyclase (Cla016840), Violaxanthin de-epoxidase-related protein (Cla007759), 9-cis-epoxycarotenoid dioxygenase (Cla015245) are carotenoid biosynthesis genes and differentially expressed among materials(Additional file 13: Fig. S3b, Additional file 14:Table S7), directly control the flesh color formation of fruit. AP2-EREBP (Cla000701, Cla017389) and bHLH (Cla020193, Cla022119 ) were differentially expressed among materials(Additional file 13: Fig. S3b, Additional file 14:Table S7) and maybe the potential color regulator in watermelon [28, 29].
At the mature stage (34 DAP), 10779 significantly differentially expressed genes were identified (Fig. 3c, Additional file 11: dataset 3). Specifically, 2097,2572༌2429༌3316 significantly differentially expressed genes were identified in red flesh variety compared to pink, orange, yellow, and white flesh variety. The numbers of other pairwise comparisons were listed in Additional file 8: Table S6. Other comparison combinations are shown in the Additional file 9–11: dataset 1–3. Most of the carotenoids pathway genes and TFs are differentially expressed among genotypes. Geranylgeranyl reductase (Cla003139, Cla019109) Geranylgeranyl pyrophosphate synthase (Cla015797, Cla020121), Phytoene synthase protein (Cla005425, Cla009122), Phytoene desaturase (Cla010898), Carotenoid isomerase (Cla017593, Cla011810), Lycopene cyclase (Cla005011, Cla017416, Cla016840), 9-cis-epoxycarotenoid dioxygenase (Cla015245, Cla009779, Cla005404, Cla005453, Cla019578), Beta-carotene hydroxylase (Cla011420, Cla006149), Zeta-carotene desaturase (Cla003751), Zeaxanthin epoxidase (Cla020214), and many TFs including (AP2-EREBP, MADS, MYB, G2-like, NAC, AUX), were differentially expressed at 34 DAP (Additional file 13: Fig. S3c, Additional file 14: Table S7). Cla015245 and Cla005404 were highly expressed in white flesh compared to yellow flesh, which may lead to the degradation of xanthophyll colorless flesh (Additional file 13: Fig. S3c, Additional file 14: Table S7). In particular, Cla017389 and Cla015515 (AP2-ERFBP) are highly expressed in red and pink color fruit. Cla006599 and Cla022119 (bHLH) are decreased at 20 and 34DAP compared to 10 DAP, similar to the expression pattern of CpbHLH1/2 in papaya [30], these genes may be important regulator gene.
For each genotype, there are 6430, 9019, 9605, 9147, and 10881 developmental DEGs were obtained in red, pink, orange, yellow, and white fleshed watermelon fruit, respectively, during the fruit development (Fig. 3d, Additional file 12: dataset 4). These results indicate that many genes are involved in the regulation of fruit development. More genes are differentially expressed in the white color flesh variety, suggesting a more complicated regulatory network of gene expression in this variety. We also identified some genes related to fruit development using comparative transcriptome analysis. A Cytokinin dehydrogenase gene (Cla022463) that was highly expressed at 10 DAP (Additional file 14: Table S7), which may contribute to the early fruit development. The pyrabactin resistance 1-like protein (PYL8) can regulate the plant growth and stress responses by mediate ABA signaling in Arabidopsis [31]. Here, we found that the expression of four Abscisic acid receptor PYL8 (Cla004235, Cla004904, Cla015009, Cla021167) were significantly different in different watermelon flesh (Additional file 13: Fig. S3d, Additional file 14: Table S7). These genes involved in fruit development and ripening.
The global hierarchical clustering (Additional file 15: Fig. S4a) and principal component analysis (Additional file 15: Fig. S4b) were performed based on the FPKM values for all the DEGs. The results revealed that the 45 samples could be generally assigned into 3 main groups corresponding to development stages based on gene expression pattern, the samples from 10 DAP were distinctly clustered as one group, but the samples from 20 DAP and 34 DAP are relatively near, except for the samples at 34 DAP of red flesh (Additional file 15: Fig. S4a), suggesting that very complex regulatory networks in the middle and late stages of fruit development among different genotypes. In the PCA, three biological repeats for red color at 34 DAP are not together (Additional file 15: Fig. S4b), may be caused by a distinct regulatory mechanism in each fruit of one variety and that susceptible to environmental impact, but we used as valid data owing to the Pearson-correlation coefficient among the three repeats is up to 0.899 (Additional file 7: Table S5). These results indicate that the Differentially expressed genes between each pairwise comparison may be associated with the difference in watermelon fruit flesh coloration.
Clustering of DEGs into six Groups based on Gene Expression Patterns
Based on the pattern of gene expression, the 16,781 DEGs are grouped into 6 different subclusters using the h-cluster clustering method (Fig. 4a, Additional file 16: Table S8), the genes which grouped in the same cluster shared the similar expression pattern and have a similar function or participate in the same biological process. In the subcluster 1, 8,931 genes display a relatively stable expression levels across different stages and varieties. In the subclusters 2 and 3, gene expression exhibited a general downward trend according to the development stage. Genes in subclusters 4, 5, and 6 exhibited a general upward trend (Fig. 4a).
To functionally characterize the biological roles of these DEGs in the subcluster 1, GO enrichment analyses were performed. GO term analysis enriched almost the same proportion of genes into biological process, cellular component, and molecular function, GO terms related to various basic life activities, such as binding, cellular macromolecule metabolic process, intracellular, and cell (Additional file 17: Fig. S5a, Additional file 18: dataset 5). In addition, DEGs are enriched to the Spliceosome, RNA transport, and Ribosome biogenesis in eukaryotes pathways by KEGG analysis (Fig. 4b, Additional file 18: dataset 5). This result suggests that these DEGs are involved in basic growth and development of fruits, are to maintain the basic growth of plants.
3282 genes have a slightly higher expression at 20 DAP and 34DAP in the subcluster 2 (Fig. 4a). GO classification showed that GO terms such as single-organism metabolic process, small molecular metabolic process, and organonitrogen compound metabolic process were enriched (Additional file 17: Fig. S5b, Additional file 18: dataset 5). Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis showed that the most significantly enriched pathways are proteasome, biosynthesis of amino acids, carbon metabolism pathway, TCA-cycle, Glycolysis/ Gluconeogenesis proteasome and pyruvate metabolism pathway (Fig. 4b, Additional file 18: dataset 5). The proteasome pathway containing 26S protease regulatory subunit genes and proteasome subunit type genes. The pyruvate metabolism pathway including Acetyl-CoA carboxylase biotin carboxylase, Pyruvate kinase, malate dehydrogenase, and others.
Subcluster 3 represents the genes with high expression at 34 DAP, the change ranges were more marked than subcluster1 (Fig. 4a). 754 DEGs in this cluster mainly allocated into molecular function and biological process according to GO term analysis, with 310 and 287 DEGs were classified into the metabolic process and catalytic activity (Additional file 17: Fig. S5c, Additional file 18: dataset 5). Notably, the DEGs were significantly involved in pathways associated with photosynthesis-antenna proteins biosynthesis, photosynthesis, protein processing in endoplasmic reticulum and biosynthesis of unsaturated fatty acids (Fig. 4b, Additional file 18: dataset 5). Cla006149 (Beta-carotene hydroxylase), Cla011420 (Beta-carotene hydroxylase), Cla009122 (Phytoene synthase), and Cla009779 (9-cis-epoxycarotenoid dioxygenase) are related to carotenoid pathway, these two genes showed increased expression at 20DAP and 34DAP (Additional file 13: Figs. S3b and S3c, Additional file 14:Table S7), indicating that accumulation of pigments occurs at this stage. In addition, some MYBs, AP2-ERFBPs, bHLHs, NACs, and WRKYs are also in subcluster 3 (Additional file 18: dataset 5), they are important regulators in fruit development and ripening. Two MYB (Cla018631 and Cla006739) and two WRKY (Cla002243 and Cla002084) are highly expressed in yellow and pink color flesh (Additional file 13: Fig. S3e, Additional file 14:Table S7), respectively, may be involved in the regulation of yellow and pink color formation.
There are 2274, 1174, and 366 genes in the subclusters 4, 5, and 6, respectively, are highly expressed at 10 DAP and decreased to a low gene expression levels at the later stages, but the changing extents were different (Fig. 4a). Go enrichment analysis of subcluster 4 indicated that biological processes were most enriched. (Additional file 17: Fig. S5d, Additional file 18: dataset 5). KEGG analysis showed that genes significantly involved in the pathway of plant hormone signal transduction, such as signal transduction histidine kinase (Cla000685, Cla005808), auxin responsive protein (Cla003635), Ein3-binding f-box protein (Cla020970), and so on (Fig. 4b, Additional file 18: dataset 5). which are important for fruit early development. GO enrichment of subcluster 5 genes assigned to the GO terms of biological process and molecular function with no significance, such as protein phosphorylation, protein kinase activity (Additional file 17: Fig. S5e, Additional file 18: dataset 5). Genes are enriched into KEGG plant hormone signal transduction, Alanine, aspartate and glutamate metabolism, and others (Fig. 4b, Additional file 18: dataset 5). These genes are associated to the fruit growth. Subcluster 6, the enriched Go term were predominantly related to molecular function and biological process, such as “enzyme inhibitor activity” and “endopeptidase regulator activity” (Additional file 17: Fig. S5f, Additional file 18: dataset 5). KEGG enrichment analysis mostly related to the pathway of plant hormone signal transduction pathway, phenylpropanoid biosynthesis, and Phenylalanine metabolism (Fig. 4b, Additional file 18: dataset 5). Cla019806, Cla0004102, Cla002975, and Cla016617 were involved in hormone synthesis, responsible for fruit enlargement and highly expressed during early stages of fruit development as compared to later fruit developmental stages (Additional file 13: Fig. S3f, Additional file 18: dataset 5, Additional file 14:Table S 7).
Genotype R, P, O, Y, and W represents the red, pink, orange, yellow, and white fleshed cultivar, respectively; the “_1, _2, _4” represent the sample group at 10, 20, 34 DAP, respectively.
Co-expression network analysis identified carotenoid-related DEG s
To identify the potential genes (structural genes and putative transcription factors (TFs) that are highly associated with different kinds of carotenoids accumulation, the carotenoids content in each sample were used as a phenotypic data and 16,781 DEGs were used to perform the weighted gene co-expression network analysis (WGCNA).
A sample dendrogram and trait heatmap was constructed to illustrate the expression of each phenotypic parameter at different developmental stages (Additional file 19: Fig. S6a). The best parameter value determination for module construction is 7.7 for this dataset (Additional file 19: Fig. S6b). A total of 40 distinct co-expression modules were formed according to the pairwise correlations of gene expression across all samples and co-expression patterns of individual genes, as shown in the cluster dendrogram (Additional file 20: Fig. S7a), the modules labeled with different colors. Moreover, a network heatmap of all the DEGs in gene-modules was also drawn to exhibit the correlation between modules (Additional file 20: Fig. S7b). Notably, 6 coexpression modules (indicated with red underlines) have a high positive correlation with most of the carotenoids (Fig. 5a), indicating that the genes in these modules play an important role in the carotenoid accumulation.
‘Yellow’ module contains 846 Genes (including 34 TFs)(Additional file 21: Table S9) exhibited a stronger positive relationship with Zeaxanthin (correlation coefficient, cc = 0.91), Neoxanthin(cc = 0.91), Antheraxanthin (cc = 0.84), Violaxanthin (cc = 0.79), Phytofluene (cc = 0.81), apocarotenal (cc = 0.76), β-cryptoxanthin (cc = 0.68), lutein (cc = 0.69), and α-carotene (cc = 0.62) (Fig. 5a). In this module, a set of genes related to cellular metabolic process, and involved in pyruvate metabolism and proteasome pathway (Additional file 22: Fig. S8). Pyruvate is an important mediator of carbohydrate, fat and protein metabolism, and participates in several important metabolic functions in vivo. The proteasome is related to the regulation of carotenoid content in tomato [32]. This module containing genes that highly differentially expressed in the yellow/orange sample (Additional file 23: Fig. S9a), maybe are positive factors involved in yellow pigment accumulation. According to gene function annotation, Cla005011 is lycopene beta-cyclase in watermelon [19]. Cla003751 encodes a Zeta-carotene desaturase, Cla020214 encode s zeaxanthin epoxidase, they are involved in the carotenoid pathway (Additional file 13: Fig. S3c, Additional file 14 and 21: Table S7 and S9). The Cla014416 (Plastid-lipid-associated protein, ClPAP) is homologous to SlPAP(NCBI Reference Sequence: NP_001234183.1) that affect carotenoid content in tomato [33]. The expression level of Cla014416 significantly higher in yellow and orange flesh color materials in this study (Additional file 13: Fig. S3g, Additional file 14 and 21: Table S7 and S9), different from the previous report that ClPAP highly expressed in red and orange color lines [34]. Cla000655 encodes a cytochrome P450, is related to the protein lutein deficient 5 (CYP97A3) in Arabidopsis thaliana [35] and highly expressed in yellow and orange color fruits (Additional file 13: Fig. S3g, Additional file 14 and 21: Table S7 and S9). Cla010839 is homologous to 15-cis-zeta-carotenoid isomerase in Arabidopsis thaliana [36]. Cla018347 encodes a cytochrome P450, is related to carotenoid epsilon-monooxygenase (CYP97C1) in Arabidopsis thaliana [37] (Additional file 13: Fig. S3g, Additional file 14 and 21: Table S7 and S9). The MYB transcription factor can regulate the carotenoid content in Mimulus lewisii flowers [38], Cla013280 and Cla010722 belongs to MYB family and highly expressed in yellow color fruit (Additional file 13: Fig. S3g, Additional file 14 and 21: Table S7 and S9). The hub genes linked to this module were further analyzed using Cytoscape cytoHubba (Fig. 5b) and ATP synthase protein I, (Cla013542), Cysteine desulfuration protein SufE (Cla003340), Membrane protein (Cla003760), and others are important candidate gene for yellow color formation (Additional file 24: Fig. S10, Additional file 21 and 25: Table S9 and S10).
The ‘darkred’ module containing 111 genes were highly associated with the content of Antheraxanthin and Violaxanthin, having a correlation coefficient of 0.76 and 0.71 respectively. Heatmaps (Additional file 23: Fig. S9b) show that the ‘darkred’ module-specific genes were over-represented in samples (yellow, orange, white) be rich in Antheraxanthin and Violaxanthin. Cla004704 encodes a Photosystem II oxygen evolving complex protein PsbP, Cla005429 encode an Oxygen-evolving enhancer protein 2, chloroplastic, PsbP. Cla004746 encodes a Chlorophyll a-b binding protein 6A (Additional file 13: Fig. S3g, Additional file 14 and 21: Table S7 and S9). Cla021635 encodes a Photosystem I reaction center subunit II, rank as the top hub gene in this module (Fig. 5c, Additional file 24: Fig. S10, Additional file 25: Table S10). Many genes in this module are also related to chloroplast or photosystem I, II (Additional file 14: Table S7).
The “mediumpurple 3” module, with 32 identified genes, was highly correlated to α-carotene, Violaxanthin, Neoxanthin, lutein, and Zeaxanthin with correlation coefficient of 0.55, 0.70, 0.72, 0.66, and 0.73 respectively (Fig. 5a). The heatmap analysis were shown in (Additional file 23: Fig. S9c). Cla005637, Cla017046, and Cla011297 were identified as candidate hub genes for this module (Additional file 24: Fig. S10, Additional file 25: Table S10). The black, and steelblue module were specific to the lycopene (cc = 0.57), and γ-Carotene content (cc = 0.55), respectively (Fig. 5a). The hub genes were listed in Additional file 25: Table S10, and expression level are showed in Supplementary Fig. 10.
By WGCNA, we found that most carotenoid pathway genes are included in the yellow module, which was highly correlated with the yellow and orange color samples (Additional file 23: Fig. S9a). Through network analysis, we identified hub genes for watermelon flesh carotenoid content (Additional file 25: Table S10).
Validation of the Expression of Key DEGs by qRT-PCR
The qRT-PCR analysis was used to validate the quality of RNA-Seq data. Five genes that previously reported and involved in the regulation of carotenoid synthesis during watermelon fruit development and seven key genes selected in this study were used for qRT-PCR analysis. Here, we found the strong correlation between the RNA-Seq and qRT-PCR data, indicating the reliability of our transcriptomic profiling data (Additional file 26: Fig. S11), and the selected gene from this study can be used as candidate genes for further research on watermelon fruit flesh color formation.