Carotene and xanthophyll accumulation in the petals of ‘V–01’ and its natural mutant ‘V–01M’
The marigold inbred line ‘V–01’ (orange flowers) and its natural mutant ‘V–01M’ (yellow flowers) had very similar botanical characteristics, except for their petal color (Fig. 1A). Marigold flower development can be divided into four stages (Fig. 1B); in the first stage, the ligulate flowers are tightly packed and green (Stage I), after which the outermost ligulate flowers begin to expand (Stage II). Next, the ligulate flowers elongate, with pigmentation starting to appear from the outermost layer (Stage III), and finally, all of the ligulate flowers expand and spread evenly to form the marigold inflorescence (Stage IV). We found that the orange variety ‘V–01’ and its natural mutant ‘V–01M’ had visible color differences starting from Stage III, resulting in completely different flower colors at Stage IV.
To analyze the differences in color at the biochemical level, we assessed the accumulation of carotenoids in the Stage-IV ligulate flowers of ‘V–01’ and ‘V–01M’ using HPLC and mass spectrometry. A total of nine carotenoids were detected, which could be divided into two subgroups: carotenes (orange pigments) and xanthophylls (yellow pigments). The carotenes included α-carotene, β-carotene, lycopene, and capsanthin, while the xanthophylls included lutein, violaxanthin, zeaxanthin, neoxanthin, and antheraxanthin. The xanthophylls was significantly more abundant in the yellow mutant ‘V–01M’ than in the orange line ‘V–01’ (Fig. 1C and D). In ‘V–01M’, the contents of all five xanthophylls were significantly higher than in ‘V–01’, especially for lutein and zeaxanthin (Fig. 1D). In contrast, no significant difference of carotenes were detected in ‘V–01’ and ‘V–01M’ (Fig. 1C). These results showed that the higher accumulation of yellow pigments (xanthophylls) in the ‘V–01M’ mutant likely resulted in its yellow petal color. This accumulation of xanthophylls could be caused by the promotion of the biosynthesis pathway or the repression of the degradation pathway.
Illumina sequencing, de novo assembly, and functional annotation
To elucidate the mechanism of flower color biosynthesis and carotenoid metabolism in marigold, we conducted the de novo sequencing of the orange and yellow varieties. A total of 24 RNA libraries were constructed from the flowers of the two lines (‘V–01’ and ‘V–01M’) at the four developmental stages, with three biological replicates for each stage (Supplementary Table S1). The raw sequencing data were filtered to remove low-quality reads that could affect the data quality and subsequent analysis. We obtained 132.954 Gb of clean reads, which were used to assemble a de novo transcriptome using Trinity. The assembly results led to the identification of 65,015 transcripts with an average length of 1,130 bp, a GC content of 39.75%, and a N50 score of 1,635 bp. These 65,015 transcripts belonged to 49,217 unigenes, which had an average length of 1,015 bp, a GC content of 40.1%, and a N50 score of 1,501 bp (Table 1). The size distribution of the transcripts and unigenes are given in Figure 2, with 43.53% and 37.28% of all transcripts and unigenes showing lengths greater than 1 kb, respectively.
Gene function was annotated based on the homology of the unigenes to sequences listed in the following databases: Swiss_Prot, TrEMBL, NR, Pfam, KOG, GO, and KEGG. Putative homologs were identified for 33,810 of the unigenes (68.70%) in the NR database, while 33,646 (68.40%), 27,502 (55.90%) 28,629 (58.20%), 22,558 (45.8%), 27,020 (54.90%), and 9,790 (19.90%) unigenes showed significant similarity to sequences in the TrEMBL, Pfam, KOG, Swiss_Prot, GO, and KEGG databases, respectively (Table 2). Among the 33,810 unigenes with a match in the NR database, 8.8% were most similar to sequences from grape (Vitis vinifera),, followed by sesame (Sesamum indicum; 7%), robusta coffee (Coffea canephora; 6.2%), and wild tobacco (Nicotiana tomentosiformis; 4.3%) (Fig. 3A). The predicted function and gene classification of the marigold unigenes were identified using the KOG and GO databases. A total of 1,779 unigenes were annotated as ‘signal transduction mechanisms’ based on the KOG database, and the most common category was ‘general function prediction only’ (3,200 unigenes) (Fig. 3B). Furthermore, the unigenes were annotated with GO terms, with the most common biological process categories determined to be ‘metabolic process’ and ‘cellular process’ (Fig. 3C).
Expression dynamics of the DEGs during flower development
Cufflinks software was used to identify DEGs between the four developmental stages in both of the marigold genotypes. The FPKM values were used to estimate the gene expression levels, and volcano plots were constructed to describe the distribution of all DEGs identified in the library comparisons (Fig. 4). These results indicated that, in both ‘V–01’ and ‘V–01M’, the most dramatic change in the expression of the genes occurred between developmental Stages I to III, as well as the comparison between Stages I and IV. This suggested that a large number of genes are significantly differentially expressed throughout flower development. Furthermore, the fold changes in the expression of the DEGs between Stages II and III were greater than those of the DEGs from the comparison of Stages I and II in both ‘V–01’ and ‘V–01M’, suggesting that a more dramatic change in gene expression occurred between Stages II and III than between Stages I and II.
Similarly, significantly fewer DEGs (1,513) were identified in V–01_I_VS_V–01_II (the comparison between Stages I and II in ‘V–01’) than in V–01_II_VS_V–01_III (4,891) and V–01_III_VS_V–01_IV (7,488), and comparisons of these stages in the mutant plant ‘V–01M’ also followed a similar DEG pattern (Fig. 5; Supplemental Table S2). This suggested that Stages II and III are the key phases of flower development with the most dramatic changes in gene expression. This is consistent with the observed accumulation of carotenoids and the color changes in the marigold flowers beginning in Stage III (Fig. 1B). The coloring of the ligulate marigold flowers began at Stage III, and the ‘V–01’ (orange) and ‘V–01M’ (yellow) began to visibly differentiate during this stage.
The most dramatic changes in gene expression, both in fold change (Fig. 4) and the number of DEGs (Fig. 5), occurred between the flower development Stages II and III. We therefore performed a detailed comparative analysis of the DEGs in Stages II and III (V–01_II_VS_V–01_III and V–01M_II_VS_V–01M_III), during which the ligulate flowers transitioned into the critical period of color formation. Finally, 4,891 DEGs (2,369 upregulated and 2,522 downregulated) were identified in the V–01_II_VS_V–01_III comparison and 4,189 (2,453 upregulated and 1,736 downregulated) were identified in the V–01M_II_VS_V–01M_III comparison (Fig. 5; Supplemental Table S2). The similar number of DEGs between these two comparisons was likely a reflection of their similar genetic backgrounds.
We further used a KEGG analysis for the functional classification and pathway assignment of the DEGs between Stages II and III in both ‘V–01’ and ‘V–01M’. For V–01_II_VS_V–01_III, a total of 904 upregulated DEGs and 609 downregulated DEGs were grouped into the KEGG pathways. Similarly, 778 upregulated DEGs and 519 downregulated DEGs from the V–01M_II_VS_V–01M_III comparison were grouped into the KEGG pathways. The most significantly enriched pathways associated with the DEGs were “metabolic pathways” and “biosynthesis of secondary metabolites” in both ‘V–01’ and ‘V–01M’ (Fig. 6), suggesting a potentially important role for secondary metabolites in floral development.
Identification of genes involved in carotenoid biosynthesis and degradation
Carotenoids are major pigments in marigold flowers, and the differing carotene (orange pigments) and xanthophyll (yellow pigments) contents in different genotypes largely contribute to the diversity of their flower colors. According to the KOG classification, about 2.2% of transcripts in the marigold flowers were assigned to the secondary metabolite biosynthesis category. Many of these encoded enzymes known to catalyze the biosynthesis of various carotenoids, including α-carotene, β-carotene, lycopene, capsanthin, lutein, violaxanthin, zeaxanthin, neoxanthin, and antheraxanthin.
To elucidate the genetic regulatory mechanism of marigold pigment accumulation, the genes involved in the carotenoid biosynthesis pathway were identified according to the annotations of the transcriptome data. The first step in carotenoid biosynthesis is the conversion of colorless phytoene to red lycopene. The enzymes involved in this step were identified in the marigold transcriptomes, including PDS (TR15738), Z-ISO (TR8655), ZDS (TR6555), and CRTISO (TR5914). Lycopene is then catalyzed into carotene by the cyclases LCY-B and LCY-E, which were also identified in the marigold transcriptomes (LCY-B (TR13418) and LCY-E (TR11756)). Carotene can be further catalyzed to produce a number of xanthophylls, including lutein, zeaxanthin, violaxanthin, and neoxanthin, which involves the genes HYD-B (TR20167) and HYD-E (TR27505). Finally, in addition to their biosynthesis pathway, three CCD genes (TR9765, TR16287, and TR24544) and four NCED genes (TR2330, TR3442, TR4240, and TR22914), all of which catalyze the degradation of the carotenoids, were identified in the transcriptome data (Table 3). The enzymes encoded by these proteins are likely involved in the biosynthesis and degradation of carotenoids. The change in their expression may therefore be vital for the final color of the marigold flowers (Fig. 7).
Transcriptome dynamics of genes involved in carotenoid biosynthesis
During the four stages of flower development, most of the genes involved in carotenoid biosynthesis and degradation were differentially regulated. Four genes encoding enzymes that catalyze the conversion of phytoene into lycopene, PDS (TR15738), Z-ISO (TR8655), ZDS (TR6555), and CRTISO (TR5914), were significantly upregulated in Stages III and IV (Fig. 7). Similarly, LCY-B (TR13418), LCY-E (TR11756), HYD-B (TR20167), and HYD-E (TR27505) were upregulated in Stages III and IV. The upregulation of the carotenoid biosynthesis genes probably reflects the pigmentation of the flowers at developmental Stages III and IV, during petal expansion. Both ‘V–01’ and ‘V–01M’ have a similar expression pattern of these genes.
Carotenoid degradation genes are differentially expressed between ‘V–01’ and ‘V–01M’
To determine the causal genes responsible for the different accumulation patterns of carotenoids in ‘V–01’ and ‘V–01M’, we compared their expression patterns of the carotenoid biosynthesis genes. The genes catalyzing phytoene to lycopene, including PDS (TR15738), Z-ISO (TR8655), ZDS (TR6555), and CRTISO (TR5914), are more highly expressed in ‘V–01’ than ‘V–01M’ in developmental Stages III and IV (Fig. 7).
The ‘V–01M’ flowers accumulated significantly more xanthophylls than ‘V–01’ (more than a ten-fold difference); therefore, we also investigated their expression of HYD-B (TR20167) and HYD-E (TR27505), which are involved in the biosynthesis of a series of xanthophylls. Surprisingly, no significant difference was observed in the expression of either HYD-B or HYD-E between ‘V–01’ and ‘V–01M’ in any of the four developmental stages, suggesting that this is not the reason for the color differences observed in these lines.
We next examined the expression of the genes responsible for xanthophyll degradation, the most important of which encode the CCD enzymes. Among the three CCD genes identified in our transcriptome data, TR9765 was noticeably downregulated (8.96-fold decrease) in ‘V–01M’ compared with ‘V–01’ at developmental Stage III. Similarly, another CCD gene, TR16287, was expressed to a level 4.30 times lower in ‘V–01M’ than in ‘V–01’ at developmental Stage III (Fig. 7).
In addition to the CCDs, the degradation of zeaxanthin also involves the NCEDs. At developmental Stage IV, two of the four NCED genes identified in the marigold transcriptome were expressed at a dramatically lower level in ‘V–01M’ than in ‘V–01’; 22.62-fold and 12.35-fold decreases were observed in the expression of TR22914 and TR2330, respectively, in the mutant flowers (Fig. 7).
The CCDs and NCEDs are the enzymes responsible for the degradation of xanthophylls; therefore, the low expression of CCDs and NCEDs in the ‘V–01M’ mutant likely resulted in its accumulation of xanthophylls and consequently the yellow color of its flowers.