2.1 Flowering induction of pitaya by supplementary light treatment in the winter season
Our flowering induction experiment results demonstrate that sufficient supplementary light can induce flowering in pitaya plants if the minimum air temperature is ≥ 15 °C. In this experiment, NL represented the control group (no light, no flowering) (Fig. 1A), L0 represented no flowering in the light-treated group (Fig. 1B), L1 represented the flower bud stage in the light-treated group (Fig. 1C), and L2 represented one week after the bud stage in the light-treated group (Fig. 1D). Approximately 20–25 days after the buds appeared, the pitaya flowers were in full bloom at night (Fig. 1E), signifying the success of this flower induction system with respect to pitaya production in the winter season.
2.2 Sequencing, de novo assembly, and annotation
To elucidate the underlying molecular events involved in light-induced flowering, we sent pitaya plant samples from the NL, L0, L1, and L2 stages to the Gene Denovo Biotechnology Co. (Guangzhou, China) for de novo assembly of RNA-Seq data. More than 39 Gb of raw data reads were retrieved. The numbers of high-quality clean reads in different samples were showed as Supplementary Fig. S1. These findings indicate that the proportion of high-quality clean reads after filtration of each sample was relatively high, signifying that the sequencing quality was good. The de novo assembly resulted in a total of 60580077 bases, but only 68113 unigenes (not less than 200 nt) had an N50 of 1730 nt (Table 1).
The average length of our assembly was 889 nt, of which the maximum length of a unigene was 36165 nt, the minimum length was 201 nt, and the GC (guanine-cytosine) percentage was 41.96%. Supplementary S2 presents the size distribution of the unigenes. To search for homologous sequences, annotation analysis was performed on the unigenes using four public databases: KOG, KEGG, SwissProt, and NR. In total, 29959 unigenes (after removing duplicates) were found in at least one of these databases (Table 2). There were 29782 unigenes with functional annotations in the NR database, 20716 annotations in SwissProt, 18088 annotations in KOG, and 11059 annotations in KEGG. We also observed that 8618 unigenes were annotated by all four major databases (Fig. 2). Moreover, we found only 6220, 75, 33, and 32 unique annotations in the NR, SwissProt, KOG, and KEGG databases, respectively.
Table 1 Results of assembly
Gene number
|
GC percentage
|
N50
|
Max length
|
Min length
|
Average length
|
Total assembled bases
|
68113
|
41.96
|
1730
|
36165
|
201
|
889
|
60580077
|
Table 2 Results of annotations in four databases
Total unigenes
|
NR
|
SwissProt
|
KOG
|
KEGG
|
Genes with annotation
|
Genes without annotation
|
68113
|
29782
|
20716
|
18088
|
11059
|
29959
|
38164
|
2.3 Global analysis of differential expression profiling
The abundance of each gene was determined by counting reads per kilobase per million reads (RPKM) to infer the expression level. Therefore, this method could be used to directly compare the differences in gene expression among the samples. The correlation between the gene expression levels among the samples was a key criterion in determining whether the experiments were reliable and whether the samples chosen were suitable. If one sample had a high degree of similarity to another, the correlation between them would be very close to 1. We calculated the correlation values between samples based on the RPKM results. According to the standard recommended by the Encyclopedia of DNA Elements (ENCODE) project [15], the square of the correlation value should be ≥ 0.92 (under ideal experimental conditions and with suitable samples).
The heatmap of correlations for these samples is presented in Fig. 3. From the figure, we can see that the correlation between the three repetitions for the NL stage was above 0.98, and that between the three repetitions for the L0 stage was above 0.96. Similarly, the correlations between the three repetitions for the L1 and L2 stages were above 0.99 and 0.97, respectively, signifying that the levels of expression between all biological replicates were highly correlated. Conversely, we found a low level of correlation between the different samples, indicating that the differences between the four samples were significant.
In the present study, comparisons between different samples resulted in different numbers of DEGs (Fig 4). When NL and L0 were compared, we found only 51 up-regulated genes and 88 down-regulated genes, totaling 139 DEGs. A total of 12352 DEGs were obtained when we compared NL and L1, of which 8640 genes were up-regulated and 3712 genes were down-regulated. Thus, compared with the NL sample, the number of DEGs in the L1 sample was approximately nine times that in the L0 sample, of which the number of up-regulated genes in the L1 sample was 169 times that in the L0 sample and the number of down-regulated genes in the L1 sample was 41 times that in the L0 sample. Similar results were found in other comparisons of the groups.
Gene ontology (GO) functional enrichment analysis was applied in our subsequent experiment to annotate the expression patterns of the DEGs to the selected GO terms: molecular function, cellular component, and biological process. As a single gene often has multiple different functions, the same gene could appear under different terms, and each histogram could be statistically independent of any other.
In our study, 50 GO terms relating to molecular function, 30 GO terms relating to cellular component, and 158 GO terms relating to biological process were enriched in the L0 stage, compared with the NL stage, after light treatment (Supplementary file S3, S4, S5). Under the same light treatment, DEGs associated with the L1 stage were enriched by 399, 162, and 1042 GO terms relating to molecular function, cellular component, and biological process, respectively (Supplementary file S6, S7, S8). Moreover, findings regarding the GO analysis of other groups were similar to those described above. Most notably, we found that the main enriched terms of all groups were the same, regardless of ontologies. In particular, “cellular component” was mainly related to the cell part, membrane, and organelle; “molecular function” was mainly related to binding, catalytic activity, and transport activity; and “biological process” was mainly related to cellular, metabolic, and single-organism transport, suggesting that DEGs primarily enriched in those terms may be associated with the induction of pitaya blossoms.
To further investigate the gene expression profiles, we performed KEGG pathway analysis to determine some of the DEGs involved in important biochemical, metabolic, and signal transduction pathways during the light-induced flowering of pitaya plants. For example, two auxin response factors (unigene0028447 and unigene0002811) were up-regulated, two genes related to JA (unigene0030164 and unigene0033693) were up-regulated, and five genes related to starch and sucrose metabolism (unigene0052631, unigene0036015, unigene0040385, unigene0041472, and unigene0035017) were also up-regulated.
2.4 Energy-related genes in the process of flowering induction
During our investigation of the light-induced flowering of pitaya plants, we focused on the DEGs in the NL, L0, and L1 stages. Before a plant flowers, essential organic nutrients accumulate to accommodate the large consumption of nutrients during flowering. Soluble sugars and soluble proteins are important nutrients that are closely associated with plant flowering. Large increases in ATP concentration initiated by changes in environmental conditions, especially different forms of light treatment, have been observed [16].
Our analysis revealed that two genes, unigene0046195 (beta-glucosidase BoGH3B-like) and unigene0014135 (ATP synthase), both associated with the Krebs cycle of glucose metabolism, were up-regulated in the L0 stage compared with the NL stage. Moreover, the expression levels of eight genes (unigene0006689, unigene0031088, unigene0044754, unigene0009228, unigene0045801, unigene0047216, unigene0050048, and unigene0053090) involved in sucrose synthesis were all increased in the L1 stage, three of which were up-regulated in the L1 stage compared with the L0 stage. These findings related to energy metabolism demonstrate that large amounts of sucrose are needed as an energy supply during the flowering process.
2.5 Plant hormone and signal transduction related genes
Three types of phytohormones—auxins, JA, and brassinosteroids (BRs)—have been individually connected to floral timing, although they have not been extensively studied. These hormones can be transported as signal molecules in plants, producing signal transduction cascades that direct a series of metabolic activities.
Genetic evidence has revealed that the polar transport of auxins in Arabidopsis controls flower formation and differentiation [17]. Genes regulating floral organ development and gynoecium vascularization have been discovered, indicating the probable involvement of auxins in flower development. In the present study (Fig. 5A and Supplementary file S9), we found an auxin-related gene (unigene0029163) that was significantly down-regulated in L0 compared with NL. Twenty-three genes were significantly expressed in L1, of which two genes (unigene0027380 and unigene0030569) were down-regulated and 21 genes were up-regulated.
A previous study reported that the flowering of the BR biosynthesis-deficient det2 mutant of Arabidopsis thaliana is delayed by at least ten days compared with the wild type; the level of endogenous BR was 10% lower than that of the wild type [18]. Compared with NL, only one up-regulated gene was detected in L0 (unigene0026249), whereas six up-regulated DEGs (unigene0004470, unigene0025032, unigene0030389, unigene0038596, unigene0035058, and unigene0042473) were observed in L1. Furthermore, compared with L0, eight up-regulated genes involved in the BR metabolism pathway were observed in L1.
In general, JA is known to activate transcription factors (TFs) that trigger a large-scale response to various abiotic and biotic stresses, and it also plays an important role in regulating flower opening. Jasmonate-ZIM domain (JAZ) proteins have been shown to regulate the levels of JA to counteract flower abscission in Nicotiana attenuata plants [19]. In the present study, we found one gene that was down-regulated (unigene0021973) in L0 and four genes that were up-regulated (unigene0030164, unigene0032865, unigene0034754, and unigene0049406) in L1. In addition, only one up-regulated gene (unigene0033693) was detected in L1 compared with L0.
GA hormone signaling is very important to floral transition. In the present study,, we have found three DELLA protein GAI genes (Unigene0005738, Unigene0031663, Unigene0046182) were significantly up-regulated during the flower induction and development stages from NL vs L1, L0 vs L1 (Supplementary file S9).
2.6 CO and FT or other directly regulating DEGs
In plants, seasonal changes in day length are sensed by the leaves, which initiate long-distance signaling that induces flowering at the shoot apex. In Arabidopsis, FLOWERING LOCUS T (FT), now known as florigen, is crucial for the appropriate timing of flowering[20], and CONSTANS (CO) plays a key role in the activation of FT expression[21]. Thus, FT acts partially downstream of CONSTANS (CO), which promotes flowering in response to long days. In the present study (Fig. 5B and Supplementary file S10), relative to NL, in the light treatment group L0, one gene—the CONSTANS-related gene (unigene0025406)—was found to be significantly down-regulated, and the FT gene was not significantly expressed. Compared with NL, 13 CONSTANS-LIKE genes were significantly differentially expressed in L1, among which three genes were up-regulated (unigene002346462, unigene0027129, and unigene0035118) and ten genes were down-regulated (unigene0025406, unigene0027332, unigene0028118, unigene0029386, unigene0030366, unigene0034031, unigene0034048, unigene0035131, unigene0043102, and unigene0050191). Furthermore, two up-regulated FT genes (unigene0001751 and unigene0005649) were significantly differentially expressed. One FT gene (unigene0025148) was down-regulated, and six FT genes (unigene0002836, unigene0006372, unigene0025249, unigene0038103, unigene0042376, and unigene0045680) were up-regulated in L1 compared with L0.
In addition to these well-known CO and FT genes, other specific genes are closely related to the regulation of plant flowering processes. For example, the protein encoded by Hd3a, a rice ortholog of FT, moves from the leaf to the shoot apical meristem and induces flowering in rice, suggesting that the Hd3a protein may be the rice florigen [22]. In our analysis, an HD3a-like gene (unigene0019970), the expression of which was decreased under light treatment, was detected in L0. The HD3a-like gene may be up-regulated during the rice flowering period but down-regulated during the pitaya flowering period.
The LEAFY gene is an important element in the transition from the vegetative to the reproductive phase, as LEAFY is both necessary and sufficient to initiate the growth of individual flowers. On long days, Arabidopsis plants flower soon after germination, in parallel with rapid up-regulation of LEAFY [23]. In the present study, we found one LEAFY homolog gene (unigene0024655) to be significantly up-regulated in L1. We also detected two genes encoding the flowering time control protein in L1, where the expression of FPA (unigene0045587) was down-regulated and that of FCA (unigene0049096) was up-regulated. One flowering-promoting factor 1-like gene (unigene0023129) was up-regulated in L1.
Light is one of the clock entrainment signals that can induce a plant’s circadian rhythm and act as a bridge in the signaling network between environmental stimulation and the internal processes of the plant. According to the KEGG pathway analysis, some DEGs were associated with the circadian rhythm in pitaya plants. One gene encoding pseudo response regulator APRR5-like protein (unigene0045107) was down-regulated in L0, but not in NL. Two down-regulated circadian clock associated 1-like genes (unigene0036436 and unigene0049229), two down-regulated genes encoding EARLY FLOWERING 4-like protein (unigene0022870 and unigene0039817), and an up-regulated gene encoding EARLY FLOWERING 1-like protein (unigene0043786) were expressed in L1 compared with NL.
2.7 Transcription factors control the course of flowering
Cycling DoF factor (CDF) proteins belong to the larger family of plant-specific DoF (DNA binding with one finger) TFs, which presumably include a single C2-C2 zinc finger and a highly conserved DNA-binding domain [24]. Several Arabidopsis DoF transcription factors—CDF1, 2, 3, and 5—are implicated in circadian rhythms and photoperiodism, and their expression at elevated levels is sufficient to repress flowering [25]. In this study (Fig. 5C and Supplementary file S11), we found two CDF genes (unigene0033003 and unigene0031426) that were down-regulated in L0 compared with NL. Similarly, three CDF genes (unigene0031426, unigene0026114, and unigene0033003) were down-regulated and two genes (unigene0029432 and unigene0006016) were up-regulated in L1.
During the development and evolution of flowering plants, the MADS domain transcription factor plays an important role, because it determines the establishment of flower morphology and flower development. The MADS-box gene contains a conserved MADS-box motif and is generally classified as type I or type II with a conserved K-, I-, and C-domain subfamily. In general, the type II subfamily can be divided into ABCDE models during the development of the flower [26]. Compared with NL, no significantly differentially expressed MADS-box genes were detected in L0. In contrast, one gene encoding the AGAMOUS-like MADS-box protein AGL19 (unigene0027094) was down-regulated and the other six genes encoding AGAMOUS-like MADS-box proteins (unigene0018425, unigene0021927, unigene0025394, unigene0037726, unigene0047376, and unigene0202108) were up-regulated in L1.
Other TFs that can bind to the CO promoter include the TEOSINTE BRANCHED 1/CYCLOIDEA/PROLIFERATING CELL NUCLEAR ANTIGEN FACTOR (TCP) transcription factor family [27]. Members of this family control multiple traits in a diverse range of plant species, including flower and petal asymmetry [28]. We discovered no DEG-related TCP TFs in L0, whereas all six genes encoding TCP (unigene0005880, unigene0009351, unigene0025734, unigene0028812, unigene0030793, and unigene0033768) were up-regulated in L1 compared with NL.
2.8 Validation of the selected DEGs by quantitative RT-PCR analysis
We considered the control of floral development genes or transcription factors that regulate downstream genes to be more important than other aspects of plant flowering processes. One cycling DoF factor (CDF) transcription factor gene in each of the L0 and L1 samples was verified via quantitative RT-PCR (qRT-PCR). One gene encoding response regulator-like APRR5 related to the circadian rhythm in L0 and two genes encoding the circadian clock associated protein in L1 were selected to check the mRNA levels by qRT-PCR. One and two CONSTANS-LIKE (CO) genes in L0 and L1, respectively, were chosen for real-time quantitative PCR. Furthermore, one flowering locus T (FT) gene and HEADING DATE 3A-like (HD3a-like) ortholog involved in L1 and L0 respectively, were verified by qRT-PCR. In addition, two TCP TF genes and a gene encoding phytochrome B in L1 were also selected for qRT-PCR (Table 3). The UBQ gene was used as a reference gene [29].
The qRT-PCR results from the L0 and L1 stages are shown in Fig. 6 and 7. The expression profiles of all 13 detected genes revealed a similar trend and consistent results between qRT-PCR and RNA-Seq. Three DEGs of L0 exhibited significant down-regulation in plants under light treatment compared with NL, including unigene0031426 (cycling DoF factor 1), unigene0025406 (zinc finger protein CONSTANS-LIKE 2), and unigene0019970 (HEADING DATE 3A-like). However, unigene0045107 (response regulator-like APRR5) associated with the circadian clock, displayed no remarkable decrease in its mRNA levels.
In the L1 stage, all nine DEGs exhibited significant differential expression based on the qRT-PCR results. Two genes exhibiting functional homology to circadian clock associated 1 in the L1 stage showed a significant reduction in transcript levels under light treatment. A similar situation was found for two genes encoding TCP TF, for which the mRNA abundance was substantially enhanced in the L1 sample following light treatment. Interestingly, two CO genes revealed an opposite trend in expression in the L1 stage: one was up-regulated and the other was down-regulated. This finding indicates that different CO gene families may be subjected to different types of regulation, resulting in opposite types of expression. In addition, CDF3-like was down-regulated, whereas FT displayed a higher mRNA level in L1 during light treatment than in NL, as determined by qRT-PCR.
2.9 Tissue-specific expression of key genes
To further explore whether the genes associated with the flowering process of pitaya plants are tissue-specific, various growth stages of the stem, flower, and fruit as depicted in Fig. 8 were selected for analysis. We selected CDF-3 like (Unigene0026114) and CO-15 like (Unigene0035118) for the tissue-specific analysis, and the results are presented in Fig. 9. The results revealed that CDF-3 like displayed specific expression levels in different tissues at different time points, where its expression was increased in the pistils and styles and highest in the small green peel, compared with old stems. The expression level of CO-15 like was higher in the flowers and fruits than in old stems, although the specificity was not obvious.