Comparative transcriptomic analysis of flowering induction in pitaya induced by supplementary lighting

Background Pitayas are currently attracting considerable interest as a fruit with many health benefits. However, the lack of natural light after November in Hainan, China, severely restricts the production of pitaya in winter. To further explore the molecular mechanisms regulating flowering in pitaya, we used de novo RNA sequencing-based transcriptomic analysis for four stages of pitaya subjected to light induction. Results We assembled 68113 unigenes in total, comprising 29782 unigenes with functional annotations in the NR database, 20716 annotations in SwissProt, 18088 annotations in KOG, and 11059 annotations in KEGG. Comparison between different samples revealed different numbers of significantly differentially expressed genes (DEGs). A number of DEGs involved in energy metabolism-related processes and plant hormones were detected. Moreover, we discovered many CONSTANS-LIKE, FLOWERING LOCUS T and other DEGs involved in direct regulation of flowering, along with CDF and TCP, which function as typical transcription factor genes in the flowering process. At the transcriptomic level, we confirmed 13 DEGs with different functions in the time-course response to light-induced flowering by quantitative reverse-transcription PCR analysis. because FKF1 can abolish CDF inhibition of CO and induce This study also identified and verified the down-regulation of FKF1 transcription by qPCR, which is related to the biological state of flowering pitaya. This pathway map only speculates partial network on the flowering of pitaya, and the complete regulatory network and verification needs further exploration.


Background
Pitaya (Hylocereus undulatus Britt) belongs to the Cactaceae family and Hylocereus genus, and is a tropical or subtropical fruit tree. The specific origin of the pitaya remains inconclusive. At present, the academic community generally believes that it originated in southern Mexico. Nowadays, it is geographically widespread, occurring in Israel, Japan's Okinawa, China's Guangxi, Hainan, and Taiwan, and Southeast Asia [1]. The earliest literature report on pitaya is from the General and Natural 4 directly promotes SOC1 and LFY expression and increases the level of SOC1 mRNA, which, in turn, activates the downstream genes LFY and AP1 to induce flowering [12]. One study found that tomato plants with jai1-1 mutants exhibited delayed flower opening, showing that jasmonic acid (JA) acts as a positive regulator of flowering [13]. IAA (indole-3-acetic acid) is likely involved in the mechanism that controls the growth of the male gametophyte to the egg cell in the ovule, demonstrating auxin is a major controlling signal that synchronizes flower development in Arabidopsis [14].
Understanding the detailed flowering mechanism is important for an understanding of the plant growth cycle, especially in the transition from vegetative to reproductive development. For pitaya, few studies have been conducted on molecular mechanisms that promote flowering by light supplementation. Therefore, understanding gene regulation for flowering could facilitate the development of pitaya varieties with photoperiod insensitivity, which could be helpful for reducing energy costs and sustaining year-round production.

Flowering induction of pitaya by supplementary light treatment in the winter season
These findings showed that sufficient supplementary light can induce pitaya flowering if the minimum air temperature is ≥ 15 °C. In this supplementary light experiment, NL represents the control group (no light, no flowering) (Fig. 1A), L0 represents no flowering in the light-treated group (Fig. 1B), L1 represents the flower bud stage in the light-treated group (Fig. 1C), and L2 represents 1 week after the bud stage in the light-treated group (Fig. 1D). About 20~25 days after the buds appeared, the pitaya flowers were in full bloom at night (Fig. 1E), signifying the success of this flower induction system with respect to pitaya fruit production in the winter season.

Sequencing, de novo assembly, and annotation
To understand the underlying molecular events involved in the response to light-induced flowering, we sent pitaya samples from the NL, L0, L1, and L2 stages to the Gene Denovo Biotechnology Co.
(Guangzhou, China) for de novo assembly of RNA-seq data. More than 39 Gb raw data reads were retrieved. The high-quality clean reads number of the three biological replicates for the NL sample were 45444076 (97.78%), 42546349 (97.75%), and 42956302 (97.82%); the clean reads for the L0 sample were 35518498 (97.83%), 38856464 (97.71%), and 49626346 (98.11%); the clean reads for the L1 sample were 42262290 (97.78%), 52312394 (97.94%), and 49023546 (98.01%); and the clean reads for the L2 sample were 49899592 (98.17%), 48213826 (98.08%), and 45191594 (98%) (Additional file 1). These findings indicated that the proportion of high-quality clean reads after filtration of each sample was relatively high, signifying that the sequencing quality was good. The de novo assembly resulted in a total of 60580077 bases, but only 68113 unigenes (not less than 200 nt) with an N50 of 1730 nt ( Table 1). The average length of our assembly was 889 nt, of which the maximum length of unigene was 36165 nt, the minimum length 201 nt, and the GC (guaninecytosine) percentage 41.96%. Figure 2 showed the scatter plot of size distribution of the unigenes. In order to search for homologous sequences, annotation analysis was performed on the unigenes using four public databases: KOG, KEGG, Swissprot, and NR. In total, 29959 unigenes (removing duplicates) were found in at least one of these databases (Table 2). There were 29782 unigenes with functional annotations in the NR database, 20716 annotations in SwissProt, 18088 annotations in KOG, and 11059 annotations in KEGG. We also observed that 8618 unigenes were annotated by all four major databases (Fig. 3). At the same time, we found only 6220, 75, 33, and 32 unique annotations in the NR, Swissprot, KOG, and KEGG databases, respectively. Table 1 Results of assembly  Table 2 Results of annotations in four databases Figure 3 Venn diagram of the four database annotations.

Global analysis of differential expression profiling
The abundance of each gene was determined by counting reads per kilobase per million reads (RPKM) to infer the expression level. Therefore, this method could be used to directly compare the difference in gene expression among samples. The correlation between gene expression levels among the samples was a key criterion in determining whether the experiments were reliable and whether the samples chosen were suitable. If one sample had a high degree of similarity to another, the correlation value between them would be very close to 1. We calculated the correlation values between samples based on the RPKM results. According to the standard recommended by the ENCODE (Encyclopedia of DNA Elements) project [15], the square of the correlation value should be ≥0.92 (under ideal experimental conditions and with suitable samples). The heatmap of correlations for these samples is shown in Figure 4. From the figure, we can see that the coefficient between the three repetitions for the NL stage was found to be above 0.98, and the correlation between the three repetitions for the L0 stage was above 0.96. Similarly, the correlation between the three repetitions for the L1 and L2 stages was found to be above 0.99 and 0.97, respectively, signifying that the levels of expression between all the biological replicates were highly correlated. In other words, we found a low level of correlation between the different samples, indicating that the difference between these four samples was significant. In our study, comparison between different samples resulted in different numbers of DEGs (Fig 5).
When NL and L0 were compared, we found only 51 up-regulated genes and 88 down-regulated genes, totaling 139 DEGs. A total of 12352 DEGs were obtained when we compared NL and L1, of which 8640 genes were up-regulated and 3712 genes were down-regulated. In other words, compared with NL, the number of DEGs in the L1 sample was found to be about nine times that in the L0 sample, where the number of up-regulated genes was 169 times that of L1 and the number of down-regulated genes was 41 times higher than that of L1. Similar results were found in other comparison groups. Gene ontology (GO) functional enrichment analysis was applied in our subsequent study in order to annotate the expression patterns of the DEGs to the chosen GO terms: molecular function, cellular component, and biological process. Since a gene often has multiple different functions, the same gene will appear under different terms and each histogram is statistically independent of any others.
In our research, 50 GO terms relating to molecular function, 30 GO terms relating to cellular component, and 158 GO terms relating to biological process were enriched with respect to the L0 stage, as compared to the NL stage, after light treatment (Additional file 2,3,4). Under the same light treatment, DEGs associated with the L1 stage were enriched by 399, 162, and 1042 GO terms relating to molecular function, cellular component, and biological process, respectively (Additional file 5,6,7).
Moreover, findings regarding the GO analysis of other groups were similar to those described above.
Most importantly, we found that the main enriched terms of all groups were the same regardless of ontologies. In particular, 'cellular component' mainly related to the cell part, membrane, and organelle; 'molecular function' mainly related to binding, catalytic activity, and transport activity; and 'biological process' mainly related to cellular, metabolic, single-organism transport, suggesting that DEGs primarily enriched in those terms may be associated with induction of pitaya blossoms. To further investigate the gene expression profiles, we made use of KEGG pathway analysis to determine some of the DEGs involved in important biochemical metabolic and signal transduction pathways during light-induced flowering of pitaya. For example, two auxin response factors (unigene0028447, unigene0002811) were up-regulated, two genes related to jasmonic acid (unigene0030164, unigene0033693) were up-regulated, and five genes related to starch and sucrose metabolism 8 (unigene0052631, unigene0036015, unigene0040385, unigene0041472, unigene0035017) were also up-regulated.

Energy-related genes in the process of flowering induction
During our investigation of light-induced pitaya flowering, we focused on the differentially expressed genes in the NL, L0, and L1 stages. It is well known that plant flowering is a normal physiological process. Before a plant flowers, some essential organic nutrients accumulate in the plant so as to accommodate the significant uptake of nutrients during flowering. Soluble sugar and soluble protein are important nutrients that are closely associated with plant flowering. Large increases in ATP concentration initiated by changes in environmental conditions, especially by different kinds of light treatments, have already been observed [16]. Our analysis found that two up-regulated genes, unigene 0046195 (beta-glucosidase BoGH3B-like) and unigene 0014135 (ATP synthase), in the Krebs cycle of glucose metabolism, were involved in the L0 stage, but not the NL stage. Moreover, the expression levels of eight genes (unigene0006689, unigene0031088, unigene0044754, unigene0009228, unigene0045801, unigene0047216, unigene0050048, unigene0053090) involved in sucrose synthesis in L1 were found to have increased, and three of these genes were up-regulated in L1, but not in L0. This finding, which relates to energy metabolism, demonstrates that large amounts of sucrose are needed as an energy supply during the flowering process.

Plant hormone and signal transduction related genes
Three types of phytohormone-auxin, jasmonic acid (JA), and the brassinosteroids (BRs)-have been individually connected to floral timing, though they have not been extensively studied. These hormones can be transported as signal molecules in plants, which in turn produce some signal transduction cascades that direct a series of metabolic activities in plants.
Genetic evidence has revealed that in Arabidopsis, the polar transport of auxin controls flower formation and differentiation [17]. Genes regulating floral organ development and gynoecium vascularization have been discovered, indicating the probable involvement of auxins in flower development. In our study (Fig 6 and Additional file 8), we found an auxin-related gene (unigene 0029163) that was significantly up-regulated in L0, but not in NL. Twenty-three genes were significantly expressed in L1, of which two genes (unigene 0027380, unigene 0030569) were downregulated and 21 genes were up-regulated. Studies have reported that in Arabidopsis thaliana, the BR biosynthesis-deficient det2 mutant is delayed flowering by at least 10 days compared to the wild type. The level of endogenous BR is lower 10% than that of the wild type [18]. Compared with NL, only one up-regulated gene was expressed in L0 (unigene0026249), but six up-regulated DEGs (unigene0004470, unigene0025032, unigene0030389, unigene0038596, unigene0035058, unigene0042473) were expressed in L1. At the same time, compared with L0, eight up-regulated genes involved in the brassinosteroid metabolism pathway were expressed in L1. Generally speaking, JA is known to activate transcription factors (TFs) that trigger a large-scale process of various abiotic and biotic stresses, but also plays an important role in regulating flower opening. Jasmonate-ZIM domain (JAZ) proteins have been shown to regulate the level of JA, to counteract flower abscission in Nicotiana attenuata plants [19]. In our own study, we found one gene that was down-regulated (unigene 0021973) in L0, and four genes that were up-regulated (unigene 0030164, unigene 0032865, unigene 0034754, unigene 0049406) in L1. In addition, only one up-regulated gene (unigene0033693) was expressed in L1, but not in L0.

CO and FT or other directly regulating DEGs
In plants, seasonal changes in day length are perceived in leaves, which initiate long-distance signaling that induces flowering at the shoot apex. In Arabidopsis, FLOWERING LOCUS T (FT) is now known as florigen, which is crucial for the proper timing of flowering, and CONSTANS (CO) is a key player in the activation of FT expression. In other words, FT acts partially downstream of CONSTANS (CO), which promotes flowering in response to long days. In this study ( Fig. 7 and Additional file 9), relative to NL, in the light treatment group L0, one gene-the CONSTANS-related gene (unigene 0025406)-was found to be significantly down-regulated, and the FT gene was not significantly expressed. In comparison with NL, 13 CONSTANS-LIKE genes were significantly differentially expressed in L1, among which three genes were up-regulated (unigene002346462, unigene0027129, unigene0035118) and 10 genes were down-regulated (unigene0025406, unigene0027332, unigene0028118, unigene0029386, unigene0030366, unigene0034031, unigene0034048, unigene0035131, unigene0043102, unigene0050191). At the same time, two upregulated FT genes (unigene0001751, unigene0005649) were significantly differentially expressed. One FT gene (unigene0025148) was down-regulated, and six FT genes (unigene0002836, unigene0006372, unigene0025249, unigene0038103, unigene0042376, unigene0045680) were up-regulated in L1, but not in L0.
In addition to these well-known CO and FT genes, certain genes are closely related to the regulation of plant flowering processes. For example, the protein encoded by Hd3a, a rice ortholog of FT, moves from the leaf to the shoot apical meristem and induces flowering in rice, suggesting that the Hd3a protein may be the rice florigen [20]. In our analysis, an HD3a-like gene (unigene0019970), whose expression decreased under light treatment, was detected in L0. Maybe the HD3a-like gene was upregulated during the rice flowering period but down-regulated during the pitaya flowering period.
The LEAFY gene is an important element in the transition from the vegetative to the reproductive phase, as LEAFY is both necessary and sufficient to initiate the growth of individual flowers. On long days, Arabidopsis plants flower soon after germination, in parallel with rapid up-regulation of LEAFY [21]. In this research, we found one LEAFY homolog gene (unigene0024655) to be significantly up-regulated in L1. We also found two genes encoding flowering time control protein to be involved in L1, and FPA (unigene0045587) to be down-regulated, while the reverse was true for the expression of 11 FCA (unigene0049096). One flowering-promoting factor 1-like gene (unigene0023129) was upregulated in L1.
Light is one of the clock entrainment signals and can induce a plant's circadian rhythm, acting as a bridge in the signaling network between the environmental stimulation and its internal processes.
According to KEGG pathway analysis, some DEGs are associated with circadian rhythm in pitaya. One gene encoding Arabidopsis pseudo response regulator (APRR5)-like protein (unigene0045107) was found to be down-regulated in L0, but not in NL. Two down-regulated circadian clock associated 1-like genes (unigene0036436, unigene0049229), two down-regulated genes encoding EARLY FLOWERING 4-like protein (unigene0022870, unigene0039817), and an up-regulated gene encoding EARLY FLOWERING 1-like protein unigene0043786 were expressed in L1, but not in NL.

Validation of the selected DEGs by quantitative RT-PCR analysis
We consider the control of floral development genes or transcription factors that regulate downstream genes to be more important than other aspects of plant flowering processes. One cyclic DoF factor (CDF) transcription factor gene in each of the L0 and L1 samples was detected by quantitative RT-PCR. One gene encoding response regulator-like APRR5 related to circadian rhythm involved in course in L0 and two genes encoding circadian clock associated protein involved in L1 were selected to aid the detection of mRNA levels by quantitative RT-PCR. One and two CONSTANS-LIKE (CO) genes in L0 and L1, respectively, were chosen for real-time quantitative PCR. Furthermore, one flowering locus T (FT) gene and its ortholog encoding HEADING DATE 3A-like (HD3a-like), involved in L1 and L0 13 respectively, were detected by quantitative RT-PCR. In addition, two TCP TF genes and a gene encoding phytochrome B in L1 were also selected for quantitative RT-PCR (Table 1)   between the columns by T test using SPSS statistical software. Data in columns with the same letters showed no significant difference (p>0.05).

Tissue specific expression of key genes
In this study, in order to further explore whether the genes associated with the flowering process of Pitaya are tissue-specific, different growth stages of stem, flower and fruit in Figure 11 were selected for specific analysis. We selected CDF-3 like(Unigene0026114) and CO-15 like(Unigene0035118) for tissue-specific analysis in Figure 12. The results showed that CDF-3 like had a certain expression level in various tissues and various periods, while its expression was increased in pistils and styles and highest in small green peel compared with in the old stem. The expression level of CO-15 like in flowers and fruits is higher than that of old stems, and the specificity is not obvious. . In our analysis, many genes (unigene0004150, unigene0021847, unigene0032942, and so on) encoding auxin-responsive protein also displayed upregulation. Defects along the jasmonic acid (JA) signaling pathway affect not only floret opening and anther dehiscence, but also spikelet development, in rice [29]. In our study, we observed that some genes encoding JAZ-like protein (e.g., MYC2 TFs) were involved in the JA signaling pathway.
Light signals perceived by phytochrome A (phyA) and cryptochrome (cry) photoreceptors stabilize CO protein on long days [30]. In our research, some CO-like genes were up-regulated, while others were down-regulated, under the same conditions, indicating that some CO genes belonging to specific 16 families may be repressed by certain proteins or miRNAs in pitaya. Photoreceptors regulate CO stability and act antagonistically to generate daily rhythms in CO abundance [31]. Moreover, a phyB mutation has been shown to cause an increase in CO protein abundance in the early phase of the photoperiod [31], indicating that phyB mutations also accelerate flowering. So, we can conclude that the amount of CO is positively regulated by the photoreceptors cry1, cry2, and phyA, but the mRNA expression of CO is inhibited by phyB. A similar conclusion was drawn from our own study. The gene encoding phyB (unigene0002396) was down-regulated by qPCR, and the expression pattern was found to be the same as that for transcriptome sequencing, suggesting that PHYB may negatively regulate CO throughout the pitaya flowering process.
The circadian clock gene, which is a key transcriptional activator of the photoperiodic pathway, CO, is essential for proper photoperiodic flowering. In the current model of the Arabidopsis circadian clock, most components function as repressors [32]. Fujiwara et al. demonstrated that the circadian clock proteins LATE ELONGATED HYPOCOTYL (LHY) and CIRCADIAN CLOCK-ASSOCIATED1 (CCA1) repressed floral transition, and in particular repressed FT expression under long-day conditions [33]. In our analysis, the transcriptome level of most DEGs associated with circadian rhythm were declined, consistent with previous reports.
The FT gene was first identified in Arabidopsis thaliana, in which it plays a major role in the photoperiod pathway and in initiating the flowering signal in the apical meristem [34]. Unlike many other floral regulators, the deduced sequence of the FT protein suggests that it does not directly control transcription or transcript processing [35]. FT expression is affected by a large number of factors, some of which act as repressors and others as positive regulators. The former class includes several MADS box TFs [36], AP2-like proteins and B-box proteins, while the latter class includes CO [37], NUCLEAR FACTOR Y, and GIGANTEA (GI) [38]. This finding was also revealed in our own study; except for one down-regulated FT gene (unigene0025148), all the other genes were up-regulated.
CYCLING DoF FACTOR 1 (CDF1) and its homologs play an important role in floral transition by repressing the expression of floral activator genes such as CO and FT in Arabidopsis [39]. One study indicated that the CDF repression operates through the formation of a CDF-TPL transcriptional complex, which binds the CO and FT promoters to reduce their expression levels throughout the morning, for seasonal flowering [40]. The results of our quantitative RT-PCR analysis revealed downexpression of two CDF-like genes (unigene0031426, unigene0026114) but up-expression of a CO-like 15 (unigene0027129) and FT gene (unigene0005649). This finding demonstrated that CDF may repress CO and FT transcriptome expression during the pitaya flowering process. Simultaneously, we observed that the expression levels of a few CDF-like genes were higher than the expression level of NL, which may explain why some CO and FT genes in our study were down-regulated.
The TCP gene family encodes transcription factors that control diverse aspects of the developmental and growth traits of plants. According to the structure of the conserved region, they are classified into classes I and II in Arabidopsis [41], orchestrating the switch from vegetative to reproductive to senescence stages in the life cycle. They may be upstream and activate of SOC1 to modulate flowering [41]. TCP4 is directly associated with CO promoters through TCP binding sites, and promotes CO expression around dusk, together with other CO activators (FLOWERING BHLH)-FBHs [42]. Furthermore, these TCPs not only interact with the flower-repressing CDFs, but physically interact with FBHs to facilitate CO transcription [43]. Our research demonstrated that two transcription factor TCP4 genes (unigene0009351, unigene0033768), and some other TCP genes, such as TCP 15, TCP 18, were up-regulated. Moreover, the expression level of CO-like genes increased, indicating that TCP TFs may activate CO transcript expression in pitaya induced to flower through supplementary lighting. Figure 13 shows the partial regulatory pathways of flowering based on the genes selected in this study and reference [44]. CO acts on the FT promoter region and is a major positive regulator of FT [31]. CDF binds to the CO promoter region to inhibit CO expression, and CDF is inhibited by F-box protein (Flavinbindingkelch-repeat 1, FKF1), because FKF1 can abolish CDF inhibition of CO and induce flowering [45]. This study also identified and verified the down-regulation of FKF1 transcription by qPCR, which is related to the biological state of flowering pitaya. This pathway map only speculates partial network on the flowering of pitaya, and the complete regulatory network and verification needs further exploration.

Plant material and experimental design
The plant material, 'Jindu No.1' red flesh pitaya, was cultivated at the Enhong Agricultural Technology

Sample preparation and sequencing
We performed three biological replicates for each stage, each involving four plant samples for mixing well. Every sample was selected to be 10 cm in length from the top of the stem and then included 2 cm of tissue around the base of the thorn. These samples were then frozen immediately using liquid nitrogen and stored at −80 °C before being sent to Gene Denovo Biotechnology Co. (Guangzhou, China) for de novo and RNA-seq using the Illumina HiSeqTM 4000 platform. Raw data were filtered by removing reads that included adapter, poly-N>10%, and low-quality reads in order to obtain clean reads as a foundation for further analysis. De novo assembly of the transcriptome was carried out using the short-read assembly program Trinity [46]. The rRNA-free, high-quality clean reads were mapped to the reference transcriptome using the short-read alignment tool Bowtie2[47], employing its default parameters. The RPKM (reads per kb per million reads) measure was used to determine the expression level of each gene, which was calculated as the fold change in expression of a gene expressed in different samples.

Identification of DEGs
In our study, genes with a false discovery rate (FDR) ≤0.05 and an absolute value of Log2Ratio ≥1 in a comparison were determined to be significantly differentially expressed genes (DEGs) that could be subjected to enrichment analysis of GO (gene ontology) functions and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. All DEGs were mapped to GO terms in the Gene Ontology database (http://www.geneontology.org/), and gene numbers were calculated for every term. Significantly enriched GO terms were defined by a corrected p value ≤0.05. Pathway enrichment analysis, which can be calculated using the same formula as that in GO analysis, identified significantly enriched metabolic pathways or signal transduction pathways among DEGs. Our raw data and processed data 20 have been uploaded in the Gene Expression Omnibus(GSE125083) at the NCBI

Quantitative RT-PCR
Total RNA was separately extracted from four groups, using the CTAB (cetyltrimethylammonium bromide) method. The resulting RNA (1 µg) was treated with gDNA Eraser to remove contaminating genomic DNA and synthesized to the first-strand complementary DNA using a PrimeScript RT reagent kit with gDNA Eraser (TaKaRa) according to the manufacturer's instructions. Primer pairs for quantitative RT-PCR were designed using Primer Premier 5.0 (PREMIER Biosoft International, Palo Alto, California, USA). The cDNA reverse-transcription products were used as templates for qRT-PCR.
An Applied Biosystems 7500 Real-Time PCR System (Life Technologies, USA) was used for quantitative RT-PCR, and the reaction consisted of 10 µL 2×SYBR Premix Ex Taq II, 0.4 µL 50×ROX Reference Dye or Dye II, 2 µL cDNA, 6 µL sterilized purified water, 0.8 µL forward primer (10 μM), and 0.8 µL reverse primer (10 μM) in a total volume of 20 µL. No-template controls were also prepared for each primer pair. The relative quantification method (Delta-Delta cycle threshold) was used to evaluate quantitative variation between the replicates examined. We selected 13 DEGs for quantitative RT-PCR (Table 3). Table 3

Consent for publication
Not applicable

Competing interests
The authors declare that they have no competing interests.        The result of CDF-3 like(Unigene0026114) and CO-15 like(Unigene0035118) tissue-specific expression. Expression was normalized to that of UBQ. The transcript levels from the old stem were set at 1.