RcTPS gene identification and sequence features
The present study originally obtained 81 nonredundant candidate gene models corresponding to PF01397 or PF03936. After aligning the candidate sequences by using MAFFT software, the sequences that did not contain the typical TPS domains were manually removed. A total of 74 putative TPS genes were identified, 49 of which were predicted to encode functional TPS enzymes. These genes encoding proteins longer than 500 amino acids (aa) contained typical TPS domains comprising either the Mg2+ binding (DDxxD/E) and NSE/DTE regions or the DxDD motif (Figure S1). The remaining 25 TPS genes, most of which encoded polypeptides shorter than 400 amino acids, were categorized as partial/ pseudo genes as they contained partial or mutant TPS domains. The genes were then renamed based on their order on the linkage groups (Table S1) [28]. Among the 49 complete RcTPS proteins, the molecular weight (MW) of the proteins ranged from 57.2 to 98.5 kDa, whereas the isoelectric point (pI) ranged from 5.06 to 6.17. The candidate TPS genes identified were RcGDS (RcTPS30, homologous gene of RhGDS), RcLINS (RcTPS-p9 + RcTPS-p10), RcLIN-NERS1 (RcTPS20), RcLIN-NERS2 (RcTPS22), and RcLIN-NERS3 (RcTPS23).
To explore the evolutionary relationship of the Rosa TPS family, a maximum likelihood phylogenetic tree was constructed using all putative and characterized RcTPSs (Fig. 1a). ent-kaurene synthase from Physcomitrella patens (PpCPS/KS), an ancient bifunctional TPS, was considered as the root of the phylogenetic tree [29]. The phylogenetic analysis separated RcTPSs into five groups, namely TPS-c, -e/f, -g, -b, and -a subfamilies. Among these 49 putative functional RcTPSs, 32 belonged to the TPS-a subfamily, which was found to be the largest subfamily. The number of complete RcTPSs in the subfamilies TPS-b, -g, -e/g, and -c were fewer, with 8, 4, 3, and 2 genes respectively. A comparison of the subfamily gene numbers with those in other eudicot plants exhibited that the gene number of the TPS-a subfamily in the R. chinensis genome was much higher than that in the other plants, which suggested a substantial species-specific expansion for the TPS-a subfamily in R. chinensis (Table 1).
Subsequently, 10 conserved motifs were identified in the rose TPS genes through the MEME program (Fig. 1b). Compared with RcTPS genes within the same subfamily, the RcTPS-p genes comprised incomplete motifs. The DDxxD and NSE/DTE motifs were represented by Motifs 1 and 7, respectively (Figure S2), and were found to be present in all putative functional RcTPSs, except those belonging to the TPS-c subfamily. Although the figure illustrates that the TPS-c genes also contained Motif 1, they did not contain the DDxxD motif. The characteristic motif DxDD was located in Motif 5; however, it was not intuitive (Figure S2). The submitted sequence failed to reflect its characteristic domains due to the presence of only a few TPS-c genes.
The gene structure analysis confirmed that every complete RcTPS gene had a TPS open reading frame of expected size and organization (Fig. 1c). The members of TPS-c and -e/f subfamilies belonged to class I TPSs, which have a large number of exon regions ranging from 11 to 15, encoding larger proteins with the number of aa residues ranging from 724 to 857. The TPS-a/b/g subfamily members belonged to class III TPSs, which normally comprise 6–7 exon regions, encoding smaller proteins with the number of aa residues ranging from 501 to 601.
Table 1 Statistics of the TPS subfamily gene numbers in the genome of Rosa chinensis and other plants
Species
|
(putative) Functional TPS
|
Partial/pseudo
TPS
|
TPS-a
|
TPS-b
|
TPS-c
|
TPS-e/f
|
TPS-g
|
All
|
Rosa chinensis
|
32
|
8
|
2
|
3
|
4
|
49
|
25
|
Malus ‘Golden Delicious’ [11]
|
4
|
1
|
1
|
1
|
3
|
10
|
45
|
Populus trichocarpa [30]
|
16
|
14
|
2
|
3
|
3
|
38
|
19
|
Glycine max [31]
|
6
|
5
|
3
|
2
|
7
|
23
|
7
|
Vitis vinifera [13]
|
30
|
19
|
2
|
1
|
17
|
69
|
83
|
Arabidopsis thaliana [10]
|
22
|
6
|
1
|
2
|
1
|
32
|
8
|
Solanum lycopersicum [8]
|
15
|
9
|
2
|
6
|
2
|
34
|
18
|
Chromosomal localization and gene duplication
The 74 rose TPS and related genes (10 complete PT and 10 RcNUDX1 genes) in the R. chinensis genome were mapped to the seven chromosomes of the R. chinensis genome sequence assembly based on their localization information (Fig. 2). The chromosomal localization analysis exhibited the uneven distribution of the 49 RcTPS and 25 RcTPS-p genes across all seven chromosomes. RcChr 5 contained the largest number of TPS genes, including 18 RcTPS and 8 RcTPS-p genes, which suggest the multiple duplication and recombination events on this chromosome [30]. Most pseudogenes were distributed near the putative full-length TPS genes. Six tandemly duplicated genes were present in the R. chinensis genome, which occurred in the TPS-a, -b, and -g subfamilies, forming five gene clusters. No segmentally duplicated TPS genes were detected in the R. chinensis genome, indicating that tandem duplication is the predominant form of expansion for the TPS gene family in R. chinensis. Additionally, four of the seven full-length TPT genes, namely RcFPPS1, RcGPPS, RcSPPS, and RcFPPS2, were mapped on RcChr 5. Some RcTPS genes were localized in the vicinity of putative rose PT genes, which indicated that some RcTPS and PT genes probably evolved through genomic duplication [31].
RcTPS gene collinearity analysis
The comparative collinearity maps of R. chinensis associated with other representative species were constructed to further infer the phylogenetic mechanisms of the TPS gene family (Fig. 3). Four RcTPS genes, namely RcTPS18, RcLIN-NERS1, RcTPS27, and RcTPS-p8, exhibited genomic shuffling across the Rosaceae species and grape, indicating that these genes were derived probably from the ancestors of dicotyledonous plants. Unlike the other three genes with only one collinear gene pair in each plant genome, RcLIN-NERS1 had three collinear gene pairs in the R. rugosa genome and three collinear gene pairs in the strawberry genome. RcLIN-NERS1 and its two collinear genes, namely FaNES1 and FaNES2, in strawberry are bifunctional terpene synthase that can efficiently convert GPP and FPP into linalool and nerolidol, respectively [24, 32]. This indicates that collinear TPS genes may exhibit similar catalytic function in relative plants. RcTPS27 (TPS-a) and RcLIN-NERS1 (TPS-b) in R. chinensis shared the same collinear TPS genes in R. rugosa and strawberry, indicating that TPS-a (RcTPS27) is probably derived from TPS-g. RcTPS-p8, a putative pseudogene that encodes a short protein (455 aa), had a conserved collinear region in all representative plants, which might be due to the breakage or loss of genes during evolution.
Effect of developmental stages and dark treatment on RcTPS expression
Nine RcTPS genes belonging to the TPS-a, -b, and -g subfamilies were expressed in MU opening flower samples, whereas the other two RcTPS genes were only expressed in the buds. These genes exhibited great differences in the expression levels and patterns. RcGDS, which was responsible for catalyzing the synthesis of germacrene D as the only product, exhibited the highest average expression level [19]. RcTPS32 and RcTPS33 exhibited the second and third highest expression levels, respectively. Although the RcTPS32 and RcTPS33 coding sequences were identical, their expression levels varied. In all samples of opened MU flowers, the RcTPS32 expression level was approximately thrice that of RcTPS33. The expression of other RcTPSs in petals and buds was relatively low (Fig. 4a).
The oscillations in RcTPS expression in different developmental stages of MU flowers were analyzed. These genes exhibited several expression patterns. The first group comprised RcLINS, RcTPS18, RcLIN-NERS1, and RcGDS, whose expression peaked in the buds about to open (S3) or in the early opening flowers (D1) and declined in old flowers. The second group comprised RcTPS39, whose peak expression was on the second day after flowering (D2). The third group comprised four RcTPS genes, whose expression levels increased when the flowers opened and peaked in old flowers. The last group comprising RcTPS8 and RcTPS9 was expressed only in the buds about to open (S3) and hardly expressed in open flowers (Fig. 4b). The effect of dark treatment on RcTPS expression in flowering samples was also studied. Compared with control samples (LD), dark treatment resulted in the decreased expression of four genes, namely RcLINS, RcLIN-NERS1, RcLIN-NERS3, and RcGDS, and increased expression of three genes, namely RcTPS32, RcTPS33, and RcTPS46. Additionally, it exhibited no significant effect on RcTPS18 and RcTPS39 expression (Fig. 4c).
Nine RcTPSs expressed in the opening MU petals were selected to further validate the RNA-seq results in three samples using real-time quantitative polymerase chain reaction (qRT-PCR). The results confirmed the differential TPS gene expression patterns in different samples (Figure S3). The concordance between qRT-PCR and RNA-seq results demonstrated the reliability of RNA-seq data in the present study.
Expression profiles of terpene-related genes
The expression profiles of related genes involved in the synthesis of terpenoid volatiles were analyzed, and the RhUBI2 expression level was used as an internal control. Most genes involved in the MEP pathway exhibited decreased expression in old petals, whereas most genes involved in the MVA pathway exhibited peak expression levels in old flowers and buds or fresh flowers (Fig. 4a). RcIDI and RcIDK exhibited different expression patterns, reaching peak expression in buds and senescent flowers, respectively. Among the five TPT genes involved in the biosynthesis of floral volatile terpenes, RcGGPPS1 and RcSSUII exhibited highest expression in buds or fresh flowers. RcGPPS expression peaked in senescent flowers, whereas the expression of two RcFPPS genes peaked in buds and old flowers. Dark treatment resulted in increased RcGGPPS1 expression and decreased RcFPPS1 expression, whereas no significant effect was observed on the expression of the other three TPT genes (Fig. 4c).
The six homologous genes of RcNUDX1 involved in geraniol synthesis were highly expressed in MU petals (Fig. 4a). All six RcNUDIX1 genes exhibited significantly increased expression after flowering. Four of the RcNUDIX1 genes exhibited constant expression throughout the flowering stage, whereas the other two genes exhibited significantly higher expression in old flowers than in fresh flowers (Fig. 4c). Furthermore, the expression levels of all six RcNUDX1 genes significantly increased after dark treatment (Fig. 4b).
Phylogenetic analysis of RcTPSs expressed in butterfly rose petals
Phylogenetic analysis was performed using the maximum likelihood method and included 11 RcTPS genes expressed in the butterfly rose petals or buds and other representative TPS genes (Fig. 5). Of these, four RcTPS-b genes were grouped into three clusters. RcTPS8 and RcTPS9 were clustered with ocimene synthases, whereas RcLINS was clustered with other linalool synthases. RcTPS18 was clustered together with other angiosperm α-farnesene synthases (AFSs) from apple (MdAFS) [33], Prunus campanulata (PcTPS7) [34], peach (PpTPS2) [35], Glycine max (GmAFS) [36], Populus trichocarpa (PtTPS2) [37], tomato (SlTPS27) [8], grape (VvGwbOciF) [13], and Ricinus communis (RcSeTPS7) [38], forming an AFS cluster. All genes in this cluster were predicted or identified as being localized in the cytoplasm. Sequence analysis showed that RcTPS18 exhibited 59% to 66% identity with other AFSs of Rosaceae plants. RcTPS18 and other AFSs exhibited conserved structural features including the DDxxD motif, NSD/DTE motif, and H-α1 loop. The last motif demonstrated function in the binding of the metal ion K+ in MdAFS (Figure S4) [39]. Moreover, RcTPS18 and PpTPS2 were collinear gene pairs located in a large collinear region with other TPS, TPT, and P450 genes (Figure S5). Thus, RcTPS18 might encode (E,E)-α-farnesene synthase, which is consistent with its annotation in the genome.
The five RcTPS-a genes expressed in rose petals were grouped into two clusters. RcTPS39 was clustered with some genes that can catalyze monoterpenoid products such as FvPINS (strawberry) [32], PcTPS2 and PcTPS5 (P. campanulata) [34], PdTPS1 (Prunus dulcis) [40], and MdPIN/CAM (apple) [11]. An evolutionary analysis on the TPS-a genes of Poaceae exhibited that some TPS-a members can convert GPP into monoterpenes derived from an initial C6-C1 closure [41]. Further studies must be conducted to investigate the ability of RcTPS39 to catalyze monoterpene products. The other four RcTPS-a genes were mainly clustered with TPSs that catalyzed (E,E)-FPP to C15 products by an initial C10-C1 or C11-C1 closure. RcTPS32, RcTPS32, and RcTPS46 might have the ability to convert (E,E)-FPP to produce sesquiterpenes.
Differential RcTPS expression and volatile terpenes in two flower developmental stages
Based on the differential expression patterns of RcTPS and related genes, the butterfly rose petals on the anthesis day (D1, fresh flower) and day 3 post-anthesis (D3, old flower) were selected for differential gene expression analysis (Log2FC ≥ 0.8 and P < 0.05) within nine RcTPS and six RcNUDX1 genes (Fig. 6a). Compared with D1 samples, four RcTPSs and one RcNUDX1 were upregulated and three RcTPS genes were downregulated in D3 samples.
Then, the volatile terpenes emitted from these two samples were analyzed. In total, 24 monoterpenoids and 30 sesquiterpenoids were identified (Fig.6b), whose content was higher than the content of those detected from headspace volatiles due to the high extraction temperature (Supplemental Table S4). The levels of almost all monoterpenes emitted from D3 samples were higher than the levels of those from D1 samples, whereas different sesquiterpenes exhibited peak release from D1 or D3 samples (Fig. 6c). As the variable importance in projection (VIP) value of most volatile terpenes was very low, differential metabolites were screened based on log2FC ≥ 0.8 and VIP ≥ 0.6. Compared with D1 samples, 14 compounds in the D3 sample including two sesquiterpenes and 12 monoterpenoids were upregulated, whereas five sesquiterpenoids were downregulated.
The relationship between differentially expressed genes (DEGs) and differential volatiles was analyzed. RcGDS expression was higher in D1 samples than in D3 samples, which was consistent with the decreased germacrene D emission in D3 samples (40% of D1 samples). However, germacrene D was not classified as a differential volatile because of its low VIP value. Increased RcLINS and decreased RcLIN-NERS3 were expressed in D1 samples, whereas the linalool emission level was higher in D3 samples (130% of D1 samples). The chirality of linalool could not be detected due to the detection method limitations. Thus, it is difficult to analyze the correlation between linalool and these two genes. The emission abundances of geraniol, nerol, and some monoterpenes [myrcene, (Z)-ocimene, and (E)-β-ocimene] from the D3 samples were 1.3–2.1 times those from the D1 samples, which may be related to the upregulated expression of RcNUDX1-1a4 in D3 samples.
The two sesquiterpenes upregulated in D3 samples were not identified, and the RcTPS32/33 and RcTPS46 functions could be speculated. RcTPS18 expression was downregulated in D3 samples, and the (E,E)-α-farnesene release in D3 samples also decreased (67% of D1 samples). However, (E,E)-α-farnesene was classified as a nonsignificant volatile because of its low VIP value. Rosa ‘Qinglian Xueshi’ that released high levels of (E,E)-α-farnesene was selected as control for further analysis. The emission amounts of (E,E)-α-farnesene in the three rose samples had the same trend as the RcTPS18 expression levels (Fig. 6d). Although the emission of five sesquiterpenes was upregulated in D1 samples, the genes responsible for their synthesis remain unknown.
Similar expression patterns of RcTPS32, RcTPS33, and RcTPS46 genes
To analyze the similar expression patterns of RcTPS32, RcTPS33, and RcTPS46 genes, the following three criteria were applied to identify these genes: (1) the average FPKM value in the petal samples at five developmental stages was ≥ 1, (2) the expression of genes had to be upregulated in D3 samples and downregulated in D1 and S3 samples, and (3) the expression of genes was significantly different between D2 and bagging treatment samples (LD and DD). The DEGs between different samples were screened out (Fig. 7a). Venn analysis exhibited similarity in the expression patterns of 91 genes with those of RcTPS32, RcTPS33, and RcTPS46 (Fig. 7b). These genes included some cell wall metabolism-related genes (e.g., β-xylosidase, β-galactosidase, cellulose synthase-like, pectinesterase/pectinesterase inhibitor), sugar metabolism-related genes (e.g., trehalose-phosphate synthase and UDP-glycosyltransferase), reactive oxygen species scavenging enzymes and antioxidants (e.g., glutathione S-transferase and peroxidases), purple acid phosphatase (PAP) genes, and four transcription factors (namely RcMYB108 [42], RcWRKY55 [43], RcNAC062 [44], and RhPMP1 [45]) (Fig. 7c). These four transcription factors were selected and further validated through qRT-PCR, confirming their expression patterns in different samples (Figure S6).