Comparison of two peach fruit quality indices
In this study, a marked difference in appearance between the Olecranon honey peach and its bud variant was observed (Fig. 1A and 1B). The tip of the Olecranon honey peach is shaped like an eagle’s beak, and is large and warped. The two types of peaches also differ significantly in some quality indicators, such as weight, longitudinal diameter, transverse diameter, hardness, juice rate, color and luster, nutrients, and mineral elements (Table 1). The quality indicators, such as cluster analysis and NMDS analysis, of the two types of peaches also differ (Fig. 5A and 5B). Notably, the weight of the Olecranon honey peach was 205.50 ± 0.98 g, whereas that of the budding variant was 116.26 ± 1.23 g. The fruit shape index was calculated as the ratio of the longitudinal diameter to the transverse diameter. Their fruit shape indices did not differ significantly at 1.02 and 1.06, respectively. This implies that the shape of both the Olecranon honey peach and the bud mutant variant is almost circular. However, there were large differences in the longitudinal and transverse diameters of these peaches.
Transcriptomic analysis from ONT sequencing
Young fruits were sequenced because fruit shape, textural properties, and cutting content depend on events that occur early during fruit development [12]. To compare the characteristics based on the fruit type, gene expression dynamics and full-length transcripts of the young fruit of the two types of peaches were determined by ONT, and total clean data in the fastq format were further filtered to eliminate short reads and low-quality reads. After quality filtering, the BeadNum ranged from 5 687 141 to 7 035 534 with N50 ranging from 1263 to 1423; the mean q-score was Q9 in each library (Table 2). Through cDNA sequencing, reads can be identified as full-length sequences when primers have recognized both ends of the reads. The numbers of full-length reads in the six libraries were 5 195 328, 481 897, 4 795 588, 5 346 222, 4 312 262, and 4 847 747, respectively. The full-length percentages (FL%) were found to be 77.57%, 77.81%, 77.20%, 77.86%, 77.68%, and 76.97%, respectively (Table 3). The consensus isoform was obtained by polishing the full-length reads. Fusion transcript prediction was performed on obtained consistent transcripts, and the fusion transcripts from 7 to 12 were obtained for each sample. (S-Table 1).
Using the minimap2 software (Li 2018), all consensus isoforms were aligned to the reference genome for elimination of redundancy analysis, after obtaining 58 596 consensus isoforms. Transcripts above 500 base pairs were screened from the non-redundant transcripts, and MISA software was used for SSR analysis. A total of 21 745 SSRs were obtained, including 6 372 mononucleotide, 10 271 dinucleotide, 4 718 trinucleotide, 235 tetranucleotide, 48 pentanucleotide, and 101 hexanucleotide repeats. Density distribution of the different SSR types was statistically analyzed (Fig. 1D); the perfect dinucleotide (p2) was found to have the highest density, followed by the perfect mononucleotide (p1) and the perfect trinucleotide (p3). The perfect pentanucleotide (p5) and the hybrid SSR (p6) were found to have the lowest density. The non-redundant transcripts of all the samples were compared with the known annotations of the reference genome, revealing 2530 new gene loci and 37 364 new transcripts. Sequence analysis of the newly discovered transcripts revealed 24 205 complete open reading frame (ORF) sequences, 2603 transcription factors, and 1005 lncRNAs. By comparison with eight protein databases, 34 665 new transcripts were functionally annotated. In total, 8502 transcripts were annotated in the Cluster of Orthologous Groups of proteins (COG) database; 20 467 were annotated in the Gene Ontology Consortium (GO) database; 15 074 were annotated in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database; 19 471 were annotated in the euKaryotic Ortholog Groups (KOG) database; 14 296 were annotated in protein family (Pfam); 20 851 were annotated in Swiss-Prot; 31 324 were annotated in eggNOG; and 34 595 were annotated in NR. Ultimately, 34 665 transcripts were annotated in the eight databases (Table 4).
Alternative splicing
Alternative splicing (AS), the process of removing introns from pre-mRNA and rearranging exons to produce several types of mature transcripts, occurs before protein synthesis [13]. In the present study, 18,322 AS events were identified in the transcripts by AStalavista [14]. The main types of AS included exon skipping (ES), alternative 3′ splice site (A3SS), mutually exclusive exon (MEE), alternative 5′ splice site (A5SS), and intron retention (IR) (Fig. 1E). Among the five main types of alternatively spliced transcripts, IR was found to be predominant, accounting for 36.18% of the AS transcripts, followed by A3SS (32.58%), A5SS (15.97%), ES (14.16%), and MEE (1.11%) (S-Table 2). This finding differs from those of previous studies in various plants, such as tea, maize, cotton, rice, moso bamboo, Arabidopsis, and Populus [15-18], in which IR was predominant, followed by ES, A3SS, A5SS, and MEE. AS events were significantly more frequent in the Olecranon honey peach (from 3581 to 3860, an average of 3731) than those in the bud peach (from 2310 to 2446, an average of 2376) (S-Table 2). In the two groups of samples, the percentages of A5SS and ES in the Olecranon honey peach were significantly lower than those in the bud peach. However, IR showed the opposite results (Fig. 1C).
Differential expression analysis of gene and transcripts
Gene and transcript expression are time and location-specific. Under different conditions, the expression levels of genes and transcripts significantly differ. There were 457 differential expression genes (DEGs) in the two groups (Olecranon honey peach and the bud peach), including 169 up-regulated genes and 288 down-regulated genes (Fig. 2A). Up-regulated genes showed higher expression levels in the bud peach than in the Olecranon honey peach, whereas the down-regulated genes showed the opposite results. Hierarchical clustering analysis was performed for the screened DEGs, and genes with the same or similar expression patterns were clustered. The genes were divided into two subclasses (Fig. 2B). The 457 DEGs were annotated using the best BLAST hit from eight protein databases. A total of 390 genes were annotated in the eight databases, and 226 genes were annotated using the GO database. These 226 genes were annotated as DEGs and divided into three main GO functional categories (biological process, cellular component, and molecular function) and 36 subcategories (Fig. 2C). In the cellular component (CC) classification, the major subcategories were “membrane,” “membrane part,” “cell part,” and “cell.” For the molecular function (MF) classification, “catalytic activity” and “binding” were the dominant subgroups. “Metabolic process,” “cellular process,” and “single-organism process” were the major subcategories in the biological process (BP) classification. The GO annotation classifications of all genes were divided into 50 subcategories. The major subcategories of all genes were consistent with those of the DEGs. Ninety-nine DEGs were annotated in the COG database; “Carbohydrate transport and metabolism,” “transcription,” and “secondary metabolites biosynthesis, transport and catabolism” were the three significant subcategories in the COG functional classification analysis of the DEGs (Fig. 2D). To better explain the function of these DEGs, the DEGs were searched in the KEGG pathway database. Among the genes, 118 DEGs were annotated in the KEGG database, and 55 KEGG pathways were identified. Most were metabolism-related pathways, such as brassinosteroid biosynthesis (ko00905), starch and sucrose metabolism (ko00500), cysteine and methionine metabolism (ko00270), and other secondary metabolic pathways of peach plants. Additionally, there were 10 DEGs involved in genetic information processing, and 2 DEGs involved in cellular processes, environmental information processing, and organismal systems (Fig. 2E). Moreover, 164 DEGs were annotated in the KOG database; 385 DEGs were annotated in NR; 242 DEGs were annotated in Pfam; 256 DEGs were annotated in Swiss-Prot; and 325 DEGs were annotated in eggNOG.
Among the two groups (Olecranon honey peach and the bud mutant variant), 1519 transcripts were differentially expressed, of which 552 were up-regulated and 997 were down-regulated. These were divided into four subclasses (Fig. 3A). When the FDR was found to be more than 2, the transcripts were considered to be different. The expression of some transcripts in the two groups showed a fold-change greater than 10 (Fig. 3B). A total of 1332 differentially expressed transcripts (DETs) were annotated, with 796 annotated by GO. These 796 DETs were assigned into three main GO functional categories (BP, CC, and MF) and 41 subcategories (Fig. 3C). The major subcategories were “membrane,” “cell part,” and “cell” in the CC classification. As observed for the DEGs, for MF classification, “catalytic activity” and “binding” were the dominant major subgroups. “Metabolic process,” “cellular process,” and “single-organism process” were the major subcategories in the BP classification. GO annotation classifications of all transcripts were divided into 52 subcategories. A total of 470 of the 1332 DETs were annotated by KEGG and matched 91 pathways (S-Table 3); most annotated pathways in the transcriptome were related to carbon metabolism and amino acid biosynthesis in a total of 17 DETs. Additionally, 349 DETs were annotated in the COG database; 611 DETs were annotated in the KOG database; 1320 DETs were annotated in NR; 721 DETs were annotated in Pfam; and 805 DETs were annotated in Swiss-Prot. Further, 1125 DETs were annotated in eggNOG, and most DETS were related to the post-translational modification, protein turnover, chaperone classes, and unknown functions (Fig. 3D). On the basis of these results, the number of annotated DEGs was found to be significantly lower than the number of DETs.
Differential expression of the genes for the fruit shape in two types of peaches
Fruit shape is an important trait affecting the appearance of peaches [3], including the Olecranon honey peach. In the bud sports, we found that both the peach diameter and the part resembling an eagle’s beak became smaller. Studies have shown that plant hormones affect plant growth and are closely related to fruit development, including the shape of the fruit. For example, auxin and cytokinin are important plant hormones that regulate fruit development [19]. The concentration of auxin in the center of the fruit is higher than that of the peel, and gibberellin is related to cell division [20] and is responsible for swelling in the tomato fruit development [21]. We identified 18 DEGs related to plant hormones (S-Table 4), which were divided into up-regulated genes (ONT.7681, gene 5670, gene 25091, ONT.9923) and down-regulated genes (ONT.16582, gene 2826, ONT.13765, gene 22504, ONT.16845, gene 6609, gene 2822, gene 2460, ONT.1953, gene 2824, ONT.1950, gene 9229, gene 2966, gene 26004); these genes were divided into three subgroups (S-Table 5). According to the GO annotation, the genes were related to biological regulation, reproductive processes, development processes, and growth (S-Table 5). The SAUR family protein (SAUR) is related to gene 9229 regulation in the signal transduction of plant hormones (ko04075) according to the KEGG annotation. It may play some role in the auxin signal transduction pathway involving calcium and calmodulin [22]. Gene 26004 regulates indole-3-pyruvate monooxygenase (YUCCA) in the tryptophan metabolism pathway (ko00380). This gene can promote the growth factors because it plays a role as a key auxin biosynthesis enzyme [23]. Gene 22504 codes for 3-epi-6-deoxocathasterone 23-monooxygenase (CYP90C1/ROT3) in the brassinosteroid biosynthesis pathway (ko00905). CYP90C1/ROT3 indicates the redundant brassinosteroid C-23 hydroxylase [24]. ONT.1950, ONT.1953, gene 2822, gene 2824, and gene 2826 regulate the PHYB activation-tagged suppressor named CYP73A1/BAS1 in ko00905, based on the KEGG annotation. ONT.1950, ONT.1953, gene 2822, gene 2824, and gene 2826 were shown to be involved in the brassinosteroid biosynthesis pathway with gene 22504. CYP90C1 and CYP73A1 play a key role in brassinolide biosynthesis.
Differential expression of the transcripts for the fruit shape in two types of peaches
In the analysis of the signal transduction pathways for plant hormones, we identified seven DETs by the KEGG annotation: ONT.5631.5, ONT.5631.9, rna13157, ONT.942.6, ONT.4473.2, rna4079, and rna4080. Both ONT.5631.5 and ONT.5631.9 were annotated as AUX1/LAX as well as auxin influx carriers (AUX1/LAX family) in the ko04075 database. Rna13157 was annotated as a SAUR family protein that has been known to promote cell enlargement and plant growth (Fig. 4). ONT.942.6 was annotated as AHP, which is a histidine-containing phosphotransfer protein (Fig. 4). This protein is important in promoting cell division. ONT.4473.2 was annotated as EIN3 (Fig. 4) and was defined as an ethylene-insensitive protein 3. This protein facilitates fruit ripening and senescence. Rna4079 and rna4080 were annotated as TGA and defined as the transcription factor TGA by the KO annotation. TGA affects the resistance of plants to diseases [25]. In the analysis of the tryptophan metabolism pathway by the KEGG annotation, we identified three DETs: rna36192, ONT.10241.19, and ONT.10241.4. Rna36192 is annotated as YUCCA, an indole-3- pyruvate monooxygenase (Fig. 4), and ONT.10241.19 and ONT.10241.4 were annotated as katE/CAT, belonging to the catalase family. In the brassinosteroid biosynthesis pathway (ko00905), we identified 11 DETs, which were divided into two classes: CYP734A1 or BAS1, including ONT.1922.5, ONT.1923.2, ONT.1923.3, ONT.1949.1, ONT.1950.1, ONT.1953.1, rna3897, rna3899, and rna3900 (Fig. 4) and CYP90C1/ROT3, including ONT.15392.3 and rna31428. The former is defined as PHYB activation-tagged suppressor 1, whereas the latter is a 3-epi-6-deoxocathasterone 23-monooxygenase (Fig. 4). Using DataViz visual data analysis software to further analyze the relationship between differential transcriptome and fruit shape, we discovered that the fruit shape index was related to ONT.4473.2, ONT.5631.5, rna13157, rna4080, ONT.10241.19, ONT.10241.4, rna36192, ONT.15392.3, ONT.1923.2, ONT.1923.3, ONT.1949.1, ONT.1950.1, ONT.1953.1, rna3897, rna3899, and rna3900, but not to ONT.5631.9, ONT.942.6, rna4079, or ONT.1922.5. Moreover, the transverse diameter was related to all transcriptomes, expect for ONT.5631.9. In contrast, the longitudinal diameter did not correlate with any feature of the transcriptome (fig. 5C).