Changes in pericarp structure
First, we evaluated the dynamic structures of fruit pericarps during different stages of ripening (Fig. 1). In the noncracking stage (PS), the pericarp cells and cuticles were densely arranged, with small intercellular spaces and continuous distribution (Fig. 1a, 1d, 1g). However, in the initial cracking stage, the cell wall thinned, the cell volume increased, the number of cell layers decreased, and the cells were loosely arranged with poor integrity; moreover, the spacing between the cells increased, and the cell of the exocarp and mesocarp began to degrade in the initial cracking stage (PM; Fig. 1b, 1e, 1h). Irregularly arranged layers with continuously reduced numbers and larger spaces were observed as an evidence of cell degradation throughout the cracking stages (PL; Fig. 1c, 1f, 1i).
Transcriptomic analysis overview
Nine cDNA libraries were constructed, and 47.05, 46.92, and 54.00 million raw sequence reads were generated from the PS, PM, and PL libraries, respectively. After removing low-quality reads and adaptor sequences, 46.45, 46.35, and 53.46 million clean reads with 91.58−93.75% Q30 bases and 45.94−48.48% GC content, respectively, were obtained (Supplementary Table 1). The resulting A. trifoliata transcriptome contained 241,376 transcripts, ranging from 201 to 2000 bp, and 186,054 unigenes (> 200 bp; Table 1 and Fig. S1).
All unigenes were annotated using Basic Local Alignment Search Tool (BLAST) searches against the following five databases: National Center for Biotechnology Information nonredundant protein sequences database (NR; 100,329; 53.9% and 41.57% of all identified unigenes and transcripts, respectively), SwissProt (56,346; 30.3% and 23.3%), Protein Families database (Pfam; 34,428; 18.5% and 14.3%), Gene Ontology database (GO; 44,558; 23.9% and 18.5%), and Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG; 30,298; 16.3% and 12.6%). This indicated that the NR database provided the largest number of annotations, suggesting that 100,329 unigenes corresponded with sequences from at least one of the public databases, and 7283 unigenes were annotated to all databases. Moreover, 17,601, 19,281, and 13,525 unique unigenes and 45,866, 48,104, and 43,303 absent unigenes were identified in PS, PM, and PL, respectively. Most of these were uncharacterized, unknown, and hypothetical proteins in the annotation results (Tables S2-S3).
Among these unigenes, 9301 were identified as differently expressed genes (DEGs), including 2703 (779 up- and 1924 downregulated), 4694 (3815 up- and 879 downregulated) and 1904 co-expressed in PM versus PS and PL versus PM groups, respectively (Table 2), which are presented in a volcano plot (Fig. S2a-S2b). The summary information of these DEGs is shown in Table S4.
Functional classification of identified DEGs
Bioinformatics analysis indicated that most DEGs in the GO terms of biological process (BP) were involved in cellular amide metabolic processes and amide metabolic processes. Genes related to structural molecular activity and oxidoreductase activity accounted for the highest proportions of DEGs in the molecular function (MF) category in both the PM_PS and PL_PM groups. Cytoplasmic parts and intracellular ribonucleoprotein complexes comprised the highest proportions of DEGs in the cell components (CCs) in the PM_PS and PL_PM groups, respectively (Fig. 2a-2b).
KEGG pathway analysis indicated that many DEGs were enriched in metabolic pathways and ribosomes in the PM_PS comparison and in the biosynthesis of secondary metabolites and metabolic pathways in the PL_PM comparison (Fig. 2e-2f). Cluster analysis showed that cell wall-related DEGs (285), including pentose and glucuronate interconversion, the phenylpropanoid pathway, galactose metabolism, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism, and transcription factors, were clustered closely in the PM_PS and PL_PM groups. Most DEGs were downregulated in the PM_PS group, but these were upregulated in the PL_PM group (Fig. 3).
Quantitative proteome analysis
In total, 812,625 spectra, 68,151 identified spectra, 12,456 peptides, 10,572 unique peptides, and 2839 proteins were determined via proteomic analysis (Table 1 and Table S3). In terms of protein mass distribution, proteins with molecular weights greater than 9 kDa had a wide range and good coverage, with a maximum distribution area of 10–40 kDa. Peptide quantitative analysis of the proteins showed that protein quantity decreased with an increase in matching peptides (Fig. S3).
Among these 2839 proteins, 223 were identified as differentially abundant proteins (DAPs), including 173 (75 up- and 98 downregulated), 33 (20 up- and 13 downregulated), and 17 co-expressed proteins in the PM_PS and PL_PM groups, respectively (Table 2), which are presented in a volcano plot (Fig. S2c-S2d). The summary information of these DAPs is shown in Table S4.
Functional classification of the identified DAPs
GO analysis showed that most DAPs in the BP category were involved in cellular responses to chemical stimulus and cellular oxidant detoxification processes in the PM_PS group and in metabolic processes and macromolecule metabolic processes in the PL_PM group. The highest proportions of DAPs in the MF category were involved in oxidoreductase activity and antioxidant activity in the PM_PS group and structural constituent of ribosome and structural molecule activity in the PL_PM group. The extracellular region in the PM_PS group and cytoplasmic parts in the PL _PM group showed the highest portions of DAPs in the CC category (Fig. 2c-2d). KEGG pathway analyses indicated that many proteins were enriched in the two-component system and ribosome pathways in the PM_PS and PL_PM groups, respectively (Fig. 2g-2h).
Hierarchical cluster analysis showed that 40 cell wall-related DAPs, including pentose and glucuronate interconversions, the phenylpropanoid pathway, galactose metabolism, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism, and other cell wall metabolism-related proteins, were clustered closely in the PM_PS and PL_PM groups. Notably, most DAPs were upregulated in both the PM_PS and PL_PM groups. Alternatively, DAPs involved in phenylpropanoid pathways and galactose metabolism were downregulated in the PM_PS group but upregulated in the PL_PM group (Fig. 3g-3l). Moreover, the protein-protein interaction (PPI) network analysis indicated that 85 DAPs, including 29 upregulated (28 and 1) and 56 downregulated DAPs (46 and 10), were involved in the interaction networks in the PM_PS and PL_PM groups, respectively (Fig. 4).
Comparative analysis between protein abundance and gene expression
There were more DEGs (11205) than DAPs (240) and shared DEGs (1904) than shared DAPs (17) in cracked fruit than in non-cracking fruit. Most were downregulated in the PM_PS group but upregulated in the PL_PM group (Table 2). Moreover, integration of the proteome and transcriptome data showed that 14 and 4 DAPs were matched with their DEGs. In addition, 12 (4 up- and 8 downregulated) and 4 DAPs (2 up- and 2 downregulated) showed the same tendency as DEGs in the PM_PS and PL_PM groups, respectively, with two showing the opposite tendency as DEGs in the PM_PS group (Fig. 5a-5b). Furthermore, the fold-changes in DAPs indicated differentially positive correlations with their corresponding DEGs based on Pearson’s correlation tests. A limited correlation (r = 0.03 and 0.11) was detected between the proteome and transcriptome, and a relatively higher positive correlation was identified with the same trend (r = 0.9161 and 0.8) for DEGs and DAPs in the PM_PS and PL_PM groups, respectively (Fig. 5c-5f).
Identification of DAPs and DEGs associated with candidate pathways
The correlation of proteome and transcriptome GO enrichment indicated that the largest groups of those DEGs and DAPs within the BP, MF, and CC categories were linked to metabolic and cellular processes, catalytic activity and binding, and cells and cell parts, respectively, in both the PM_PS and PL_PM groups (Fig. S4a-4b). KEGG enrichment showed that most were significantly enriched in phenylpropanoid biosynthesis, pentose and glucuronate interconversions, amino sugar and nucleotide sugar metabolism, and galactose metabolism in the PM_PS group. Most were significantly enriched in pentose and glucuronate interconversions and the galactose metabolism pathway in the PL_PM group for both the proteome and transcriptome (Fig S4c-4d). Comparative analysis showed that two pathways, i.e., pentose and glucuronate interconversion and galactose metabolism pathways, were shared in both the PM_PS and PL_PM groups. In contrast, the phenylpropanoid biosynthesis pathway was only shared in the PM_PS group.
Validation of data reliability through reverse transcription real-time quantitative PCR (qPCR) and parallel reaction monitoring (PRM)
qPCR experiments were performed for 20 selected DEGs, and the results are shown in Fig. 6. In the PM_PS group, the phenylpropanoid pathway-related genes 4-coumarate-COA-ligase (4CL), peroxidase (PRX), and PRX2 were downregulated, whereas cinnamyl-alcohol dehydrogenase (CAD) and shikimate O-hydroxycinnamoyltransferase (HCT) were upregulated. The galactose metabolism-related genes β-GAL1 and β-GAL2; amino sugar and nucleotide sugar metabolism-related gene beta-d-xylosidase (BXL); and starch and sucrose metabolism-related genes cellulase (CEL), cellulose synthase-like protein (CSLG), and glucan endo-1,3-beta-d-glucosidase (ENDOB) were downregulated. The cell wall metabolism genes NAC, NAC-like, and EXP1 were downregulated, whereas the BHLH transcription factor and dirigent protein (DIR2) were upregulated. The pentose and glucuronate interconversion-related genes PL, PG, and PE were upregulated. Most of these genes, except 4CL, CAD, β-GAL, and EXP1, were significantly upregulated in the PL_PM group. Moreover, the expression of 13 candidate genes, including DIR2, NAC-like, EXP1, CAD, β-GAL1, β-GAL2, 4CL, ENDOB, PE, BHLH, PG3, CEL, and PRX2, showed strong correlations and that of seven genes showed poor correlations with the corresponding RNA-seq data (Fig. 6; Table 3).
Moreover, 20 DAPs were selected for PRM analysis, of which 18 exhibited significantly different levels. Of these 18 DAPs, 14 (77.8%) showed the same trends in abundance between PRM and TMT quantification, including PE, PL, PG2, furostanol glycoside 26-O-beta-glucosidase (F26G), β-GAL2, auxin efflux carrier, alpha/beta hydrolase (α-HY), PRX2, PG4, PRX5, PRX3, endoglucanase 19, endoglucanase 8, and DIR1. Additionally, four genes (PRX, beta-fructofuranosidase, BXL, and BGLU33) showed inconsistent abundance compared with protein levels quantified by TMT (Table 4). In general, the trends in the expression changes measured by PRM and TMT were consistent. Notably, in the PM_PS and PL_PM groups, several genes involved in cell wall metabolism pathways showed consistent upregulation/downregulation in the transcriptome and proteome (Fig. 7).