Identification and chromosomal distribution of E3 ubiquitin ligase genes in peach
In peach, 765 PpE3 ligase genes were identified through BLAST analysis using protein sequences from Arabidopsis and grape against the peach genome database (Table 1; Additional file 1: Table S1). The PpE3 ligase genes account for almost 3.0% of the predicted proteins in the peach genome. The number of putative E3 ligase genes in peach was greater than identified in V. vinifera (677) but was significantly less than the numbers in six other species (Table 1).
Normally there are nine subfamilies in the E3 ligase gene family. Only eight E3 subfamilies were identified in the peach genome, namely the BTB, Cullin, DDB, F-box, HECT, RING, SKP and U-box subfamilies. The RBX subfamily was not found in peach. The number of genes in the different subfamilies differed in peach. The largest number of genes, 338, was in the RING subfamily. The second largest was 267 in the F-box family. The RING and F-box subfamilies represented 79% of the predicted PpE3 ligase genes. The smallest subfamily was the DDB subfamily, with only 3 genes identified in peach.
For each of the PpE3 ligase genes, the exons, introns, additional domains and the length of each domain were analyzed (Additional file 1: Table S2). Most of PpE3 ligase genes contained introns, with numbers varying from 0 to 30. According to the number of introns, the PpE3 ligase genes were divided into 5 groups. Most of the PpE3 ligase genes, 408, contained between 1 and 5 introns. Of these genes, 191 genes lacked an intron in the second position. Only 4 PpE3 ligase genes, 3 in the RING and 1 in the HECT subfamily, have more than 20 introns. The number of introns in different subfamilies might have some relationship with their different functions.
All 765 of the identified PpE3 ligase genes were mapped onto one of the eight peach chromosomes (Additional file 2: Figure S1). The largest number of genes (173) was located on chromosome 1, including 16 BTB, four Cullin, 78 F-box, two HECT, 61 RING, one SKP and 11 U-box genes. Only 68 genes were found on chromosome 8 (Additional file 1: Table S3). Members of the BTB, U-box, F-box and RING subfamilies could be found on every chromosome, while there was no Cullin gene on chromosomes 2, 6, or 7, no SKP gene on chromosomes 3, 5, or 6, no HECT gene on chromosomes 2, 3, or 5, and no DDB gene on chromosome 1, 3, 4, 5, or 8. The more abundant E3 ligase gene subfamilies were mainly present on the longer chromosomes (Chr 1 and Chr 6). This result indicated that the PpE3 ligase genes are not evenly distributed on each chromosome.
Gene duplication pattern analysis
To explain the expansion and evolution of the E3 ligase gene family in peach, the patterns of gene duplication were analyzed and compared across the peach genome (Table 2). Dispersed gene duplication (DSD, 48% of 765) was responsible for the largest number of gene duplications among the PpE3 ligase genes, followed by tandem duplication (TD, 20%) and whole-genome duplication (WGD, 19%). However, the expansion of the PpE3 subfamilies did not all follow the same patterns. For the BTB subfamily, the largest number of gene pairs was derived from DSD (51% of 67), followed by WGD (21%), TD (16%) and proximal pairs (PD, 4%). DSD was also the most common duplication mode for the F-box (39% of 267), HECT (100% of 7), RING (53% of 338), SKP (44% of 16), and U-box (59% of 54) subfamilies. WGD was the second most common mode of duplication for the RING (27%), SKP (31%), U-box (30%) and Cullin (30%) subfamilies. In the F-box subfamily, TD (37% of 267) was the second most common mode of gene duplication. No duplicated gene pairs were detected in the DDB subfamily.
The genomic distribution of the different types of gene duplications found in the PpE3 family was dissected (Fig. 1; Additional file 1: Table S4). The DSD, TD, WGD, PD and singleton patterns of each subfamily are shown in Fig. 1 as concentric layer 0, layer 1, layer 2, layer 3 and layer 4, respectively. Each syntenic pair is linked by a colored line, with the colors representing the different subfamilies. The DDB subfamily contained only singleton genes. DSD (layer 0) is the most prevalent gene duplication pattern in each subfamily and was found on each peach chromosome. The other patterns of gene duplication, including TD, WGD, PD and unduplicated singletons, in each subfamily occurred randomly among the different chromosomes. These results provide further insights into the expansion of the PpE3 family in peach.
Classification and Phylogenetic analysis
SMART and Pfam databases were used to detect the specific domains, shared domains, and other domains of the predicted PpE3 proteins in the eight subfamilies in peach. All E3 proteins in peach carried their subfamily-specific domain. In all, 81 other types of domains were identified in five of the E3 subfamilies (BTB, F-box, U-box, RING and HECT), some of which appeared in more than one subfamily. The classification of the BTB, F-box and U-box genes was based on the subfamily-specific domains plus additional domains.
According to domains present in the BTB proteins, the BTB subfamily could be divided into 14 subgroups (Fig. 2), with 1 to 22 genes in each subgroup (Additional file 1: Table S5). Notably, 21 BTB genes containing the NPH3 domain were detected in BTBN subgroup. Four new subgroups of BTB proteins with different combinations of domains were identified in peach and named BTBAND (BTB-ANK-NPR-DUF), BTBBL (BTB-BACK-LRR), BTBP (BTB-Pentapeptide) and BTBAN (BTB-ANK-NPR). The observation of these new BTB subgroups in peach implies that these genes might play novel functions during the growth and development of peach. The phylogenetic analysis results of the BTB subfamily are shown in Fig. 3. Most subgroups were clustered together, such as the BTBN, BTBM and BTBT subgroups. The results were consistent using both the SMART and Pfam databases.
F-box proteins contain different domains and are classified into differing numbers of subgroups in different species. In peach, the FBX subgroup had 128 members with no other identified domains except the F-box motif. The other nine subgroups were named according to a previous study [27]. The numbers of genes in the different subgroups were quite different, with 64 in FBA, 19 in FBD, 2 in FBDL, 14 in FBL, 12 in FBK, 6 in FBP, 2 in FBW, 8 in FBDUF and 12 in FBO (Additional file 2: Fig. S2; Additional file 1: Table S5). Most members of some subgroups, such as FBX, FBA, FBP, FBK, FBL, and FBD, clustered together. Members of the FBO subgroup were scattered about the tree, possibly due to the non-uniform domains in the FBO subgroup (Additional file 2: Fig. S3; Additional file 1: Table S5).
The U-box proteins of peach were also divided into seven subgroups, according to the identity of the additional domain (Additional file 2: Fig. S4; Additional file 1: Table S5). The U-box subgroup has 15 members that carry only the U-box domain without any other specific domains. Twenty-five U-box proteins contained the U-box domain and the ARMADILLO (ARM) domain, which was present in 1 to 6 repeats. Eight PpE3 proteins contained Pkinase domain. The TPR (TetratricoPeptide Repeat), UFD2 (Ubiquitin Fusion Degradation 2), and KAP (Kinesin-associated protein) subgroups each contained one member, and the WD40 subgroup had three members. In the phylogenetic tree of the U-box proteins, members of the ARM, U-box, Pkinase and WD40 subgroups clustered together (Additional file 2: Fig. S5; Additional file 1: Table S5). The phylogenetic analysis of the eight subgroups partially supports our classification based on SMART and Pfam domains analyses.
There were 352 RING domains found in the 338 predicted proteins identified in the peach genome (Additional file 2: Fig. S6; Additional file 1: Table S5). According to the spacing between the amino acids that bind the metal ligands or substitutions at one or more of the metal ligand positions, the RING subfamily was classified into six subgroups, including the RING-C2 (18), RING-G (3), RING-HC (198), RING-H2 (109), RING-S/T (4) and RING-v (20) subgroups. In the phylogenetic tree, members of most subgroups clustered together, with the exception of the proteins Prupe.1G303200 and Prupe.5G119800, two members of the RING-S/T subgroup that clustered with the RING-HC subgroup (Additional file 2: Fig. S7; Additional file 1: Table S5). This may be because these two genes evolved from the RING-HC subgroup.
The C-terminal end of the HECT proteins from peach contained an approximately 350-amino acid HECT domain (Additional file 1: Table S5). Proteins in the HECT subfamily also contained other domains, such as UBA (Ub-associated) and UIM (Ub-interacting), both of which are potentially important for ubiquitin ligase function. Based on the presence of additional protein motifs as predicted by the SMART and Pfam database, the HECT subfamily could be divided into three subgroups: (i) Subgroup I, only containing the HECT domain (4); (ii) Subgroup II, containing UBA, UIM and DUF domains (2); and (iii) Subgroup III, containing a UBQ (Ubiquitin homologues) domain (1) (Additional file 2: Fig. S8; Additional file 1: Table S5). The phylogenetic tree of the HECT subfamily coincided with the classification results using the SMART and Pfam databases (Additional file 2: Fig S9; Additional file 1: Table S5).
Expression of E3 ubiquitin ligase genes during fruit ripening in MF and SH peach
To reveal the expression patterns of PpE3 ligase genes in peach fruit, the transcriptome of fruit was analyzed during ripening in a MF and a SH cultivar across four stages of ripening. An average of 37,438,865 paired-end reads were obtained after filtering the reads of low quality and were mapped onto rRNA. About 95.5% of the high-quality reads were mapped against the peach reference genome (Additional file 1: Table S6).
Among the 515 expressed PpE3 ligase genes, 231 differentially expressed genes (DEGs) were identified at the same-stage between the two peach cultivars (MF vs. SH) (Table 3; Additional file 1: Table S7). Fifteen randomly selected PpE3 ligase genes and eight genes related to ethylene, auxin and ABA pathway were used to confirm the expression levels by quantitative real-time PCR (qRT-PCR) during fruit ripening in the MF and SH cultivars. The qRT-PCR results were consistent with those of RNA-seq (Fig. 4). The number of DEGs in the different subfamilies was different (Table 3). The number of DEGs was highest for the RING subfamily, while the highest rate of differential expression within a subfamily was for the HECT subfamily.
According to their expression patterns at the same stage of ripening, the 231 DEGs could be classified into eight clusters (Fig. 5). In cluster I, 47 DEGs showed lower expression (1.7- to 42.0-fold) in MF at stage S4III. Twenty-two of the cluster I DEGs belonged to the RING subfamily. Surprisingly, the expression of Prupe.3G223000 was 42.0-fold lower in MF than in SH fruit. Prupe.3G223000 was annotated as a BTB/POZ protein with an NPH3 domain and has high homology with At3G22104 (function unknown). Meanwhile, one F-box gene, Prupe.8G253300 (PpTIR1), which is predicted to function as an auxin receptor TIR1 (TRANSPORT INHIBITOR RESPONSE1), showed a 3.3-fold lower expression level in MF-S4III than in SH-S4III. Prupe.1G097200 (PpXBAT32), in the RING subfamily, has high homolog with AtXBAT32 (At5G57740).
DEGs in cluster II showed higher expression levels in MF peach at almost all analyzed stages of ripening. Twenty genes belonged to this cluster, including two BTB genes, eight F-box genes, nine RING genes and one U-box gene. Prupe.1G020100, a member of the RING subfamily, showed the greatest fold change between the MF and SH peaches (7.2-fold) at stage S4I. Another gene in this cluster, Prupe.8G193500 (PpPUB9), is a U-box gene with a high homology with AtPUB9 (At3G07360).
The transcript abundance of DEGs in cluster III was lower in MF than in SH at stage S3, but higher in MF at stage S4III. This cluster contained three BTB genes, one F-box gene, eight RING genes and six U-box genes. Prupe.6G222100, a BTB gene, showed the greatest fold change (-4.6-fold in MF) at stage S3. This gene (PpBT4) had high homology with ATBT4 (At5G67480, BTB and TAZ domain protein 4). Prupe.3G125000, a RING gene, showed the greatest fold change (27.0-fold in MF) at stage S4III. Prupe.3G125000 (PpATL43) has high homolog with ATL43 (At5G05810).
Cluster IV, the second largest group containing 42 genes, showed lower expression in MF at stage S3 but equivalent transcript levels in the other three stages. Their expression at stage 3 was 3.8- to 2.0-fold lower in MF than in SH. It was notable that four of the five CULLIN DEGs were in this cluster, Prupe.1G138700, Prupe.5G063200, Prupe.5G063700 and Prupe.8G255500. The CULLIN 1 homolog (Prupe.5G063700), which encodes a protein that forms a SCF complex with SKP1 and an F-box protein, showed the maximum fold change (-3.8-fold) in this cluster.
The DEGs in cluster V, the third largest group with 33 genes, showed higher expression levels at stage S4I and lower expression levels at the other stages in the MF fruits. Prupe.5G140900, a RING gene, showed the highest fold change (64.9-fold in MF) at stage S4I but was 1.43-fold lower in MF at both stages S4II and S4III. Prupe.5G140900 had high homology with AtORTH/VIM (At1G57820, ORTHRUS/ VARIANT IN METHYLATION). At same time, Prupe.3G242700, another RING gene, showed the greater reduction in expression (-8.1-fold) at stage S4III. The expression of the BTB gene Prupe.8G240700 showed the second lowest fold change (-5.9-fold) in MF compared to SH at stage S4III.
Genes in cluster VI showed lower expression levels in MF at stages S3 and S4I, but higher expression levels at stages S4II and S4III. Prupe.1G493000 had the greatest swing in expression levels, with the greatest expression reduction (-4.7-fold) at stage S4I and the greatest expression increase (21.1-fold) at stage S4II. Prupe.1G493000 (PpBB) is a RING protein highly homologous to AtBB (At3G63530, AtBIG BROTHER).
Cluster VII included 29 genes, most showing lower expression level in MF peaches at all four stages. Prupe.6G014500 showed the greatest fold change (-26.0-fold) at stage S3 and is a U-box gene with high homology to AtPUB11 (At1G23030). However, the function of AtPUB11 is not clear yet.
Cluster VIII contained one BTB, five F-box, nine RING and seven U-box genes that had their highest expression levels in MF at stage S4III. The U-box gene Prupe.8G024500 (PpPUB29, highly homologous to MdPUB29,MDP0000928620) showed the greatest fold change (56-fold) in MF compared to SH at stage S4III. Among the five F-box genes in this cluster, two genes (Prupe.1G480700 and Prupe.7G244300) were annotated as EBFs (EIN3-binding factor).