Identification of BAHD genes in seven Rosaceae species
The BAHD superfamily’s characteristic domain (Pfam: PF02458) and the BAHD Hidden Markov Model (HMM) configuration file (PF02458) were used to identify the BAHD members. The online site SMART (http://smart.embl-heidelberg.de/) was used to analyze protein sequences of candidate genes and to determine the presence of the BAHD domain. Ultimately, we obtained 773 putative BAHD family candidate genes in the seven species with E-values < 1e−10. Furthermore, a multiple sequence alignment was conducted to verify the presence of two characteristic conserved domains (HXXXD and DFGWG) in the BAHD family genes [8]. Because they lacked the two domains, three, six, six, sixseven, eight and 20 genes were removed from Prunus persica (peach), P. bretschneideri (Chinese white pear), Pyrus communis (European pear), Rubus occidentalis (black raspberry), Fragaria vesca (strawberry), Malus × domestica (apple) and Prunus avium (sweet cherry), respectively. Finally, 56 sequences were removed that lacked both domains. Ultimately, 717 BAHD genes were identified and analyzed (Table 1). Detailed information on the features of BAHD genes is available in Additional file 1: Table S1.
Table 1 Genomic information and identified BAHD gene numbers in Rosaceae species
Common name
|
Scientific name
|
Chromosome number
|
Release version
|
Genome gene number
|
Identified BAHD genes
|
|
|
Chinese white pear
|
Pyrus bretschneideri
|
34
|
NJAU, v1.0
|
42,341
|
114(120)
|
|
Apple
|
Malus domestica
|
34
|
JGI, v1.1
|
63,541
|
141(149)
|
|
Strawberry
|
Fragaria vesca
|
14
|
GDR, v4.0
|
32,831
|
89(96)
|
|
European pear
|
Pyrus communis
|
34
|
GDR, v2.0
|
37445
|
97(103)
|
|
Sweet cherry
|
Prunus avium
|
16
|
GDR, v1.0
|
43,679
|
125(145)
|
|
Peach
|
Prunus persica
|
16
|
JGI, v1.1
|
27,864
|
82(85)
|
|
Black raspberry
|
Rubus occidentalis
|
14
|
GDR, v3.0
|
33,286
|
69(75)
|
|
The database addresses were listed below: NJAU (http://peargenome.njau.edu.cn/); GDR (http://www.rosaceae.org/); JGI (http://www.jgi.doe.gov/); The numbers in parentheses show the count of genes before filtering for unanchored and missing conserved domain genes.
Phylogenetic and conserved motif analyses of BAHD genes in Chinese white pear
The amino acid sequences encoded by BAHD genes in Chinese white pear, European pear, Arabidopsis and Populus were used to construct a phylogenetic tree. The Chinese white pear BAHD family protein genes were classified into five clades (I, II, III-a, IV and V; Fig. 1). Clade I consisted of two subclades (clades I-a and I-b). The Arabidopsis genes belonging to clade I-a are involved in modifying aromatic and aliphatic alcohols in Arabidopsis and Populus [26, 27]; therefore, we speculated that the BAHD genes clustered into clade I-a in Chinese white pear might encode proteins with similar functions. The members of clade I-b had functions related to the biosynthesis of lignin monomeric intermediates [28, 29], including tobacco and Arabidopsis shikimate hydroxycinnamoyltransferases [30]. Clade II consisted of two subclades (clades II-a and II-b). The functions of the Arabidopsis genes belonging to clade II-a are unknown. Clade II-b contained two Arabidopsis genes, AT3G29590.1 (At5MAT) and AT1G03940.1 (At3AT1), which are associated with anthocyanin biosynthesis [31, 32]. Clade III-a contained few members, including four genes from Chinese white pear, three genes from Arabidopsis, and one gene from poplar. Clade V contained nine members, including one well-studied Arabidopsis gene AT4G24510.1 (CER2) involved in regulating the cuticular wax biosynthesis [3]. Clade IV contained many members involved in catalyzing the acetylation of aromatic alcohols and acetylating small- or medium-chain alcohols [27]. Additionally, only three and four members clustered in clades III-a and V. However, ~28.1% (32 of 114) BAHD genes were in clade I and 36.0% (41 of 114) were in clade IV, respectively. BAHD genes closely related to pear volatile ester contents, such as AAT, belonged to clades I and IV. The maximum-likelihood tree of BAHDs for European pear with Arabidopsis and Populus as outgroups was also constructed. (Additional file 2: Figure S1). Paralleling the phylogenetic tree built for Chinese white pear, the European pear’s BAHD genes were divided into five clades. Clades V and III-a only contained four members, respectively. However, ~36.1% (35 of 97) genes clustered in clade IV, and 26.8% (26 of 97) clustered in clade I. Additionally, five function-known acyltransferases of Arabidopsis [AT3G0480.1 (CHAT), AT2G23510.1 (SDT), AT5G48930.1 (AtHCT), AT3G29590.1 (At5MAT) and AT1G03490.1 (At3AT1)] were classified into two subgroups (I-a and II-b). These results provided putative candidates for the study of gene functions. We detected 20 conserved motifs in the BAHD proteins of Chinese white pear using the online software MEME (Fig. 2). All the BAHD family members contained motif1 or motif3, and ~65.8% (75 of 114) of members contained them both. Based on a gene structural analysis, we determined that motif1 and motif3, with sequences CGGFAIGLSMSHKVADGSSLSTFINSWAE and FYEADFGWGKP, respectively, correspond to domains HXXXD and DFGWG, respectively. Members of subclades I-a, I-b, II-a, II-b, III-a and V do not contain motif17, except for Pbr005916.1, and members of the subclades I-a, I-b, II-a, II-b and III-a, as well as clade V, do not contain motif19, except for Pbr010925.1. Except for Pbr005746.1, Pbr014025.1, Pbr035166.1 and Pbr036245.1, members of clade IV do not contain motif10. Motif14 and motif16 were only detected in the Clade II-b. The type and distribution of the conservative motifs of the same subclades were similar, further supporting the evolutionary tree’s classification (Fig. 2). Information on conservative motifs is shown in Additional file 3: Table S2.
Gene duplication events identified in the pear BAHD superfamily and a BAHD collinearity analysis of seven Rosaceae species
Different patterns of gene replication have jointly promoted the evolution of the BAHD family, including whole-genome duplication (WGD) or segmental duplication, tandem duplication (TD), proximal duplication (PD), transposed duplication (TRD) and dispersed duplication (DSD) [34, 35]. We used DupGen_finder software [36] to detect duplicated BAHD family gene pairs in seven Rosaceae genomes. All the BAHD gene family members were assigned to WGD, PD, TD, TRD or DSD. The number of WGD duplications in Chinese white pear and apple were 29 and 59, respectively, but there were only three in strawberry and peach, four in black raspberry and sweet cherry, and 23 in European pear. The number of DSDs in Chinese white pear, European pear and sweet cherry were 113, 95 and 142, respectively. Additionally, there were 91 in strawberry and 81 in apple, which are more than in peach (76) and black raspberry (70). Genomic rearrangements and gene loss may lead to the large proportion of DSDs in these species. Moreover, the RNA- and DNA-based TRD event can also produce this result [34]. WGDs and DSDs impacted the evolution of the BAHD superfamily in Chinese white pear, apple and European pear (Fig. 3). In peach and strawberry TDs and DSDs were the main forces, while PDs and DSDs played major roles in the evolution of black raspberry and sweet cherry. In pear, ~57.1% (113 of 198) BAHD genes were involved in DSD events, while there were 66.9% (91 of 136) in strawberry, 44.0% (81 of 184) in apple, 60.3% (76 of 126) in peach, 67.3% (70 of 104) in black raspberry, 66.4% (142 of 214) in sweet cherry and 55.2% (95 of 172) in European pear (Additional file 4: Table S3). The results indicated that DSDs were ubiquitous in all the investigated species.
In addition, we identified intra-genomic synteny blocks for each species [34]. As shown in Fig. 4a, the BAHD genes of Chinese white pear are randomly distributed on 17 chromosomes and there is only one gene on chromosome 13. Similarly, the BAHD genes were detected as randomly distributed in the other species. We found 78 syntenic gene pairs among the seven Rosaceae species. Of these, 17, 21, and 29 syntenic pairs were identified in European pear (Fig. 4g), Chinese white pear (Fig. 4a) and apple (Fig. 4b), compared with only three in strawberry (Fig. 4f), peach (Fig. 4d) and black raspberry (Fig. 4e) and two in sweet cherry (Fig. 4c) (Additional file 5: Table S4).
Nonsynonymous (Ka) and synonymous (Ks) substitutions per site, and a Ka/Ks analysis for BAHD family genes
The stage of evolution for the WGD is usually estimated using Ks [37–39]. In addition to the original WGD [Ks = 1.5–1.8, ~140 million years ago (Mya)] (denoted as a γ-paleohexaploidization event) that was shared by core eudicots [40], a more recent WGD was detected in pear and dated to 30–45 Mya (Ks = 0.15 to 0.3) [25]. As shown in Additional file 6: Table S5, Ks values of WGD-derived gene pairs in Chinese white pear ranged from 0.006 to 3.909, and the ranges of Ks values for gene pairs derived from TD, PD, TRD and DSD were 0.001–4.247,0.07–3.670,0.029–4.381 and 0.005–5.066, respectively. Similar results were found in apple. In Chinese white pear, there are nine WGD-derived genes pairs with Ks values that ranged from 0.15 to 0.30, demonstrating that they may be derived from the current WGD (30–45 Mya) [25]. Some other duplicated gene pairs possessed higher Ks values (1.992–3.909), implying that they probably originated from more ancient duplication events. The Ks values of the WGD-derived gene pairs in black raspberry, European pear, peach and sweet cherry were 1.356–2.965, 0.145–4.288, 1.416–4.357 and 1.469–4.210, respectively. The higher Ks values of WGD-derived gene pairs in peach, black raspberry and sweet cherry suggested that they were duplicated and retained from more ancient WGD events, supporting the absence of more recent WGD events in these species.
Deleterious mutations can be removed by negative selection (purifying selection). Conversely, new favorable mutations can be accumulated by positive selection (Darwinian selection) and spread through the population [6]. To detect the selection pressure acting on BAHD genes, we analyzed the Ka and Ka/Ks values in the seven Rosaceae species (Additional file 6: Table S5). The direction and magnitude of the selection pressure were inferred based on Ka/Ks ratio (Ka/Ks > 1: positive selection; Ka/Ks = 1: neutral evolution; and Ka/Ks < 1: purifying selection) [42]. The Ka/Ks values of all the BAHD gene pairs in strawberry (Fig. 5d), peach (Fig. 5c) and European pear (Fig. 5b) were less than one, indicating that these genes evolved through purifying selection (Fig. 5). Similar results were found in the other four Rosaceae species [the Chinese pear (Fig. 5f), sweet cherry (Fig. 5a), black raspberry (Fig. 5g) and apple (Fig. 5e)], except for a few gene pairs with Ka/Ks values greater than one. The box plots also indicated that the data distributions were concentrated, especially in Chinese white pear, sweet cherry and apple.
Expression pattern of BAHD genes in Chinese white pear
Based on transcriptome data (Additional file 7: Table S6) from different pear tissues, we determined that most genes in Chinese white pear showed higher expression in roots (Fig. 6), and we discovered that 37 of the BAHD genes were expressed in all four stages of fruit development. Pbr014238.1 was only expressed in the four stages of pollen-tube development, while Pbr020016.1, Pbr027303.1, Pbr029551.1, Pbr014028.1 and Pbr006821.1 were highly expressed in the late stage of fruit development (Fruit_S4). Most members of the BAHD superfamily showed no expression during the four stages of pollen-tube development.
Gene expression analyses with qRT-PCR
Based on the transcriptome expression profiles and the ester content analysis, we selected five potential Chinese white pear genes (Pbr020016.1, Pbr019034.1, Pbr014028.1, Pbr006821.1 and Pbr029551.1) that showed strong correlations with total ester content changes during fruit development (Fig. 7f). We used qRT-PCR to examine these candidate genes. The expression patterns of several individual genes were highly correlated with the ester content changes during pear fruit development (Fig. 7). Our results indicated that the expression level of Pbr014028.1 (Fig.7c) decreased from S1 (45DAF) to S2 (75DAF), increased sharply from S3 (105DAF) to S4 (145DAF), and then reached a peak value. Surprisingly, three indices of Pbr014028.1 (Fig. 7c), the relative expression level, the RNA-seq data and the changes in total ester content at all stages exhibited correlated trends. In addition, Pbr019034.1 (Fig. 7a), Pbr029551.1 (Fig. 7d) and Pbr020016.1 (Fig. 7e) showed similar expression patterns. They each had a sharp increase from S3 (105DAF) to S4 (145DAF) and reached their peak value in the last period. Moreover, the relative expression levels of these genes and the changes in total ester contents at all the stages showed consistent trends. The expression pattern of Pbr006821.1 (Fig. 7b) presented a different trend. It decreased first, then increased, reached a peak value in the S3 (105DAF) stage, and then decreased sharply from the S3 (105DAF) to S4 (145DAF) stage. However, the overall expression level in S4 (145DAF) was still greater than that in S1 (45DAF). Thus, except for Pbr006821.1, the expression levels of these genes, the RNA-seq data and the changes in total ester contents at all the stages exhibited correlated trends. Therefore, the four genes (Pbr020016.1, Pbr019034.1, Pbr029551.1 and Pbr014028.1) appear to be important candidate genes for ester synthesis.
To further investigate this result, five BAHD genes (Pbr020016.1, Pbr019034.1, Pbr014028.1, Pbr006821.1 and Pbr029551.1) from Chinese white pear and 11 biochemically characterized AAT genes from other species were used to construct a maximum-likelihood tree. As seen in Fig. 8, we found that these pear BAHD genes shared a high homology with the reported AAT genes. Thus, we speculated that these genes might have a strong correlation with ester synthesis.