Annotation update and genomic characteristics of B3 genes
We performed re-annotation of the B3 genes for seven species, including A. thaliana, O. sativa, and five species of Solanaceae. A total of 1,039 B3 genes were annotated containing 231 (22.2%) newly identified genes (Table 1). Specifically, we found 75.3% of newly identified B3 genes in three pepper genomes (Capsicum species), ranging from 53 to 67 in individual species. We investigated the domain architectures for B3 genes and classified them into five known subfamilies: RAV, HSI, ABI3/VP1, ARF, and REM. Specifically, ABI3/VP1 and REM genes had only B3 domain(s), whereas RAV, HSI, and ARF genes had additional AP2, CW-type zinc finger, and auxin response factor domains, respectively, with B3 domain(s) (Fig. 1A). When we examined the ratio and number of B3 genes from the five subfamilies, REM genes were mostly abundant compared to those of other families, especially in three pepper species (Fig. 1B). Specifically, REM genes accounted for 78.2% of the newly identified genes in pepper genomes (Table S3). These results indicate that updated B3 genes from the re-annotation process provide accurate gene repertories of B3 genes, especially for REM genes in pepper genomes.
Table 1
The number of re-annotated B3 genes in seven species.
Species | Previously annotated genes | Newly annotated genes | Total |
O. sativa | 96 | 5 | 101 |
A. thaliana | 116 | 5 | 121 |
C. annuum | 120 | 54 | 174 |
C. baccatum | 119 | 53 | 172 |
C. chinense | 118 | 67 | 185 |
S. tuberosum | 125 | 15 | 140 |
S. lycopersicum | 114 | 32 | 146 |
Total | 808 | 231 | 1,039 |
To explore the sequence differences of the B3 domain among the five subfamilies, we examined the amino acid sequence of the B3 domain using the updated B3 genes (Fig. 1C). The B3 domain consisted of approximately 110 amino acids, including two α-helices and seven β-sheets. The position of the first deduced α-helix from the B3 domain was located between the second and third β-sheets, otherwise, the second deduced α-helix was located between the fifth and sixth β-sheets. Given the secondary structure, we subdivided sections of the B3 domain and found that most of the sections were not conserved. Because REM genes were particularly variable, we separated B3 genes in the five subfamilies into two types (REM and other families) and observed high conservation among genes in other families (Fig. S1). This indicates that the REM genes have contributed to increased sequence diversity among the B3 domains. We also examined the motif structure of the B3 domain and verified representative motifs abundant in REM or other subfamilies. Motifs #3, #9, #14, and #17 were enriched in other families, whereas motifs #5, #6, #19, and #20 were abundant in REM (Fig. 1D). This implies that REM and other families contain distinct B3 domains and thus may have different functions.
Gene Ontology (GO) analysis was performed to understand the potential functions of B3 genes in seven species (Fig. 1E). Many B3 genes were predicted to be associated with a binding in molecular function and nucleus function in the cellular component. Because transcription factors are generally known to control the transcription of specific genes [24], the B3 genes were also predicted to play a role as a transcription factor. To examine the putative function of B3 genes in subfamilies, we separated the results of GO analysis by families and found different functions in specific subfamilies. Specifically, ARF genes were thought to function in biological processes related to the auxin-activated signaling pathway, suggesting that these genes are involved in the auxin response factor and AUX/IAA proteins (Fig. S2). Taken together, our data generated from the updated B3 genes provide accurate subfamily repertories, genomic structures, and potential function of B3 genes in Solanaceae species with A. thaliana and O. sativa.
Motif compositions of REM and other subfamilies of B3 genes
To compare the motif composition of REM and other families, we analyzed the motif structure and verified 50 conserved motifs in the updated B3 genes. A total of 36 motifs were mapped at 19 positions, except for 14 motifs that were observed at multiple locations as repetitive motif sequences (Fig. 2). Specifically, two regions of B3 domains were presented in the REM genes, at positions 4th to 9th and 15th to 18th, respectively (Fig. 2A). However, we found that B3 genes from other families have a B3 domain located at positions 4th to 10th (Fig. 2B). We also analyzed the non-B3 domain regions and found significant differences between REM and other families mainly due to additional domains in RAV, HSI, and ARF families. These results illustrate distinct amino-acid sequence repertories between REM and other subfamilies. When we examined motif positions in REM or other subfamilies, enriched motifs of REM genes were mainly observed in B3 domain regions, whereas the majority of specifically abundant motifs in other subfamilies were positioned in the flanking region as well as within the B3 domain (Fig. 2). These represent that the REM genes consisted of different sequences compared to the B3 genes in other families, and thus may have undergone an independent evolutionary process. Among the B3 domains in the REM genes, we found that the first and remaining B3 domain regions have similar motif configurations (Fig. 2A). This suggests that REM genes may evolve by gaining repetitive domains.
Copy number expansion of specific REM genes in pepper
To elucidate the evolutionary relationship of B3 genes, we constructed a phylogenetic tree with 231 updated B3 genes in the seven plant genomes. Based on the motif compositions of genes and tree branches, we classified 1,010 genes into 12 subgroups (G1-12) (Fig. 3A). Specifically, REM genes were constructed into a large lineage that was grouped into G5-11. The motifs in the front and back of the B3 domain were examined to identify the characteristics of each subgroup (Fig. 3B). Specifically, we found that motif #11 was observed within the ARF family, but motifs #13 and #24 were unique to G3 and G4, respectively. This indicates that these motifs contributed to the generation of the genomic diversity of ARF genes among those subgroups. Conversely, we observed that dominant motifs among the REM subgroups were conserved overall and shared with motifs #27 and/or #4. However, motif #4 of G7 and G8 had the characteristic of being accompanied by motifs #28 and #41, respectively. The G9 and G10 also showed different characteristics for each subgroup, such as a difference in the number of B3 domains. Based on these conserved structures, this result implies that the duplication of REM genes occurred rapidly, resulting in the conservation of genomic sequences of genes in REM subgroups. We then examined the copy number of B3 genes in each subgroup by species and verified that the overall number of genes in each subgroup of Solanaceae was similar, except for the number of genes in G8 (Fig. 3C). The largest number of genes were congregated in G8, followed by G4. In the G8 subgroup, a large number of REM genes (90.64%) were clustered in three pepper genomes, whereas genes in potato and tomato were rarely observed. This suggests that the specific REM genes in pepper species have been expanded by lineage-specific evolution, resulting in the conservation of a large pool of REM genes in pepper species.
Next, we conducted the chromosome location of the genes for pepper (C. annuum), potato, and tomato (Fig. S3). We found that most of the genes were unevenly distributed across 12 chromosomes, suggesting different repertories of B3 genes among Solanaceae. In particular, the tandem array of genes in G8, located on chromosomes 1 and 3 of the pepper genome, was observed. To compare the genomic regions of the G8 genes on chromosomes 1 and 3 in the pepper genome with the corresponding regions in the Solanaceae genomes, we performed synteny analyses for chromosomes 1 and 3 of pepper and chromosomes 8 and 9 of potato and tomato, respectively (Fig. 3D). Of the 17 and 14 pepper B3 genes of G8 on chromosomes 1 and 3, respectively, we found only 12.5% and none of the orthologous genes in their corresponding regions in potato and tomato. Because we observed only a few orthologous relationships between the pepper B3 genes in G8 and the other two species, we assumed that the copy number expansion of the pepper B3 genes in G8 had recently occurred mainly after speciation. To verify this, synonymous substitution rate (Ks) values were calculated between duplication pairs of G8 in pepper, potato, and tomato, respectively, to estimate the emergence time of genes from G8 in the three species (Fig. 3E). The average MYA of pepper, potato, and tomato were about 14.9 (0.21 Ks), 30.8 (0.43 Ks), and 50.1 (0.7 Ks), respectively. Because 80% of the G8 genes in pepper were smaller than 21.6 MYA (0.3 Ks), we constructed dendrogram based on duplication time for the pepper G8 genes that located on the chromosomes 1 and 3 to identify how tandem duplications were occurred (Fig. S4). Considering that the divergence point between Capsicum and Solanum species was around 0.3 Ks [19], our result suggests that the pepper REM genes in G8 have been rapidly duplicated and specifically expanded after speciation by tandem duplication especially in chromosome 1 and 3. Taken together, our data indicates that pepper-specific copy number expansion of REM genes via recent tandem duplication events probably has played a crucial role in the construction of distinct B3 gene repertoires in pepper compared to other Solanaceae species.
Expression and potential roles of pepper B3 genes under abiotic stress conditions
To investigate the potential role of the B3 genes in pepper (C. annuum) under abiotic stress conditions, we performed RNA-seq analyses and identified differentially expressed genes (DEGs) under cold, heat, mannitol, and salt stresses. Because previous studies reported the speculation of gene functions by detecting genes that had similar expression patterns with DEGs under certain conditions [25, 26], we also compared the expression of B3 genes and DEGs and grouped them into three clusters (C1-3) for each stress condition given similar expression patterns (Fig. 4A). The expressed B3 genes were most abundant in mannitol stress with 85 genes, followed by 84 genes, 79 genes, and 69 genes for heat, salt, and cold stress, respectively. This suggests that these B3 genes, which were similarly expressed with DEGs, may have roles related to each stress. Among the subgroups, we observed many REM genes of G8 in clusters, containing 15 genes under cold, 24 genes under heat, 22 genes under mannitol, and 22 genes under salt (Fig. 4B). Specifically, these genes in G8 were abundant in specific clusters. For example, 60% (cold C3), 42% (heat C3), 59% (mannitol C3), and 59% (salt C2) of the B3 genes were in G8. These suggest that the potential role of pepper-specific expanded REM genes is relevant to various abiotic stresses. When we examined the GO terms of whole genes in two clusters by each abiotic stress condition, including abundant REM genes in G8, we observed various abiotic stress-related functions, such as response to auxin (GO:0009733) in cold and heat, small molecule biosynthetic process (GO:0044283) in heat and mannitol, and signaling receptor activity (GO:0038023) in salt (Fig. 4C). In addition, many genes in specific clusters, such as cold C1, heat C3, mannitol C3, and salt C3, were associated with an RNA modification (GO:0009451) encoding a pentatricopeptide repeat (PPR) domain. Our results suggest the potential role of pepper B3, especially REM genes in G8 under abiotic stress conditions via association with a variety of other genes.
Co-expression network and functional association of pepper REM genes under abiotic stress conditions
We detected B3 DEGs in expression clusters and verified an overall similar distribution across four stresses: 33 (cold), 26 (heat), 26 (mannitol), and 26 (salt). Of these, the REM DEGs, especially in the G8, were mostly abundant regardless of the stresses (Fig. 5A). Furthermore, we identified 25 stress-specific DEGs, and 16 (8) of them belonged to REM (G8) (Fig. 5B). Co-expression networks of target genes suggest their potential roles given the repertories of linked genes in the expression network [27]. We constructed co-expression networks of stress-specific REM DEGs with other DEGs in the same expression clusters to understand the specific roles of pepper REM genes under abiotic stress conditions. Our analyses revealed that pepper REM genes in G8 could be involved in cold and mannitol stress-induced functions of various genes (Fig. 5C). In particular, we detected that CaREM210 and CaREM205 were co-expressed with three PPRs and a variety of genes under cold and mannitol conditions, respectively. Previous studies have reported the functions of PPRs under cold and mannitol conditions. The repressed expression of TCD10 (LOC_Os10g28600) and OsV4 (LOC_Os04g39970) genes in rice caused abnormal chloroplast development at low temperatures [28, 29]. SOAR1 (At5g11310) in Arabidopsis was the positive response to particularly cold and osmotic stress through the regulation of ABA signaling [30]. In addition to the PPR genes, OsRH42 (Os08g0159900), which encodes a DEAD-box helicase in rice, is important for pre-mRNA splicing under cold stress [31]. The Rboh gene (LOC107862088, LOC107875997), encoding several domains such as EF-hand, NADPH oxidase, and FAD-binding in pepper, was known to be activated by the binding of Ca cations to the EF-hand, causing the accumulation of malondialdehyde in response to mannitol stress [32]. Our data suggest that pepper REM genes in G8 could be involved in the regulation of cold and mannitol stress-related traits with PPR and other various genes via examination of co-expressed genes with pepper REM genes in G8 under those stress conditions.
Furthermore, we observed diverse genes that co-expressed with pepper REM genes in G8 and G9 under salt and heat stress (Fig. 5D). In salt stress, we also found genes involved in the regulation of ABA [30, 33], the regulation of ROS production [34], and the early events of the signal transduction pathway [35] such as CCT, PPR, and protein kinase domain, as described in previous studies. For example, overexpression of AtCOL4 (At5g24930), which encodes the CCT domain in Arabidopsis, is known to regulate ABA synthesis and stress-related genes under salt stress [33]. These results suggest that the pepper REM genes in G8 may play a role in the salt stress condition through co-expression with other salt stress-related genes. In heat stress, CaREM365 in G9 was co-expressed with nine genes having various domains such as protein kinase, α-amylase, and lactate/malate dehydrogenase, suggesting the putative role of the pepper REM gene in G9 with these genes. Taken together, our data comprehensively suggest the underlying roles of pepper REM genes, especially those belonging to the pepper-specific expanded G8 via association with a variety of genes involved in abiotic stress responses based on the investigation of DEGs and co-expression networks.