Identification and classification of Hsf genes
Total 25 Hsfs genes were identified in Nipponbare genome, including 13 class A Hsfs, 8 class B Hsfs and 4 class C Hsfs. In O. barthii, O. glumaepatula, O. meridionalis, O. nivara, O. punctate and O. rufipogon genomes, 22, 23, 24, 24, 25 and 25 Hsfs genes were identified respectively. They all had 13 class A Hsfs. However, the different number of class B and C Hsfs were found in the 6 wild rice genomes. There were 6 class B Hsfs in Oryza barthii, 7 in Oryza glumaepatula and Oryza meridionalis, while 8 in Oryza nivara, Oryza punctate and Oryza rufipogon. Oryza barthii, Oryza glumaepatula and Oryza nivara genomes had 3 class C Hsfs, while Oryza meridionalis, Oryza punctate and Oryza rufipogon genomes had 4 class C Hsfs.
As shown in Supplementary Tables 1 and 2, HsfB2b, HsfB4d and HsfC2b genes were present in the Nipponbare genome but absent in O. barthii genome. HsfB2c and HsfC1b were not present in O. glumaepatula genome. O. meridionalis genome did not have HsfB4a gene, while O. nivara did not have HsfC1b gene. And two Hsf genes were found to be translocated among the 6 tested rice varieties. The HsfA8a gene was located on chromosome 3 in Nipponbare genomes, but the gene was located on chromosome 6 in O. nivara genome. HsfB1a gene was located on chromosome 9 in Nipponbare genome, but it was located on chromosome 3 in O. nivara genome.
Evolutionary analysis of Hsf genes
The full-length amino acid sequences of Hsf genes in the 6 wild and 1 cultivated rice genomes were aligned to evaluate the evolutionary relationship of Hsf proteins, and the orthologous pair of Hsf genes among the test 7 rice varieties were also identified by BLASTp. ObHsfA7b, ObHsfB2c and ObHsfC2a were removed because their alignment length was less than 200 bp. The numbers of orthologous pairs between Nipponbare and the six wild rice, O.barthii, O.glumaepatula, O. meridionalis, O. nivara, O. punctata and O. rufipogon, were 19, 23, 24, 24, 25 and 25, respectively.
As shown in Fig. 1, all the Hsf proteins were divided into 3 different clades corresponding to the main Hsf classes A, B, and C. Within the Hsf gene A clade, eight distinct subclades (A1, A2, A3, A4, A5, A6, A7 and A8) were resolved and contained all of the Hsf A genes. The class A Hsf genes were divided into 13 orthologous gene clusters (OGCs). The C-type Hsf genes from the tested 7 rice varieties also constituted one distinct clade, had 4 OGCs and appeared to be closely related to the Hsf A group. Correspondingly, the B type Hsf genes were grouped into a separate clade subdivided into three groups (B1, B2 and B4). Notably, the B2 sub-clade was obviously distinct from the other two subclades and formed earlier than other subgroups, hinting that the B2 subgroup was the ancestor of other subgroups. Particularly, OrHsfB2b and ObHsfB2c genes were separated, suggesting that these two genes might be the ancestors of the rice Hsf genes, and other Hsf genes might have evolved through random replication and mutation of these two genes.
Selective constraints on OGCs of Hsf genes
To test the selective constraints on the evolution of each Hsf-family OGC, we detected the variations in dN/dS ratio among codon positions within each cluster, which was shown in Table 1. Overall, the average dN/dS ratio for the 25 clusters under model M0 was 0.217 (with a standard deviation of 0.112), which was statistically lower than 1 but greater than 0 (one-sample t-test, p < 0.01), illustrating that purifying selection was the predominant constraint on the evolution of Hsf gene family in rice. However, the log-likelihood differences between M3 and M0 were statistically significant for 15 OGCs, suggesting that the selective-constraint levels differed across these 15 Hsf OGCs. To evaluate whether positive selection facilitated the evolution of each Hsf OGC in rice, we compared M8 and M7 models to determine whether they had undergone positive selection. According to Yang et al. (2014), the following 3 criteria was used, including an estimate of ω > 1 under M8, any sites found to be under positive selection and a statistically significant likelihood ratio test (LRT) [17]. Finally, we found 8 Hsf OGCs had undergone positive selection during the evolution of rice because they satisfied the above 3 criteria (Table 1).
Table 1
Parameter of positive selection under site-specific models for each Hsf OGC in rice.
Group | No. | Hsf | dN/dS (ω) under M0 | 2Δl, M3 vs. M0 | 2Δl, M8 vs. M7 | Parameter estimates under M8 | Positively selected sites |
OGC1 | 6 | HsfB2b | 0.14049 | 14.604** | 0.02 | p1 = 0.13271,w = 1.00000 p = 3.87480 ,q = 98.99983 | NAN |
OGC2 | 5 | HsfB2c | 0.18177 | 144.38** | 40.388** | p = 0.01317,q = 0.09377 p1 = 0.04901,w = 10.27485 | 11 |
OGC3 | 7 | HsfB2a | 0.038 | 0 | 0 | p = 3.96425,q = 99.00000 p1 = 0.00001,w = 1.00000 | NAN |
OGC4 | 7 | HsfB1a | 0.10086 | 0.279 | 0 | p = 0.89716 ,q = 7.69407 p1 = 0.00001 ,w = 1.00000 | NAN |
OGC5 | 7 | HsfB4c | 0.1591 | 32.029** | 0.067 | p = 1.37462 ,q = 99.00000 p1 = 0.16360 ,w = 1.11258 | NAN |
OGC6 | 6 | HsfB4a | 0.22777 | 2.057 | 0 | p = 0.27556 ,q = 0.85859 p1 = 0.00001 ,w = 1.00000 | NAN |
OGC7 | 6 | HsfB4d | 0.30055 | 140.628** | 22.899** | p = 0.01708 ,q = 0.05438 p1 = 0.05865 ,w = 6.91555 | 10 |
OGC8 | 7 | HsfB4b | 0.24141 | 48.148** | 1.617 | p = 5.97909 ,q = 99.00000 p1 = 0.17175 ,w = 1.70970 | 1 |
OGC9 | 7 | HsfA4d | 0.26651 | 3.821 | 0.375 | p = 0.00500 ,q = 0.75866 p1 = 0.20236 ,w = 1.47764 | NAN |
OGC10 | 7 | HsfA4a | 0.22383 | 0 | 0 | p = 28.64176,q = 99.00000 p1 = 0.00001w = 33.18236 | NAN |
OGC11 | 7 | HsfA5a | 0.17942 | 3.047 | 0 | p = 0.05818 ,q = 0.24971 p1 = 0.00674 ,w = 1.00000 | NAN |
OGC12 | 7 | HsfA7a | 0.47716 | 53.843** | 22.701** | p = 31.49855 ,q = 99.00000 p1 = 0.05797 ,w = 7.63926 | 5 |
OGC13 | 6 | HsfC2a | 0.24124 | 27.25** | 0.376 | p = 5.81163 ,q = 99.00000 p1 = 0.18562 ,w = 1.29356 | NAN |
OGC14 | 6 | HsfC2b | 0.17291 | 51.512** | 5.467 | p = 0.01413 ,q = 0.35141 p1 = 0.10764 ,w = 2.40297 | 1 |
OGC15 | 7 | HsfC1a | 0.08375 | 17.029** | 0.794 | p = 0.00500 ,q = 3.01839 p1 = 0.06962 ,w = 1.47290 | NAN |
OGC16 | 5 | HsfC1b | 0.12084 | 18.875** | 2.87 | p = 0.26773 ,q = 2.63611 p1 = 0.01325,w = 11.10354 | 1 |
OGC17 | 6 | HsfA7b | 0.17444 | 6.771 | 0.47 | p = 13.23116 ,q = 99.00000 p1 = 0.04866 ,w = 1.79039 | NAN |
OGC18 | 7 | HsfA2e | 0.06398 | 0 | 0 | p = 6.85736 ,q = 99.00000 p1 = 0.00001 ,w = 1.00000 | NAN |
OGC19 | 7 | HsfA8a | 0.44664 | 104.571** | 54.387** | p = 14.66857 ,q = 99.00000 p1 = 0.06573,w = 16.87959 | 14 |
OGC20 | 7 | HsfA1a | 0.32209 | 46.417** | 18.311** | p = 7.36998 ,q = 99.00000 p1 = 0.05482 ,w = 7.16638 | 7 |
OGC21 | 7 | HsfA3a | 0.42448 | 40.071** | 16.113** | p = 29.51001 ,q = 99.00000 p1 = 0.04751 ,w = 8.53702 | 4 |
OGC22 | 7 | HsfA6b | 0.26424 | 1.109 | 0 | p = 0.29028 ,q = 0.77927 p1 = 0.00001 ,w = 1.00000 | NAN |
OGC23 | 7 | HsfA6a | 0.20438 | 20.264** | 7.07* | p = 8.27014 ,q = 98.95912 p1 = 0.03015 ,w = 6.96406 | 3 |
OGC24 | 7 | HsfA2b | 0.18087 | 29.57** | 6.052* | p = 1.78956 ,q = 99.00000 p1 = 0.08047 ,w = 2.96326 | 2 |
OGC25 | 7 | HsfA2a | 0.18992 | 8.449 | 2.631 | p = 16.44481 ,q = 99.00000 p1 = 0.01004 ,w = 8.66996 | NAN |
* NAN means there is no positively selected site |
Synteny analysis of Hsf genes
The 3 main types of gene duplication result from chromosomal rearrangements, including retrotranspositions, tandem duplications or segmental duplications, and whole genome duplication (WGD), might drive the evolution of protein-coding gene families[18]. In this research work, MCScanX package was used to detect the origins of duplicate genes of Hsf gene family in the tested 7 rice variety genomes. As shown in Table 2, each member of Hsf gene family was grouped into one of five different categories: singleton, WGD, tandem, proximal or dispersed, while percentage of total Hsf family member in each category was calculated.
Table 2
Numbers of Hsf genes from different origins in the test 7 rice variety genomes*.
Species | No. of Hsf genes | No. and percentage of Hsf genes from different origins |
Singleton | WGD | Tandem | Proximal | Dispersed |
Nipponbare | 25 | 9(36.0) | 5(20.0) | 0 | 0 | 11(44.0) |
O.meridionalis | 24 | 3(12.5) | 5(20.8) | 1(4.2) | 1(4.2) | 14(58.3) |
O.glumaepatula | 23 | 0 | 11(47.8) | 0 | 0 | 12(52.2) |
O.barthi | 22 | 0 | 10(45.5) | 0 | 0 | 12(54.5) |
O.nivara | 24 | 0 | 11(45.8) | 0 | 0 | 13(54.2) |
O.punctata | 25 | 0 | 16(64.0) | 0 | 0 | 9(36.0) |
O.rufipogon | 25 | 0 | 14(56.0) | 0 | 0 | 11(44.0) |
* Data in ( ) indicated the percentage of Hsf genes. |
All the Hsf genes in O.barthi, O.glumaepatula, O.nivara, O.punctata and O.rufipogon genomes were duplicated and retained from dispersed duplication and WGD (Table 2). This result indicated that dispersed duplication and WGD played an important role in the expansion of Hsfs family genes for wild rice. Compared to the 6 wild rice varieties, the Hsf genes in Nipponbare had considerably the lowest proportions of WGD duplication (20.0%) and the highest proportion of singleton (36.0%), showing that Nipponbare experienced singleton since it divergenced from the wild rice. Only Hsf genes in O. meridionalis had tandem and proximal duplication event.
Meanwhile, we identified the conservation blocks both in intra- and inter-genomic pairwise to analyze the collinearity of Hsf genes as introduced by Qiao et al. [12]. According to the definition of fragment replication, if there are two or more common linear regions in one species, these regions are considered as fragment replication regions. Totally, 3, 4, 7, 0, 6, 12 and 10 collinear regions were found in Hsf genes in Nipponbare, O. barthii, O. glumaepatula, O. meridionalis, O. nivara, O. punctate and O. rufipogon, respectively (Fig. 2 ).
Expression profile of Hsf genes
Studying the gene-expression patterns of all members of a gene family would provide insight into their functional divergence[19]. To analyze the expression profile of Hsf genes in cultivated rice and their immediate ancestral progenitors in response to same stress condition, we searched and downloaded two transcription data from the previous studies, one was Nipponbare rice seedling under 1 h salinity stress while the other was Oryza rufipogon rice seedlings under 12 days salinity stress. In early response of Nipponbare rice seedling to salinity stress, three up-regulated shoot Hsf genes, 7 up-regulated and 7 down-regulated root Hsf genes were found. Whilst, in the late response of Oryza rufipogon rice seedling to salinity stress, six up-regulated and 5 down-regulated shoot Hsf genes, 4 up-regulated and 1 down-regulated root Hsf genes were detected (Table 3). This result indicated that more Hsf genes were regulated in the shoot of rice seedling in response to long-term than short-term salinity stress, the reverse was true in the root. However, the same tissue of rice seeding under short and long-term salinity stress did not have the common regulated Hsf genes, expect 4 root Hsf genes (Table 3). This result implied that different Hsf family members were function to response to different stage of salinity stress, or that cultivated rice and its immediate ancestral progenitor employed different Hsf genes to response to salinity stress. Four root Hsf genes, including HsfA3a, HsfA4d, HsfC2a and HsfC2b, were simultaneously up-regulated in Nipponbare under 1 h and Oryza rufipogon under 12d salinity stress, showing that they might function critically in the early and late response to salinity stress.
Table 3
Differentially expressing Hsf genes in the shoots and roots of O. rufipogon and Nipponbare rice seedling under salinity stress.
| Up-regulated | Down-regulated |
Shoot | Root | Shoot | Root |
Oryza rufipogon (12d salinity stress) | HsfA7a, HsfB2a HsfB2b, HsfB2c HsfC1b, HsfC2b | HsfA3a, HsfA4d HsfC2a, HsfC2b | HsfA2b, HsfB4b HsfB4c, HsfB4d HsfC1a | HsfB4a |
Nipponbare (1 h salinity stress) | HsfA3a, HsfC1a HsfC2a | HsfA3a, HsfA4d HsfC2a, HsfC2b HsfA7a, HsfC1a HsfC1b | | HsfA6a, HsfA2b HsfA6b, HsfA2e HsfB2a, HsfB2b HsfB2c |