Identification and syntenic analysis ofGossypium H3gene family
A total of 34 G. hirsutum genes were identified belonging to the H3 gene family (Table 1), which were distributed on 16 chromosomes of A and D subgenomes, respectively (Fig. 1). Among them, multiple genes were clustered on one chromosome such as three genes were clustered together on At_chr10 and Dt_chr10 respectively, showing a higher similarity. The H3 gene distributed on chromosomes 5, 6, 8, 10 and 13 of the A subgenome showed a higher collinearity with those in the D subgenome (Fig. 1).
In addition, 18 H3 genes were distributed on 8 chromosomes and 2 scaffolds (scaffold3134 and scaffold 3624) of G. arboreum, while 16 H3 genes of G. raimondii where distributed in 7 chromosomes and 4 scaffolds (scaffold 371, scaffold266, scaffold266 and scaffold23), respectively (Supplementary Table S1). The syntenic analysis of H3 gene in different cotton species revealed that three genes of G. arboreum (Cotton_A_23780, Cotton_A_28035 and Cotton_A_34310) and one gene of G. raimondii (Cotton_D_gene_10004657) were lost in evolution, while five genes of G. hirsutum (Gh_A05G1915, Gh_A07G1271, Gh_D07G0357, Gh_D07G1382 and Gh_D10G0981) were appeared during evolution (Fig. 2).
Evolution and structure analysis of HistoneH3genes inG. hirsutum
These identified 34 H3 genes of G. hirsutum were divided into four sub-classes including CENH3, H3.1, H3.3 and H3-like (Fig. 4) by comparing with the H3 gene of Arabidopsis: Among these 34 H3 genes, 22 and 9 genes belonged to the subclasses H3.1 andH3.3, while subclasses CENH3 and H3-like contained only two and one genes (Fig. 3). Structural analysis of the H3 genes of G. hirsutum showed that the H3.1 sub-class contained no introns except Gh_D10G0981 with an intron, while the sub-classes H3.3, CENH3 and H3-like contained multiple introns. These results were similar to previous studies (Okada T1 and Endo M 2005; Cui J and Zhang Z 2015).
Conserved sequences analysis in theH3gene family
The conserved sequence of the H3 protein family were analyzed in G. hirsutum (Fig. 5). Out of the 34 G. hirsutum H3 proteins, 31 had a highly conserved sequence, including 22 H3.1 proteins and 9 H3.3 proteins, in which only four conserved amino acid substitutions found at sites 31, 41, 87 and 90 between the H3.1 and H3.3 subclass, where the four conserved amino acids are A31F41S87A90 for H3.1 and T31Y41H87L90 for H3.3. These four conserved amino acid signatures could be used to discriminate H3.1 from H3.3. Additionally, 2 CENH3 proteins (Gh_A07G1271 and Gh_D07G1382) and H3-like protein (Gh_D07G0357) had a highly diverse sequence and InDel. The CENH3 and H3-like subclasses carried the R31(P/R)41S87(H/L)90 and N31Q41P87Y90 signatures, respectively.
Previous studies showed that the histone H3 lysine 4 trimethylation (H3K4me3), and H3 lysine 36 di- and trimethylation (H3K36me2/me3) were associated with active gene expression while H3 lysine 9 methylation and H3 lysine 27 trimethylation (H3K27me3) were involved in gene repression (Cui J, Zhang Z and Shao Y 2015). In our study, the K4, K9, K27 and K36 were highly conserved in H3.1 and H3.3, while the members of CENH3 carry K4, K9, S27 and K/N36, and H3-like carried H4, L9, A27 and E36, indicating that the lysine modifications were conserved in the two sites of K4 and K9.
Based on the Ka/Ks ratio, it can be assumed that Darwinian positive selection was linked with the H3 gene divergence after duplication (Prince VE and Pickett FB 2002; Vandepoele K 2003). In our study, we found that 16 gene pairs had low Ka/Ks ratios (smaller than 0.5) and one gene pair had the Ka/Ks ratios between 0.5 and 1.0. Only one gene pair (Gh_A07G1271-Gh_D07G1382) had Ka/Ks larger than 1, might be due to relatively rapid evolution following duplication (Table 2). As most of the Ka/Ks ratios were smaller than 1.0, we presumed that the cotton H3 gene family had undergone strong purifying selection pressure with limited functional divergence that occurred after segmental duplications and whole genome duplication (WGD).
Expression profile ofH3genes across different tissues and developmental stages
To understand the temporal and spatial expression levels of different H3 genes, a publicly deposited RNA-seq data was used to assess the expression profile across different tissues and developmental stages. Among the 34 H3 genes, four groups of genes have been identified with FPKM > 0.5 in at least one of the selected tissues and developmental stages (Fig. 6). We found that H3 genes were widely expressed in the vegetative (root, stem and leaf) and reproductive (petal, stamen and pistil) tissues as well as in the fiber (5, 10, 20 and 25 DPA), indicating these have multiple biological functions in different tissues. Interestingly, we found that genes belong to H3.1 subgroup were up-regulated in pistil and all vegetative (root, stem, and leaf) tissues and down-regulated in other tissues, indicating their critical role in pistil development. Some genes were up-regulated in one tissue while down-regulated in the other tissues. For example, the high expression of Gh_A02G0886 in stamen suggests that it may play a critical role in stamen development. Similarly, Gh_D07G1382 was up-regulated in pistil and stamen while down-regulated in other tissues. In contrast, some genes were highly expressed in different fiber developmental stages such as Gh_A11G1633 and Gh_D02G0973 were up-regulated in the early stage (5 and 10 DPA) of fibre development, while down-regulated in the 20 and 25 DPA of fiber development.
Silencing of theGhCENH3gene affecting the leaf size and the number of the stomata and stroma chloroplast
Functional analysis of the gene GhCENH3 was performed in TM-1 using VIGS technology to validate its role in leaf size and the number of the stomata and stroma chloroplast in cotton. After the Agrobacterium-mediated infection, the phenotype of both silenced and non-silenced plants was observed regularly to check the efficiency ofgene silencing. The mutant phenotypes of the VIGS-treated plants started to emerge after one week of Agrobacterium-mediated infection. The plants injected with pYL156-PDS showed the loss of normal green coloration with albino phenotype in leaves (Fig. 7), showing the efficiency of gene silencing. The significantly smaller size leaves of the plants infiltrated with pYL156-CENH3 were observed (Fig. 7), while, the plants infiltrated with pYL156 had no effect on leaves (Fig. 7). To check the silencing efficiency, qRT-PCR was performed which showed a significant lower expression of candidate gene in the plants infected by pYL156-CENH3 (Fig. 8). In additional, the number of the stomata was 57 ± 1.414 with the coefficient variation 0.055 by pYL156-CENH3, 21 ± 0.632 with the coefficient variation 0.067 by pYL156 and 15 ± 0.748 with the coefficient variation 0.109 by pYL156-PDS (Fig. 9). The number of the stomatal chloroplast was 46.8 + 5.805 with the coefficient variation 0.1240 by pYL156-CENH3, 21.2 + 2.864 with the coefficient variation 0.1351 by pYL156 and 0 in the Photo-bleaching leaf phenotype (Fig. 10). Moreover, the stomatal chloroplast showed size variations in plants injected by pYL156-CENH3, which would need further research to identify the reason of this size variation.