3.1 Characterization of putative OMT genes
The genome wide analyses revealed that the OMT family genes have varying sizes and diverse annotations in Gossypium species (S3 Table. Sheet A). There various features and characterizations of OMTs in Gossypium might result in differentiation of functions of this gene family. A total of 192 OMT members were identified in three Gossypium species, including 82 in G. hirsutum, 55 in G. arboreum, and 55 in G. raimondii (S3 Table. Sheet A). For phylogenetic analysis (Ma et al. 2016a), 33 OMT members in A. thaliana, and 26 members in T. cacao species were also identified (https://phytozome.jgi.doe.gov/pz/portal.html) (S3 Table. Sheet B). Retrieving information of OMT genes in G. hirsutum revealed that GhOMT81, which was detected in scaffold, coded the smallest peptide of 62 amino acids (aa) with a molecular weight of 6.642 kDa. While GhOMT40, which was identified on chromosome Dt02, coded the largest peptide of 969 aa with a molecular weight of 108.296 kDa among all OMT members in three Gossypium species.
As domain wise characterizing OMT family genes in Gossypium species, the results revealed that almost 3/4 (G. hirsutum) to more than 4/5 (G. arboretum and G. raimondii) of the OMT family genes harbored methyltransf_2 domain, while only from 1/4 to less than 1/5 harbored methyltransf_3 domain. As comparison, a little less than 2/3 of the OMT family genes in A. thaliana harbored methyltransf_2 domain and in T. Cacao, this number was almost 2/2.
3.2 Chromosomal distribution, collinearity, duplication, and loss of OMT genes
We confirmed chromosomal locations of the OMT family genes in Gossypium as previously described (Zhang et al. 2019). A total of 161 OMT genes were positioned on their respective chromosomes, while seven of G. raimondii, one of G. arboreum, and 23 of G. hirsutum were positioned in scaffolds (S4 Figure). In G. raimondii (D genome), chr11 was mapped with 13 genes followed by chr08 with nine genes. The minimum number of genes in a chromosome was one in chr2, chr6, and chr10 respectively. There was no OMT family members identified in chr01 and chr07 (S4 Figure.a). In G. arboreum (A-genome) (S4 Figure.b), 54 OMT genes were mapped in all chromosomes except chr1. Chr10 harbored 13 OMT genes which were the highest per chromosome, followed by chr12 and chr04 with 10 and 9 genes respectively. The minimum number of genes located in a chromosome was one in chr02 and chr11 respectively. In G. hirsutum (AtDt genome) (S4 Figure.c), unexpectedly, there were no OMT genes in At02, At05, At07, Dt03, Dt09, and Dt11 chromosomes. The distribution of genes in Dt sub-genome (33 genes) was higher than in At sub-genome (26 genes). The maximum number of genes in a chromosome was seven in Dt04 and At12, followed by four in Dt10 and At10 chromosomes, respectively. Dt01, Dt05, At01, At06, and At11 only had one OMT gene, and Dt06, Dt07, At03, At08, At09, At13 two OMT genes and Dt02, Dt08, and Dt13 three OMT genes respectively (S4 Figure.c). A collinearity analysis of the OMT family genes in Gossypium species chromosomes was shown in Fig. 1. The results demonstrated a pair wise collinearity of OMT genes between the chromosomes on which OMT family genes were mapped. Noticeably, a number of available genes in At and Dt scaffolds were collinear with their homologues in A and D genomes suggesting the collinearity of the DNA fragments between the scaffolds and chromosome where these OMT genes locate (Fig. 1). Taken the OMT gene numbers identified in each A/D genome or At/Dt sub-genome, collinearity analysis also revealed that there were totally 21 and 19 OMT genes exclusively detected in A and D genomes respectively. Their homologous counterparts in AtDt genomes of G. hirsutum are lost. There are also a few OMT genes that are exclusively detected in AtDt genome of G. hirsutum without homologous counterparts in A and D genomes (S4 Figure).
According to studies there are five types of duplications including singleton, dispersed, proximal, tandem, and segmental or whole-genome duplication (Qiao et al. 2019). A total of 73 members of the OMT genes of three Gossypium species were identified to have dispersed duplications, fifty-one genes (including 15 in A genome, 12 in D genome, and 24 in AtDt genome) to have segmental duplications, while thirty-four to have singleton duplications (S5 Table Sheet A).
3.3 Analysis of selection pressure
The ratio of the number of non-synonymous substitutions per non-synonymous site (Ka) to the number of synonymous substitutions per synonymous site (Ks) represents selection pressure of the gene (Ma et al. 2016b). Ka/Ks < 1 demonstrates high purifying selection pressure, while Ka/Ks > 1 shows positive selection pressure. Analysis of Ka/Ks ratio of homologous OMTs in three Gossypium species revealed that they are under purifying selection pressure. The Ka/Ks ratio of homologous OMTs in G. raimondii and G. arboreum ranged from 0.09 to 0.8, in G. raimondii and G. hirsutum ranged 0 to 0.7, and in At and Dt of G. hirsutum ranged 0.4 to 0.7 (S5 Table Sheet B).
3.4 Phylogenetic Analyses, Sequences alignment, conserved motifs and gene structure
The phylogenetic analysis included all of the 251 OMT genes identified in this study, including 192 genes from three Gossypium species (Fig. 2.a), 33 from A. thaliana, and 26 from T. cacao species (Fig. 2.b). The evolutionary relationship of OMT genes in three Gossypium species was closer and more similar with each other (Fig. 2.a) than with A. thaliana and T. cacao (Fig. 2.b). According to the topology of constructed tree, the OMT gene family is divided into five clades (I, II, III, IV, and V) in Gossypium, A. thaliana, and T. cacao species. Previous study has also identified five clades of OMT genes in Catalpa bungei (Lu et al. 2019). The results showed that each clade of OMT genes were symmetrically distributed within Gossypium species (Fig. 2.a), while in A. thaliana and T. cacao, OMT genes were identified in cluster forms (Fig. 2.b). The results demonstrated that these Gossypium OMT members might be evolutionary close within respective species and their identified clades.
To examine the conserved motifs of each clade, the analysis of representative motif logo and conserved motifs prediction were conducted (S6 Figure). The results revealed that motif 1, enriched with leucine, valine, and glycine, motif 2, enriched with leucine and valine, and motif3, motif 4, motif 5, and motif 6 were common in clades I, II, III, IV and clade V. While motif 7 was found missing in some members of clade V, which was then replaced with motif 8 at same positions (S6 Figure). The enriched amino acid residues of conserved motif1 (L/VDVGGG/TG) was previously identified in S-adenosyl-l-methionine (SAM)-dependant OMTs that shared 95% similarity with G. hirsutum OMT which further evidenced the biological function of identified genes in response to diverse abiotic stresses (Kim et al. 2013).
Methyltransferases serve different functions although they may have high sequence similarity. Investigation of gene structure has uncovered the different number of exons and introns of OMT genes. Exon and intron number of OMT genes varied from the least one exon and no intron to the most 7 to 9 exons and 6 to 8 introns (S7 Table). Two members in G. hirsutum including GhOMT67 and GhOMT30 contain nine exons as the highest (S7 Table). Same as, two members including GaOMT47 in G. arboreum, and GrOMT21 in G. raimondii contain 9 exons (S7 Table). Gene structure analysis revealed that the OMT genes with higher number of exons had a shorter exons and introns, and vise versa. These results demonstrated that OMT members possess different structural patterns in accordance with their features.
3.5 Identification of cis-regulatory elements in OMT family
The promoter regions of the OMT family contain precisely a large number of cis-regulatory elements. The analysis of cis-regulatory elements revealed the enrichment of MYB cis-regulatory elements, which was detected more than 350 times in OMT genes (S8 Figure). The MYC was another important element that was found 183 times in enlisted OMT genes. Box 4 (part of a conserved DNA module involved in light responsiveness) was found 152 times in 43/82 genes in G. hirsutum. ABRE elements was detected 119 times in 29/82 OMT genes in G. hirsutum. The ERE element was detected 113 times in 37/82and G-Box 97 times in 37/82 G. hirsutum OMT genes. An auxin RR-core and cis-acting regulatory element involved in the MeJA-responsiveness (TGACG-motif) were also observed in Gossypium OMT genes where this element was identified 48 times in 25/82 genes. Some other important cis-regulatory elements including wun-motif 44 times in 26/82, W-box 39 times in 31/82, GATA-motif 32 times in 27/82, O2-site 30 times in 22/82 OMT genes respectively, in G. hirsutum (S8 Figure). These cis-regulatory elements might function collectively in accordance with their specific roles and with specific conditions as well as growth and development stages (S8 Figure).
3.6 Sub-cellular localization prediction of OMT genes
Understanding and determining the sub-cellular localization of proteins is an important strategy to identify the function of protein at cellular level (Binder et al. 2014). This approach includes proteomic-based experiments and microscopic high throughputs (Andersen et al. 2002; Herold et al. 2009). Several sequence-based approaches have been developed to predict the sub-cellular localization by providing amino acid sequences including PSORT (Horton and Nakai 1997), Yloc (Briesemeister et al. 2010), BaCelLO (Pierleoni et al. 2006), LOCtree (Goldberg et al. 2012). According to Cello prediction, most of OMT genes were located in the cytoplasm (Table 1), while seven genes were predicted in periplasm, including, GhOMT31, GhOMT33, GhOMT16, GhOMT55, GhOMT68, GhOMT69, and GhOMT71. Five OMTs were predicted to be localized in both periplasm and cytoplasm, including GhOMT32, GhOMT5, GhOMT50, GhOMT20, and GhOMT70. Two genes GhOMT67 and GhOMT30 were predicted in the outer membrane. Only GhOMT29 was predicted in inner membrane and cytoplasm (Table. 1). The function of the OMT genes might be related to their predicted localizations, though the experimental approach is still needed for further confirmation.
Table 1
Predicted Subcellular localization of OMT genes of G. hirsutum
Gene ID | Predicted Localization | Reliability* | Gene ID | Predicted Localization | Reliability* | Gene ID | Predicted Localization | Reliability* |
GhOMT1 | Cp | 3.712 | GhOMT29 | IM/Cp | 1.807/2.478 | GhOMT57 | Cp | 2.352 |
GhOMT2 | Cp | 3.597 | GhOMT30 | OM | 2.374 | GhOMT58 | Cp | 4.675 |
GhOMT3 | Cp | 2.985 | GhOMT31 | Pp | 4.34 | GhOMT59 | Cp | 3.572 |
GhOMT4 | Cp | 3.897 | GhOMT32 | Pp/Cp | 2.121/2.358 | GhOMT60 | Cp | 2.968 |
GhOMT5 | Pp/Cp | 2.094/2.624 | GhOMT33 | Pp | 4.154 | GhOMT61 | Cp | 3.224 |
GhOMT6 | Cp | 4.545 | GhOMT34 | Cp | 4.87 | GhOMT62 | Cp | 4.112 |
GhOMT7 | Cp | 4.122 | GhOMT35 | Cp | 4.493 | GhOMT63 | Cp | 4.124 |
GhOMT8 | Cp | 2.707 | GhOMT36 | Cp | 4.637 | GhOMT64 | Cp | 4.385 |
GhOMT9 | Cp | 4.033 | GhOMT37 | Cp | 3.636 | GhOMT65 | Cp | 4.927 |
GhOMT10 | Cp | 4.822 | GhOMT38 | Cp | 4.092 | GhOMT66 | Cp | 2.781 |
GhOMT11 | Cp | 4.154 | GhOMT39 | Cp | 4.228 | GhOMT67 | OM | 2.226 |
GhOMT12 | Cp | 4.234 | GhOMT40 | Cp | 3.13 | GhOMT68 | Pp | 3.91 |
GhOMT13 | Cp | 4.275 | GhOMT41 | Cp | 4.475 | GhOMT69 | Pp | 2.49 |
GhOMT14 | Cp | 4.675 | GhOMT42 | Cp | 4.619 | GhOMT70 | Pp/Cp | 1.961/2.316 |
GhOMT15 | Cp | 4.743 | GhOMT43 | Cp | 3.384 | GhOMT71 | Pp | 4.276 |
GhOMT16 | Pp | 3.107 | GhOMT44 | Cp | 4.717 | GhOMT72 | Cp | 4.474 |
GhOMT17 | Cp | 3.838 | GhOMT45 | Cp | 4.717 | GhOMT73 | Cp | 4.312 |
GhOMT18 | Cp | 2.229 | GhOMT46 | Cp | 4.735 | GhOMT74 | Cp | 4.534 |
GhOMT19 | Cp | 1.577 | GhOMT47 | Cp | 3.973 | GhOMT75 | Cp | 4.641 |
GhOMT20 | Pp/Cp | 1.995/1.743 | GhOMT48 | Cp | 3.924 | GhOMT76 | Cp | 4.885 |
GhOMT21 | Cp | 3.577 | GhOMT49 | Cp | 2.531 | GhOMT77 | Cp | 3.973 |
GhOMT22 | Cp | 3.508 | GhOMT50 | Pp/Cp | 1.967/2.566 | GhOMT78 | Cp | 2.277 |
GhOMT23 | Cp | 4.529 | GhOMT51 | Cp | 3.273 | GhOMT79 | Cp | 2.154 |
GhOMT24 | Cp | 4.404 | GhOMT52 | Cp | 4.607 | GhOMT80 | Cp | 3.576 |
GhOMT25 | Cp | 4.069 | GhOMT53 | Cp | 3.946 | GhOMT81 | Cp | 2.09 |
GhOMT26 | Cp | 4.655 | GhOMT54 | Cp | 4.695 | GhOMT82 | Cp | 3.123 |
GhOMT27 | Cp | 3.408 | GhOMT55 | Pp | 3.275 | | | |
GhOMT28 | Cp | 4.921 | GhOMT56 | Cp | 4.391 | | | |
Cp: Cytoplasmic, Pp: Periplasmic, OM: outer membrane, IM: inner membrane |
*Reliability: Lower reliability value shows the stronger possibility of predicted localization |
3.7 Enrichment analysis
To understand the functional annotations of OMT family genes of G. hirsutum, 82 genes in G. hirsutum were undergone through gene ontology (GO) enrichment, kyoto encyclopedia of genes and genomes (KEGG Pathway), and InterPro analyses. In GO term analysis, all of the OMT family genes in G. hirsutum were enriched mainly in three molecular function categories, namely all 82 genes in O-methyltransferase activity, 62 of the 82 genes in methyltransferase activity, and 53 of the 82 genes in protein dimerization activity (Fig. 3.a). In KEGG Pathway analysis, the OMT family genes are categorized into two groups according to their domain functions. Twenty-nine OMT family genes were involved in monolignol biosynthesis, phenylpropanoid, secondary metabolism, and metabolic pathways and eleven genes were involved in phenylalanine and flavonoid biosynthesis pathways (Fig. 3.b). In InterPro analysis (http://www.ebi.ac.uk/interpro/), 82 OMT genes were categorized as common as the functions of S-adenosyl-L-methionine-dependent methyltransferase (Fig. 3.c). Followed by the second most enriched categories of methyltransferase_2 and O-methyltransferase COMT-type, with Sixty-two genes predicted in each of them (Fig. 3.c). Fifty-seven OMT genes were enriched in winged helix-turn-helix DNA-binding domain, among which 53 genes were also predicted in plant methyltransferase dimerization category (Fig. 3.c).
3.8 Expression profiling of OMT genes and their homologues
In order to verify the biological functions of OMT family genes, several transcriptome data sets including TM-1 (Hu et al. 2019), G. arboreum, G. raimondii, CSSLs (Li et al. 2017), and RILs (Zhang et al. 2020b), were applied to analyze their expression profiles in different developmental stages, organs, or tissues, and responses to various abiotic stress treatments. The transcriptome clusters showed that the OMT genes can be assorted into three basic groups (Fig. 4A): Those that have a broad responses to different developmental stages from germination to fiber maturation, typical examples of which included GhOMT33, GhOMT71, GhOMT16 and GhOMT55; those that have specific responses to root development, including GhOMT28, GhOMT4 and GhOMT34; and those that have responses to early germination in seed, cotyledon, root and stem, including GhOMT58, GhOMT32 and GhOMT46. When fiber specific transcriptome data sets of G. arboreum, G. raimondii were applied to observe the expression profiling diploid OMT family genes, the result also supported specific expression profiling of some OMT genes in diploid species of G. arboreum (Fig. 4.B) and G. raimondii (Fig. 4.C).
The gene expression profiling was further verified with trancriptome datasets of RILs (Fig. 5.A) two CSSLs (Fig. 5.B and 5.C). The results showed that the genes that had specific expressions during fiber development (Fig. 4.A) also had specific expressions in fiber development of RILs and CSSLs materials. These genes had a highly consistent expression profiling among the different cotton cultivars and lines during fiber development. Some selected GhOMT examples genes, GhOMT16 (Fig. 5.D), GhOMT27 (Fig. 5.E), GhOMT33 (Fig. 5.F), GhOMT43 (Fig. 5.G), and GhOMT55 (Fig. 5.H), were verified through qRT-PCR using sGK9708 and 0-153, the two parental lines of the RIL population with different fiber quality traits. The results showed that GhOMT16, GhOMT33, and GhOMT55 were significantly up-regulated during fiber development in sGK9708 than in 0-153 Fig. 5.D, 5.F and 5.H) and that GhOMT 27 and GhOMT43 did not show differences between the two cultivars (Fig. 5.E and 5.G). Noticeably, GhOMT16 and GhOMT55 reached the highest expression levels at 20 DPA and their high expression lasted in a short time as compared with that of GhOMT33. GhOMT33 had a rapid expression increase from 10 DPA to 15DPA and then its expression steadily increased until 25 DPA when it reached its highest expression level.
Based on the expression profiling of the OMT gene family in responses to cold, hot, osmotic, and salt stress treatments (Fig. 6.A), two genes specific in salt stress responses, GhOMT27 and GhOMT43, and three genes specific in fiber development, GhOMT16, GhOMT55, and GhOMT33 were verified by qRT-PCR with RNA samples extracted from salt treatment. The results indicated that both GhOMT27 and GhOMT43 showed an elevated expression in salt treatments in salt-tolerant cultivar as compare to the control treatments (Fig. 6.B and 6.C). These two genes had different expression profiles from 2 h to 6 h after salt treatment. GhOMT27 had the highest expression at 2 h and then its expression went down at 6 h; whereas GhOMT43 had an increasing expression pattern from 2 h to 6 h. Both genes had much higher expression in roots than in stem or leaf.