Phylogeny and evolutionary analysis of cotton SRO genes
To explore the phylogeny among cotton SRO genes with other well characterized SRO genes from Arabidopsis, maize, potato and apple, an unrooted tree was generated using their protein sequences. In this phylogenetic tree, all the 45 SRO genes were scattered among three groups (Group I, Group II and Group III) which consisted of 12, 17, and 16 genes, respectively (Fig. 1). Cotton SRO genes were randomly distributed in all three groups. Among the GhSRO genes, GhSRO12, GhSRO6, GhSRO10 and GhSRO4 were clustered in group III with most of the Arabidopsis SRO genes (AtSRO2, AtSRO3, AtSRO4 and AtSRO5) and one potato SRO gene (SlocSRO2), four GhSRO genes (GhSRO8, GhSRO2, GhSRO7 and GhSRO1) were clustered in group I. Maximum number of SRO genes were clustered in Group II, including GhSRO11, GhSRO5, GhSRO3 and GhSRO9 from upland cotton, two Arabidopsis SRO genes (AtRCD1 and AtSRO1), one maize SRO gene (ZmSRO5) and one apple SRO gene (MdRCD1). As seen in Fig. 1, many orthologous genes were found among four cotton species, but no orthologues were found between Arabidopsis and cotton species. Maximum seven orthologous pairs were found between G. hirsutum and G. barbadense (GhSRO6/GbSRO6, GhSRO8/GbSRO8, GhSRO7/GbSRO7, GhSRO1/GbSRO1, GhSRO11/GbSRO10, GhSRO3/ GbSRO3, and GhSRO9/GbSRO9). Two orthologues were found between G. arboreum and G. barbadense (GaSRO6/GbSRO4 and GaSRO2/GbSRO2), One between G. raimondii and G. hirsutum (GrSRO3/GhSRO10) and one between G. arboreum and G. hirsutum (GaSRO5/GhSRO5). Interestingly, no orthologous pair was found between diploid species of cotton.
Chromosomal distribution and syntenic study of cotton SRO genes
To explore the evolutionary dynamics and syntenic relations among cotton SRO genes, a circos plot was generated among two diploid species and two allotetraploid cotton species. Results showed that putative GaSRO, GrSRO, GhSRO and GbSRO genes were unevenly distributed among three chromosomes of A2 (Ga-A05, Ga-A08, Ga-A12), D5 (Gr-D04, Gr-D08, Gr-D09), At (Gh-A05, Gh-A08, Gh-A12), Dt (Gh-D05, Gh-D08, Gh-D12), At (Gb-A05, Gb-A08, Gb-A12) and Dt (Gb-D05, Gb-D08, Gb-D12) genomes, respectively (Fig. 2). The gene number varies between one to four, with one gene on Ga-A08, Gr-D04, Gh-A08, Gh-D08, Gb-A08 and Gb-D08. Two genes on Ga-A05, Gr-D09, Gh-A05, Gh-D05, Gh-D12, Gb-A05, Gb-D05, Gb-D12 and three genes on Ga-A12, Gh-A12, Gh-D12, Gb-A12 and maximum numbers of genes (four) were found on Gr-D08 chromosome (Table S3, Fig. 2). Gene duplication analysis of GhSRO genes revealed six duplicated pairs which shared more than 95% similarity in nucleotide sequences. These pairs are GhSRO1-7, GhSRO2-8, GhSRO3-9, GhSRO4-10, GhSRO5-11 and GhSRO6-12 (Fig. S1). Interestingly, all the duplicated gene pairs undergo segmental duplication and followed the purifying selection pressure (Table S4).
Structure and motifs analysis of GhSRO genes
Gene structural analysis comprising of introns/exons distribution and numbers showed that Group II members had highest number of exons and introns (Exon = 6 and Intron = 5). All members of Group I and Group III possessed 4 exons and 3 introns, while two members of Group III contained 5 exons and 4 introns (Fig. 3A&B). Collectively, structural diagram discovered the conservation in number and distribution of exons/introns among closely related members within the groups. Conserved domains/motifs analysis provides clues about gene duplications and functional conservation during evolution. Using online MEME tool, seven motifs were found in GhSRO genes and their distributions were displayed in Fig. 3C&D. Different GhSRO genes contained different motifs; however, motifs 1 and 4 were conserved among all members of GhSRO genes. Moreover, GhSRO genes of similar structure harbored similar motifs within subgroups.
Cis -elements analysis
Promoter regions of every gene possess certain elements that control or regulate gene expression. These regulatory elements are known as cis-elements. To comprehend specific functionality of GhSRO genes during cotton developmental stages and responses to different stress factors, we comprehensively investigated the cis-regulatory elements of GhSRO genes. As shown in Fig. 4A, mainly four types of elements were found in GhSRO genes, including light responsive, growth or development related, hormones-responsive and stress related. Light responsive elements were predominant (59%) in GhSRO genes, followed by hormones (17%), stress related elements (17%) and plant growth responsive (7%). A diversity of cis-elements presents in each GhSRO genes and the pattern of these cis-elements varied among members of GhSRO genes. Comparing to other GhSRO genes, GhSRO4 have maximum number of low temperature-responsive, development-related, ABA-responsive and GA-related elements. Captivatingly, only GhSRO5 and GhSRO7 have cis-elements responsive to defense or stress related (Fig. 4B).
Identification of potential miRNA targeting sites in GhSRO genes
MiRNAs are tiny (∼22-nucleotide long) non-coding RNAs known to participate in certain regulatory functions of genes in eukaryotes (48, 49). Recently, the regulatory roles of these small RNAs in plant developmental processes and stress responses are widely studied (50–53). To further elucidate the functions of GhSRO genes in cotton and to predict potential targets of miRNAs, coding DNA sequences of GhSRO genes were submitted in psRNATarget server (http://plantgrn.noble.org/psRNATarget/home). The results revealed the 23 miRNAs of upland cotton that targeted 12 GhSRO genes (Fig. 5, Table S5). The predicted miRNAs were related to fiber development, growth stages and environmental stresses in plants (54). These miRNAs were identified during various RNA-sequencing and bioinformatics approaches. The functions of some identified miRNAs have been validated through experiments, while others are only predicted. The results showed that ghr-mir2868 targeted GhSRO2, GhSRO8 genes and targeted sites was present in RST domain. Similarly, GhSRO4, GhSRO10 and GhSRO6, GhSRO12 were targeted by ghr-miR2529 and novel-miR-27, respectively. Other miRNAs like ghr-mir482d, ghr-miR414f, ghr-n72, ghr-miR2595, ghr-mir482a and ghr-miR482h targeted GhSRO1, GhSRO3, GhSRO5, GhSRO7, GhSRO9 and GhSRO11, respectively. In addition to above reported miRNAs, other miRNAs also targeted GhSRO genes, like GhSRO1 was also targeted by ghr-miR2673, ghr-n3, ghr-miR2595, ghr-miR835, ghr-miR838a, novel_miR_43 and GhSRO2 by ghr-miR414f and ghr-miR7. The complete details comprising predicted miRNAs and their potential targets in GhSRO genes are provided in Table S5.
Expression profiling of GhSRO genes in various tissues and ovule developmental stages of cotton
Genes expression pattern partly reflect their functions. SRO genes have been reported to regulate various developmental processes in plants (13, 29). To preliminary investigate the biological functions of GhSRO genes in upland cotton in various tissues and during ovule development stages, we analyzed transcriptomic data (38). Differential expression pattern from all the candidate GhSRO genes were observed in most of the studied tissues (roots, stem, leaf, sepal, petal, anther, filaments, pistils, bract and torus). Of the total 12 candidate genes, six GhSRO genes (GhSRO6, GhSRO4, GhSRO12, GhSRO10, GhSRO7, and GhSRO1) were predominantly induced in anthers. However, the expressions of other six genes were higher in some tissues, very low or not obvious in some tissues. Such as, GhSRO8 highly expressed in sepals and torus, lesser in anthers and filaments while barely detectable in roots and bracts (Fig. 6A). As shown in Fig. 6B, all the candidate SRO genes induced differentially during different ovule developmental stages. Two genes (GhSRO4 and GhSRO10) more specifically induced during the later developmental stages, while two genes (GhSRO8 and GhSRO2) during mid-developmental stages, while others are differentially induced during initial and later stages. Moreover, four genes including GhSRO11, GhSRO5, GhSRO3 and GhSRO9 were more or less constitutively induced during most of ovule developmental stages. Remarkably, GhSRO10 most preferentially induced during ovule development at 25 dpa, indicating its specific functions at this physiological stage (Fig. 6B). Collectively, our expression profiling data advocates the important roles of GhSRO genes during various growth and developmental stages of cotton.