Identification of SPL genes in rye
In this study, all possible members of the SPL gene family in rye were identified and extracted using two BLASTp methods. Twenty-one ScSPL genes were identified and named ScSPL1-ScSPL21 based on the chromosome number (Additional file 5: Table S1). Next, the essential characteristics of the ScSPLs were analyzed, including molecular weight (mw), isoelectric point (pI), coding sequence length, and subcellular localization (http://cello.life.nctu.edu.tw).
Among the 21 ScSPL proteins, ScSPL7 was the smallest with 193 amino acids, while ScSPL11 was the largest with 1130 amino acids. The molecular masses of the proteins ranged from 20.127 kDa (ScSPL7) to 123.589 kDa (ScSPL11), while the pI ranged from 5.74 (ScSPL19) to 9.87 (ScSPL11), with an average of 6.57. It was also found that three of the 21 ScSPL genes contained the ANK domain. According to the results of subcellular localization, all ScSPL genes were located in the nucleus, five in the cytoplasm, chloroplast, and mitochondria, and one (ScSPL16) in the plasmid. Furthermore, the number of SPL genes in rye (21) was higher than that in A. thaliana (15), H. vulgare (17), and A. tauschii Coss (19) but lower than in Z. mays (29), O. sativa (29), and T. aestivum (56).
Multiple sequence alignment, phylogenetic analysis, and classification of ScSPL genes.
To determine the phylogenetic relationship of SPL proteins in rye, a phylogenetic tree of 21 ScSPLs genes and 15 AtSPLs genes was constructed by the neighbor-joining method using MEGA6.0 software. First, the 21 ScSPL genes in the phylogenetic tree were classified into seven clades (groups 1–7) according to the classification proposed by Cenci and Rouard [36] and their topological structure. It was evident that the classification groups of SPL proteins were consistent with those of Arabidopsis, suggesting that these SPL genes remained stable during the evolutionary process (Figure. 1; Additional file 5: Table S1).
Among the seven subfamilies, subfamily II had the largest number of members (five ScSPLs), whereas subfamilies Ⅰ and VI had only one ScSPL. Subfamilies III, VII, and VIII had the same number of SPL genes with four members. A comparison of the rye phylogenetic tree with A. thaliana showed that some ScSPLs clustered closely with AtSPLs (bootstrap support ≥ 70), suggesting that these proteins might be orthologous and had similar biological functions.
Multiple sequence alignment of AtSPL s with the seven ScSPL subfamilies.
Then multiple sequence alignment of SPL genes in A. thaliana with the seven subfamilies of SPL genes in rye was performed. According to previous studies [37], all SPL genes contain SBP conserved domains, consisting of zinc finger 1 (Zn1), zinc finger 2 (Zn2), and bipartite nuclear localization signal (NLS). The basic region consists of 14 conserved amino acids, spanning 70–80 amino acids (Additional file 1: Figure S1). The results revealed that only subfamily II was not fully conserved in rye and A. thaliana. The Zn-2 (Cys2HisCys-type) and NLS in ScSPL19 from rye subfamily II were mutated, and two R's in the C-terminal RRRK of SBP structural domain were replaced with other amino acids (Additional file 2: Figure S2). In addition, the conserved KR was replaced with MY. The SBP domains in A. thaliana and rye were highly conserved, indicating that the SBP structural domain was established early in plants.
Conserved motifs and gene structural analysis of ScSPL genes.
To explore the structural diversity of SPL genes in rye, their exons and introns were obtained by comparing them with the corresponding genomic DNA sequences. A comparison of the localization and number of the exon-intron structures revealed that the 21 ScSPL genes had different numbers of exons, varying from 1 to 11. However, all genes contained different exons, but all had the SBP domain (Figure. 2, Additional files 5 and 6: Tables S1 and S2). Moreover, ScSPL4, ScSPL9, ScSPL11, and ScSPL20 contained the ANK conserved domain, while 10 ScSPL genes had the same intronic and exonic structures with three exons and two introns (Figure. 2B). ScSPL4, ScSPL9, ScSPL11, and ScSPL20 genes, belonging to subfamily II, had the largest number of introns and exons, 11 and 10, respectively (Figure. 2A, B). It was also found that ScSPL genes from the same subfamily had similar gene structures. Given that subfamily II showed more significant structural differences in the number of introns, which suggested that these genes might have more functions.
To further explore different regions of ScSPL proteins, the motifs of ScSPL genes were analyzed using the online MEME software to evaluate structural diversity of ScSPL genes. Ten diverse motifs were identified in ScSPL proteins, named motif 1 to motif 10. Motif 1 and motif 2 were commonly distributed in ScSPLs, and these two motifs were very closely related to each other in SPL proteins (Fig. 2C). Notably, ScSPL genes in the same subfamily typically shared a similar motif composition. For instance, subfamily Ⅰ contained motifs 1, 2, 3, and 5, while ScSPL9 and ScSPL11 in subfamily II have all motifs. Subfamilies Ⅲ, Ⅳ, Ⅶ, and Ⅷ all contained same motifs, with motifs 1, 2, and 6. In addition, subfamilies Ⅴ and Ⅵ contained motifs 1 and 2. After further analysis, some motifs were distributed in specific positions. For instance, motif 6 was always distributed at the beginning of the motif, while motif 8 was always at the end. Motif 2 was always found between motif 6 and motif 1, in addition to Subfamilies Ⅴ and Ⅵ (Figure. 2C, Table S2). Overall, these results suggested that genes from same subfamily have similar gene composition and structure and tend to cluster together, which was consistent with the phylogenetic tree population classification.
Chromosomal spread and gene duplication in ScSPL genes.
Physical localization of SPL genes on chromosomes was performed using the latest rye genome database. 21 SPL genes were unevenly distributed on chromosomes (Chr) one to seven, thus each SPL gene was named based on its physical location on the chromosomes (Figure. 3, Additional file 5: Table S1). Chr2 and Chr4 contained the largest number of ScSPL genes (five genes, ~ 23.8%), followed by Chr7 (three genes, ~ 14.29%). Chr1 and ChrUn only had one gene (~ 4.76%), while Chr3, Chr5, and Chr7 each contained two (~ 9.52%) ScSPL genes. Notably, almost half of the SPL genes were located at the top of a chromosome.
Gene duplication events mainly include both tandem repeat events and segmental duplications, which are essential for gene amplification and generation of new functions [38, 39]. Chromosomal regions containing two or more genes in the range of 200 kb are called tandem repeat events [40]. Furthermore, a duplication event analysis of SPL genes in rye was performed to explore evolutionary conservation of the gene family. The results showed that there were no tandem replication events and no duplicates in rye genome (Additional file 3: Figure S3). These results indicated that ScSPL gene was relatively conserved without fragment duplication during evolution
Evolutionary analysis of ScSPL genes and SPL genes from several different species
One dicot (A. thaliana) and five monocots (H. vulgare, O. sativa, Z. mays, T. aestivum, and A. tauschii Coss) were selected in this study. Evolutionary trees were constructed from 21 ScSPL genes and 10 conserved patterns of SPL genes from the other six plants according to the MEME web server using the NJ method of Geneious R11. The results showed that ScSPL genes exhibited an uneven distribution on the phylogenetic tree (Fig. 4). Genes from same subfamily tended to cluster together and were more inclined to have same motifs. Interestingly, almost all SPL genes from the seven plants contained motifs 1, 2, and 3, except for subfamily II (ScSPL19) in rye. Subfamily Ⅰ and subfamily II contained the most motifs showing diversity. In subfamily II, motif 10 was always distributed at the beginning of motifs, while motif 9 was almost always distributed at the end. However, in subfamily III, VI, and VIII, motif 7 was always distributed at the beginning of the motif. Overall, these results revealed that ScSPL genes from subfamily Ⅰ and II showed higher homology with SPL genes clusters in T. aestivum. In contrast, most SPL genes in other groups were clustered with H. vulgare and A. tauschii Coss, implying that these SPL genes were evolutionarily closer and might have similar functions.
To further explore the phylogenetic developmental mechanisms of SPL genes, a comparative syngeneic map of rye connections with six representative species was constructed, which included one dicot (A. thaliana) and five monocots (H. vulgare, O. sativa, Z. mays, T. aestivum, and A. tauschii Coss) (Figure. 5, Additional file 7: Table S3). The results demonstrated that 21 ScSPL genes exhibited collinear relationships with genes of A. thaliana (15), A. tauschii Coss (19), H. vulgare (17), T. aestivum (56), O. sativa (29), and Z. mays (29). The number of homologous pairs among the other six species (A. thaliana, H. vulgare, O. sativa, A. tauschii Coss, Z. mays, and T. aestivum) was 2, 9, 16, 16, 23, and 39, respectively.
After analyzing colinearity in the six plants, at least one pair of genes was found to be colinear with ScSPL, such as ScSPL14 with Zm00001d031451_T001/AET7Gv20605000.11/AT3G57920.1/HORVU5Hr1G073440.7/Os08t0509600-01/TraesARI7A01G276200.1, suggesting that these orthologous genes were more highly conserved and possibly existed prior to ancestral divergence. Therefore, ScSPL genes might have played an essential function in evolution of SPL gene family in rye. Interestingly, some gene pairs collineating with 10 ScSPL genes were identified in H. vulgare, O. sativa, A. tauschii Coss, Z. mays, and T. aestivum, suggesting that these orthologous pairs might have been formed via gene duplication during the differentiation of dicotyledonous and monocotyledonous plants.
Promoter cis-element and transcription factor analysis.
Next, ScSPLs promoter regions were analyzed to explore effects of cis-regulatory elements on gene regulation. Cis-acting elements identified in the ScSPL promoter could be classified into four categories: light-, hormone- and stress-responsive, and plant growth and development-related elements (Additional file 8: Table S4). These results showed that the largest proportion of light-responsive elements and individual ScSPL genes in rye covered most of the phytohormone-response elements, including abscisic acid-response elements (ABRE), growth hormone response elements (AuxRR-core and TGA element), gibberellin response elements (GARE-motif, P-box, and TATC-box), jasmonic acid response elements (CGTCA-motif and TGACG-motif), and salicylic acid response element (TCA element). In addition, cis-regulatory elements were associated with low temperature, hypoxia, drought, anaerobic conditions, and other defenses, and stress responses were found in all ScSPL genes.
All genes contained abscisic acid response and stress response elements, while 95.2% of ScSPL genes contained MeJA response elements. The promoters of ScSPL1 and ScSPL18 had growth hormone-, ethylene-, SA-, and gibberellin-response elements. ScSPL1 promoter contained several stress response elements during other stress responses, with drought response elements (MYC, as-1, and MBS) were the highest with 150. All ScSPL promoters except the ScSPL2 promoter contained drought-related element MYC. In addition, as-1 element was identified in all promoters except the ScSPL10 promoter (Additional file 4: Figure S4). These results suggested that some cis-acting elements might regulated the expression of different tissues (seed, root, endosperm, fenestrated chloroplast, and phloem) during development. Furthermore, the study showed that ScSPL genes might be involved in tissue developmental processes and response to various abiotic stresses.
Promoter cis-elements combined with TFs can regulate precise initiation and efficiency of transcription. Therefore, PlantTFDB was used to predict potential TFs binding to ScSPL promoter. Among them, the largest number of transcription factors were involved in regulating ScSPL20, while the smallest number of transcription factors were involved in regulating ScSPL17. Meanwhile, 17 ERF transcription factors were mainly involved in regulation of ScSPL genes, all of them responded positively to MeJA and ABA stresses. Of great interest, most (57%) of SPL TFs, such as ScSPL1, ScSPL3, ScSPL4, ScSPL5, ScSPL7, ScSPL8, ScSPL9, ScSPL11, ScSPL12, ScSPL13, ScSPL14, and ScSPL15, could bind to each other (Figure. 6). It has been shown that ERFs can regulate the expression of target genes and defend against Botrytis cinerea in Arabidopsis thaliana by relying on JA [41, 42]. Therefore, it is speculated that ScSPL genes might be involved in the JA responses with a potential antihelminthic function through ERF regulation. In addition, bZIP and NAC centrally regulated ScSPL19 and ScSPL20. Among them, five bZIP TFs bound to ScSPL19 and ScSPL20 proteins, most of which (bZIP3/44/48/53) belong to the GroupS subfamily of bZIPs. the GroupS family was confirmed to play an important role in stresses and developmental processes [43]. It was speculated that ScSPL19 and ScSPL20 might also have similar functions. AtMYB17 is involved in early inflorescence development and seed germination [44]. AtMYB74 responds to osmotic stress, water deficit, and seed development by regulating ABA [45]. This provides an idea to study the functions of MYB family members in regulation of ScSPL1 protein.
Expression patterns of ScSPL genes in different plant organs.
To further evaluate the potential function of ScSPL genes, qRT-PCR was used to analyze the expression of the 21 ScSPL genes in four organs (roots, stems, leaves, flowers, and fruits). It was found that ScSPL genes exhibited different expression patterns in four organs, and these genes might play diverse regulatory roles as a result. Notably, it was evident that all genes were least expressed in roots. Five genes (ScSPL3, ScSPL13, ScSPL16, ScSPL19, and ScSPL20) were highly expressed in fruits. In addition, 13 genes (ScSPL1, ScSPL2, ScSPL5, ScSPL6, ScSPL7, ScSPL8, ScSPL9, ScSPL10, ScSPL11, ScSPL12, ScSPL14, ScSPL15, and ScSPL18) had the highest expression in flowers, while ScSPL4 and ScSPL19 were highly expressed in leaf and stem tissues (Figure. 7C) (*p < 0.05). Importantly, similar expression patterns were found in the same subfamily of genes, which might have similar functions. Given that all SPL genes were least expressed in roots, which indicates that SPL genes might be more associated with the development of plant stems, leaves, and flowers.
Moreover, since SPL genes have been previously reported to regulate fruit development and thus influence its nutrient composition and developmental rate [3, 4], the expression patterns of 21 ScSPL genes at five different stages after anthesis were analyzed (7D, 14D, 21D, 28D, and 35D). The results showed that most ScSPL genes exhibited different expression patterns at these five stages of rye fruit development. Thirteen genes showed a significant down-regulation trend during fruit development, while the expression of ScSPL7 was up-regulated. ScSPL14 showed the highest expression level at day 21 of fruit development, while most genes (ScSPL6, ScSPL12, ScSPL14, and ScSPL19) had the highest expression level at day 28 (Figure. 7A) (*p < 0.05). These results demonstrated that SPL genes also played an essential role in fruit development, thus providing a theoretical basis for studying the nutritional value of rye.
Expression patterns of ScSPL genes under various treatments.
To further determine whether different abiotic stresses affected the expression of ScSPL genes, qRT-PCR analysis was performed to evaluate the expression of ScSPL genes in roots, leaves, and stems of plants under six abiotic stresses. ScSPL genes exhibited significant up- and down-regulated expression patterns under different stress treatments. Notably, the expression of most genes was significantly different in all three tissues. For example, twelve ScSPL genes were highly expressed in leaves under cold stress. ScSPL3, ScSPL4, ScSPL8, and ScSPL9 were significantly up-regulated in roots, leaves, and stems under different treatments. Under flooding stress, ScSPL1, ScSPL2, ScSPL9, ScSPL10, and ScSPL13 were significantly up-regulated in roots, while ScSPL14 was significantly down-regulated (Figure. 8) (*p < 0.05).
Interestingly, most genes were significantly affected in the early stages of treatment. ScSPL8, ScSPL10, ScSPL15, and ScSPL21 genes demonstrated similar expression patterns. Different tissues showed a tendency to be up-regulated over time to generate stress responses. However, some genes showed different expression patterns at transcriptional levels under different treatments. Most genes exhibited the highest expression in roots under heat stress treatment and the highest expression in leaves under cold stress treatment. Notably, ScSPL8, ScSPL9, and ScSPL18 showed similar expression patterns under heat and cold stresses and were highly expressed in roots in response to stress (Figure. 8) (*p < 0.05).
Furthermore, the expression levels of ScSPL genes under ABA, IAA, and GA3 were examined then. Genes exhibited significant expression changes under different treatments (Fig. 9). All genes except ScSPL16 responded to hormones, but each gene had different expression patterns under different treatments. ScSPL2 and ScSPL5 had the highest expression level under GA3 treatment. ScSPL11 and ScSPL14 had the highest expression level under IAA treatment, while only ScSPL5 had the highest expression level under ABA treatment (Figure. 9) (*p < 0.05). Notably, ScSPL1 was highly expressed in all tissues under hormone and abiotic stress treatments, and therefore should be considered as a potential candidate gene for further study.