Retrieval of Genomic sequence
Using reported accession ID no: AF258608, 5346 bp cDNA sequence of wheat was retrieved from the NCBI database and used as a query sequence for searching and retrieving the genomic sequence of TaSSIII gene from IWGSC (wheat Genomic database). It includes query ID, Database, aligned genome (Subjects), aligned score, identities (query length), aligned percentage, expected E-value, start and end position of aligned query sequences on A, B & D genome. BLAST report exhibited that the query sequence of SSIII cDNA had aligned maximum with the chromosome no.1 of each of the A, B, and D genome at a particular site with 98%, 96%, and 96% of aligned scores and 0.0 E- value. It also aligned with the chromosome no. 2 of each of the A, B, and D genome at a particular site having 76%, 76%, and 76% of aligned scores, with very low E- values, i.e., 3e− 152, 1e− 157 and 8e− 153 respectively.
Full length of the three homoeologous copies of the TaSSIIIa gene with length 12,168bp, 12,839bp and 10,529bp were found at specific locations i.e., from 84,002,521 − 84,014,688 (+ strand), 141,283,445 − 141,296,283 (+ strand) and 87,756,557 − 87,767,085 (+ strand) on chromosome 1A, 1B and 1D in forward direction respectively (Table 1). Similarly, three homoeologous copies of the TaSSIIIb gene of size 8244bp, 8440bp, and 9194bp were identified on chromosome 2A, 2B and 2D but in reverse orientation, at location from 712,573,140–712,581,384 (-strand), 690,026,216–690,034,656 (-strand) and 574,104,017–574,113,211 (-strand) respectively. Detailed descriptions of genome specific TaSSIIIa & TaSSIIIb gene are shown in Table 1.
Comparative analysis of genome-specific copies of TaSSIII gene
The comparative studies of exons and introns of each of the genome-specific copies of the TaSSIII gene present on their respective location on A, B, and D showed that the gene has sixteen exons. Out of the sixteen exons, the third exon was the largest, and the second exon the smallest. Most of the exons were relatively conserved with respect to its sequence length except for exon 3 and exon 1, while exon 4 has a conserved length of 218bp and 215bp on chromosome no.1 and 2 of all the three genomes, respectively. Exon 1 and 3 exhibited maximum variations among the studied genome-specific copies of the TaSSIII gene of wheat (Triticum aestivum). The significant variation in the length of exon 1 and 3 was due to insertion and deletions in the different copies of the gene (Fig. 2).
In all the copies of the TaSSIII gene, 15 introns were identified wherein introns 1, 5, and 8 were found largest and 6, 13, 14, and 15 the smallest in the genes. The average lengths of the total intronic region of TaSSIII (viz., TaSSIIIa1A, TaSSIIIa1B, TaSSIIIa1D, TaSSIIIa2A, TaSSIIIb2B & TaSSIIIB2D) was estimated slightly higher than that of its coding sequence length. Relatively more variability was observed in the length of introns than that of the exons among the studied copies of the genes. The orientation of the TaSSIIIa gene was identified in the forward direction (on + strand) in each of the copies present on chromosome no.1, on the contrary, TaSSIIIb was found in reverse orientation (on -strand) in each of the copies present on chromosome no.2 (Table 2).
Table 1
Detailed description of Genome specific TaSSIII gene
Genome specific SSIII gene
|
TaSSIIIa1A
|
TaSSIIIa1B
|
TaSSIIIa1D
|
TaSSIIIb2A
|
TaSSIIIb2B
|
TaSSIIIb2D
|
|
SSIII gene
|
Gene ID
|
TraesCS1A02G091500
|
TraesCS1B02G119300
|
TraesCS1D02G100100
|
TraesCS2A02G46800
|
TraesCS2B02G491700
|
TraesCS2D02G468900
|
Chromosome no.
|
1A
|
1B
|
1D
|
2A
|
2B
|
2D
|
Gene orientation strand
|
Forward strand
|
Forward strand
|
Forward strand
|
Reverse strand
|
Reverse strand
|
Reverse strand
|
Location
|
84,002,521 − 84,014,688
|
141,283,445 − 141,296,283
|
87,756,557 − 87,767,085
|
712,573,140–712,581,384
|
690,026,216–690,034,656
|
574,104,017–574,113,211
|
Size of gene
|
12168bp
|
12839bp
|
10529bp
|
8244bp
|
8440bp
|
9194bp
|
Orthologous
|
87
|
87
|
87
|
86
|
86
|
86
|
Paralogoues
|
38
|
32
|
37
|
38
|
32
|
37
|
Transcript (Splice variant)
|
1
|
3
|
3
|
3
|
2
|
2
|
1
|
1
|
2
|
3
|
1
|
2
|
3
|
1
|
2
|
3
|
1
|
2
|
1
|
2
|
cDNA (bp)
|
5505
|
5637
|
5466
|
3552
|
5348
|
5322
|
5493
|
4568
|
4436
|
4514
|
4460
|
4514
|
4561
|
4644
|
Exons
|
16
|
16
|
15
|
5
|
16
|
15
|
16
|
16
|
17
|
16
|
16
|
16
|
16
|
17
|
Introns
|
15
|
15
|
14
|
4
|
15
|
14
|
15
|
15
|
16
|
15
|
15
|
15
|
15
|
16
|
Protein (amino acids)
|
1629
|
1610
|
1553
|
1183
|
1613
|
1554
|
1611
|
1307
|
1263
|
1289
|
1098
|
1276
|
1332
|
1403
|
Domain & Features
|
29
|
25
|
25
|
16
|
24
|
26
|
25
|
33
|
30
|
31
|
25
|
27
|
33
|
34
|
Variant allele
|
771
|
741
|
741
|
410
|
468
|
468
|
468
|
521
|
521
|
521
|
609
|
609
|
325
|
326
|
Oligo probes
|
22
|
21
|
22
|
12
|
17
|
17
|
28
|
8
|
8
|
8
|
14
|
16
|
21
|
18
|
Table 2
Comparative lengths of exons and introns in base pairs (bp) among genome-specific copies of TaSSIII gene
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
|
16
|
Gene
|
Exon
|
TaSS-IIIa1A
|
99
|
64
|
2757
|
218
|
271
|
176
|
108
|
110
|
103
|
171
|
129
|
183
|
132
|
112
|
124
|
133
|
TaSS-IIIa1B
|
96
|
64
|
2703
|
218
|
271
|
176
|
108
|
110
|
103
|
171
|
129
|
183
|
132
|
112
|
124
|
133
|
TaSS-IIIa1D
|
105
|
64
|
2703
|
218
|
271
|
176
|
108
|
110
|
103
|
171
|
129
|
183
|
132
|
112
|
124
|
133
|
TaSS-IIIb2A
|
90
|
64
|
1803
|
215
|
271
|
176
|
108
|
110
|
103
|
171
|
129
|
183
|
132
|
112
|
124
|
133
|
TaSS-IIIb2B
|
102
|
64
|
1698
|
215
|
271
|
176
|
108
|
110
|
103
|
171
|
129
|
183
|
132
|
112
|
124
|
133
|
TaSS-IIIb2D
|
102
|
64
|
1866
|
215
|
271
|
176
|
108
|
110
|
103
|
171
|
129
|
183
|
132
|
112
|
124
|
133
|
|
Intron
|
TaSS-IIIa1A
|
2523
|
468
|
259
|
360
|
771
|
113
|
249
|
899
|
424
|
125
|
88
|
118
|
96
|
89
|
81
|
|
TaSS-IIIa1B
|
2906
|
471
|
259
|
361
|
762
|
113
|
249
|
895
|
554
|
125
|
88
|
152
|
99
|
87
|
81
|
|
TaSS-IIIa1D
|
1228
|
468
|
258
|
337
|
774
|
114
|
246
|
895
|
264
|
125
|
88
|
117
|
96
|
90
|
81
|
|
TaSS-IIIb2A
|
438
|
88
|
259
|
134
|
454
|
75
|
431
|
591
|
427
|
106
|
135
|
250
|
83
|
88
|
118
|
|
TaSS-IIIb2B
|
391
|
88
|
265
|
135
|
910
|
75
|
527
|
694
|
190
|
105
|
124
|
127
|
91
|
88
|
117
|
|
TaSS-IIIb2D
|
395
|
88
|
259
|
134
|
447
|
75
|
525
|
720
|
205
|
105
|
124
|
123
|
83
|
89
|
118
|
|
Phylogenetic analysis of the TaSSIII genes present in different species
Molecular Phylogenetic analysis was carried out by Maximum Likelihood method. For the analysis, references protein sequences of each species used are: TaSSIIIa1A, TaSSIIIa1B, TaSSIIIa1D, TaSSIIIb2A, TaSSIIIb2B, TaSSIIIb2D (Triticum aestivum), AtSSIII (#NM-001198036), BdSSIII (XM-010236006), HvSSIII (#JN256948), SbSSIIIa & SbSSIIIb (#XM-021465270 & #EU620721), GmSSIII (#XM-003541570), ZmSSIIIa (dull1) & ZmSSIIIb (#AF023159 & #EF472250), OsSSIIIa & OsSSIIIb (#XM-015795185 & #XM-015780729), StSSIII (#NM-001287873), and VuSSIII (#AJ225088), respectively (Fig. 3).
For the phylogenetic analysis, the genome-specific copies of TaSSIII gene present on chromosome no.1 of the three genomes were named as TaSSIIIa1A, TaSSIIIa1B, and TaSSIIIa1D, while those present on the chromosome no.2 of the three genomes, were named as TaSSIIIb2A, TaSSIIIb2B, and TaSSIIIb2D, because phylogenetic analysis result indicated that they are genetically near to reported SSIIIa and SSIIIb copies of the other species respectively. The retrieved SSIII protein sequences from 4 dicots and 6 monocots crop species were subjected to phylogenetic analysis. Obtained phylogenetic tree (dendrogram) depicts genetic similarity among the SSIII genes of 4 dicots and 6 monocots crop species with respect to the genome-specific SSIII genes of wheat (TaSSIII). The Phylogenetic tree revealed that, SSIII genes of different species fit into an evolutionary cluster according to their similarities. SSIII cluster diverges into two sub-clusters, accounting to the divergence of the monocots and dicots. The SSIII of the monocot plant further diverges into two forms SSIIIa and SSIIIb. The two variants of the SSIII gene found in wheat; TaSSIIIa and TaSSSIIIb, respectively, were separated into two evolutionary groups. TaSSIIIa1A & TaSSIIIa1D were genetically more similar than TaSSIIIa1B. Among the monocots understudy, first maize (ZmSSIIIa) and sorghum (SbSSIIIa), which were found evolutionary closely related, diverged from the rest of the group. Then rice (OsSSIIIa) diverged from the rest of the group. Brachypodium (BdSSIIIa) was the next to have diverged from barley and wheat. HvSSIIIa of barley was last to have diverged from all the three copies of wheat TaSSIIIa. SSIIIb of maize and sorghum were closer to each other and were first to diverge from rest of the monocots under study. OsSSIIIb of rice was next to diverging from TaSSIIIb of wheat. TaSSIIIb2A was more close to TaSSIIIb2D than TaSSIIIb2B. SSIII gene of wheat and barley were diverged from Brachypodium, whereas the SSIII gene of rice has diverged from maize and sorghum, which were genetically closer. In the phylogenetic tree, maximum genetic dissimilarities were observed between monocot and dicot crops. GmSSIII of Soybean was found closer to VuSSIII of cowpea, whereas, potato’s StSSIII gene was close to AtSSIII of mustard. The maximum divergences were observed between TaSSIIIa1A of wheat and AtSSIII of mustard.
Domain analysis of genome-specific TaSSIII gene
The starch-synthase III (SSIII) is one of the enzymes involved in plant starch biosynthesis. It contains a putative N-terminal transit peptide followed by a SSIII-specific domain (SSIII-SD) with three internal tandem repeats of starch binding domains (SBD) and a C-terminal catalytic domain (Table 3; Fig. 4). Variations are there among the homoeologs of both copies of the SSIII gene. Variations among TaSSIII1A, TaSSIII1B, and TaSSIII1D genes amounted to a difference of 1 residue in the starch binding domain 1(SBD-1), 4 residues in SBD-2, 2 residues in SBD-3, and 1 residue in each of the starch synthase catalytic domain (SS-CD) and glycosyltransferase (GT). Whereas, variation among TaSSIIIb2A TaSSIIIb2B and TaSSIIIb2D genes led to differences of 4 residue in SBD-1, 3 residue in each of the SBD-2 and SBD-3, and 4 residue in SS-CD. No variations are there among homoeologs in the GT domain. The amino acid residues known to be involved in starch binding domains (SBD), starch synthase catalytic domain (SS-CD), and glycosyltransferase (GT) were well conserved. The glycosyltransferase domain was found most conserved among all domains of all the copies and their homeologs of TaSSIII gene.
3D protein structure & functional analysis of the catalytic domain of TaSSIII
Motif search results were evaluated using Prosite, are shown in (Fig. 5) a compressive list of all predicted motifs. Two N-glycosylation motifs (PS00001), five PKC-Phospho motifs (PS0005), eleven CK2-Phospho motifs (PS00006) and eight N-Myristyl motifs (PS00008) were predicted in the domains of the TaSSIIIa protein. Whereas eight N-glycosylation motifs (PS00001), three PKC-Phospho motifs (PS0005), fourteen CK2-Phospho motifs (PS00006), and eight N-Myristyl motifs (PS00008) were predicted in the domains of the TSSIIIb protein. In addition, a TYR-Phospho motif (PS00007) and an Amidation motif (PS00009) were found in TSSIIIa and TaSSIIIb, respectively. Starch synthase forms an important part of the GT1 family of Cazy (carbohydrate-activated enzymes) taxonomy, and includes two evolutionarily conserved putative ADP glucose binding motifs in all species namely, KVGGL and KTGGL. They were found in the N-terminal and C-terminal of the SS-CD and GT-1 domains, respectively. In all homoeologs of TaSSIIIa, KVGGL and KTGGL were located on average between 1175 and 1547 residues. While in TaSSIIIb these were found to average between 841 and 1250 residues. These KVGGL and KTGGL ADPglc binding motifs were found to be well conserved in all species. In addition, two new conserved motifs; The ITRLT and FEPCGLT ADP binding pockets were found to be well conserved among all species at the C-terminus of both catalytic domains of TaSSIIIa and TaSSIIIb. FEPCGLT was strictly conserved among all species, whereas in ITRLT, conservative substitutions (conservative mutations) for valine (V) to isoleucine (I) and threonine (T) to serine (S) were found among all species (Fig. 5C and 5D).
Ligand binding analysis predicted multiple ligand binding sites in the catalytic portion (starch synthase forms a part of GT-1 domain) of all the six homoeologues of the TaSSIII protein. Five predicted amino acid clusters were found in TSSIIIa1A, three each in Ta-SSIIIa1B and TaSSIIIa1D, four each in Ta-SSIIIb2A and Ta-SSIIIb2B, and six in Ta-SSIIIb2D, respectively. A total nine heterogens/ligands were found for the binding sites, for example, ADP (Adenosine Di-phosphate), G6P (Glucose-6-Phosphate), GLC (α-D- Glucose), PLP (Pyridoxal-5-Phosphate), AMP (Adenosine-Mono-Phosphate), G1P (Glucose-1-Phosphate), F1P (Fructose-1-Phosphate), CFF (Caffeine), and MG (Magnesium), respectively. Cluster-I was detected with the maximum number of ligands in each homoelogous copies. The maximum number was predicted in Ta-SSIIIa1B, with 50 heterogen ligands, whereas the minimum number was predicted in Ta-SSIIIb2B, with 29 heterogen ligands. Of the 50 ligands of Ta-SSIIIa1B, 7ADP, 1G6P, 1G1P, 14PLP, and 27GLC were predicted in cluster-I, respectively. Eight amino acid including of ASP, HIS, LYS, GLU, CYS, GLY, LEU, and THR are involved in the binding site. In contrast, in the Ta-SSIIIb2B, out of 29 ligands i.e., 7ADP, 1G6P, 12PLP, and 9GLC, were estimated in cluster-I. Eleven amino acid including ASP, ARG, 3LEU, 2CYS, GLU, PRO, GLY, and THR, are involved in the binding site. Information of rest of homoeologous copies of TaSSIII is presented in (Supplementary Table 1). For the confirmation and validation of ligands sites in active site of enzyme, out of all ligands, glucose was selected for molecular docking with TaSSIIIa1B and TaSSIIIa2B. The results show that glucose as a ligand has same result (binding site) as predicted via 3D ligand site software (Fig. 6A and 6B).
Table 3
Position of protein domains on TaSSIII gene
TaSSIII protein
Domains &
positions
|
TaSSIII1A (1629 AAs)
|
TaSSIII1B (1610 AAs)
|
TaSSIII1D (1611 AAs)
|
TaSSIII2A (1307 AAs)
|
TaSSIII2B (1276 AAs)
|
TaSSIII2D (1332 AAs)
|
Starch binding domain (SBD-1)
|
745–830
|
726–811
|
727–812
|
424–509
|
393–478
|
449–534
|
Starch binding domain (SBD-2)
|
920–1011
|
901–992
|
902–993
|
599–690
|
568–659
|
624–715
|
Starch binding domain (SBD-3)
|
1083–1172
|
1064–1153
|
1065–1154
|
761–850
|
730–819
|
786–875
|
Starch synthase, catalytic domain
(SS-CD)
|
1182–1369
|
1163–1350
|
1164–1351
|
860–1047
|
829–1015
|
885–1072
|
Glycosyl transferase (GT)
|
1428–1551
|
1409–1532
|
1410–1533
|
1106–1228
|
1075–1197
|
1131–1253
|
Expression analysis from RNA-seq data
The expression profiles of the both homologous gene, TaSSIIIa and TaSSIIIb showed variable expression in various tissues at vegetative and reproductive stages. The expression level of TaSSIIIb was found to be much higher than that of TaSSIIIa. Both genes were expressed constitutively in all tissues in relatively low amounts. Tissue-specific expression of TaSSIIIa (Fig. 7A) was found maximum in grain and spike, i.e., at Z75, 14 DAA (12 FKPM), and Z39 (5.7 FPKM) respectively. The expression of the gene in root and leaves was insignificant. Whereas tissue-specific expression of TaSSIIIb (Fig. 7B) was found maximum in leaf, spike, and grain, i.e., at Z23, three tillers stage (20.3 FKPM), Z39 (19.4 FKPM) and in grain at Z71, i.e. 2DAA (23.4 FKPM) respectively. The expression of TaSSIIIb in leaf increased after Z10 and stayed constant (~ 20 FKPM) till anthesis. The expression in spike also showed insignificant variation with the combined expression of all homoeologous transcripts (32–39.2 FKPM) from Z32 to Z65. The expression level of the transcript of TaSSIIIa-1A was found much higher than that of the transcripts of its homeologs. In contrast, the expression level of TaSSIIIb-2B was maximum, followed by SSIIIa-2D and SSIIIa-2A respectively.
Effect of heat stress on the expression of the genes in pot and field grown genotypes
Pot experiment:
In the current study, Starch Synthase III genes TaSSIIIa1D and TaSSIIIb2D expression patterns in the flag leaf tissue of six wheat genotypes grown in pots were examined using quantitative expression analysis under control and heat stress conditions. Heat streSSInduced up regulation of Starch Synthase III genes was observed in the flag leaves of all six genotypes. Surprisingly, the increase in transcript level varied significantly across the genotypes studied. A 14.6-fold and 9.8-fold increase in the transcript level of TaSSIIIa1D and TaSSIIIb2D was observed in IC252874 under heat-stress conditions, respectively. When heat stress was applied, the expression of TaSSIIIa1D and TaSSIIIb2D genes was up regulated in IC252874 that was significantly higher than the other five genotypes analyzed. Under the stress, the transcript level of the TaSSIIIb2D gene increased 5.8-fold in RAUWB-7, which was significantly lower than IC252874 but significantly higher than the other four genotypes (PBW 343, DBW 187, DH5 167, and HD 2967) (Fig. 10).
Field experiment:
In the pot experiment, the genotypes IC252874 and HD-2967 displayed different patterns of starch synthase III gene expression. A field study was conducted with these two check genotypes to confirm the consistency of the gene expression. The quantitative expression pattern of the Starch Synthase III genes TaSSIIIa1D and TaSSIIIb2D in the flag leaf and peduncle tissues of the check genotypes IC252874 and HD-2967 was examined in the field study under normal sown (control) and late sown (heat stress) conditions. Similar to pot experiment, heat stress induced up regulation of Starch Synthase III genes was observed in the flag leaves and peduncle tissues of both genotypes but the up regulation of TaSSIIIa1D and TaSSIIIb2D genes under the stress was significantly higher in IC252874 than HD-2967 in both flag leaf and peduncle tissues. A 7.5-fold and 14-fold increase in the transcript level of TaSSIIIa1D was observed in IC252874 under heat-stress conditions, whereas HD-2967 showed only a 0.4-fold and 2.1-fold increase in the transcript level of TaSSIIIa1D in the flag leaf and peduncle tissues, respectively. A 6.6-fold and 6.3-fold increase in the transcript level of TaSSIIIb2D were observed in IC252874 under heat-stress conditions, whereas HD-2967 showed only a 2.0-fold and 1.2-fold increase in the transcript level of TaSSIIIb2D in the flag leaf and peduncle tissues, respectively. Under heat stress, the transcript level change of TaSSIIIa1D is significantly higher in peduncle as compared to flag leaf, whereas the transcript level change of TaSSIIIb2D is numerically higher in flag leaf than in peduncle in both the genotypes (Fig. 11).
Among the grain layers, expression of TaSSIIIb was maximum in the inner pericarp (~ 1.8 RPKM) followed by outer pericarp and endosperm (Fig. 8B). While expression of TaSSIIIa was found maximum in the endosperm (~ 24.02 RPKM) followed by inner pericarp (~ 17.97 RPKM) and outer pericarp (~ 4.3 RPKM) (Fig. 8A). Out of all homoeologous transcripts of the TaSSIIIa, the contribution of TaSSIIIa1A was found more than 50% in development of grain layers. The combined expression of TaSSIIIa was highest at 10 DPA (~ 125.4 FRKM) during the development of grain layers, which decreased rapidly at 20 DPA (~ 47.4 FRKM) and 30 DPA (~ 31.1 FRKM) respectively. In contrast, no such significant contribution was observed from homoeologous copies of TaSSIIIb gene in the development of grain layers except TaSSIIIb2B. The expression of TaSSIIIb was constant throughout the development stage of grains layers.
The expression level of TaSSIII gene was down-regulated under both drought and heat stresses. The level of expression decreased for both 1 and 6 hours of stress. It indicates that the TaSSIII genes are negatively regulated by heat and drought stresses (Fig. 9).