Isolation of TaGS3 gene in wheat
In wheat, three TaGS3 copies on 4AL (TraesCS4A02G474000), 7AS (TraesCS7A02G017700) and 7DS (TraesCS7D02G015000) were searched in genome database. Primers were designed based on specific regions for each sequence and hence 2445bp, 2393bp and 2409bp length PCR product for TaGS3-4A (GenBank accession: KY888174), TaGS3-7A (GenBank accession: KY888186) and TaGS3-7D (GenBank accession: KY888197) in Changzhi6406 (in P-HGW). The lengths of the predicted coding sequences of three genes were 513bp, 510bp and 510bp, encoding putative 170, 169 and 169 amino acid, respectively. TaGS3-4A, TaGS3-7A and TaGS3-7D showed similarity exon-intron structure to OsGS3, which consist of five exons and four introns (Figure 1). The similarity between the coding region of OsGS3 gene and TaGS3-4A, TaGS3-7A and TaGS3-7D was 45.92%, 43.94% and 44.87%, respectively. In parallel, the similarity of deduce amino acid were 45.26%, 44.83% and 42.24%, respectively.
Phylogenetic analysis of GS3 gene in plant
The phylogenetic tree included 14 deduced amino sequence of GS3 homologous deduced protein that obtained from Aegilops tauschii, Hordeum vulgare, Panicum hallii, Setaria italic, Triticum aestivum, Zea mays, Triticum urartu, Sorghum bicolor, Glycine max and Arabidopsis thaliana. In addition, another reported atypical Gγ domain gene DEP1 was used as peripheral group control. Total tree separated these sequences into two main groups, which were GS3 orthologous and DEP1 orthologous (Figure 2a). The GS3 orthologous group containing ten sequences which were all derived from monocotyledons, where another DEP1 group comprised of sequences in both monocotyledons and dicotyledons.
In GS3 orothologe group, each of them is congruent with the species phylogeny, excepted that the relationship between GS3 translated by GS3-7A in Triticum aestivum and wheat A sub-genome progenies Triticum urartu is further than that with GS3 in Hordeum vulgare. The similarity of the conserved domain was 85.14% overall and they showed a similar length (60-66 amino acids) excepted for that in Triticum urartu (46), whereas the length of the cysteine-rich region was various, splitting into three classes. Class I consist of triticeae plant with a conserved length of 92 to 94 (85 in Triticum urartu under exceptional); Class II included Oryza plant showed the longest of 150; Class III consists of panicoideae plant also with the conserved length of 111-116. Comparison of atypical Gγ subunit conserved domain for GS3 orthologous sequences, we found that they all possessed completely same amino acids fragment “PRP--RLQLAVDALHR--FLEGEI” (-- represented the non-conserved region and the number of “-” not match the number of amino acid).
Polymorphic of TaGS3-4A and TaGS3-7A among wheat accessions
Ten sequences for TaGS3-4A and TaGS3-7A in each accessions were obtained, respectively. In order to avoid possible false variations generated by sequence error, advanced analysis for detecting real variants was followed by minor allele frequency >0.1 (at least twice in ten sequences) and 17 and 18 variations for TaGS3-4A and TaGS3-7A, respectively. There were 3, 1, 6, 0, 3, 0, 0, 0, 0, 2 and 2 variants in 5’ upstream region, 1st exon, 1st intron, 2nd exon, 2nd intron, 3th exon, 3th intron, 4th exon, 4th intron, 5th exon and 3’ downstream region for TaGS3-4A, respectively. In parallel, 4, 0, 2, 0, 4, 0, 0, 1, 3, 3, 1 variants were found in corresponding region for TaGS3-7A, respectively (Table 2 and Table 3). Among them, both three variants causing amino acids change were determined for TaGS3-4A and TaGS3-7A, which were located at 1st exon (1), 5th exon (2) and 4th exon (1) and 5th exon (2), respectively. The variant (GC/AT) at the position of 70 for TaGS3-4A was located at atypical Gγ subunit, but GC allele was occupied for 80% and 60% in the heavy pool and light pool, indicated this allele was not associated with grain weight. Additionally, the frequency of allele A at position 1907 (amino change: ALA/THR) was 0.6 in heavy grain weight pool, competing with none in the light pool. Therefore, this allele was far more likely to be associated with grain weight in wheat.
Molecular marker design and validation
Based on SNP mining result, a KASP marker was designed for SNP at position 1907 (A/G). The calls of TaGS3-7A-A allele with potential light grain weight group were clustered at X-axis, where the calls of TaGS3-7A-G allele with potential heavy grain weight group were clustered at Y-axis (Figure 3). These cluster results clearly distinguished two alleles, which was used to validate its effects on grain weight in the natural population.
The association between genotypes and grain traits was based on 224 mini-core collection (MCC) germplasms for this marker (Table S1). Based on SNP calling result of KASP marker, a significance for kernel width, kernel weight and kernel thickness was observed for TaGS3-7A-G and TaGS3-7A-A alleles, whereas no significance for kernel length was observed (Table 3). In addition, the allele TaGS3-7A-A with superior traits with a rare frequency of 13.107%, which was significantly less than TaGS3-7A-G. However, the frequency of TaGS3-7A-A in cultivars (18.085%) was significantly more than that in landrace (0.980%).