GAMYB gene in rye – sequence, polymorphisms, map location, allele-specific markers and relationship with selected agronomic traits

Background. A master GA-induced regulatory protein, crucial for development and germination of cereal grain and involved in anther formation is MYB transcription factor GAMYB, activating a vast number of genes including high-molecular-weight glutenin and α-amylase gene families. This paper presents the first attempt to characterize rye gene encoding GAMYB in relation to its sequence, polymorphisms and phenotypic effects. Results. ScGAMYB gene was identified and mapped on rye chromosome 3R using high-density DArT/DArTseq-based maps developed in two mapping populations. Comparative analysis of the gene sequence revealed its high level of homology to wheat and barley orthologues. Single nucleotide polymorphisms detected among rye inbred lines allowed developing AS-PCR markers for ScGAMYB (ten pairs of primers) which might be used to detect this gene in wide genetic stocks of rye and triticale. Segregation of ScGAMYB alleles showed significant relationship with quantitative traits including plant height, thousand grain weight, α-amylase activity, earliness per se and leaf rolling. Conclusions. The research showed the strong similarity of rye GAMYB sequence to its orthologues in other Graminae and confirmed the position in the genome consistent with the collinearity rule of cereal genomes. The statistically significant, however moderate association of ScGAMYB with many agronomic features has been pointed out, which proved that this gene is a QTL of pleiotropic character. The effect of ScGAMYB on flowering time was statistically the most significant. Developed sequence-based, allele-specific PCR markers could be useful in research and application purposes.

binding to a highly conserved GA-responsive element (GARE, TAACAA/GA) in the promoter [1]. The constitutive expression of GAMYB in aleurone cells in the absence of GA is sufficient to activate α-amylase promoter, however silencing or loss-of-function mutation of GAMYB detain α-amylase activity in GA-treated aleurone cells [2,3]. Thus, GAMYB activity is indispensable for elevated expression of α-amylase genes in response to gibberellin signal [4]. GAMYB is also involved in production of storage proteins during grain development [5,6] and in developmental mechanisms of anther formation [4,7].
GAMYB production in aleurone layer is controlled by the quantitative ratio of gibberellins (GA) and abscisic acid (ABA). Genes encoding GAMYB are suppressed by ABA signal transduced by protein kinase PKABA1 [8] and by GAMYB binding protein KGM, representing MAK-kinases [9]. GA induces rapid increase in HvGAMYB gene expression in barley aleurone layers through degradation of its repressor SLN1 representing DELLA proteins [4,10,11].
There is only one copy of GAMYB gene per cereal genome and it is located in syntenic position on homologous group 3 chromosomes in barley and wheat [12] and on collinear rice chromosome 1 [13]. Sequences of Hv-and Ta-GAMYB comprise four exons and three introns being differentiated to several haplotypes within wide germplasm collections of both species [12].
Functional polymorphisms in GAMYB gene may be an important factor affecting variation of α-amylase activity (AA) and possibly other important traits of cereals. This possibility should be explored since wheat, barley and rye chromosome 3 was shown to contain a number of QTL for AA, preharvest sprouting (PHS) and plant height (PH) [14][15][16][17][18][19]. Till now neither ScGAMYB sequence, polymorphisms and map location nor its relationship with agronomic traits have been characterized in rye.
This paper reports about sequence identification, mapping and association of ScGAMYB gene polymorphisms with selected quantitative traits of rye.

Sequence of ScGAMYB
The 769 bp fragment of ScGAMYB amplified from DNA of Ot1-3 and 541 parental lines showed 95-97% identity with orthologous genes of wheat and barley deposited in NCBI database ( Table 1).The E value between rye and wheat or barley sequences was 0.0. primers allowing to generate allele-specific products (Tab. 2). AS-PCR markers uncovered polymorphisms not only between parental 541/Ot1-3 lines but also between S32N/RXL10 lines and within mapping populations. In total, 10 pairs of primers were designed ( Table   2). All of them amplified stable and repeatable products specific to the alleles tested.
Geneious software, version 10.2.4. was used to align the fragment of ScGAMYB sequence to the whole genome shotgun sequence assembly of rye cultivar Lo7 [20]. This approach allowed to identify homologous gene in the scaffold no. Sc170168 being a DNA fragment containing entire sequence of ScGAMYB located on chromosome 3R in position of 92.15706326 cM. The identity coefficients between the analyzed sequences and that found within the scaffold were 97.46% and 96.56% for Ot1-3 and 541 lines, respectively. Additionally, bioinformatics analysis of the raw sequences data deposited in Sequence Read Archive (SRA), in GeneBank (NCBI) for DS2, RXL10, M12 and L35 rye inbred lines allowed to disclose the complete mRNA sequence of the ScGAMYB geneaccessions SRX2636904-SRX2636920). The alignment of the obtained rye sequences of GAMYB gave a total gene length of 3,700 bp for M12 and DS2 lines. Sequences for RXL10 and L35 rye inbred lines were incomplete within exon 1. All sequences contained the entire coding sequence of ScGAMYB.Comparative analysis of these four sequences revealed SNPs in 22 The structure of ScGAMYB gene was derived by comparing sequencing data reported in this paper with rye DNA sequences presented by Bauer et al. [20] and those deposited in  Table 3). While relationship with leaf rolling, amylase activity, grain number per spike and grain weight per spike were detected in one mapping population and in one year of study, the remaining traits showed significant relationship across years and populations (spike length, plant height) or across years of study (thousand-grain weight and flowering date).

Discussion
The sequence of ScGAMYB characterized in this paper shows high homology to TaGAMYB and HvGAMYB in wheat and barley [12]. The gene structure is also similar to wheat and barley orthologues having four exons and three introns where the start codon and functional MYB domain are located on exon 2. ScGAMYB map position identified on the proximal part of the long arm of chromosome 3R is syntenic to that found in wheat and in barley [12]. The alignment of the ScGAMYB sequence to the whole genome shotgun sequence assembly of rye cultivar Lo7 [20] confirmed this location. The gDNA scaffold no.
Sc170168 containing entire sequence of ScGAMYB was also located on chromosome 3R, in position of 92 cM [20].
Out of 22 polymorphisms (SNPs) found within the coding sequence, 5 affected amino acid composition and secondary structure of the ScGAMYB protein. The level of polymorphism detected in rye was thus higher than that reported for wheat and barley within a much wider genetic material [12]. It is not surprising, since rye as an outcrossing species is more heterogeneous than self-pollinated cereals. Finding SNPs which affect secondary structure of ScGAMYB gives opportunity to develop functional markers for this key regulatory protein.
ScGAMYB is located within a near-centromeric region on chromosome 3R where QTL for αamylase activity were found in rye [17,19,21]. As expected, analysis performed here indicated significant relationship between ScGAMYB allelic segregation and α-amylase activity in grain. Relationship of ScGAMYB and α-amylase activity revealed in this paper comprise with its function as transcriptional activator of α-amylase structural genes in cereal grain. This molecular function of GAMYB may be however negatively affected by interactions with a number of transcription factors such as SLN1, Vp1and PKABA1 [8,10,22]. Also interfering with two members of the WRKY family i.e. ABF1 and ABF2, zinc finger protein HRT, a MAK-like kinase KGM and a DOF transcription factor BPBP [4,9,23,24], may reduce GAMYB's effectiveness in α-amylase induction. In spite of this complex regulatory network ScGAMYB seems to be a candidate gene for at least partial control of α-amylase production in rye grain.
Finding of relationship between ScGAMYB alleles segregation and variation of other studied traits has not such a straightforward explanation as for α-amylase activity.
However knowing that GAMYB is active during grain development in promoting protein synthesis [5], it's role in enhancing thousand-grain weight seems possible. Also connection with flowering date and grain number per spike can be found in a literature since GAMYB's activity in development of anthers and pollen grains have been established. Overexpression of GA-related genes often leads to male sterility and failure to set seed; for example, transgenic barley overexpressing HvGAMYB exhibits increased male sterility, which causes a loss of grain production [25]. Important signaling and/or response roles in flowering of GAMYB factors were proved ia for Arabidopsis thaliana [26] and Lolium temulentum [27]. Rice GAMYB is involved in almost all instances of GAregulated gene expression in anthers [7].
Enhancement of plant, spike and leaf growth by GAMYB is also possible at least as a pleiotropic effect of this potent regulatory gene. Our previous study [18] identified QTL for plant height, thousand-grain weight and awn length within map interval containing ScGAMYB, thus confirming results presented here. Further study of ScGAMYB polymorphisms using wider collection of genetic stocks should bring more information about functions of this gene in various aspects of plant development.

Conclusions
Plant genomes have undergone significant reshaping during evolution, enabling each species to adapt to its ecological niche. Many changes can be detected within a family, genus or even each species. Sequential analyzes of cereal genomes indicate unusually rapid evolution of intergenic regions, which has consequences for the gene conservation.
The priority of cereal genomics should be to develop efficient tools for the isolation of agronomic genes in every important family [28]. Rye belonging to family Graminae, genus Secale, has great research potential as a species with a much larger basic genome than other crops like rice or even, more related, barley and wheat. It can be used to analyze similarities and confirm orthology, as well as to test hypotheses for other species, especially with respect to gene function. It can also be a source of markers useful in the process of improving wheat and triticale cultivars, enabling breeding progress. The

Genetic mapping
ScGAMYB was genetically mapped on the 541×Ot1-3 (RIL-K) and S32N/07×RXL10 (BSR-F 2 ) high-density DArT based maps developed by Milczarski et al. [33] and Myśków et al. (unpublished). The genetic map construction of the population BSR-F 2 was conducted using Multipoint 3.2 software [34]. The group of DARTseq markers segregating in the combination "b, d" (the same as GAMYB segregation) were used to construct the genetic map. The "order" command was used for marker groups formed at a maximum threshold level of recombination frequencies at 0.005. For detection and removing problematic markers that caused neighborhood instabilities the "control of monotony" command was used. Finally, the ordering was repeated. For reducing the inflation of genetic distances on a high-density genetic map, the average length of the consensus map [33] was used for scaling of obtained linkage groups, as previously described [35,36] Figure 1 The structure of ScGAMYB gene in rye.

Figure 2
The changes in secondary structure of ScGAMYB resulting from SNPs.

Figure 3
Relationships between rye inbred lines and related species established based on GAMYB sequences using UPGMA method.

Figure 4
Location of the ScGAMYB gene on the chromosome 3R of populations BSR-F2 and RIL-K [33]. To integrate these two maps the map of RIL-S population [37] was used.

Supplementary Files
This is a list of supplementary files associated with the primary manuscript. Click to download.