Characterization of Potential Candidate Genes for Grain Size in Wild Emmer Wheat Triticum Dicoccoides


 There is an incessant need to address food security in staple crops, and the crop yield is positively correlated with grain weight. Grain size, determined by grain length and width, is an essential component of final grain weight in cereals. Wheat wild relatives are the goldmine to harness any trait of interest, including the component traits of grain size. It is crucial to understand the detailed mechanism of grain size formation and unravel underlying genes controlling grain size in these species for their proper utilization in wheat improvement. In this study, gene expression analysis was performed on developing grain in wild tetraploid progenitor Triticum dicoccoides (AABB) to identify candidate genes involved in determining grain size. Four T. dicoccoides accessions were selected, two (pau5228 and pau5322) with higher grain length and weight and two (pau14703 and pau14756) with comparatively smaller grains.Six genes out of the eight genes selected for expression study, viz., GL7, TaGL3, TaGS5, GS3, SRS3, and TaGASR7, were upregulated from 8 days post-anthesis (DPA) to 20 DPA in both the large grain accessions, while TaGW2 gene was upregulated in both small grain accessions. TGW6 was downregulated in all the accessions at all stages of grain development. The results indicated that the selected genes play an essential role in grain size formation by controlling individual morphometric components of grain length and width. Targeted introgression genes controlling grain size components will eventually aid in improving grains yield.


Introduction
Wheat is one of the major global cereal grains that occupies seventeen percent of the crop area worldwide, feeding approximately 30% of the world's population (Eversole et al. 2014). There is a hike in demand for food production as expected by the continual expansion of the population, which would be 9 billion by 2050, thus increasing pressure on agricultural systems (FAO et al. 2015). Increasing crop yield is the only sustainable route towards meeting this demand. However, a decline in rates of yield has been observed, which is a major bottleneck to accomplish the expected doubling in crop production (Ray et al. 2013).
Grain yield in wheat is a complex trait, controlled by several components like productive spike number per unit area, grain number per spike, grain size, and thousand grain weight (TGW) . TGW in wheat is the most stable yield component and is positively correlated with crop yield, highlighting the need to carry out studies regarding genetic control to enhance breeding e ciency (Kuchel et al. 2007; Yu et al. 2019). TGW is mainly determined by individual grains size and morpho-metric components of grain length, width, and area. During the domestication process, the grain size was the primary target that has been widely selected and manipulated to increase grain yield (Gegas et al. 2010).
In the last decade, functional genomics has increased the understanding of grain size. A number of genes associated with grain size/weight have been identi ed in model crops like rice and Arabidopsis (Zuo and Li 2014). Several signaling pathways and diverse mechanisms in uencing the endosperm and maternal tissue growth determining the seed size have been reported . A number of genes are identi ed in major metabolic pathways and are involved in cell expansion and cell division regulation.
Despite these advances, an understanding of the control of grain size is limited in wheat. However, some QTLs for grain weight and size have also been characterized in wheat (Breseghello and (Hanif et al. 2016) and TaGL3 ). Traits like grain length, grain width, and grain thickness are often positively correlated with grain size, and these traits are further used to evaluate the grain weight in the breeding programs. Also, grain weight has been more extensively studied in wheat than subcomponents like grain length and width. Hence, there is a need to look into the genetic mechanism controlling these yield components, ultimately contributing to overall yield.
Studying the grain development process seems essential to nd the genetic architecture and genes involved in grain size formation. Understanding the regulatory mechanisms underlying early gene expression and selecting the candidate genes for grain size is of great signi cance for yield and quality improvement in wheat (Guan et al. 2019).
There has been a radical change in wheat genomics in the last few years in terms of resources such as transcriptomic databases (Borill et al Signi cant variations in grain size and weight occur among wild species of diploid, tetraploid, and hexaploid wheat (Gegas et al. 2010). T. dicoccoides is known as wild emmer wheat, the progenitor of cultivated wheat, contributing to grain size during wheat evolution (Feldman and Kislev 2007). Punjab Agricultural University has a collection of more than 110 T. dicoccoides accessions, and a study on its grain size traits from previous years indicated that this germplasm is a good source of variation for grain sizerelated traits. The present investigation aimed to understand the role of some potential genes responsible for controlling grain length and grain width in different accessions of T. dicoccoides having variations in grain length and width.

Plant material
Plant material used includes four T. dicoccoides accessions, selected on the basis of contrasting thousand grain weight (TGW). Two of these accessions, pau5228 and pau5232, originated in Turkey, had larger grains and hereon called LG accessions. The other two accessions, pau14703 and pau14756, with the origin in Israel, had smaller grains hereon called SG accessions. The four selected accessions were evaluated for grain size parameters in developing grain planted in three replications, in completely randomized design (CRD) for two consecutive years, 2018-19 and 2019-20 at experimental eld area, School of Agricultural Biotechnology, PAU, Ludhiana.

Measurement of grain size
The grain size was determined both in the developing and mature grains. The developing grains were collected at seven different stages (post-anthesis); the rst sample is collected at 4-days post-anthesis (4 DPA), then at 8DPA, 12DPA, 16DPA, 20DPA, 24DPA, 28DPA, and mature grains for two consecutive wheat seasons 2018-19 and 2019-20. The spikes in each accession of T. dicoccoides were tagged with the anthesis date to maintain uniformity in the sampling marking. Developing seeds were collected only from the primary orets of three spikelets selected from the middle of each spike. In the developing grain, grain length (GL), grain width (GW), and grain area (GA) of collected seeds were measured, while in mature grain GL, GW and GA were taken using the Canon5600 scanner GrainScan software (Whan et al. 2014). As manual threshing of one thousand grains in wild accessions of T. dicoccoides spikelets is di cult due to hard threshing nature, hundred-grain weight was recorded and converted into thousand-grain weight (TGW). The data was analyzed in the statistical software R (R Core Team 2018), using the "AOV" function for variance and Duncan Multiple Range Test (DMRT) to compare mean values of four T. dicoccoides accessions. For further analysis, the adjusted means of two environments were represented as the third environment. The graphical representation of the data was plotted using "GGplot2" version 3.3.2 and "GGpubr" version 0.4.0 packages of Rstudio (Kassambara and Kassambara 2020; Wickham 2016).
Identi cation of candidate gene for grain size Candidate genes for grain size component traits were selected from available literature in wheat and rice. Nucleotide sequences of complete genes were retrieved from respective RGAP (Rice Genome Annotation Project) and NCBI (National Centre for Biotechnology Information) database and were used as a query for online BLAST against RefSeqv1.0 of wheat. Sequences showing maximum alignment with A and B genomes chromosomes were selected, and again BLAST was done against T. dicoccoides genome in Ensembl Plants database (http://www.ensemblgenomes.org/id/%s). The nal sequences obtained were ltered based on low e-value, query coverage, and gene functional annotation.

RNA extraction and cDNA synthesis
Total RNA was isolated from developing seeds using RNAiso Plus reagent (Takara, Japan) in three technical replicates. The concentration of RNA was measured by spectrophotometry using Nanodrop™ 1000 (Thermo Scienti c, USA), and quality was con rmed with MOPS gel by visualizing under UV light in gel documentation unit, Gbox3 (SYNGENE G: Box, USA). RNA was converted to cDNA using 1 st strand cDNA synthesis kit (TakaraPrimeScript TM Takara, Japan) as per manufactures' instruction. The con rmation of cDNA was done through polymerase chain reaction (PCR) ampli cation with 26S rRNA as an internal control (CACAATGATAGGAGGAGCCGAC and CAAGGGAACGGGCTTGGCAGAATC).
PCR primer design and quantitative real-time PCR qRT PCR primers were designed for selected genes using Primer3 software (Thornton and Basu 2010). qRT-PCR was carried out using SYBER Green™ Premix Ex Taq (Promega) in LightCycler96 Real-Time PCR System (Roche Applied Science, Germany). The reaction mixture contained 10µl of 2X PCR SYBR green ready mix, 1µl of each primer and the cDNA template (2µl) in a nal volume of 20µl at 94 o C for 3 min followed by 40 cycles of 94 o C for 10 sec, 60 o C for 30 sec, 55 o C for 30 sec. For each sample, the transcript abundance of potential candidate genes was analyzed across a series of three biological replicates for each developmental stage. TaActin gene primer was used as an internal control to normalize gene expression (Guan et al. 2019). Melting curve analysis was performed at 55 C with the help of the LightCycler96 software package supplied by Roche. The C q quanti cation cycle values for both the reference and target genes were estimated in each sample. The 4DPA stage was considered the control stage for analyzing expression levels of genes at other developing stages. The ∆C q is the normalized value, and ∆∆C q is the actual difference of gene expression between control and other developing stages. The 2 −ΔΔCt method was used to calculate the relative expression level in terms of fold change (Livak and Schmittgen 2001). Statistical data analysis was done according to the ∆∆C q method based on relative expression and fold change values:

Results
Wheat grain yield has improved exponentially since the green revolution and continues to improve, although the pace of increment is decreasing. Despite achieving a reasonable level of global production of wheat, work for improving yields has always remained at the forefront, one to ful ll demands of ever-increasing mouths and other to answer the scienti c curiosity of "how much yield potential can be improved?" Wild species of wheat encompass a wide diversity of alleles, and T. dicoccoides is one of the important species, housing a vast variation in grain size and novel alleles for grain size-related component traits ( In wheat, anthesis begins in the central part of the spike and continues bidirectionally towards the basal and apical parts. Furthermore, the proximal/primary orets of the central spikelet are fertilized two to four days earlier than the distal orets; hence grains from these orets usually have higher weight (Bonnett 1936;Kirby 1974;Peterson 1965;Simmons and Crookston 1979). In the present study, the primary oret of the central three spikelets was selected to assess the different grain parameters to maintain uniformity in the experimental material.
The grain development in wheat has been divided into three phases after anthesis, grain enlargement (0-14DPA), grain lling (15-35DPA), and physiological maturity (36-50DPA). There is a signi cant increase observed in the length of the developing grain after an initial period of isotropic growth, and this would become maximum at around 15DPA, which contributed more towards grain area, compared to width (Brinton et  . We observed higher expression of GL7 in LG accessions of T. dicoccoides than short grain accessions, con rming that this gene is expressed to produce longer and heavier grains. Due to the increased copy number or mutations in the promoter, the expression of GL7 was higher in long grain varieties of rice (Wang et al. 2015b). The sequence analysis of this gene in long grain accessions of T. dicoccoides might lead to the identi cation of novel alleles associated with longer grains. GS5 functions as a positive regulator of grain size and higher expression of GS5 is correlated with larger grain size (Li et al. 2011;Xu et al. 2015). We observed higher expression levels of TaGS5 at initial grain development stages in long grain accessions, whereas very low levels in all the developing stages of short grain accessions, indicating that higher expression of TaGS5 might be involved in the development of larger grains. Ma et al. (2016) investigated the temporal and spatial expression patterns of the TaGS5 homoeologues ortholog of rice gene OsGS5 in various tissues of wheat, which showed higher expression in seedlings, young spikes, and developing grains. GS3 in rice encodes a putative transmembrane protein, and a major QTL for grain length, weight, and a minor QTL for grain width have been identi ed. A loss of function of allele in GS3 promotes cell proliferation and forms long grains, while gain of function produces short grains (Fan et al. 2006). The cause of mutation by premature stop-codons between grain size in rice suggests that orthologous genes and similar related regulatory processes for this type of traits may be conserved across a broad range of taxa ranging from monocot to dicot species. Our result showed higher expression of GS3 in mid-grain developmental stages in long grain accessions, indicating the role of this orthologue in regulating the grain length in wheat.
In rice, GL3 encodes a protein phosphatase with kletch-like repeat domains (OsPPKLs), restricting cell division in spikelet hulls that increase grain length, weight, and yield (Qi et TaGL3-5A, its expression pattern was similar with increasing grain size at the early (8DPA) and middle stages (20DPA) of seed development, suggesting that TaGL3 play a role at an early phase of seed development. Association analysis revealed that the TaGL3-5A-G allele was signi cantly correlated with longer grains and higher TGW, and the frequency of the allele in hexaploid wheat was slightly lower than in T. dicoccoides. wheat to unravel the genetic architecture of grain size, where the homolog of rice SRS showed successively higher expression across the early to middle stages of grain development. Our study also observed higher expression of this gene at an early and middle stage of grain development, which correlates with the phenotypic gain of grain size in LG accessions.
OsGASR7 in rice showed similarity to Arabidopsis GASA4 and was considered as a candidate gene determining grain length ). The wheat ortholog of OsGASR was also reported to play the same role and was involved in grain length development ). Zhang et al. (2014) studied expression patterns of TaGASR7 in immature seeds in a synthetic hexaploid wheat. GASR7B was highly expressed from 6 DPA till 14 DPA in developing seeds of the tetraploid accessions and began to decrease at 17 DPA. Our results also showed an increased expression till 12DPA, indicating that TaGASR7 was also involved in grain length increase during seed development in T. dicoccoides. GW2 in rice and Arabidopsis affect grain size by suppressing cell proliferation (Song et al. 2007;Xia et al. 2013). Our result showed a negative correlation between TaGW2 and long grain in T. dicoccoides, as this gene was highly expressed in SG accessions than large ones. TaGW2 has been associated with kernel width and weight, which has been validated as a negative regulator of grain size in wheat by gene editing and mutant analysis. The association analysis indicated that the mutated TaGAW2 allele signi cantly increased kernel width (KW) and thousand-kernel weight (TKW) and slightly improved kernel length (KL) in tetraploid and hexaploid wheat. The increase in grain width and length was consistent across grains of different sizes, suggesting that the effect of the mutation is stable across the ear and within spikelets (Simmonds et al. 2016;Su et al. 2011;Wang et al. 2018;Yang et al. 2012;Zhang et al. 2018). Ishimaru et al. (2013) identi ed a novel gene for grain length, and weight, TGW6, which encodes a novel protein related to indole-3-acetic acid (IAA) synthesis, loss of function of this allele enhances grain length and weight. TaTGW6-A1, an ortholog of rice TGW6, is associated with grain weight and yield in bread wheat (Hanif et al. 2016). Very low expression of TGW6 was found in all the accessions used in the present study across different grain developmental stages, indicating a functional allele in the selected T. dicoccoides accessions.
Notably, we observed the expression peak of six genes in long grain accession at around 8DPA-20DPA, corresponding to the change in grain size, suggesting that these genes play an important role in the early phase of grain development. There was also a correlation observed between gene expression and grain size traits at the early and middle stages during seed development. Cluster analysis illustrated by the heat map showed that the expression pattern of GL7, TaGL3, TaGS5, GS3, SRS3, and TaGASR7 has more similarity to grain size traits (Fig. 5). In contrast, TaGW2 and TGW6 showed a negative correlation with grain size traits in developing stages.
As there is a correlation between grain size-related traits and expression pro les of selected genes at the early stages of grain development, it depicts the scope of targeting the grain development process to improve yield. T. dicoccoides has a large variation in grain size; thus, we speculate that allelic variations of multiple genes involved in grain size are responsible for grain size variation in T. dicoccoides. Although other studies have reported these genes in wheat, but expression patterns of these genes in the present study indicate the presence of novel alleles in the T. dicoccoides germplasm. The long grain T. dicoccoides accessions studied in the present investigation and characterized genes can form a basis for systematic marker-assisted breeding for enhancing the grain size of the breeder's germplasm.        Relation between expression pro les of eight selected genes and phenotypic changes in grain area of T.
dicoccoides accessions at different grain developmental stages from 4DPA to 28DPA.