Genotyping USDA Rice Mini-Core Collection With Functional Markers for Important Agronomic Traits

The USDA rice mini-core collection was established to capture the diversity of an entire collection of over 18,700 accessions of global origins for ecient germplasm evaluation and exploration. Previous studies have investigated its genetic diversity and population structure using genome-wide SSR markers. Many important agronomic traits that are fundamental to rice breeding programs, however, remain to be explored. Functional markers can be developed based on polymorphic sites within genes affecting phenotypic variation in, e.g., starch physicochemical properties, nutritional qualities and biotic resistance. These markers can be used for genotyping and hence differentiating phenotypes among rice accessions. In this study, we employed 12 pairs of functional markers (SNP and Indel) to genotype all 217 accessions constituting the USDA rice mini-core. These markers are highly associated with starch physicochemical properties (intron 1 G/C SNP, 23bp duplication in exon 2, exon 6 C/A SNP, exon 10 C/T SNP of Waxy gene, GC/TT SNPs of SSIIa gene, G/C SNP of SBE3 gene), glutelin content (3.5 kb deletion in Lgc1 gene), grain length (C/A SNP in GS3 gene), brown planthopper resistance (InDel in Bph 14 gene) and rice blast resistance (InDel in Pi54 and Pit gene). Using these functional markers, all the 217 accessions of the mini-core are characterized for aforementioned agronomic traits associated alleles/genes. The results of this study will help breeders select parental materials with desirable allele/gene combinations and phenotypes among mini-core accessions for rice breeding programs.


Introduction
Rice is the major staple food for nearly half of the world's population. Diverse rice germplasm providing raw materials for parents selection is required by various rice breeding programs aiming at various breeding goals, e.g. yield lifting, quality improvement, biotic and abiotic resistance enhancement. The USDA rice mini-core collection which is composed of 217 accessions was selected from the whole USDA rice gene bank. As a set of germplasm, potentially valuable for breeding, its diversity had been evaluated based on phenotypic and genotypic markers in previous studies (Agrama et al. 2009;Li et al. 2010), the results con rmed that this mini-core collection is a good representative of over 18,700 genotypes from 115 countries contained in USDA rice gene bank. Some successful studies on its application in gene-trait association analysis are available (Li et al. 2011;Bryant et al. 2013). However, in the respect of breeding practice, many fundamental yet important information on agronomic traits remained to be explored.
The knowledge on phenotypic and genotypic traits is a prerequisite for parent and offspring selection in breeding. DNA markers that can identify desired genes/alleles in rice germplasm are valuable because they can reveal the existence of the genes without laborious phenotype investigation. However, many markers are distant from targeted genes, occasional uncoupling of the marker from trait might happen in cycles of meiosis, this could lead to errors in selection of desired traits (Perumalsamy et al. 2010). Functional molecular markers (FMs) are derived from functional polymorphisms within or around genes that causally affect phenotypic variations (Anderson and Lubberstedt 2003). FMs have absolute advantages over other DNA markers since they are fully diagnostic of the target trait allele. As information on cloned genes becomes available, FMs development are enabled and FMs genotyping should serve as a reliable alternative to cumbersome phenotype investigation since they always show consistency between functional traits and sequence polymorphisms in alleles.

Functional markers for rice starch physicochemical properties
As the major component in the rice grain, starch accounts for over 90% of endosperm weight, its physicochemical properties strongly in uence rice eating and cooking qualities. Two types of starch polymers exist in rice endosperm: amylose and amylopectin. Amylose is linear with few branches while amylopectin is highly branched (Tester et al. 2004, Hizukuri et al. 2006). It has been established that amylose content and the ne structure of amylopectin are the main determinants of rice eating and cooking qualities (Juliano 1985, Fujita et al. 2003, Zhang et al. 2011, Syahariza et al. 2013).
Apparent amylose content (AAC) is one of key parameters for evaluation of rice eating and cooking qualities. Rice cultivars can be grouped into ve classes according to AAC: glutinous (AAC 1-2%), very low (2-9%), low (10-20%), intermediate (20-25%) and high (> 25%) (Kumar and Khush, 1987). Genetic studies have revealed that Waxy (Wx) gene, which encodes granule bound starch synthase (GBSS) was the primary gene responsible for amylose biosynthesis in rice endosperm. GBSS activity is affected by the naturally occurred or human generated sequence variations in Wx gene, on transcriptional or posttranscriptional levels, then AAC phenotypic variations occurred subsequently. Three functional alleles at the Waxy locus, namely Wx a , Wx b and wx, were rst revealed by Sano in 1984(Sano 1984. Glutinous rice has the recessive wx allele which harbors a 23 bp duplication in the exon 2 of Wx gene. Due to this insertion, gene expression was disrupted, a prematurely terminated translation occurred, then normal GBSS synthesis failed (Wanchana et al. 2003). Rice cultivars with Wx b allele have low AAC while those with Wx a allele have intermediate or high AAC (Sano 1984;Sano et al. 1985). The difference between Wx a and Wx b alleles is the G/T substitution in putative 5' splice site in intron 1 of Wx gene. This G to T mutation in Wx b could reduce the e ciency of GBSS pre-mRNA processing, thus resulted in a lower level of spliced mature mRNA and a subsequent lower AAC (Bligh et al. 1998;Cai et al. 1998;Hirato et al. 1998;Isshiki et al. 1998). Another two additional functional Wx alleles, namely Wx op and Wx in were found by Mikami et al. (2008). Wx op and Wx in are associated with very low (< 10%) and intermediate (20-25%) AAC respectively. The A to G substitution occured in the exon 4 and the A to C substitution in the exon 6 in Wx gene were the characters of Wx op and Wx in alleles respectively. In a study involving 70 rice cultivars (Tian et al. 2009), Wx gene were classi ed into Wx-I, Wx-II and Wx-III haplotypes and they were associated with glutinous, intermediate and high AAC phenotypes respectively. Finally, in a most recent and advanced study (Teng et al. 2012), a total of ve functional Wx alleles have been found in single-segment substitution lines involving 17 parental rice cultivars. Among the ve functional Wx alleles reported in this paper, wx, Wx t and Wx g1 were identical to the previously reported wx, Wx b and Wx in alleles (Sano 1984;Mikami et al. 2008). Compared with Wx g1 , Wx g3 had a C to A substitution in the exon 6, Wx g2 had an extra C to T substitution in the exon 10. Importantly, these ve Wx alleles, namely Wx g3 , Wx g2 , Wx g1 , Wx t and wx were perfectly associated with ve AAC classes: High II, High I, Intermediate, Low, and Glutinous.
Amylopectin is the other type of starch in rice grain, its content and ne structure are also crucial to rice eating, cooking and processing qualities. Genetically, ADP-glucose pyrophosphorylase (AGPase), soluble starch synthase (SS), starch branching enzyme (SBE) and starch de-branching enzyme (DBE) work together to synthesize amylopectin (Nakamura 2002, Nakamura et al. 2010. Each group of enzyme has multiple subunits or isoforms with speci c roles. Among them, SSIIa has been reported as the major gene responsible for gelatinization temperature (Umemoto et al. 2004). Two consecutive GC/TT SNPs could differentiate rice cultivars with low GT from those having intermediate or high GT. PCR-based markers targeting this polymorphism were designed (Bao et al. 2006). Additionally, the C/G SNP downstream of the SBE3 gene was signi cantly associated with amylose content and viscosity properties (Lu et al. 2012).

Functional markers for blast resistance, bacteria blight resistance and brown planthopper resistance
Rice blast is one of the most devastating diseases affecting the rice crop worldwide. To date, more than 50 blast resistance genes have been characterized in rice. Among them, one of the major blast resistance genes Pi54 (once named as Pik h ) exhibited universal resistance to predominant races of the pathogen (Sharma et al. 2002). In an allele mining of Pi54 gene, Ramkumar et al. (2011) revealed a 144 bp insertion/deletion (InDel) in the exonic region of the gene. Pi54 MAS, a PCR-based co-dominant molecular marker targeting this InDel was designed. In a following validation of Pi54 MAS, 105 rice genotypes were investigated, the conclusion was that Pi54 MAS could more accurately distinguish nearly all the resistant/susceptible cultivars than any early reported marker for Pi54 gene, hence this functional marker is suitable for application in MAS.
Pit is another rice blast resistance gene conferring rice cultivars broad-spectrum resistance (Kiyosawa 1972). The resistance function was mainly conferred by up-regulated promoter activity through the

Other functional markers
High yield had been not the only goal in recent rice breeding programs. More and more interesting was attracted by rice eating and cooking quality improvement and functional food development. With knowledge generated in agronomically important genes' isolation, more information are available for developing FMs directly used in germplasm evaluation, gene/allele identi cation and offspring selection in breeding.
Low glutelin content rice is characterized by less mature gultelin and more prolamine than those of normal varieties. Since large amount of glutelin absorption can lead to proteometabilism disturbance and exacerbation, low glutelin rice is viewed as functional food for patients with kidney failure. Genes Lgc 1 (glu-1), glu-2 and glu-3 were revealed responsible for low glutelin content (Miyahara et al. 1996 GS3 is the most important grain length QTL, a single nucleotide polymorphism C/A in its exon 2 principally differentiates rice grain length (Fan et al. 2006). SF28 is a cleaved ampli ed polymorphic sequence marker targeting C/A SNP in the GS3 gene, in an association analysis, 38 out of 180 rice varieties with GS3-A allele had grain length ranging from 8.8 to 10.7 mm, while the rest 142 ones with GS3-C allele had grain length ranging from 6.4 to 8.8 mm (Fan et al. 2009), this suggested this marker could be used in MAS with great reliability.
The objective of the present study was to genotype USDA rice mini-core collection with twelve functional markers associated with eight important agronomic traits, for the purpose of providing guidelines for selecting parent materials with desired gene/alleles for rice breeding. Besides, the polymorphism detected in Waxy gene in this study could be applied in a following association analysis with AAC variations, in order to achieve a better understanding of rice amylose biosynthesis and nd marker combinations that could explain more AAC variations.

Plant materials
The USDA rice mini-core collection used in this study was provided by USDA-ARS.

DNA isolation
Rice seeds were germinated, whole genomic DNA was extracted from ve seedlings of each accession using CTAB method (Doyle 1991
Two primers oligo 484 and oligo W2R, restriction endonuclease AccI (BRL) (Ayers et al. 1997), were used to genotype the intron 1 G/T SNP. Two fragments (129 and 128 bp) designated a G SNP, whereas no digestion suggested a T SNP.

Functional marker for Lgc 1 and GS3
Primers InDel-Lgc1-2 were used to identify the 3.5-kb deletion in low glutelin content gene 1 (Lgc1), those accession without this deletion, produced a 509 bp fragment (Chen et al. 2010).
Primers SF28 and restriction endonuclease PstI were used to determine the C/A SNP in GS3 gene (Fan et al. 2009), two fragments (110 and 26 bp) designated C SNP, whereas no digestion indicated A SNP.

DNA marker analysis
The PCR reaction was performed in a 10 µl reaction mixture containing 20 ng of template DNA, 1X PCR buffer, 2mM MgCl 2 , 0.2 mM dNTPs, 0.2 µM of each primer and 1 unit of Taq DNA polymerase. All ampli cations were performed on a PTC-100 thermal cycler (MJ Research, Inc.) under the following conditions: 5 min at 94℃, followed by 45 s at 94℃, 60 s at T A (Annealing temperature, Table 1), 60 s at 72℃ for 35 cycles, and 7 min at 72℃ for a nal extension.
The PCR products of SSIIa, Pit, Pi54, Bph14 and Lgc1 gene were resolved on a 2.0% agrose gel containing 0.05 µl/mL gel red in 1X TBE buffer, then visualized using a gel documentation system. PCR products of Wx-exon 2, Wx-exon 6 were separated on 8% polyacrylamide denaturing gels and visualized by silver staining (Bassam et al. 1991).
5µl of PCR products of Wx-intron 1, Wx-exon 10, SBE3, xa5, and GS3 were digested by restriction endonucleases (Biolabs) ( Table 1), according to manufactures. The products of restriction digestion of Wx-exon 10, SBE3, xa5 were separated by gel electrophoresis (2.0% agarose gel) and visualized by gel documentation system. For Wx-intron 1 and GS3, the digested products were separated on 8% polyacrylamide gel and detected by silver staining.

Results
All 217 accessions in mini-core collection were genotyped using twelve functional markers for eight agronomic traits (Table 1, Supplementary Table 1).

Allelic frequencies of Wx gene, SSIIa gene and SBE3 gene
Four Wx gene markers have been chosen in genotyping the collection: G/T SNP in the 5' splice site of intron 1 (Wx-Intron 1), 23 bp duplication in exon 2 (Wx-Exon 2), A/C SNP in exon 6 (Wx-Exon 6) and C/T SNP in exon 10 (Wx-Exon 10).
Concerning Wx-Intron 1 G/T SNP, 154 rice accessions carried G allele (71.0%) while only 63 (29.0%) for T allele (  (Table 2), as complement to Wx-Intron 1 G/T polymorphism and Wx-exon 6 A/C polymorphism, Wx-exon 10 could further divide rice cultivars with high AAC to high I and high II subclasses (Teng et al. 2012).
Two pairs of confronting markers targeting GC/TT polymorphism in SSIIa gene were chosen to genotype the mini-core. The GC allele was predominant, harbored by 183 accessions (83.9%). Contrastingly, only 35 accessions (16.1%) had TT allele ( Table 2). TT allele was reported associated with low gelatinization temperature (GT) while GC allele with intermediate and high GT. Pretty same pattern observed in SBE3 C/G SNP, as 145 accessions (66.8%) carried G allele while 72 accessions (33.2%) had C allele (Table 2).
In general, for all the six functional markers employed in present study targeting starch physicochemical properties, genotyping results showed in USDA mini-core collection, for each one of them, one allele was always more frequent than its counterpart. Relatively, the most equally distributed alleles occurred in SBE3 C/G polymorphism, but still 66.8% SBE3-G allele to 33.2% SBE3-C allele. The most unequally distribution observed in Wx-Exon 2, as 23 bp duplication was only observed in 6.9% cases, by the way, this duplication was reported responsible for glutinous AAC.

Allelic frequencies of Pi54 gene, Pit gene and Bph14 gene
Pi54 and Pit were both of genes conferring rice blast resistance. The Pi54 resistant allele has a 144 bp deletion in exnoic region of Pi54 and the Pit resistant allele has an insertion of a long terminal repeat retrotransposon upstream of Pit. In mini-core, 99 (45.6%) accessions carried Pi54 resistant allele, but Pit resistant allele was only found in 4 (1.8%) accessions (Table 2). Noticeably, two accessions "SHIMIZU MOCHI" and "Padi Pohon Batu" were found heterogenous for the Pi54 allele (Fig 2a). Furthermore, three accessions "Warrangal Culture 1252", "Tranoeup Beykher" and "4484" had both resistant alleles of Pi54 and Pit (Supplementary Table 1).
Resistant allele of Bph14 gene was not found in USDA rice mini-core collection, nevertheless resistant allele was neither found in Chinese rice mini-core collection, which comprises 200 varieties, selected from more than 60000 accessions of Chinese cultivated rice (Zhou et al. 2013).

Allelic frequencies of Lgc 1 gene and GS3 gene
For GS3 C/A SNP, C allele is more frequently found in mini-core, in contrast to 151 accessions (69.6%) carrying C allele, only 66 accessions (30.4%) had A allele ( Table 2), C allele was also found more frequent in another study involving 180 rice genotypes (Fan et al. 2009), 142 (79%) of them have the C SNP.
In USDA rice mini-core collection, there were 58 accessions (26.7%, Table 2) having the 3.5 kb deletion in Lgc1 gene which caused low glutelin content phenotype.

Discussion
Germplasm resource is essentially important for plant breeding in respect of parents selection. Phenotype and genotype investigation is a prerequisite for parents selection in any breeding program. Although the USDA rice mini-core collection is a valuable set of germplasm, little is known on its phenotypic and genotypic characterization of many important agronomic traits, such like disease resistance, eating and cooking qualities. On the other hand, the investigation of phenotype is usually laborious. For example, to identify parental materials with pathogen resistance, usually inoculation test and long-term observation is needed.
Molecular marker assisted selection (MAS) is helpful in parents and offspring selection to save cost and labor in breeding. However, many markers are distant from responsible genes, when undesired recombination between marker and candidate genes happened, the marker will lose its accuracy and reliability in selection of interested traits (Rafalski and Tingey, 1993). Functional markers are derived from within or around genes causally affect phenotypic variations thus have been directly employed with great e ciency and reliability to identify desirable alleles in many breeding programs ( Since starch is the major component in rice grain, accounting for nearly 90% of endosperm weight, the nucleotide sequence variations in starch biosynthesis related genes and their association with starch physicochemical properties has long been a research focus in rice breeding. Waxy gene is primarily responsible for amylose biosynthesis. Sequence variations in ve polymorphic sites of Wx gene can explain most of amylose content variations. Firstly, due to a 23 bp duplication in exon 2 (wx allele), gene translation is prematurely terminated, so amylose can not be synthesized because of the absence of normal GBSS in glutinous rice (Wanchana et al. 2003). In USDA rice mini-core collection, 15 accessions harbored this 23bp duplication, so these rices are supposed to be glutinous/waxy rice with AAC less than 2%. For those rice without this duplication, if they have a T SNP in Wx gene intron 1 pupative 5" splice site, because of the lowered e ciency of the rst intron splicing out, the amount of mature mRNA was reduced and of the GBSS enzyme. In short, rice with Wx-intron 1-T allele tend to show low amylose content (Sano 1984, Sano et al. 1985. In mini-core, Wx-intron 1-G allele was much more frequent (71.0%), these rice should fall to category of low AAC class. As complement to Intron 1-G allele, the A/C SNP in exon 6 has the ability to distinguish intermediate AAC from low and high AAC. In detail, C SNP is associated with intermediate AAC class while A SNP is associated with both low and high AAC class (Larkin and Park, 2003). Furthermore, the C/T SNP in exon 10 could distinguish rice with high I AAC from high II AAC: G-1-A-C allele associated with higher AAC (high II) than G-1-A-T (high I) (Teng et al. 2012), However, So far, no alleles had been reported distinguishing very low AAC from low AAC.
Based on these four polymorphic sites, all 217 accessions in USDA rice mini-core collection have been assigned to eight waxy gene haplotypes. G-1-A-C with 75 accessions (35.6%) was the most predominant Wx gene haplotype in mini-core, it was associated with high II AAC class (Teng et al. 2012). Apart from the previously reported ve haplotypes, three new haplotypes have been found in 17 accessions, T-1-C-C, T-1-A-T and T-1-C-T. According to previous knowledge, T SNP in waxy gene intron 1 leaded to low AAC phenotype, hence these 17 accessions may demonstrate low AAC phenotype. Nevertheless, besides wx gene, other genes also play their roles in AAC determination. Among them, a C to G transversion in SBE3 gene was reported leading to decreased AAC and increased RVA pro le (Lu et al. 2012). This substitution occurred at the 63rd nucleotide downstream of OsBEIIb gene termination codon. Instead of directly causing amino acid substitution, it may affect gene translation activity. In mini-core, 145 accessions were revealed carrying SBE3-G allele.
The question is, what is the interaction between Wx alleles and the alleles of other starch biosynthesis related genes? For instance, with C/G allele in SBE3 gene? On the other hand, conventionally, low AAC class is further divided to low and very low AAC subclasses (Kumar and Khush, 1987). Do the new wx alleles found in mini-core have the ability to distinguish rice genotypes with very low AAC (2-9 %) from that with low AAC (10-20 %)? In the following study, AAC will be measured as it is also a fundamental information about mini-core and on the other hand, the data will allow an association analysis which will shed light on the aforementioned questions raised in the present study.
Gelatinization temperature (GT) is one of important indicators for rice eating, cooking and processing quality. Genetically, GT is determined by starch synthase IIa (SSIIa) gene (Umemoto et al. 2004). Two consecutive SNPs can differentiate rice with low GT from those with intermediate or high GT with accuracy at 90%. PCR-based markers targeting GC/TT were developed (Bao et al. 2006). This important gene marker could be used to predict GT of cultivars and applied in marker-assisted selection to improve rice grain quality (Lu et al. 2010). In mini-core, GC allele is more frequent (182 accessions, 83.9%), with GC allele, rice show high or intermediate GT in 90% cases. GC allele was also frequently found in another survey involving 334 rice breeding lines and 172 rice landraces, totally, 346 rice genotypes (68.38%) carried GC allele and noticeably, only two had TT in 172 rice landraces.
Pi 54 and Pit are both of important genes being manipulated in rice breeding programs, the functional markers employed in present study were designed targeting functional polymorphisms of Pi54 and Pit gene, their reliabilities have been validated. 98 accessions (45.6%) have been found carrying functional allele of Pi54 gene. Comparably, in another set of diverse rice varieties, 38 out of 105 (36%) were found carrying functional alleles, suggesting a close result with our study in terms of allele frequency (Ramkumar et al. 2011). For Pit gene, only 4 accessions were revealed having resistant allele, this low frequency is not a surprise as in another study, functional Pit allele was only found in 5 accessions from 68 cultivars of the NIAS (National Institute of Agrobiological Sciences, Japan) global rice core collection (Hayashi et al. 2010). What is noteworthy, among these four accessions harboring functional Pit allele, three of them, namely "Warrangal Culture 1252", "Tranoeup Beykher", and "4484" also carried functional Pi54 allele. With functional markers, individuals with disease resistant alleles could be easily identi ed and through gene pyramiding, cultivar with combined alleles will show wider spectrum resistance.
Resistant allele of Bph14 gene was not found in USDA rice mini-core collection, furthermore, resistant allele was neither found in Chinese rice mini-core collection, which comprises 200 varieties, selected from more than 60000 accessions of Chinese cultivated rice (Zhou et al. 2013) This indicates Bph 14 may be not present in O. sativa.
Grain length plays its role in determining grain appearance, milling quality and affecting grian yield. It is therefore an important agronomic trait in rice breeding. For GS3 C/A SNP, C allele was more frequently found in mini-core, in contrast to 151 accessions (69.6%) carrying C allele, only 66 accessions (30.4%) had A allele. C allele was also found more frequent in another study involving 180 rice genotypes, 142 (79%) of them carrying GS3-C allele, with C allele, rice grain showed shorter grain length (Fan et al. 2009).
As revealed by Kusaba et al. (2003), a 3.5 kb deletion occurred in Lgc 1 gene resulted in remarkable suppression of glutelin accumulation in rice. For breeding low glutelin rice, conventionally, low glutelin content mutant generated by physical or chemical mutagenesis is an important genetic resource (Iida et al. 1993(Iida et al. , 1997Qu et al. 2002). In this present study, a total of 58 accessions (26.7%) having the 3.5 kb deletion in Lgc1 gene were identi ed in USDA rice mini-core collection. It is conceivable that new and suitable parent material could be found in these 58 accessions in respect of breeding low glutelin rice variety.

Conclusions
In conclusion, all 217 accessions of USDA rice mini-core collection had been genotyped using twelve functional markers for eight agronomic traits. We hope the results of this study could help breeders select parental materials with desirable allele/gene combinations and phenotypes among mini-core accessions for rice breeding programs, also the Wx gene polymorphisms identi ed in this study could be included in an association analysis with amylose content variations, in order to develop powerful marker combination that can differentiate rice cultivars belong to different AAC classes.