A Genetic Resource for Rice Improvement (cid:0) introgression Library of Agronomic Traits for All AA Genome Species in Genus Oryza

Background: Rice improvement depends on the availability of genetic variation, and AA genome Oryza species are the natural reservoir of favorable genes which are useful for rice breeding. Developing the introgression library using multiple AA genome species was rarely reported. Results: In this study, to systematically evaluate and utilize potentially valuable QTLs/genes or allelic variations, based on the evaluation and selection of agronomic traits, 6372 introgression lines (ILs) were raised by crossing 330 accessions of 7 AA genome species as the donor parents, with three elite cultivars of O. sativa, Dianjingyou 1, Yundao 1 and RD23 as the recurrent parents, respectively. Further, twenty-six, twenty-six and nineteen loci were detected in the multiple donors using 1,401 ILs in the Dianjingyou 1 background for grain length, grain width, and the ratio of grain length to grain width, respectively. Interestingly, ten loci had opposite effect on grain length in the different donors, so did grain width. Moreover, one locus for grain width, qGW3.1, was validated using the segregation population derived from the donor of O. glumaepatula. Conclusions: This introgression library provided the powerful resource for future rice improvement and genetic dissection of allelic variations. Selections of favorable alleles that are present in wild relatives proceed the driving force of the rice domestication. 200 1 F 1 were for each of the combinations. The heading date of plants that were too early or too late were discarded, and then about 30 plants from each cross combination were selected to backcross to the recurrent parents, and about 200 BC 2 F 1 seeds were obtained. each the BC 2 F 1 progeny, individuals that showed signicant agronomic difference from the recurrent parents were selected for further backcrossing or selng.

). Introgression hybridization from early japonica to proto indica and proto aus resulted in indica and aus subspecies (Choi et al. 2017) In this study, in order to explore and utilize cultivated rice relatives for rice improvement, we systematically introduced foreign segments from eight different AA genome species (O. longistaminata, O. barthii, O. glumaepatula, O. meridinalis, O. nivara, O. ru pogon, O. glaberrima and upland rice of O. sativa ) into three elite, highly productive O. sativa varieties (Dianjingyou 1, Yundao 1, RD23) and to provide introgression lines (ILs) as the basis for QTL mapping, gene and allelic variations discovery, and rice breeding. One thousand four hubdred and one of 6372 ILs in the Dianjingyou 1 background were used to analysis genotype and discovery novel alleles for grain size. Two QTLs for grain width were identi ed in the BC 4 F 2 and BC 4 Table S1). Three elite varieties, Dianjingyou 1, Yundao 1 and RD23 were used as the recurrent parents.
Three hundred and twenty-nine donor accessions except for 1 accession of O. longistaminata as male parents were crossed with the two recurrent parents Dianjingyou 1 and Yundao 1 as female parents followed by successive backcrosses and sel ng to generate ILs. The F 1 plants were used as female parents to backcrossing to their respective recurrent parents to produce BC 1 F 1 generation. More than 200 BC 1 F 1 seeds were generated for each of the combinations. The heading date of plants that were too early or too late were discarded, and then about 30 plants from each cross combination were selected to backcross to the recurrent parents, and about 200 BC 2 F 1 seeds were obtained. From each of the BC 2 F 1 progeny, individuals that showed signi cant agronomic difference from the recurrent parents were selected for further backcrossing or sel ng. After 2-3 times backcrossing and 2-7 times sel ng, the progeny with stable target traits were developed as ILs.
The ILs derived from the cross between 1 accession of O. longistaminata as the donor parents and RD23 as the recurrent parent were generated as above procedure.

Agronomic traits evaluation
Each introgression line was planted in randomized complete block design under two environments. Observation on agronomic traits was recorded from ve randomly chosen plants for three replications. Border plants were excluded for phenotype evaluation.
To measure the grain size, seed was selected from primary panicle and stored at room temperature for at least 3 months before testing. Twenty grains were used to measure grain width (GW) from each plant. Grains per individual were taken photographs using stereomicroscope, and then grain width was measured by software Image J. The average width of 20 grains was used as phenotypic value. 1,000-grains weight was measured by weighting fertile, fully mature grains from ve panicles; Prostrate growth habit was observed for the tiller angle in three main stages, including booting stage, heading stage and grain lling stage.
For spreading panicle, primary branch and second branch angle were observed. Erect panicle or drooping panicle was evaluated according to the angle between the lines connecting panicle pedestal with panicle tip and the elongation line of stem; For a simply inherited traits, awn, pericarp color and kernel color was observed directly in the eld.
Tiller numbers were recorded from ve random plants; spikelets per panicle (SPP) were measured as the total number of spikelets of the whole plant divided by its total number of panicles; Aerobic adaptation was evaluated by biomass, yield, harvest index, heading date and plant height difference between upland and irrigated environments; plant height was measured from the ground level to the tip of the tallest panicle.
To evaluate blast resistance, introgression lines were inoculated with Magnaporthe oryzae for 3 weeks after sowing by spraying with conidial suspension. After seven days, lesion types on rice leaves were observed and scored according to a standard reference scale based on dominant lesion type .

DNA extraction and PCR protocol
The experimental procedure for DNA extraction was performed as previously described ( Figure 1A, Table S5). And 265 ILs were derived from the cross between 1 accession of O. longistaminata as the donor and an indica variety RD23 as the recurrent parent ( Figure 1D, Table S5). Thus, the introgression library derived from the multiple donors in the different backgrounds showed the abundant genetic variations.
For the same donor parent, phenotype variations for agronomic traits were signi cantly different between Dianjingyou 1 background and Yundao 1 background. The numbers of ILs showing erect panicle, dense panicle, lax panicle, awn, plant height, pericarp color, 1 000-grain weight, drought-resistance and aerobic adaption in Yundao 1 background were more than those in Dianjingyou 1 background, whereas the number of ILs exhibiting spreading panicle, prostrate growth, kernel color, glabrous hull, grain length, grain width in Yundao 1 background were less than these of Dianjingyou 1 background ( Figure 1B). It was suggested that target traits expression depended on the background of recurrent patent. Developing introgression library in the different backgrounds will be bene t to express hidden genes in the donor parents and discover more genetic variations for further study.

Characteristics of chromosome substituted segments in the introgression library
A total of 168 SSR markers distributed on 12 chromosomes were selected to genotype the introgression library in Dianjingyou 1 background. The length of the interval between two markers ranged from 0.2-5.5 Mb, with an average of 2.22 Mb on the rice physical map ( Figure 2; Table 1 In addition, different distribution in introgression segments on 12 chromosomes from 7 AA genome donor species was observed. The number of introgression segments on chromosome 3 was more than those on other chromosomes ( Figure S1-S7), which may be related to the chromosome structure and location. We also found that introgression segments with the donor of upland rice on chromosome 1 were detected in almost ILs, it may be because this donor segment was tightly linked with some agronomic traits under selection pressure ( Figure S7).

Detection of allelic variations for grain size in the introgression library
Seed size plays an important role in rice yield (Xing and Zhang, 2010). Seed size not only determines seed appearance, but also affects milling, cooking and eating quality of rice (Fan et al., 2006). Signi cant variations were observed for grain length (GL), grain width (GW) and the ratio of grain length to grain width (RLW) in the introgression library with multiple donors in the background of Dianjingyou 1. Some ILs for grain length (GL), grain width (GW) and the ratio of grain length to grain width (RLW) were found to be signi cantly superior to the recurrent parent Dianjingyou 1. For grain length, 133 and 125 ILs were found to be signi cantly longer than the Dianjingyou 1 in two seasons, respectively. For grain width, 412 and 508 ILs were observed to be signi cantly wider than the recurrent parent in two environments. For the ratio of grain length to grain width, 277 and 178 ILs were found to be higher than the Dianjingyou 1 in both seasons ( Figure. 3). These results suggested that abundant genetic variations for grain size existed in the wild and cultivated accessions of rice.
In order to explore favorable allelic variation for grain size, the potential loci were detected based on the genotype and phenotype data. Forty-one loci linking with grain length, forty-four loci linking with grain width, thirty-two loci linking with the ratio of grain length to grain width, were identi ed in both seasons. It indicated that abundant gene pool for grain size exists in the AA genome species ( Figure. 4-6). Among these, 26 loci for grain length were detected from multiple donors, 12, 11, 2 and 1 loci were detected from the donors of two species, three species, four species and six species, respectively. It suggested that the same locus contributing to grain length is potential allelic variation from different donors. Moreover, 4 loci from the different donors were only responsible for long grain, 12 loci derived from the multiple donors only contributed to short grain, and 10 loci from the different species controlled both long grain and short grain (Figure.  Potential allelic genes were detected in the different donors, suggesting that some loci for grain size were conserved in Genus oryza. Taken together, these results will provide the information that the loci for grain size from the different donors were the same or different haplotypes; it also indicated that IL library with the donor of 7 AA genome species was an excellent resource and tool to discovery favorable allelic variations and new QTLs/genes for rice improvement.

Con rmation of loci for grain width
In order to con rm the loci for grain width above identi cation, a BC 3 F 7 IL was derived from the cross between an O. glumaepatula (IRGC100184) as a donor parent, and a temperate japonica variety, Dianjingyou 1, which was highly signi cant difference in the trait of grain width between IL and Dianjingyou 1 ( Figure 7A). In the BC 4 F 2 population, phenotypic values of grain width showed a continuous distribution. The GW was skewed toward the large-value parent, IL ( Figure 7B). These results indicated that grain width was controlled by quantitative trait locus. This population was used to con rm the locus for grain width.
In order to validate the QTL for grain width, 443 SSR markers distributed on 12 chromosomes were used for polymorphic analysis between Dianjingyou 1 and IL again. Then, a total of 22 polymorphic markers between the two parents, including 15 markers on chromosome 3 and 7 markers on chromosome 6, were selected to survey the genotype of BC 4 F 2 population. Based on the phenotype and genotype data, two QTLs for GW were detected in BC 4 F 2 and were validated in BC 4 F 3 populations. qGW3.1 was identi ed in the region between RM186 and RM416, and it explained 3% and 10% of the phenotypic variation with an additive effect of -0.06mm and -0.02mm in two generations, respectively. Another QTL for GW, qGW6.1, was detected in the region between RM253 and RM19623 on chromosome 6, explaining 10% and 15% of the total phenotypic variation with an additive effect of -0.06 and -0.05mm, respectively (Table 3). Thus, the location of qGW3.1 was consistent with the locus for thin grain from O. glumaepatula, indicating that exploration of favorable allelic variation using multiple ILs was reliable. Moreover, a novel qGW6.1, was found only from segregating population, suggesting that exploring new genes or QTLs using segregating population could acquire more information.

Discussion
Exploration of natural allelic variations from AA genome species Genetic diversity and allele were lost during the domestication from the wild species of rice to the cultivated rice (Sun et al. 2002), whereas narrow genetic basis led to the yield bottleneck of Asian cultivated rice, thus, exploration and utilization of favorable allelic genes in AA genome species is an accessible approach to improve rice breeding. Recently years, mining and utilization of useful allele genes have made great progress in rice breeding. For example, the allelic variation in the Wx gene and SSSI were proved to contribute greatly to the differences in ECQs in the two subspecies . Allelic variation at the E1/Ghd7 locus allowed expansion of the rice cultivation area through adjusting heading date (Saito et al. 2019 introgression lines were derived from a single accession of AA genome species in a single background leading to the loss of some valuable allelic information. In this study, introgression library with multiple donors from different relatives of Asian cultivated rice is a powerful resource platform to discover novel and functional allele QTLs/genes. One locus for grain length and one locus for grain width were explored from the six and ve different donor species, respectively. Two loci for grain length, three loci for grain width and one locus for the ratio of grain length to grain width were detected from the donors of four species respectively (Figure 4 2) Target genes/QTLs for the same phenotype could be validated by the different donors, and it will provide the information that these target genes/QTLs could be the same haplotype; 3) The genes or QTLs responsible for the opposite phenotypes, for example, long grain size and short grain size, could also be con rmed using the different populations from multiple donors, and it could be the different haplotype; 4) Introduction of new genetic variation, selection of favorable alleles in the wild relatives could speed up to x the useful genes. Therefore, Constructing IL library is not only the breeding method, but also the domestication power. Thus, these IL libraries will help us improving rice breeding and interesting genes discovery and utilization, as well as the development of rice domestication.
Introgression library for QTL mapping and cloning Introgression lines harboring one or more donor chromosome segments exhibited distinguish traits, compared to the recurrent parent. Since the background of introgression lines is similar to the recurrent parent and it is easier to correlate a particular chromosomal region to phenotypic variation, they were used for mapping and cloning QTLs or genes for complex traits. Many QTLs or genes have been mapped based on introgression lines, such as GS2 for grain size and weight (Hu et  Hybrid sterility is the major barrier to develop interspeci c introgression library, which is also an ideal model for studying the relationship between the reproductive isolation and speciation The major barrier is hybrid sterility exhibiting complete or partial pollen sterility and/or spikelet sterility in the crosses between AA genome species and O. sativa. We observed that pollen fertility of F 1 varied from 1. Though reproductive isolation, especially hybrid sterility, was existed in the hybrids between Asian cultivated rice and AA genome species, direct cross and backcross could be made for raising introgression library. Thus, introgression library from AA genome species is an ideal model to study the relationship between reproductive isolation and AA genome species divergence. Using this resource platform, we identi ed a series of QTLs or genes for interspeci c hybrid sterility, including S1, S29(t), S37(t), S38(t), S39(t), S40, S44(t), S51(t), S52(t), S53(t), S54(t) S55  (Table S3-S5). Thus, interspeci c hybridization could be an important mechanism for rice species diversity and increase adaptive ability in the different environment, which is valuable resource for meeting the demand of breed challenge in rice. (an indica variety) as the recurrent parents, respectively. forty-one loci for grain length, forty-four loci for and thirty-two loci for the ratio of grain length to grain width were identi ed, and twenty-six, twenty-six and nineteen loci were detected in the multiple donors for grain length, grain width, and the ratio of grain length to grain width, respectively. Interestingly, ten loci for long grain were found in some donors, and vice versa, those ten loci for short grain were also identi ed in other different donors, and ten loci controlled both wide grain and thin grain in the different donors. One locus for grain width, qGW3.