QTL Mapping of Novel Genomic Regions for Yield Related Traits in Doubled Haploid (DH) Population Derived from the Popular Rice Hybrid KRH-2

Background: Rice, being the principal food crop and major nutritional source for more than half of the global population, is also an important source of livelihood in many South and South-East Asian countries. Amidst diminishing natural resources and many biotic-abiotic stresses, increasing the yield of rice varieties remains a challenging task. Identication of novel and yield augmenting alleles from stable rice hybrids is crucial to facilitate their marker-assisted transfer into various genetic backgrounds. Results: Quantitative trait loci (QTL) mapping using a population of 125 doubled haploid (DH) lines developed from the cross IR58025A/KMR3R and 126 polymorphic SSR; EST-derived SSR markers led to the identication of 12 each of major-minor effect QTLs for yield related traits. Major effect QTLs were detected for traits namely days to fty percent owering, test (1,000) grain weight, plant height, panicle weight, panicle length, ag leaf width, ag leaf length, biomass and total grain yield/plant explaining the phenotypic variability in the range of 29.95%-56.75%. QTL hotspots were detected on chromosome 3 for the traits, panicle length and total grain yield/plant and on chromosome 6 for the traits, panicle length, ag leaf length and total grain yield/plant. Though many of these QTLs were noted to co-localize with the QTL regions reported in earlier studies, ve novel and major effect QTLs for panicle length, biomass, ag leaf width, panicle weight, plant height and three novel minor effect QTLs for panicle weight and fertile grains per panicle, were identied in this study. Conclusions: Through this study, both major-minor effect novel QTLs for crucial yield related traits, viz., fertile grains per panicle, panicle length, panicle weight were identied. Further, the QTL hotspots identied on two different chromosomes for ag leaf length, panicle length and total grain yield/plant shall not only help in understanding the underlying genetic mechanisms of yield regulation but also would provide an insight into the genetic synchrony among the various yield related traits in contributing for yield heterosis. The identied QTL hotspots after their validation can be deployed in Among all the traits, the lowest Vg and Vp values were observed for FLW trait (0.02 and 0.14), respectively. Cumulatively for all the agro-morphological traits, the phenotypic variance was observed to be higher than the genotypic variance. Higher genotypic coecient of variance (GCV) and phenotypic coecient of variance (PCV)(>20%) was observed for the traits PW, PL, BM and YLD among which the highest GCV-PCV value was observed for the trait BM (37.49 and 38.36, respectively). Moderate GCV-PCV values (between 10%-20%) was observed for the traits, GP, FGP, TGW, PH, FLL, FLW, PT. Low value (less than 10%) of the GCV-PCV was observed for the trait DFF. Broad sense heritability (H 2 ) was noted to be in the range of 16.80% (FLW trait) to 95.52% (BM trait). The highest genetic advance (GA) value was observed for the trait FGP (101.94) whereas the lowest value of 0.13 was associated with FLW trait. The genetic advance per mean (GAM) was observed in the range of 75.49 (BM trait) and 6.41 (DFF trait). The DH population demonstrated signicant statistical variability for the traits under study; hence, the population can be considered suitable to be subjected to QTL mapping analysis [50]. their application in QTL mapping for heterosis and yield associated QTLs is limited. In the present study, using a doubled haploid (DH) population developed from the cross IR58025A/KMR3R and by employing SSR and EST-derived SSR markers, new regions in the rice genome, associated yield and its other trait components have been mapped. trait was identied in this study, except for the traits, GP and PT. It was interesting to observe that majority (58.33%) of the identied QTLs were from IR58025A parent. A total of 14 QTLs for various traits namely DFF, YLD, FGP, PW, FLL, FLW and BM were contributed by IR58025A. Likewise, a total of 10 QTLs for traits viz., YLD, TGW, PW, PH, PL and FGP were contributed by KMR-3R. As pointed out by Marathi et al. (2012), these observations indicated towards the phenotypic and allelic variation existing among the parental lines and the possibility of combining these QTLs in a single parent for signicant enhancement of yield and heterosis levels. QTLs controlling agro-morphological traits associated with yield and allied parameters have been identied. Five novel major effect QTLs controlling the yield related traits viz., panicle length, panicle weight, plant height, ag leaf width, biomass with high LOD values and high Rsq values have been detected. Two major effect QTL hotspot regions on chromosome 3, governing total grain yield per plant (YLD)-panicle length (PL) and on chromosome 6 regulating YLD, PL and ag leaf length (FLL) have also been identied. The major QTLs and QTL hotspot regions identied in this study, could be transferred through marker assisted backcross breeding (MABB) into various genetic backgrounds. specic for the minor-major fertility restoration loci, respectively [84]. PW-Panicle

Therefore, only those QTLs which demonstrate consistent expression across various environments with lower predominance of genotype (G) × environment (E) are important from plant breeding perspective [33,34].
Genetic architecture of rice yield involves a study of yield related trait components which in uence the former's expression. Moreover, understanding the various modes of gene actions which contribute to yield trait's expression is also a crucial part of genetic architecture. Out of the various agro-morphological traits that in uence the genetic architecture of rice yield, two traits namely the number of productive tillers or tillering number (TN) and panicle morphology are known to have a higher in uence [35] and which share a common developmental biology [36]. As explained by [36], apical growth and branching regulates the tiller number (TN) and panicle architecture. The rst group of genes i.e. monoculm (MOC) genes [39], was observed to regulate the axillary bud formation, thus in uencing the TN and yield traits. The second group of genes, the tillering dwarf genes, was identi ed to effectively regulate the tillering number [41,42]. The third group of genes was observed to in uence the TN trait by regulating the plant hormonal pathway [43,44,45]. Other genetic mechanisms which regulate the TN are follows: Negative regulation of axillary bud growth [46], positive regulation of TN through enhanced photosynthesis [47]. Jiang et al. [48] identi ed four candidate genes which were assumed to have a role in in uencing the variations in tiller numbers and are awaiting validation. Pleiotropic interactions two traits viz., between yield and days to fty percent owering were determined by [49] using a rice MAGIC population.
Though there are several reports on the development of immortal populations from elite rice hybrids for QTL mapping [50,51], deployment of doubled haploid (DH) population developed from elite hybrids is limited [52,53]. Considering the above mentioned points, the present investigation was undertaken with the objectives of developing doubled haploid (DH) population from the cross IR58025A/KMR3R and identifying yield and its related QTLs from the population by employing hyper-variable SSR markers. Further, efforts were also made to identify the architecture of novel QTLs.

Statistical analysis of agro-morphological trait performance of DHLs
The frequency distribution of the key agro-morphological traits indicates their normal distribution for three consecutive seasons ( Figure 1). Table 1 Table 4). At 5% level of signi cance, traits YLD and FGP were observed to be positively correlated (r = 0.04). A negative correlation was noted for the traits, YLD and DFF (r = -0.11), FLL (r = -0.06) at 5 % level of signi cance. Moreover, very strong and positive genotypic correlation was also observed between the traits PL and GP (r = 0.15), TGW (r = 0.24), PW (r = 0.59), PH (r = 0.22), FLW (r = 0.14), PT (r = 0.27) at 1% level of signi cance. Genetic variability estimates (Supplementary Table 5) demonstrated a very high level of genotypic variance (Vg) and phenotypic variance (Vp) for the traits viz., FGP (2750 and 3088, respectively), GP (2517 and 2950). Among all the traits, the lowest Vg and Vp values were observed for FLW trait (0.02 and 0.14), respectively. Cumulatively for all the agro-morphological traits, the phenotypic variance was observed to be higher than the genotypic variance. Higher genotypic coe cient of variance (GCV) and phenotypic coe cient of variance (PCV)(>20%) was observed for the traits PW, PL, BM and YLD among which the highest GCV-PCV value was observed for the trait BM (37.49 and 38.36, respectively). Moderate GCV-PCV values (between 10%-20%) was observed for the traits, GP, FGP, TGW, PH, FLL, FLW, PT. Low value (less than 10%) of the GCV-PCV was observed for the trait DFF. Broad sense heritability (H 2 ) was noted to be in the range of 16.80% (FLW trait) to 95.52% (BM trait). The highest genetic advance (GA) value was observed for the trait FGP (101.94) whereas the lowest value of 0.13 was associated with FLW trait. The genetic advance per mean (GAM) was observed in the range of 75.49 (BM trait) and 6.41 (DFF trait). The DH population demonstrated signi cant statistical variability for the traits under study; hence, the population can be considered suitable to be subjected to QTL mapping analysis [50].

QTL identi cation with SSR markers
Using a set of 126 hyper-variable SSR and EST derived SSR markers (Supplementary Table 6), a total of 12 major effect-12 minor effect QTLs were detected among the 125 DHLs for all the traits except for the total number of grains per panicle (GP) and number of productive tillers (PT) (Additional le 1; Table 1 and Supplementary Table 7).
As per [54], a QTL with greater than 20 PVE % is considered to be having a major effect. Twelve major effect QTLs were detected for all the traits viz., days to fty percent owering (DFF), total grains per panicle (GP), test (1,000) grain weight (TGW), productive tillers (PT), biomass (BM), fertile grains per panicle (FGP), panicle weight (PW), panicle length (PL), ag leaf length (FLL), ag leaf width (FLW), plant height (PH) and total grain yield/plant (YLD) on chromosomes 3, 4, 6, 7, 9 and 12. The LOD scores of these QTLs were observed to be in the range of 2.70-16.51 with PVE% between 29.95%-56.75%. By using 126 hyper-variable SSRs, 15,750 data-points were noted to be ampli ed. Out of them, 652 data-points to the tune of 4.13% did not amplify and therefore considered to be missing data points.

Days to fty percent owering (DFF):
A major effect QTL, qDFF12-1, was identi ed with a LOD score of 3.64 and PVE value of 48.60%. This QTL with size 5.01 cM was anked by the markers RM27966 (46.68 cM) and RM235 (51.69 cM). Among them, RM27966 was the closest and with the additive effect value of 9.77 and it was inherited from IR58025A (Additional le 1; Table 1, Figure 2). The cumulative effect (RSq value) of the major and minor effect QTLs is shown in Additional le 1; Table 1 and Supplementary Table 7, respectively.
Total grain yield per plant (YLD): Two major effect QTLs, namely, qYLD3-1 and qYLD6-1, were identi ed with LOD scores of 16.51 and 14.36, respectively and the PVE% of both these QTLs was 56.75% and 35.29%, respectively. The QTL, qYLD3-1, with a size of 19.71 cM was anked by the markers, RM448 (93.41 cM) and RM15679 (113.12 cM) and it was closely associated with RM15679. The QTL, qYLD6-1 of 15.15 cM was observed to be anked by RM7023 (53.95 cM) and RM586 (69.10) with RM7023 being the closely associated marker. The negative additive effect value, -7.81, of qYLD3-1 indicated that the favorable allele was inherited from KMR-3R (Table 1,  Test (1,000) grain weight (TGW): A total of two major QTLs namely, qTGW6-1 and qTGW7-1, were identi ed for this trait. The rst major QTL, qTGW6-1 was observed to be anked by RM19410 ( Figure 2). qTGW7-1, was observed to be located in between the markers, RM20948 (167.52 cM) and RM21649 (172.53 cM) with RM21649 to be identi ed as closely associated marker. The LOD score and PVE% of this QTL was observed to be 3.06 and 29.95%, respectively. The size of this QTL being 5.01 cM, was identi ed to have a negative additive effect value of -1.24 demonstrating its inheritance from KMR-3R (Additional le 1; Table 1, Figure 2). The cumulative effect of both these QTLs on the phenotype accounted up to 72.94% (Additional le 1;  Figure 2).  Figure 2).

Flag leaf width (FLW):
qFLW4-1, a novel major effect QTL with a LOD score of 3.29 with PVE% of 48.22% was identi ed in between the anking markers, RM252 (7.96 cM) and RM3524 (10.46 cM) and the marker RM3524 to be closest to the QTL. The 2.5 cM size QTL was observed to be inherited from IR58025A with an additive effect value of 0.17 and accounting up to 48.22% of the cumulative phenotypic variance (RSq) (Additional le 1; Table 1, Figure 2).
Panicle length (PL): Two major effect QTLs, qPL3-1 and qPL6-1, were identi ed with a LOD score of 16.51 and 14.36, respectively and the PVE% for was observed to be 56.75% and 35.29%, respectively. The 19.71 cM sized QTL, qPL3-1, was anked by RM448 (93.41 cM) and RM15679 (113.12 cM), with RM15679 being the closely associated marker. A negative additive effect value of -7.80 indicated that the inheritance of the favorable allele of this QTL was from KMR-3R. qPL6-1, (15.15 cM sized) was observed to be anked by RM7023 (53.95 cM) and RM586 (69.10 cM) and RM7023 being the closest marker. The additive effect value of 7.43 indicated that the inheritance of this QTL was from IR580525A. The cumulative phenotypic variance (RSq) of both these QTLs was observed to 97.26% indicating that they are major QTLs (Additional le 1; Table 1, Figure

Biomass (BM):
A novel biomass QTL, qBM4-1, (11.81 Figure 2, Supplementary Figure 3a-II). The details of colocalization of major effect QTLs with earlier studies is presented in Additional le 1; Table 2.    A total of 12 minor effect QTLs were identi ed for the following traits: fertile grains per panicle (FGP), panicle weight (PW) and panicle length (PL). For the trait FGP, a total of nine QTLs namely, qFGP4-1, qFGP5-1, qFGP6-1, qFGP6-2, qFGP8-1, qFGP8-2, qFGP8-3, qFGP9-1, qFGP12-1, whose LOD scores were in the range of 2.66-4.25 with PVE% in the range of 9.06%-9.78% were identi ed. The cumulative phenotypic effect (RSq) of all these QTLs on the trait accounted up to 45 Epistatic interactions of QTLs with LOD peak > 2.5 and PVE% > 20% can be considered as signi cant and major effect in nature. Signi cant epistatic interactions were observed between the QTLs located on chromosomes 2, 3, 4, 6 and 11 and controlling three traits namely, days to fty percent owering (DFF), panicle weight (PW) and ag leaf length (FLL) (Additional le 1; Table 3, Supplementary Figure 2). For the trait DFF, epistatic interaction was observed between the QTLs located on chromosomes 11 and 12 with LOD score of 6.63 and PVE% accounting up to 79.58%. The RSq value of this signi cant epistatic interaction was observed up to 51.02%. Similarly, for the trait panicle weight (PW), complex epistatic interactions were observed as shown in Supplementary Figure 2. Among these, three signi cant QTL epistatic interactions were observed between the QTLs located on chr 2 -6, chr 6 -7 and chr 3 -9, with total phenotypic variance (RSq) accounting for 46.4%. Epistatic interaction was also observed between QTLs located on chr 4 and 12 for the trait ag leaf length (FLL) with a LOD score of 5.25 and phenotypic variance (PVE%) value of 78.26%. As shown in Supplementary Figure 3, among the DHL population, two high yielding DHLs namely, DHL-1 (RP6301-189-17-2) and DHL-2 (RP6301-188-15-47) were observed to be best performing with 32.13 g and 31.28 g of total grain yield per plant (YLD), respectively. DHL-1 was observed to have longer panicles, more number of productive tillers and medium bold (MB) grain type whereas DHL-2 had longer-dense panicles with highly desirable medium slender (MS) grain type.  In-silico analysis of the major effect QTLs Supplementary Table 10 Table 1). The rst group comprised of 104 individuals was observed to have both fertility restoration loci. The second group consisting of nine individuals had Rf3 locus only. The third group consisted of 11 individuals of Rf4 locus only and one DHL was negative for both the loci.

Discussion
Large scale cultivation of rice hybrids is one of the feasible options for enhancing the rice production, amidst plateauing yield levels, diminishing natural resources and increasing population. It is estimated that through the adoption of this technology, the rice productivity could be increased by 35%-40% [55]. In India, out of a total 44.15 million hectares under rice cultivation, hybrid rice occupies less than three million hectares [4]. Limited adoption of hybrid rice in India could be primarily attributed to the lower yield advantage of these hybrids over the conventional varieties, coupled with lack of good grain quality features in rice hybrids [5,56]. Therefore, in order to popularize the hybrids, there is an imminent requirement that the released novel hybrids have at least 20-30% yield advantage in comparison with the existing varieties along with the appropriate supporting policies for their better adoption [5,57,58]. As opined by [7], development of such highly heterotic Correlation is an indirect tool of selection that gives a deeper understanding of the degree and direction of selection. ICIM based QTL mapping Thomson et al. (2003) [71], in order to address the issue of lack of reliability and consistency of identi ed QTLs (particularly when the same population was employed for QTL mapping across various environments) suggested the inclusion of an empirical threshold value for consistent and robust QTL identi cation along with the consideration of data generated through multiple replicates. Keeping this in view, the agro-morphological data for yield and its allied traits was collected across different seasons with atleast ten plants of each DHL entry.
Consistency in the QTLs was con rmed by comparing the genomic regions of the QTLs identi ed. Out of the 24 QTLs identi ed, 16 of them were reported by previous research groups involving various types of populations along with employment of different types of markers for QTL mapping. A total of eight novel QTLs including ve major effect QTLs associated with panicle weight (qPW9-1), panicle length (qPL3-1), ag leaf width (qFLW4-1), plant height (qPH12-1), biomass (qBM4-1) and three minor effect QTLs for panicle weight (qPW3-1 and qPW8-1) and fertile grains per panicle (qFGP5-1) were identi ed.
At least one QTL for each trait was identi ed in this study, except for the traits, GP and PT. It was interesting to observe that majority (58.33%) of the identi ed QTLs were from IR58025A parent. Major effect QTLs co-localized with those from earlier reports Seven out of the 12 major effect QTLs identi ed in this study were observed to overlap with the genomic region of the QTLs mapped in earlier studies. For example, qDFF12-1 (12.16-26.17 Mb) co-localized with the QTL identi ed by [72,73]. Another major QTL, qYLD3-1 .87 Mb) co-localized with the yield (GY) QTL identi ed by [74]. Similarly, qYLD6-1 (6.98 Mb-1.47 Mb) was noted to co-localize with the QTL, qGY6-1, as reported by [75]. The QTL for 1,000 grain weight, qTGW6-1 (2.91 Mb-3.41 Mb), overlapped with the QTL region controlling grain weight reported by [50]. The second major effect QTL for TGW trait, qTGW7-1 (

Novel major and minor QTLs identi ed
Five major effect QTLs for the traits, panicle weight (qPW9-1), panicle length (qPL3-1), ag leaf width (qFLW4-1), plant height (qPH12-1) and biomass (qBM4-1) were identi ed in genomic regions where no other QTLs for the same trait has been reported, hence can be considered novel. It is indeed pertinent to note that all the novel QTLs are associated with traits of agronomical importance and hence may have signi cant utility in breeding programs. We are developing QTL-near isogenic lines (QTL-NILs), in order to ne-map novel major QTLs and identify the causative genes underlying these QTLs. In addition, to the major QTLs, a total of three minor effect QTLs were identi ed to be novel in this study. Also, since two high yielding DHLs namely DHL-1 and DHL-2 were observed to have longer panicles with more number of productive tillers and dense panicles, these lines may have novel major effect QTLs, viz., qPL3-1, qPW9-1.

QTL hotspots
A QTL hotspot was identi ed in the genomic region on chromosome 3, anked in between RM448 (34.49 Mb) and RM15679 (26.87 Mb), wherein two major effect QTLs viz., qYLD3-1 and qPL3-1 were located. A part of this QTL hotspot, i.e. physical position, 31.24 Mb, was earlier reported by [74] to possess a major QTL governing yield (GY) trait. As the two traits, PL and GY are positively correlated to each other [56,79], it is possible that a single candidate gene could be controlling both the traits. A second QTL hotspot region anked within the SSR markers, RM7023 (6.98 Mb) and RM586 (1.47 Mb) on chr 6 was observed to govern three traits namely YLD (qYLD6-1), FLL (qFLL6-1) and PL (qPL6-1) which are interrelated to each other functionally. Concerning the minor effect QTLs, one QTL hotspot on chr 8 at physical position 24.18 Mb was noted to be associated with that traits, panicle weight and panicle length, traits which may not be related to each other at functional level.
The population containing (125 individuals DH lines ) employed in this study could be considered as optimal, as some previous research groups viz., [ In-silico analysis of novel major effect QTLs Putative candidate gene (s) could be identi ed for three major effect QTLs viz., qPH12-1, qPL3-1 and qBM4-1. For the plant height QTL, qPH12-1, three candidate genes with RAP-DB locus IDs viz., Os12t0479400-01, Os12t0479400-02 and Os12t0479400-03, were identi ed at 17.57 Mb on chromosome 12. These three genes were observed to encode different transcriptional factors namely transcriptional factor B3 (IPR003340) and Aux/IAA-ARF-dimerization protein (IPR011525) that are associated with expression of AUX/IAA protein (IPR003311) (Rice Annotation Project Database (RAP-DB) (https://rapdb.dna.affrc.go.jp/). As plant growth hormones, Auxin/Indole Acetic Acid (IAA) are well characterized and are known to regulate the overall plant architecture including the plant height [82, 83], the putative candidate genes underlying the QTLs may be in uencing plant height. Productive tillers and morphology of the panicle in uence the total grain yield/plant (YLD) architecture in rice.
The panicle morphology is further in uenced by length and weight of the panicle. Biomass (BM) trait has also been well documented to in uence the YLD in plant and therefore YLD's genetic architecture. In our study, using the DHL population consisting of 125 individuals and employing 126 hyper-variable SSRs and EST-derived SSRs, novel QTLs for traits panicle length (qPL3-1) and biomass (qBM4-1) were assessed for the identi cation of putative candidate genes. Two putative candidate genes, with RAP-DB locus ID viz., Os03t0742900-03 and Os03t0742900-01 associated with panicle length QTL, qPL3-1, were also observed to be Auxin (AUX)/IAA response proteins that could be controlling panicle length. A putative candidate gene identi ed with biomass QTL, qBM4-1, with RAP-DB locus ID: Os04t0491700-01 was observed to be a sugar/inositol transporter (IPR003663). As physiological transportation of sugar (s) from source to sink is crucial in plant biomass accumulation, this gene may have a role in augmenting the trait value of the biomass QTL identi ed in this study. Hence, as the molecular functions of the above mentioned putative candidate genes (s) are identi able with the newly identi ed QTLs, it is conjectured that they might have a role in regulating the trait expression and therefore in uencing the genetic architecture.
As the selection e ciency of the functional markers, speci c for Rf4 and Rf3 was observed to be 83.4% and 52.5%, respectively, it can be concluded that the presence of Rf4 + Rf3 loci or Rf4 locus alone would impart the line with complete restoration potential. However, presence of only Rf3 locus would confer partial restoration capability, as reported in earlier studies [84,85]. In our study, using these functional markers, 104 DHLs were observed to possess both the fertility restoration loci, 11 DHLs had only one major fertility restoration loci i.e., Rf4 and rest all were partial restorers. It would be interesting to envisage test-crosses between some of the selected high yielding and better performing DHLs with popular cytoplasmic male sterile (CMS) lines, to validate their complete fertility restoration potential.

Conclusion
Using doubled haploid lines (DHL) developed from the cross IR58025A/KMR3R, a total of 24 QTLs, comprising of 12 each of major and minor effect QTLs controlling agro-morphological traits associated with yield and allied parameters have been identi ed. Five novel major effect QTLs controlling the yield related traits viz., panicle length, panicle weight, plant height, ag leaf width, biomass with high LOD values and high Rsq values have been detected. Two major effect QTL hotspot regions on chromosome 3, governing total grain yield per plant (YLD)-panicle length (PL) and on chromosome 6 regulating YLD, PL and ag leaf length (FLL) have also been identi ed. The major QTLs and QTL hotspot regions identi ed in this study, could be transferred through marker assisted backcross breeding (MABB) into various genetic backgrounds.

Experimental material
The seeds of the popular, medium duration rice hybrid, KRH-2, developed from a cross between IR58025A and KMR-3R, with long-bold grain type, released for the irrigated ecology, with a high yield potential were obtained along with its parents from the Hybrid Rice Section, Indian Institute of Rice Research (IIRR), Hyderabad, India and were used as the experimental material. Varietal checks namely Akshayadhan (AKD) and Varadhan (VRD) were used as controls and their seeds were also obtained from the above mentioned section of IIRR, Hyderabad, India. Crosses were effected between cytoplasmic male sterile (CMS) line, IR58025A and the elite restorer, KMR-3R to produce the hybrid KRH-2 in the dry season 2014. As IR58025A does not set seeds due to sterile pollen, IR58025B was used for morpho-agronomic assessment of the DHL population. Hence, IR58025A was used as the female parent. IR58025B was used for the purpose of agro-morphological evaluation of the DHL population as IR58025A does not set seeds.

Agro-morphological evaluation of developed DHLs
Following the anther culture protocol standardized by [52], a total of 125 regenerated true, fertile and highly stable DHLs (D 0 ) were produced at Extraction of genomic DNA and genotyping of the DHL population with hypervariable SSRs The genomic DNA was isolated from fresh and healthy leaves of 125 DHLs (D 3 generation) and parents IR58025A-KMR-3R, following the protocol of CTAB MiniPrep method described by [88]. The protocol for setting up the polymerase chain reaction (PCR) including the PCR pro le was followed as described in [87]. The products ampli ed after the PCR were resolved in 4% agarose gel (Sigma, USA) using Protean II gel casting and electrophoresis apparatus (BioRad, USA). As the agarose gel was stained with 0.5 µg/ml ethidium bromide, visualization of the ampli ed bands was done using a gel documentation system (Alpha Innotech, USA). A total of 1,904 SSR markers were used for polymorphism survey among which 134 polymorphic markers between the parents were used for genotyping the DHL population. segregation distortion. Therefore, these eight distorted markers, 5.97% out of 134 polymorphic markers were omitted in the mapping of QTLs mapping exercise. Thus, 126 SSRs with no segregation distortion were only employed for QTL linkage map construction. The details of genomic SSRs are available at GRAMENE (www.gramene.org) while the EST-derived SSR markers were selected from [91]. Manual allele scoring was done as per the expected allele sizes enlisted in GRAMENE (www.gramene.org) and in [91]. Those bands which had unambiguous ampli cation were scored with 50 bp ladder. The alleles of the ampli ed DHLs with the SSR markers were scored as 1 if an expected size of the amplicon is observed and as 0 for the absence of the expected amplicon size. Hence, a binary data-matrix based on the above mentioned scheme of scoring was used for QTL mapping.

Percent of missing data-points among the SSR and EST-derived markers
In order to determine the percentage of the missing data-points using the SSR and EST-derived SSRs, the number of data-points expected to amplify with 126 hyper-variable SSRs in the DHL population was computed on the basis of the number of DH individuals used for QTL mapping. Then, the number of data-points that did not amplify was recorded. The ratio of unampli ed data-points to the total number of ampli ed data-points was multiplied with 100 to obtain percentage.

Construction of QTL map
Inclusive composite interval mapping method (ICIM) was employed for the construction linkage map using binary data matrix of hyper-variable SSRs with IciM ver. 4.0.1 [89, 90] (https://www.isbreeding.net). The Kosambi map function was used for mapping the QTLs using the above software and determination of signi cant QTLs (P = 0.05) was done by setting the permutation number to 1,000 with a minimum LOD threshold value of 2.5 and the range of LOD value being 2.5-3.3. Upon the detection of signi cant LOD peak with a minimum value of 2.5, the values of various other factors which in uence the QTL mapping viz., the log-likelihood ratio (LOD), phenotypic variation explained (PVE) in percentage and additive effect were determined. The statistical identi cation of the QTLs based on the LOD score was done as per the procedure described in [92]. As mentioned in [54], the QTLs were termed as major effect if their phenotypic variation percentage (PVE%) was more than 20%. QTL nomenclature was followed as described by [93]. Manual identi cation of the QTLs was done as per the procedure described in Marathi et al. (2012) but with slight modi cations. The QTL hotspots were identi ed by searching the QTL linked to the genomic regions with in a window size of 20 cM wherein two or more QTLs were observed to be co-located within the each window region. The construction of linkage QTL map, determination of epistatic (digenic) interactions between pairs of markers was done using QTL IciMapping software ver. 4.0.1 [89,90]. For determining the quantitative epistatic interactions (QEIs) among the identi ed QTLs, an LOD threshold value of 3.0 was set as a minimum. In order to determine the co-factors of QTL mapping namely, the LOD values, PVE% and additive effect, the stepwise regression analysis was used. For estimating the QTL whether is a major or minor effect QTL, maximum-likelihood method was used. Using IciM, the interaction between QTL (genotype) and environment was determined using the MET functionality. For determining whether the QTLs detected through this study were earlier reported by other research groups or novel, the physical positions of these QTLs were compared to those identi ed in earlier reports from the GRAMENE QTL database (http://archive.gramene.org/qtl/) and QTARO database (http://qtaro.abr.affrc.go.jp/qtab/table#as_table:21:unde ned:unde ned). As described in [50], the QTLs identi ed through this study were reported to be novel if the physical positions of the SSR marker intervals did not match with those of the earlier studies. For identifying the functions of the putative candidate gene (s) within the physical positions of the novel major effect QTLs, the Rice Annotation Project Database (RAP-DB) (https://rapdb.dna.affrc.go.jp/) was used.

Molecular assessment of fertility restoration loci in DHL population
For identifying complete-partial restorers within the 125 DHLs without resorting to tedious test crosses, genomic DNA was isolated from freshyoung leaves of DHL plants along with the hybrid KRH-2 and its parents using the method described by Availability of data and material: The data and information generated from the study are available in our laboratory as hard and soft copies and can be shared based on request.

Figure 1
Frequency distribution histograms of the means of 125 individuals of doubled haploid (DH) population for various traits. Mean value of parents, IR58025B and KMR-3R indicated by an arrow.