Finding new QTL for yield traits based on a high-density genetic map in the super hybrid rice Nei2You No.6

Background Rice is one of the most important food crops in the world. To determine the genetic basis of yield components in super rice Nei2You No.6, 387 recombinant inbred sister lines (RISLs) were obtained for mapping quantitative trait loci (QTL) responsible for yield-associated traits, such as 1000-grain weight (TGW), grain number per plant (GNP), number of panicles per plant (NP), and grain yield per plant (GYP). Results Using whole genome re-sequencing, a high-density linkage map consisting of 3203 bin markers was constructed with total genetic coverage of 1951.1 cM and an average density of 0.61 cM. As a result of the multi-environment test, 43 yield-related QTL were mapped to all 12 chromosomes, among which 28 inherited from Nei2B showed a positive effect on yield traits. Nine QTL, qTGW-1a, qTGW-5, qTGW-7, qTGW-10b, qTGW-10c, qTGW-12, qNP-7, qGNP-6c, and qGYP-6b, showed stable effects across multiple environments. Five of the nine QTL were co-located with previously reported QTL, and four novel loci, qTGW-7, qTGW-12, qGNP-6c, and qNP-7, were identied in the present study. Subsequently, qNP-7, qTGW-12, and qTGW-7 were validated using corresponding paired lines which differed only in the target region. Conclusions the RISL population is an effective tool for mapping and validating QTL of complex traits, for instance, yield-associated traits, and newly detected QTL provide new genetic resources for research of yield components and molecular breeding in rice.

and yield [13]. The recently reported GL6 encodes a plant-speci c PLATZ transcription factor and positively regulates grain length to increase TGW [10].
The e ciency of QTL mapping is determined by the accuracy and saturation of the genetic map. Recent advances in sequencing technology have enabled rapid, high-throughput genotyping of single nucleotide polymorphisms (SNPs), which provides saturated molecular markers for the anatomy of complex yield traits [18]. For example, there are 1,703,176 SNPs between Nipponbare and 9311, equivalent to one SNP per 268 bp [19]. Compared with low-throughput molecular markers like simple sequence repeats (SSRs), sequencing-based genotyping is a powerful tool with advantages of time-saving and high-density. Bin map refers to a certain number of continuous SNP as the unit for determining recombination events, and the parent source of each Bin is speculated to obtain the genome-wide physical map of the offspring. For example, a high-density genetic map consisting of 3016 bin was constructed and nally 26 QTL for six yield traits were located, among which two novel QTL, qAGB6 and qHI9, were identi ed [20]. In another report, a total of 79 QTL for 15 yield traits were mapped using a high-density linkage map with 3569 bin markers [21].
Most studies focus on mapping yield-associated QTL differing between high and low yield varieties. Therefore, more attention should be paid in genetic factors varying in high-yielding lines. The three-line indica hybrid cultivar Nei2You No.6 is a super hybrid rice released by the China National Rice Research Institute (CNRRI) in 2006 that has a yield of up to 8.89 t/ha (http://www.chinariceinfo.com/variety/). To investigate the genetic basis of high yield, we constructed a RISL population derived from the two parents of Nei2You No.6. Using high-density bin-mapping by resequencing, QTL associated with NP, GNP, TGW, and GYP were scanned in two environments across two years. These QTL are helpful to elucidate the high yield mechanism of Nei2You No.6 and provide guidance for high-yielding rice breeding.

Mapping population
As shown in Fig. 1, the recombinant inbred sister lines (RISLs) were derived by a single-seed descent method (SSD) from a cross between the maintainer line Nei2B and restorer line Zhonghui8006 (R8006).
At the F 7 generation, two individuals were selected randomly from each of 194 lines followed by sel ng to F 15 . The 387 RISLs were named Q plus line number.

Evaluation of yield-related traits
After drying at 37°C for 2 weeks, all harvested individuals were measured for four yield-associated traits, such as TGW, GNP, NP, and GYP. The lled grains of each plant were weighed to measure GYP and effective panicles were counted manually for NP. GNP and TGW were analyzed using an automatic seed examination analysis meter (Wanshen SC-G).

DNA extraction and population resequencing
At the F 15 generation, DNA from leaves of Nei2B, R8006, and 387 RISLs were sampled for re-sequencing at the tillering stage. To obtain high-quality DNA, a cetyltrimethylammonium bromide (CTAB) method was used to extract the genomic DNA [22]. DNA quality and concentration were assessed using the NanoDrop 2000C Spectrophotometer.
High-throughput sequencing was conducted by Berry Genomics company using Illumina HiSeq X Ten sequencing platforms (Illumina). The sequencing depth of Nei2B and R8006 were 60.48-fold and 79.38fold, respectively. In addition, the average sequencing depth of the RISL population was 8.01-fold. For SNP calling, the Burrows-Wheeler Aligner (BWA) with default parameters was applied for sequencing alignment between clean data and the R498 reference genome sequence (http://www.mbkbase.org/R498/) [23]. The Genome Analysis Toolkit (GATK) was used to detect SNP and InDel loci [24]. Annovar software was used to annotate SNPs [25]. In addition, Clipping REveals STructure (CREST) and CNVnatoor were used to detect and annotate structural and copy number variations, respectively [26,27].

Linkage map construction
We used a sliding window method to incorporate continuous non-recombinant SNPs on the genome into a bin [28]. The window size for genotyping was 15 continuous SNPs without missing data. According to the ratio of SNPs in the sliding window coming from the parents, the genotype of this window was determined. Recombination breakpoints were determined by the join point of two different genotypic regions. A bin map was constructed according to the recombination breakpoints of all individuals in this study. Bin markers were screened to lter out distorted and missing markers. To lter partial separation markers, 3273 candidate markers were ltered at a ratio of 1:3, and 25 biased separation markers were ltered out. Markers with genotypes covering at least 75% of 387 RISLs were retained. Heterozygous genotypes were deleted. The Kosambi algorithm was used to calculate the genetic distance between the markers.

QTL analysis
Additive QTL were analyzed using the composite interval mapping (CIM) model in Windows QTL Cartographer v.2.5 software (http://statgen.ncsu.edu/qtlcart/WQTLCart.htm). The threshold for declaring a putative QTL (P < 0.05) was estimated using 1000 permutations with 1 cM as a search step. The percentage of phenotypic variation explained (R 2 ) and additive effects (AE) were also obtained using this software. The location of the highest LOD value was taken as the interval of QTL. The QTL nomenclature followed the recommendations reported by Mccouch [29], speci cally, q plus trait abbreviation and chromosome number; if more than one QTL are detected on the same chromosome for one trait, the end of the QTL will be plus a, b, etc.

Statistical analyses
Statistical analyses were carried out by comparing the raw data of all individuals using Microsoft Excel (2016) and Prism 5 software (GraphPad). Frequency distribution for each trait was drawn in Microsoft Excel 2016 to identify the pattern of variation of each trait within the population. Furthermore, correlations among the four traits in two year were estimated using the Pearson correlation coe cient. For QTL validation, data of RISLs were compared using Student's t-tests and signi cance levels were determined according to the Student's t-test: * P < 0.05, ** P < 0.01.

Phenotypic variation and correlation analysis of RISLs
Phenotypic variation of the RISLs and two parents are illustrated in Fig. 2. All four traits of 387 RISLs displayed continuous distributions and transgressive segregation across four environments, indicating their quantitative inheritance. Coe cient of variation (CV) for GYP, GNP, and NP ranged from 23.8-27.9%, 17.8-25.7%, and 21.9-27.6% in four conditions, respectively, which meant a wide range of variation among the RISL. According to the results of CV, GYP, GNP, and NP were vulnerable to environmental in uences. In particular, the CV of 1000-grain weight (TGW) ranged from 6.2-9.5%, which indicated that TGW was a stable quantitative trait.
The correlation among the four yield-related traits is shown in Fig. 3. GYP was positively correlated with NP and GNP while NP and GNP were negatively correlated. TGW was negatively correlated with GNP and NP. However, there was no signi cant correlation between TGW and GYP. The high yield performance of Nei2You No.6 may be mainly caused by the increase in GNP and NP.

Population sequencing and linkage map construction
In total, 221,340,292 high-quality reads were generated for Nei2B and 303,049,072 high-quality reads for R8006, with average sequencing depths of 60.48-fold and 79.38-fold, respectively. A total of 1,589,836 single nucleotide polymorphisms (SNPs) were identi ed between Nei2B and R8006.All of 387 RISLs were used for whole-genome sequencing and resulted in a total of 1.05T of clean data, with approximately 8.10-fold depth for each RISL. Using a sliding-window method, a total of 59,890 recombination breakpoints were generated, with an average of 155.16 breakpoints per line. We preliminary obtained 3,273 bin markers with 930,361 high-quality SNPs. After ltering out segregation distortion and low coverage markers, 3,203 effective markers were selected and then used for linkage analysis to construct a genetic linkage map (Fig. 4). The length of bin markers ranged from 50 kb to 1.4 Mb, with a mean of 119 kb. The number of SNPs and bins of each chromosome are shown in Table 1. The total genetic distance of the map was 1951.1 cM, with an average linkage distance of 0.61 cM between adjacent markers. The largest linkage group was Chr.1 with 419 bin markers and a length of 263.6 cM and the smallest Chr. 9 with 163 bin markers and a length of 69.3 cM. The ratio of linkage distance to physical distance ranged from 2.8 to 6.1 cM/Mb, with a mean ratio of 5.0 cM/Mb.

Coverage of genotypic difference in paired RISLs
To verify that if the construction of RISL population was successful, phylogenetic analysis was performed. The phylogeny showed that 387 RISLs were clustered in pairs (Additional le 1: Figure S1).
Then, we compared the genome of each paired sister line. To determine the coverage of genotypic difference, we counted the different bins between each of paired RISLs in F 15 generation ( Fig. 5 and Additional le 2: Table S1). Each bin was covered by 77 paired RISLs on average, with the maximum of 119 paired and minimum of 54 paired. The maximum different bins between paired lines was 52.1%, and minimum of that was 8.6%, and the average different bins accounted as 39.1%. In F 15 generation, the heterozygous region ranged from 0 to 27.5% with an average of 2.6%. This meant that the genome is not nearly homozygous until F 15 generation. Some lines with high heterozygous rate may be caused by cross-pollination in high generation, and the homozygous degree of the whole population conforms to the stable genetic population. Therefore, we can conclude that the heterozygosity exists longer than in theory.
QTL analysis using the RISL population As a result of QTL analysis, a total of 43 QTL were identi ed on all of twelve chromosomes, nine of which were repeatedly detected in multiple environments. Detailed information about all 43 QTL are summarized in Additional le 3: Table S2 and Fig. 6.
Eight QTL for GYP were detected on chromosomes 1, 6, 9, 10, and 12, and the total PVE ranged from 0.0 to 10.9% of the total phenotypic variation across four environments. qGYP-6b was detected in LS15 and FY15 with contributions to phenotypic variance of 4.3% and 4.2%, respectively. The other 7 QTL were detected in one environment. The positive alleles for ve out of eight QTL for GYP were derived from Nei2B.

QTL veri cation
For QTL validation, sister lines were selected for phenotyping in Fuyang, 2018 (Fig. 7). For NP, lines Q277 and Q278 shared the same genetic background harboring on QTL other than qNP-7 detected were selected. The phenotypic data showed a signi cant difference between Q277 and Q278 in NP, which led to a signi cant improvement in GYP (Fig. 7a, b). Similarly, lines Q239 and Q240 differed in qNP-7 also showed a signi cant difference in NP.
For qTGW-1a, lines Q124 and Q125 shared the same genetic background besides the target region. The phenotypic data showed that there was a signi cant difference between Q124 and Q125 in TGW (Fig.  7c). For qTGW-7, it showed a similar result as qTGW-1a (Q378 vs Q379, Fig. 7d). For qTGW-10b and qTGW-10c, lines Q317 and Q318 showed signi cant difference in TGW (Fig. 7e). The qTGW-12 was validated using lines Q70 and Q71 for TGW (Fig. 7f). The phenotypic date showed that there was a signi cant difference between Q70 and Q71 in TGW. Similarly, lines Q146-Q147 and Q380-Q381 also contained qTGW-12 and had a signi cant difference in TGW.

Discussion
High-density genetic linkage map is an effective tool for QTL mapping. Genetic maps constructed with traditional molecular markers have low density and too large a localization range, which leads to overestimation of phenotypic variance explained by QTL and di culty of gene cloning and developing markers for pyramiding breeding [30,31]. With the development of sequencing technology, whole-genome sequencing has been widely used in rice studies. Through bioinformatics analysis of the rice genome, many SNPs, InDels and structural variation (SV) can be exploited, which can be used to develop numerous molecular markers to build a high-density genetic map. In this study, 387 RISLs were selected for whole genome resequencing. A high-density genetic linkage map was constructed using 3203 bin markers for QTL analysis for yield associated traits.
A set of primary and secondary population were used for genome-wide QTL analysis, QTL validating and ne mapping. NILs are widely used in QTL validating and ne mapping, then came the residual heterozygous lines. The NILs are usually constructed by continuously back cross while residual heterozygous lines by sel ng in advanced generation, both of which depending on the marker assisted selection strategy. We constructed recombinant inbred sister lines, and "sister" means that the two individuals used to construct the population originate from the same family in the F 7 generation.
Comparing with NILs and RHL, RISLs showed advantage of more e cient and labor-saving with omission of MAS. As well as QTL mapping, the RISLs can be used in QTL validation e ciently. As a supplement, a library consisting of 1700 F 11 , 2780 F 12 , and 2464 F 13 lines were obtained during constructing the RISLs, in which the target residual heterozygous line can be selected by library-screening and used for ne mapping.
A total of 43 QTL were identi ed in four environments for four yield-related traits and distributed in different regions of 12 chromosomes, and the phenotypic variance explained by a single QTL ranged from 2.4 to 10.2%. Among the 43 QTL, 65.1% of the positive alleles inherited from the parent Nei2B. Consistent with the probably existence of major QTL in both of two parents, the above results illustrate that yield-associated traits of Nei2You No.6 are mainly controlled by multiple minor-or middle-effect QTL. On another hand, the QTL effect or PVE are always in uenced by the genetic background. For example, Gn1a and DEP1 in japonica and indica genetic backgrounds showed different degrees of phenotypic variance [21]. Additionally, it has been realized that both major-and minor-effect QTL play important roles in the controlling of complex traits [32]. More and more attentions have been paid to QTL with minor-or middle-effects in recent years [33][34][35].
According to the results of comparison between the QTL in our study with previous studies, four loci, namely qNP-7, qGNP-6c, qTGW-7, and qTGW-12, are probably novel QTL. And re-phenotyping of paired lines successfully proved that the stable existence of qNP-7, qGNP-6c, qTGW-7, and qTGW-12. In future study, these four QTL will be genetically dissected and ne mapped using secondary segregated population by screening RISLs or F 11 /F 12 /F 13 library. The ne mapping and map-based cloning of novel QTL detected in this study will provide genetic resources for rice high-yield breeding.

Conclusions
A RISL population was obtained from the cross between Nei2B and R8006. After sequencing the genome of RISLs, a total of 3202 effective SNPs was used to construct a high-density genetic linkage map, totally spanning 1951.1 cM, with an average distance of 0.61 cM. In total 43 QTL were identi ed for GYP, NP, GNP, and TGW in the multi-environment test, tending to cluster into nine loci. After comparing with previous studies, there are four of nine novel QTL, such as qNP-7, qGNP-6c, qTGW-7, and qTGW-12.
Subsequently, qNP-7, qTGW-7, and qTGW-12 were validated using corresponding paired lines. In brief, our results provide a new method for constructing QTL mapping population, and the RISLs used in present study is a useful tool for mapping and validating QTL of complex yield traits. Next, we aim to develop markers tightly linked with qNP-7, qTGW-7, and qTGW-12 and ne map these QTL using secondary segregated population. The QTL analysis results lay an important foundation for the candidate gene identi cation of yield-related traits for rice.

Consent for publication
Not applicable.

Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional les. Nei2You No.6 is an indica type hybrid rice bred by our research group. The seeds of recombinant inbred sister line population and the parents are available from the corresponding author on reasonable request.

Competing interests
All authors declare that they have no competing interests. Epistasis for three grain yield components in rice (Oryza sativa L.  List Of Additional Files   The bin map of 387 RISLs. The horizontal axis shows the twelve chromosomes and the vertical represents line list. Red, blue, and yellow means the genotype is consistent with Nei2B, R8006, and heterozygosity, respectively.

Figure 5
Overview of genotypic difference in RISLs. The inner track indicates the number of paired sister lines whose difference interval can cover the bin in whole genome.

Figure 6
Graphical representation of QTL detected using RILSs. The scale on the left indicates map distance in centimorgans (cM). The darker bands indicate the markers. Graphics with different colors and shapes indicate the different traits and environments of detected QTL. Names of stable detected QTL are marked in the gure.