The use of heterosis in hybrid rice has become increasingly important since the beginning of hybrid rice extension in China (Ma GH and Yuan LP 2015). Hybrid rice has contributed greatly to food security in China and the world. In recent years, the average yield of rice in China has increased from 3.5 ton/ha in1975 to 6 ton/ha in 1995, and to 7 ton/ha in 2018 (FAOSTAT), and the grain quality of rice has been improved. About 50% of newly registered rice varieties in China (national approval) have grain quality of grade one (Lu F et al. 2019). However, it is time and labor consuming for developing a hybrid variety by conventional breeding even though marker assisted selection (MAS) has been used. Future breeding of hybrid rice will benefit from the use of new breeding technology integrated with genetics, genomics, computational science and artificial intelligence.
Rice is a model species for genomic study of monocotyledonous plant. The genome of rice was fully sequenced in 2005 (International Rice Genome Sequencing Project and Sasaki T 2005), and more than 3000 genes have been cloned and analyzed (Yao W et al. 2018). Large number of molecular markers have been developed for MAS of important traits such as plant height, blast resistance, leaf blight resistance, submergence tolerance and fragrance (Jena KK and Mackill DJ 2008). However, the success of MAS heavily depended on level of heritability and genetic architectures of the selected traits. MAS is not effective for traits controlled by large number of genes/QTLs with small contribution. With the development of high-throughput sequencing and chip technology, genome-wide association study (GWAS) has been used for identification of useful genes/QTLs, and genomic selection (GS) or genome-wide selection (GWS) has been proposed as a promising tool and applied for animal and plant genetic improvement (Meuwissen THE et al. 2001). GS has higher genetic gain than marker assisted selection for complex traits controlled by large number of QTLs (Crossa J et al. 2017b). However, GS has not been successfully used in hybrid rice breeding yet.
Genomic selection uses genotypes and phenotypes of target traits from individuals in a training population to establish prediction models, and uses the models to predict genomic estimated breeding values (GEBVs) of individuals based on their genotypes in a test population (Crossa J et al. 2017a). The hypothesis is based on the assumption that with high density SNP markers distributed throughout the whole genome, at least one SNP can be found in a linkage disequilibrium state with the quantitative genetic loci affecting the target trait, so that the effect of each QTL can be reflected by SNP markers (Meuwissen T 2007). The statistical models of genome selection can be roughly divided into two categories. The first is the direct method, which takes the individual as the random effect, the genetic relationship matrix constructed by the genetic information of the reference population and the predicted population as the variance covariance matrix, to estimate the variance components through the iterative method, and obtain the predicted breeding value of the individual. The second is the indirect method, which first estimates the marker effect in the reference group, and then accumulates the marker effect combined with the genotype information from the prediction group to obtain the individual estimated breeding value of the prediction group (Zhang Z et al. 2011; Misztal I and Legarra A 2017). Different prediction modes use different statistical methods, thus, the efficiency of the models need to be compared and validated before using for breeding selection.
Genomic selection has been successfully used in animal breeding programs to increase the rate of genetic gain of dairy cattle, pig, dairy goat, layer chicken, and fish (García-Ruiz A et al. 2016; Samorè AB and L. 2015; Mucha S et al. 2015; Wolc A et al. 2015; López M et al. 2015). In recent year, simulations and experimental studies have been conducted to validate the efficiency of this method in breeding of plants. Be specific to rice, the predictive ability of heading date, culm length, panicle length, panicle number, grain length and grain width varied from 0.4 to 0.8 in a population of 110 rice cultivars using nine prediction methods (Onogi A et al. 2015). The highest predictive abilities for spikelets per panicle, heading date, plant height and protein content was 0.44–0.7 in a diverse population of 413 rice inbred lines from 82 countries genotyped with a 44 K SNP chip (Isidro J et al. 2015). The GEBVs of other traits such as grain shape, grain yield, nitrogen balance index, panicle weight, grain weight, and blast resistance have been predicted using inbred lines or cultivars (Spindel J et al. 2015; Yabe S et al. 2018; Iwata H et al. 2015; Grenier C et al. 2015; Hassen M et al. 2018; Huang M et al. 2019). Genomic prediction has also been conducted for grain yield, thousand grain weight, and index of different traits of hybrid rice (Wang X et al. 2017; Xu S et al. 2014; Xu Y et al. 2018; Wang W et al. 2018; Cui Y et al. 2020). The predicted GEBVs from different populations were similar, thus, genomic selection is a reliable method for rice breeding.
In this study, we investigated the agronomic traits, yield related traits, and grain quality related traits of 404 hybrid rice lines that genotyped by using a 56K SNP chip, and conducted genome wide association study and genomic prediction for 20 traits using 15 statistical methods. The objectives of this study were to validate the predictability of different models and to find best-fit statistical methods for prediction of different traits.