Large scale genomic and transcriptomic profiles of rice hybrids revealed a novel universal mechanism underlying yield heterosis


 The utilization of heterosis (or hybrid vigor) is a revolutionary technology in agricultural. However, its genetic mechanisms are still unclear in plants. Here we develop, sequence and record the phenotypes of 418 hybrids from crosses between two testers and a diverse mini core collection. Phenotypic analysis showed that heterosis is an extensive but not necessary phenomenon, which varied by combinations and environments. Evidence from both GWAS on the 418 hybrids and their parents and transcriptomics of the traditional rice hybrid Liangyoupei 9, indicated that dominance and overdominance are the main genetic contributions to heterosis. Furthermore, cumulation or complementation of repulsive genetic factors may account for 37.8% of the overdominant QTL and nearly half of the genes with overdominant expression pattern. We systematically compared non-additive and additive factors and observed a common phenomenon that non-additive factors are more sensitive to background than that of additive ones across species, phenotypes, QTLs and transcription levels, further evidence from both simulations and experiment demonstrated a novel universal molecular mechanism underlying heterosis, i.e. homo-insufficiency under insufficient background (HoIIB), which expounds that heterosis in most cases is not the heterozygote advantage but the homozygote disadvantage under the insufficient genetic background. The HoIIB model can explain most known hypotheses and phenomena about heterosis, thus provides a novel theory for future hybrid rice breeding.


43
Hybrid breeding is a revolutionary technology in agricultural production and for food security. 44 Due to their dramatic increased yield by tens of percent and even double compared to inbreds 1 , 45 hybrid have been the important and even the main variety type for agricultural plants and animals 2 . 46 Rather different from traditional inbred breeding, which mainly exploit accumulation of 47 displayed a similar pattern of residual effects on both inbreds and hybrids, that is, the highest for 164 GWP, followed by PNP, SPP and SPP-related traits (PBP and SBP), and the lowest for KGW in 165 both japonica and indica subspecies ( Supplementary Fig 8a; Supplementary Table 3 and 4). 166 However, the proportion of environment effect on hybrids was generally much higher than that on 167 the corresponding inbreds, especially for SPP and its related traits (Fig 1b; Supplementary Table  168 3 and 4). We used the same method to analyze the data of 266 maize hybrids and their parents in 169 four environments, the results also showed that the effect of environment on hybrids was generally 170 stronger than that of the inbred parents ( Supplementary Fig 8b-c) 44 . Another recent maize panel 171 also showed the similar result (data not shown) 45 . In summary, it is a common phenomenon that 172 hybrids are more sensitive to the environment than their inbred parents. 173 Using our diverse and large scale of hybrids, investigation of the strength of heterosis in terms 174 of different traits, combinations, and environments indicated that heteosis, especially the better-175 parent heterosis, are just the potentiality rather than the apodeictic result of hybridization or 176 heterozygosity, despite which is predominant over the cases (Fig 1c-d; Supplementary Fig 9a-h). this was apparently higher than that in Sanya (-8.53%-53.30% with an average of 8.21%) (Fig 1c-182 d; Supplementary Table 2b). However, For PNP and KGW, the dHmp of all combinations 183 generally appeared to be no obvious difference between the two locations. Particularly, the 184 proportion of positive overdiminant (POD) heterosis of all traits across the combinations in 185 Changsha (averagely 63.19%) was much higher than that in Sanya (averagely 32.40%). And the 186 decrease of POD from long-day to short-day environments was distinctly represented by spikelet 187 number related trait (PBP, SBP and SPP), compared to the other traits, Compared to the intra-188 subspecific combinations, only PNP displayed stronger heterosis for the inter-subspecific 189 combinations evaluated under the two environments. Regarding heterosis of the other traits, the 190 differences between inter-and intra-subspecific combinations varied by traits, testers, subspecies, 191 and environments (Fig 1c-d). Similarly, the proportion of POD heterosis of inter-subspecific 192 combinations for all traits was apparently higher than that of intra-subspecific combinations in traits, subspecific hybrids, and environments. Consistent with the observation that SPP related trait 250 did not exhibit obvious heterosis in Sanya, fewer overdominant F1_QTL and Hmp_QTL were 251 identified in Sanya than that in Changsha. On average, 75.7% of the F1_QTLs identified in 252 Changsha expressed as overdominance for the SPP related traits, however, the proportion 253 significantly reduced to 42.6% in Sanya. Compare the two subspecies, we found that the reduction 254 is more remarkable in japonica hybrids (from 71.0% in Changsha to 22.3% in Sanya) than that in 255 indica hybrids (from 80.4% in Changsha to 62.8% in Sanya) ( Supplementary Fig 24a-c). On the 256 other hand, the F1_QTLs and Hmp_QTLs for KGW and PNP did not show such consistent changes 257 between subspecies and between environments, varying by combinations ( Supplementary Fig  258   24d-e). When comparing environmental stability among the QTLs of additive, dominant and 259 overdominant ones, we found that a large proportion of additive QTLs showed environment-stable 260 than non-additive QTLs, regarding all types of combinations for all traits except for PNP. In 261 addition, a higher proportion of dominant QTLs indicated environment-stable than overdominant 262 QTLs for most of the combinations and traits (Supplementary Fig 25). Thus, our results indicated 263 that the higher magnitude of the dominant effect of a QTL, the stronger the environmental 264 sensitivity of the hybrid. The distinctively larger proportion of unstable factors including 265 overdominant and dominant QTLs identified in hybrids or heterosis than that in inbreds, consistent 266 with the fact that the response of hybrid to environment was generally stronger than that of inbreds 267 (Fig 1b and Supplementary Fig 8b). 268 Dominance / partial-dominance cumulation and complementation are prevalent 269 genetic basis of overdominance 270 The above mentioned evidence from both phenotyping and QTLs mapping indicated that 271 hybrids and heterosis mainly attributes to the non-additive effects (dominance and over-dominance) 272 and non-additive effects are more environmental sensitive than additive ones. It is worth noting that 273 heterosis and non-additive effect for some loci may not be the necessary results of heterozygosity, 274 but as a potential possibility depending on different subspecies, testers, traits and environments. 275 Apparently, it is challenging to take advantage of the maximum effects of the loci, before we 276 understand the molecular mechanism underlying the additive and non-additive effect. Currently, there are three easily comprehensible genetic mechanisms that produce the non-additive effects and 278 especially the overdominant effect through two or more repulsive factors, i.e. the multiplication of 279 additive or dominant factors 10 , the cumulation of dominant factors 37,47 and the complementation of 280 two factors. In our observation, the effect of multiplication between/among repulsive additive 281 factors contributed only a little to yield heterosis ( Supplementary Fig 26). Therefore, we mainly 282 focus on the last two genetic mechanisms at both the QTL and transcription level in this section. 283 To estimate the contribution of repulsive dominant allele (RDA) cumulation to non-additive 284 effect and especially the overdominance, we calculated the repulsive degree of the SNP with the 285 same direction (positive or negative) of dominant effect in each dominant or overdominant QTL 286 (see Methods). As expected, the proportion of higher repulsive degree (0.2-0.4 and >0.4) in QTLs 287 with overdominance (38.0%) was much higher than that in QTLs with dominance (14.57%), and 288 this feature was prevalent across all traits, combinations and environments (Fig 2a-c; 289 Supplementary Fig 27a-c). This phenomenon was also observed among the dominant and 290 overdominant QTLs in 1086 three-line hybrids 34 ( Supplementary Fig 28-29). When compared 291 both the inter-and intra-subspecific combinations, the inter-subspecific combinations represent 292 distinctly higher proportion (52.2%) of combinations with RDA averaged by the QTLs containing 293 RDA than intra-subspecific combinations (25.6%) (Fig 2d-f; Supplementary Fig 27d-f). This 294 indicated that the probability of RDA cumulation in the inter-subspecific hybrids is double of that 295 in the intra-subspecific hybrids. Further, the probability of RDA cumulation in the J×J combinations 296 was much lower than that in the I×I ones (Fig 2d-f; Supplementary Fig 27d-f). More interestingly, 297 we found that the overdominant QTL with higher repulsive degree tended to be stable between two 298 environments ( Supplementary Fig 30). Given that additive effect was more stable across 299 environments than dominant effect ( Supplementary Fig 25), we may expect that the 300 complementation of additive alleles with repulsive phase contributed apparent effect on the 301 overdominant QTLs besides of the RDA cumulation. But we may not clearly distinguish the 302 complementation of alleles from the RDA cumulation at the QTL level, and we thus estimated the 303 contribution of complementation mainly at the transcription level (see below). Considering that the 304 repulsive phase at short distance was not easy to be broken, we anticipated that the cumulation of 305 dominant alleles or the complementation from repulsive phased alleles will continue to play an 306 important role in heterosis utilization.
One novel universal molecular mechanism of dominance and overdominance -308 homo-insufficiency under insufficient background (HoIIB)

309
As mentioned in introduction and above, the heterosis and non-additive phenomenon at the 310 phenotype level are often resulted from the integrated effects of multi-factors at various 311 intermediate and fundamental levels (such as different genes, QTLs, gene expression, and 312 physiological traits), thus it is challenging to investigate the molecular mechanism of heterosis at 313 the phenotypic level. Transcription is such an intermediate step for a gene to perform its functions 314 in development of complex phenotypes. Therefore, it is informative to explore gene expression 315 patterns between parents and their F1 hybrid, in order to understand molecular mechanisms 316 underlying heterosis. Here we investigated transcriptome profile of young panicles from the hybrid 317 LYP9 and its two parents PA64S and 9311. As a whole, 8,248 genes showed differential expressions 318 between the two parents and their F1 hybrid in at least one of three tissues (1 mm, 2 mm, 3 mm 319 young panicles). Expression patterns can be classified as additive (A) (13%), dominant (D) (39%), 320 and overdominant (OD) (48%) in at least one of three tissues (Fig 3a, Supplementary Table 9). 321 The OD can be further grouped as negative and positive (NOD and POD), which mean expression 322 level in hybrids is lower and higher than that in both parents, respectively. We identified that the 323 NOD and POD accounted for 21.0% and 81.0%, respectively. Many of them belong to the directly 324 observed complementary pattern, including the complementary negative or positive OD, that is, 325 expression was observed only in both parents but absent in hybrids (CNOD (5.8%)), or vice versa 326 (CPOD (36.7%)). The directly observed complementary effects (CNOD and CPOD) accounted for 327 about 40% of the overdominant expression, and no more than 20% of the 7,248 non-additive 328 expressed genes. These results strongly suggested that the complementarity may be involved in the 329 mechanism of overdominance at transcriptional level. 330 When compared the stability of genes with additive and non-additive expression, we were 331 surprised to find that the dominant and overdominant expression showed dramatically more 332 variability across tissues than the additive expression (Fig 3b); in another word, non-additive or 333 heterosis of expression is more tissue-specific and may be more background-dependent. This is 334 consistent with the results mentioned above, where hybrids are more variable than the inbreds and 335 non-additive QTLs are more variable than additive QTLs (Fig 1b; Supplementary Fig 8 and 25).
Although we repeatedly found that non-additive factors were more dependent on the 337 backgrounds, it is still difficult to directly evaluate factors and their background at phenotypic and 338 QTL level. Fortunately, it is possible to examine the relations between expressing genes and their 339 transcription factors that directly regulate them, since many species had a well documented 340 transcription factors and target gene annotations. In this sense, transcription factor can be 341 considered as the background of its target expressing genes. Given these speculations, we carefully 342 investigated the dependency between the types of genetic effects (additive, dominant, and 343 overdominant) and the genetic background, by analyzing the correlation of expression levels 344 between a gene and its transcription factors. The results indicated that the expression of genes with 345 dominant and overdominant effects represented apparently stronger dependency on their 346 transcription factors than those genes with additive effects (Fig 3c). The phenomenon was also 347 observed in the transcriptomes of three Arabidopsis thaliana combinations ( Supplementary Fig  348   31) 16 . These results implied an interesting phenomenon that expression of genes with non-additive 349 effects is more sensitive to the dosage changes of its genetic background. Does the genetic 350 background dependency of non-additive effects represent a universal molecular mechanism 351 underlying heterosis? It is well known that no factor is absolutely independent in the biology   Fig 32). 367 For the positive regulation, when the activator as the background is insufficient (smaller X/K) 368 for the functional allele of the receptor, the receptor will express as positive (partial-)dominance. In 369 contrast, when the background is sufficient (larger X/K), the receptor will express as additive effect 370 (Fig 4a and Supplementary Fig 33a-b). Apparently, it is the insufficient ligand background that 371 can only activate partial function of two homo-alleles in parents, but relatively full function of one 372 allele in the F1, which results in the positive (partial-)dominance. For the negative regulation, the 373 performance is similar, but the receptor expresses as negative (partial-)dominance, when the 374 insufficient ligand background can only suppress partial function of two homo-alleles in parents, 375 instead relatively full function of one allele in the F1 ( Supplementary Fig 33c-d). It is common 376 between positive and negative regulations that the reaction is dramatically more sensitive to the 377 ligand (activator or repressor) concentration change under insufficient ligand background, where 378 the (partial-)dominance is easy to be observed. It should be noted that there is no overdominance 379 for this scenario if no synergistic effect were involved (when n is equals to 1). 380 Scenario 2: Two alleles of one polymorphic site under two independent backgrounds, that is, 381 two alleles of one polymorphic site of the receptor can be bound by two respective and independent 382 ligands as the backgrounds of the receptor ( Supplementary Fig 34-35). 383 For the positive regulation, as expected in Scenario 1, the receptor easily appears positive 384 dominance when the activator background for the allele with larger maximum function of the 385 receptor is insufficient (smaller X/K). As different from Scenario 1, we can also observe positive 386 overdominance under Scenario 2, when the receptor in F1 can cumulate the effect from the 387 (partial-)dominant allele with a larger function and that from the other allele with a smaller function. 388 When both backgrounds of two alleles are sufficient (higher X/K), the receptor in both parents and 389 F1 can express the full function as two alleles and one allele, respectively, as a result the receptor 390 expresses as additive (Fig 4b and Supplementary Fig 36). The performances of negative 391 regulation are similar, but the receptor expresses as negative (partial-)dominance or overdominance 392 under insufficient background ( Supplementary Fig 37). It is common between positive and 393 negative regulations that the reaction is dramatically more sensitive to the ligand (activator or 394 repressor) concentration change under insufficient ligand background (smaller X/K), where the 395 non-additive effect is easy to be observed. Scenario 3: Two alleles of one polymorphic site with shared background, that is, two alleles 397 of one polymorphic site of the receptor can be bound by the same ligand as the background of the 398 receptor. But these two alleles may have different affinities (K) to the ligand and show different 399 maximum functions (μ). Thus we considered two situations: (1) One allele has higher affinity and 400 shows a larger maximum function, and the other has lower affinity and shows a smaller maximum 401 function (abbreviated as HALF/LASF) (Fig 4c and Supplementary Fig 38-39); (2) One allele has 402 higher affinity but shows smaller a maximum function, and the other has lower affinity but shows 403 a larger maximum function (abbreviated as HASF/LALF) (Fig 4d and Supplementary Fig 40) . 404 Before considering the above two situations, we found from the simulation that there is only 405 additive effect if the ligand randomly and equally binds to two alleles (see Supplementary note). 406 In spite of the positive or negative regulations, function and affinity are similar to scenario 2 that 407 the reaction tends to appears non-additive under insufficient ligand background, especially for the 408 allele with a larger maximum function. The insufficient ligand background renders the reaction 409 dramatically more sensitive to the ligand (activator or repressor) concentration, compared to the 410 sufficient ligand background ( Supplementary Fig 38-42). But we can only observe the non-411 additive effect, when the background is dramatically insufficient under HALF/LASF. In addition, 412 the degree of non-additive effect is apparent weaker in the HALF/LASF situation, compared to the 413 HASF/LALF, because in the latter situation the background in F1 can be reallocated to the allele 414 with LALF from the allele with HASF when the background for latter has been saturated. Taken 415 together, we suppose that overdominance results from the cumulation or compensation between the 416 (partial-)dominance of the allele with a larger function and the effect of the other allele with a 417 smaller function. 418 According to the above simulations, we put forward one model that explains a unique and 419 important molecular mechanism underlying the non-additive effects and heterosis: homo-420 insufficiency under insufficient background (HoIIB) (Fig 4e). As indicated by HoIIB, it is the 421 genetic background insufficient to maximize the function of two homo-alleles in parents but 422 relatively or even completely sufficient to maximize the function of one-allele in F1, thus resulting 423 in the insufficient function of two homo-alleles in parents but the relatively or completely sufficient 424 function of one-allele in F1, that renders the target locus non-additive in effect, as contributing to 425 heterosis. And there were three main features of this theoretic model, according to the simulation.
First, the background insufficiency of the allele with a larger function is the driving force for non-427 additive effects. What we see dominance and heterosis is not the consequence of a stronger F1 428 hybrid, but the consequence of the lower down of the parent with homo-allele of larger function. In 429 other word, the observable function of two homo-alleles is lower than their maximum function 430 under insufficient background. Second, if there is no synergy (n = 1), the overdominance can only 431 be found when both alleles are functional, which result from the cumulation or complementation 432 between the (partial-)dominance of the allele with a larger function and the effect of the other allele 433 with a smaller function. Third, we observed one universal phenomenon in the three scenarios 434 mentioned above, that is, the reaction is dramatically more sensitive to the ligand (activator or 435 repressor) concentration under insufficient ligand background, where the non-additive effect is easy 436 to be observed. 437 The HoIIB model was supported by different levels of evidence 438 It is intrigue that in the observed experiments we have found extensive evidence that can 439 represent the three features of the HoIIB model mentioned above. First, we observed the Homo-440 insufficiency of the allele with a large function and the cumulation or complementation from the 441 allele with a smaller function at various levels including transcription, QTL, and traits (Fig 5a-c). 442 Using transcriptome profile from the 1, 2, and 3 mm young panicles of 9311, PA64S, and their 443 hybrids (LYP9), we investigated expression levels in the two parents for those genes with additive, 444 positive dominant, and positive overdominant effects, respectively. The homo-insufficient 445 expression was substantially observed in the higher parent for genes with dominant and 446 overdominant transcription, compared to those with additive transcription (Fig 5a; Supplementary  447   Fig 43). Meanwhile, the homozygous genotypes in lower parent showed increased expression for 448 the positive overdominance in most cases (Fig 5a; Supplementary Fig 43). Then we compared the 449 QTL with different types of genetic effects that were identified by our GWAS on the three main 450 yield components (PNP, SPP and KGW). Apparently, the homozygous genotypes with lager effects 451 of the dominant and overdominant QTLs represented decreased phenotype, compared to those of 452 the additive QTLs (Fig 5b; Supplementary Fig 44-46). We also compared the QTL that were 453 identified by the 278 immortal F2 lines from the crosses between randomly selected RILs derived 454  Fig 47). We further investigated the distribution 458 of the degrees of middle-parent heterosis for the five yield traits (SPP, PBP, SBP, PNP and KGW) 459 among the MCC combinations evaluated under the two environments. The stronger heterosis 460 tended to be found among the combinations whose higher parents show decreased phenotypes ( Fig  461   5c; Supplementary Fig 48). 462 Second, under the HoIIB model, we may expect that the expression or the observable function 463 of those genes with stronger heterosis are subject to more serious homo-insufficiency background 464 and thus will show stronger response to the change of background, compared to those with weaker 465 heterosis. The instability of the genes with (over-) dominant effects was reflected by their higher 466 variance of expression levels across the three tissues, compared to the genes with additive effects 467 ( Supplementary Fig 49). Examining variance of the QTLs identified by the MCC GWAS or by 468 the immortalized F2 mapping panel, we observed that both homozygous and heterozygous 469 genotypes of QTLs with (over-) dominant effects exhibited higher variability, compared to those 470 with additive effects (Fig. 5d; Supplementary Fig 50-51) 33 . The combinations with higher degree 471 of dominance also showed higher variability for most of the traits (Supplementary Fig 52). 472 The HoIIB model was experimentally validated in yeast 473 To verify the HoIIB model, we designed an experiment to see whether we can manipulate the 474 performance of heterosis of one gene by changing its background sufficiency within a living 475 organism. In order to reduce the experimental complexity as possible as, we used the transcription 476 level as the performance indicator of the target gene and the transcription factor as its background, 477 and carried out the experiment in the simple organism, yeast. We screened the reported transcription 478 factors and its target genes in yeast according to the following criteria: (1) the promoter region 479 being bound by a transcription factor has been clearly validated; (2) there is strong and simple 480 regulatory relationship between the transcription factor and its target gene. After investigating the 481 co-expression of six pairs of genes (WAR1 vs PDR12, VHR1 vs VHT1, VHR1 vs BIO5, AZF1 vs its transcription factor FZF1 in strain BY4743 of Saccharomyces cerevisiae (R 2 = 0.88, 484 Supplementary Table 10b). So we selected FZF1 and its target gene SSU1. According to the 485 reported binding features between two genes 50 , we knocked out the FZF1 recognition motif in SSU1 486 promoter region, then constructed the heterozygous (SSU1/ssu1) and homozygous (ssu1/ssu1) 487 knockout strain of SSU1 in BY4743 ( Supplementary Fig 53a-c). The ssu1/ssu1 genotype showed 488 apparently decreased expression compared to wild genotype of SSU1 (SSU1/SSU1), indicating the 489 effective mutation. Gene SSU1 indicated overdominant expression in the system comprising 490 genotypes SSU1/SSU1, SSU1/ssu1 and ssu1/ssu1, implying that FZF1 supply the insufficient 491 background to SSU1 in BY4743, and we may expect that we can decrease the dominance degree of 492 SSU1 if we can regulate up the expression of its background FZF1 (Fig 5e and Supplementary 493 Table 10c). In the strains with overexpressed FZF1, we really observed dramatically decreased 494 dominance degree among genotypes SSU1/SSU1, SSU1/ssu1 and ssu1/ssu1 of SSU1 along with 495 the increasing of expression level of FZF1, and SSU1 even nearly transited into additive expression 496 when the expression of FZF1 upregulated more than 10 folds (Fig 5e-f) The systematic HoIIB phenomenon related to rice yield heterosis 501 The model and the results mentioned above revealed that insufficient background contributing 502 to the homo-insufficiency is not only the limiting factor for (over-) dominant loci to reach their 503 maximum function, but also the one that causes the instability of the target genes. Therefore, 504 identification of (over-) dominant loci will provide us with a start point or hint to discover the key 505 limiting factors along the genome, or gene regulatory network that impacts such important traits as 506 yield, and thus guide the improvement of hybrids. 507 We performed candidate genes analysis for the identified SPP related trait QTLs, combining 508 GWAS, transcription analysis. The candidate genes selected within the overdominant QTLs 509 affecting SPP related traits include: (1) previously reported genes that controls SPP related traits; 510 (2) highly expressed genes in inflorescence less than 4 mm (http://ricexpro.dna.affrc.go.jp/) 51 ; (3) 511 genes with significantly differential expression between the two parents, and with expression levels 512 in the hybrids significantly deviated from the middle expression level of the two parents, based on 513 the transcriptome profiles of < 4 mm inflorescences from two combinations of PA64S×9311 and 514 JBY×ZH100 (Methods). In total, we identified 3,983 candidate genes out of the 6,906 annotated 515 genes relevant in F1_QTLs and Hmp_QTLs for SPP and the related traits (Supplementary Table  516 11). Among the candidate genes, 33 genes were the cloned genes related to rice spikelet or grain 517 number, 2,414 genes expressed in non-additive pattern in the two combinations, 1,678 genes 518 expressed highly in inflorescence, and 32 genes exhibited as panicle specific expression pattern, 519 which including the cloned genes of OSH1, OSH3 and FZP (Supplementary Table 11). Among 520 the cloned genes, OSH1, which acts as key regulatory factor in SAM development 53 , exhibits the 521 negative overdominant (NOD) expression pattern in the hybrids of JBY×ZH100, and associates 522 with both F1 and Hmp in I×9311 and I×Nip ( Supplementary Fig 54a-c). d35, which regulates the 523 panicle size as showed in two reports 54,55 , encodes the gibberellin biosynthesis enzyme and acts as 524 negative overdominance in J×Nip in both Changsha and Sanya. Further examination of non-525 synonymous SNP T/C in two-line and three-line hybrid combinations revealed that the homozygote 526 of inferior allele C has been avoided in majority of commercial hybrids ( Supplementary Fig 54d- as well as the F1 QTLs identified in 1086 three line hybrids 37 , followed by gene set enrichment 532 analysis using the candidate genes repeatedly identified by GWAS (Supplementary Fig 55). 533 Results indicated that the Nipponbare combinations have apparently more colocalized non-additive 534 QTLs than did the 9311 combinations, consistent with the fact that Nipponbare is less productive 535 than 9311 and suggesting that Nipponbare may represent a more constrained background and thus 536 easily result in non-additive effect in its F1 hybrids compared to 9311. Regarding different 537 subspecific combinations, negative overdominant QTLs were identified more frequently in indica 538 combinations for traits related to SPP and PNP but in japonica ones for KGW; however, negative 539 dominant and positive non-additive QTLs tended to be detected in japonica combinations for all traits. Regarding different environments, the colocalized non-additive QTLs tended to be detected 541 in Sanya compared to Changsha (Supplementary Table 13). These results indicated that the HoIIB 542 appeared to be taxa-and environment-systematic to some degree, but mainly determined by two 543 specific parents in the combination investigated. Secondly, the GO enrichment indicated that those 544 genes within additive QTLs seldom show enrichment, but those genes within non-additive 545 (dominant and overdominant) QTLs are frequently involved in many kinds of catalytic activities 546 and binding functions (Supplementary Fig 56, Supplementary Table 14). Compared to those 547 genes with non-additive performance in non-lethal deletion yeast strains grown in five different 548 media 52 , we also found that they enriched in the GO terms of catalytic activity (Supplementary 549 Fig 57). The enrichment in catalytic activity for non-additive genes may be explained by the reports 550 that most enzymes in organism usually operate at unsaturated substrate concentration 53,54 , i.e. at the 551 lower level of substrates, which may result in the insufficient background of these enzymes and 552 thus their non-additive performance. Further checking those genes encoding rate-limiting enzymes 553 (RLE) showed that the proportion of RLE genes in non-additive QTLs was generally higher than 554 that in additive QTLs (Supplementary Fig 58), consistent with the fact that most of the RLE 555 First, it has been recognized early that complex traits, such as grain yield in rice, often display 571 observable degrees of heterosis 11,33 . It is essential to distinguish concepts of hybrid vigor and 572 heterozygote advantage. As mentioned in the introduction, many lines of evidence have confirmed 573 that hybrid vigor can be easily achieved by cumulation or complementation of a series of balanced 574 and hierarchical additive factors. For instance, a recent study in rice indicated that additive 575 multiplication of components traits are proven to be one source of heterosis in complex trait 34 . 576 Similarly, yield heterosis in barley was predicted by the products of yield components, including 577 ears/plant, kernels/ear, and average kernel weight 56 . Heterosis for plant height in snap bean could 578 be well defined by multiplying internode length by internode number 30 . 579 However, physiologic constrains often impact the total capacity of a biological system. When 580 component traits are put together to form a more complicated trait, such as grain yield in rice, it is 581 constrained by the "short" component(s), which display considerable difference when compared to 582 their per se performance. Therefore, the degree of heterosis may be reduced in such traits as yield. 583 Regarding the relationships between yield heterosis and multiplicative effect of the yield 584 components, our results indicated that, if no dominant effect exist, the geometric multiplication of 585 additive component traits did not contribute much to yield heterosis in rice ( Supplementary Fig  586   26). 587 Second, non-additive genetic effects have been widely observed in F1 hybrids of many plant 588 species. In maize, dominance seems to be the main factors contributing to grain yield and its 589 components, with a moderate role of overdominance and possible epistatic effect 57,58 . In complementation (also refereed to as pseudo-overdominance). At the transcription level, we found 599 that there were nearly 50% of the overdominant expression genes could be explained by 600 complementary expression, which is thought as a common phenomenon 59,60 . As our observation 601 and many reports suggested, dominant cumulation and complementation between loci may be the 602 major genetic basis affecting rice yield heterosis ( Supplementary Fig 59-62). But it should be 603 noted that the dominant cumulation and complementation hypotheses were always challenged by 604 one question, that is, why does heterosis not significantly decrease after pyramiding of superior 605 alleles at dominant loci? Although this can be partially answered from the statistics point of view 1,61 , 606 we believe that our HoIIB model can give more persuasive answer as discussed below. 607 Despite the fact that multiplicative, cumulative, and complementary effects between two genes 608 is the bases of heterosis, it is rarely reported that hybrid vigor can result from the heterozygote In the present study, as evidenced by the results from yield heterosis of many rice hybrids, QTL 619 mapping, and transcriptome profiling, as well as from the theoretical kinetic simulation, we propose 620 one novel universal molecular mechanism underlying heterosis of single polymorphic locus, the 621

homo-insufficiency under insufficient background (HoIIB). 622
The HoIIB model suggests that the non-additive effect is not the intrinsic feature of the gene 623 under study, and heterosis is not the heterozygote advantage. Instead, the non-additive effect is a Furthermore, evidence from our repeatedly detected non-additive genes, followed by GO 642 enrichment analyses, indicated that enzyme catalytic activity may be a systematic HoIIB 643 phenomenon that causes the non-additive effect, i.e. heterosis ( Supplementary Fig 56-57 To investigate generality of the HoIIB, we compared it with several known mechanisms, 653 models, and phenomena about heterosis. The nonlinearity of the enzyme catalytic system was 654 frequently described to explain heterosis 65,66 . But as our simulations indicated, nonlinearity is not the absolute feature of enzyme catalytic activity, it only occurs when the substrate is insufficient to 656 support the full function of the enzyme. In fact, the gene related to the enzyme can also be linear or 657 additive when the substrate concentration is sufficient. BÄurger and Bagheri insisted that the 658 dominance can be evolutionally modified, after comparisons of the models proposed by Wright and 659 Kacser-Burns, respectively 67,68 , they pointed out that the output gain curve will change from non-660 linear to relative linear, and the dominance will transit to additive effect, if one mutation results in 661 a decrease in Kcat that leads to a lower saturation level (i.e. a status that substrate saturates the The balance between genes involved in a biological complex system is another important 672 hypothesis about heterosis. This hypothesis suggests that an imbalance in the concentration of the 673 subcomponents of a protein-protein complex / pathway / network can be deleterious. The typical 674 example for gene balance was reported by Balazs and coleagues 52,71 . Their studies indicated that 675 mutation of the subunit in a complex (or the factor in an interaction pair) can result in imbalance 676 and thus is harmful, which might indicate the impact of gene imbalance on dominance. However, 677 these studies did not consider the effects from the counterpart background. Thus we simulated the 678 effects of complex background on dominance. Our results indicated that the non-additive effect will 679 become weaker and even loss, when background of the considered subunit gets sufficient 680 ( Supplementary Fig 63a-c). These lines of evidence indicated that heterosis is the result of low 681 function of homozygote (homo-insufficiency), rather than the heterozygote advantage. The HoIIB model may affect future utilization of heterosis in several aspects. First, our HoIIB 705 model indicated that in most cases heterosis is not the consequence of heterozygote advantage, but 706 homozygote disadvantage under insufficient background. This implies that current utilization of 707 heterosis is not the best way to take advantage of maximum function of target genes 37 . Therefore, 708 we need to identify and improve the constrained factor(s), or the target genes. In the present study, 709 we extensively investigated yield QTLs or genes affecting parental lines, hybrids, and heterosis, 710 followed by dissection of their genetic effect ( Supplementary Fig 24 and 55). This may provide 711 us with references to identify the limiting factors. In theory, the frequently detected additive factors 712 may represent the systematic limiting factors that constrain the dominant or even overdominant 713 factors from maximizing their functions. When comparing the frequently detected additive vs 714 overdominant QTLs affecting yield traits, we observed that the candidate genes within the additive 715 QTLs displayed distinctly lower expression, compared to those within the overdominant ones 716 (Supplementary Fig 64). Of course, lower expression just represents one aspect of the insufficient 717 function of the genes, we may expect to observe the other aspects of insufficient functions, such as 718 enzyme activity and affinity. These results implied that the frequently detected additive factors, 719 rather than the dominant or overdominant factors, should be focused in future breeding programs. 720 For genes involved in a complex where the gene balance theory applies, we can take advantage of 721 their maximum functions in homozygote by improving the background rather than their 722 compromised functions in heterozygote in order to fit the insufficient background. Theoretically, 723 we can easily make use of homozygote that can maximize the functions of the target genes of 724 interest, which can be achieved by the improvement of the corresponding factors as the insufficient 725 genetic background of target genes. 726 Second, although the above discussion may illude us to think that hybrid breeding is not 727 necessary, our point is that utilization of heterosis will still be an important breeding strategy for a 728 long time and even forever. First, from the perspective of favorable alleles accumulation, even 729 though all insufficient factors can be improved to their maximum functions, followed by integration 730 into an inbred line in theory, it is impossible to be realized in a short time (Supplementary Fig 65). 731 Second, it is an extremely long process to construct the regulation network and thus clearly 732 understand the mutual dependency between genes. Third, the mechanism of HoIIB implies that 733 non-additive is common phenomenon in life system. The reason is simple, that is, it is not expected 734 that the factors in a system operate on the exactly required dependency each other. Thus, one most 735 insufficient factor will result in a batch of factors that present different degrees of homo-736 insufficiency. So we may expect to find less additive factors than non-additive ones, including 737 partial-dominance, dominance and over-dominance. This is really consistent to our results. We 738 detected distinctly less additive QTLs (about 19%) than non-additive ones (including partial 739 domiance), and less genes with additive expression pattern (about 13%) than non-additive ones. 740 Fourth, genetic improvement is a dynamic process, involving the alleviation of insufficiency for 741 one factor, followed by induction of insufficiency for the counterpart factor, that is, breakdown of 742 old balance along with establishment of new unbalance, plus the background change under different environments. For example, it has been proved that improvement of corn hybrids is mainly 744 attributed to the improvement of their inbred parental lines. The high performance of inbreds did 745 not decrease the degree of heterosis in hybrid corn breeding 1 . This phenomenon has also been 746 observed in other organisms, such as cotton 61 . These results indicated that the dynamic breeding 747 process contributes substantially for the continuous improvement of both inbreds and hybrids. Our 748 results also indicated that we may consider different aspects, when we try to improve a variety. For 749 example, we need to overcome the weakness of heterosis for SPP related traits under short-day 750 environment (Fig 1d). 751 Third, the HoIIB model helps our understanding and utilization of general combining ability 752 (GCA) and special combining ability (SCA), and provides guidance in breeding by genome 753 selection. It was well known that additive effects contribute mainly to GCA, and non-additive 754 effects, including dominance and epistasis, to SCA 75 . Our HoIIB model suggested that the additive 755 factors are less background-sensitive, compared to the non-additive factors, which explain why 756 SCA is more difficult to predict than GCA does. In addition, improvement of GCA through 757 accumulation of additive superior alleles has proven to be an efficient strategy in hybrid breeding 76 . The parental varieties were re-sequenced as part of in the rice 3,000 rice genomes 801 project 41 . Genomic DNA was prepared from the leaves of a single young plant for each variety 802 by a modified CTAB method. After the quality control, at least 3 μg genomic DNA of each 803 sample was randomly fragmented by sonication and size-fractionated by electrophoresis, and 804 DNA fragments of approximately 500 bp were purified. Each sequencing library was 805 sequenced in six or more lanes on the HiSeq2000 platform and 90 bp paired-end reads were 806 generated. Subsequently, the reads from each sample were extracted based on their unique 807 nucleotide multiplex identifiers as 83 bp reads (90 -6 -1, where 1 is the ligation base "T"). To 808 ensure high quality, raw data was filtered by deleting reads having adapter contamination or 809 containing more than 50% low quality bases (quality value ≤ 5). 810 The 83-bp paired-end reads of 267 rice varieties were mapped to the temperate japonica 811 Nipponbare reference genome (IRGSP-1.0) using the BWA software with default parameters 812 except for "aln -m 10000 -o l -e 10 -t 4". The alignment results were then merged and indexed 813 as BAM files 77 . SNP calling was based on the alignment using the Genome Analysis Toolkit The middle parent heterosis value (Hmp) of each combination for each trait was measured 852 as: F1-(P1+P2)/2, i.e. the deviation of F1 from middle parent performance, where F1, P1 and P2 853 represent the phenotypic values of each trait in F1, P1 and P2 respectively. In addition, we 854 denoted the over higher parent heterosis (OHP) when the F1 shows the phenotypic value over 855 the higher parent, range between the two parents (RBP) when the F1 shows the phenotypic 856 value between two parents and below lower parent heterosis (BLP) when the F1 shows the 857 phenotypic value below the lower parent. Here, y represents phenotype, X represents genotype matrix, P is the matrix of population 884 structure and K is the matrix of genetic similarity between individuals. α and β represent fixed 885 effects of genotype and population structure, and μ and e represent random effects of kinship 886 and residuals. The first five principal components were used to estimate the population 887 structure. The matrix of genetic similarity based on simple SNP matching coefficients was used 888 to model the variance-covariance matrix of the random effect. 889 To avoid the over correction of the Bonferroni method, we used FDR to control overall 890 errors as following. Permutation tests were used to estimate the FDR 81 . For each examined 891 trait, we reshuffled the original phenotype data, and then performed association analysis using 892 GAPIT with the same parameters. After 1000 permutations, we got 1000 association p value 893 from permutation (p_per) for each SNP and we set the highest -log(p_per) as the FDR of that 894 SNP. The SNP was denoted as significant association when the -log(p_GWAS) is no less than 895 the highest -log(p_per), where p_GWAS is the association p value for each SNP  H. The additive effect of each SNP was half of the absolute difference between the two 906 homozygotes, i.e. a = abs(PA-(PA+PB)/2). The traditional estimation for dominance effect was 907 expressed as d = FH-(PA+PB)/2, here we rescaled the dominance effects identified as d = FH-908 (FA-PA)-(PA+PB)/2. In which FA -PA means the background heterozygous effects. 909 Then, we delimitated those significant SNP with tight LD (linkage disequilibrium) as one 910 QTL and selected the tagSNP of each QTL. We first constructed the LD block using GAB 911 algorithm for all significant SNPs with r 2 >= 0.8 and selected out one tagSNP for each block 82 . 912 If one block size was larger than 20kb and there were no less than 3 significant SNPs in the 913 block, we defined the block as one QTL and the QTL was named by the position of its tagSNP. 914 The additive and dominant effects of this QTL were estimated by the average additive and 915 dominant effects of the significant SNPs in the QTL. 916 Finally, we defined the additive, dominant and over-dominant QTLs. A QTL is referred to 917 as over-dominance preferred if the absolute ratio of dominant effect to additive effect (|d/a|, 918 degree of dominance) is no less than 1.5, and (partial-) dominance preferred if 0.5≤|d/a|<1.5, 919 and additive preferred if |d/a|<0.5. The dominant and over-dominant QTLs can further be 920 classified as positive ones when their d > 0 or negative ones when their d < 0. 921

922
The repulsive effects of more than one locus within one QTL may be one mechanism 923 resulting in non-additive effect 36 . So we estimated the repulsive degree in each QTL. If there 924 are n significant SNPs showing the same type of dominant effect (positive or negative) in one 925 QTL, then its maximum pairwise SNPs number is N = . We denoted the SNP as Sref if 926 phenotype effect of its allele with genotype being same to tester, and Salt otherwise, then the 927 maximum possible pairwise repulsive SNPs number is R = Sref×Salt. Thus, the repulsive degree 928 of the QTL was RD = R/N. The RD is larger in one QTL, the higher potential the QTL involves 929 repulsive effect. 930 gene was classified as negative over-dominance (NOD); if LYP9 is significantly higher than