Endosperm types analysis within different subgroups
A total of 2108 rice accessions, including 1965 non-glutinous (or non-waxy) and 143 glutinous (or waxy) rice accessions (http://snp-seek.irri.org/), were used to identify waxiness-related genetic loci and analyze the differentiation for development of both endosperm types (Table S1). Meanwhile, 17,132,232 SNPs of the rice panel were obtained from 3KRGP . Subsets of these data were further filtered and used in the subsequent analyses.
Reasonable assessment of population structure is conducive to detect the phenotypic differences and subsequent GWAS of natural population. Using Admixture software , we calculated varying levels of K means within the rice population. The indica and japonica subpopulations appear clearly at K = 2 (Fig.1A). The principal component (PC) analysis indicated that top three PCs explained 17.0%, 6.1% and 2.3% of the genetic variation within the rice panel, which supported that there were two main subpopulations (Fig.1B). Referring to the recent results of 3,010 rice accessions , we classified the panel into two major subpopulations, 1298 indica and 810 japonica, although there were several atypical indica and japonica accessions (Table S1). Hence, the endosperm type of each rice subgroups was compared, and top three PCs were used as covariates to control for subgroup structure in GWAS.
Among the 143 glutinous rice accessions, there were 70 indica and 73 japoncia (Table S1, Fig.1B), suggesting broad genetic variation of trait occurred in indica and japonica. To study the underlying external factors affecting glutinous differentiation, the geographic distribution of accessions with different glutinous traits was investigated. The vast majority of glutinous rice accessions are distributed in SEA, SER and EAR with 75, 31 and 19 accessions, respectively (Fig.1C). In contrast, non-glutinous rice, as a major endosperm type, was widely distributed in the whole rice growing area (Fig.1D). This geographic distribution was in consistence with previous research that reported the artificial selection of glutinous rice in Southeast Asia [11, 29]. Taken together, these results suggested that there were large genetic differentiation among glutinous rice accessions, although they were relatively geographically concentrated.
Identification of waxy trait QTLs by GWAS
Under linear mixed model (LMM) with kinship matrix (K) and top three PCs (Q), GWAS was performed to study the genetic basis of endosperm types. Quantile-quantile (Q-Q) plot showed that LMM efficiently controlled population structure and relationships as there was no inflated P values and a majority (95%) of markers exhibited P value equal to or lower than the expected with accordance to null hypothesis (Fig.2A, B and C). Finally, a total 3,338 SNPs located in 399 annotated genes (including gene region and 2 kb promoter region) were identified to associate with endosperm type with threshold of –log(P) = 5.6 (Table S2). Taking into account the large genetic differences between the glutinous accessions of japonica and indica (Fig.1A and B), we further conducted GWAS of indica and japonica to explore subpopulation-specific waxy genes. According to the above criteria, a total of 2,670 and 1,034 associated SNPs were identified in indica and japonica, located in 262 and 157 annotated genes, respectively (Fig.2 B and C, Table S2). The GWAS detection efficiency of the whole panel was higher, and the most associated sites were found (Table S2). By comparing GWAS results of three populations, 1,424 significant SNPs (53.3%) of indica and 546 significant SNPs (70.3%) of japonica could be detected in the whole panel (Table S2). Interestingly, a certain degree of significant loci was detected simultaneously among the whole panel and subpopulations, including 244 common SNPs located in 32 annotated genes (Fig.2D and E, Table S3), indicating that these genes were important and have conserved gene response for endosperm type between subpopulation.
Exploration of candidates for endosperm type in rice
The real genes related to rice endosperm type were required to be adequately expressed in seeds at grain filling stage such as OsAGPL2, Wx and OsSSIIIa (Fig.S1). To further screen candidate genes for endosperm types in QTL regions, we firstly analyzed the expression level of candidate genes in rice seeds at two periods (7-8 and 10-14 days after flowering) of seeds development in rice. Among them, 127 of 399 candidates in whole rice panel, 81 of 262 candidates in indica, and 48 of 156 candidates in japonica showed moderate expression at least one period (FPKM and RPKM > 10) (Table S4). To further verify the reliability of combined analysis of GWAS, and expression level, the comparison between GWAS detected candidate genes and the known waxy genes was performed. Three starch synthesis-related genes, Wx (LOC_Os06g04200), SSG6 (LOC_Os06g03990) and OsSSIIa (LOC_Os06g12450) were detected among three populations, respectively. As Manhattan plots showed, these known genes showed top signals in whole rice panel and subpopulations (Figure3A, B and C). Meanwhile, OsSSI (LOC_Os06g06560) showed association with endosperm types in the GWAS results of whole rice panel (Fig.3D). According to the functional reports of these genes, these genes are involved in starch synthesis, Wx is well known gene responsible for controlling amylose synthesis, and its multiple alleles had been identified [30-34].The other three genes affect the morphology and amylose content of starch [35-37].The comparison of the GWAS results and known genes indicated that the GWAS results for endosperm type were credible, the four known genes were key loci for natural variations of rice endosperm type.
Natural variation in three key genes responsible for rice endosperm type
The exploration of natural variation of key endosperm type is beneficial in breeding for high-quality rice. We performed haplotype analysis to identify their elite alleles of the three key genes (Wx, SSG6 and OsSSIIa) for rice endosperm types. Firstly, association analysis of candidate genes was performed between endosperm types and 537 SNPs with MAF > 0.01 located in three known genes. Of these, 100 SNPs were significant associated with rice endosperm type (-log(p) > 2). Here, we focused on non-synonymous SNPs, SNPs at splice site and SNPs in promoter (Table S5), as these SNPs could be responsible for functional variation through changes in expression and protein sequence [25, 26, 38]. A total of 37 significant SNPs were identified within Wx gene, including two non-synonymous SNP, one SNP at split site, and 34 in promote or UTR regions. Twenty-six haplotypes, named Wx-1 to Wx-26, were identified in whole panel (Fig.4A). Twenty-four of 26 haplotypes were detected in indica, eight of which showed moderate frequencies ranging from 5% to 23.3%. By comparison, 47.8% and 24.1% japonica carried Wx-8 and Wx-9, suggesting there were large genetic variation of Wx in indica than japonica (Fig.4A). Previous studies showed that Chr6_1765761 was a key functional SNP for post-transcriptional modification of Wx . The mutant of fifth exons of Wxb induced to lower AC than that of glutinous rice. In our study, we did not detect a unique waxy haplotype of Wx. Wx-9 (allele T at Chr6_1765761) considered as the waxy haplotype, with 37 of 108 in indica and 51 of 173 in japonica were glutinous rice (Fig.4A and Fig.S2). The results suggested that Wx was not the only key gene accounted for natural variation in proportion of amylopectin and amylose in rice. Waxiness of rice, as a physiological trait, is often the result of the continuous joint change of multiple biochemical processes of starch biosynthesis.
Based on 17 significant SNPs within SSG6 (twelve in the promoter, one in the 5’UTR, 3 non-synonymous SNPs and one in the 3’UTR), nine major haplotypes, named SSG6-1 to SSG6-9, were identified in whole panel. SSG6-2, SSG6-3, SSG6-7 and SSG6-8 were predominantly represented indica varieties, accounting for 15.2%, 43.4%, 10.1% and 22.3% of the total, respectively. Moreover, SSG6-1, SSG6-3, SSG6-5, SSG6-6 and SSG6-7 were predominant within japonica, accounting for 10.1%, 11.2%, 14.4%, 19.0% and 29.1% (Fig.4B). The results indicated the existence of a certain degree of genetic differentiation of SSG6 between indica and japonica, although there were two shared haplotypes between indica and japonica. Further study showed that SSG6-7 could be considered as main waxy haplotype, due to that 41 of 131 indica and 20 of 236 japonica carrying SSG6-7 were glutinous rice. Additionally, a japonica-special glutinous haplotype SSG6-5 was detected with 47 of 117 japonica carrying SSG6-5 were glutinous rice (Fig.4B and Fig.S2). Meanwhile, we detected 5 haplotypes (named OsSSIIa-1 to OsSSIIa-5) of OsSSIIa gene, based on 10 significant SNPs (five in promoter, one in 5’UTR, two non-synonymous SNPs and two in 3’UTR). OsSSIIa-1, OsSSIIa-2 and OsSSIIa-3 were predominant in whole panel (Fig.4C). There was no obvious genetic differentiation of OsSSIIa between indica and japonica. OsSSIIa-1 could be considered as waxy haplotype, as 58 of 296 indica and 54 of 212 japonica accessions carrying OsssIIa-1 were glutinous rice (Fig.4C and Fig.S2).
Taken together, we identified the key glutinous rice haplotype of each gene (Fig.S2), although none of them completely determined the waxiness of rice. Furthermore, it provides an important message that waxiness of rice, as a physiological trait, is also determined by a complex network, rather than simple genes in the biochemical synthesis pathway in the traditional sense. To prove the above hypothesis, we first examined the geographical distribution of different haplotype combinations of the three genes. Totally, there were 27 haplotype combinations in 124 glutinous rice accessions, haplotype combinations with more than three accessions were listed (Fig.5A). Among 75 glutinous rice of SEA, 33 accessions carried the haplotype combination of Wx-9, SSG6-7 and OsSSIIa-1, 12 accessions carried the haplotype combination of Wx-9, SSG6-5 and OsSSIIa-1. The haplotype combination of Wx-9, SSG6-5 and OsSSIIa-1 was also the predominant in SER, while most glutinous accession of EAR carried the haplotype combinations of Wx-10, SSG6-7 and OsSSIIa-1 or Wx-10, SSG6-5 and OsSSIIa-1 (Fig.5B), indicating haplotypes combining more glutinous alleles formation more glutinous rice and glutinous alleles were conductive to the formation of glutinous rice in SEA, SER and EAR.
Population structure and genetic differentiation of three key genes between both endosperm types
The sequence alignment of three key waxy genes and geographical distribution of their different haplotype combinations suggested that the genetic differences underling waxiness trait among regions was greater than that between subpopulations in rice. To confirm the above hypothesis, we investigated the population structure and admixture patterns of each gene in the whole rice panel. We first estimated ancestry proportions of Wx, SSG6 and OsSSIIa for individuals by Admixture. Population structure based on each of three genes showed different genetic structures from the whole genome. Admixture model using 202 SNPs within Wx gene indicated that 53 of 70 glutinous indica accessions clustered with glutinous japonica accessions (Fig.6A). Meanwhile, admixture model using 123 SNPs within SSG6 gene indicated that 42 of 70 glutinous indica accessions clustered with glutinous japonica accessions, and one glutinous japonica accession clustered with other 28 glutinous indica accessions (Fig.6B). Additionally, admixture model using 194 SNPs within OsSSIIa gene showed that 66 of 70 glutinous indica accessions clustered with glutinous japonica accessions, and one glutinous japonica accession clustered with other 4 glutinous indica accessions (Fig.6C). The results confirmed that there was no obvious genetic differentiation of the three key waxy genes between japonica and indica distributed in SEA, SER and EAR, which was supported by further PC analysis (Fig.6C).
The exceptional genetic similarity among glutinous rice revealed by PC and admixture analyses could be caused by a unique domestication process. The origin of waxy haplotypes of the three genes and how they spread in japonica and indica rice are two key issues to reveal the formation of glutinous rice. Here, we firstly examined haplotypes of three known genes in wild rice. There were 72, 64 and 52 haplotypes in Wx, SSG6 and SSIIa of wild rice. The waxy haplotypes Wx-9 of Wx gene could be detected in 3 wild rice accessions, which were from Thailand and China. The results indicated that the waxy haplotype Wx-9 could be inherited from wild rice, but it is a very unlikely scenario that all waxy haplotype in both rice subpopulations originate directly from a small amount of wild rice (Fig.7A). Additionally, none of wild rice carried waxy haplotypes SSG6-5 and SSG6-7 of SSG6 and waxy haplotype OsSSIIa-1 of OsSSIIa (Fig.7B and C), suggesting that the waxy haplotypes of SSG6 and OsSSIIa newly generated during rice domestication. Taken together, a more possible hypothesis for the exceptional genetic similarity among glutinous rice is substantial local gene flow of Wx, SSG6 and SSIIa between indica and japonica in SEA, EAR, and EAR.
To further determine the hypothesis of gene flow and examine the direction of gene flow, we performed phylogenetic analyses using all haplotype types of each gene. For Wx gene, waxy haplotype Wx-9 clustered with other japonica haplotypes and formed a monophyletic group (Fig.7A). Meanwhile, two waxy haplotypes SSG6-5 and SSG6-7 of SSG6 clustered together with long genetic distance to other haplotypes of cultivated rice (Fig.7B). Additionally, waxy haplotype SSIIa-1 clustered with SSIIa-2 and two wild haplotypes (Fig.7C). Phylogenetic trees in cultivated rice indicated that the waxy haplotypes of each gene were closer to their corresponding japonica haplotypes than indica haplotypes, such as Wx-9 closed to Wx-7/8/10, SSG6-5/7 closed to SSG6-6, and OsSSIIa-1 mainly closed to japonica as the haplotype OsSSIIa-2 account for 46.5% of total in japonica accessions (Fig.S3). Given that japonica was first domesticated from wild rice in southern China, and that indica was subsequently developed from crosses between japonica and local wild rice , we suggested that glutinous haplotypes of the three genes in japonica rice firstly evolved or were directly inherited from wild rice, and then flowed into indica rice in SER, SEA and EAR.