qCL1.2 detection using a CSSL population
In previously study, a set of CSSLs was constructed in our laboratory  . The donor wild rice parental plant has a procumbent phenotype. To identify genes controlling PH during rice domestication, we conducted a QTL analysis for PH using this CSSLs population. The PHs of the CSSLs were investigated under five environmental conditions (Table 1). The PH phenotype substantially differed within the CSSL population (Fig. S1). Genotyping was performed using 157 molecular markers, including 97 SSR markers and 60 InDel markers. The linkage map of SSR/InDel markers was shown in Fig. S2a. In total, 11 QTLs correlated with PH were identified under the five environmental conditions (Table 2). One QTL, located near InDel 1-16 on Chr. 1 was detected in four environments, had the highest LOD value (45.01 in E3) and explained 48% of the PH variance (Table 2), indicating that this QTL is likely a main effect QTL. This QTL was named as qCL1.2. One CSSL, CSSL28, which had the greatest PH in the CSSL population and harbored qCL1.2, was selected for further study. The CSSL28 genotype is shown in Fig. S2a. Only two substituted segments from wild rice were detected using SSR/InDel markers in the whole CSSL28 genome (Fig. S2a); therefore, CSSL28 was considered a near isogenic line (NIL) of qCL1.2.
Phenotypic characteristics of parental lines CSSL28 and 9311 and their F1 generation
CSSL28 showed a significantly greater PH than the recurrent parent 9311. The PH of CSSL28 was 180.3cm, while that of 9311 was 116.8cm (in E5). The F1 was generated from a cross with CSSL28 as the female parent and 9311 as the male parent. The resulting F1 individuals were as tall as CSSL28 (Fig. 1a,b). A difference in PH between CSSL28 and 9311 was clear evident at the seedling stage (Fig. S2b), The difference was significant from 30 d after sowing. The difference in PH between CSSL28 and 9311 was extremely significant at the heading stage, reaching ~ 63 cm on average (Fig. 1c). The lengths of panicle and internodes of 9311 and CSSL28 were also measured (Fig. 1d,e). The basal three internodes of CSSL28 were similar in length to those of 9311. However, the upper three internodes and panicles of CSSL28 were longer than those of 9311. The second and third internodes of CSSL28 were longer than those of 9311 by ~18.4 and ~17.7 cm, respectively. The total lengths of second and third internodes in CSSL28 contributed approximately 43.8% to the total culm, as compared with 37.6% in 9311 (Fig. 1f). The increase in CSSL28 PH was mainly caused by elongated upper second and third internodes.
To determine the cause of the differences in PH, histological observations of transverse and longitudinal sections of the internodes of CSSL28 and 9311 were recorded (Fig. 2). The transverse sections of the third internodes from the main culms indicated that the CSSL28 cells, especially the vascular cells, were much bigger than those of 9311. The longitudinal sections of the internodes suggested that there was no significant difference in cell length between CSSL28 and 9311. Similar results were observed for the second, fourth and fifth internodes. However, for the first and basal internodes, no differences between CSSL28 and 9311 were observed in the transverse sections. Because the stems of CSSL28 are much thicker than those of 9311, we deduced that the increased PH of CSSL28 resulted from an enhanced cell number and cell size at the first through fifth internodes, rather than an enhanced cell length.
Fine mapping of qCL1.2 and gene prediction
The F2 population of CSSL28/9311, containing 402 individuals, was constructed for a genetic analysis in the summer of 2017. The segregation ratio of PH fit a 3:1 ratio (c2 = 1.76 < c20.05,1 = 3.84) for single gene inheritance. In 2018, two segregating F3 populations derived from a single heterozygous plant were used for further genetic analyses. One F3 population, containing 1611 individuals, was planted in E3, and another one, containing 928 individuals, was plant in E5. As shown in Figure 3a,b, the PH showed a bimodal distribution and similar 3:1 segregation ratios were obtained (c2 = 2.20 <c20.05,1 = 3.84, c2 = 3.31 < c20.05,1 =3.84). These results indicated that the difference in PH between CSSL28 and 9311 was controlled by a single QTL, qCL1.2.
We located qCL1.2 between RM128 and RM472 (near InDel 1-16) on Chr. 1. To narrow the site of qCL1.2 into a smaller region, we selected molecular markers within this interval. One InDel and eight SSR markers (InDel1-14、RM486、RM5389、RM11908、RM11986、RM11928、RM11960、RM11974、RM11982) with polymorphisms between CSSL28 and 9311 were seleced. Using ~2,000 F3 segregating individuals, qCL1.2 was narrowed to a 131-kb interval between RM11974 and RM11982 (Fig. 3c). According to the Rice Genome Annotation Rice Genome Annotation Project Database (http://rice.plantbiology.msu.edu/), this interval may include 13 candidate genes (Table S1), including the “green revolution” gene sd1 (LOC_Os01g66100). By sequencing LOC_Os01g66100, we found that the first and third exons in the coding region produced synonymous and non-synonymous SNP changes, respectively, which altered the tyrosine in CSSL28 to a termination codon in 9311 (Fig. S3). In addition, the promoter region was also altered at 17 sites between CSSL28 and 9311. Functional defects in the SD1 gene result in serious PH changes. Therefore, we hypothesized that qCL1.2 is the SD1 gene and that the extremely high PH of CSSL28 results from the wild rice SD1 allele.
Gene expression analysis
To investigate the expression patterns and regulatory network of the novel allele of the SD1 gene, total RNA from seedlings of CSSL28 and 9311 at 5, 15 and 30 d after germination were isolated for a real-time PCR analysis. SD1 and genes involved in GA synthesis (EUI1) and GA signaling (SLR1 and GID1) were selected (Fig. 4). For SD1, the expression level was high at 5 d after germination in both CSSL28 and 9311, and the expression level in 9311 was higher than that in CSSL28. At 15 and 30 d into the seedling stage, the expression level of SD1 decreased in both CSSL28 and 9311.
For the SLR1 gene, which encodes a DELLA protein, and the GA receptor gene GID1, the expression levels were low at 5 d after germination and significantly increased at the 5th day of the seedling stage. The expression levels of the two genes in 9311 were much higher than in CSSL28 and then significantly decreased by the 30th day of the seedling stage. The same expression patterns were also found for the EUI1 gene, which correlates with the internode lengths at the top of rice stems.
Nucleotide diversity and haplotype network analyses of the sd1 gene
Using the rice functional genomics-based breeding database (http://www.rmbreeding.cn/index), the qCL1.2 (SD1) gene coding and promoter region sequences from 2,822 rice varieties were aligned. Haplotype and genetic diversity analyses were carried out using the data of 2,822 cultivated rice PH phenotypes. Abundant genetic variations were detected at the LOC_Os01g66100 site in the 2,822 cultivated rice accessions (Fig. 5). The SD1 coding region contained 27 non-synonymous SNP/InDel sites. In total, 33 haplotypes with more than 5 individuals were selected, and a total of 20 variation sites were retained (Fig. 5a). As shown in Figure 6, a network was constructed using the major haplotypes for the SD1 coding region. The 33 haplotypes were basically divided into three groups. The left group contained 8 haplotypes and 95.5% of the japonica rice samples, and the middle group contained 16 haplotypes and 89.1% of the indica rice samples (Fig. 6a). Associations between haplotypes and PH were also analyzed (Fig. 6b). Among the 24 haplotypes in the left and middle groups, 97.6% accessions having PH values greater than 130 cm, and 95.5% of the samples having PH values between 110 and 130 cm were in this group. In the right group, most of the PH values were less than 90 cm, and 50.69% of samples having PH values between 70 and 90 cm were in this group.
As shown in Figure 5a, SNPs at nt 299 and 1,019 in the SD1 coding region differentiated japonica and indica rice. The amino acids at the two sites were glutamate (E) and glutamine (Q), respectively, in japonica, and glycine (G) and arginine (R), respectively, in indica. More than 98% of the indica accessions carried the SD1-GR allele, as in qCL1.2. Most of the japonica accessions carried the SD1-EQ allele. However, these two SNPs did not affect PH. Compared with other haplotypes, the InDels in haplotypes H_13, H_14, H_21, H_22, H_23, H_24, H_29 and H_33 resulted in frame-shifts or translational termination, leading to the dwarf plant phenotype. Additionally, in the H_20 of 9311, the SNP at nt 1,026 led to dwarfed plants.
A network for the SD1 promoter sequence was also constructed using the same database. The SD1 promoter region contained 51 SNP/InDel sites, and a total of 31 haplotypes having more than 10 individuals were selected (Fig. 5b). As shown in Figure 7a, three haplotypes, H_3, 7 and 28, contained 90.78% of the japonica individuals, while 93.88% of the indica individuals were in the other haplotypes. This finding suggested that SNPs at nt 35, 93, 242, 412, 447, 537, 853, 1,164 and 1,247 of the sd1 promoter region (Fig. 5b) clearly differentiated between japonica and indica. Most accessions in H_21, 24, 29, 23 and 27 are ‘others,’ suggesting that SNPs at nt 35, 734, 759, 820, 1,133, 1,137 and 1,345 differentiated between indica and others. Furthermore, no SNP or haplotype was found associated with PH in Figure 7b, indicating that the PH of rice is mostly controlled by the SD1 protein function but not the gene expression level.