Main agronomic characteristics of lsl2
To elucidate the genes that regulate flower development in rice, we screened for a floret mutant phenotype among an EMS-mutagenized population and identified a long sterile lemma 2 (lsl2) mutant in the ZH11 background. Phenotypic comparisons between the lsl2 mutant and wild-type ZH11 plants are presented in Table 1. The results showed no significant differences in the major agronomic traits, including the plant height, panicle length, number of effective panicles, spikelets per panicle, seed-setting rate, 1,000-grain weight, grain length and grain width.
Phenotypic observations and analysis of the lsl2 mutant
At the vegetative stage, the phenotypes of the ZH11 and lsl2 plants were indistinguishable, but their spikelets displayed different phenotypes from the boot stage to the mature stage (Table 1 and Fig. 1a, b). The sterile lemma of the ls12 mutants was markedly longer than that of ZH11, although other components of the spikelet were the same (Fig. 1a, b). Interestingly, no significant difference in the grain size or brown rice size was found between lsl2 and ZH11 after maturation (Table 1 and Fig. 1c, d).
We compared the germination rates of the lsl2 and ZH11 seeds. On the second day, the wild-type ZH11 plants started sprouting (69.3%), but the lsl2 mutant had barely begun to germinate (2.3%) (Fig. 2 and Table 2). Compared with the wild-type plants, the lsl2 mutants showed clearly reduced germination rates from the second day to the fourth day (Table 2).
Genetic analysis of the lsl2mutant
To determine whether the lsl2 mutant is caused by a single gene, we then crossed the lsl2 mutant with ZH11. The F1 generation showed normal phenotypes, and the F2 population exhibited Mendelian segregation (Table 3). Indeed, the segregation between the wild-type and mutant plants corresponded to a 3:1 segregation ratio in the two F2 populations (χ2=0.124～0.462, P>0.5), which indicated that the lsl2 mutant phenotype is controlled by a single recessive gene.
Initial localization of the lsl2 gene
To determine which gene mutation causes the lsl2 phenotype, we then mapped the lsl2 gene. Two SSR markers located on rice chromosome 7, RM4584 and RM2006, were found to be associated with mutant traits in 193 F2 individuals. Based on the recombination frequency, the genetic distance between RM4584 and RM2006 was calculated to equal 28.8 cM. Therefore, lsl2 is located in a 28.8-cM region on chromosome 7 flanked by the SSR markers RM4584 and RM2006 (Fig. 3a).
Fine mapping of the lsl2 gene
To delineate the gene to a smaller region, an accurate map between RM4584 and RM2006 was constructed using published markers (Table 4). Through genetic linkage analysis, the lsl2 gene was mapped between the molecular markers RM8059 and RM427, with a distance of 7.6 cM (Fig. 3b). For further mapping, all recombinant genes were genotyped using nine polymorphic markers (Table 4). The results showed that the lsl2 gene is located between the molecular markers Indel7-13 and Indel7-15, with a physical distance of 205 kb (Fig. 3c and Table 4). For the fine mapping of the lsl2 gene, seven polymorphic indel markers for recombinant screening (Table 4) detected one, one, three, three, six, seven and 11 recombinant plants (Fig. 3d). Thus, we precisely localized the lsl2 gene between the molecular markers Indel7-22 and Indel7-27, with a physical distance of 25.0 kb.
Candidate genes in the 25.0-kb region
Four candidate genes (LOC_Os07g04660, LOC_Os07g04670, LOC_Os07g04690, and LOC_Os07g04700) were annotated in this 25.0-kb region (Fig. 3e). According to the available annotation database, these four genes all have a corresponding full-length cDNA. LOC_Os07g04660 encodes white-brown complex homologue protein 16, and LOC_Os07g04670, LOC_Os07g04690 and LOC_Os07g04700 encode a DUF640 domain-containing protein, UDP-arabinose 4-epimerase 1, and an MYB family transcription factor, respectively.
Sequence analyses of the lsl2 gene
To analyse which gene causes the mutant phenotype, we sequenced the above-mentioned four genes in ZH11 and lsl2 and found only a single 1-bp change (T to C) in LOC_Os07g04670 between the wild-type ZH11 and lsl2 mutant plants. No other differences in the sequences of the three other genes were observed. Thus, we speculated that the LOC_Os07g04670 locus corresponds to lsl2. Interestingly, the G1/ELE gene, which encodes a DUF640 domain-containing protein, is present in this locus . Based on the results from phenotypic similarity and localization analyses, we hypothesized that the long sterile lemma phenotype of lsl2 might be caused by functional changes in the product of the LOC_Os07g04670 locus. These results suggest that the lsl2 gene might be allelic with G1/ELE.
The analysis of the open reading fragment (ORF) of the LSL2 gene (LOC_Os07g04670) showed one exon and no intron. lsl2 is a 1-bp mutant that resulted in the exchange of a serine (Ser) for a proline (Pro) (Fig. 4). Ser is a polar amino acid, whereas Pro is nonpolar. Thus, this mutation might alter the function of a protein.
The lsl2 geneis responsible for the long sterile lemma phenotype
To confirm that the mutation phenotype can be attributed to lsl2, we examined whether the knockout of LSL2 in the cultivar ZH11 would lead to the long sterile lemma phenotype. One sequence-specific guide RNA (sgRNA) was designed to knock out the LSL2 gene using the CRISPR/Cas9 gene editing system. A total of three plants from three independent events were obtained and confirmed by sequencing to carry insertions and deletions in the target sites (Table 5).
We then investigated the panicle characteristics of these three homozygous lines after maturity and found that all three exhibited a long sterile lemma phenotype (Fig. 5), which indicated that the knockout of LSL2 in ZH11 leads to the long sterile lemma mutation phenotype.
QPCR confirms expression status of LSL2
To verify the expression status of LSL2, empty glumes and lemma/palea were selected for qPCR. The results showed that the LSL2 was expressed significantly differently in empty glumes and lemma/palea, and the expression level in empty glumes was significantly higher than that in lemma/palea (Fig. 6).
Analyses of 3-D structures between the LSL2 protein and the lsl2 protein
Further simulations of the 3-D structures of the proteins revealed changes between the lsl2 and LSL2 proteins (Fig. 7). Moreover, the change in residue 79 of LSL2 from Ser to Pro induced a significant change in the protein structure (Fig. 7).
Haplotype analysis of the LSL2 gene
To further investigate the genetic and evolutionary characteristics of the LSL2 gene, we performed SNP (Single nucleotide polymorphisms) calling and haplotype analysis of the 3,000 sequenced rice genomes available in the CNCGB and CAAS databases  and found 492 haplotypes for the LSL2 gene, including 49 haplotypes among more than 15 rice resource materials (Supplementary Table 2). However, no haplotype or SNP was found for the lsl2 mutant in the 3,000 sequenced rice genomes.