Inheritance analysis
To analyze the inheritance of leaf shape in R. sativus, a cross between WA, a CMS line with serrated leaf, and J4, a restorer line with deeply lobed leaf was conducted (Fig. 1). All F1 plants had lobed-leaves, while in F2 population (WA×J4), there were 68 plants with serrated leaves and 219 plants with lobed leaves, which fit an expected Mendelian segregation ratio of 1:3 (χ2 = 0.227, p > 0.05). These results indicated that the lobed-leaf is controlled by a dominant nuclear gene and dominant over the serrated-leaf.
Bulked segregant sequencing of the lobed leaf gene
To quick map the leaf shape gene, we did BSA-seq by incorporating BSA and GBS. After sequenced on HiSeq 4000, there were 2,500,383 reads and 3,453,168 reads obtained form J4 and WA, 2,851,995 reads and 4,844,581 reads obtained form Bk1 and BK2, respectively. All reads were aligned to the radish reference genome (Jeong et al. 2016) to identify single nucleotide polymorphisms (SNPs) and small INDELs between parents and between two Bulks. Totally, 4,544 high-quality polymorphic loci including 3,852 SNPs and 692 INDELs were identified. For each identified SNP, the SNP index was calculated, and a corresponding Δ (SNP index) plot of two bulks was drawn along each chromosome (Fig. 2). It was found that the Δ (SNP index) in a genomic region from 0.66 Mb to 8.19 Mb on chromosome R7 exceeded the threshold of 0.5, indicating that the leaf-shape gene may be encompassed.
Rs390250 is the lobed leaf candidate gene of radish
For further mapping, 16 INDEL markers among the initial candidate interval on chromosome R7 were designed based on the INDELs between parents. These markers were used to amplify the genomic DNA of WA and J4, 9 (RIL_1, RIL_2, RIL_3, RIL_7, RIL_11, RIL_12, RIL_14, RIL_15 and RIL_16) out of the 16 markers detected distinguishable polymorphisms (Fig. 3a). Then, these polymorphic INDEL markers were used for recombinant screening of 287 F2 individuals. Finally, the genomic region containing lobed-leaf gene was narrowed down to a 1.53 Mb region between RIL_3 (0.87 Mb) and RIL_11 (2.40 Mb) on chromosome R7 (Fig. 3).
Radish belongs to the tribe Brassicaceae and closely related to B. rapa and B. napus, two important commercial crops as vegetables and vegetable oil. Comparative mapping using molecular markers and whole-genome alignment revealed extensive syntenic blocks all over the genomes of R. sativus and B. rapa (Li et al. 2011; Jeong 2016). To identify the candidate gene, we did whole-genome collinearity analysis between R. sativus and B. napus. The region from 0.87 Mb to1.3 Mb of chromosome R7 on radish which within the fine mapping region showed a good collinearity with the region between 16.3-16.8 Mb of chromosome A10 on B.napus (Fig. 3b). In the collinear region of B. napus, there were two LMI1-like genes, BnaA10g26320D (BnLL1.LMI1) and BnaA10g26330D (BnA10.LMI1) which were considered as the lobed leaf genes (Ni et al. 2017; Hu et al. 2018). Both BnLL1. LMI1 and BnA10. LMI1 are orthologous to Rs390250 (899,863-901,651 bp), one of the 53 gene models in the mapping region of radish. All these genes encode homeodomain leucine zipper class I (HD-Zip I) transcription factors. Previous results indicated that several class I HD-Zip regulators control leaf shape and size (Saddic et al. 2006). We thus listed Rs390250 as the lobed leaf candidate gene of radish.
Sequence comparison analysis of Rs390250
According to the annotation of the radish genome, Rs390250 consists of three exons and two introns, and encodes a protein with 220 amino acids. To compare the nucleotide sequences of Rs390250 between J4 and WA, a 3.0 kb genomic region including the 1.0 kb upstream regulatory sequence, the 1.8 kb coding region and the 0.2 kb downstream untranslated region of Rs390250 was amplified. Comparative sequencing identified three SNPs (C84G、T177A and C425T) in the coding sequence (CDS), one 149 bp insertion in the regulatory region and one 666 bp insertion in the first intron of Rs390250 from J4 (Fig. 4a). The 149 bp insertion in the regulatory region is a duplicate copy of a neighbouring 130 bp sequence plus a 19 bp spacer with unknown function. By searching against the transposable element (TE) database (http://pmite.hzau.edu.cn) (Chen et al. 2013), we found that the 666 bp insertion is a Mutator-like TE, which has a 9 bp target site duplication (GTTTTTAAA) and 46 bp terminal inverted repeate (GAGTTATTCTTGGGTTCACCCCCTAGGGTGAACCTTTAGGTTCACC). To detect whether the TE is co-segregate with leaf shape variations, we designed a pair of primers (Rs-3F/Rs-2R) to specifically amplify the 666 bp insertion in the 68 serrated-leaf plants selected from the F2 population, and found that all the 68 serrated-leaf plants did not have the 666 bp insertion while the lobed-leaf plants had (Fig. 4c), suggesting that the TE insertion in the first intron of Rs390250 co-segregated with the lobed-leaf. To reveal if the 149 bp duplication in the regulatory region affect the expression level of Rs390250, we performed RT-PCR to measure the relative expression level of Rs390250 in J4 and WA, and found that there is no significant difference between parents (Fig. 4d), indicating that the insertion is unrelated to the difference of leaf shape between J4 and WA .
Rs390250 encodes an HD-Zip I transcription factor. HD-Zip proteins contain a homeodomain (HD) domain and a leucine zipper (LZ) domain. The HD domain binds to double helix of target DNA, and the LZ domain helps the HD to recognize the downstream genes by forming a homo- or heterodimer structure (Tron et al. 2004). Both the HD and the LZ domains are essential for the HD-Zip I proteins to regulate the transcription of target genes. Of the three SNPs in the CDS of Rs390250, T177A is a synonymous mutation while C84G and C425T are non-synonymous mutations. The C425T mutation caused an amino acid substitution from serine (S) to leucine (L), which exactly located in the LZ domain (Fig. 4b). we aligned few of amino acid sequences of the LZ domain in LMI1-like genes from different species with various lobed-leaves, the result showed that the serine (S) residue is absolutely conserved in these species, except for the serrated-leaf WA which has a leucine (L) residue (Fig. 5a). Besides, Phylogenetic analysis of the LZ domain indicated that Rs390250 belongs to RCO orthologous gene (Fig. 5b). Analysis of conserved domains using the NCBI Conserved Domains tool showed that both RCO and Rs390250-J4 have the HD domain and the LZ domain, while Rs390250-WA only has the HD domain but does not have the LZ domain, suggesting that the substitution of the conserved serine by leucine may destroys the structure of DNA-binding LZ domain in WA. We thus supposed that the C425T may cause the different morphological variation between J4 and WA. To verify our speculation, we amplified and analyzed the Rs390250 sequence from 11 more radish varieties with various leaf shape. All the serrated-leaf varieties are T/T homozygous as WA, while the lobed-leaf varieties with deep lobes are C/C homozygous as J4 or heterozygous (C/T) (Table 1). It is known that the rco mutant in Cardamine hirsuta resulted in a lighter leaf margin dissection (Vlad et al. 2014; Sicard et al. 2014). Thus, we speculated that the C425T transition is the causal variation resulting the serrated leaf of radish.