Phenotypic characterization of mutant pylm
The wild type ‘CK-51’ and pylm were both obtained from isolated microspore culture of the ‘Huaguan’ pakchoi variety. However, the latter displayed yellow leaves at germination and this phenotype was stable throughout its lifetime (Fig. 1). The mutant had a slender phenotype and weak growth. However, its yellow leaf color was nonlethal. Moreover, relative to ‘CK-51’, pylm displayed an elongated hypocotyl at the seedling stage (Fig. 1c) and early flowering at the bolting stage (Fig. 1b).
Genetic analysis of mutant pylm
F1 and F2 populations were constructed from crosses between pylm and the Chinese cabbage DH line ‘FT’ (Fig. S1). The F1 individuals from the reciprocal crosses had the same green leaf phenotype as ‘FT’. Therefore, inheritance of the etiolation phenotype in pylm is nuclear rather than cytoplasmic. Segregation statistics data for the green and yellow leaf phenotypes of the F2 population accorded with the expected Mendelian ratio of 15:1 (c2 < c20.05 = 3.84). Thus, the Chl deficiency trait is controlled by two recessive nuclear genes. The BC1 progeny was obtained from F1 separately backcrossed with pylm and ‘FT’. Segregation statistics data for the green and yellow leaf phenotypes of the BC1F1 population from the cross of F1 with pylm fitted the expected Mendelian ratio of 3:1 (c2 < c20.05 = 3.84). This finding confirms that the mutant trait is conferred by two recessive nuclear genes. They were designated as py1 and py2. Neither gene alone can induce the yellow leaf phenotype. Phenotypic data for all generations are listed in Table 1.
Segregation of py1 and py2
According to the genetic analysis of pylm, the reduced-Chl phenotype is controlled by the recessive nuclear genes py1 and py2. In consequence, the F2 and BC1F1 populations could not be used to map these genes separately. To separate py1 from py2, F2 individuals with green leaves may be randomly selected and self-pollinated to produce F2:3 progeny. Green-colored F2:3 plants with a statistical segregation ratio of 3:1 (green-colored:yellow-colored) may be self-pollinated to generate F3:4 progeny. In theory, ~2/3 of the F3:4 families should display the expected Mendelian segregation ratio of 3:1. Some of these could map py1 while the others could map py2 (Fig. 2).
Twenty green-colored individuals from F2 were randomly selected and self-pollinated to produce F2:3. For the twenty F2:3 families, phenotypic segregations were investigated. There were three distinct groups. Eleven populations showed no yellow-colored plants, five segregated with 15:1, and the other four with 3:1 (Table S1). These results corroborated the theoretical segregation ratio of “all green-colored:(green-colored:yellow-colored = 3:1):(green-colored:yellow-colored = 15:1) = 7:4:4” for F2:3.
Of the four F2:3 families segregated with 3:1, eight plants with green leaves per family were selected and self-pollinated to produce F3:4. Phenotypic segregations revealed that twenty F3:4 families (Nos. 1-20) segregated with 3:1 while the other twelve showed no yellow-colored individuals (Nos. 21-32). Thus, their F2:3 genotypes should be Py1 py1 py2 py2/py1 py1 Py2 py2 and Py1 Py1 py2 py2/py1 py1 Py2 Py2, respectively (Table S2). Phenotypic segregations of the F3:4 families fitted the theoretical segregation ratio of “(green-colored:yellow-colored = 3:1):all green-colored = 2:1”. Therefore, the F3:4 families (Nos. 1-20) could be used to map the py1 and/or py2 loci.
BSR-Seq analysis
A total of 47,526,126 and 49,119,466 raw reads (150-bp) were generated from the G-pool and Y-pool, respectively. After quality evaluation and data filtering, 97% of the read pairs (46,456,174 for the G-pool and 47,581,728 for the Y-pool) remained. Clean reads were mapped against the Brassica reference genome with Hisat v. 2.0.14. Of these, > 66% were uniquely mapped in both pools.
Relative to the reference genome, 154,863 and 157,022 SNPs were detected in the G-pool and Y-pool, respectively. Differential SNP loci were screened for ED5 calculation and 412 target differential SNP loci were obtained between the pools according to the top 1% ED5 threshold. Two distinct peaks were observed on chromosomes A07 and A09 (Fig. 3). This finding was consistent with the hypothesis that the mutant trait is controlled by two recessive nuclear genes. Thus, it was predicted that the etiolation genes were located on chromosomes A07 and A09 within five chromosome regions (Table 2).
Identification of differentially expressed genes
RPKM was used to measure gene expression level. By setting RPKM ≥ 0.1, 55,250 genes were detected. These were divided into six RPKM distribution intervals (Table S3). There were 181 DEGs between the G-pool and Y-pool according to the constraint (|log2 fold change| ≥ 1 and FDR ≤ 0.05). Ninety genes were upregulated and the others were downregulated when the G-pool was compared with the Y-pool (Fig. S2). The DEGs are shown in Table S4.
Fine mapping of py1
Ninety-six SSR markers were developed around the three predicted chromosome regions on chromosome A09. They were used to detect polymorphisms between pylm and ‘FT’. After screening, thirty-seven SSR markers displayed polymorphisms between parents. They were used to test twelve green-colored and yellow-colored individuals each from the No. 1 F3:4 family. SSRzk5 and SSRzk12 were located near the 23,811,435-27,563,122 region on chromosome A09 and showed linkage to py1 on the opposite side.
A total of 1,520 yellow-colored individuals of the No. 1 F3:4 family were selected as the py1 mapping population. A linkage analysis disclosed that py1 was located between SSRzk5 and SSRzk12 at estimated genetic distances of 3.2 cM and 1.8 cM, respectively (Fig. 4a). To identify the molecular markers tightly linked to py1 and narrow the py1 mapping interval, new SSR and Indel markers were developed between SSRzk5 and SSRzk12. The polymorphic markers SSRzk17, SSRzk28, SSRzk29, SSRzk36, Indelzk72, and Indelzk125 were linked to py1 (Table S5). SSRzk17, Indelzk72, and Indelzk125 were located on one side of py1 as SSRzk5 while SSRzk28, SSRzk29, and SSRzk36 were located on the other side of py1 as SSRzk12. The py1 was mapped between Indelzk125 and SSRzk36 at 0.13 cM and 0.2 cM, respectively (Fig. 4b). Therefore, py1 was mapped in a 258.3-kb region between the most tightly linked markers (Fig. 4c).
Candidate py1 analysis
The target DNA sequences of the 258.3-kb region between Indelzk125 and SSRzk36 were obtained from the Brassica database. A genomic sequence analysis revealed that the candidate region contained 34 genes (Fig. 4c, Table S6). Differential gene expression analysis disclosed only BraA09004189 in the py1 mapping region. BraA09004189 is a heme oxygenase (HO1) which participates in heme catabolism. Mutants with yellow leaf phenotype induced by defective HOs were reported in earlier studies [40, 41]. BraA09004189 was predicted to be the most probable candidate py1 gene.
To confirm this hypothesis, two pairs of primers were designed to sequence BraA09004189 in pylm and ‘CK-51’ (Table S7). The BraA sequence did not differ between parents whereas the BraB sequence in pylm presented with one SNP (Fig. 5). Based on the position of BraA09004189, an SNP marker was designed to screen 1,520 yellow-colored individuals from the No. 1 F3:4 family. The bands of whole mapping individuals co-segregated with py1.
A qRT-PCR was performed to determine BraA09004189 expression in pylm and ‘CK-51’. In accordance with the differential gene expression analysis, the results indicated that BraA09004189 expression level was much higher in ‘CK-51’ than that in pylm (Fig. 6). This finding further supports the likelihood that BraA09004189 is the candidate for py1.
Fine mapping of py2
Considering the constructed populations size, we screened the Nos. 2-5 F3:4 families using the same research strategy applied for SSRzk5 and SSRzk12 linked to py1. The etiolation gene py1 was identified in the Nos. 2, 4, and 5 F3:4 families. In theory, then, the No. 3 F3:4 family may be used to establish the py2 locus.
Forty-eight SSR markers were developed around the two predicted regions on chromosome A07 to detect polymorphisms between pylm and ‘FT’. After screening, eleven SSR markers displayed polymorphisms between the parents. They were used to test twelve green-colored and twelve yellow-colored individuals of the No. 3 F3:4 family. SSR84 and SSR103 were located around the region 11,166,810-15,034,483 on chromosome A07 and presented with linkages to py2 on the opposite side (Fig. 7a; Table S8).
There were 1,860 yellow-colored individuals from the No. 3 F3:4 family selected as the py2 mapping population. The py2 was located between SSR11 and SSR15 at estimated genetic distances of 0.24 cM and 0.02 cM, respectively (Fig. 7b). To narrow the py2 mapping interval and identify the molecular markers tightly linked to py2, new SNP markers were developed between SSR11 and SSR15. Only the polymorphic marker SNP11 was linked to py2. Based on the recombinant individuals, the py2 interval was narrowed to 14,851,951-14,896,902 and contained five genes (Fig. 7c).
Candidate py2 analysis
Annotation data for the five candidate genes in the py2 target region were obtained from the Brassica database (Table S9). Primers were designed to cover the cDNA for each gene and predict the candidate genes (Table S10). There were no differences between pylm and ‘CK-51’ in terms of BraA07001775, BraA07001776, or BraA07001777. After PCR amplification, the BraA07001773 sequence was disordered and the sequence comparisons were inconsistent over serial repetitions. There was SNP variation between parents for the first exon in BraA07001774 (Fig. 8). It caused a single amino acid mutation from Asp (GAT) in the wild type to Asn (AAT) in pylm (Fig. 9). Therefore, BraA07001774 was taken as the most probable candidate gene for py2.
BraA07001774 is an embryo defective 1187 (emb 1187) and a phosphotransferase. The albino mutants (pds1, pds2) phenotypes in Arabidopsis thaliana may be caused by emb 71 [45]. For Arabidopsis seeds with silique defects, hypocotyl elongation was characterized during the development of F2 generation mutant seedlings [46]. We proposed that the mutant phenotype is determined by mutations in BraA07001774. To validate our prediction, BraA07001774 expression in pylm and ‘CK-51’ was analyzed by qRT-PCR. BraA07001774 was dramatically downregulated in pylm (Fig. 10). Thus, it probably is the candidate gene for py2.