Plant materials and population construction
The ‘nfc’ was subjected to the non-vernalization grafting floral induction method to artificially induce flowering, following the previously reported methods (Motoki et al. 2019; Motoki et al. 2022; Motoki et al. in press). To generate segregating populations, flowering ‘nfc’ scions via grafting were crossed with three B. oleracea cultivars exhibiting different degrees of relatedness to ‘nfc’. The most closely related cultivar was ‘T15’ (B. oleracea var. capitata), which the parental line of ‘nfc’. One individual of ‘T15’(#43) were used for crossing. The second most closely related cultivar was a commercial cabbage cultivar ‘Watanabe-seiko No. 1 (W1)’ (B. oleracea var. capitata) (Genebank project, NARO, Japan, accession No. 25974), and one individual was used for crossing. The most distantly related cultivar was a commercial Chinese kale cultivar ‘kairan’ (B. oleracea var. alboglabra) (Tsurushin Seed Ltd., Japan), and one individual was used for crossing. Each F1 was cultivated in the open field as described below, and each F2 seed was obtained after the self-pollination of six individuals of ‘nfc’בT15’ F1, one individual of ‘nfc’בW1’ F1, and three individuals of ‘kairan’בnfc’ F1. Self-pollination was performed using bud pollination or CO2 treatment to overcome self-incompatibility, following the methods of previous studies (Nakanishi and Hinata 1975). To prevent cross-pollination, the inflorescences were covered with non-woven bags before flowering until pod formation.
Growth conditions in the field
For ‘nfc’, plants propagated in vitro via cuttings were used. For ‘T15’, plants propagated in vitro via cuttings or cultivated from seeds were used. For ‘W1’, ‘kairan’, F1, and F2, plants cultivated from seeds were used. Plants were cultivated between 2018 and 2021. Cultivation year represents the year the plants were transplanted into the open field. Seedlings were raised as described previously (Kinoshita et al. 2021). The seedlings were transplanted in the open field of the Kyoto Farm or the Kizu Farm, Kyoto University in September or October. Cultivation, fertilization, and pest management were performed following commercial practices. The maximum leaf length of each plant was measured on January 19, 2019, December 15–18, 2019, December 19–20, 2020, and December 18–19, 2021 of each cultivation year. Plants with defective growth were excluded from the analysis. Since March of the following year after transplanting, cabbage heads were divided vertically with a knife to enable the main stem to bolt and the lateral shoots to elongate. Cultivation was continued until July of the following year.
Quantitative evaluation of flowering characteristics
‘nfc’ is not a mutant that has lost the flowering ability but exhibits rare flowering in a field. However, the terminal bud does not flower even when flowering occurs although the lateral buds elongate and flower at the end of spring (Kinoshita et al. 2021). We consider these flowering characteristics as important indicators to characterize ‘nfc’. Therefore, we evaluated flowering characteristics quantitatively using the following three criteria: the flowering date of the plant, the flowering date of the terminal bud, and the number of flowering shoots (Fig. S1). The flowering date of the plant was defined as the day that the first flower opened irrespective of whether the flower site was a lateral bud or the terminal bud. Additionally, the flowering date of the terminal bud was examined. In plants exhibiting terminal bud flowering at first, the flowering date of the terminal bud coincides with that of the plant. Meanwhile, in plants exhibiting lateral shoot flowering at first, the flowering date of the terminal bud does not coincide with that of the plant. The terminal bud was examined to check if it developed into a flower bud or continued vegetative growth in plants whose terminal bud did not flower. The terminal bud that developed into flower buds but ceased development before the flower opening was considered an aborted bud. The number of flowering shoots was defined as the number of terminal and primary lateral shoots with flowers. This is an indicator that allows non-flowering and flowering plants to be evaluated with a continuous number. For ‘kairan’ and ‘kairan’בnfc’ F1 and F2, only the flowering date of the plant was examined. The data for ‘nfc’ and ‘T15’ cultivated in 2018 and 2019 were obtained from a previous study (Kinoshita et al., 2021).
Library preparation and sequencing
For whole genome sequencing, the genomic DNA was extracted from leaves using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA) or the method described by Mizuno et al. (2020). The DNA samples of ‘nfc’, ‘W1’, ‘T15’(#43), and ‘kairan’ were extracted from the same clone individuals that were used for crossing. ‘nfc’בT15’ F2 comprised the progenies of six F1 individuals (n = 125, 82, 68, 68, 60, and 8), and the distribution of flowering dates differed slightly depending on the F1 parent. Therefore, from each F2 population derived from five F1 individuals (n = 125, 82, 68, 68, 60), except for one F2 population (n = 8), 10% of plants (n = 13, 8, 7, 7, 6) with the earliest flowering date were selected for Early-flowering bulk and 10% of plants with the latest flowering date including non-flowering plants were selected for Late-flowering bulk. In ‘nfc’בW1’ F2, 15% of plants (n = 16) with the earliest flowering date were selected for Early-flowering bulk and 15% of plants with the latest flowering date including non-flowering plants were selected for Late-flowering bulk. Chinese kale does not require vernalization for flowering because of the loss of BoFLC2 (Tang et al. 2021). Preliminary experiments revealed that ‘kairan’ used in this study lacks BoFLC2, while ‘nfc’, ‘T15’, and ‘W1’ harbored BoFLC2. Additionally, the expression level of BoFLC2 was comparable between ‘nfc’ and ‘T15’. The genotypes for BoFLC2 in ‘kairan’בnfc’ F2 should segregate into three types. Hence, each genotype should be considered separately to prevent the detection of BoFLC2 as a QTL in QTL-seq analysis. We developed a codominant DNA marker for BoFLC2 to distinguish the genotypes of ‘kairan’בnfc’ F2 individuals using multiplex polymerase chain reaction (PCR) (Fig. S2). Consequently, ‘kairan’בnfc’ F2 individuals were divided into ‘nfc’ genotype with homozygous BoFLC2 (n = 93), ‘kairan’ genotype without BoFLC2 (n = 77), or a heterozygous genotype (n = 174). From each BoFLC2 genotype, 20% of plants (n = 18, 16, 34) with the earliest flowering date were selected for Early-flowering bulk and 20% of plants with the latest flowering date were selected for Late-flowering bulk. Before library preparation, the genomic DNA of the F2 plants selected for each bulk was quantified using a Qubit 2.0 Fluorimeter (Invitrogen, Carlsbad, CA, USA) and was mixed in equal amounts. Library preparation was performed using the TruSeq DNA PCR-Free Sample Prep Kit (Illumina, San Diego, CA, USA). The libraries of ‘nfc’ sample were sequenced using Illumina NextSeq 500 in a 151 bp paired-end mode. Meanwhile, the libraries of other samples, except for ‘nfc’, were sequenced using Illumina NovaSeq 6000 in a 151 bp paired-end mode (Table S1).
QTL-seq analysis for the flowering dates
The FASTQ files generated by whole genome sequencing were filtered and trimmed using Trimmomatic (Bolger et al. 2014). The remaining reads were mapped to Brassica oleracea reference genomes of ‘TO1000’ (Parkin et al. 2014) downloaded from National Center for Biotechnology Information (NCBI) using Burrows-Wheeler Aligner (BWA) with the default parameters (Li and Durbin 2009). The generated SAM files were converted into BAM files, and BAM files were sorted using SAM tools (version 1.9) (Li et al. 2009). The variants including both SNPs and Indels were called using Genome Analysis Toolkit (GATK) (version 4.1) (McKenna et al. 2010). The output genomic variant call format (GVCF) file of parents and their F2 bulks were merged and generated database using GATK “GenomicsDBImport” function, the variant call format (VCF) file was output from the database using GATK “GenotypeGVCFs” function, and then VCF file was converted into a tab-delimited table using the GATK “VariantsToTable” function. The tab-delimited table format file was used for downstream analyses by R (R Core Team 2020). Polymorphism used to calculate the ΔSNP-index were filtered using the method of Itoh et al. (2019) with minor modifications. Concretely, only the polymorphisms positions that met all of the following criteria were retained: both parents’ genotypes (GT column) were homozygous, there was a polymorphism between the parents, and the coverage was at least 5 reads for both parents and 10 reads for both bulks. By comparing the genotypes of ‘nfc’ and the other parent with the reference, genotypes in the F2 bulk were converted from reference (ref) homozygotes, alternative (alt) homozygotes, and heterozygotes to ‘nfc’ homozygotes, the other parent homozygotes, and heterozygotes. The SNP-index in each bulk was calculated as the frequency of ‘nfc’ allele. Then, we calculated the ΔSNP-index with a sliding window size of 1 Mb, except for polymorphisms for which both bulk SNP-indexes were less than 0.3 in 10,000 simulations, and simultaneously calculated the 99% and 99.9% confidence intervals using R package “QTLseqR” was used to identify significant QTL (Mansfeld and Grumet 2018). We used polymorphisms including not only SNPs but also Indels in the calculation of ΔSNP-index.
The data in FASTQ files were filtered and trimmed using Trimmomatic (Bolger et al. 2014) to obtain a quality score of at least 20. The remaining 150 bp paired-end reads were aligned to the reference genomes of B. oleracea TO1000 (Parkin et al. 2014) using Burrows-Wheeler Aligner (BWA) with the default parameters (Li and Durbin 2009). Next, the SAM files were converted into BAM files, which were sorted using SAM tools (Li et al. 2009). Variant calling including both single nucleotide polymorphisms (SNPs) and insertions/deletions (indels) was performed using Genome Analysis Toolkit (GATK) (version 4) (McKenna et al. 2010). The genomic variant call format (GVCF) file of both parents and both bulks from the same F2 population were merged using GenomicsDBImport function and GenotypeGVCFs function from GATK. The final merged variant call format (VCF) file was converted into a tab-delimited table using the VariantsToTable function from GATK.
QTL-seq analyses were performed following the methods of a previous study (Abe et al. 2012; Takagi et al. 2013; Itoh et al. 2019). Polymorphisms containing both SNPs and indels were filtered under the following three conditions; (1) genotypes (GT) of both parents were homozygous; (2) presence of genotypic (GT) polymorphism between parents; (3) read depth (DP) > 9 for both parents and both bulks. By dividing the allele depth (AD) of both bulks and comparing it to the genotypes of the TO1000 reference genome and the parental genome, we calculated the AD of the ‘nfc’ type and the other parental type for each bulk. The frequency of the ‘nfc’ type AD was calculated as the SNP-index of each bulk. The SNP-index in this analysis was calculated for not only SNPs but also indels. The average ∆SNP-index (SNP-index_ Late-bulk−SNP-index_Early-bulk) was calculated using the R package “QTLseqR” (Mansfeld and Grumet 2018). Polymorphisms with SNP-index of both bulks less than 0.3 were excluded because they were considered false positive polymorphisms. Additionally, the window size was set to 10 Mb for ‘nfc’בT15’ F2 and 2 Mb for ‘nfc’בW1’ F2 or ‘kairan’בnfc’ F2 to account for the number of polymorphisms. Simultaneously, 95%, 99%, and 99.9% confidence intervals were calculated for 10,000 simulations.
Validation of the genomic region associated with the flowering date using QTL analysis
The significant ∆SNP-index peak for the flowering date identified using QTL-seq was confirmed using QTL analysis with the ‘nfc’בW1’ F2 population. Cleaved amplified polymorphic sequence (CAPS) markers were designed for candidate QTL regions on chromosome 9 (Chr. 9) using ‘nfc’ and ‘W1’ genome sequences mapped to the TO1000 reference genome (Table S2). The genomic DNA of all ‘nfc’בW1’ F2 individuals cultivated in 2020 (n = 109) was extracted from leaves, following the methods described by Mizuno et al. (2020). Specific regions were amplified with primers of each CAPS marker using KOD One® PCR Master Mix (Toyobo Co., Ltd., Osaka, Japan), following the manufacturer’s instructions. The PCR conditions were as follows: 30 cycles of 98 °C for 10 s, 65 °C for 5 s, and 68 °C for 1 s. The PCR products (1 µL) were digested with restriction enzyme (0.5–1.5 U) for at least 3 h at 37 °C and analyzed using 1 % agarose gel at 100 V for 30 min. The genotype was determined based on the lengths of the detected bands.
Days to flowering of the plant, days to flowering of the terminal bud, and the number of flowering shoots were used for phenotypic characterization. Days to flowering were calculated based on the flowering date of the earliest flowered individual (for example, 1 day for the earliest flowered individual and 6 days for the individual that flowered 5 days later). Non-flowering individuals were not included in QTL analysis for days to flowering. The number of flowering shoots of non-flowering individuals was considered zero in QTL analysis for flowering shoots.
QTL analysis was performed with R/qtl (Broman et al. 2003). The Haldane map function was used to estimate genetic distances and map order. The logarithm of odds (LOD) scores was estimated using both interval mapping (IM) and composite interval mapping (CIM). In IM analysis, the LOD scores were estimated using nonparametric interval mapping (scanone function; model = “np”) and the EM algorithm. Meanwhile, in CIM analysis, the LOD scores were estimated using Haley-Knott regression. To identify the presence of a significant QTL, 95%, 99%, and 99.9% significance thresholds were determined using a permutation test with 10,000 permutations in IM analysis and 1,000 permutations in CIM analysis.
Fine mapping of QTL region using QTL analysis
To narrow down the identified QTL regulating the non-flowering trait, QTL analysis was performed using ‘kairan’בnfc’ F2 population cultivated in 2021. CAPS markers were designed for candidate QTL regions on Chr. 9 using ‘nfc’ and ‘kairan’ genome sequences mapped to the TO1000 reference genome (Table S2). DNA purification-free PCR was performed using a previously reported method (Jia et al. 2021) with some modifications. Approximately 1 cm2 of leaf sampled from each seedling was rubbed onto filter papers that were air-dried after submerging in the soaking buffer. A disk with leaf samples and filter papers punched with a 1 mm diameter biopsy punch (KAI Corporation and KAI Industries Co., Ltd, Tokyo) was used as a template. Specific regions were amplified with primers of each CAPS marker using KOD One® PCR Master Mix (Toyobo Co., Ltd., Osaka, Japan) with 4% Tween 20 (Wako Pure Chemical Industries, Ltd., Osaka, Japan). PCR and restriction digestion were performed as mentioned above. QTL analysis using only recombinant plants was associated with a high power of detection and fine mapping of the loci (Topcu et al. 2021). Therefore, genotypes of the CAPS markers P350-351 (Chr. 9: 46,064,092) and P407-408 (Chr. 9: 53,311,640) designed at both ends of the candidate region for all seedlings were determined. Only recombinant plants with different genotypes of the two CAPS markers were transplanted in the open field and genotyped using the additional seven CAPS markers. Subsequently, the flowering dates of these plants were examined. For phenotypic characterization, days to flowering of the plant calculated based on the flowering date of the earliest flowered individual was used. QTL analysis was performed as described above.
Expression analysis
For gene expression analysis, ‘nfc’ (n = 4), ‘T15’ (plants cultivated from seed, n = 3), and ‘W1’ (n = 3) transplanted in the open field on October 15, 2021, were used. The upper leaves and the maximum leaves were sampled 15–30 min before sunset at the following time points: one week after planting (October), in the middle of each month from November to March, early April, and mid-April. Meanwhile, the plants grew gradually and encountered low temperatures during winter. The upper leaf was defined as the most recently developed leaf with a length of approximately 2 cm. As the head formed after December, new leaves were sampled by pushing aside the surrounding leaves. The maximum leaf was defined as the intact leaf with the longest length. Leaf disks of the maximum leaf were sampled with a 6 mm diameter biopsy punch (KAI Corporation and KAI Industries Co., Ltd, Tokyo).
Total RNA was extracted from the leaf samples using Sepasol RNA I Super G (Nacalai Tesque, Inc., Kyoto, Japan), purified using an Econospin™ for RNA (Epoch Life Sciences, Missouri, TX, USA), and reverse transcribed into complementary DNA (cDNA) using ReverTra Ace® (Toyobo Co., Ltd., Osaka, Japan) and RNase Inhibitor Recombinant (Toyobo Co., Ltd., Osaka, Japan), following the manufacturer’s instructions. Subsequently, 1 μL of cDNA diluted 10-fold was used as a template for quantitative real-time PCR (qPCR). qPCR analysis was performed using a THUNDERBIRD® SYBR® qPCR Mix kit (Toyobo Co., Ltd.), following the manufacturer’s instructions. PCR was performed using a LightCycler® 480 system (Roche Diagnostics K.K., Tokyo, Japan) under the following conditions: 95°C for 5 min, followed by 40 cycles of 95°C for 10 s, 60°C for 30 s, and 72°C for 30 s. Single-target product amplification was evaluated using a melting curve. The expression levels of target genes were normalized to those of the housekeeping gene BoActin. Gene expression was quantified using the comparative Ct method (2−∆Ct method). The experiments were repeated two times as technical replicates. All qPCR primers used in this study are listed in Table S3. The amplification efficiency of each primer set that was calculated using a standard curve of serially diluted samples (five-fold dilution), was 100 ± 5%.
RNA-seq analysis
RNA-seq analysis was performed using ‘nfc’ (n = 3) and ‘T15’ (plant cultivated from seed, n = 3). Total RNA extracted from the upper leaf sampled in mid-November as above was used for RNA-seq analysis. RNA-seq libraries were prepared with NEBNext Ultra Directional RNA LP kit (New England Biolabs Ltd., UK) and subjected to 150 bp pair-end sequencing using an Illumina NovaSeq 6000 (Table S4). High-quality clean data were obtained by discarding the paired reads containing adapter sequences, more than 10% uncertain nucleotides, and more than 50% low-quality nucleotides (base quality less than 5), following the standard protocol provided by Novogene (Novogene Bioinformatics Technology Co. Ltd., Beijing, China). The reads were mapped against the TO1000 reference genome, and the expression levels were calculated using RSEM (Li and Dewey 2011) with Bowtie2 alignment program. Differential expression analysis was performed using edgeR (ver 3.38.4) with glmQLFTest (Robinson et al. 2010) in R (R Core Team, 2022).
Phylogenetic analysis of B. oleracea FLC homologs
Illumina reads obtained after RNA-seq of ‘nfc’ and ‘T15’ were used for de novo assembly with Trinity (v.2.4.0) (Grabherr et al. 2011). Contigs exhibiting high similarity (Blastn, expected value < 1e−150) with the cDNA sequences of BoFLC1a (XM_013756815.1), BoFLC1b (XM_013756817.1), BoFLC2 (AY306122.1), BoFLC3 (XM_013773027.1), and BoFLC5 (Bo3g024250.1) were selected. Phylogenetic analysis was performed using the selected contigs and BoFLC homologs, as well as the following B. napus FLC homologs derived from the C genome: BnaFLC.C02 (GSBRNA2T00068991001), BnaFLC.C03a (GSBRNA2T00134620001), BnaFLC.C03b (GSBRNA2T00024568001), BnaFLC.C09a (GSBRNA2T00016124001), and BnaFLC.C09b (GSBRNA2T00016119001). A phylogenetic tree was constructed using the Create Alignment and Create Tree (Algorithm = Neighbor Joining, Bootstrap = 1,000 replicates) function in CLC Genomics Workbench (CLC Bio; QIAGEN).