Identication of Candidate Genes for Clubroot-Resistance in Brassica oleracea using QTL-Seq

Clubroot caused by Plasmodiophora brassicae is a devastating disease of cabbage (Brassica oleracea). To identify quantitative trait loci (QTLs) for clubroot resistance (CR) in B. oleracea, genomic resequencing was carried out in two sets of extreme pools that constructed from 184 F2 cloned-lines derived from cross between clubroot-resistant cabbage ‘GZ87’ (against race 4) and susceptible cabbage ‘263’. QTL-seq analyses identied one CR QTL from group I on chromosome C07 and four QTLs from group II on C04 and C07, among which three QTLs of C07 that found from group II were located within the one detected from group I. RNA-Seq and qRT-PCR were conducted in the extreme pools of group II before and after inoculation identied two potential candidate genes (Bol037115 and Bol042270) from the three QTLs interval on C07, which exhibiting up-regulation after inoculation in the resistant pool but down-regulation in the susceptible pool. A functional marker ‘SWU-OA’ was developed from one QTL on C07, exhibiting ~95% accuracy in identifying CR in 56 F2 lines. Our study will provide valuable information on digging resistance genes against P. brassicae and may accelerate breeding process of B. oleracea with CR.


Introduction
Clubroot disease caused by the obligate parasite Plasmodiophora brassicae is a devastating disease that affects Brassica species worldwide including important crops like B. oleracea, B. rapa and B. napus (Chu et al. 2014;Dixon 2009). The pathogen usually causes gall formation on plant roots which may subsequently hamper the uptake of su cient nutrients and water, leads to abnormal growth of plants and nally results in severe yield losses and economic damage of the crop (Devos et al. 2005). It is hard to control this disease once a eld is contaminated with the pathogen since the resting spores of the soil-borne pathogen can survive in the soil for as long as 20 years (Chu et al. 2014). Cultural managements and chemical fungicides are currently the common approaches to control clubroot (Friberg et al. 2005; Ludwig-Müller et al. 2009; Peng et al. 2015), but the e ciency is unstable and the environmental damages from fungicides cannot be ignored. Therefore, developing resistant cultivars is the most effective, economical and environment friendly way to control clubroot.
In a previous study , several QTLs for traits such as root length and P. brassicae content in roots were identi ed from a hybrid cabbage (B. oleracea L. var. capitata) variety 'GZ87' which was resistant to P. brassicae race 4. However, QTLs for CR in 'GZ87' have not been identi ed yet. In the present study, QTL-Seq strategy (Takagi et al. 2013) was used to identify QTLs using genome resequencing in two sets of extreme pools. Potential candidate genes were identi ed from important QTL regions and a functional marker was further developed for molecular assisted selection. Our study will provide important signi cances for the breeding of clubroot-resistant cabbage.

Plant materials and resistance test
Two cabbage lines, 'GZ87' (with CR to P. brassicae race 4) and '263' (susceptible to P. brassicae race 4), were selected as parents to develop an F2 segregating population comprising of 184 cloned-lines through tissue culture following the asexual reproduction technology (Luo et al. 2000). Vegetative plants were transplanted into 72-well plug trays after rooting and kept in a phytotron (

Whole genome resequencing and QTL-Seq analysis
In order to improve the reliability of QTL mapping, the F2 segregating population was randomly divided into two small groups, in which group I and group II contained 110 and 74 F2 lines, respectively. Each of 17/20 extreme resistant and susceptible lines were selected from group I/II. Genomic DNA was extracted from young leaves of two parents and the extreme lines following the CTAB method (Doyle 1991). A resistant (R) pool and a susceptible (S) pool were constructed within each group by mixing equal volume of DNA from the extreme resistant and susceptible lines. The two pools in group I and two parents were subjected to whole genome resequencing on an Illunima HiSeq X Ten platform in Biomarker (Beijing, China), while the two pools in group II and parents were sequenced on a NovaSeq 6000 platform in the same service provider. Reads with low-quality bases (Q≤20) or with N ratio over 10% in the pooled samples and over 1% in parental lines were removed. The high-quality reads were then mapped to the reference genome of B. oleracea replications. Differential gene expression was calculated using the 2 −ΔΔCt method. The BoActin1 was used as an internal reference control. Primer sequences for all genes were available in Online Resource 1.

Functional marker development
According to the genome sequence of interested region, a common polymorphic site between two parents and between extreme pools was developed for a molecular marker, of which the primers were 5'-TACACACGCTGATATACCAACA-3' (forward) and 5'-TACACACGCTGCCCTGGAAA-3' (reverse). The polymerase chain reaction (PCR) ampli cation was conducted among B. oleracea lines in group II. The PCR products were separated in 1.5 % agarose gels and visualized under UV light.

Clubroot resistance of parents and extreme pools
Clubroot resistance was investigated in two parental lines and the two F2 groups across two years (Online Resource 2). In group I/II, 'GZ87' exhibited a high resistance level to P. brassicae (DI = 6.  (Table 1). According to the DI value of the F2 population of group2, a skewed-right distribution bar graph was obtained (Fig. 1). This shows that the F2 population data we obtained were concentrated on a lower boundary, and the average DI in the population was greater than the median.  Table 2) were detected in group II. Comparing QTLs between two groups, the three QTLs detected in group II were all located within the locus qCRc7-1 identi ed in group I (Fig. 2).

Identi cation of QTLs
Potential candidate genes for CR RNA-seq was conducted in R and S pools of group II to investigate the expression of these genes. Over 6 Gb clean data was obtained from each sample (Q30 > 93%), of which 72.6~74.2% were aligned to the reference genome of B. oleracea (Online Resource 4). Comparing to 0 DAI, 540/500/1837 and 2656/1518/4182 DEGs were detected from the R and the S pool at 4/7/14 DAI respectively. In total, 11/16/28/16 DEGs were detected from 123/79/134/99 genes with large amounts of SNPs and indels that located within the con dence intervals of qCRc4-1/qCRc7-2/qCRc7-3/qCRc7-4 (Table 2, Online Resource 5). Within these QTL intervals, 8 DEGs exhibiting obvious differential expression patterns between the R and the S pool were found to be located in the overlap regions on chromosome C07. The qRT-PCR revealed consistent expression patterns with RNA-seq for 6 genes among these 8 genes (Fig. 3). Among the 6 genes, Bol037115 (FCS-like zinc nger protein, located in qCRc7-2 with 7 SNPs and 1 indel between two pools) and Bol042270 (plant intracellular Ras-group-related LRR protein 8, located in qCRc7-4 with 9 SNPs and 6 indels between two pools) were upregulated after inoculation in the R pool but down-regulated in the S pool. The other 4 genes exhibited downregulation after inoculation in the R pool. Therefore, Bol037115 and Bol042270 were considered as potential candidates for future certi cation.

Functional marker for CR
According to deep comparison between two parents and between extreme pools, a PCR primer pair named 'SWU-OA' was designed within the interval of qCRc7-4 (Fig. 2). After the ampli cation by SWU-OA in 54 F2 lines in group II, 28 lines that exhibited absence of the same bands (400 bp) as the resistant parent GZ87 were all susceptible to the pathogen, with DI values ranged from 20.0~75.0 (averaged in 43.2), while 26 lines that presented the 400-bp bands were found to be with DI values of 0~18.6 (averaged in 4.2), with 3 exceptions that exhibited intermediate DI values (21.4~35.0) (Fig. 4). It is indicated that the SWU-OA exhibited ~95% accuracy in identifying CR in 56 F2 lines and possibly provided a way to accelerate breeding process of B. oleracea with CR. In the Fig. 1, since the mean DI in the population was greater than the median, the F2 population data we obtained were concentrated on a lower boundary and a skewed-right distribution bar graph was obtained. Different from the standard normal distribution chart, the actual data usually be biased. When the kurtosis is positive, it will form skewed-right, otherwise it will form skewed-left. In the absence of gene action, the kurtosis is negative or close to 0, while kurtosis is positive in the presence of gene interactions (Choo and Reinbergs 1982;Kotch et al. 1992;Samak et al. 2011). It is indicated that there were genes interaction in the process of constructing F2 populations between resistant and susceptible parents, studies on the amount of gene interaction are undoubtedly needed, so as to increase the e ciency of our selection and breeding programs.

Discussion
As compared with that of B. rapa, fewer CR loci were identi ed from B. oleracea possibly due to the lack of resistant resources. The reported CR loci of B. oleracea were detected from chromosome C01 (Pb3, PbBo1), C02 (Pb-Bo(Anju)1, Pb-Bo(Anju)2, CRQTL-YC), C03 (Pb-Bo(Anju)3, Pb4), C07 (Rcr7, Pb-Bo(Anju)4) and C09 (CRQTL-GN_1, CRQTL-GN_2). In our study, we identi ed 4 QTLs for CR, of which one located on chromosome C04 and other three located adjacently on C07. No CR QTL has been detected before on C04 in B. oleracea, but two CR QTLs (SCR-C4a and SCR-C4b) were found from chromosome C04 in B. napus (Li et al. 2016). By aligning sequences of the markers in SCR-C4a and SCR-C4b to the reference genome of B. oleracea, the two QTLs were aligned to 2.49-2.51 Mb and 8.06-8.10 Mb on C04 of B. oleracea, which were obviously distant from the interval of our qCRc4-1 . By using the same approach, CR QTLs on chromosome C07 were compared among studies. The CR loci PbBo(Anju)4 (Nagaoka et al. 2010) was found to locate in 37.93-39.25 Mb which was partially overlapped with our qCRc7-2 (38.96-39.52 Mb). However, qCRc7-2 may have a limited effect to CR since PbBo(Anju)4 was reported to be with a very small effect (R 2 = 0.03) and we failed in nding candidate resistance gene from this region. The other two QTLs for CR on C07, i.e. qCRc7-3 (41.38-42.52 Mb) and qCRc7-4 (43.56-44.15 Mb) were found to be located nearby but not overlapped with Rcr7 (42.94-43.20 Mb) which is a major QTL in B. oleracea for resistance against P. brassicae pathotypes 3 and 5X (Dakouri et al. 2018). In addition, the most possible candidate gene for Rcr7 (Bo7g108760, a TIR-NBS-LRR disease resistance gene) (Dakouri et al. 2018) was not induced by P. brassicae race 4 in the present study (FPKM of 0.5, 0.6, 0.4 and 0.4 in R0, R4, R7 and R14; 1.6, 0.5, 0.8 and 0.6 in S0, S4, S7 and S14, respectively).
These suggest that qCRc7-3 and qCRc7-4 were novel loci for CR. The possible reason might be the difference on plant materials or P. brassicae pathotypes/races.
In a previous study, 23 QTLs for three CR-associated traits were identi ed from the same segregating population between 'GZ87' and '263', including disease incidence (DIC), numbers of brous roots (NFR) and P. brassicae content in roots (PCR)  .Of these, 4 QTLs (NFR.I-3, NFR.I-4, NFR.II-4 and PCR.II-3) located on chromosome C04 and 2 QTLs (NFR.II-6 and NFR.II-7) located on chromosome C07. However, after comparison on the physic positions of these loci, none of them was overlapped with the QTLs for CR found in the current study. The closest loci between the two studies were NFR.II-6 (47.77-48.30 Mb) and qCRc7-4 which showed a distance of 3.62Mb on chromosome C07. This suggests that disease incidence, numbers of brous roots and P. brassicae content in roots may be not representative indicators for CR which is usually determined by disease index.
A total of 312 genes were found to locate in the three QTLs regions on C07, including 6 resistance genes (R genes) encoding TIR-NBS-LRR disease resistance proteins. However, none of these R genes presented expression difference between R and S pools in RNA-seq (Online Resource 6). Although 61 DEGs were identi ed from the three regions, most of them presented similar expression patterns between the two pools excepting eight genes. Among these, an FCS-like zinc nger (FLZ) domain protein (Bol037115) and a plant intracellular Ras-group-related LRR (PIRL) protein (Bol042270) exhibited over 3-fold up-regulation after inoculation in the R pool but with down-regulation in the S pool. The FLZ domain proteins are implicated in the regulation of various biotic and abiotic stresses (Chen et al. 2013b; Jamsheer K and Laxmi 2015).
Members of Arabidopsis thaliana FLZ gene family is responsive to ABA and JA (Nietzsche et al. 2014). PIRLs encode a plant-speci c class of leucine-rich repeat proteins related to Ras-interacting LRRs that take part in developmental signaling in animals and fungi (Forsthoefel et al. 2005). Some PIRL family members in rice were found to respond to different hormonal treatments, including NAA (a member of the auxin family), KT (a cytokinin), and GA (a gibberellin) (You et al. 2010). It seems that both the two candidate genes are potentially involved response to hormones which are tightly associated with host response to pathogens.
Clubroot-resistant variety is of great importance in cabbage cultivation, but the breeding practice was unsuccessful due to the lack of highly resistant sources, the quantitative nature of resistance, the di culties in marker assisted selection (MAS) and the instability of resistance evaluation. In our study, a molecular marker (SWU-OA) that developed from the polymorphic region within qCRc7-4 was effective in distinguishing resistant or susceptible of F2 lines (with an accuracy of 95%), suggesting a great potential of this marker to be applied in MAS of offspring with CR. This would be helpful for lightening the labour of resistance screening in low generations and may accelerate the breeding process of B. oleracea with CR.