Genome-wide association analysis uncovers candidate genes for forage quality traits in Brassica napus stems

related nutritional quality and of animal feed. quantitative trait loci (QTLs) associated with acid detergent fiber (ADF) and neutral detergent fiber (NDF) contents in rapeseed In this study, we used 494 B. napus accessions to perform genome-wide association studies (GWAS) of ADF and NDF contents. Ninety-two single-nucleotide polymorphisms (SNP) and 35 simple-sequence repeat (SSR) loci were significantly correlated with ADF and NDF contents, respectively, and six genetic loci associated with ADF and NDF contents were detected using both types of markers. We identified three candidate genes on chromosome A05 related to ADF content, including genes encoding chitinase-like protein 2 (CTL2) and two trichome birefringence-like 41 s (TBL41s). Seven genes on chromosomes A03 and A04 were related to NDF content, including genes encoding glycosyl hydrolase (GH), reversibly glycosylated polypeptide 1 (RGP), irregular xylem 12 (IRX12), trichome birefringence-like 34 (TBL34), galacturonosyltransferase 7 (GAUT7), cytokinesis defective 1 (CYT1), and LOB domain-containing protein 15 (LBD15). These candidate genes encode factors that likely participate in secondary cell wall formation and lignocellulose biosynthesis. These findings lay the foundation identifying genes related to forage quality traits and improving the efficiency of forage utilization in rapeseed, which will be beneficial for breeding new varieties for high-quality forage with low lignocellulose content.


Abstract Background
Brassica napus (rapeseed) is an important oilseed crop and its leaves and stems can also be used as animal feed. Lignocellulose content is closely related to the nutritional quality and palatability of animal feed. However, quantitative trait loci (QTLs) associated with acid detergent fiber (ADF) and neutral detergent fiber (NDF) contents in rapeseed stems have not yet been mapped.

Results
In this study, we used 494 B. napus accessions to perform genome-wide association studies (GWAS) of ADF and NDF contents. Ninety-two single-nucleotide polymorphisms (SNP) and 35 simple-sequence repeat (SSR) loci were significantly correlated with ADF and NDF contents, respectively, and six genetic loci associated with ADF and NDF contents were detected using both types of markers. We identified three candidate genes on chromosome A05 related to ADF content, including genes encoding chitinase-like protein 2 (CTL2) and two trichome birefringence-like 41 s (TBL41s). Seven genes on chromosomes A03 and A04 were related to NDF content, including genes encoding glycosyl hydrolase (GH), reversibly glycosylated polypeptide 1 (RGP), irregular xylem 12 (IRX12), trichome birefringence-like 34 (TBL34), galacturonosyltransferase 7 (GAUT7), cytokinesis defective 1 (CYT1), and LOB domain-containing protein 15 (LBD15). These candidate genes encode factors that likely participate in secondary cell wall formation and lignocellulose biosynthesis.

Conclusions
These findings lay the foundation for identifying genes related to forage quality traits and improving the efficiency of forage utilization in rapeseed, which will be beneficial for breeding new varieties for high-quality forage with low lignocellulose content.

Background
Brassica napus is not only an important source of edible oil, but it is also a forage crop whose leaves and stems are used for ruminant animal feed [1]. Brassica forage with low fiber and high protein contents is more similar to traditional forage, grass, legumes and herbage than other types of Brassica [2]. Following fermentation to form silage, rapeseed stems contained 4.78% crude protein, 1.04% ether extract, 45.59% crude fiber, 49.72% acid detergent fiber (ADF), 62.45% neutral detergent fiber (NDF), 9.17% acid detergent lignin (ADL), 0.63% Ca and 0.08% P contents, and showed improved palatability [3]. The weight of cattle increased after they were fed rapeseed stems [4]. However, rapeseed stems have poor digestibility due to the high cellulose and lignin contents in their cell walls, limiting the value of rapeseed straw for use as animal feed [5].
The cell wall is a composite of cellulose, hemicellulose, and lignin. The contents of the cell wall components ADF and NDF can be estimated using the Van Soest method [6]. NDF consists of the cellulose, hemicellulose and lignin released after neutral detergent treatment, and ADF consists of the cellulose and lignin released after acid detergent treatment. Cell wall digestibility is widely used to evaluate the nutritional quality of forage [7]. The ADF and NDF contents of crops are negatively related to the digestibility of forage, as well as its nutritional quality and palatability [8]. However, cellulose and lignin are important for the mechanical strength and structural integrity of the stem [9,10], and correlated with lodging and disease resistance [11][12][13]. Therefore, it is important to balance the ADF and NDF contents in stems.
QTL mapping of cell wall components has been performed in various crops. Cardinal et al. (2005) [14] identified 64 QTLs related to lignin, ADF, and NDF contents in the leaf-sheaths and stalks of a maize single-nucleotide polymorphisms (SNPs) associated with ADF, NDF and in vitro dry matter digestibility (IVDMD), respectively in 368 maize inbred lines in seven different environments. Each significant SNP explained 4.2-6.2% of the phenotypic variation [7]. Another study [15] detected 24 SNPs associated with NDF on chromosomes 1, 2, 6, 7, 8, 9 and 10 and 7 SNPs associated with ADF content on chromosomes 3 and 4 in sorghum (Sorghum bicolor). In B. napus, most QTLs for ADF and NDF contents identified to date are associated with the seed coat. For example, Badani et al. (2006) [16] demonstrated that seed coat color is related to ADF contents. The main QTL for this trait was detected at the same position on chromosome N18 in all three populations examined; the second QTL for ADF contents was located on chromosome N13. Liu et al. (2013) [17] identified 7, 9, and 5 QTLs for seed coat lignin, cellulose, and hemicellulose contents in an RIL population of rapeseed, explaining 8.1-42.8%, 4.7-21.9% and 7.3-16.9% of phenotypic variation, respectively. However, to date, QTL mapping for ADF and NDF contents in rapeseed stems has not been reported.
In the current study, we used SNP markers from the Brassica Illumina 60K SNP array and simplesequence repeat (SSR) markers to genotype 494 B. napus accessions. We performed genome-wide association studies (GWAS) to identify significant loci and candidate genes associated with ADF and NDF contents. This study provides an important basis for identifying and cloning genes related to ADF and NDF contents in B. napus stems.

Results
Phenotypic variation of ADF and NDF contents in B. napus In this study, we measured ADF and NDF contents in the stems of 494 B. napus lines in both 2013 and 2014. These values showed continuous variation and approximated a normal distribution (Fig. 1). The phenotypic range of NDF content in 2013 was 70.34-79.85%, and that of ADF content was 54.87-62.86%. The variation coefficient of NDF content in 2013 and 2014 was 2.74% and 1.97%, respectively, and the variation coefficient of ADF content was 2.59% and 1.95%, respectively (Table 1). These results indicate that NDF and ADF contents are typical quantitative traits.

Gwas Using Snps
We performed GWAS of 494 B. napus accessions using the GLM and MLM models. For ADF content, based on the QQ plot, the P values detected by the K, K + PCA and K + Q models were close to the expected values, which had greater effects in reducing false positives than the other models ( Fig. 2a).
For NDF content, the P, K + PCA and K + Q models were better than the Q and K models (Fig. 2b).
We detected 11 SNPs that were significantly related to ADF content on chromosomes A05, A06, A07, and A09, which explained 3.90-5.32% of the phenotypic variation ( Fig. 3 and Additional file 1). We identified the candidate gene BnaA05g23000D (chitinase-like 2, CTL2), which is located at position 17.4 Mb on chromosome A05. CTL2 binds to glucan-based polymers and regulates cellulose assembly in Arabidopsis thaliana [18,19].
In addition, we identified 81 loci that were significantly associated with NDF on all chromosomes except A01, A08, and C06; these 81 loci contributed 3.21-6.21% of the phenotypic variation ( Fig. 3 and Additional file 1). We also identified several candidate genes involved in lignocellulosic biosynthesis. BnaA02g09490D (galacturonosyl transferase 12, GAUT12), located at position 4.7 Mb on chromosome A02, is involved in xylan biosynthesis and lignin deposition during secondary cell wall formation [20]. BnaA03g14000D and BnaA03g14010D located at position about 6.4 Mb on chromosome A03, and BnaA04g17560D and BnaA04g17570D at about 14.2 Mb on chromosome A04, encoding C4H involved in lignin biosynthesis [21]. We also found the MYB transcription factor gene BnaA06g25640D (MYB103) on chromosome A06. In Arabidopsis, MYB103 regulates syringyl lignin and cellulose biosynthesis in the cell wall [22,23].

Common Associations Using Snps And Ssrs
We detected six genetic loci associated with ADF and NDF content using both SNPs and SSR markers ( Table 3). The SSR associated with ADF content was located at position 18.

Discussion
B. napus stems are currently an underutilized resource in China. A large proportion of B. napus stems are usually burned or chopped and incorporated into the soil, which pollutes the air and disrupts the ecological balance [24]. B. napus stems have huge potential for use as a source of fuel with high sulfur content, high calorific value and low moisture content, which would add value to the crop at the farm level [25]. In addition, B. napus stems could be used as feed to meet the current demands in light of insufficient forage. However, the low digestibility of rapeseed stems due to the high cellulose and lignin contents in the stem cell walls limits the value of this material. In the current study, we identified significant markers and candidate genes associated with ADF and NDF contents, which should facilitate the improvement of rapeseed varieties in the future. In addition, studying ADF and NDF contents in stems is essential for improving lodging and disease resistance for rapeseed breeding and cultivation.
Near-infrared (NIR) spectroscopy is a rapid, efficient, non-destructive approach for predicting chemical compound composition in numerous samples. This technique requires little or no sample pretreatment and does not alter the structure of the samples, which in turn reduces analytical costs and saves labor and time [26]. In addition, NIR results are accurate compared to other analytical techniques. NIR technology has been widely used to detect numerous nutrients and toxic elements in agricultural foods during agro-industrial production and to ensure product safety [27,28]. NIR could also be used in the petrochemical industry for the determination of hydrocarbons, alcohols, ketones and nitrile compounds in organic solvents [29]. Paul et al. (2018) [30] detected microplastics in the soil using a combined NIR chemometric approach, which met the demands for high-throughput analysis of large sample volumes. In addition, NIR technology has been widely used to measure chemical compounds in plants, such as lignin in roots, and to identify important QTLs [31]. In a grain × sweet sorghum population, 17 and 14 QTLs associated with ADF and NDF contents, respectively, were detected using NIR [32]. Wang  In the current study, we identified candidate genes associated with ADF and NDF contents via GWAS using SNP and SSR markers. Three candidate genes were identified for ADF contents, including CTL2 and two TBL41 genes. CTL2 (BnaA05g23000D; chitinase-like 2) binds to glucan-based polymers and functions in cellulose biosynthesis [33]. TBL41 (BnaA05g24680D; trichome birefringence 41) might be involved in cell wall formation. In A. thaliana, TBL27 is required for xyloglucan acetylation [34], and TBL3 contributes to cellulose biosynthesis [35]. Volker et al. (2010a) [36] proposed that TBL is a pectin-binding protein or bridging protein that binds to pectin and other cell wall polysaccharides based on sequence and structural similarities with rhamnogalacturonan acetylesterase (RGAE) of Aspergillus aculeatus and the protein LUSTRIN A-LIKE (Oryza sativa). The role of TBL41 needs to be further verified.
Among the candidate genes for controlling NDF content, BnaA03g22360D, which is homologous to AT2G27500, encodes a glucan endo-1,3-beta-glucosidase that participates in carbohydrate movement by hydrolyzing glycosidic bonds in cellulose and stabilizes protein morphology [37]. BnaA03g27800D encoding reversibly glycosylated polypeptide 1 (RGP1) is homologous to AT3G02230. RGP1 in pea (Pisum sativum) can be glycosylated by UDP-Glc, UDP-Xyl or UDP-Gal, and involved in xyloglucan and hemicellulose biosynthesis [38]. In addition, BnaA04g21810D (irregular xylem, IRX12), a laccase gene, is essential for lignin and secondary cell wall biosynthesis [39]. BnaA04g21970D (trichome birefringence-like 34, TBL34) encodes DUF231 (domain of unknown function 231). TBL34 is expressed in xylem cells in Arabidopsis and mutation of this gene caused a significant reduction in the amount of acetyl groups in xylan. The tbl34 tbl35 wsk1 (ESKIMO1) triple mutant exhibited collapsed xylem vessels and retarded plant growth, indicating that TBL34 is essential for secondary cell wall biosynthesis and cellulose deposition [40].
Another candidate gene, BnaA04g22110D, encodes galacturonosyltransferase 7 (GAUT7), which plays an important role in polysaccharide biosynthesis in the cell wall matrix by interacting with GAUT1 [41]. CYT1 (BnaA04g22820D) was identified at a position approximately 17.0 Mb away from chromosome A04. CYT1, which is also related to the monolignol S/G ratio, as described in a previous report, increases lignin content, and its previously reported position overlaps with the physical location determined in the current study [12]. The cyt1 mutant exhibited a 5-fold decrease in cellulose contents in embryos, impairing cell wall synthesis [42]. LBD15 (BnaA03g19070D, LOB domain-containing protein) regulates the expression of VND7, encoding a master regulator of tracheary element differentiation that functions via a positive feedback mechanism [43].

Gwas And Candidate Gene Identification
The ADF and NDF traits were evaluated for two years with two biological repetitions. The best linear unbiased prediction of a trait with two replicates over two years was estimated using an R script (http://www.eXtension.org/pages/61006) based on a linear model. Population structure (Q) was performed using Structure software, setting the K value from 1 to 10, with three independent measurements for each value [47]. The relative kinship between the materials was calculated using TASSEL 5 [48]. Association analysis was performed using the general linear module (GLM) and mixed linear module (MLM) in TASSEL 5 for SNP markers and TASSEL 3 for SSR markers. In addition, a quantile-quantile (QQ) plot was generated based on the observed -log 10 (P) and the expected P value, and a Manhattan plot was drawn using the R package qqman [49]. The GWAS threshold was set to P

Declarations
Availability of data and materials The data sets used and/or analysed during the current study will be available upon reasonable request to the corresponding author.

Consent For Publication
Not applicable.
Ethics approval and consent to participate Not applicable.

Competing interests
The authors declare that they have no competing interests.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. additionalfile1.pdf