Regional association and transcriptome analysis revealed candidate genes controlling plant height in Brassica napus

Plant height is a key morphological trait in rapeseed, which not only plays an important role in determining plant architecture, but is also an important characteristic related to yield. Presently, the improvement of plant architecture is a major challenge in rapeseed breeding. This work was carried out to identify genetic loci related to plant height in rapeseed. In this study, a genome-wide association study (GWAS) of plant height was performed using a Brassica 60 K Illumina Infinium SNP array and 203 Brassica napus accessions. Eleven haplotypes containing important candidate genes were detected and significantly associated with plant height on chromosomes A02, A03, A05, A07, A08, C03, C06, and C09. Moreover, regional association analysis of 50 resequenced rapeseed inbred lines was used to further analyze these eleven haplotypes and revealed nucleotide variation in the BnFBR12-A08 and BnCCR1-C03 gene regions related to the phenotypic variation in plant height. Furthermore, coexpression network analysis showed that BnFBR12-A08 and BnCCR1-C03 were directly connected with hormone genes and transcription factors and formed a potential network regulating the plant height of rapeseed. Our results will aid in the development of haplotype functional markers to further improve plant height in rapeseed.

and significantly associated with plant height on chromosomes A02, A03, A05, A07, A08, C03, C06, and C09. Moreover, regional association analysis of 50 resequenced rapeseed inbred lines was used to further analyze these eleven haplotypes and revealed nucleotide variation in the BnFBR12-A08 and BnCCR1-C03 gene regions related to the phenotypic variation in plant height. Furthermore, coexpression network analysis showed that BnFBR12-A08 and BnCCR1-C03 were directly connected with hormone genes and transcription factors and formed a potential network regulating the plant height of rapeseed. Our results will aid in the development of haplotype functional markers to further improve plant height in rapeseed.

Background
Plant height, as one of the important agronomic traits in rapeseed, influences the establishment of an ideal plant type and high yield. The global production of cereal grains significantly increased in the 1960s and 1970s during the "Green Revolution," with the creation of semidwarf varieties with lodging resistance and high yields in rice and wheat through the modification of plant architecture (Hargrove and Cabanilla 1979;Khush 1999). Breeding practices in many crops have demonstrated that the application of semidwarf Abstract Plant height is a key morphological trait in rapeseed, which not only plays an important role in determining plant architecture, but is also an important characteristic related to yield. Presently, the improvement of plant architecture is a major challenge in rapeseed breeding. This work was carried out to identify genetic loci related to plant height in rapeseed. In this study, a genome-wide association study (GWAS) of plant height was performed using a Brassica 60 K Illumina Infinium SNP array and 203 Brassica napus accessions. Eleven haplotypes containing important candidate genes were detected Rui Ren and Wei Liu contributed equally to this work.

3
Vol:. (1234567890) varieties could effectively improve lodging resistance and reduce yield loss (Hedden 2003). Moderate dwarfing of plants can strengthen stalks, improve the harvest index, and improve the utilization rate of fertilizer, pesticide, and water (Shah et al. 2019). Therefore, it is of great significance to identify some important genes related to plant height and apply them to rapeseed breeding.
Plant height is mainly related to internode elongation, cell elongation, and apical meristem growth and differentiation. In rice and Arabidopsis model plants, the genetic mechanisms controlling plant height have been more thoroughly investigated than in any other plant. Previous studies have shown that homologs of Arabidopsis genes controlled similar metabolic pathways in polyploid plants; however, further elucidation of the molecular mechanism of plant height is limited due to the complexity of the polyploid genome. In polyploid rapeseed, some researchers have identified many quantitative trait loci (QTLs) associated with plant height in biparental populations in the past few decades. For example, 27 QTLs related to plant height were detected in a doubled haploid (DH) population of 348 inbred lines (Wang et al. 2015). Mei et al. (2009) identified seven QTLs associated with plant height, accounting for 8.5-28.6% of the phenotypic variation. A total of 25 QTLs related to plant height were identified by a DH population composed of Y689 and Westar (Shen et al. 2018). Previously reported QTLs have been a good way to locate plant height regulating genes, but using only two parents can influence mapping accuracy and limit further investigations into the molecular mechanisms of plant height. However, high-density SNPs and historical recombination events in large populations have provided the opportunity to methodically analyze the genetic architecture of plant height in crops by GWAS. Li et al. (2016) performed a GWAS to detect eight SNPs significantly associated with plant height by 472 rapeseed accessions with a 60 K single nucleotide polymorphism (SNP) array. Sixty-eight SNPs were identified as significantly associated with plant height by GWAS, and 48 SNPs overlapped the confidence interval of QTLs previously detected from biparental populations in rapeseed (Sun et al. 2016). A total of 27 stable quantitative trait nucleotides (QTNs) were identified significant associated with plant height in 4 environments in 286 soybean accessions (Han et al. 2021). A genome-wide association study identified 9 SNP makers significantly associated with plant height in 300 maize accessions (Zhang et al. 2019). Forty-three SNPs were consistently detected significantly associated with plant height in multiple environments in hexaploid wheat by GWAS (Muhammad et al. 2021).
At present, some studies have identified potential genes related to plant height in crops. For example, Zheng et al. (2020) used CRISPR/Cas9 editing technology to knock out the BnMAX1 gene, which is related to the synthesis of strigolactone and obtained shorter and more branched plants in B. napus. Li et al. (2019) suggested a weak gain-of-function mutation in iaa7.a03 that reduced the length of internodes in rapeseed. Yu et al. (2020) suggested that the introduction of the weak allele SD1-EQ from the japonica cultivar led not only to a substantial increase in plant height but also to an increase in yield per plant in the indica cultivar. The two semi-dwarf genes sdw1 and ari-e had an additive effect for plant height with height reductions up to 27 cm in barley (Dang et al. 2020). Dong et al. (2019) suggested that TaCOLD1 acts as a novel regulator of plant height through interfering with the formation of heterotrimeric G protein complex in bread wheat.
In this study, a GWAS was performed to identify eleven haplotypes associated with plant height in 203 B. napus accessions with a 60 K SNP array. Then, novel genes/loci affecting plant height in these eleven haplotype regions were further identified with 50 resequenced accessions and transcriptomes. These results will facilitate the development of haplotype functional markers to further improve plant height in rapeseed.

Plant material and phenotypic data
In this study, 203 semi-winter rapeseed accessions were collected from the breeding program of Southwest University in Chongqing, China (Table S1). All accessions were sown in a randomized complete block design at Gross Gerau (49.941000°N, 8.501391°E), Germany, and Rauischholzhausen (50.760932°N, 8.880663°E), Germany, in 2013 (designated as E1) and 2014 (designated as E2), respectively. We also planted these accessions in Chongqing 1 3 Vol.: (0123456789) (106.38°E, 29.84°N), China, in 2013 (designated as E3) and 2014 (designated as E4). Each line was planted in a three-row plot, with 0.05 m between plants within each row, 0.25 m between the rows, and 25 plants per row. Plant height was measured as the distance from the ground to the tip of the tallest panicle for each plot.
The R packages HMISC (Harrell and Dupont 2018) and PSYCH (Revelle 2017) were used to calculate and analyze the plant height phenotypic distribution, coefficient of variation, mean value, standard deviation, correlation coefficient, and minimum and maximum values under the four environments. The analysis of variance (ANOVA) was calculated with a general linear model procedure by the statistical software package SPSS Statistics (IBM Corp., Armonk, NY, USA), version 22. The broad sense heritability (H 2 ) for plant height was calculated via the following equation: , and 2 e are the genetic variance, variance of variety environment interaction, and residual variance components, respectively; n is the number of environments; and r is the repetitions of every environment. The best linear unbiased estimation (BLUE) under the mixed linear model was performed for plant height across all environments using the R package "lme4" (Bates et al. 2015). The BLUE value was also used for GWAS as phenotypic data.

Genome-wide association analysis
To guarantee the quality of SNP genotyping, the 60 K SNPs were filtered based on a call frequency ≥ 0.85, minor allele frequency ≥ 0.05, and heterozygosity ≤ 0.15. Finally, 24,984 SNPs were used to carry out GWAS of plant height in 203 B. napus accessions. The population structure of the 203 accessions was described in detail by Qian et al. (2014). TASSEL 5.0 software was used to calculate and analyze the relationships among the 203 accessions (Bradbury et al. 2007). To examine marker-trait associations using the mixed linear model (Q + K) as proposed by Yu et al. (2006), the calculation formula was as follows:y = + S + Pv + Zu + e , where y, µ, α, v, u, and e are the vector of phenotypic observations, the overall mean, a vector of the fixed allelic effects associated with the SNP under investigation, a vector of fixed population effects, a vector of random genetic background effects (relationships among individuals), and the vector of residuals, respectively. The matrix P contains information on population structure. S and Z are incidence matrices relating y to a and u, respectively. The variances of the random effects u and e are assumed to be normally distributed with u ~ N (0, G 2 g ) and e ~ N (0, R 2 g ), where G is a 203 × 203 genomic relationship matrix and R is a 203 × 203 matrix in which the off-diagonal elements are 0 and the diagonal elements are the reciprocal of the number of phenotypic observations per individual. The G matrix was computed according to the first method proposed by VanRaden (2008). Since every genotype was tested only once per environment and GWASs were performed for all environment-trait combinations separately, R corresponds to the identity matrix I. The R package CMplot was used to display Q-Q and Manhattan plots in GWAS of plant height (Yin et al. 2021). A significance threshold based on the false discovery rate (FDR) was used to avoid the loss of interesting loci under different environments (Benjamini and Hochberg 1995). The R package fdrtool (Strimmer 2008) was used to calculate the FDR threshold, and a threshold value of − log 10 (P) = 3.5 was used to estimate the significant association between the SNP and the plant height phenotype.

Regional association analysis
The resequencing of 50 rapeseed accessions was described in detail by Dong et al. (2018) and Yao et al. (2021). SNP loci with a heterozygous rate > 0.25 and a MAF < 0.05 were removed. Eventually, 533,046 high-quality SNPs in eleven haplotype regions were used for further analysis. The software package SPAGeDi (Hardy and Vekemans 2002) was employed to calculate the relative kinship. Principal component analysis (PCA, P) was calculated using the R package SNPRelate (Zheng et al. 2012). Based on the P and K matrices, the calculation was performed with the mixed linear model incorporated into TASSEL 5.0 software (Bradbury et al. 2007).
Haplotype analysis and gene content in the haplotype block regions The R package "LDheatmap" was used to identify haplotype blocks in the whole genome (Shin et al. 2006). Haplotype alleles with frequency > 1% were used to test for significant phenotypic differences by a two-sample t-test (assuming unequal variance). Two databases, the Brassica napus Darmorbzh reference genome v. 4.2 (Chalhoub et al. 2014) (accessed from https:// genom evolu tion. org/ CoGe/) and the Arabidopsis genome (http:// www. arabi dopsis. org/) were used to further analyze and verify the most likely gene functions from significantly associated haplotype regions.
Transcriptome sequencing and qRT-PCR verification Equivalent amounts of 2 cm stem tips from 20 accessions at the seedling stage were collected for RNA extraction, and three biological replicates were performed. These samples were immediately flash frozen in liquid nitrogen and stored at − 80 °C until RNA extraction. A total amount of 3 µg RNA per sample was used for library construction according to the manufacturer's instructions (Illumina Inc.). The Illumina HiSeq 2500 platform was employed to sequence all accession libraries and generate 125 bp paired-end raw reads. High-quality clean data was obtained by removing reads containing ploy-N or, adapters, and low-quality reads from raw data by fastp (Chen et al. 2018). HISAT2 (http:// ccb. jhu. edu/ softw are/ hisat/ index. shtml) was used to align the high-quality paired-end clean reads to the Darmor-bzh reference genome v.4.2 (accessed from https:// genom evolu tion. org/ CoGe/) using HISAT2 (Chalhoub et al. 2014). HTSeq 0.6.1 was used to count the reads mapped to each gene. The FPKM of each gene was calculated according to the length of each gene, and the read count of the gene was plotted (Anders et al. 2015).
Quantitative real-time PCR (qRT-PCR) was carried out to confirm the expression of haplotype genes of BnFBR12-A08 and BnCCR1-C03. The result analysis was performed on LightCycler 480 SYBR Green I Mastermix and a LightCycler 480II real-time PCR system (Roche, Switzerland). The transcript abundance calculated from three biological and three technical replicates. The B. napus β-actin gene (BnaC02g00690D) was used as a reference standard. The 2 -ΔΔCT method was used to estimate the relative expression level (Livak and Schmittgen 2001). The gene-specific primers sequences were listed in Table S2.

Coexpression analysis
The transcriptome data of stem tips from 20 semiwinter rapeseed varieties in China were collected to calculate the coexpression edges, and a soft threshold value of 0.9 was chosen in the weighted gene coexpression network analysis (WGCNA) R package (Langfelder and Horvath 2008). Genes with Pearson correlation coefficients (PCCs) ≥ 0.20 were used to visualize the coexpression network by CYTOSCAPE3.6 (Smoot et al. 2011). GO enrichment was carried out with TBtools (Chen et al. 2020) and visualized using the R package ggplot2 (Villanueva and Chen 2019).

Phenotypic analysis of plant height in 203 rapeseed accessions
In this study, we observed extensive phenotypic variation in plant height in the 203 B. napus inbred lines. The frequency distribution of plant height for the four different environments is summarized in Fig. 1. In the E1, E2, E3, and E4 environments, the plant height ranged from 74 to 161 cm, 77 to 163 cm, 112 to 214 cm, and 121 to 203 cm, with average values (± SD) of 119.66 ± 15.28 cm, 126.61 ± 16.91 cm, 171.86 ± 20.22 cm, and 168.88 ± 16.11 cm, respectively, and variable coefficients of 12.77%, 13.36%, 11.77%, and 9.54%, respectively (Table 1). Plant height was significantly positively correlated across all environments, with correlation coefficients of 0.31 to 0.82 (Fig. 1). The broad-sense heritability (H 2 ) of plant height was 0.67 (Table S3).
We also analyzed the phenotypic variation in plant height in 50 resequenced accessions in four different environments. Plant height was significantly correlated across the four different environments, with correlation coefficients of 0.24 to 0.77 (Fig. S1)

Genome-wide haplotype analysis of plant height
A total of 81 SNPs significantly associated with plant height were identified in four different environments and BLUE by GWAS (Fig. 2). These significant SNPs associated with plant height were investigated at a high resolution by assaying haplotype blocks (r 2 > 0.50) in flanking chromosome segments. Finally, we checked 11 haplotypes associated with plant height distribution on   Table S5).
Regional association analysis in haplotype regions Regional association analysis of 50 resequenced accessions detected three SNPs from the promoter region of FUMONISIN B1-RESISTANT12 (BnFBR12-A08, Bna-A08g19830D) that were associated with plant height on the haplotype block (2,580,486-2,752,416 bp) of chromosome A08 (Fig. 3a). Four haplotype alleles were identified in the BnFBR12-A08 gene region (Fig. 3a). A comparison of the four haplotype alleles showed that they corresponded to the plant height phenotype, and we identified BnFBR12-A08_Hap4 in the shortest accessions compared to other haplotype alleles (t-test: assuming unequal variance; Fig. 3b; Table S6). A similar study was performed on the haplotype region (49,305,972,021 bp) of chromosome Fig. 3 Regional association analysis on haplotype blocks using 50 re-sequenced rapeseed inbred lines. The green dots represent these SNPs located in BnFBR12-A08 (a) and BnCCR1-C03 (c) on chromosome A08 and C03, respectively. Four and three haplotype alleles showed frequency > 1% were detected in the BnFBR12-A08 (a) and BnCCR1-C03 (c) haplotype regions, respectively. (b, d) Boxplot showed phenotype comparative analysis among these haplotype alleles related to plant height phenotype in BnFBR12-A08 and BnCCR1-C03, respectively. *p ≤ 0.05

3
Vol.: (0123456789) C03. Two SNPs in the promoter region of CIN-NAMOYL COA REDUCTASE 1 (BnCCR1-C03; BnaC03g60490D) were associated with plant height by regional association analysis of 50 resequencing B. napus inbred lines (Fig. 3c). Three haplotype alleles were detected in BnCCR1-C03 (Fig. 3c). Comparative analysis of these three haplotype alleles related to the plant height phenotype, and revealed BnCCR1-C03_Hap1 corresponds to accessions that have the shortest plant height than those of the other haplotype alleles (t-test: assuming unequal variance; Fig. 3d; Table S6).
Based on the haplotypes of BnFBR12-A08 and BnCCR1-C03 gene regions corresponding to plant height phenotypes, two groups of extremely high and short accessions were selected. qRT-PCR analysis result showed that the expression levels of BnFBR12-A08 and BnCCR1-C03 in extremely high accessions were significantly higher than in short accessions (Fig. S4).

Coexpression network of candidate genes
To provide additional context for the proposed functions of BnFBR12-A08 and BnCCR1-C03, we used gene expression data from 20 stem tips to construct a coexpression network in B. napus. This analysis yielded 12 gene modules, each represented by a different color in the output (Fig. 4a). The modules containing candidate genes that had been detected by WGCNA, BnFBR12-A08 and BnCCR1-C03, were in the yellow and blue modules, respectively (Fig. S5). The yellow and blue modules showed significant negative and positive correlations with plant height, and the correlation coefficients were − 0.35 and 0.53, respectively (Fig. 4b).
We performed gene ontology (GO) enrichment analysis for the blue and yellow modules and created a bubble chart showing the top 10 GO terms for biological process, cellular component, and molecular function. The blue module was significantly enriched in genes involved in the response to cytokinin (GO: 0,009,735), protein transport (GO: 0,006,810), and so on ( Fig. S6a; Table S7), and the yellow module was notably enriched in genes involved in the abscisic acid-activated signaling pathway (GO: 0,009,738), cellular response to auxin stimulus (GO: 0,071,365), signal transduction (GO: 0,007,165), and so on ( Fig. S6b; Table S7).

Discussion
Rapeseed breeders are in the process of modifying plant height to obtain improved ideal plant types for increased yield and mechanized harvesting. In the model plant Arabidopsis, brassinosteroid, auxin, gibberellin, and strigolactone biosynthesis and signaling pathway genes are known to regulate plant height (Wang and Li 2008;Sun et al. 2010;Clouse 2011). Strong selection can cause multilocus tight linkage of genomic regions associated with phenotypic variation in complex traits. High-resolution genome analysis technologies provide an unprecedented level of insight into the local linkage disequilibrium (LD) patterns of complex crop genomes (Edwards et al. 2013;Voss-Fels and Snowdon 2016). In our study, eleven haplotype blocks carrying nineteen genes were involved in plant growth and development processes and were significantly associated with plant height under four different environments. We detected haplotypes containing the candidate genes SAUR30, TCP22, and GAI, which were identified by previous QTL mapping (Wang et al. 2015). The candidate gene BnSAUR30 is a small auxin-up RNA (SAUR) that belongs to the largest family of early auxin response genes. SAUR proteins are key effector outputs of hormonal and environmental signal regulation in plant growth and development (Ren and Gray 2015). TCP family genes are important transcription factors that regulate plant growth and development. TCP genes regulate plant growth by responding to GA signals. TCP22 has been shown to affect plant height in Arabidopsis (Davière et al. 2014). The candidate gene GAI regulates plant growth by negatively regulating the gibberellin (GA) signal transduction pathway. The gai mutant presents a short-plant phenotype in Arabidopsis (Dill and Sun 2001). We also identified some new QTLs containing the candidate genes BRI1 and SKP2A on chromosomes A05 and C09. A loss-offunction mutant of the brassinosteroid biosynthesisassociated gene BRI1 presents a dwarf phenotype in Arabidopsis (Noguchi et al. 1999). SKP2A affects plant growth and development by positively regulating cell division (Jurado et al. 2008). Moreover, regional association analysis detected SNP variations in BnFBR12-A08 and BnCCR1-C03 related to the phenotypic variation in plant height in the haplotype regions on chromosomes A08 and C03 in 50 resequenced rapeseed accessions. FBR12 encodes a putative eIF-5A-2 protein that regulates cell division and cell growth in a tissue-and development-specific manner (Feng et al. 2007). FBR12 affects the growth rate and organogenesis of Arabidopsis thaliana and reduces the number of leaves and adult organs, resulting in a dwarfing phenotype (Feng et al. 2007). Ren et al. (2013) suggested that the eukaryotic translation initiation factor eIF5A-2 regulates cytokinin signaling in Arabidopsis. CCR1 encodes a cinnamoyl CoA reductase that is involved in lignin biosynthesis (Lacombe et al. 1997). The Arabidopsis mutant ccr1 displayed a dwarf phenotype (Ruel et al. 2009). The ccr1 dwarf phenotype is the consequence of dramatically increased levels of FeA that keep the cellular oxidation state low, resulting in a prolonged phase of cell proliferation and dwarfism (Xue et al. 2015). Bonawitz and Chapple (2013) suggested that lignin quantity or composition are primarily responsible for dwarfing in plants. Previous studies have shown that auxin (Khadr et al. 2020), cytokinin (Guo et al. 2005), gibberellin (Biemelt et al. 2004), abscisic acid (Liu et al. 2021), and brassinosteroid (Rao and Dixon 2017) affect the biosynthesis of lignin. Moreover, the coexpression network showed a direct correlation between BnFBR12-A08 and BnCCR1-C03, which included many genes involved in auxin, cytokinin, gibberellin, abscisic acid and brassinosteroid biosynthesis, and signaling pathway. These results suggest that these two genes may be associated with plant hormone-related genes that coregulate the growth and development of oil rapeseed.
Haplotype blocks combining multiple SNPs show highly increased heterozygosity due to their inherent multiallelic nature and are thus much more informative than single biallelic SNPs (Stephens et al. 2001). Based on increasing quantities of population sequence data, more precise identification of gene variants in haplotypes will provide a basis for more precise selection of environmentally resilient cultivars. For instance, nine haplotypes were detected in ZmCCT , and Hap5 showed excellent performance for both stalk rot resistance and flowering time and is thus expected to have potential value in future maize breeding programs (Li et al. 2017). Three haplotypes were detected in ARF18 from chromosome A09, and ARF18-Hap.C accessions showed the largest seed weight and the longest silique length in rapeseed (Dong et al. 2018). Abbai et al. (2019) proposed developing next-generation tailor-made rice with superior haplotype combinations of target genes to meet future food and nutritional demands via haplotype-based breeding. In this study, we found that the favorable haplotype alleles BnCCR1-C03_Hap1 and BnFBR12-A08_Hap4 on chromosomes C03 and A08, respectively, contribute to the shortest plant heights. Our results will be beneficial for the development of haplotype functional markers to further improve plant height and provide important candidate gene information for the creation of dwarf accessions by using gene editing techniques in rapeseed.

Conclusion
Eleven haplotypes carrying nineteen genes were involved in plant growth and development processes and were significantly associated with plant height by GWAS, including 8 haplotype regions that overlapped with previously reported QTLs. We revealed that the SNP variations in BnFBR12-A08 and BnCCR1-C03 influenced plant height in 50 resequenced accessions and were variable in the transcriptome analysis of 20 accessions. BnFBR12-A08_Hap4 and Red nodes represent BnCCR1-C03 gene. These genes from the coexpression network are divided into the following categories: abscisic acid pathway (lime nodes), auxin pathway (cyan nodes), brassinosteroid pathway (lightcoral nodes), cytokinin pathway (pink nodes), ethylene pathway (magenta nodes), gibberellin pathway (lavender nodes), and transcription factor (dimgray nodes) ◂ BnCCR1-C03_Hap1 had the shortest plant height phenotypes compared with the other haplotypes. Our results will be beneficial for the development of haplotype functional markers to further improve plant height in rapeseed.