Detection of reproducible QTL associated with bioenergy traits in sorghum across several growing environments

Sweet sorghum (Sorghum bicolor L. Moench) is a promising bioenergy crop. To increase the productivity of this crop, marker-assisted breeding will be important to advance its genetic improvement. This study aimed to identify quantitative trait loci (QTL) associated with several bioenergy-related traits in sweet sorghum that include flowering time, plant height, total biomass, stem diameter, stem moisture percentage, and brix. We used 188 F7 recombinant inbred lines (RILs) derived from a cross between a sweet sorghum (Wray) and a grain sorghum (Macia). The RILs and their parental lines were grown at two locations. Application of genotyping-by-sequencing analysis of the RILs allowed for constructing a map with 979 single nucleotide polymorphisms. Using the inclusive composite interval mapping of additive QTL, a major QTL for a flowering time was detected on chromosome 6, and explained 29.45% of the phenotypic variances (PVE). Major QTL for plant height (29.51% PVE) and total biomass yield (16.46% PVE) were detected on chromosome 7, and QTL for stem diameter (9.43% PVE) was detected on chromosome 1. Several QTL for brix were associated with sugar transporter genes, providing candidate genes for further study. For example, a major QTL for brix (39.92% PVE) was detected on chromosome 3 consistently across four environments. Twenty one QTL for five traits were detected across four environments using ICIM-ADD. The identified QTL in this study should aid in developing lines and hybrids of sorghum that are suitable for producing bioenergy.


Introduction
Bioenergy is an alternative and renewable energy derived from biomass resources, which includes include wood and wood wastes, agricultural crops and their waste byproducts, municipal solid waste, animal wastes, waste from food processing and aquatic plants and algae (Demirbaş 2001). Bioenergy is expected to provide 30% of the world's energy by 2050 (Guo et al. 2015). Sweet sorghum (Sorghum bicolor) is a C4 drought-resistant species with potential bioenergy production (Calviño and Messing 2012;Carpita and McCann 2008;Zegada-Lizarazu and Monti 2013;Schittenhelm and Schroetter 2014). Sweet sorghum is highly productive, with low input requirements, and is drought-tolerant (Rooney et al. 2007;. Its drought tolerance enables it to reasonably sustain biomass production under water-limited conditions (Perrier et al. 2017). Sweet sorghum accumulates large amounts of carbohydrates in its stalk and produces total biomass as high as 30 Mg ha −1 yr −1 (Bihmidine et al. 2015;McKinley et al. 2018;Rooney et al. 2007). Stalk carbohydrates are easily converted to ethanol via the fermentation of stalk juice. The bagasse, a fibrous lignocellulosic material, can produce cellulosic ethanol, heat, power co-generation, or biofuel and thermo-electrical energy production.
While sweet sorghum has been developed as a sugar crop, its development for bioenergy purposes is rudimentary; hence, genetic improvement of sweet sorghum is expected to enhance its productivity. Towards this end, biotechnology and genomics will be important for the genetic improvement of sweet sorghum (Madhusudhana 2014;Rooney et al. 2007). Genomic regions linked to complex traits can be identified by genetic mapping and QTL (quantitative trait locus) analysis (Shehzad and Okuno 2014). The fundamental idea underlying QTL analysis is to associate genotype and phenotype in a population exhibiting genetic variation (Broman and Sen 2009).conclusion The most important target traits in sorghum breeding programs focusing on biofuel production include shortening crop cycle duration and increasing plant height, total biomass yield, and sugar or fiber contents (Regassa and Wortmann 2014). Such traits are quantitative in nature, controlled by many QTL (Mace et al. 2019), including plant height Guan et al. 2011;Murray et al. 2008aMurray et al. , 2008bRitter et al. 2008;Shiringani et al. 2010), biomass Murray et al. 2008a; Shiringani and Friedt 2011) stem diameter , and sugar content (Bian et al. 2006;Felderhoff et al. 2012;Guan et al. 2011;Lv et al. 2013;Murray et al. 2008a;Murray et al. 2008b;Ritter et al. 2008;Shiringani et al. 2010). In sorghum, several authors indicated plant height and maturity (Days to flowering), i.e., the number of days from sowing to 50% flowering, as the main determinants of biomass yield (Habyarimana et al. 2004;Upadhyaya et al. 2012;Kalpande et al. 2014), and can therefore be used for indirect selection for this trait. Plant height is the product of internode length and the number of nodes which are produced before flowering, which, in turn, is a consequence of growth duration and the rate of internode production. Sorghum biomass productivity is highly and positively correlated with maturity (days to 50% flowering) (Haussmann et al. 2002) and height Ritter et al. 2008;Zhao et al. 2009;Kebede et al. 2001;Ali et al. 2008;Upadhyaya et al. 2009). Plant height is independent of stem structural composition, i.e., cellulose, hemicellulose, and lignin content . A variety with thick and tall stems can be bred to contain more cellulose and less lignin. Overall, the genes influencing biomass productivity and sugar content in sweet sorghum remain unknown.
Recently, next-generation sequencing technologies have led to whole genome sequencing, making Genotype-by-sequencing (GBS) feasible for various species (Elshire et al. 2011). GBS has generated single-nucleotide polymorphisms (SNPs) and QTL mapping (Beissinger et al. 2013). Compared to SSRs and AFLP markers, GBS greatly improved the resolution of QTL mapping (Kong et al. 2018;Verma et al. 2015). Several QTL mapping studies of sweet sorghum bioenergy traits have already been reported (Mathur et al. 2017). However, these previous studies have generally applied low markers for mapping, providing too little resolution to narrow the number of possible candidate genes. Modern genotyping technologies based on next-generation sequencing (NGS), such as GBS (Elshire et al. 2011), can generate thousands of markers and help to increase genome saturation. Thus, the objectives of this study were (1) to identify high-resolution QTL associated with bioenergy-related traits in sweet sorghum using GBS, and (2) to confirm previously identified QTL with higher resolution in an independent genetic background derived from a cross between Macia (a grain sorghum) and Wray (a sweet sorghum). QTL were detected for the bioenergy-related traits of sweet sorghum: brix (a measure of sugar content in stem juice), biomass yield, flowering time, plant height, stem diameter, and stem moisture content for 188 F 7 recombinant inbred lines (RIL).
These higher-resolution QTL could help identify genes that influence the biomass productivity of sweet sorghum and could be used to develop molecular markers to select individuals with valuable bioenergy-related traits within a breeding population using these markers.

Field experiment
One hundred ninety lines, including the two parental lines, were planted under rain-fed conditions at the University of Nebraska-Lincoln experimental farms at Havelock and Mead, Nebraska, in the summer seasons of 2012 and 2013. In 2012, the planting dates were May 23rd at Havelock and May 25th at Mead. In 2013, the planting dates were May 24th at Havelock and May 23rd at Mead. The experimental design was an alpha lattice incomplete block design with 15 incomplete blocks of 14 plots each per replication (15 × 14 alpha lattice), with two replications in each environment. A plot was a single row measuring 4.5 m long, with 0.75 m between rows. A single row was sown at a rate of 50 seeds per row.

Phenotyping of bioenergy-related traits
Flowering time was measured as the days from planting until 50% of the plants within a plot were shedding pollen. In 2012, harvesting dates were September 18th-26th at Havelock and October 2nd-5th at Mead. In 2013, harvesting dates were September 23rd-27th at Havelock and October 1st-9th at Mead. On average, harvesting was performed 125 days after planting. Five middle plants within a plot were arbitrarily selected at harvesting times to measure plant height, total biomass, stem diameter, and brix.
Plant height was measured as the distance in cm from the plant's base to the panicle's tip.
Total biomass in Mg ha −1 was collected when plants reached their harvesting time by cutting the plants near the soil surface and separating them into panicles (heads), leaves, bottom stems (the length of 0-20 cm above ground), and remaining stems. The sub-samples were bagged and weighed immediately to obtain the wet weight and placed into an oven at 120-160°F for ten days to completely dry the samples. Dried subsamples were re-weighed and summed up to obtain the total dry weight. Total biomass was calculated as follows: Stem diameter (cm) and brix readings were obtained from the bottom stem sections between the first and the second node before drying. Brix degree (°Brix) as a measure of soluble solids (mainly sucrose) in the stem juice was measured using a handheld refractometer (MASTER-T Brix 0-32% with ATC, Atago Co., LTD, Tokyo, Japan). Because of the high volume workload in the field, stem diameter, and brix readings were measured in the lab within 2-3 days after harvest. The bottom stems were kept in covered plastic boxes and put in a cold room (4 °C) to minimize sugar metabolism and to maintain brix level until the measurement.

Phenotypic analysis
Analysis of variance (ANOVA) for bioenergy-related traits was performed for each environment and combined environments, replications, years, and genotypes using the PROC MIXED procedure (Littell et al. 2006)  following statistical model: where is the observed trait of genotype obtained in the block within the replicate, is the population means, is the effect of the genotype, is the effect of the replicate, is the effect of the incomplete block within the replicate, and is a random error and the analysis of genetic, environment and genetic x environment variance based on the following model: where is the population mean, fixed; is the effect of the genotype, is the effect of environment, is the genotype x environment interaction effect and is the residual effect. Pearson's correlation coefficients between traits were calculated using the PROC CORR procedure of SAS. Frequency distribution for bioenergy-related traits was performed for each environment using the PROC CAPABILITY. Narrowsense heritability with the standard error was determined using SAS code which is available at http:// www4. ncsu. edu/ ~jholl and/ heritability/Inbreds.html (Holland et al. 2003) following the formula; where represents heritability, is the genotypic variance component, is the genotype x environment variation, is the experimental error variance, is the number of replications, and is the number of environments.

Genotyping-by-sequencing (GBS)
One hundred ninety genomic DNA samples, including 188 RILs plus two parental lines (Macia and Wray), were sent to the Institute for Genomic Diversity, Cornell University, for GBS. The DNA samples were genotyped by GBS (Elshire et al. 2011) using the Illumina HiSeq 2000/2500 (100 bp, single-end reads). The 190 genomic DNA samples were isolated from 20-day-old leaf tissue using a magnetic bead DNA extraction method (Xin and Chen 2012). The DNA samples were diluted to 100 ng/µl and 40 µl of each sample submitted for GBS. The raw sequence reads (FASTQ files) were analyzed using the TASSEL4 GBS pipeline (Bradbury et al. 2007). The raw sequence reads were aligned to the reference sequence for sorghum (Sbi1) (Paterson et al. 2009) using Burrows-Wheeler Aligner, BWA (Li and Durbin 2009). Illumina provided the formatted reference sequence file for BWA. The TASSEL4 GBS pipeline provided the HapMap file containing high-throughput SNPs (10,424 SNPs). The HapMap file was imported into Microsoft Excel 2010 for coding markers following ICiMapping format (Wang et al. 2012).

Linkage map construction
Linkage maps were constructed by IciMapping 4.2 . Before linkage map construction, binning of redundant markers using the ICiMapping BIN function was performed to calculate the missing rate. Linkage maps were pre-constructed to evaluate a 1:1 segregation ratio of markers using the chi-square test (P ≤ 0.01). SNPs with ≥ 80% missing rate and significant SNPs from the chi-square test were removed from the data set. Finally, genetic linkage maps were reconstructed with 979 selected SNPs.
QTL analysis QTL analysis on bi-parental populations (BIP) was performed by IciMapping 4.2. Inclusive composite interval mapping of additive QTL (BIP-ICIM-ADD) was used to conduct the QTL mapping for each environment, and mapping of additive genes from multienvironmental trials (MET-ICIM-ADD) was used for the analysis of the combined environment. Genetic linkage maps mean phenotypic data for bioenergyrelated traits for 188 RILs across four environments, and genotypic data of 979 SNPs were used for QTL analysis. The Kosambi mapping function was used to convert recombination frequency to mapping distance.
Step in scanning was assigned at one centiMorgan (cM). A thousand-permutation test was applied to each data set to determine LOD (log 10 of the likelihood odds ratio) threshold (P ≤ 0.01), with LOD score > 3 used to determine significant QTL. The phenotypic variation explained by the marker (%PVE) and the additive effect of QTL were estimated by IciMapping.

Phenotypic data analysis
Significant differences (P < 0.01) were observed among genotypes and genotype x environment interactions for all traits measured in each environment and combined environments (Table 1). Wray had greater means than Macia for all traits except stem diameter. The mean values for all traits of RILs were between the parental lines  Fig. 1). The narrow-sense heritability (h 2 ), estimated for all traits and environments, ranged from 0.14 for stem moisture content at Havelock in 2013 to 0.96 for plant height at Havelock in 2013 (Table 1). Total biomass yield correlated best with plant height (r = 0.67), while brix was negatively correlated with stem diameter (r = − 0.13) and stem moisture content (r = − 0.41) ( Table 2).

Linkage map construction
A total of 979 SNPs from GBS were mapped onto 10 linkage groups (chromosomes) based on the physical position of the SNPs. The linkage map spanned a total of 1,523.46 cM ( Table 4). The maximum distance between markers was 37.97 cM (Chr 6) ( Table 4). The length of the linkage groups ranged from 81.56 cM (Chr 9) to 199.93 cM (Chr 1) ( Table 4).

QTL analysis
QTL for six bioenergy-related traits were analyzed, including flowering time, plant height, stem moisture content, biomass yield, stem diameter, and brix. Twenty-one QTL for five traits was detected for combined environments using ICIM-ADD. None of the QTL for stem moisture content were detected for combined environments. The QTL locations for four environments and combined environments are presented in Table S1,

QTL for plant height
A total of four QTL were detected for plant height on chromosomes 1, 6, and 7, and explained 3.10-29.51% PVE. Of the four QTL, three QTL on chromosomes 6 and 7 were major and had negative effects due to Macia ranging from − 17.53 to − 24.84 cm, while the QTL on chromosome 1 had an 8.04 cm additive effect due to Wray. The major QTL with the highest PVE (29.51%) was on chromosome 7 and was detected consistently across four environments (Table 5).

QTL for stem moisture content
None of the QTL for stem moisture content were detected for combined environments, but two major QTL for stem moisture content were detected individually at Mead in 2012 on chromosomes 3 and 7 (Table S1). The QTL located on chromosome 3 explained 11.82% PVE and had a 0.91% additive effect due to Wray, while the QTL located on chromosome 7 explained 12.56% PVE and had a − 0.94% additive effect due to Macia.

QTL for biomass yield
Four QTL were detected for total biomass yield on chromosomes 4, 6, and 7, and explained 5.72-16.46% PVE. Of the four QTL, three QTL on chromosomes 6 and 7 were major QTL. The major QTL on chromosome 7 (flanking markers; S7_58245093-58,601,476) was also the major QTL identified for plant height.
The QTL with the highest % PVE (16.46%) was on chromosome 7 and was detected consistently across three environments (Havelock 2012, Mead 2012, and Havelock 2013. A major QTL located on chromosome 6 (10.93% PVE) was detected consistently across four environments. All QTL for total biomass yield had negative additive effects ranging from − 0.97 to − 1.64 Mg ha −1 .

QTL for stem diameter
Five QTL for stem diameter were detected on chromosomes 1, 4, and 6, and explained 4.81-9.43% PVE. The QTL with the highest PVE was on chromosome

QTL for brix
There were six QTL associated with brix (Table 5), which were located on chromosomes 2, 3, 4, and 7 and explained 5.28-39.92% PVE. Two QTL on chromosome 3 were major QTL. The major QTL with the highest PVE was on chromosome 3 (39.92% PVE) and was detected consistently across all four environments. There were negative additive effects of the QTL detected on chromosomes 2, 3, and 7 due to Macia, which ranged from − 0.62 to − 1.71. QTL detected on chromosome 4 had positive additive effects due to Wray ranging from 0.79 to 0.82.

Discussion
The sorghum genetic map   QTL for plant height associated with dw2 and dw3 Four known loci influence plant height in sorghum (dwarfing gene) dw1 on chromosome 9 (Sobic.009G229800: 57,038,653-57,041,166), dw2 on chromosome 6 (unknown), dw3 on chromosome 7 (Sobic. 007G163800: 59,821,905-59,829,910), and dw4 on chromosome 6 (unknown) (Multani et al. 2003;Quinby 1974;Yamaguchi et al. 2016). Only dw1 and dw3 have been cloned and characterized. A major QTL for plant height in the current study was identified on chromosome 7 (PH7_2, S7_58245093-58,601,476, 82.13-87.37 Higgins et al. (2014). This region is also the putative region containing the dw2 gene reported by recent studies using GBS (Gelli et al. 2016, 41.97-43.22 Mb;Higgins et al. 2014, 41.23-44.34 Mb; Morris et al. 2013, 42.26 Mb). These results indicated that dw2 and dw3 might contribute QTL for plant height on chromosomes 6 and 7 in this study. The QTL on chromosomes 6 and 7 were major QTL and had negative additive effects showing that dw2 and dw3 were transferred from Macia, the shortgrain sorghum parent line used in this study.

QTL for flowering time associated with Ma1 and SbEHD1
Sorghum flowering time or maturity genes were previously identified at six loci, designated Ma1 ( maturity gene) on chromosome 6 (locus name: SbPRR37, Plant height and flowering time are highly related traits in grasses, in which apical growth is terminated by flowering (Lin et al. 1995). QTL associated with plant height are linked with loci controlling flowering and have been previously reported in wheat (Börner et al. 1993), maize (Khairallah et al. 1998;Thornsberry et al. 2001), rice (Yan et al. 2011), barley (Chen et al. 2009), and sorghum (Börner et al. 1993;Higgins et al. 2014;Khairallah et al. 1998;Lin et al. 1995;Madhusudhana and Patil 2013;Morris et al. 2013;Shiringani et al. 2010;Thornsberry et al. 2001). In this study, a QTL for plant height (PH6, S6_41974107-42,311,647, 6.89-8.56 cM) co-localized with a major QTL for flowering time on chromosome 6. This result supports the finding that the dwarfing locus dw2 is linked to the maturity gene, Ma1, which had been investigated in sorghum (Higgins et al. 2014;Lin et al. 1995;Madhusudhana and Patil 2013;Morris et al. 2013;Shiringani et al. 2010).
We also found that total biomass yield correlated best with plant height (r = 0.67), and major QTL for these traits co-localized on chromosomes 6 and 7. Similarly, Ritter et al. (2008), Shiringani et al. (2010), and Shiringani and Friedt (2011) also reported the colocalization on chromosome 6, while Murray et al. (2008b) and Felderhoff et al. (2012) reported the colocalization on chromosome 7. The results indicated that total biomass yield is highly determined by plant height, which was determined by the duration of growth before flowering.
QTL for brix associated with SbSWEET1A, SbSUT5, and SbSUT6 Sugar content in this study was measured using brix. Brix is used to practically analyze total solute content (mostly sugars) (Kawahigashi et al. 2013). Ritter et al. (2008) suggested that brix is a simpler phenotypic trait than quantifying sucrose and sugar content, which are more time-consuming and expensive to measure. Moreover, they identified similar QTL. In the current study, QTL for brix was identified on chromosomes 2, 3, 4, and 7. Similar results were also observed in previous QTL mapping for brix on chromosome 2 (Guan et al. 2011;Shiringani et al. 2010), chromosome 3 Guan et al. 2011;Murray et al. 2008b), chromosome 4 (Bian et al. 2006Felderhoff et al. 2012;Lekgari 2010;Shiringani et al. 2010), andchromosome 7 (Bian et al. 2006;Guan et al. 2011;Lekgari 2010;Murray et al. 2008b;Shiringani et al. 2010). The QTL for brix identified in the present study may be associated with known candidate genes controlling sugar Euphytica (2023)  content in sorghum. Genes suggested to control the accumulation of sugar in sorghum stems have been characterized, including six sucrose transporter genes (SbSUTs), three tonoplast sugar transporter genes (SbTSTs), and twenty SWEET genes (SbSWEETs), (Bihmidine et al. 2015(Bihmidine et al. , 2016Braun 2012;Braun and Slewinski 2009;Eom et al. 2015;Milne et al. 2013;Qazi et al. 2012). Intriguingly, we found that the major QTL detected on chromosome 3 (BR3_1, S3_68877230-69,154,987, 158.76-176.55 Lekgari (2010). These results suggest that the genes SbSWEET1A, SbSUT5, and SbSUT6 may underlie the QTL for brix identified on chromosomes 3, 4, and 7. Milne et al. (2013) reported that SbSUT5 and SbSUT6 might function in sucrose phloem loading in leaves and sucrose accumulation in stems. However, a previous qRT-PCR expression analysis of SbSUT5 and SbSUT6 in Wray and Macia stems did not support the hypothesis that these genes play a prominent role in sucrose phloem loading or stem sugar accumulation (Bihmidine et al. 2015). Further functional analyses of these genes should test these hypotheses and determine their contributions to whole-plant carbohydrate partitioning and sugar accumulation in stems. It is also noteworthy that although we did not detect similar QTL in our analyses, several of the other studies identified QTL for brix that overlap the position for sugar transporter genes. For example, on chromosome 2, Guan et al. (2011) found a QTL for brix near the SbSWEET11B gene, and on chromosome 3, Felderhoff et al. (2012) and Guan et al. (2011) identified QTL containing the SbSWEET2A and SbSWEET6 genes. Additionally, on chromosome 4, Bian et al. (2006), Lekgari (2010), Shiringani et al. (2010), and Felderhoff et al. (2012) identified brix QTL that are in the vicinity of SbSWEET15, SbSWEET4A, SbSWEET4B, SbSWEET4C, and SbTST2. Future research will investigate the molecular functions of these genes and their potential involvement in regulating stem sugar content.
Finally, we also found another QTL for brix on chromosome 2 that was not correlated with known sucrose transport genes in sorghum. Shiringani et al. (2010) reported a QTL for brix on chromosome 2 at positions 30-40 cM, while the QTL for brix located on chromosome 2 in this study was at positions 56.3-59.4 cM. These QTL for brix will be contributors to identifying other sugar transporter genes, sugar metabolism genes, or novel genes regulating sugar transport or metabolism.
QTL for stem moisture content QTL for all traits were detected for combined environments, except QTL for stem moisture content. The reliability of the confidence intervals associated with QTL locations depends on the heritability of the individual QTL (Kearsey and Farquhar 1998). QTL related to stem moisture content could not be detected in combined environments, but they were detected in one environment (Mead 2012) on chromosomes 3 and 7. Stem moisture content is easily affected by the environment. The heritability of stem moisture content for combined environments was as low as 0.25, whereas that of Mead in 2012 was 0.48. The low heritability of the stem moisture content QTL is likely why we failed to detect them in the analysis. This result is similar to that found by Felderhoff et al. (2012). They reported no QTL for percent moisture across all environments, but they found a QTL for moisture on chromosome 3 at only one of four environments.
Validity of the study QTL detected in this study were similar in genomic regions with previous studies, but the genomic regions were narrowed in this study. This increases our confidence in candidate gene identification and reduces the number of genes needing subsequent characterization to a feasible level. QTL for stem diameter were detected on chromosomes 1, 4, and 6. Two QTL on chromosome 1 were located at the same position as QTL reported by Shehzad and  Okuno (2015) and Shiringani et al. (2010). Still, both regions from this study were smaller than the previous studies.
In conclusion, GBS increased the precision of the QTL analysis, and candidate genes associated with the phenotypes could be identified for many QTL. The bioenergy-related QTL in sweet sorghum detected in this study were in genomic regions associated with known genes linked to the traits brix, biomass yield, plant height, and flowering time. Other QTL we identified overlap with previous studies. Tantalizingly, the association between SbSWEET1A and the major QTL for brix on chromosome 3 suggests a potential mechanism underlying sugar accumulation in the stem of sorghum and possibly other bioenergy grasses. Future validation of the bioenergy-related QTL and associated candidate genes should be conducted for the functional genomic improvement of bioenergy-related traits in sweet sorghum.

Main conclusion
By constructing a grain x sweet sorghum QTL mapping population, we identified genomic regions associated with bioenergy-relevant traits. Additionally, we identified a major QTL for brix (39.92% PVE) that is localized on chromosome 3, and detected consistently across four environments using genotyping by sequencing.
Author contributions K.T. conducted the experiments and drafted the manuscript. I.D. designed and oversee the experiment executions. D.B. and B.B. contributed ideas to the experiments. All authors read and approved the final manuscript.
Funding This work is being funded by a grant from the Plant Feedstock Genomics for Bioenergy # DE-SC0006810 to DMB et al. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Data availability Additional datasets generated by this study are included in the Supplementary Materials.

Declarations
Competing interests The authors declare no competing interests.

Consent for publication
The authors consented for the publication.