Linkage Mapping of Biomass Production and Composition Traits in a Miscanthus sinensis Population

Breeding miscanthus for biomass production and composition is essential for targeting high-yielding genotypes suited to different end-uses. Our objective was to understand the genetic basis of these traits in M. sinensis, according to different plant ages and environmental conditions. A diploid population was established in two locations according to a staggered-start design, which distinguished the plant age effect from climatic condition effect. An integrated genetic map of 2602 SNP markers distributed across 19 LGs was aligned with the M. sinensis reference genome and spanned 2770 cM. The QTL mapping was based on best linear unbiased predictions estimated across three climatic conditions and at least three ages in both locations. A total of 260 and 283 QTL were related to biomass production and composition traits, respectively. In each location, 40–60% were related to biomass production traits and stable across different climatic conditions and ages and 30% to biomass composition traits. Twelve QTL clusters were established based on either biomass production or composition traits and validated by high genetic correlations between the traits. Sixty-two putative M. sinensis genes, related to the cell wall, were evidenced in the QTL clusters of biomass composition traits and orthologous to those of sorghum and maize. Twelve of them were differentially expressed and belonged to gene families related to the cell wall biosynthesis identified in other miscanthus studies. These stable QTL constitute new insights into marker-assisted selection (MAS) breeding while offering a joint improvement of biomass production and composition traits.


Introduction
Miscanthus is a perennial C4 crop that produces valuable lignocellulosic biomass, mainly for bioenergy end-uses, biobased products, and animal bedding [1][2][3][4][5]. However, only one clone of a Miscanthus × giganteus interspecific hybrid (2n = 3x = 57) is currently available for commercialization in Europe and the USA [6,7]. It originates from a natural interspecific cross between a tetraploid M. sacchariflorus (2n = 4x = 76) and a diploid M. sinensis (2n = 2x = 38) [8] and obtains high biomass yields under various environmental conditions [9,10]. As M. × giganteus is sterile, the potential risk of invasiveness from the seeds spreading to a new environment is avoided, but this hampers the breeding of the crop [11,12]. Moreover, such narrow genetic variability can be risky in case of pest adaptation, and the corresponding phenotypic variability may not be sufficient for the different end-use requirements [8]. The two parents of M. × giganteus, originating from East Asia, present high genetic variability and are adapted to extended environmental conditions [13][14][15][16]. For example, M. sinensis reaches the same amount of biomass production as M. × giganteus under specific conditions [17]: this makes it a relevant candidate for breeding new varieties at an intraspecific level, or for creating new M. × giganteus clones at an interspecific level, despite its self-incompatibility [18]. Breeding miscanthus aims to improve biomass production and composition traits by creating new cultivars, adapted to a range of different environments. However, the optimal time to collect reliable phenotypic data for these traits is a major point of concern in miscanthus breeding. Indeed, the yield-building phase of the plants can last 3 years after crop establishment, and even this can vary between miscanthus species [19]. The plants then reach a yield plateau phase during which biomass production is more stable [20]. Selecting plants only on their phenotype during the yield-building phase may thus be unreliable (Raverdy et al., submitted to BioEnergy Research), especially since the different progenies may reach their full growth potential at different times. Marker-assisted selection programs would thus be a helpful tool for improving the breeding efficiency in miscanthus, as they would make it possible to select young plants on the basis of their genetic information, without waiting for them to reach the yieldplateau phase.
Because of its self-incompatibility, miscanthus is an outcrossing species with a high level of heterozygosity. Therefore, the genetic mapping methods that were initially developed based on inbred line populations have to be adapted to full-sib (F1) populations, as done for other perennial crops such as sugarcane or rubber tree [21][22][23]. Full-sib diploid populations have a maximum of four different segregating alleles per locus. This makes the haplotype phase estimation more complex than for inbred lines populations, for which a maximum of two alleles segregate.
The initial traditional method for building genetic maps in full-sib populations was the "pseudo-testcross" strategy [24], which was based on separate linkage maps for each parent. Later, Wu et al. [25,26] developed a method that generates an integrated genetic map and for which the estimation of linkage distances and phases between markers and locations of the quantitative trait loci (QTL) are improved. Moreover, Gazaffi et al. [27] proposed a method based on composite interval mapping and multipoint genetic mapping using molecular markers with different segregation patterns. This last method can reveal QTL that segregates in any pattern and identify dominance effect.
The initial miscanthus genetic map was developed by Atienza et al. [28] based on an M. sinensis population, by using random amplified polymorphic DNA (RAPD) markers, and was composed of 28 linkage groups (LGs). This unsaturated map was used to detect QTL for morphological traits, biomass yield, and combustion quality [29][30][31][32]: however, the 28 LGs that constitute the genetic map, compared to the base chromosome number in miscanthus (x = 19), as well as the small population size of 89 F1 individuals, make it difficult to compare the QTL with recent studies in which saturated maps are presented with the expected number of 19 LGs. Ma et al. [33] created a high-resolution linkage map of M. sinensis and identified 19 LGs for the first time. Gifford et al. [34] identified 72 QTL associated with biomass productivity, by using the integrated genetic map of M. sinensis constructed by Swaminathan et al. [35]. It was also made up of 19 LGs. Dong et al. [36] developed six high-density parental genetic maps using the pseudo-testcross strategy and two consensus maps which integrated M. sinensis and M. sacchariflorus. Using these maps derived from three interconnected miscanthus populations, they identified 109 to 288 QTL for 14 biomass production traits which mapped into 86 to 157 meta-QTL. Van der Weijde et al. [37] constructed two parental maps of M. sinensis using also the pseudo-testcross strategy and identified 86 QTL related to biomass composition and conversion efficiency traits. Most of these studies identified QTL of M. sinensis biparental populations derived from crosses between different heterozygous parents. These populations were evaluated during 2 years in one location. In apple trees, another perennial species, Segura et al. [38] carried out QTL mapping on an F1 progeny and demonstrated that the QTL detected are related to genetic, ontogenetic, and climatic factors: this was possible by a staggered-start design [39][40][41], which partitioned the year effect into age and growing season effects. The growing season effect itself corresponded to soil and climatic condition effects.
In a previous study, we estimated the heritability values and genetic and phenotypic correlations of an M. sinensis population. We highlighted moderate to high heritability values for biomass production and composition traits. Both age and climatic condition effects were considered, based on a staggered-start design carried out in two contrasted locations (Raverdy et al., submitted in BioEnergy Research). We thus expected the phenotypic data of this population to be relevant for undertaking QTL mapping. The objectives of the present study were then (1) to construct an integrated linkage map based on single-nucleotide polymorphism (SNP) markers according to the miscanthus reference genome and (2) to detect QTL for biomass production and composition traits and identify stable QTL, while considering the range of years (i.e., climatic conditions), ages, and locations evaluated. To reach these goals, the same population has therefore been analyzed, based on the staggered-start design that was established in two contrasting locations. The growing season effect corresponded to climatic condition effect as the stands were staggered twice in a single field in both locations. A genotyping-by-sequencing (GBS) approach was initially used to discover SNP markers according to the alignment with the miscanthus reference genome that was released in 2017 [42]. A next step was the development of an integrated linkage map for the population. QTL were then detected for both biomass production and composition traits over 5 consecutive years. To our knowledge, it is the first miscanthus genetic map that considers the alignment with the miscanthus reference genome. It is also the first miscanthus study for which the QTL of biomass production and composition traits are jointly detected for more than 2 years and in two contrasted locations. This will make it possible to assess marker-trait associations during and after the theoretical establishment phase of miscanthus and according to different ages and climatic conditions. Based on the staggered-start design per location, the "year" effect was partitioned into "plant age" and "climatic condition" effects, which made it possible to detect QTL related to such conditions for the first time in miscanthus.

Mapping Population and Experimental Design
Two diploid ornamental M. sinensis cultivars, "Malepartus" (Mal) and "Silberspinne" (Sil), were crossed in order to get an F1 full-sib progeny for mapping studies. These two ornamental parents putatively originate from central to southern Japan [15,16]. Each seed of the population was germinated in vitro, which provided 248 initial genotypes. These plants were propagated in vitro according to a protocol of shoot organogenesis and regeneration [43]. All seedlings were planted in a greenhouse that offered suitable growing conditions before being transplanted to the field. Due to variable genotype ability for in vitro propagation, only 157 genotypes were available for the field trial. However, 248 genotypes were available for genotyping.
The F1 full-sib progeny and the two parental genotypes were cultivated with single plants, in two locations in France. These locations presented contrasting soil and climatic conditions. Experimental design in both locations was based on a staggered-start design [39]. One of the experiments was established at the "GCIE" INRAE (National Institute for Research on Agriculture, Food and Environment) experimental unit of Estrées-Mons (49°53′N, 3°00′E) in a deep loam soil (Orthic Luvisol according to the World Reference Base for Soil Resources, WRB). The other experiment was established at the "GBFOr" INRAE experimental unit of Orléans (47°49′N, 1°54′E) in a sandy soil (Dystric Cambisol, WRB). Each staggered-start design was made up of two stands or groups of genotypes that were organized in two plots established in 2 successive years in the same field: the first group of genotypes (G1) was established in 2014, while the second group (G2) was established in 2015 ( Fig. 1). In each location, each plot was adjacent to the other plot in the field and separated by a border row: soil conditions were therefore similar between the two groups. In Estrées-Mons, 157 genotypes and the two parents were cultivated, with 82 genotypes common to each group (G1 and G2) ( Fig. 1 and Table S2). The two parents and 104 genotypes of those cultivated in Estrées-Mons were also cultivated in Orléans. These 106 genotypes included 59 common to G1 and G2 ( Fig. 1 and Table S2). Finally, 57 genotypes were common both to G1 and G2 and between locations (Table S2). The number of genotypes was unbalanced due to the recalcitrance of some genotypes concerning the propagation and establishment steps, as described previously. In both locations and each plot that corresponded to each group of genotypes sequentially established, single plants were organized in an incomplete randomized block design [44] with five blocks. The genotypes were thus replicated in four of the five blocks on average, except the two parents, which were replicated in all blocks of the two stands. Plant density was 1 plant per square meter, with single plants equally spaced 1 m apart within and between rows. For both locations, more details about the experimental design and the climatic conditions that correspond to the plant cycle are available in Raverdy et al. (submitted in BioEnergy Research).

Development of Single-Nucleotide Polymorphism Markers
The genomic DNA of the 248 individuals of the progeny and of the two parents was extracted from seedlings at the INRAE Gentyane Genomic Service platform (Clermont-Ferrand, Puy-de-Dôme, France), using a sbeadex™ livestock kit (LGC Group, United Kingdom). A GBS approach was used to discover SNP markers which corresponded to the population. It was carried out according to the protocols of Elshire et al. [45] and Cormier et al. [46], at the CIRAD (French Agricultural Research Centre for International Development) Genotyping platform core facility (GPTR, Orléans. The total number of M. sinensis genotypes (including the two parents) is indicated for each staggered-start design. The number of genotypes for each group (G1 and G2) and the genotypes common to both groups (at the intersection of blue and red circles) are also specified for each location Montpellier, Hérault, France). The 96-plex libraries were prepared by digestion of the 250 DNA extractions using the PstI restriction enzyme. The single-end sequencing was then carried out in three lanes on a HiSeq™ 3000 sequencer (Illu-mina® Inc., San Diego, CA, USA), at the "GenoToul GeT" platform (Auzeville, Haute-Garonne, France). The quality check of the reads was conducted using the "FastQC" software v.0.11.2 [47].
The "TASSEL 5 GBS" pipeline [48] was used in order to analyze the sequencing data. The raw reads of the individuals were initially grouped in tags. By using the "Bowtie 2" v.2.3.2 software [49], tags with a minimum count of 10 reads were retained and aligned with the M. sinensis reference genome sequence that was released in December 2017 [42]. The resulting variant call format file, which contains the markers and information regarding the individuals, was then filtered. The filtering was carried out using the "vcf2pop.1.0.py" software [50] and was based on a minor allele frequency threshold of 0.05 and a maximum of 25% missing data per marker. Three marker types were available for the genetic map construction: first, markers that were heterozygous in both parents (ab × ab), called "Bridge" (Bri) markers and that segregated in a 1:2:1 ratio (aa, ab, bb); second, "Mal" markers that were heterozygous in the Malepartus parent and homozygous in the Silberspinne parent (ab × aa); and lastly, "Sil" markers that were homozygous in Malepartus and heterozygous in Silberspinne (aa × ab). The latter two marker types segregated in a 1:1 ratio (aa, ab). According to the classification of Wu et al. [25], "Bri" markers were named B3.7, "Mal" markers were named D1.10, and "Sil" markers were named D2.15.

Genetic Map Construction
An integrated genetic map was constructed based on the mapping population, by using the "OneMap" R package [51,52]. First, the redundant markers were removed from the analysis. The remaining markers were tested according to expected Mendelian segregation by using a chisquare test with a global α = 0.05, corrected for multiple testing with Bonferroni correction. The recombination fraction between all pairs of markers was then determined according to two-point tests [25]. The markers were then grouped based on their position on the reference genome chromosomes and a maximum recombination frequency of 0.35. Based on this grouping, 19 linkage groups (LGs) were obtained, which corresponded to the base chromosome number of miscanthus (x = 19). A high homology was highlighted between the markers grouped in each LG and their original position in accordance with the alignment with the reference genome chromosomes. For example, 90% of the markers grouped in LG1 were initially identified in correspondence with chromosome 1 of the miscanthus reference genome sequence. The marker grouping for each of the 19 LGs was thus refined, by only keeping the markers that were in accordance with the corresponding chromosome of the miscanthus reference genome. This ensured that each marker was grouped in the right LG.
The segregation-distorted markers that were kept for the optimization of the marker grouping step were then discarded for marker ordering and phasing. This met the QTL mapping model assumption of Mendelian segregation. Marker ordering was then tested according to three different methods, and the Kosambi mapping function was used [53]. These different marker ordering methods were based either on heuristic algorithms or physical positions within the miscanthus reference genome. They had to be tested independently in order to find the best marker order among them. The best order was defined with the inspection of the expected pattern in the resulting recombination fraction matrix between markers, visualized in heatmaps. The ordering methods have already been investigated in different studies, with the aim of getting the best marker order: it was yielded either by the marker ordering algorithms or the physical positions within a reference genome [54,55].
The first marker ordering method consisted of the ordering of the most informative markers (1:2:1) using an exhaustive search tool. It consisted of comparing all possible orders, and the remaining markers were positioned according to the "TRY" algorithm [56]. The resulting marker order was unsatisfactory according to the heatmaps (data not shown), even though the "RIPPLE" algorithm [56] was used in order to improve it. A second marker ordering method was thus tested based on the multi-dimensional scaling (MDS) method [57,58] that was implemented in the OneMap software [51,52]: although it improved the marker order, this approach did not provide the means to get a satisfactory marker order (data not shown). However, this approach made it possible to refine marker filtering, by removing some markers according to the principal curves method from the "MDSMap" R package [57,58]. Finally, the third method consisted in ordering the markers according to their positions identified within the miscanthus reference genome. In addition, the marker order was adjusted based on recombination fractions between the markers. This final method was retained, as it yielded the best marker order quality among the three methods.
Once the ordering was defined, the genetic distance was estimated based on multipoint approaches using hidden Markov models [56] that consider outcrossing species [26]. The presence of genotyping errors was managed, as often carried out in mapping studies [54,[59][60][61]. Thus, a genotyping error probability of 5% was considered in the hidden Markov model emission function. This function was implemented in the OneMap software, which made it possible to consider uncertainties between observed and estimated genotypes [62].

Phenotypic Data Analysis of Biomass Production and Composition Traits
Five biomass production traits and five biomass composition traits (expressed as a percentage of dry matter, %DM, or cell wall content, %CW) were studied. These phenotypic data were acquired over 5 successive years between 2014 and 2019 in Estrées-Mons and 4 successive years between 2014 and 2018 in Orléans (Fig. 2). In order to name each trait in a relevant manner, a miscanthus ontology was developed at the INRAE BioEcoAgro research unit of Estrées-Mons (https:// urgi. versa illes. inra. fr/ ephes is/ ephes is/ ontol ogypo rtal. do) by using the GnpIS multispecies integrative information system from the INRAE-URGI of Versailles [63]. Four morphological traits were evaluated: canopy height (CH_cm), plant maximum height (HMax_cm), plant stem number (PSNb), and plant circumference (C50_cm). The aboveground biomass yield (ABM_tDMha) was measured after the winter harvest in late February and was expressed as tDM/ha. The biomass composition-related traits were assessed based on near-infrared spectroscopy (NIRS) predictions for all plants of the population and a set of calibration samples for which the composition traits were assessed. These samples were analyzed by the LANO laboratory (Saint-Lô, France) according to a protocol adapted from the Van Soest method [64] and described in Belmokhtar et al. [65]. Three fractions were determined: neutral detergent fiber (NDF), acid detergent fiber (ADF), and acid detergent lignin (ADL). The NDF fraction, which corresponds to the cell wall content (CW), is considered to represent cellulose, hemicelluloses, and lignin. The ADF consists of cellulose and lignin, and the ADL consists of lignin [64]. The cellulose content (CL) was estimated by subtracting ADL from ADF, hemicelluloses content (HEM) was obtained by subtracting ADF from NDF, and finally, lignin content corresponded to ADL. The dry matter content of each calibration sample was determined at 103 °C to express all of the previous values (NDF, ADF, ADL, CL, and HEM) in percentage of dry matter (% DM). The traits were also expressed as percentage of cell-wall, excepting NDF.
For each location considered separately, the staggeredstart design made it possible to analyze the phenotyping data by distinguishing the "plant age" effect from the "climatic condition" effect, according to two linear mixed models [66]. An initial model (1), which takes into account the "age" effect, was applied to three data subsets in each location (2016-year, 2017-year or 2018-year data subsets in Fig. 2). A second model (2), which accounts for the "climatic condition" effect, was used with three other data subsets in Orléans (age 1, age 2, or age 3 data subsets in Fig. 2), and a fourth additional subset in Estrées-Mons (age 4 data subset). The two models were assessed using the restricted maximum likelihood (REML) approach, known to be suitable The group year establishment is specified in brackets below each group name for analyzing unbalanced datasets, and the corresponding function was implemented in the "breedR" R package [67]. Model 1 was specified as follows: in which Y ijk represents the phenotypic value measured on plant k of genotype i at age j; μ is the overall mean; α i is the random effect of genotype i; β j is the fixed effect of age j; (αβ) ij is the random interaction between genotype i and age j; and ε ijk is the random residual for plant k of genotype i at age j. Model 2 was specified as follows: where each term is similar to those of Model 1, excepting age effect β j which is replaced by climatic condition effect γ l of year l, and the interaction between genotype i and age j (αβ) ij which is replaced by the interaction between genotype i and climatic condition effect of year l (α'γ) il . The terms α' i , and ε′ ikl were different from the effects given by the previous Model 1 because they were estimated based on different subsets of the data. In both models, spatial effects were accounted for using an autoregressive correlation structure, based on x and y coordinates in the field, to partition the covariance matrix of each residual ε ijk and ε′ ikl into a spatially dependent component and an independent remaining residual variance [68]. In order to minimize the border effect, the trial was surrounded by one row of border plants, which were accounted for in the estimation of the autoregressive model. However, the corresponding genotypes were discarded for subsequent QTL detections, once the best linear unbiased predictions (BLUP) of all genotypes of the trial were calculated. (1) Block and spatial effects were both initially included in each model. As the models that only considered the spatial effect yielded the best Akaike information criterion (AIC) [69], the model was simplified and the block effect was finally not considered. It can be noted that the low performance of a statistical model that includes a block effect reinforced the similarity in the soil conditions between both stands of the staggered-start design. Based on each model described above, the BLUP of the random genotype (G) effect were estimated in order to carry out the QTL mapping of miscanthus biomass production and composition traits. For each location, the BLUP of the G effect that were estimated using Model 1 were related to the climatic condition of each studied year (i.e., data subset). They were considered as independent from the age effect between G1 and G2 groups, i.e., represented by the fixed effect of the age and the random interaction between genotype and age ( Fig. 2). Reciprocally, the BLUP of the G effect estimated using Model 2 were related to each age (i.e., data subset). They were considered as independent from the climatic condition effect between G1 and G2 groups, i.e., represented by the fixed effect of the climatic condition and the random interaction between genotype and climatic conditions (Fig. 2). For biomass production and composition traits, the G effects estimated based on these two models were previously shown to account for the highest part of the variance components analyzed according to the staggered-start design in each location (Raverdy et al., submitted to BioEnergy Research) (Figs. 3 and 4).
For each trait and each condition, the distribution of the BLUP along with the parental values and phenotypic means were observed: as the data were normally distributed, no data transformation was needed (Fig. S5a, S5b, S5c, S5d and S5e).

Quantitative Trait Loci Mapping
The BLUP of the five biomass production traits and the five biomass composition traits (expressed as %DM or %CW) previously cited were used for QTL mapping. Each data subset of the staggered-start design per location was considered separately (Fig. 2). As miscanthus is an outcrossing species, a specific CIM model adapted to fullsib progeny was carried out by using the "fullsibQTL" R package [27,70]. Outcrossing segregation patterns were considered in the model based on a multipoint approach which estimated three genetic effects according to three contrasts: two contrasts concerned the additive effects of the QTL alleles for each parent, and the third one was related to the intra-locus interaction (dominance) between additive effects on each parent. The conditional multipoint probabilities of QTL genotypes were obtained for every 1-cM interval. For each linkage group, the cofactors were selected using multiple linear regression with a stepwise procedure. The associated model selection was based on the AIC [69]. The QTL were defined for a threshold based on a significance level of 5% across distributed LOD scores that were obtained by selecting the second LOD profile peak from 1000 permutations [71]. However, although some QTL were defined according to the same method, the corresponding threshold was based on a significance level of 10% in order to detect a supplemental set of stable QTL between conditions (Fig. S1). The QTL impacted by this methodology were displayed with a "*" in the results section (Fig. 5). The three genetic effects previously detailed, the linkage phases, segregation patterns, and the proportion of genotypic variation explained by each QTL (R 2 ) were estimated based on the CIM model. The QTL confidence intervals were calculated using the 2-LOD drop-off method [72]. As QTL mapping was carried out for each climatic condition and each age in each location separately, it was possible to highlight QTL that were stable according to these conditions. The QTL were defined as stable for a given trait when they co-localized under at least two conditions (Fig. 3). QTL could be stable across different climatic conditions in each location, for example, a QTL detected in 2017 and 2018 in Estrées-Mons. QTL could also be stable across different ages in each location, for example, if a QTL was detected at age 3 and age 4 in Estrées-Mons. Moreover, QTL could also be stable across climatic conditions and ages in a given location: for example, when a QTL was detected in 2017 and at age 4 in Estrées-Mons. Finally, stable QTL were also identified when they co-localized under at least four conditions across the two locations: for example, (1) two QTL detected under two climatic conditions in each location, (2) two QTL detected for two ages in each location, and, lastly,  (3) two QTL detected for one climatic condition and one age in each location (Fig. 3). For each trait, the proportion of stable QTL across climatic conditions, ages, and/or locations was then determined: in Estrées-Mons, for example, the number of stable QTL that was detected under at least two climatic conditions was divided by the total number of QTL detected across the different climatic conditions in Estrées-Mons, in order to calculate the corresponding percentage.
QTL clusters were identified for a given climatic condition or a given age in each location. These clusters corresponded to the co-localization of at least three QTL for different traits: the biomass composition traits that were either expressed as %DM or %CW were not considered as different traits. QTL clusters were made up of biomass production traits, biomass composition traits, and both biomass production and composition traits. The reliability of these QTL clusters was verified according to the Pearson correlation coefficients based on BLUP. They were computed by using the "stats" R package and visualized using the "corrplot" R package [73].

Identification of Putative Cell Wall-Related Genes in M. sinensis
Putative cell wall-related genes were identified in M. sinensis based on the orthologous relationships between M. sinensis and sorghum and between M. sinensis and maize. These two species are indeed relatives of miscanthus and can be considered as genetic models for miscanthus. This offers the opportunity to take into account the genetic knowledge that is currently available for maize and sorghum.
Orthologous M. sinensis genes were initially identified based on two cell wall-related gene lists, hereafter referred to as search lists, that were composed of 2148 candidate genes for sorghum and 2470 for maize (Virlouvet, personal communication). Secondly, these M. sinensis genes that are located between the two markers flanking each QTL cluster, i.e., flanking the region where the most extreme QTL confidence intervals within a cluster overlapped, were selected. This selection was based on the annotation file Msinen-sis_497_v7.1.gene.gff3 from Phytozome, the Plant Comparative Genomics portal of the Department of Energy's Joint Genome Institute (https:// phyto zome. jgi. doe. gov), which contains 67,789 genes. Among these positional M. sinensis genes, the genes that were orthologous to cell wall-related genes in sorghum and maize could finally be identified by comparison with the initial search lists.

Single-Nucleotide Polymorphism Calling and Filtering
The sequencing of GBS libraries produced 1,161,537,843 raw reads based on a read length of 150 bp. Nineteen individuals out of the 248 were removed from the analysis, due to the low quality of the corresponding sequencing data: a total of 229 individuals was thus considered for the genetic map construction. The SNP calling and the alignment with the M. sinensis reference genome made it possible to identify a final set of 149,741 biallelic SNP markers. The filtering of these markers then provided a selection of 9330 high-quality SNP markers available for genetic mapping. The details of the markers according to their segregation type and coding [25] are available in Table 1.

Genetic Map Construction
The grouping of the SNP markers resulted in 3774 SNP markers distributed across 19 LGs. Based on the initial marker dataset, redundant markers were removed, and markers unlinked to any of the 19 LGs were thus not considered for the analysis. It can be noted that LG12 was partitioned into two LGs, because it was not possible to group all three marker types together due to the presence of only four B3.7 available markers (i.e., heterozygous in both parents): for that reason, LG12a was made up of "Bri" and "Mal" markers on the one hand, while LG12b was made up of "Bri" and "Sil" markers on the other. Finally, all LGs corresponded to the 19 chromosomes of the miscanthus reference genome, when checking each marker provenance according to their physical position in relation to the original alignment step. Segregation-distorted markers were initially kept in order to optimize the grouping phase, but they were then removed before the ordering step: LGs. Moreover, a supplemental analysis using the principal curves method led to the removal of an extra set of 290 markers. After the ordering of the SNP markers according to the physical positions within the miscanthus reference genome, a few badly ordered markers were moved or removed according to the recombination fractions between them. The final integrated map was made up of 2602 markers and had a total length of 2770 cM (Table 2 and Fig. S2). The different markers on the map were as follows: 613 "Bri" markers of B3.7 type, 1256 "Mal" markers of D1.10 type, and 733 "Sil" markers of D2.15 type. In the 19 LGs, LG1 was the largest with a length of 217.3 cM and LG12a and LG12b were the shortest with a length of 49.3 and 32.6 cM, respectively. The average inter-marker distance was 1.06 cM when considering the 19 LGs.
LG4 showed the highest density with a mean interval of 0.72 cM between markers, which was in accordance with the highest number of 271 markers mapped on this LG (Table 2). For each LG, the heatmap showed the good quality of the marker order (Fig. S3).

Transgressive Segregation Was Observed for Biomass Production and Composition Traits
The distribution of the BLUP and phenotypic means of the progeny were observed for biomass production and composition traits for each condition ( Fig. 4; Fig. S4; Fig. S5a, S5b, S5c, S5d and S5e). These conditions were related to the climatic condition that occurred across the year, the age of the plants, and the location in which they were established. This resulted in a total of 13 conditions. Parental BLUP were reported for each trait and each condition. For both biomass production traits and composition traits, transgressive segregation was observed for each condition, except for the plant stem number (PSNb). In fact, the two parents of the population were chosen and crossed based on their highly contrasted stem number, as shown in Fig. 4. High genetic variability was observed for each of the biomass production traits: this was illustrated at age 3 by the BLUP range expressed as a percentage of its corresponding phenotypic mean. These percentages ranged between 35% for the plant maximum height (HMax_cm) in Estrées-Mons and 186% for the aboveground biomass yield (ABM_ tDMha) in Orléans. This variability tended to increase over the years (i.e., climatic conditions) and ages for most of the biomass production traits.
Regarding biomass composition traits, their genetic variability was lower than the genetic variability observed for biomass production traits: the BLUP range, expressed as a percentage of its corresponding phenotypic mean, varied between 5% for NDF_%DM in Estrées-Mons and 30% for ADL_%DM in Orléans. In contrast to the biomass production traits, this genetic variability did not increase across the years or ages.

Stable QTL Were Mainly Detected Across Climatic Conditions and Ages for Biomass Production Traits, While Few Stable QTL Were Detected for Biomass Composition Traits
A total of 260 QTL was detected for biomass production traits (Tables 3 and S1) and 283 QTL for biomass composition traits (Tables 4 and S1). The number of QTL was reported for each trait under each condition (Tables 3 and  4). For each location, stable QTL (i.e., QTL that co-localized under at least two conditions) were highlighted across the years, ages, and for both years and ages. An example is given with the solid red triangle in LG8 (37 cM) in Fig. 5: in Estrées-Mons, three stable QTL for aboveground biomass yield (ABM_tDMha) co-localized in 2016, 2017, and 2018, and three QTL also co-localized for age 2, age 3, and age 4. Six QTL thus co-localized for this trait and were either related to years and/or ages.
For biomass production traits considered in each location, the average proportion of stable QTL detected according to different climatic conditions or different ages was similar and around 30% (Table 3). These proportions were mainly consistent between traits and conditions. The stable QTL were not necessarily the QTL that displayed the highest percentage of genotypic variance accounted for (R 2 ), or the most significant QTL. In each location, the proportion of stable QTL detected by considering years and ages together was higher than stable QTL that only considered years or ages separately. This proportion was 59% in Estrées-Mons and 39% in Orléans. Thus, QTL detected in a given year (e.g., for the climatic condition in 2017 when G1 plants were 3 years old and G2 plants were 2 years old), were also detected for a given age (i.e., age 3 for plants grown in 2017 for G1 and 2018 for G2). Regarding the stability of QTL across the two locations, no stable QTL were identified for biomass production traits (Table 3).
For the biomass composition traits evaluated in each location, the proportion of stable QTL was globally higher for the QTL related to the different climatic conditions, than for  (Table 4). However, these proportions were relatively low and never exceeded 15% on average, with some traits showing no stable QTL at all. As for biomass production traits, the consideration of successive climatic conditions and ages together made it possible to increase the proportion of stable QTL for biomass composition traits to up to around 30% in each location. No stable QTL were identified across the two locations, similarly to biomass production traits (Tables 4 and 5).

QTL Clusters Were Identified for Biomass Production Traits and Biomass Composition Traits
Many QTL for one trait co-localized with QTL for another trait under each of the 13 conditions. Therefore, QTL clusters were defined when at least three QTL for different traits co-localized. This led to the identification of 12 QTL clusters, which were either specific to biomass production traits, biomass composition traits, or both ( Fig. 5 and Table 6). Five QTL clusters were identified for biomass production traits and were located in LG4, LG7, and LG18 ( Fig. 5 and Table 6). Four of these clusters were made up of QTL for canopy height (CH_cm), plant circumference (C50_cm), and aboveground biomass yield (ABM_tDMha). One of the clusters, located in LG7, was made up of QTL for canopy height, plant maximum height (HMax_cm), and aboveground biomass yield. All these clusters for biomass production traits were identified in Estrées-Mons, in 2018, and at ages 1, 2, and 4. In LG18, a particularly stable cluster, based on the same component traits, was detected in 2018, at age 2 and age 4. The reliability of the clusters was confirmed based on the large positive correlations between the BLUP of the traits that belonged to the clusters identified for each condition. These correlations were all statistically significant at the 0.05 probability level and ranged from 0.65 to 0.96 (Fig. 6).
Regarding biomass composition traits, six QTL clusters were identified ( Fig. 5 and Table 6): three of them were located in LG4, LG5, and LG15 and were made up of QTL for cellulose, hemicellulose, and lignin contents (expressed as %DM or %CW). Two other clusters were made up of QTL for ADF_%DM, cellulose (as %DM or %CW), and hemicelluloses (as %CW) and were located in LG13 and LG15. The last cluster was located in LG16 and was made up of QTL related to ADF_%DM, CL_%DM, and ADL_%DM. As for biomass production traits, the QTL clusters were confirmed by the high, positive, or negative correlations between the biomass composition traits, statistically significant at the 0.05 probability level (Fig. 6): they ranged from − 0.55 to − 0.96 for negative values and from 0.48 to 0.98 for positive values (except for the correlations between cellulose and lignin, which ranged from 0.37 to 0.47). Stable QTL clusters were detected in LG15, for QTL detected in 2017 and at age 2 in Orléans. The other QTL clusters were detected in 2017, at age 2 and age 3 in Estrées-Mons, and in 2018 in Orléans.

QTL Effects of Biomass Production and Composition Traits Were Found to Be Stronger in Orléans Than in Estrées-Mons
According to the QTL identified for each of the biomass production and composition traits, the minimum and maximum for additive effects of each parent and dominance effects LGs out of 19. The QTL were detected according to 13 conditions. A part of the stable QTL is illustrated, as well as QTL clusters. The QTL detected for a threshold based on a 10% significance level are marked with a "*" (see the "Materials and Methods" section for more details). The length of each LG is specified to the left of the LG in cM LG4 LG5 LG7 LG8 LG13 LG15 LG16 LG18  were reported for each location in Table 5, as well as the ranges of R 2 and LOD threshold values. The proportion of significant additive and dominance effects found based on all QTL corresponding to each trait was also reported. The QTL were detected according to a LOD threshold that ranged from 5.0 to 11.5 on average. For biomass production traits, the highest percentage of genotypic variance explained by the QTL (R 2 ) of each trait ranged from 15.9 to 19.5% in Estrées-Mons and from 19.6 to 24.5% in Orléans. The QTL detected in Orléans thus tended to explain more genotypic variance than the QTL detected in Estrées-Mons: this was consistent with the values of additive and dominance effects, which were often higher in Orléans than in Estrées-Mons. For example, a maximal dominance effect of 8.2 cm was highlighted for total plant height in Orléans, compared to a BLUP of 5.3 cm in Estrées-Mons. Interestingly, most of the significant QTL effects identified for each production trait in both locations were either due to dominance effects or additive effects of the Silberspinne parent. This was notably observed for plant circumference (C50_cm), for which 43% of the significant effects originated from the Silberspinne allelic effect in Estrées-Mons and 46% originated from the dominance effect in Orléans.
Concerning biomass composition traits, the highest percentage of genotypic variation explained by the QTL of each trait ranged from 12 to 19.3% in Estrées-Mons and from 17.2 to 29.4% in Orléans. As for biomass production traits, the QTL identified in Orléans explained a higher percentage of the genotypic variation than the QTL identified in Estrées-Mons: this was also consistent with the higher allelic and dominance effects often observed in Orléans than in Estrées-Mons. As an example, the maximum dominance effect for cellulose (%CW) was 0.41% in Orléans compared to 0.17% in Estrées-Mons. In contrast to the proportion of significant effects observed for biomass production traits, the maximum proportion of significant effects observed for biomass composition traits were either due to the Malepartus allelic effect, Silberspinne allelic effect, or dominance effect, and that was the case in both locations. This can be shown for hemicellulose content (%DM) in Orléans, with 41% of the significant effects originating from the dominance effect. Concerning cellulose content (%CW) in Estrées-Mons, 37% of the significant effects originated from the Silberspinne allelic effect. Regarding lignin content (%CW), 38% of the significant effects originated from the Malepartus allelic effect.

Some M. sinensis Genes Within the QTL Clusters Were Orthologous to Sorghum and Maize Cell Wall-Related Genes
A total of 809 and 494 M. sinensis genes within the QTL clusters were determined to be orthologous to genes in sorghum and maize, respectively. The QTL clusters contained 62 of these M. sinensis genes that were orthologous to cell wall-related genes in sorghum and maize (Table 7), knowing that the QTL clusters were either related to miscanthus biomass production or composition traits (Tables 6 and 7). It must be noted that some genes are repeated in Table 7 as they belong to different QTL clusters. The QTL cluster located in LG5 did not contain any M. sinensis genes that were orthologous to cell wall-related genes in sorghum and maize. Regarding the other clusters, some underlying genes were identified for different climatic conditions and different ages (Tables 6 and 7). The 62 M. sinensis genes belonged to three main categories: 15 genes (24%) coded for enzymes involved in polysaccharide biosynthesis, 11 genes (18%) coded for enzymes involved in the phenylpropanoid pathway that provides precursors for the biosynthesis of lignin, and five genes (8%) coded for cell-wall proteins. Finally, 17 transcription factors (27%) were also identified based on the miscanthus literature review (Table 7). Finally, twelve genes among these 62 genes belong to families that were previously found to contain genes involved in the secondary cell wall (SCW) biosynthesis of miscanthus and are highlighted in bold in Table 7 Discussion The stability of the QTL detected for biomass production and composition traits was investigated based on 13 different conditions related to the staggered-start design established in each of the two locations. The QTL of both types of traits were found to be more stable for successive climatic conditions and ages considered together, compared to successive climatic conditions or successive ages considered separately. The evaluation of each climatic condition and each age was made possible based on the staggered-start design: usually, other types of designs such as "single-start" designs lead to the evaluation of plants for a given year, in which the related age and climatic condition cannot be distinguished. According to our design, the biomass production traits appeared to be more stable than the biomass composition traits across the conditions evaluated. However, there was no stable QTL highlighted across both contrasted locations. The QTL clusters representing co-localizations of QTL for biomass production and/or composition traits were identified across 13 different conditions. The corresponding intervals were screened for the underlying genes that correspond to orthologous cell wall-related genes known in sorghum and maize.
Three main points will be discussed in this section: (1) the stability of biomass production and composition traits highlighted across the ages and the climatic conditions of the successive years studied, based on the staggered-start design established in each location; (2) the clusters of QTL for biomass production and composition traits that are consistent with the moderate to high genetic correlations highlighted between these traits; and (3) the different orthologous cell wall-related genes that are known in sorghum and maize, two relatives of miscanthus, and that were found in the regions of the clusters highlighted.

Stable QTL of Biomass Production and Composition Traits Were Highlighted Across Climatic Conditions and Ages Based on the Staggered-Start Design of Each Location, While No Stable QTL Were Detected Across the Two Locations
In each location, stable QTL that corresponded to the QTL detected for a given trait that co-localized under at least two conditions across different climatic conditions and/or across different ages, were identified for biomass production and composition traits. The assessment of QTL stability was an important objective of the study. This is why the staggeredstart design was analyzed according to two different linear mixed models, related to each climatic condition and each age. It also made it possible to consider all the genotypes of the population in each location. Regarding biomass production traits, the different climatic conditions and ages considered together in each location led to highlight 59% and 39% of stable QTL in Estrées-Mons and Orléans, respectively. These results can be compared to those reported by Gifford et al. [34] and Dong et al. [36], in which each of the years studied in their experimental designs was not partitioned into age and climatic condition effects. Gifford et al. [34] studied 13 biomass production traits in a M. sinensis population over 2 successive years, among which they identified 61% of stable QTL: 22 QTL re-discovered in 2012 out of 36 QTL detected in 2011. Dong et al. [36] established three interconnected miscanthus populations and carried out four different QTL analysis methods, either related to CIM or stepwise analyses. This led to the detection of 288, 264, 133, and 109 QTL for 14 biomass production traits across 2 years. In 2013, they re-identified from 48 to 56% of the QTL that had already been detected in 2012. When climatic conditions and ages were considered separately in each location of our study, around 30% of stable QTL were identified either over the years or across the ages, regardless of the location. Accordingly, these lower proportions result from Table 7 List of the miscanthus genes that were related to orthologous cell wall-related genes in sorghum and maize (Virlouvet et al., personal communication). Each miscanthus gene was detected in at least one cluster that was detected for a specific condition. When a gene is detected in two clusters, it corresponds to two different conditions. Accordingly, the corresponding stability type is specified. Two types of orthologous relationships were assessed: miscanthus with sorghum and miscanthus with maize. A miscanthus gene can thus correspond to an orthologous sorghum gene that can also have a maize ortholog (in blue). In addition, a miscanthus gene can correspond to an orthologous maize gene that can also have a sorghum ortholog (in green). The black color corresponds to a miscanthus gene that was directly identified based on orthologous relationships with both sorghum and maize. A miscanthus gene ID starts with the root "Misin-", a sorghum gene ID starts with the root "Sobic.0-" and contains a "G" among the gene numbers and a maize gene ID starts with the root "Zm00001d-". A miscanthus gene written with a * corresponds to an identical miscanthus gene, which is displayed in two different rows as the gene was identified in two clusters with different trait types. The miscanthus genes highlighted in bold belong to a gene family for which cell-wall candidate genes were identified in miscanthus by Hu et al. [79] and Zeng et al. [80] the partition of each year studied into the corresponding climatic condition on the one hand, and the age on the other.
Regarding biomass composition traits, a higher proportion of stable QTL was also identified across the climatic conditions and ages when they were considered together rather than separately. However, these different proportions ranged from 3 to 29%, which was relatively low compared to biomass production traits. Van der Weijde et al. [37] studied an M. sinensis population for traits related to biomass composition and conversion efficiency. They reported 23% of stable QTL in 2 successive years: 20 out of 86 QTL were detected in 2013 and 2014. These proportions confirm that, in miscanthus, the QTL of biomass composition traits seem to be less stable across different years (and ages) than those of biomass production traits. This may be explained by strong genotype x climatic condition or genotype x age interactions, as significant genotype x year interactions have already been highlighted for miscanthus biomass composition traits [74,75].
In each location studied, the proportion of stable QTL for both types of traits across the climatic conditions and/ or ages was rather low: this could be expected, as biomass production and composition traits can be affected by the variability related to plant age and environmental factors, such as the related climatic conditions that occur each year [74,75]. However, these stable QTL mapped across the climatic conditions and/or ages would lead to the identification of relevant targets for MAS programs. Among these, a relevant example was highlighted in LG8 (37 cM) for plant circumference (C50_cm) and aboveground biomass yield (ABM_tDMha), symbolized with a solid red triangle in Fig. 5: these stable QTL are relevant in terms of their stability over different ages and climatic conditions of successive years in Estrées-Mons, especially as they are stable for age 2, 3, and 4. It means that future genetic material could be screened at a young age, in order to select individuals that show beneficial alleles according to this QTL. It could thus speed up miscanthus breeding when based on an early selection of such individuals.
The proportion of stable QTL depends on environmental conditions, plant age and the genetic material considered, which is specific to each miscanthus study [34,36,37]. However, these prior studies did not detect QTL for more than 3 years after establishment and did not distinguish the age effect from climatic condition effect. Segura et al. [38] used a staggered-start design and carried out QTL mapping in order to dissect the apple tree architecture into genetic, ontogenetic, and environmental effects. This made it possible to determine the genetic determinism of related traits, with regard to tree ontogeny and climatic conditions. To our knowledge, the present study uses a staggered-start design in miscanthus for the first time, in order to detect stable QTL in different climatic conditions and/or at different ages. Moreover, a staggered-start design was established in each of the two contrasting locations, which led to the examination of QTL stability across locations as well, as had never been done before in miscanthus. Accordingly, when considering the different years (i.e., climatic conditions) and ages together across locations, no stable QTL were identified for biomass production and composition traits. Thus, it shows that the QTL detected are specific to each location studied. These QTL are relevant for miscanthus breeding programs, as they express themselves in specific conditions related to a given location. It indicates that other environmental effects interact with the genetic basis of biomass production and composition traits across locations. The different climatic and soil conditions could explain that, as the staggered-start design was established in a deep loam soil in Estrées-Mons, it was established in a sandy soil in Orléans. In addition, the climate in Estrées-Mons is more influenced by the ocean than in Orléans: the differences in climatic conditions between both locations are presented according to different periods of the plant cycle in Raverdy et al. (submitted to BioEnergy Research). Thus, significant genotype × location interactions may explain the lack of stable QTL across locations. In a study comparing different miscanthus species across five different locations in Europe, Clifton-Brown et al. [76] and Lewandowski et al. [77] indeed highlighted genotype × location interactions for biomass production and biomass composition traits. Moreover, 53 additional progenies were grown in Estrées-Mons compared to Orléans (Fig. 1): the genetic variability was therefore not identical between the two locations. This can also be a reason why the genotypic variances (R 2 ) explained by the QTL detected in Orléans were mostly higher than those explained by the QTL detected in Estrées-Mons. The establishment effect can impact QTL detection power, as miscanthus is mature from around 2 to 3 years [19] or 5 years after establishment [78]: a substantial number of QTL were detected from young to old plants in our study, which suggests that the effect due to the establishment may be limited within the location. However, the different establishment conditions between locations could also explain the lack of stable QTL across locations. Tejera et al. [41] used a staggered-start design and showed that the M. × giganteus yield response to fertilization was influenced by establishment conditions in each location but not by the plant age.
In this study, each staggered-start design makes it possible to highlight a higher proportion of stable QTL for a range of climatic conditions and ages considered together rather than separately. However, the stability of QTL under these conditions is higher for biomass production traits than for biomass composition traits, studied together for the first time in a miscanthus mapping population. Across locations, no stable QTL were identified, which may be due to different environmental conditions such as climate, soil, and establishment effects. This brings new insights into miscanthus breeding, as stable QTL are needed from different genetic material evaluated across different ages and climatic conditions: the comparison of stable QTL between studies would lead to the identification of the most significant genomic regions associated with biomass production and composition traits. Such QTL that are specific to a given location will benefit to the breeding of miscanthus, notably to target the conditions encountered in a particular region.

Clusters of QTL for Biomass Production and Composition Traits Were Consistent with the Moderate to High Genetic Correlations Highlighted Between These Traits
The QTL clusters identified for biomass production and composition traits were in agreement with the moderate to high genetic correlations between the traits. The QTL clusters related to biomass production traits were identified in LG4, LG7, and LG18. They were made up of QTL that overlapped at similar positions, for traits such as canopy height, total plant height, plant circumference, and aboveground biomass yield. The corresponding significant and moderate to high genetic correlations suggest that QTL overlapping is not random. Moreover, the stability of the QTL cluster is shown in LG18, as QTL were detected in 2018 and at ages 2 and 4 in Estrées-Mons. This is possible based on each staggered-start design evaluated over 5 years in two locations. Gifford et al. [34] identified QTL clusters in LG3 and LG6, which were re-identified in 2 subsequent years and that were consistent with high genetic and phenotypic correlations as well. These clusters were made up of QTL related to the plant circumference, stem diameter, plant stem number, aboveground biomass yield, or characteristics of the leaves such as leaf width, length, and area. These QTL identified for leaf-related traits are relevant: as canopy height refers to the height of the different leaves of the plant that contributes to yield, the QTL identified for canopy height in our study can in fact be related to different phenotypic characteristics of the leaves. However, none of their different QTL clusters were common to our QTL clusters. Dong et al. [36] identified different QTL clusters in their three interconnected miscanthus populations: these clusters were related to plant height, plant circumference, stem volume and density, and the aboveground biomass yield. They were identified in various LG depending on the population and were in agreement with the moderate to high phenotypic and genetic correlations between these traits. For one of their populations originating from a cross between an M. sinensis and an M. sacchariflorus cultivar, they identified QTL clusters in LG4 and LG7: these LGs were common to the LGs in which we identified QTL clusters for the same type of traits related to plant height, plant circumference, and aboveground biomass yield. However, an investigation of the QTL cluster positions in their study would be based on the alignment with the M. sinensis reference genome in order to determine if the same genomic regions are involved.
Regarding biomass composition traits, we identified QTL clusters in LG4, LG5, LG13, LG15, and LG16, which were also in agreement with the significant moderate to high correlations between these traits. The stability of the clusters is also notable, because two clusters were identified in 2017 and for age 2 in Orléans. This is made possible based on the staggeredstart design as well. They were located in LG15 and made up of traits related to cellulose, hemicelluloses, lignin, and ADF contents. The co-localization of QTL related to ADF content with those related to cellulose and lignin content is not surprising, as ADF content represents the sum of cellulose and lignin contents [64]. Van der Weijde et al. [37] identified a major QTL cluster for traits related to conversion efficiency and composition traits: this cluster was located on chromosome 6 according to the Sorghum bicolor reference genome that was used in the construction of their two parental miscanthus genetic maps. In miscanthus, the corresponding chromosomes are chromosomes 11 and 12, as the miscanthus genome has been shown to be the result of chromosomal duplication and fusion based on the sorghum genome [33,35,42]. However, none of our QTL clusters is common with their QTL clusters, because we did not identify QTL clusters in LG11 and LG12.
Our study was conducted by considering biomass production and composition traits together: this led to the identification of a QTL cluster in LG15, which was made up of both biomass production and composition traits. The corresponding traits were canopy height, NDF (%DM), and hemicellulose content: the moderate and significant correlations of canopy height with these composition traits (respectively, 0.44 and − 0.55) tend to validate the existence of this cluster. However, further analysis according to the genes underlying this cluster is necessary to confirm this assumption.
The QTL clusters identified for biomass production and composition traits could be explained by different genetic factors, such as the pleiotropic effects of the genes underlying these QTL or linked genes. Sometimes, these clusters can originate from genomic regions with segregation distortion, but this may not be possible in our study as we carefully filtered the distorted markers for the construction of our integrated genetic map. The staggered-start design led to the identification of QTL clusters located in LG4, LG5, LG7, LG13, LG15, LG16, and LG18 for a range of climatic conditions and ages that consider biomass production and composition traits together, for the first time in miscanthus.

Orthologous Cell Wall-Related Genes Previously Identified in Sorghum and Maize Enabled the Identification of Putative Cell Wall-Related Genes in M. sinensis
Some of the 62 M. sinensis genes that were identified in the QTL clusters based on the orthologous cell wall-related genes known in sorghum and maize belong to specific gene families. Twelve genes among the 62 genes belong to families that were previously found to contain genes involved in the secondary cell-wall (SCW) biosynthesis of miscanthus [79,80]. Hu et al. [79] carried out a transcriptome analysis of genes involved in secondary cell-wall biosynthesis in developing internodes of M. lutarioriparius: they highlighted different gene members in specific gene families. These families included genes encoding 4-coumarate-CoA ligase (4CL) and cinnamoyl-CoA reductase (CCR), both involved in the biosynthesis of several classes of phenylpropanoids, as well as laccase (LAC), involved in the polymerization of lignin [81]. They also identified the cellulose synthase-like (CSL) and glycosyltransferase (GT) gene families that are involved in the biosynthesis of cellulose and hemicellulose components in plants. Finally, three other gene families in common with Hu et al. [79] were identified: the fasciclinlike arabinogalactan (FLA) gene family, for which genes are involved in cell wall modification and assembly; the NAC transcription factor (TFNAC) and WRKY transcription factor (TFWRKY) families that contain transcriptional factors for the regulation of secondary cell wall development. Based on genetic and transcriptional analyses in M. × giganteus, Zeng et al. [80] identified several genes that are common to the genes we highlighted: the 4CL and CCR families that were also reported by Hu et al. [79], as well as the shikimate hydroxycinnamoyl transferase (HCT) family.
Based on these different comparisons, we hypothesize that twelve M. sinensis genes out of the 62 genes previously identified are involved in secondary cell wall development. This hypothesis is supported by the fact that these genes were mainly located in the QTL clusters composed of M. sinensis biomass composition traits, especially for the clusters located in LG4, LG13, and LG15. Cluster 2 in LG4 and clusters 1 and 3 in LG15 were particularly notable, as the R 2 of the related QTL mainly ranged from 11.3 to 29.4% (Table 6).

Conclusion
In this study, an integrated genetic map of 2770 cM was constructed based on 2602 SNP markers distributed across 19 LGs and was aligned with the released M. sinensis reference genome. This integrated genetic map, which was highly saturated, led to the identification of 260 and 283 QTL related to biomass production and composition traits, respectively. The staggered-start design established in each of the two contrasting French locations led to the detection of QTL that were stable across different climatic conditions and different ages. For both types of traits, a higher stability of the QTL was found when the climatic conditions were considered together with the different ages, than when they were considered separately. These differences were highlighted based on the distinction of the plant age effect from climatic condition effect. For a given location, the most stable QTL identified across different climatic conditions and different ages would be interesting for miscanthus breeders, as they are stable regardless of the condition assessed in our experiment. They are thus important resources to carry out future MAS programs. This would be true especially for the QTL which were found to be stable at age 3 and age 4, as they could be relevant for screening young plants without the need to wait for their mature age. This would be more suited to biomass production traits, as the biomass composition traits were found to be less stable across the conditions. However, no stable QTL were identified across locations: it highlights that the QTL detected in this study were specific to the conditions encountered in Estrées-Mons or in Orléans, and it shows the relevance of carrying out the study in two locations. They may be explained by the existence of QTL that correspond to the genotype × age and genotype × climatic condition interaction effects. These effects were specifically assessed in the two models carried out for the analysis of the staggered-start designs, but their corresponding mapping has not been carried out yet, and their future detection would be desirable.
Clusters of QTL were then identified for biomass production and composition traits under different conditions: this means that linked genes or pleiotropic effects from the genes underlying these QTL would make it possible to jointly improve these different traits. Moreover, these QTL clusters contained 62 M. sinensis genes that were orthologous to cell wall-related genes in sorghum and maize. Twelve of these genes were identified as putatively involved in secondary cell wall biosynthesis. In summary, all these QTL clusters which correspond to different traits or stability types and their underlying candidate genes constitute targets of interest for miscanthus breeders, in order to evaluate and create new miscanthus cultivars that would be adapted to different environments, with a high biomass yield and a composition suited to bioenergy, biomaterials or animal bedding.