Unraveling the Role of Maize Cell Wall Hydroxycinnamates in Digestibility, Biofuel Production and Pest Resistance Using A Multi- Parent Advanced Generation Intercross (MAGIC) Approach


 Background: Mechanical resistance due to higher hydroxycinnamate content makes maize tissues more recalcitrant to damage by insects, less digestible by ruminants, and less suitable for biofuel production. The integrated study of the maize functional genetic variability for each hydroxycinnamate component could be crucial to identify relevant genetic variants that may be incorporated into selection programs to breed maize varieties for multiple uses. A Genome Wide Association study was carried out a in a maize Multiparent-Advanced Intercross (MAGIC) Population to indentify Single Nucleotide Polymorphisms (SNPs) associated with cell wall bound hydroxycinnamates;and we checked thereafter their relationship with SNPs significantly associated with saccharification efficiency, digestibility of organic matter and corn borer damage.Results: We found 24 SNPs, corresponding to15 QTL, significantly associated with cell wall bound hydroxycinnamates. Each SNP explained between 6 and 8% of the total variability. We define new genomic regions and genes involved in polysaccharide synthesis and modifications, and the oxidative coupling associatted to cell wall hydroxycinnamates content. Conclusions: SNPs explained a small proportion of the variability for hydroxycinnamates, saccharification efficiency, digestibility or insect damage, therefore we recommend a genomic selection approach for future breeding programs of these traits. In addition, no colocalizations were found between hydroxycinnamates and final-use-related traits so breeding strategies can be focus on each particular trait with no side effects on the others.


Background
The structural and functional properties of the plant cell wall are controlled by the composition and organization of each of its individual components [1]. In this sense, cell walls are mainly composed by micro brils of celluloses embebed in a matrix of hemicellulose, lignin and other phenolic compounds (mostly hydroxicinnamic acids). Particularly, the role of stiffening, streghtening and forti cation that hydroxycinnamic acid play within the cell wall has been either positively or negatively correlated with economically important uses and characteristics of cereal grasses such as resistance to insects, ethanolic production and feedstock digestibility [2][3][4].
The most common hydroxycinnamates in those grasses are p-coumaric (PCA) and ferulic (FA) acid. PCA is involved in the radical coupling of S units of the lignin polymer and is known to be ester bond to the γ position of the side chains of the lignin S units [5]. On the other hand, FA is ester bond to arabinose residue of arabinoxylans chains, later crosslinked through ether bond and C-C bonds to the G unit of lignin, working as nucleation site of this polymer. Furthermore, FA is able to form links between proteins and polysaccharides trough cysteine and tyrosine residues, and can form dimers (DFAs) and even oligomers through enzymatic or non-enzymatic oxidative coupling reactions, cross-linking this way hemicelullose polysaccharides. This net confers greater mechanical resistance to the cell wall [6][7][8].
The relationship between cell wall cross-linking and insect resistance was rst suggested by Fry [1]. Since then, many studies have pointed out the role of hydroxycinnamate content in plant resistance to pests and diseases [9][10][11][12]. Greater amount of hydroxycinamates monomers and diverse diferulic isomers, have been found in the pith tissues of borer resistant genotypes compared to susceptible ones. In addition, Barros-Rios et al. [11] found negative correlations between stem tunneling by borers and total DFAs, DFA-8-5'-and PCA esters, or negative correlations between larval weight in feeding bioassays and total DFAs content. At rst sight, modi cations of their concentration would results in the improvement of borer resistance, however, a successful divergent selection program for changing diferulate ester concentration resulted in changes in the cell wall features [13]. When DFAs increased, glucose concentration and total cell walls concentration decreased, affecting properties related to animal digestibility [14]. Likewise, both, xylan-to-xylan ferulate linking [15] and ferulate-to-lignin cross-links [16,17] limit the enzymatic depolymerization of cell wall polysaccharides increasing cell wall recalcitrance to deconstruction. This cell wall recalcitrance impacts negatively in digestiblity, decreasing cell wall deconstruction by microorganisms and enzymes, thence limiting the energy intake by the catle [18][19][20].
Finnally, strategies that could reduce the incidence of ferulate cross-links in the cell wall have the potential to improve cell wall degradability properties relevant to cellulosic ethanol production [16]. Cell wall polysaccharides are the main substrate for ethanol fermentation, and cell wall recalcitrance increases the energy requirements, the cost and complexity of bio re nery operations and/or reduces the recovery of biomass carbon into desired products [21]. Some maize genomic regions mediating hydroxycinnamate accumulation have been detected throughout the genome [22][23][24]. Fontaine et al. [25], in a subset of 135 maize RILs, found seven QTL associated with FA and PCA. In addition, several QTL for hydroxycinnamate content, including PCA and FA, were identi ed by Barrière et al. [22] in a bi-parental population obtained from the cross F286 x F838, one of which colocalized with a QTL previously described by Fontaine et al. [25]. In the IBM population, Lorenzana et al. [26] found eight QTL associated with PCA and eight QTL associated with FA, whereas analyzing pith samples of second internode below the main ear in a biparental population Santiago et al. [27] found two QTL associated with FA, seven with PCA and seven with total diferulates. Moreover, in one of the rst high resolution association mapping analyses made to detect QTL associated to hydroxycinnamate content in the maize pith using a diversity panel, López-Malvar [24] found 22 QTL associated with cell wall bound hydroxycinnamates. However, in the mentioned studies, mainly biparental populations have been used. In these populations you have limited variability, and although they are useful to identify QTL, the region where the QTL is located is usually quite wide. The use of inbred panels resolves both problems: they are diverses and enable high-resolution mapping of QTL to narrow genomic regions, where the searching for genes contributing to trait variability is more feasible. In this sense, as previously mentioned, a diversity panel of 272 inbreds was used to study cell wall bound hydroxycinnamates by our group [24]. Association studies using diversity panels still showing some disadvantages, as limited power to detect QTLs due to the small effect and/or low frequency (rare alleles) of some genetic variants, therefore we could have many undetected rare alleles that could be on the other hand valuable for breeding purposes when they show major effects [28][29][30]. Results from QTL mapping in Multi-Parent Advanced Generation Inter-Cross (MAGIC) populations could be complementary to those obtained by linkage mapping in biparental populations or association mapping using inbred panels. In this case several alleles can be simultaneously studied and none of them would be in low frequency. Besides, MAGIC populations present a known underlying structure that prevents from false positive associations [28]. Finally, in contrast to maize inbred panels constructed by inbreds of diverge origins, MAGIC populations, in general, are better adapted to a particular environment, so adaptation differences would not impair the phenotyping.
Therefore, a complete study of the maize functional genetic variability for cell wall hydroxycinnamates and nal-use-related traits could be practical in order to identify relevant genetic combinations that may be incorporated into selection programs. The nal objective will be to breed maize varieties for multiple purposes, without compromising each other.
To reach this goal we use cell wal bound hydroxycinnmates data and reports from three previous studies involving the same MAGIC population (López-Malvar et al. 2020 a,b submitted; [31]). We described genes in high resolution genomic regions..

Plant Materials
The MAGIC population was developed by the Maize Genetics and Breeding group at Mision Biologica de Galicia-CSIC. The eight parents of the MAGIC population were chosen because they show partial resistance to Mediterranean Corn Borer attack and high speci c combining ability with the Reid germplasm group. Six of them come from European germplasm (EP17, EP43, EP53, EP86, PB130 and F473) and two from American germplasm (A509, EP125). Procedures used to release the 672 recombinant inbred lines of the population have been previously reported [31,32]. The pedigree of each founder line is shown in Table 1.

Experimental Design
A subset of 408 RILs of the MAGIC population together with the eight founders were tested in a single augmented design with 10 blocks in Pontevedra, Spain (42° 24'N, 8° 38'W and 20 m above sea level) in 2016 and 2017. Forty-two non replicated RILs plus the eight parents of the MAGIC (PB130 was replaced by EC212 in 2017 and EP43 by EP80 in both years due to lack of seed availability) were randomly assigned to each block. Only 30 RILs were evaluated in block 10. Each experimental plot consisted of a single row with 13 single-kernel hills planted manually, spacing between consecutive hills in a row being 0.18 m and 0.8 m between rows, obtaining a nal density of ~ 70,000 plants ha -1 .
Local agronomical practices were ful lled.

Phenotypic Data
Biochemical analyses were performed in stover samples that were harvested approximately 55 days after silking (days from planting until half of the plants in the plot showed visible silks). Each sample was composed of tissue from at least two plants per plot. Samples were chopped, pre-dried at 35 °C in a forced air camera and dried at 60 °C in a stove. Last of all, dry stover samples from each plot were grounded in a Wiley mill with a 0.75 mm screen to be used in subsequent biochemical analysis.
A recently optimized protocol was used for hydroxycinnamate quanti cation [33]. Brie y, subsamples of 500 mg were extracted in 30 ml of 80% methanol and mixed using a Polytron mixer. Samples were extracted for 1 h and then centrifuged for 10 min at 1000 g. The remaining pellet containing the cell wall-bound material was shaken in 20 ml of 2 N NaOH under a nitrogen ow for 4 h. Digested samples were neutralized with 6 N HCl, and the pH was adjusted to 2.0. After centrifugation, the supernatant was collected and the pellet washed twice with distilled water (10 ml each). Supernatants were pooled and then extracted twice with ethyl acetate (40 ml each). Collected organic fractions were combined and dried using a SpeedVac for 6 h at a medium setting without a radiant cover. The nal extract was dissolved in 1.5 ml of HPLC grade methanol and stored at −20 °C prior to HPLC analysis. Standards and samples were ltered through a 22 μm tetra uoroethylene lter before being analysed.
Chromatographic quanti cation was completed using a 2690 Waters Separations Module equipped with a model 996 photodiode array detector and a Waters YMC ODS-AM narrow bore column (100 × 2 mm internal diameter; 3 μm particle size). The solvent system consisted of acetonitrile (solvent A) and tri uoroacetic acid (0.05 %) in water (solvent B) as follows: initial conditions of 10:90 A:B, changing to 30:70 over 3.5 min, 32:68 over 6.5 min, 100:0 over 4 min, isocratic elution at 100:0 for 4.5 min, and nally returning to the initial conditions (10:90) over 3 min. The mobile phase ow rate was 0.3 ml/min, and the total analysis time was 17.5 min. The sample injection volume was 4 μl. Phenolic standards ferulic acid (FA) and p-coumaric acid (PCA) were purchased from Sigma-Aldrich Quimica SL, Madrid, Spain. The identities of FA dimers were con rmed by a comparison with the authentic 5−5 standard or published retention times and UV spectra. The total diferulate content (DFAT) was calculated as the sum of the following three identi ed and quanti ed DFA isomers: DFA 8-O-4, DFA 5-5, and DFA 8-5.

Statistical Analysis
Inbred lines were previously genotyped with 955.690 SNPs using a genotyping-by-sequencing (GBS) method. Genotype matrix was ltered , i.e. SNPs with more than 50% missing data and a minor allele frequency less than 5% were omitted. Heterozygous genotypes were considered missing data. After ltering, 215.131 SNPs distributed across the maize genome were retained.
For each phenotypic trait, data from indivual as well as from combined trials were analyzed according to the mixed model procedure (PROC MIXED) of the SAS program (version 9.4) [34] and the best linear unbiased estimator (BLUE) of each RIL was calculated based on the combined data for the 2-year analysis.. The BLUES constituted the phenotype matrix The comparison of means was carried out using the Fisher's protected least signi cant difference (LSD). Heritabilities (ĥ 2 ) were estimated for each trait on a family mean basis as previously described by Holland et al. [35]. The genetic (r g ) and phenotypic (r p ) correlation coe cients between each pair of traits were calculated using REML estimates according to a published SAS mixed model procedure [36].
We used the current eld trials and population in previous studies in order to quanti ed the average sacchari cation e ciency, the digestibility of the organic matter (DOM) (López-Malvar et al. 2020 a,b submitted) and the tunnel length borers resistance [31]. In order to understand the interrelation among hydroxycinnamates and those nal-use-related traits, multiple linear regression models using the stepwise method of the SAS PROC REG procedure . Sacchari cation e ciency, DOM and tunnel length were considered as dependent variables.
Variables with a p value less than 0.15 were retained or excluded in the regression model.
The genome-wide association analysis was completed with Tassel 5 [37] based on a mixed linear model using a genotype-phenotype matrix and a kinship matrix using the centered IBS method [38]. Among the mixed linear model options, we used the optimum compression level and P3D to estimate the variance components.

SNPs, QTL and Candidate Gene Selection
Two approaches were used to calculate the comparison-wise threshold for declaring signi cant an association between a trait and a SNP: (i) A modi cation of the classic Bonferroni approach where, rst the number of independent tests was estimated by the Haploview program using the option four gamete rules, resulting in 12397 independent comparisons [39][40][41]. Then, the comparison-wise threshold was the coe cient between the experiment-wise threshold established (0.3) and the number of independent tests (12397); (ii) it was determined as the point where the observed and expected F test statistics deviated in the Q-Q plot of the model, performed with Tassel 5 [37] (Supplementary Figure 1).
We considered a +/-700 kbp con dent interval region around each signi cant SNP following previous association studies using the same mapping population [42]. In case con dence intervals of two SNPs overlapped they were assigned to a single QTL. The two described genes that delimit the +/-700 kbp region around the SNP in the reference genome assembly version 2 were positioned in version 4 of the reference genome, and all genes contained in the region delimited by those genes were then identi ed and characterized based on the maize B73 reference genome assembly (version 4) available on the MaizeGDB browser [43]. Those genes are listed in Supplementary Table 1.

Means, Analysis of Variance and Heritabilities
Means and ranks are shown in Table 2. Signi cant differences were found among the founders means and among RILs means for all the quanti ed hydroxycinnamates. p-coumaric and ferulic acid were the most abundant cell wall bound hydroxycinnamates. We found signi cant variation for all traits; and also noted that the founder lines of the MAGIC population showed high variation for hydroxicinnamic acids as in previous evaluations [10]. Monomers presented moderate heritability values (0.59 for PCA and 0.60 for FA), whereas, dimers and total diferulate amount presented lower values (ranging from 0.32-0.42).

Genotypic and Phenotypic Correlations
Correlation coe cients are detailed in Table 3. We found high positive genotypic and phenotypic correlation (r g,p <0.66) between FA, total diferulates and individual dimers. On the other hand, very low correlation values were found between PCA and any other trait.

Multiple Linear Regression Analysis
We looked for the multiple linear regression models that better t data on sacchari cation e ciency, digestibility of the organic matter, and tunnel length damage using cell wall bound hydroxycinnamates as independent variables. For each model, the percentage of the partial variance for the dependent trait explained by each independent variable, the percentage of accumulated variance explained after incorporating that particular independent variable in the model, as well as estimates for regression coe cients are shown in Table 4.
The best model for sacchari cation e ciency only explained less than 1 % of the variance, mainly by PCA. Variation among RILs for DOM was partially determined by PCA (11 %) and DFA 8-5-l (2 %). By last, only 0.8 % of the variation for tunnel length was explained by cell wall bound hydroxycinnamates.

Association Analysis
Following the modi cation of Bonferroni approach a marker was considered signi cantly associated with a trait at p values less than 2.42×10 -5 ( p-value = 4.6). For the Q-Q plots we stabilised that a marker was signi cantly associated with a trait with values less than 1.00x10 -4 ( p-value = 4.0). We followed the results of the Bonferroni modi cation approach for every trait with the exception of DFAT and DFA 5-5. We considered a +/-700kbp region around each signi cant SNP as the SNP con dence interval and two SNPs were clustered in the same QTL when their con dence intervals overlapped (Table 5).
A total of 24 SNPs, corresponding with 15 QTL, were signi cantly associated with cell wall bound hydroxycinnamates ( levels, while minor frequency alleles generally increased cell wall-bound DFA 8-5 and total diferulates concentrations; nally, major frequency alleles increased p-coumaric and ferulic acid concentrations.

Candidate Gene Selection
The genes containing or physically close to SNPs signi cantly associated with traits were identi ed and characterized according to the maize B73 reference genome assembly (version 4). Analyses of +/-700kbp regions surrounding signi cant SNPs resulted in the identi cation of the genes listed in Supplementary Table 1.

Discussion
First of all, we have observed genetic variation for cell wall bound hydroxycinnamates in the maize MAGIC population evaluated, for both the RILs and the founders agreeing with former evaluations [10]. The correlations between cell wall components traits followed the trends previously reported in the literature [2,24,44]. FA, particular dimers and DFAT showed co-variation. This means that if the target for improvement is FA, individual and total dimers would be also modi ed. On the other hand, this would not happen for PCA, as this trait was not correlated with any other trait.
Furthemore, moderate to high heritability for hydroxycinnamate contents agreed with the results obtained in previous studies [2,24]. From those heritabiliy values, we would expect a good response to selection, since additive effects are more important than additive × environment interaction effects. Although phenotypic selection could be effective based on those heritability estimates, a genomic selection approach could be implemented to speed up selection, in contrast to a time consuming phenotyping method.

QTL Co-localization
With respect to other studies, novel genomic regions involved in hydroxycinnamate content were found, such as those in bins 10.04 (PCA), 5.06 and 7.01 (for FA), 2.08 and 7.02 (for DFA 5-5), 7.04 (for DFA 8-O-4), 5.06 (for DFA 8-5) and 5.01 and 7.02 (for DFAT). Nine of the ten SNPs associated with PCA, that correspond to three QTL, are located in bin 10.04. These QTL appear to be good candidates for selection targeting PCA. We found overlapping QTLs for FA and DFA 8-5 in the bin 5.06 and for DFA 5-5 and DFAT in bin 7.02. Those co-localizations have sense because genotypic correlations between those traits are high.
In addition, SNPs signi cantly associated with a trait in the current study co-localized with QTL for the same trait found by other authors. This is the case of the marker associated with PCA in bin 1.02, that co-localized in the same bin with one QTL associated with the same trait portrayed by García-Lara et al. [45], studying the cell wall phenolic composition of maize pericarp tissues in 163 F2:3 families; or the case of

Candidate Genes
Among genes proposed as candidates for cell-wall bound hydroxycinnamateswe highlight peroxidases involved in oxidative coupling of FA to form dimers, genes that are responsible for transcriptional control of the phenylpropanoid pathway, implicated in xyloglucan and arabinoxylan synthesis, and involved in polysaccharides synthesis and modi cation. Finally, we also spot genes involved in gibberellin and suberin biosynthesis.
Cellulose and glucurono-arabinoxylans are the main constituents of ligni ed secondary walls; among the genes involved in the upstream parts of cell wall carbohydrate biosynthesis, we found several UDP-glycosyltranferases within the supporting intervals of QTL for PCA (qPCA_10_3), DFA 5-5 (qDFA5-5_7_1) and 8-O-4 (qDFA-8-O-4_1_1), and total diferulates (qDFAT_7_1). UDP-glycotransferases are enzymes involved in the elongation of carbohydrate chains using nucleotide sugar as substrates and thereby causing variation in the cell wall polysaccharides structure [46]. For example, in rice, the gene underlying the brittle-culm-14 mutants, was a nucleotide sugar transporter that causes reduced mechanical strength by decreasing cellulose content and altering wall structure, including higher xylan extractability. In addition, we spotlight a reduced residual arabinose 3 gene, which encodes an arabinosyltransferase that adds the second arabinose residue in a β-1,2 linkage in arabinoxylan chains, as candidate for the QTL qDFA8-O-4_7_1 [47]. Besides, FA is ester bond to arabinose residues of arabinoxylans chains, and FA dimers crosslink hemicellulose chains binding speci cally to arabinose [48,49]. Modifying arabinosyl transferase activities could be a promising strategy for modulating ferulate cross-linkages in the walls [50,51], so we consider Zm00001d021974 a good candidate for DFA 8-O-4 content. Similarly, in the con dence interval of DFA 5-5, we found a glycosyl hydrolase (Zm00001d006940). Glycosyl hydrolases are mainly cellulases and xylanases; that modify the phenolic composition of cell walls as they cleave phenolic ester linkages. Thus, we considered Zm00001d006940, which encodes a glycosyl hydrolase, a good candidate for qDFA5-5_2_1. Besides, FA could be also bound to α-xylosil of xyloglucan (XyG) [18] and diferulate crosslinking could also anchor the xyloglucan chains. XyG participates in the formation of cellulose-XyG network, allowing the attachment of cellulose to other wall polymers and playing an important role in cell wall extension during plant growth [52]. In this sense , we spot a xyloglucan endotransglucosylase/hydrolase gene that codi es for an enzyme involved in modi cations of the xyloglucan for cell wall elongation as candidate for the overlapping QTL for FA and DFA 8-5.
Similarly, it has been hypothesized that GA could in uence phenolic cross-linking via an effect on peroxidases [53]. As we mentioned throughout this work, phenolic groups can be oxidatively coupled by peroxidase + H 2 O 2 [54] and/or by oxidases (laccases) + O 2 [55] to form dimers. The conditions that increase the number of diferuoyl crosslinking included, among others, low gibberellin supply [56]. In this context, within the supporting intervals of qDFA8-O-4_1_1, we highlight a gibberllin 20-oxidase 2, as candidate for DFA 8-O-4 content. In the same way, we also spotlight several peroxidase genes as candidates for FA and individual dimers in QTL qFA_5_1, qDFA8-5_5_1 and qDFA5-5_2_1.
Among the genes found in the interval of qPCA_3_1, we spot MYB tf 41 which has been classi ed by Du et al. [57] as involved in the "Phenylpropanoid Pathway". Barrière et al. [58] proposed ZmMYB041 as the probable gene underlying a QTL for lignin content,and this could be also involved in PCA biosynthesis because increased ligni cation has been commonly associatedto higher PCA concentration [59].
Finally, we found, in the QTL interval for PCA in chromosome 10, a gene encoding a ω-hydroxypalmitate O-feruloyl transferase (Zm00001d024864) which takes part in the esteri ed suberin biosynthesis pathway. Suberin is a polymeric constituent of plant cell wall, which consists of two domains that are cross-linked. Concerning the aromatic fraction of suberin, hydroxycinnamates esters, such as PCA, fortify the crosslinking between arabinoxilans and suberin fatty acids; besides PCA deposition has been associated with highly suberized tissues. Even though there is some of the esteri ed PCA esteri ed to polysaccharides, as previously mentioned, most of the PCA in grasses is ester bond to S units of lignin [60,61].

Conclusions
To sum up, we pointed out new genomic regions associated to cell wall bound hydroxycinnamates in maize stover that could have an impact on their content across different genetic backgrounds. Based on their annotated functions the putative candidate genes for the traits under study are involved in polysaccharides synthesis and oxidative coupling, however, the genes that are not yet characterized, or the ones encoding a "hypothetical protein" that fell within QTLs intervals, could have an in uence on the nal hydroxycinnamates content. However, the effects of individual SNPs signi cantly associated with hydroxycinnamate content were low, and each SNP explained a low percentage of total genetic variability. Based on these results, and on the moderate heritability estimates observed, we suggest that the best breeding strategy to improve hydroxycinnamates content would be a genomic selection. On the other hand, we determine that the variation for cell wall bound hydroxycinnamates will not vary tunnel length damage, sacchari cation e ciency or forage digestibility, so we recommend carried out a direct selection program independently for each nal use related trait.

Declarations
Ethics approval and consent to participate Not applicable in this study.

Consent for publication
Not applicable in this study.

Availability of data and materials
The data sets used and/or analysed during the current study will be availableupon reasonable request to the corresponding author.

Competing interests
The authors declare that they have no competing interests.

Funding
This research has been developed in the frame of the "Agri-Food Research and Transfer Centre of the Water Campus (CITACA)" at the University of Vigo (Spain), which is economically supported by the Galician Government, and in the Misión Biológica de Galicia (CSIC). It was funded by the "Plan Estatal de Ciencia y Tecnología de España" (projects RTI2018-096776-B-C21, and RTI2018-096776-B-C22 co-nanced with European Union funds under the FEDER program). A. López-Malvar's scholarship for the PhD ful lment has been granted by University of Vigo and by a contract charged to the project RTI2018-096776-B-C22. The funding body played no role in study design, data analysis and manuscript preparation.    Table 4 Multiple linear regression models (using stepwise selection) with sacchari cation e ciency, digestibility of organic matter and tunnel lenght on cell wall composition as dependent variables, and hydroxycinnamate-related traits in a maize MAGIC population.
Step Wise Selection Sacchari cation E ciency (nmol mg − 1 material − 1 hour − 1 ) Step Variable introduced in the Model  Table 5 SNPs and QTL signi cantly associated with cell wall component traits: including SNP's chromosome, bin and position within chromosome, allelic variants and additive effect for the SNP, proportion of total variance explained by the SNPs and P-value for the association between the SNP and the phenotype