A non-destructive method to quantify leaf starch content in red clover

doi:10.21203/rs.2.22508/v2

Download PDF

Research article

A non-destructive method to quantify leaf starch content in red clover

https://doi.org/10.21203/rs.2.22508/v2

This work is licensed under a CC BY 4.0 License

Version 2

posted

You are reading this latest preprint version

Background: Grassland-based ruminant livestock production provides a sustainable alternative to intensive production systems relying on concentrated feeds. However, grassland-based roughage often lacks the energy content required to meet the productivity potential of modern livestock breeds. Forage legumes, such as red clover, with increased starch content could partly replace maize and cereal supplements. However, breeding for increased starch content requires efficient phenotyping methods. This study is unique in evaluating a non-destructive hyperspectral imaging approach to estimate leaf starch content in red clover for enabling efficient development of high starch red clover genotypes.

Results: We assessed prediction performance of partial least square regression models (PLSR) using cross-validation, and validated model performance with an independent test set. Starch content of the training set ranged from 0.1 to 120.3 mg g-1 DW. The best cross-validated PLSR model explained 56% of the measured variation and yielded a root mean square error (RMSE) of 17 mg g-1 DW. Model performance decreased when applying the trained model on the independent test set (RMSE = 29 mg g-1 DW, R2 = 0.36). Different variable selection methods did not increase model performance.

Conclusion: The non-destructive spectral method presented here, provides a tool to detect large differences in leaf starch content of red clover. The major benefit of the method is that it can be repeatedly applied to the same plants, thus providing a means to follow starch concentrations over time and over a broad range of environments.

Plant Physiology and Morphology

Plant Molecular Biology and Genetics

Red clover

starch content

hyperspectral imaging

partial least square regression

forage quality

grassland

Temporary and permanent grassland account for roughly 70% of agricultural land and play a significant role in sustainable agriculture worldwide by providing roughage for ruminant livestock production. Pasture and grassland-based agroecosystems maintain carbon balances, nutrient cycles, biodiversity and water quality [1]. However, they were gradually replaced by intensified production systems, where the high feed energy content required by today’s livestock breeds is largely covered through starch from cereals and maize. Starch is an important form of assimilated carbohydrates in plants, which diurnally accumulates in the leaf and is nocturnally mobilized to support growth [2–4]. The accumulation of starch and its linear degradation at night is thought to be crucial for stable growth and directly correlated to plant biomass [5, 6]. However, plant biomass and leaf starch content are not always negatively correlated in species such as birdsfoot trefoil (Lotus japonicus L.) or red clover (Trifolium pratense L.) [7, 8]. Starch accumulation and degradation varies not only among plant species but also among genotypes, seasons, and management regimes [9–12].

Red clover is one of the most important forage legumes in temperate climates [13]. Its high yield potential, high crude protein content and high digestibility make it an excellent feed, not only for cattle but also for other livestock and poultry [14, 15]. Red clover has the potential to accumulate up to one third of its leaf dry mass as starch, and some genotypes degrade less than 50% of their starch during the night [8]. Thus, selecting for red clover plants with high starch content and low degradation rates is likely to result in high starch cultivars. These could provide an alternative, high energy feed source, which would significantly improve sustainability of ruminant livestock production.

Developing a high starch red clover requires a better understanding of the starch metabolism in red clover and an efficient method to quantify starch in leaf tissue. Starch is commonly quantified with an enzymatic method, where leaf samples are flash frozen, ground and weighted before extraction is performed [16]. This procedure is laborious, expensive and involves destructive sampling. A non-destructive method to measure leaf starch content would enable detailed studies of starch turnover in red clover plants and dynamic changes during plant development could be traced on the same plant throughout the entire season. Specifically, different genotypes could be investigated under different management regimes and across different environments over an extended period of time. Furthermore, this method could be applied to develop high starch red clover cultivars.

Hyperspectral imaging and near infrared spectroscopy (NIRS) are routinely used to estimate biochemical compounds such as lignin, cellulose, starch, sugars and proteins in numerous crops [17–19]. These two methods have largely replaced wet chemistry as the standard analytical procedure for detection and quantification of plant biochemical compounds in the food industry [20, 21]. Infrared spectra result from the fundamental vibrational absorptions of photons in the mid-infrared region (500–4000 cm^-1, 350–25000 nm) by bonds within specific functional groups of molecules. These absorptions are mirrored to the NIR region [21]. Multivariate statistics, chemometrics, or machine learning methods are then used to quantify and classify specific compounds or properties [22]. NIRS or other spectral techniques are most accurate when using dried and homogenized (i.e. milled) plant material. For example, starch has been accurately quantified on dried cotton leaves or dry forage maize using NIRS ( R²> 0.9) [19, 23–25]. Estimating chemical compounds with spectral measurements on fresh leaf tissue is often less reliable due to masking effects of light absorption by the cuticle or the leaf water content [26, 27]. For successful spectroscopy-based diagnostics using fresh leaf tissue, spectral pre-processing and statistical modelling are essential to at least partially correct for confounding effects [26, 27].

The following study aimed at developing a non-destructive spectroscopic method to estimate leaf starch content in fresh leaf tissue of red clover. Although developed in the greenhouse under controlled conditions, such a method could, once validated in the field, enable to monitor starch turnover on the same genotype over a longer period, under different management regimes, and under various environmental conditions.

For the development of a non-destructive method to analyse leaf starch content in red clover, leaf starch was determined in two independent sets of plants (i.e., a training set and a test set) using wet lab analysis and leaf spectroscopy.

Wet lab analysis for starch quantification

Starch concentration of samples of the training set harvested at ED ranged from 2.0 to 120.3 mg starch per g dry weight (DW), with a median of 46.3 mg g^-1 DW. For the samples harvested at the end of the night (EN), starch concentration ranged from 0.1 to 47.8 mg g^-1 DW, with a median of 9.6 mg g^-1 DW. Starch concentration for the test set ranged from 26.41 to 125.44 mg g^-1 DW for ED harvested samples, with a median of 66.18 mg g^-1 DW. Plants harvested at EN had lower starch concentrations, ranging from 3.66 to 79.51 mg g^-1 DW, with a median of 23.28 mg g^-1 DW. Differences between samples harvested at ED and EN were statistically significant (p < 0.5) for both sets (Fig. 2). In order to test the reproducibility of the enzymatic method, three technical replicates of the 24 plants of the training set were analysed, resulting in a standard error (SE) of 0.096 mg g^-1 DW (data not shown). Also, there was no substantial difference in dry matter content (dry weight / fresh weight) observed between samples from different plants or sampled at different time points, indicating that water content per se was not responsible for the differences in starch content observed by spectral analysis (data not shown).

The iodine stained leaves displayed analogous patterns. Plants harvested at ED showed higher starch accumulation than plants harvested at EN, indicated by a dark coloration of the leaves (Fig. 3A, B). Differences in coloration were not only visible between diurnal time points, but also across and within leaves (Fig. 3B, C). Dark coloration indicated that old leaves accumulated more starch than young ones. Coloration varied within a plant, showing a clear pattern with young leaves accumulating less starch compared to old leaves. While starch accumulation varied within one leaf, no clear pattern was distinguishable based on iodine visualization when observing the starch accumulation within individual leaflets (Fig. 3C). Observations in the iodine staining were confirmed by starch quantification in different leaf types. Young leaves had a significantly lower (p < 0.05) starch concentration when compared to mature leaves, old leaves or entire plants (Fig. 4). Some differences between genotypes of the training set harvested at ED were statistically significant (p < 0.05), but a high variation within genotypes was observed (Fig. 5).

Spectral measurements and modelling

Leaf spectra were measured with a spectroradiometer on leaf discs cut from multiple leaflets per plant. Spectral data were pre-processed and subsequently modelled using partial least squares regression (PLSR).

The average reflectance spectra of the training set revealed similar patterns for both harvest time points ED and EN (Fig. 6A). VNIR/SWIR (350 nm-2500 nm) spectra had three main absorption regions, around the absorption bands of 700 nm, 1400 nm, and 1900 nm. Savitzky-Golay smoothed spectra showed very similar patterns across the entire wavelength range for samples harvested at ED and those harvested at EN (Fig. 6B).

The best PLSR training model with pre-processed spectra resulted in an accurate starch prediction for the training set (R² = 0.72, RMSE = 13 mg g^-1DW, bias = -0.0), using seven PLS components (Fig. 7A). Five times repeated 10-fold cross validation performed on the same data set revealed a moderate correlation coefficient (R²_CV) of 0.56, a RMSE_CV of 17 mg g^-1, and a residual bias of -0.2 (Fig. 7B).

PLSR modelling using pre-processed spectra performed better than modelling using raw spectra as predictors (Additional file 3: Fig. S2). Separating cross-validated predictions by ED and EN resulted in lower correlation coefficients of R²_CV = 0.39 and R²_CV = 0.25 for ED and EN, respectively (Additional file 4: Fig. S3). Including only the most relevant wavelengths for estimating starch content, first based on filtering training variable importance in the projection (VIP > 1), second based on the 50 top starch correlated wavelengths (PCC), and third based on MLR divided by the minimum standard error of reflectance did not increase model performance substantially (Table 1).

Model evaluation with an independent test set

Independent test set predictions (n = 57) using the best training PLSR model calibrated with pre-processed spectra (n = 337; ncomp = 7, all wavelengths) yielded a substantially lower R² of 0.36 and larger RMSE of 29 mg g^-1DW (Fig. 8). The three training models calibrated with variable selection (VIP > 1, top 50 correlations, and MLR with normalized assigned starch bands) resulted in inferior accuracy when applied to the test set (Table 1).

Table 1 Cross-validated training set and test set evaluation of vis--NIR PLSR models for leaf starch that were developed with different feature selection methods on the training set.

		Training set (cross-validation; n = 337)				Test set (n = 57)				Description
	R²		b	RMSE	bias	R²	b	RMSE	bias
VIP > 1	0.58		1.09	16.6	–1.11	0.22	0.58	29.5	--0.78	Filtering based on the training variable importance in the projection (VIP)
PCC	0.47		1.04	18.5	–0.03	0.37	0.94	32.1	–20.4	Filtering based on the top 50 with starch correlated wavelengths
MLR	0.26		0.96	21.8	0.01	0.18	30.2	38.0	21.3	Reflectance at 556nm, 702 nm, 1300 nm, and 1960 nm divided by reflectance at 670 nm (minimum standard deviation)

Model development using different filtering methods such as variable importance in the projection (VIP), the top 50 starch correlated wavelengths (PCC), multiple linear regression (MLR) before performing partial least square regression (PLSR). Best model performance of each filtering method determined by five time’s repeated 10-fold-cross validation was used to estimate leaf starch content of an independent test set

Hyperspectral imaging on dry homogenized material is a widely used and well established technique, but applying this method on fresh plant material is not yet a standard analytical procedure [26, 27]. Consequently, the correlation of hyperspectral measurements and wet lab results for starch reported in this study were clearly lower, when compared to NIRS measurements on dried plant material, where coefficients of determination (R²) reached 0.99 for nitrogen and starch contend of cotton leaves [19]. One challenge using fresh leaf material is the water in the fresh leaves. Liquid water is a strong absorber of the infrared radiation and predominant bands are in the regions near 1200 nm, 1450 nm, and 1950 nm [26], where important wavelengths were present in this study (Additional file 5: Fig. S4). It is therefore likely, that water absorption masked the absorption bands of starch molecules, impairing prediction of starch content to some extent [26, 27]. Not only water absorption can obscure the starch absorption characteristics, but also the cell structure of fresh plants scattering light as it passes through multiple air and water boundaries. Furthermore, the distribution of starch in fresh leaves is not uniform with respect to the organization of cells and organelles [26]. The problems associated with the prediction of starch content in fresh leaves might be reduced, if spectral data is pre-processed [35]. Indeed, pre-processing of the spectra considerably improved predictive accuracy compared to unprocessed reflectance spectra (Additional file 3: Fig. S2), by removing systematic variation in spectra such as light scattering and thereby increasing the signal to noise ratio [35].

Total starch concentration of the plant material in the training set was between 0.2% and 12% for plants harvested at ED and ranged from 0.01% up to 5% for the plant material harvested at EN. The starch concentrations of the training set were slightly lower than the concentrations of the test set. The total starch concentration was substantially lower than the ones published by Ruckle et al. [8], where leaf starch concentration ranged from 6% up to 35% for ED harvested plants. This difference occurred most likely due to different growing conditions, since light and temperature have a high impact on starch accumulation.

Many studies have shown that starch contents highly depend on the diurnal cycle [3,5,37,38]. An over 3-fold difference in starch content was observed between ED harvested plants and plants harvested at EN (Fig. 4). Mean genotypic differences for the training set ranged from 31.6 to 59.7 mg g^-1 DW for the ED harvested plants (Fig. 5) and from 2.2 to 37.6 mg g^-1 DW for the EN harvested plants (data not shown), respectively, showing high variation within genotypes. The best PLSR training model explained 56% (R² = 0.56) of the measured starch variation with an RSME of 17 mg g^-1 DW. The ratio of performance to deviation (RPD) followed the trend indicated by R²values (Fig. 7, Fig. 8, Additional file 3: Fig. S2 and Additional file 4: Fig. 3). The cross-validated overall bias was almost zero for the training set, while predictions on the test set had a bias of -10.7 mg g^-1 DW.

These results imply that the developed vis–-NIR PLSR model can predict differences between harvest time points and differences between extreme genotypes. Nevertheless, 56% of starch variation explained by our model is lower than the proportions reported by Shorten et al. [40]. They used hyperspectral imaging systems (550 nm - 1700 nm) to estimate more than ten different quality compounds in perennial ryegrass (Lolium perenne L.). Low and high weight sugars were estimated separately and best model prediction for the high weight sugars using PLS regression resulted in an R² of 0.68 and a RMSE of 19.9 mg g^‑1. Assigning two-third of the data to calibration and using the remaining data for validation resulted in a slightly lower model performance (R² = 0.63 and RMSE of 21.6 mg g^-1) [40]. Filtering spectral variables by a variable importance in the projection threshold (VIP > 1) did not considerably improve model performance (Table 1). This is in contrast to comparable studies where selecting important wavelengths improved model accuracy and reduced the redundancy effects of wavelengths, which had low weight in the model [35, 36]. Our model results indicate that restricting PLSR with a subset of important spectral variables was not sufficient to estimate starch with equal effectiveness compared to the full-range vis—NIR data, confirming that many spectral features are important for starch prediction. For example, the wavelengths near 550 nm, 770 nm, 850 nm, 1440 nm, 1920nm, from 1650 nm to 1850 nm and 2160 nm had a relatively high model contribution for estimating starch content in the training set (VIP analysis, Additional file 5: Fig. S4). The red-edge region around 700 nm, where a local maximum of the first derivative is located and which is typically indicative for chlorophyll, had relatively low model importance. However, adjacent wavelengths to the red-edge were moderately important. The highest VIP in the training set was around 550 nm. This region was shown to be the second most important region in the vis---NIR for the spectral estimation of total carbon, nitrogen, leaf mass per unit area, protein and nitrate from wet leaves of 8 crop species [41]. Starch absorbance in fresh leaves was further associated with wavelengths in the regions of 556 nm, 702 nm, 1300 nm and 1960 nm [42]. These absorptions partly corresponded with the VIP patterns across wavelengths for the data of the present study. In addition, performing explanatory inference for spectra-model-compound linking is hampered by spectral overlaps due to dominant water bands and signals of other compounds related to starch. In fact, plant leaves contain many biochemical compounds with vis—NIR absorption regions that overlap with starch absorptions, or whose concentration directly or indirectly correlate to starch, such as cellulose, water or lignin, all having signals from O---H vibrations in the regions around 1450 nm and 1940 nm [42]. Curran et al. [42] performed both a correlative and stepwise regression analysis between 12 abundant structural, productive and storage compounds, and vis—NIR first derivative spectra of ground and dried slash pine needles. Among the components tested, starch exhibited the lowest coefficient of determination with first derivative spectra, and selected starch wavelengths were 1208 nm, 1418 nm, and 2172 nm, whereas 978 nm and 1208 nm were linked to starch absorption features. Native plant starch consists of a variable ratio of amylose and amylopectin. Amylose content in various mixtures was accurately discriminated with vis—NIR reflectance, showing major spectral feature differences between 1700 nm and 1800 nm in the pure form [43]. We found two VIP peaks with moderate importance (around 1.2), that might be linked to amylose and amylopectin signals. PLSR and the variable importance analysis were thus able to explain a significant proportion of the starch variability.

A model built from a single set of training observations is often not adequate to predict an independent data set [32, 33]. If a model is tested on the same data that was used to fit the model, performance is often overestimated [32]. Our study showed that the cross-validated PLSR model underestimated high starch contents (Fig. 7). The independent second dataset from a second experiment (test set) allowed us to further validate model performance, in addition to cross-validation during training. As expected, the test prediction resulted in a 1.7-fold increase in RMSE (Fig. 8). Moreover, models including only a subset of wavelengths were validated on the test set, resulting in an lower predictability (Table 1). The VIP analysis of the two independent sets (training and test) indicated that some important wavelengths regions occurred in both sets, but with different VIP magnitudes (Additional files 5, 6: Figs. S4, S5). For example, the absorption feature near 1450 nm was less important for the test set model fitting, compared to the model developed for the training set. Further, the training model had important features between 500 nm and 750 nm, whereas the re-calibrated test model had important wavelengths below 500 nm. These differences in VIP magnitudes and the additional regions relevant for prediction partly explain the poorer prediction performance of the test set when applying the training model. Despite the fact that two of the three genotypes from the test set were included in the training set, the spectra and models had only limited generalization capacity for starch contents.

Recalibration using only test data led to a slight decrease in RMSE compared to test prediction, but this substantially reduced bias. Thus, a new calibration may be needed for each independent trial or the current red clover starch spectral library needs to be augmented with more measurements from different independent trials with both genotypic and phenotypic variance in starch. Various environmental growth conditions influence starch accumulation and can thereby mask genotypic effects [38]. Hence, we suggest follow up research to ultimately verify whether a separate spectral model is required for each independent trial to improve predictive accuracy under substantial genotype x environment interaction. In addition, all measurements were taken under controlled conditions where leaves were completely removed from the plants. As a next step, it is crucial to evaluate the method under field condition, where leaves are left on the plant and measured repeatedly.

Despite the relatively low prediction accuracy, performance of the best PLSR training model was sufficient to detect differences between red clover genotypes with very high or very low levels of starch content. Therefore, once validated in the field, the method developed has the potential to contribute to the breeding of high energy red clover cultivars.

The success of breeding forage crops with increased energy content was previously demonstrated by breeding perennial ryegrass cultivars with high levels of water-soluble carbohydrates (WSC). These WSC cultivars can substantially increase animal performance and nitrogen use efficiency in pasture-based animal production systems [44]. Red clover and ryegrasses are often cultivated in mixtures, not only due to their attractive diet composition, but also due to the transfer of N between species. In addition, grass-clover mixtures require fewer pesticide and herbicide applications, and protect soils against erosion [45, 46]. Therefore, high starch red clover cultivars in mixtures with high WSC ryegrasses appear a particularly promising option, which brings us one-step closer towards an environmental sustainable feed production meeting the high-energy requirements of modern livestock production.

This study is unique in developing and testing a non-destructive method to predict leaf starch content in red clover plants. The described method is suitable to differentiate between high and low starch content in red clover genotypes. Unfortunately, model performance is not sufficient to trace small changes in starch accumulation. Therefore, the method is only partially suited to monitor starch metabolism in detail or to investigate the effect of environmental influences or management regimes throughout an entire season on the same plant. We suggest follow up studies to enlarge the current red clover starch spectral library by means of additional measurements from different independent trials, covering both genotypic and phenotypic variation in starch and to validate the method under field conditions. Currently, the level of resolution is sufficient for the method to differentiate high variance in starch and thus, can be integrated into existing breeding programs to get a first impression on starch levels of different red clover cultivars under controlled conditions.

Leaf starch was determined in two independent sets (i.e., a training set and a test set) of plants grown in two separate experiments. Wet laboratory measurements were taken on exactly the same material as used for the hyperspectral measurements.

Plant material and growth conditions

Thirty red clover plants obtained from IPK Gatersleben, Seeland, Germany and Agroscope, Zurich, Switzerland, were used for the study (Additional file 1: Table S1). All plants were clonally propagated, using cuts that contained only one shoot and one root meristem to ensure comparable physiological states of all plants. These clonally propagated plants were grown in a climate chamber for 90 days before harvesting (3:2:1 soil: peat: perlite substrate, photoperiod of 14:10-h L:D, day temperature 20 ± 2 °C, night temperature 15 ± 2 °C and relative humidity, 60 ± 10%). Samples were taken at the end of the night (EN; before lights were turned on), and at the end of the day (ED; before lights were turned off). The training set included 18 randomly selected genotypes, six thereof clonally duplicated, resulting in a total of 24 plants (Additional file 1: Table S1). Starch was measured on twelve plants at EN and on twelve plants at ED, using 15 leaf cuts per plant, taken on all three leaflets of five leaves (the youngest, fully emerged leaf [y], the oldest leaf [o] and three intermediate leaves [m; Fig. 1]). In total, the training set included 360 measurements (24 plants x 3 leaflets x 5 leaves per plant). The test set included six plants from three different genotypes. Three plants were harvested at ED and three plants at EN (Additional file 1: Table S1). For the test set, ten leaf cuts were taken per plant on randomly selected leaflets of mature leaves, resulting in a total of 60 measurements (6 plants x 10 leaflets per plant).

Leaf spectroscopy

Leaflets were cut using a round, sharpened tube with a diameter of 12 mm to standardize leaf area (Additional file 2: Fig. S1). These leaf cuts were placed on the matt black surface of the FieldSpec4 pro device (Analytical Spectral Devices, Boulder, Colorado, USA). The device is not influenced by external light sources, potentially enabling the application in field experiments. Radiance between 350 and 2500 nm was measured. The spectrometer’s contact probe was fixed on a clamp, and the sample was placed so that no light escaped through the sides. Leaf samples were referenced to a spectralon white reference every fifth recording and the radiance measurements where transformed to reflectance. Immediately after taking spectral measurements, leaf cuts were flash frozen in liquid nitrogen and freeze-dried for 48 h.

After taking spectral measurements, whole plants from the training set were cut 2 cm above ground, flash-frozen and freeze-dried for starch quantification.

Wet lab analysis for starch quantification

Starch in leaf cuts and whole plants was quantified as described by Ruckle et al. [8]. Two additional clones of one genotype, not included in the correlation model, were iodine stained to visualize the starch pattern within a plant. Plants were harvested either at ED or at EN, washed with tap water and placed in 80% (v / v) boiling ethanol. After two hours, when plants were transparent, they were removed and placed in Lugol’s solution. After 10 minutes, the Lugol’s solution was rinsed off to destain the non-target areas. The plants were photographed on a light-table.

Statistical analyses

Statistical analyses were performed using the R statistical software version 3.6.0 [28]. Significance testing was performed using the library MASS [29]. Assumptions of homoscedasticity of variances and normality of residuals were not met, therefore an exact Wilcoxon rank sum test at α = 5% was performed.

Pre-processing of spectral data

Spectral analysis was realized using the R package simplerspec [30]. The mean reflectance values of 10 measurements per sample were used. Leaf spectra were pre-processed prior to modelling. Gaps between the different detector arrays at λ = 1000 nm and at λ = 1800 nm were splice corrected. Spectra were smoothed with the Savitzky-Golay first derivative filter using a 3rd-order polynomial at a 21-point window (21 nm at a resampled spectrum interval of 1nm; R package prospectr [31]). Spectral pre-processing is crucial to reduce significant noise and baseline drift resulting from light scatter before establishing a correlation model. After smoothing the spectra with Savitzky-Golay the spectral variables were centered and scaled prior to relating them to leaf starch using partial least squares regression (PLSR), in order to consider variables equally independent of their variation in absolute values. PLS regression is a substantial chemometric method, which can cope with multicollinearity in spectra and delivers robust calibration models with many predictors and few observations [32, 33]. To further reduce collinearity in processed spectra, only every forth wavelength was kept for modelling, resulting in 533 spectral predictor variables.

Model development

Leaf reflectance data from the training set was modelled by PLSR [32], using either raw or pre-processed spectra as predictors. A 5-times repeated 10-fold cross-validation scheme was used to fit the models, to determine the best number of components (ncomp), and to estimate model performance of the final model. A constant random seed was set for resampling, yielding identical hold out data across all models. Model reliability was assessed by the coefficient of determination (R²) and slope (b) of a linear regression with intercept, the root mean square error (RMSE, Eq. 1), the bias or mean error (Eq. 2), and the ratio of performance to deviation (RPD, Eq. 3). The evaluation metrics were calculated by aggregating all holdout predictions from the repeated 10-fold cross-validation (ŷ_i) and corresponding observed values (y_i) grouped by ncomp. (see Equations 1-3 in the Supplemental Files)

Variable influence on projection scores (VIP) is a measure of variable importance tailored to PLS regression [34–36]. VIP scores were calculated from the PLS regression parameters taking multicollinearity into account, which is likely to occur because of the nature of spectroscopic data. VIP scores are considered as a robust measure to identify relevant predictors, here important wavelengths. A variable with VIP above 1 contributes more than the average variable to the model prediction. The VIP value vj was calculated for each wavelength variable j as (see Equation 4 in the Supplemental Files)

where w_aj are the PLS regression weights for the a^th component for each of the wavelength variables and SS_a is the sum of squares explained by the a^th component (Eq.4). The sum of squares SS_a for the a^th component was calculated from the score q_a of the predicted variable y and the t_a scores of the spectral matrix X (Eq. 5): (see Equation 5 in the Supplemental Files)

VIP scores were also used to filter important predictors with a threshold of VIP > 1 within the training set, and the identified predictors were used to re-calibrate the test set and assess performance. This separation in the VIP filtering by independent tests was needed to avoid overfitting and over-optimistic assessment that typically occurs when identifying subsets of features on the modelling data.

In addition to the VIP based filtering, two other procedures were applied for wavelength selection. First, the 50 most relevant wavelengths to estimate starch according to Pearson’s correlation coefficient (r) were taken to re-perform PLS regression. Second, four wavelengths that were assigned to starch in previous literature were taken and normalized with the reflection at the wavelength that had the smallest standard deviation across the entire wavelength range, prior to performing a multiple linear regression (MLR; [22]).

Model evaluation using the test set

The best training model tuned by cross-validation and refitted on all training data, and the training models with three different wavelength selection (filtering) methods were tested on the independent test data set (60 samples). The predictive ability of these final models was again evaluated using R² and RMSE on the test set. Besides these test set predictions, a PLSR model was re-calibrated using only the test data. This re-calibration allowed to determine whether the test set possibly contained different or differently weighted spectral features relevant for starch prediction, so that PLSR training relationships did not generalize to this independent test experiment.

ED: at the end of the day

EN: at the end of the night

DW: dry weight

PLSR: partial least square regression

VIP: variable importance in the projection

PCC: top 50 starch correlated wavelengths

MLR: multiple linear regression

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Availability of data and materials

Raw data and scripts can be accessed via:

https://zenodo.org/record/3598699#.XhOLvdko_d4

Competing interests

The authors declare that they have no competing interests

Funding

This research was partially supported by the Coop–Research Fellowship Program through the ETH–World Food System Center (WFSC). The funding body contributed to the design of the study but was not involved in collection, analysis and interpretation of data, nor was it involved in the writing of the manuscript.

Author contribution statement

LAF performed the research, analysed the wet lab data and drafted the manuscript. PB analysed the spectral data and HA assisted with data analysis. PB and HA helped to improve the manuscript. RK and BS supported the research design and helped interpreting the results and drafting the manuscript. All authors read and approved the final version of the manuscript.

Acknowledgements

We thank Dr. Michael Ruckle for initiating the research on high starch red clover and for helping with the wet lab starch analyses, Prof. Dr. Achim Walter for the allocation of the FieldSpec4 pro device and Dr. Dániel Carrera and Ms Verena Knorst for the excellent technical support.

Steinfeld H, Gerber P, Wassenaar T, Castel V, Rosales M, De Haan: C. Livestock’s long shadow. Rome: FAO Rome; 2006
Stettler M, Eicke S, Mettler T, Messerli G, Hörtensteiner S, Zeeman SC. Blocking the metabolism of starch breakdown products in Arabidopsis leaves triggers chloroplast degradation. Mol Plant 2009, 2:1233-1246
Geiger DR, Servaites JC. Diurnal regulation of photosynthetic carbon metabolism in C3 Plants. Annu Rev Plant Physiol Plant Mol Biol. 1994, 45:235-256
Stitt M, Gibon Y, Lunn JE, Piques M. Multilevel genomics analysis of carbon signalling during low carbon availability: Coordinating the supply and utilisation of carbon in a fluctuating environment. Funct Plant Biol. 2007, 34:526-549
Graf A, Schlereth A, Stitt M, Smith AM. Circadian control of carbohydrate availability for growth in Arabidopsis plants at night. Proc Natl Acad Sci USA 2010, 107:9458-9463
Mugford ST, Fernandez O, Brinton J, Flis A, Krohn N, Encke B, et al. Regulatory properties of ADP glucose pyrophosphorylase are required for adjustment of leaf starch synthesis in different photoperiods. Plant Physiol 2014, 166:1733-1747
Vriet C, Welham T, Brachmann A, Pike M, Pike J, Perry J, et al. A suite of Lotus japonicus starch mutants reveals both conserved and novel features of starch metabolism. Plant Physiol 2010, 154:643-655
Ruckle ME, Meier MA, Frey L, Eicke S, Kölliker R, Zeeman SC, et al. Diurnal leaf starch content : An orphan trait in forage legumes. Agronomy 2017, 2:1-15
Liu W, Su J, Li S, Lang X, Huang X. Non-structural carbohydrates regulated by season and species in the subtropical monsoon broad-leaved evergreen forest of Yunnan Province, China. Sci Rep 2018, 8:1-10
Moraes MG, Chatterton NJ, Harrison PA, Filgueiras TS, Figueiredo-Ribeiro RCL. Diversity of non-structural carbohydrates in grasses (Poaceae) from Brazil. Grass Forage Sci 2013, 68:165-177
Griggs TC, MacAdam JW, Mayland HF, Burns JC. Temporal and vertical distribution of nonstructural carbohydrate, fiber, protein, and digestibility levels in orchardgrass swards. Agron J 2007, 99:755-763
Pelletier S, Tremblay GF, Bélanger G, Bertrand A, Castonguay Y, Pageau D, et al. Forage nonstructural carbohydrates and nutritive value as affected by time of cutting and species. Agron J 2010, 102:1388-1398
Taylor NL. A century of clover breeding developments in the United States. Crop Sci 2008, 48:1-13
Halling MA, Hopkins A, Nissinen O, Paul C, Tuori M, et al. Forage legumes – productivity and composition Magnus. Landbauforsch Voelkenrode, Sonderh 234 2001, 8-9
Broderick GA. Desirable characteristics of forage legume for improving protein utilisation in ruminants. J Anim Sci. 1995;73:2760–73
Hostettler C, Kölling K, Santelia D, Streb S, Kötting O, Zeeman SC. Chloroplast research in Arabidopsis. Chloroplast Res Arab Methods Protoc 2011, 775:387-410
Goetz AFH, Gao BC, Wessman CA and Bowman WD. By spectrum matching techniques. Brows Conf , Geosci Remote Sensing 1990, 971-974
Yoder BJ, Pettigrew-Crosby RE. Predicting nitrogen and chlorophyll content and concentrations from reflectance spectra (400-2500 nm) at leaf and canopy scales. Remote Sens Environ 1995, 53:199–211
Hattey JA, Sabbe WE, Baten GD, Blakeney AB. Nitrogen and starch analysis of cotton leaves using near infrared reflectance spectroscopy (NIRS). Commun Soil Sci Plant Anal 1994, 25:9-10
Barton FE. New methods for the structural and compositional analysis of cell walls for quality determinations. Anim Feed Sci Technol 1991, 32:1–11
Card DH, Peterson DL, Matson PA, Aber JD. Prediction of leaf chemistry by the use of visible and near infrared reflectance spectroscopy. Remote Sens Environ 1988, 26:123-147
Kumar L, Schmidt K, Dury S, Skidmore A. Imaging spectrometry and vegetation science. In Imaging Spectrometry. Volume 4. Edited by van der Meer F and de Jong SM. Dordrecht: Springer; 2002:111-154
Hetta M, Mussadiq Z, Wallsten J, Halling M, Swensson C, Geladi P. Prediction of nutritive values, morphology and agronomic characteristics in forage maize using two applications of NIRS spectrometry. Soil Plant Sci 2017, 67:326-333
Lu X, Sun J, Mao H, Wu X, Gao H. Quantitative determination of rice starch based on hyperspectral imaging technology. Int J Food Prop 2017, 00:1-8
Kjær A, Nielsen G, Stærke S, Clausen MR, Edelenbos M, Jørgensen B. Prediction of starch, soluble sugars and amino acids in potatoes (Solanum tuberosum L.) using hyperspectral imaging, Dielectric and LF-NMR Methodologies. Potato Res 2016, 59:357-374
Fourty T, Baret F. On spectral estimates of fresh leaf biochemistry. Int J Remote Sens 1998, 19:1283-1297
Curran PJ, Dungan JL, Macler BA, Plummer SE, Peterson DL. Reflectance spectroscopy of fresh whole leaves for the estimation of chemical concentration. Remote Sens Environ 1992, 39:153-166
R Core Team. R: A language and environment for statistical computing. Vienne, Austria: R Foundation for Statistical Computing, 2017
Venables WN and Ripley BD. Modern Applied Statistics with S. 4^th Ed. Springer, New York, 2002
Baumann P. SimplerSpec. 2016. https://github.com/philipp-baumann/leaf-starch-spc. Accessed 21 Aug 2019
Stevens A, Ramirez Lopez L. An introduction to the prospectr package. 2014:1-22
Kuhn M, Johnson K: Measuring performance in regression models. In Applied Predictive Modeling. Volume 5. New York: Springer; 2013: 59-100
Naes T, Martens H. Comparison of prediction methods for multicollinear data. Communications in Stat 1985, 14:545-576
Zhao N, Wu ZS, Zhang Q, Shi XY, Ma Q, Qiao YJ. Optimization of Parameter Selection for Partial Least Squares Model Development. Sci Rep 2015, 5:1-10
Wold S, Jonsson J, Sjörström M, Sandberg M, Rännar S: DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. ACACAM 1993, 277: 239-253
Chong IG, Jun CH. Performance of some variable selection methods when multicollinearity is present. Chemom Intell Lab Syst 2005, 78:103-112
Claessens A, Castonguay Y, Bertrand A, Bélanger G, Tremblay G.F: Breeding for improved nonstructural carbohydrates in alfalfa. In Breeding in a World of Scarcity: 13-17 September 2015: Gent. Edited by Roldán-Ruiz I, Baert J, Reheul D. Springer: Cham; 2016: 231-235
Holt DA, Hilst AR. Daily variation in carbohydrate content of selected forage crops. Agron J 1969, 61:239-242
Bellon-Maurel V, Fernandez-Ahumada E, Palagos B, Roger JM, McBratney A. Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy. TrAC Trends Anal. Chem 2010, 29:1073-1081
Shorten PR, Leath SR, Schmidt J, Ghamkhar K. Predicting the quality of ryegrass using hyperspectral imaging. Plant Methods 2019, 15:63
Ely KS, Burnett AC, Lieberman-Cribbin W, Serbin SP, Rogers A. Spectroscopy can predict key leaf traits associated with source–sink balance and carbon–nitrogen status. J Exp Bot, 2019, 70:1789-1799
Curran PJ, Dungan J L, Peterson DL. Estimating the foliar biochemical concentration of leaves with reflectance spectrometry: testing the Kokaly and Clark methodologies. Remote Sens Environ 2001, 76:349-359.
Fertig CC, Podczeck F, Jee RD, Smith MR. Feasibility study for the rapid determination of the amylose content in starch by near-infrared spectroscopy. Eur J Pharm Sci 2004, 21:155-159.
Rasmussen S, Parsons AJ, Xue H, Newman JA. High sugar grasses – harnessing the benefits of new cultivars through growth management. Proc New Zeal Grassl Assoc 2009, 71:167-75
McKenna P, Cannon N, Conway J, Dooley J. The use of red clover (Trifolium pratense) in soil fertility-building: A Review. F Crop Res 2018; 221:38-49
Dhamala NR, Rasmussen J, Carlsson G, Søegaard K, Eriksen J. Nitrogen fixation in red clover grown in multi-species mixtures with ryegrass, chicory, plantain and caraway. Mult roles Grassl Eur 2016; 21:576-578.

Additional file 1

Table S1 Name and origin of the used red clover genotypes

Additional file 2

Figure S1 A) Round sharpened tube, used to cut out leaflets, B) leaf cuts C) and FieldSpec4 pro device opened and closed

Additional file 3

Figure S2 PLS regression of raw spectra and best model performance of the cross-validation (ncomp = 8; n=337). Different colours and shapes indicate the age of the leaves, m for matures leaves (red, circular), o for the oldest leaf (green, rectangle) and y for the youngest fully emerged leaf (blue, square). Regression line (dashed line), and 1:1 line (fine black line) are shown. Statistics are for the validation results for complete statistical reporting

Additional file 4

Figure S3 PLS regression of the training set separated for the two different harvest times at the end of the day (ED; ncomp = 7, n = 165) and at the end of the night (EN; ncomp = 7, n = 172). Different colours and shapes indicate the age of the leaves, m for matures leaves (red, circular), o for the oldest leaf (green, rectangle) and y for the youngest fully emerged leaf (blue, square). Regression line (dashed line), and 1:1 line (fine black line) are shown. Statistics are for the validation results for complete statistical reporting

Additional file 5

Figure S4 Reflectance spectra (top panel), pre-processed reflectance (Savitzky-Golay pre-processed; second panel), VIP filtering (third panel) and PLSR beta regression coefficients (bottom panel) for the training set

Additional file 6

Figure S5 Reflectance spectra (top panel), pre-processed reflectance (Savitzky-Golay pre-processed; second panel), VIP filtering (third panel) and PLSR beta regression coefficients (bottom panel) for the test set

Download PDF

Version 2

posted

You are reading this latest preprint version

A non-destructive method to quantify leaf starch content in red clover

Status:

Version 2

Abstract

Figures

Background

Results

Discussion

Conclusion

Methods

Abbreviations

Declarations

References

Additional File Legends

Supplementary Files

Status:

Version 2