NMR profiles of Ganoderma-infected oil palm stem tissue in 4 severity indices
Based on the results of 1H NMR analysis, 90 metabolites from 4 different levels of disease severity were identified. This metabolite consists of 65 metabolites for severity index 1, 51 for severity index 2, and 72 for each severity index 3 and 4. The enrichment analysis step was applied to classify these metabolites into 9 main metabolite classes of organic compounds: organic acids, carbohydrates, benzenoids, organoheterocyclic compounds, organic nitrogen compounds, fatty acyls, organic oxygen compounds, nucleic acids, and polyketides. We classified the unclassified metabolites based on the metabolite classes in the HMDB. The analysis results showed that organic acids (including amino acids) and carbohydrate derivatives were the class of compounds with the highest percentage in all disease severity indices (Fig. 1). Index 1 has the highest relative percentage of organic acids (43%) compared to other indices, while Index 2 has the lowest relative percentage of organic acids (33%). Index 4 has the lowest relative percentage of carbohydrates (18%), while other indices have a relatively higher percentage of carbohydrates (23%, 28%, and 26% for Indices 1, 2, and 3, respectively). The class of fatty acyls and benzenoids (including phenolic compounds) was also present in all indices, although the percentage was not as high as that of organic acids. Benzenoids had the highest relative percentage in Indices 1 and 4 and the lowest relative percentage in Indices 2 and 3. Two organic compounds are present in Indices 3 and 4 that are not present in Indices 1 and 2: polyketides for Index 3 and polyketides and nucleic acids for Index 4. Differences in metabolites detected in oil palm stem tissues with different disease severity indices were then analysed using multivariate statistical analysis.
Multivariate Data Analysis (MVDA)
The identification of metabolite profiles and significant compounds in this research was made using multivariate data analysis. PCA was performed to classify the sample characteristics and analyse the metabolites that contributed to the variation in the data 16. In this study, PCA was performed for all disease severity indices (Fig. 2) to identify differences in total metabolite profiles, and paired PCA analysis at indices 1–2, 1–3, and 1–4 (Supplementary Fig. 1) was performed to determine the differences in the metabolite profile between the health indices and various other disease severity levels.
The paired PCA analysis should identify potential differences in metabolites between 2 states (healthy and different disease severity levels) for their application as biomarkers in the early diagnosis of BSR diseases. In Fig. 2, it is shown that there are differences in metabolite profiles between different disease severity levels, although the 3D PCA visualization is not fully separated. In this analysis, the cumulative value of the two initial principal components (PC1 and PC2) is 51.1% of the total variation. These 4 disease severity groups separated mainly in the PC1 direction, which had an overall variation of 35.7% of the total NMR data. Pairwise PCA between indices 1 and 2 shows a distinction between the two disease severity states, although there is still overlap between the two groups, with the cumulative total of the principal components being 69.8% (Supplementary Fig. 1). The two groups are also well separated towards PC1, where the total variation value is 42.5%. In the paired PCA analysis between indices 1–3 and 1–4, the differences in metabolite profiles were quite clear, although there was still overlap between groups 1 and 4. The cumulative values of PC1 and PC2 for each pair were 72.1% and 56.3%, respectively. The separation between these two groups also occurs well towards PC1, where the variation value is 51.3% for index 1–3 and 32.9% for index 1–4. From the overall PCA analysis, each disease severity index has a compound profile that is quite different from each other. The paired PCA analysis was able to show that there was a clear difference between the healthy plant (index 1) and other severity indices, so there is potential for the identification of BSR disease biomarkers for early detection purposes through this study.
In this study, PLS-DA33,34 analysis was used to distinguish significant metabolites between all disease severity indices, while OPLS-DA35 analysis was performed to determine significant metabolites between the two disease severity indices, namely, index 1 and the other severity indices (Fig. 3a). Based on the results of the PLS-DA analysis, 20 metabolites that could significantly differentiate the four disease severity indices were identified (VIP score > 1.0; R2 = 0.83; Q2 = 0.15). Lists of metabolites can be found in Table 3. These metabolites include organic acid, carbohydrate, organoheterocyclic compound, and benzenoid groups. The relative concentration of each compound is represented by red for the highest relative concentration and blue for the lowest relative concentration. Based on the analysis, 31, 25 and 33 significant metabolites were identified from OPLS-DA analysis for indexes 1–2, 1–3, and 1–4, respectively (VIP score > 1.0) (Supplementary Fig. 2). Some organic acid compounds, such as taurine and threonic acid, have relatively high concentrations at index 1 but very low concentrations at index 4. Carbohydrate groups such as L-arabitol and D-fructose as well as organoheterocyclic compounds such as allantoin had the highest concentration at index 1 and decreased at the other indices. This is in complete contrast to the research of Isha et al., 14, where in G. boninense-infected leaves, the relative concentration of D-fructose actually increased compared to healthy leaves. This may indicate that in infected stem tissue, some of the carbon that makes up carbohydrate compounds is transferred to secondary metabolism 16. Carbohydrates can also act as an essential energy source in the biosynthesis of secondary metabolites 38,, so the relative concentration decreases when the disease severity index increases. However, for other carbohydrate-derived compounds, such as D-gluconic acid, xylitol, and D-mannose, the relative concentration at index 2 was actually higher than at index 1. This may be because these compounds act as signalling molecules in biotic stress 37. The accumulation of these compounds may also be related to their role as energy sources for pathogens 38.
Heat map analysis was performed to visualize the concentration of each metabolite at different indices (Fig. 3b). In this study, heat map analysis was performed for all disease severity indices. Based on the dendrograms of all heat map combinations, metabolites are classified into two main groups, namely, metabolites with high concentrations in certain severity indices and metabolites with low concentrations in other severity indices. For example, in the index 1-2-3-4 heat map visualization, there are two groups of compounds that tend to have relatively high concentrations at index 1 but decrease at the next index and groups of compounds with relatively high concentrations at both indices 1 and 2 but decrease in the next index. The first group of compounds consisted of L-arginine, ascorbic acid, oxidized L-glutathione and 2-hydroxyphenylacetic acid. The next group includes allantoin, D-gluconic acid, guanidino acetic acid and other compounds. The heatmap also shows that some sugar compounds, such as D-mannose, L-arabitol, and D-fructose, tend to have a high concentration on Disease Severity Index 3 and 4 compared to Index 1 (Healthy). This might indicate that these compounds have the potential to act as signalling molecules under biotic stress 37 as well as an energy source for pathogens 38.
Biomarkers can be interpreted as biological indicators that can indicate the presence, absence or status of a disease and ideally have sensitive and specific properties 39. In this study, the identification of biomarkers on G. boninense-infected oil palm stem tissue at different disease severity indices was performed. This biomarker is expected to be used as a diagnostic tool for basal stem rot disease in oil palm at different disease stages. The methods used in this analysis are OPLS-DA and receiver operating characteristics (ROC). OPLS-DA was used as a method to classify significant compounds with a VIP cutoff > 1.0 40, while the ROC curve method was used to assess the diagnostic power of biomarkers 39. From the results of the analysis in this study, there were 12 disease biomarkers for severity index 2, 12 disease biomarkers for severity index 3, and 11 disease biomarkers for severity index 4. These biomarkers consisted of groups of organic acids, carbohydrates and organoheterocyclic compounds, organic nitrogen compounds, and benzene. The table for the list of biomarkers for disease severity indices 2, 3, and 4 compared to healthy trees is shown in Table 1.
Although they have some similarities in terms of groups of organic compounds, both of which are included in the group of fatty acid derivatives, sugars and derivatives of phenolic compounds, there are no similarities between the biomarker identification results between this study and earlier research on biomarker identification using GC- MS 17 and LC-MS 22. Differences in biomarkers were also found when comparing the results of this study with the results of identifying biomarkers on G. boninense-infected oil palm leaves using NMR 14,16. These differences can be caused by measuring instruments or the analyzed tissue differences. Despite this, the panel of connections between the biomarkers in this study and previous studies shares many similarities, including many consisting of carbohydrate derivative compounds, amino acids, phenolic compounds, and other organic acid compounds.
Figure 4 shows the ROC curve and boxplots of several biomarkers in this study. An AUC score of 1 indicates that the classifier is able to discriminate all values in the positive and negative classes well, so it does not allow false-positive cases to occur. Furthermore, the box plot visualization shows a significant difference between the concentrations of biomarkers at two different disease severity states, in this case the concentrations of L-arginine, 2-hydroxyphenyl acetic acid, and sarcosine at disease severities of 2, 3, and 4, respectively, compared to healthy trees. Therefore, it is hoped that the biomarkers identified from the results of this study can accurately predict the severity of BSR disease in oil palm stem tissue.
Based on pathway analysis using the KEGG database, there are 5 pathways in oil palm that are potentially affected by BSR disease (p-value < 0.05 and pathway impact > 0.1), namely, the arginine biosynthesis pathway, alanine, aspartate, and glutamate metabolism, glycine, serine, and threonine metabolism, beta-alanine metabolism, and arginine and proline metabolism (Table 2). There were 6 metabolites targeted in the analysis of the arginine biosynthetic pathway, namely, L-citrulline, L-glutamine, L-aspartate, L-argininosuccinate, L-arginine and L-ornithine. Arginine is a precursor to the biosynthesis of proline, polyamine, and NO, which play a role in the stress response of plants. Proline and NO have a regulatory function in plant development and act as signaling molecules that can mediate various responses to biotic and abiotic stress, while polyamines play a role in plant development and plant responses to stress 41.
The identification of 6 metabolites in this pathway indicates that the arginine biosynthetic pathway may be affected by BSR disease in oil palm stem tissue, where this pathway may function in response to biotic stress. In addition to the arginine biosynthetic pathway, the alanine pathway can also be affected by G. boninense infection. Alanine is a nonproteinogenic amino acid that plays a role in the general stress response in plants and protects plants from extreme temperature, hypoxia, drought, heavy metals, and biotic stress. Alanine can be converted into osmo-protective compounds, such as alanine-betaine, in some species and into the antioxidant homoglutathione in other species 42. For example, in Arabidopsis thaliana, both drought and heat stress can increase alanine levels 43. The next metabolic pathway that can be affected by BSR disease is the metabolism of alanine, aspartate and glutamate. Asparagine, commonly known as a nitrogen transporter in plants, is an amino acid that plays a role in the stress response from pathogens. Aspartate is a precursor of asparagine biosynthesis, the conversion of which requires the enzyme asparagine synthetase 44. This study found six metabolites in the alanine pathway and the alanine, aspartate, and glutamate pathways. These compounds were L-aspartate, -alanine, panotenate, L-asparagine, L-aspartate, and L-arginino succinate. The other pathways that could be altered due to BSR disease are glycine, serine, and threonine metabolism and arginine and proline metabolism (Fig. 5a).
Metabolomics analysis in this research was also conducted using the Debiased Sparse Partial Correlation algorithm (DSPC)29. This algorithm was based on a recently proposed de-sparsified graphical lasso modeling procedure 28, assuming that the number of actual connections between metabolites is smaller than the available sample size 45. In this research, network analysis was conducted on the metabolites in all indices and in each index (Fig. 5b).
In metabolites from all indices, the metabolic network was constructed from 79 nodes and 100 edges. In this network, it is observed that there are 5 metabolites that have the highest node degree and betweenness centrality, namely, L-citrulline, 2-aminobutyric acid, 3-methyl-L-histidine, D-mannose, and 4-ethyl phenol. From this network, it is observed that there are various correlations between metabolites, either as a positive correlation (red edges) or a negative correlation (blue edges). At Index 1, several positive and negative correlations existed between metabolites (Supplementary Fig. 4). For instance, 4-hydroxyphenyl acetic acid has a negative correlation with D-fucose but a positive correlation with myo-inositol. In index 2 and index 4, there are also several various correlations between metabolites. However, in the index 3 network, the correlation between metabolites is mostly positive (red edges), which means that the increase in one metabolite concentration would cause an increase in other metabolites as well. Glycogen, as the compound with the highest node degree, has a positive correlation with several metabolites, such as xylitol, lactate, and sarcosine. The summary of the metabolic network from various severity indices is written in Table 3.
Differences in metabolite relative concentrations among all indices could be observed in the altered metabolic pathway analysis (Fig. 6). In the arginine biosynthesis pathway, L-arginine had the highest relative concentration at index 1 and the lowest at the other indices. L-Ornithine, L-citrulline, and L-arginosuccinate had the lowest relative concentrations at index 2 compared to the other indices. L-aspartate has the highest concentration at index 1 and relatively the same concentration at other indices, while L-glutamine has the lowest relative concentration at indexes 1 and 3 and the highest at indexes 2 and 4 (Supplementary Fig. 3).