This section includes the following: (i) feature ranking, (ii) detailed outcomes of the top-performing model, (iii) results pertaining to model explainability, and (iv) a comprehensive discussion and comparative analysis. This structured presentation is aimed at providing a nuanced understanding of the study's outcomes and their implications.
3.1 Feature Ranking
In this investigation, three advanced ML feature selection models—XGBoost, random forest, and extra trees — were used. After a thorough preliminary exploration, the random forest model was found to exhibit superior performance, achieving the highest rankings. From the initial set of 48 features, the top ten features emerged as particularly impactful, delivering optimal results with a minimal subset of features. Figure 4 indicates the top features, ranked through the random forest feature selection algorithm, across distinct comparisons: (A) control vs RSV, (B) control vs influenza, (C) control vs COVID-19, (D) control vs all respiratory virus, and (E) COVID-19 vs influenza/RSV. These visual representations offer a concisely provide insight into the discriminative power of selected features in differentiating among the specified conditions.
3.2 Classification Model Results
The comprehensive evaluation process comprised five distinct scenarios: control vs RSV, control vs influenza A, control vs all respiratory viruses, COVID-19 vs all respiratory virus, and influenza A/RSV. In the initial phase, 20 ML models were trained with a 5-fold dataset. The top ten performing models were carefully selected for each case, and a stacking-based ensemble technique was used to enhance predictive accuracy.
The application of stacking achieved notable improvements in the evaluation metrics, particularly for scenarios involving control vs all respiratory viruses and COVID-19 vs all influenza A/RSV. However, for the remaining scenarios, no improvement in metrics was observed. Figure 5 visually depicts the top ten performing models across the five scenarios, thus providing a concise overview of the model performances in each distinct case.
Figure 5(A) indicates the outcomes for the control vs RSV scenario, with linear discriminant analysis emerging as the top-performing model. Demonstrating superior performance across various evaluation metrics, this model achieved an accuracy of 96.08%, precision of 96.13%, recall of 96.08%, specificity of 95.38%, F1-score of 96.07%, and an AUC of 95.92%. Figure 5(B) reveals the exceptional performance of SVM as the leading model in the control vs influenza A scenario. SVM outperformed other models, with an accuracy of 97.94%, precision of 98.01%, recall of 97.94%, specificity of 97.51%, F1-score of 97.93%, and an impressive AUC of 99.69%. In Fig. 5(C), the control vs COVID-19 scenario highlights SVM as the preeminent model, exhibiting an accuracy of 95.96%, precision of 96.02%, recall of 95.96%, specificity of 95.4%, F1-score of 95.95%, and AUC of 97.23%. Figure 5(D) reveals random forest as the top performer in the control vs all respiratory virus scenario, achieving an exceptional 98.1% accuracy, 98.09% precision, 98.1% recall, 94.48% specificity, F1 score of 98.08%, and an AUC of 97.78%. In Fig. 5(E), Logistic Regression emerges as the superior performer in the COVID-19 vs influenza A/RSV scenario, with commendable metrics, including an accuracy of 86.14%, precision of 85.97%, recall of 86.14%, specificity of 80.3%, F1 score of 85.97%, and an AUC of 87.68%. Notably, the lower accuracy in this case was attributed to the class imbalance issue for COVID-19, with 55 samples ,compared with 110 samples for influenza A/RSV.
Further detailed results for each case can be found in the Supplementary Material, including Table 3S to Table 7S. The supplementary tables offer comprehensive insights into the confusion matrices and AUC curves for the best-performing models in each scenario, as visually depicted in Fig. 3S and 4S.
3.3 Model Explainability According to Shap Values
SHAP [43] helps understand the impact of each feature on the model's output for a particular prediction, offering valuable insights into the model's decision-making process. This method uniquely highlights the individual contribution of each feature towards a specific prediction, thereby providing a nuanced understanding of the global and local behaviors inherent in the model. By emphasizing transparency and elucidating the decision-making process, SHAP is aimed at instilling trust in the ML approach among end-users. SHAP not only enhances interpretability but also promotes a more informed and confident engagement with the model's predictions.
We used the random forest model to perform SHAP analysis in three unique scenarios for our research, considering all pertinent attributes. Figure 6 illustrates the effect of SHAP values on the model output across various scenarios. The horizontal axis delineates the direction of the effect, with positive and negative impacts represented by red and blue colors, respectively. In this context, red indicates higher feature values, whereas blue indicates lower values.
In Fig. 6(A), for the control vs. RSV scenario, the SHAP analysis highlights distinct feature effects on model predictions. Specifically, Met.SO (Methionine sulfoxide) had a substantial positive effect on RSV predictions, indicative of the higher concentrations in RSV cases than control. Notably, Ile, Val, Asp, Phe, and showed considerable positive effects, thus emphasizing their influential roles in predicting RSV cases. In Fig. 6(B), focusing on the control vs influenza A scenario, the SHAP analysis revealed LYSOC18:2 as the predominant metabolite feature with the greatest effect on predicting influenza A cases. In Fig. 6(C) for control vs COVID-19, LYSOC18:2 again emerge as the dominant feature, in agreement with previous findings by Bennet et al. [16], thereby establishing its value in distinguishing COVID-19 cases. Other notable metabolite features, including Kynurenine, Phe, Val, Tyr, and Asp, contributed significantly to the predictive model. For the control vs all respiratory virus scenario, as depicted in Fig. 6(D), LYSOC18:2 was the most dominant feature, thus indicating its crucial role in discriminating cases involving respiratory viruses collectively.
Finally, in the control vs RSV/Influenza A scenario represented in Fig. 6(E); Carnosine emerged as the most dominant feature for predicting COVID-19 cases. This detailed analysis provided valuable insights into the specific metabolite features driving the predictive capability of the model across various respiratory virus classification scenarios.
3.3 Discussion
Respiratory viruses, including influenza A, RSV, and COVID-19, pose major health challenges [44–46]. Our work focused on leveraging LC/MS-MS metabolomics data to predict the presence of respiratory viruses in individuals, by discerning dominant metabolites contributing to accurate classification. Applying a similar method to various diseases allowed us to explore distinct metabolite profiles and gain insights into the underlying biochemical dynamics across different pathological conditions. ML models can discern complex patterns within the data [47] and identify subtle metabolic changes associated with specific viral infections. This approach enables a more nuanced understanding of disease dynamics.
A comprehensive statistical analysis was conducted for control, normal, and all respiratory virus scenarios, by using chi-square tests, rank Sum tests, and T-tests. Twenty ML models were trained for five distinct scenarios: control vs RSV, control vs influenza A, control vs COVID-19, control vs all respiratory viruses, and COVID-19 vs influenza A/RSV. Feature ranking techniques were applied to select the top ten features. Standard scaling was used to normalize the data, and a 5-fold dataset was created. Before model fitting, the SMOTE technique was used to address class imbalance.
Among the 20 ML models, the top ten performers were selected, and a stacking ML model was trained by using the three most successful models. The outcomes of each model are illustrated in Fig. 5. Notably, linear discriminant analysis excelled in the control vs RSV scenario, whereas SVM stood out in the control vs influenza A scenario. The control vs COVID-19 and control vs all respiratory virus scenarios indicated SVM and random forest as the leading models, respectively. Logistic regression emerged as the superior performer in the COVID-19 vs influenza A/RSV scenario.
Furthermore, SHAP values were used to evaluate the influence of features on model output, thus revealing the dominant features contributing to positive predictions. Specifically, for COVID-19, features such as LYSOC18:2, Kynurenine, Phe, Val, Tyr, and Asp significantly influenced the predictive model. These findings provide valuable insights into the metabolic signatures associated with respiratory virus infections.
Table 2
Comparison of evaluation metrics with other work.
| Model | Cases | Accuracy | Sensitivity | Specificity |
Bennet et al. [16] | Supervised machine learning | Control vs all respiratory virus | 96% | 98% | 86% |
COVID- 19 vs influenza A/RSV | 85% | 74% | 90% |
Ours | RandomForest | Control vs all respiratory virus | 98.10% | 98.10% | 94.48% |
Logistic Regression | COVID-19 vs influenza A/RSV | 86.14% | 86.14% | 80.3 |
Table 2 presents a comparative analysis of respiratory virus scenarios between the results obtained by Bennet et al. For the scenario of control vs all respiratory viruses, Bennet et al. achieved an accuracy of 96%, sensitivity of 98%, and specificity of 86%. In contrast, our random forest model had higher performance, with an accuracy of 98.10%, sensitivity of 98.10%, and specificity of 94.48%. In the case of COVID-19 vs influenza A/RSV, Bennet et al. have reported an accuracy of 85%, sensitivity of 74%, and specificity of 90%. Our Logistic Regression model exhibited improved performance with an accuracy of 86.14%, sensitivity of 86.14%, and specificity of 80.3%. These results highlight the effectiveness of our proposed models in achieving higher accuracy and comparable or enhanced sensitivity and specificity to the referenced supervised ML approach.