In this study sebum showed strong correlation with serum metabolites, revealing dysregulation across biofluids. Notably, in the control group of COVID-19 negative participants, a set of positive correlations between serum triglycerides and ceramides and skin lipids was visible. This positive correlation in the controls was dysregulated in the cohort of COVID-19 positive participants, evidence of multi-organ dyslipidemia due to COVID-19. The integrated analysis presented here also showed correlation between sebum lipids and DHEAS in the cohort of COVID-19 positive participants. DHEAS is an immune-system positive adrenal hormone and also an antiglucocorticoid, and so alterations in any DHEAS / sebum axis could be indicative of immune response, and may underpin diagnostic differences seen in the sebum of COVID-19 positive and negative participants26,27. Sebum lipids have been identified as biomarkers in other pathologies, such as Parkinson’s Disease where sebum dysregulation has been linked to carnitines, a relationship that is also visible in this pathology28,29. Although as yet not as well investigated as blood-based metabolomics, we believe that sebomics holds promise for investigating other conditions.
Saliva showed weaker correlation to serum, especially in the case of directly matched metabolites. COVID-19 positivity did change the correlation maps between the two biofluids, but from a less correlated starting point than sebum / serum, and resulting in weaker diagnostic power overall. As a filtrate, saliva should be influenced by serum levels, but concentrations are lower than in blood30. Furthermore, the salivary biome is independent and has its own discrete functions and is markedly more subject to direct contamination from food or medication. Indeed, the correlation of metabolites between saliva and blood has previously been found to be weak or in some cases non-existent31,32, and the results of this work are concordant with previous studies that show that the most accurate metabolomic analyses are those that target blood metabolites5. This is unsurprising, given that blood is homeostatically controlled in a way that saliva and sebum are not. Saliva and sebum showed weak correlation in the control (COVID-19 negative) group, but perhaps surprisingly, in the COVID-19 positive group a series of negative correlations between phosphatidylcholines and sebum lipids were visible. A direct causal relationship is challenging to identify given that saliva is a filtrate, but an indirect relationship is possible. One study found a relationship between reduced levels of phosphatidylcholines and Alzheimer’s disease33, but in general research on the metabolomics of saliva in areas other than oral health has been limited.
The analysis of diagnostic capability of a reduced feature set for each biofluid showed declining accuracy for the biofluids in the order serum (diagnostic accuracy 0.97), sebum (0.88), and saliva (0.80). Serum therefore performed best in this comparison of matched biofluids, but sebum also performed relatively well, and in this reanalysis better than previously reported19. This was in large part due to the use of feature reduction: the original sebum dataset’s 998 features are likely to have led to overfitting (exceeding the number of samples by a multiple of 15), worse generalisation and worse performance on cross-validation. It should be stressed, however, that n for all three biofluids in this work was small, and so these results are indicative of relative performance only. Without a validation cohort the accuracies presented here should not be taken as indicative of absolute performance.
Diagnostic accuracy was also investigated for serum using a more limited set of metabolites, equivalent to that provided by the p180 Biocrates kit, leading to reduced accuracy but still relatively better than the other biofluids. This finding illustrates the trade-off between narrowly targeted analyses and widely targeted (or untargeted) analyses – whilst it is easier to validate a more tightly controlled panel of metabolites, a wider range can reveal additional biomarkers, especially during the initial discovery phase of biomarker identification. The biomarkers responsible for separation between positive and negative measured by VIP score were: glycolithocholic acid 3-sulfate (GLCAS), a bile acid, two triglycerides (TG(22:4_32:2) and TG(18:3_33:2)), as well as the amino acid L-proline betaine. This is consistent with other studies finding evidence of dyslipidemia, particularly increased triglyceride levels15,16,11. The dysregulation of GLCAS is also concordant with liver damage caused by COVID-1934, and dysregulation of bile acids (deoxycholic acid and ursodeoxycholic / hyodeoxycholic acid) has previously been reported as a key feature specific to COVID-19, differentiating between COVID-19 and other respiratory and inflammatory diseases in hospital-recruited patients35.
Saliva performed less well in diagnostic terms, in spite of evidence of pathology-driven correlations with other biofluids, showing weaker ability to differentiate COVID-19 negative from positive in this analysis. It should be noted, however, that due to the inability for ethical reasons to require abstinence from food or drink in the hospital setting, saliva would have been the most subject to environmental confounders such as the recent oral intake of food or medication. Whilst a clear limitation of the study, it does also reflect the practical limitations of sampling during a pandemic or indeed in any busy clinic.
Overall, whilst the integrated analysis herein of serum, sebum and saliva shows challenges in identifying reproducible metabolic biomarkers of COVID-19, it also shows the potential for non-invasive sampling in revealing relationships across biofluids and pathways. For diagnostic purposes where sensitivity and specificity are paramount, however, we believe that blood-based metabolomics will remain the best-in-class approach.