We identified protein biomarkers previously implicated in cardiometabolic disease that were significantly associated with severe illness from COVID-19, shedding light on biological pathways involved in COVID-19 pathology. We demonstrated that these protein biomarkers, measured early in the disease course, were more predictive of ICU admissions or death than established clinical risk factors. These findings suggest that proteomic profiling could improve the triage and treatment of patients hospitalized with COVID-19.
We found a set of seven protein biomarkers (IL6, IL-1RA, KIM1, ACE2, CTSL1, ADAMTS13, and VEGFD), along with two hospital laboratory tests (procalcitonin and LDH), that were predictive of ICU/death. These circulating biomarkers are likely related to host and viral factors influencing disease, including inflammation (IL6, IL-1RA) [31, 32], thrombosis (ADAMTS13, VEGFD) [33, 34], and viral entry (KIM1, ACE2, CTSL1) [35–37]. Elevated inflammatory markers, including IL-6 [31, 38–40], CRP, ferritin, and D-dimer have been reported in severe COVID-19 [41]. Dexamethasone, an anti-inflammatory medication, and IL-6 receptor antagonists are current COVID-19 therapies shown to reduce the risk of poor outcomes in critically ill patients [42, 43].
LDH, D-dimer, fibrinogen, CRP, and low platelets, markers of thrombotic risk, have been reported to be associated with poor prognosis in COVID-19 [44]. This observation is in keeping with the association between lower ADAMTS13, an enzyme that degrades von Willebrand factor, and poor outcomes found in our study and other reports [33]. Low levels of ADAMTS13 have also been described in thrombotic thrombocytopenic purpura and syndromes of thrombotic microangiopathy caused by infection [45]. Microangiopathic thrombosis has been seen in autopsies of patients who have died of COVID-19, similar to what has been observed in other ARDS-causing diseases [46].
Three of the identified biomarkers, KIM1, ACE2, and CTSL, are involved in host-virus interactions. KIM1, an indicator of renal insults, plays a role in viral entry and regulation of the host immune response to viral infections [47]. ACE2, the cellular receptor for SARS-CoV-2 [36, 38], undergoes shedding, leading to circulating ACE2, a biomarker of cardiovascular disease, diabetes, and death in patients with and without COVID-19 [48, 49]. The association of ACE2 with severity is supported by a recently reported rare genetic variant that is associated with a 37% reduction in ACE2 expression and a 40% reduction in risk of severe COVID-19 [50]. Finally, CTSL is one of the lysosomal proteases that can cleave the SARS-CoV-2 spike protein, a step necessary for cellular entry [37, 51].
The logistic regression and random forest models built with these seven biomarkers significantly outperformed all models developed from the clinical features and laboratory tests, suggesting that the biomarkers provide unique predictive value not captured by patient data extracted from the electronic health record. The biomarkers replaced known clinical risk factors for severe illness that had been selected in the model built without biomarkers (i.e., BMI, D-dimer, CRP, ALC, and troponin). Notably, BMI was replaced by IL-1RA, a biomarker that was strongly correlated with BMI (Fig. 3). IL-1RA, known to be highly expressed in white adipose tissue [52] and upregulated during inflammation, could serve as a better proxy than BMI for obesity-driven COVID-19 risk.
We recognize that standards of care and resource availability evolved quickly during the first wave of the pandemic. As data on the efficacy and side effects of COVID-19 therapies accrued, the use of remdesivir and hydroxychloroquine increased and decreased, respectively. Prone positioning, applied heterogeneously early in the pandemic, eventually became standard of care. It is possible that these exogenous factors contributed to differences in outcomes between the in-sample and out-of-sample cohorts. We expect that, as the SARS-CoV-2 virus mutates, the virulence pathways and host responses may change, as noted by both the delta and omicron variants [53]. The patients hospitalized with COVID-19 today are generally younger and consist of both unvaccinated and vaccinated patients with breakthrough infections or repeat infections. Newer COVID-19 therapies have emerged, which could influence the proteomic profile of patients and its association with COVID-19 severity.
By evaluating the models in a sample separate from that used to develop the models, we showed that the predictive value of the biomarkers was robust to changes in clinical protocols and the patient characteristics during the highly dynamic study period. Compared to studies that develop and test predictive models within the same patient population, our approach provided a more rigorous assessment of the generalizability of our models and the conclusions derived from our analysis. As one of the largest proteomic analyses performed in COVID-19 patients, we were able to conduct age-stratified, gender-stratified, and race/ethnicity-stratified analyses, demonstrating the strong performance of the model with the protein biomarkers across various demographic strata (Fig. S5, S6, and S7)
Similar to previous COVID-19 analyses [54–56], this study was limited by the precision with which COVID-19 severity could be captured and COVID-19 related outcomes could be tracked. We used ICU admission and death as proxies for severe illness from COVID-19; however, patients may have died or been admitted to the ICU for reasons independent of their COVID-19 status. Patients discharged alive could still have died outside the hospital or be admitted to ICUs at other hospitals. Another limitation was that the time between symptom onset and blood sample collection was not uniform across all patients. Despite this, we observed similar results when excluding patients with sample collection dates that were on or after the date of ICU admission or greater than 14 days following presentation to care (Fig. S2, Fig. S3, and Table S5). Finally, by only collecting discarded blood samples at a single time point, we were unable to perform longitudinal analyses; however, biomarkers that can be interpreted with single timepoint measurements may be more useful in clinical settings where only one lab draw is available.