Background: Machine learning (ML) based risk stratification models of Electronic Health records (EHR) data may help to optimize treatment of COVID-19 patients, but are often limited by their lack of clinical interpretability and cost of laboratory tests. We develop a ML based tool for predicting adverse outcomes based on EHR data to optimize clinical utility under a given cost structure. This cohort study was performed using deidentified EHR data from COVID-19 patients from ProMedica Healthcare in northwest Ohio and southeastern Michigan.
Methods: We tested performance of various ML approaches for predicting either increasing ventilatory support or mortality and the set of model features under a budget constraint was optimized via exhaustive search across all combinations of features.
Results: The optimal sets of features for predicting ventilation under any budget constraint included demographics and comorbidities (DCM), basic metabolic panel (BMP), D-dimer, lactate dehydrogenase (LDH), erythrocyte sedimentation rate (ESR), CRP, brain natriuretic peptide (BNP), and procalcitonin and for mortality included DCM, BMP, complete blood count, D-dimer, LDH, CRP, BNP, procalcitonin and ferritin.
Conclusions: This study presents a quick, accurate and cost-effective method to evaluate risk of deterioration for patients with SARS-CoV-2 infection at the time of clinical evaluation.