Study population characteristics
In the discovery phase, we enrolled 112 severe or critically ill COVID-19 patients from three hospitals in Huangshi City, Hubei Province, China. There were 49 (43.75%) critical illnesses and 31 deaths (27.68%). The mean (SD) age of patients was 61.0 (14.9) years, and 73 (65.2%) patients were male. The symptoms were fever (81.2%), cough (76.8%), chest tightness (65.2%), fatigue (58.0%), shortness of breath (30.4%), phlegm (25.0%), and diarrhea (17.0%) among others, and most patients had two or more (Table 1). There were 66 (58.9%) patients with one or more comorbidities (Table 1). Eighteen (16%) patients had abnormal chest imaging findings at admission. The laboratory measures at admission are presented in Table 2. The imputed data results were generally consistent with the original data (Table S1).
The characteristics of the 375 patients in the validation set were detailed in the original literature[12]. In brief, the mean (SD) age of these patients was 58.83 (16.46), and 224 (59.7%) were male. In the validation set, generally, severe and critically ill patients accounted for 52.5%, 7.2%, and 40.3%, respectively, and the mortality rates in the three groups were 6.09%, 51.85%, 98.01%, respectively (Table S2). Overall, 174 (46.4%) patients in the validation died during hospitalization (Table S2).
Feature selection
In the discovery set, there were 52 laboratory tests with sufficient (≥4) numbers of repeated measures for use in trajectory classification. In total, 3 covariates, 61 laboratory measures at admission, and 52 laboratory trajectory clusters were included in the Ranger model. SWSFS analysis identified the 15 top laboratory features with minimal OOB errors, including 11 laboratory measures at admission: platelet count (PLT), urea, creatine kinase (CK), fibrinogen, creatine kinase isoenzyme activity, aspartate aminotransferase (AST), activation of partial thromboplastin time (APTT), albumin, standard deviation of erythrocyte distribution width (RDW-SD), neutrophils (%) and red blood cell count (RBC), as well as 4 trajectory clusters including the trajectory during hospitalization of white blood cell (WBC), PLT large cell ratio (P-LCR), PLT distribution width (PDW) and AST (Fig. 1).
Cox regression adjusted for common covariates showed that patients at admission with higher neutrophils proportion [Hazard ratio (HR), 3.85; 95% confidence interval (CI), 1.70-8.70; P=0.0012], higher Urea (HR, 5.20; 95%CI, 2.15-12.59; P=0.0003), higher CK (HR, 4.86; 95%CI, 1.78-13.25; P=0.0020) and CK-MB (HR, 3.57; 95%CI, 1.59-8.01; P=0.0020), lower PLT (HR, 0.28; 95%CI, 0.11-0.69; P=0.0057), higher AST (HR, 2.33; 95%CI, 1.11-4.89; P=0.0258), lower albumin (HR, 0.18; 95%CI, 0.04-0.78; P=0.0222), higher variation of RDW (HR, 2.63; 95%CI, 1.10-6.30; P=0.0297), lower RBC (HR, 0.40; 95%CI, 0.17-0.95; P=0.0369) and lower fibrinogen (HR, 0.47; 95%CI, 0.22-0.98; P=0.0445) had worse survival outcomes (Fig. 2). In addition, Cox regression for trajectory features showed that persistently higher and more varied WBC, P-LCR, PDW, and AST were associated with increased hazard of death (Fig. 3). After correcting for false discovery rates, all variables except APTT were significant (Table S3).
Prediction forest model construction and assessment
The prediction forest was constructed in the discovery set using all 15 candidate prognostic features combined with covariates. The random forest model achieved 100% (95%CI: 99%-100%) AUC for predicting mortality within 28 days of admission to the hospital with a 3-fold internal cross validation to control for over-fitting. Further, the prediction forest model was validated in the external validation set. The trajectory cluster of each laboratory measure in the validation set was classified by adding one case at a time to the trajectory model trained in the discovery set. In the validation set, association between the baseline indicators and outcome was significant except for APTT, consistent with the results of the discovery set (Figure S1). The AUC in the external validation set reached 87% (95%CI: 84%-91%) (Fig. 4a), which was 14% higher than the AUC of the model using the covariates only (P=6.87×10-7). The optimal cut-off of the survival probability at decision-making determined based on the Youden index was 0.58; the corresponding sensitivity and specificity for predicting 28-day mortality were 0.73 and 0.88, respectively; the specificity was 0.62 according to a fixed sensitivity of 0.90 (Fig. 4b).
In addition, to evaluate the stability of the modeling strategy, the discovery set was randomly split into training set (55 samples) and testing set (57 samples); the prediction forest was developed in the training set, followed by an internal validation in the testing set, and further evaluated in the external validation set. This was repeated 1000 times. The mean AUC was 0.87 (95% CI: 76%-98%) in the testing set, and 0.85 (95% CI: 0.80-0.89) in the validation set (Figure S2).
Comparison with existing prognostic prediction models
Further, we verified the prognostic prediction models reported in published studies in our discovery dataset. The c-index ranged from 0.64 to 0.74, and AUC from 0.66 to 0.82 (Table S4).
Web-based application tool
To facilitate the application of our prediction forest model, we developed an online tool that can be accessed at http://106.15.72.70:3838/COSP. By uploading the values of prognostic factors, the tool will output the distribution of likelihood that a given COVID-19 patient will die at a specific time point (Fig. 5).