Basic characteristics of the study population
The flowchart of the study procedure is illustrated in Figure 1. Basic characteristics of the COVID-19 patients are summarised in Table 1. The patients in the training set had a median age of 42.0 years (IQR: 33.0-56.5) and a median BMI of 22.5 kg/m2 (IQR: 20.3-25.0). Among them, 47(45.2%) patients were men, and 23 (22.1%) patients had at least one comorbidity. During hospitalisation, 21 (20.2%) patients were classified as mild type, 72 (69.2%) as moderate type, and 11 (10.6%) as severe type. In the validation set, the median age and BMI were 59.0 years (IQR: 48.0-66.0) and 24.7 kg/m2 (IQR: 22.1-27.0), respectively. 44 (64.7%) patients were men, and 56 (82.4%) patients had at least one comorbidity. During hospitalisation, 16 (23.5%) patients were classified as moderate type, 29 (42.7%) as severe type, and 23 (33.8%) as critical type. The most common clinical symptoms were fever and cough, followed by expectoration and shortness of breath in both the training and validation sets.
Severity-associated markers of COVID-19
Table 2 presents the associations of clinical characteristics with the severity of COVID-19 in the training set. For demographic characteristics and clinical symptoms, age, comorbidity, and fever were associated with the severity of COVID-19 (all P values < 0.05). For dichotomous laboratory markers, higher levels of C-reactive protein (CRP), lactate dehydrogenase (LDH), serum amyloid A, fibrinogen (FIB), D-dimer, adenosine deaminase, reduced haemoglobin, and lower levels of lymphocyte, eosinophil, platelet counts, calcium, phosphorus, albumin (ALB), albumin/globulin, prealbumin, total cholesterol, high density lipoprotein cholesterol, retinol binding protein, apolipoprotein A1, SaO2, PaO2/FiO2 increased the risk of elevated COVID-19 severity (all P values < 0.05). Detailed results of the associations of continuous laboratory markers data with COVID-19 severity are summarised in Additional file 1: Table S1.
Model construction and evaluation
Based on the criteria described in the Methods, 18 candidate markers and 90 patients were selected for the model construction. Because of similar clinical function, D-dimer and FIB were combined into a new variable of coagulation function as DFIB. Abnormal DFIB was defined as patients with abnormal D-dimer or FIB. Electrolyte disturbance was calculated based on the sum of abnormalities in calcium, phosphorus, potassium, sodium and chlorine. Thus,16 markers were included in LASSO regression for further feature selection. After 1,000 resamples by bootstrap, ALB, CRP, LDH, DFIB, comorbidity, lymphocyte count, eosinophil count, and electrolyte disturbance were finally selected as the predictors in the model. The detailed frequency of each marker in the 1,000 LASSO models is summarised in Additional file 1: Table S2.
Table 3 presents the performance of each model in the internal and external validations. For the internal validation, high levels of AUROCs were found among four models of logistic regression, ridge regression, support vector machine, and random forest from 0.919 (95% CI 0.793-0.955) to 0.973 (95% CI 0.935-0.993). For the external validation, the ridge regression model showed the best performance with the highest AUROC of 0.827 (95% CI 0.716-0.921). Therefore, the ridge regression model was considered as the best model because of its high predictive power.
A risk score was then calculated according to the result of the ridge regression model using the following formula:
Risk score = 26.78×lactate dehydrogenase + 19.31×C-reactive protein + 17.16×DFIB + 19.81×albumin + 17.59×comorbidity + 9.19×eosinophil count + 4.83×electrolyte disturbance + 6.25×lymphocyte count
All markers, except electrolyte disturbance, were in dichotomous forms (1 = abnormal, 0 = normal). The range of electrolyte disturbance was from 0 to 5. Figure 2 presents the receiver operating characteristic curve (A) and calibration curve (B) of the risk score. The risk score indicated good discrimination of severe or critical type with an AUROC of 0.897 (95% CI 0.845 - 0.940). In addition, calibration curve graphically showed good consistency between the predicted and actual probabilities of severe or critical type. Using the optimal cutoff value of 71, the sensitivity of the risk score was 87.1%, and specificity was 78.1% for the COVID-19 severity prediction. Figure 3 presents the distribution of risk scores in different degrees of COVID-19 severity. The mild patients had the lowest median risk score of 9.19 (IQR: 0-26.82), then after the moderate (median: 45.65, IQR: 19.56-76.91) and severe patients (median: 102.38, IQR: 81.37-120.92). The critical patients had the highest median risk score of 113.42 (IQR: 87.89-125.75). In order to help clinicians to detect the patients who were likely to develop severe or critical COVID-19 at admission, we developed a web-based assessment system based on our risk score. (Figure 4, Website: http://www.gtrsp.com:8011/)