Apply Four Laboratory Characteristics to Classify Critical Patients With COVID-19 After Admission

The sudden outbreaking of COVID-19 worldwide has brought into sharp increased burden of economic and treatment in worldwide. All conrmed patients with different severity not only share the limited healthcare systems simultaneously but increase the risk of cross-infection among patients and health care workers. Hence, effective separation of critical COVID-19 patients from the common COVID-19 will be the key to success for ensuring critical patients to obtain treatment priorities and avoiding cross-infections in the hospital. A total of 105 patients with complete medical records were collected, including 84 blood samples of patients who conrmed in the First Aliated Hospital of the University of Science and Technology at Anhui and 25 blood samples of patients in two hospitals at Shantou. Series of machine learning tools were introduced to explore and validate the most signicant laboratory characteristics. Meanwhile, we compared it to three current popular assessment systems for pneumonia by using three methods, including the AUC index, NRI index and the net benet. We identied four signicant potential laboratory characteristics for the classication of critical patients, including C-reactive protein, albumin, globulin, and sodium levels. The results also suggested the accurate and prediction ecacy of these selected indicators are the highest. In conclusion, and laboratory characteristics appear be import predictors of classication in critical patients after They guide help make Hence, believe that such classication is essential for a more rational allocation scarce medical resource.


Introduction
Coronavirus disease (COVID-19), a coronavirus pneumonia, is a highly infectious disease and is an ongoing outbreak in the world.
Symptoms of patients with COVID-19 always include fever, cough, fatigue and respiratory complications. All con rmed patients with different severity not only share the scare healthcare systems simultaneously but increase the risk of cross-infection among patients and health care workers. Hence, effective separation of critical COVID-19 patients from the common COVID-19 will be the key to success for ensuring critical patients to obtain treatment priorities and avoiding cross-infections [1].
Little attention has been paid to the classi cation of critical patients with COVID-19 who require immediate medical attention after hospital admission. No simple operable classi cation system has been specially designed for separating patients with COVID-19. Hence, clinicians have to apply three main pneumonia severity scoring systems to classify patents with COVID-19 in clinical trial, including the Clinical Pulmonary Infection Score (CPIS), Confusion-Urea-Respiratory Rate-Blood pressure-65 (CURB-65) and the pneumonia severity index (PSI). For example, Hankunyuan et al. reported a higher proportion of older than young and middle-aged COVID-19 patients with PSI grade IV and V [2]. Wu suggested that the PSI can be used to stratify patients with COVID-19 after hospitalization [3]. Liu indicated that the increase in CURB-65 score occurred concomitantly with the aggravation of acute respiratory distress syndrome in patients with COVID-19 [4]. In a multicenter study in Zhejiang province, patients with COVID-19 were classi ed by PSI and CURB-65 together, treated as a supplementary classi cation system for clinical assessment after admission [5].
In this study, we aim to identify some robust and interpretable laboratory biomarkers to separate critical patients from the common patients on three retrospective cohort studies from 3 hospitals in 2 provinces in China. We believe that they can provide effective medical resources allocation and treatment for patients and avoid cross-infections in the hospital.

Study design and patients
This was a retrospective study of three cohorts with COVID-19 initial diagnosed the "Diagnosis and treatment protocol for novel coronavirus pneumonia (Trial version 6)" published by the National Health Commission of China [6] before admission. 84 patients admitted from January 20 to February 20, 2020 to the First A liated Hospital of the University of Science and Technology of China were collected. 13 patients from the First A liated Hospital of Shantou University Medical College and 12 patients from Shantou Central Hospital admitted from January 19 to February 20, 2020 were collected. All these patients were con rmed to have SARS-CoV-2 infection by RT-PCR of samples from the respiratory tract by the Centers for Disease Control and Prevention. All patients with COVID-19 were hospitalized and admitted to the same ward without making distinction between common patients and critical patients [6].

Data collection
A total of 105 patients with complete medical records were collected. We reviewed all clinical data, laboratory characteristics and chest CT scans (see Table 1). The clinical data included demographic information, underlying comorbidities, symptoms and signs. Laboratory characteristics included routine blood tests, biomarkers for monitoring functions of multiple organs, and infection-related biomarkers. All data were collected within 24 hr after admission. According to the guideline for patients with con rmed COVID-19 from the National Health Commission in China, patients with mild clinical presentations (no pneumonia) may not initially require hospitalization. Hence, we removed data for 3 patients with clinical presentations because of possible bias. Data for 81 patients with 35 variables were retained.

Statistical analysis
In three datasets, we found that some laboratory characteristics had missing data. After deleting 2 variables with high missing rate (> 25%), we imputed the remaining data by using multiple imputation [7,8].We also handled the collinearity and ltering with mis-measured outliers by considering the results of variance in ation factor and correlation analysis together [9, 10]. Subsequently, a classi cation model including 35 candidate predictors was tted by using cforest implementation with the random forest (RF) classi cation model [11]. During this analysis, the importance of various conditioning factors can be measured quantitatively, and we found several negative importance variables. We kept running a loop function to remove negative values. The importance of selected variables was weighted by using the weight of evidence (woe) method to improve the classi cation accuracy [12]. Finally, four signi cant selected biomarkers were selected by using a generalized linear model (glm) with the stepwise Bayesian information criterion method. The prediction model was depicted by the nomogram.
Internal validation was conducted 100 times by spitting 80% of data into a training set with n train = 62 samples and 20% into a test set with n test = 15 samples. Then we counted the total times each predictive variable was present in each model. Moreover, external validation involved using data for 25 patients with COVID-19 from 2 hospitals in Shantou.
We compared the new classi cation to the three current popular assessment systems for pneumonia by using three methods, including the area under the receiver operating characteristic (ROC) curve (AUC) index, the net reclassi cation (NRI) index and the net bene t. AUC was used to describe the diagnostic ability of a binary classi er system [13]. The net reclassi cation index (NRI) was used to evaluate the improvement in risk prediction by adding a marker to a set of baseline predictors [14,15]. Decision curve analysis (DCA) was used to evaluate and compare prediction models that incorporate clinical consequence [16]. In this study, we used DCA to graphically describe the clinical usefulness of each classi er based on a potential threshold for misclassi cation (x axis) and the net bene t of using the model to risk-stratify patients (y axis) relative to assuming that no patient will be misclassi ed.
Statistical analyses were performed with R v3.6.3 and p < 0.05 was considered statistically signi cant. Table 1 describes participant characteristics. Three cohorts show few major differences existed.

Results
After ltering collinearity and outliers, 77 patients from Hefei were retained. The full model was approximated by a small model including the 14 most predictive variables by using RF. Only 13 predictive variables were retained after deleting variables with strength of evidence less than "very strong". Figure 1 shows the weight of evidence of importance of variables. Finally, four signi cant biomarkers were selected, including: CRP (P = 0.001), ALB (P = 0.014), GLB (P = 0.013) and sodium (P = 0.006) (see Table 2). The nomogram was depicted in Fig. 2 and the nal classi cation model was described in formula1: Logit(p) = 76.579 + 0.064*CRP-0.259*ALB + 0.287*GLB-0.567*sodium. …… (1) In internal validation, the random-splitting was repeated 100 times and results are described in Table 3. CRP and sodium level appeared 100 times, ALB level 72 times and GLB level 85 times and so were selected to build the model in the training dataset. Table 2 showed the results of GLM analysis. Both them demonstrated that four selected laboratory characteristics can be regarded as potential biomarkers for identifying the critical patients.
The AUC for the CPIS score was highest (AUC CPIS = 0.988) and that for four biomarkers was lower, 0.881 (Fig. 2). However, the ability to discriminate patients with critical and common disease was better by using four biomarkers than the CPIS, mainly because the CPIS overestimates the variance when the AUC is close to 1 and it is not realistic in clinical trials [17]. DCA demonstrated that the prediction model built by four biomarkers improved the accuracy of classi cation against the threshold probabilities of three popular classi ers. Table 4 suggested that the new classi cation model performed the best because the values of three NRIs were larger than 0. The new classi cation model was always superior to other 3 models across a wide range of threshold probabilities (Fig. 3). For example, the highest difference between the new prediction model and CPIS was at a threshold probability around 0.41. At that threshold, the net bene t for the new prediction model was about 0.29 and 0.1 for CPIS. At that threshold, using the new prediction model over the CPIS to classify patients and make clinical decisions, the probability of more pro table treatment was 28% (95% CI 0.29 − 0.1).

Discussion
According to the results of our study, four easily available and low-cost laboratory characteristics appear to be import predictors of classi cation in critical patients after hospital admission. The new classi cation model based on four laboratory characteristics was demonstrated to had better discriminative ability than other 3 current popular systems. The results of AUC, NRI and DCA analysis also demonstrated that it was the best classi er. The discriminative ability of it was also externally validated.
CRP level were positively correlated with the severity of COVID-19. It was consistent with some previous studies. CRP can activate the complement system to enhance the regulation of lymphocytes and promote the phagocytosis of macrophages to eliminate the invading pathogens [18,19]. Some studies of COVID-19 showed that CRP level was signi cantly increased speci cally in patients with severe disease [20,21]. The reason might be some in ammatory factors such as interleukin 6, interleukin 1, tumor necrosis α could promote the synthesis of CRP by hepatocytes [18]. Ko et al. found that CRP ≥ 2 mg/dl was one of the predictive factors for pneumonia development of Middle East respiratory syndrome (MERS), while CRP ≥ 4 mg/dl, low albumin level, male, hypertension, thrombocytopenia, lymphopenia were regarded as the predictive factors for respiratory failure [22]. A recent retrospective study also showed that CRP levels of patients with COVID-19 were also signi cantly higher in the death group on admission [23]. Liu et al. reported that IL-6 and CRP could be used as independent factors to predict the severity of COVID-19, and those patients were more likely to have severe complications while their CRP level larger 41.8 mg/L [24]. Wang also suggested that CRP level can be regarded as an important biomarker in the early stage of COVID-19 because CRP could re ect lung lesions and disease severity [25]. Albumin was the second potential biomarker found in our study. It could be detected in the blood and was a protein made in the liver. Albumin could prevent leakage of the uids from the blood into other organs [26]. Increasing number of studies showed that low albumin levels were associated with poorer outcomes of patients with COVID-19 [27]. Albumin concentration was suggested as an independent risk factor for mortality in patients with pneumonia and also found associated with COVID-19 [28,29]. A systematic reviewed and meta-analysis showed that hypoalbuminemia status increased risk of severe COVID-19 [30]. Our study also described that lower sodium was a risk factor for severe COVID-19 infection. Sodium was considered a predicator in several scoring systems for assessing pneumonia, including the PSI and Acute Physiology and Chronic Health Evaluation II. Hyponatremia was the most common electrolyte disorder in clinical practice and severe hyponatremia was associated with increased mortality [31]. Berni et al. found that sodium was inversely correlated with IL-6 in COVID-19 patients, directly correlated with PaO 2 /FiO 2 ratio [32]. Stephan J.L Bakker gave a hypothesis about that low sodium balance may augment cellular damage at a certain virus load and increase the risk of developing severe and fatal COVID-19 infection by their experimental and epidemiological data [33]. Finally, globulin was suggested to be positively relative with the severity of COVID-19. Yafei Zhang demonstrated that the globulin level in severe COVID-19 patients is signi cantly increased while comparing to the mild patients because the promoted immunoglobulin synthesis [27].
In addition, the CPIS, a diagnostic algorithm, is mainly applied for ventilator-associated pneumonia and community-acquired pneumonia. Most studies indicated that CPIS had inaccurate sensitivity and speci city [34][35][36][37]. The CPIS was suggested to have high inter-observer variability and is not available for multiple centers study [37,38]. The CURB-65 score consists of 5 separate elements: confusion, uremia, respiratory rate, blood pressure, and age ≥ 65. The CURB-65 is relatively simple to use. The PSI involves 20 clinical variables de ning 5 classes of increasing risk of mortality. It has been extensively validated. However, the inappropriate weights of age or inappropriate threshold values for both the PSI and CURB-65 result in a potential underestimation of severe pneumonia, especially in young people [39,40].
A major limitation of the current study is the insu cient sample size. As more raw data will be collected in the future, we would have the ability to optimize our new model. Another limitation of our study was that we had to combine patients with critical presentation to severe presentation because there were only 6 patients with a critical clinical presentation in our study.
In conclusion, four easily available and low-cost laboratory characteristics appear to be import predictors of classi cation in critical patients after hospital admission. They guide therapeutic options and help clinicians make clinical decisions. Hence, we believe that such classi cation is essential for a more rational allocation scarce medical resource.

Declarations
Acknowledge Thanks to three hospital staff members for their efforts in collecting the data that was used in this study, and all patients who consented to donate their data for our analysis and the medical staff members who insist on working on the front line of caring for patients.

Data availability
Mrs. AP Guo Tan had full access to all of the data in the study. After publication, the data will be made available to others on reasonable requests after approval from the author (Mrs. AP Guo guoanpingah@126.com).

Con ict of interest statement:
The authors have declared that no con ict of interest exists.

Study approval
This study was approved by the Medical Ethical Committees of the First A liated Hospital of the University of Science and Technology of China (Hefei, China) , the Medical Ethical Committees of the First A liated Hospital of Shantou University Medical College (Shantou, China) and the Medical Ethical Committees of the Shantou Central Hospital (Shantou, China) respectively(approval letter 2020-P-018, 2020-046, 2020-019).Written informed consent was waived.

Contributors:
ZK and AG contributed to the concept and design of the study, acquisition of data, and interpretation of data, drafting the article and nal approval of the version to be published, they contributed equally to this work and are joint rst authors. DL, XZ, BY and JD were involved in data collection, veri cation and nal approval of the version to be published. TY and YW were involved in the interpretation of the data, drafting the article. HZ and CW contributed to the obtaining funding, design of the study, acquisition of data, statistical analysis and interpretation of data, drafting the article, and approval of the nal version to be published. AG is the data guarantor. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.    Figure 1 Weight of evidence of importance of variables related to severity of COVID-19. CRP, C-reactive protein; ALB, albumin; GLB, globulin; BUN, blood urea nitrogen; PCT, procalcitonin; LDH, lactate dehydrogenase; ALT, alanine aminotransferase; AST, aspartate transaminase; BP, blood pressure; Na, sodium; cTN.I, cardiac troponin I; D.Bil, direct bilirubin; PT, prothrombin time.

Figure 2
The plot of nomogram.