Prediction Model of Severe Coronavirus Disease 2019(COVID-19) Cases Shows the Leading Risk Factor of Hypocalcemia

Background A striking characteristic of Coronavirus Disease 2019(COVID-19) is the coexistence of clinically mild and severe cases. A comprehensive analysis of multiple risk factors predicting progression to severity is clinically meaningful. Methods The patients were classied into moderate and severe groups. The univariate regression analysis was used to identify their epidemiological and clinical features related to severity, which were used as possible risk factors and were entered into a forward-stepwise multiple logistic regression analysis to develop a multiple factor prediction model for the severe cases. Results 255 patients (mean age, 49.1±SD 14.6) were included, consisting of 184 (72.2%) moderate cases and 71 (27.8%) severe cases. The common symptoms were dry cough (78.0%), sputum (62.7%), and fever (59.2%). The less common symptoms were fatigue (29.4%), diarrhea (25.9%), and dyspnea (20.8%). The univariate regression analysis determined 23 possible risk factors. The multiple logistic regression identied seven risk factors closely related to the severity of COVID-19, including dyspnea, exposure history in Wuhan, CRP (C-reactive protein), aspartate aminotransferase (AST), calcium, lymphocytes, and age. The probability model for predicting the severe COVID-19 was P=1/1+exp (-1.78+1.02×age+1.62×high-transmission-setting-exposure +1.77×dyspnea+1.54×CRP+1.03×lymphocyte+1.03×AST+1.76×calcium). Dyspnea (OR=5.91) and hypocalcemia (OR=5.79) were the leading risk factors, followed by exposure to a high-transmission setting (OR=5.04), CRP (OR=4.67), AST (OR=2.81), decreased lymphocyte count (OR=2.80), and age (OR=2.78). provide a theoretical basis for the early formulation of individualized diagnosis and treatment programs and prevention of severe diseases.


Introduction
Since the end of 2019, an emerging infectious Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has become a global pandemic. As of July 1, 2020, COVID-19 had resulted in more than 10 million cases and more than 500,000 deaths globally [1]. A striking characteristic of COVID-19 is the coexistence of clinically mild and severe cases [2]. A cohort of 44,672 cases proved that 81%, 14%, and 5% were mild or moderate, severe, and critically ill cases, respectively [3]. As the susceptible population covers nearly the whole population, the timely identi cation of risk factors for poor prognosis is critical for appropriate therapy and substantially in uences the prognosis.
Previous studies have roughly delineated several risk factors of poor progression of COVID-19, such as age, diabetes, neutrophil-to-lymphocyte ratio, obesity, and dyspnea [4][5][6][7][8]. Nevertheless, these studies usually investigated the association between a single risk factor and a subjective classi cation of cases, which is not suitable for an early risk assessment of multiple risk factors among an affected population. SARS-CoV-2 often affects several vital organs via the widely distributed virus receptor angiotensinconverting enzyme 2 (ACE2) [9]. These vital organs often include the heart, liver, kidney, and lungs, leading to numerous clinically abnormal manifestations, such as elevated in ammatory biomarkers, elevated serum enzymes indicating liver and cardiac damage, and unbalanced electrolytes. As the previous studies indicated, demographic features like age also affect the progression of COVID-19. Clinically, it is not appropriate to assess the prognosis of COVID-19 cases using each of the risk factors. A comprehensive analysis of multiple risk factors is more representative of the condition of patients.
Multiple logistic regression is used to analyze the relationship between several predictors (independent variables) and an outcome (a dependent variable) that is often dichotomous in nature (such as the presence or absence of the severe condition in the present study) [10]. We enrolled a cohort of patients in three hospitals in Wenzhou of China and collected patients' clinical features from the electronic medical data. This study aimed to develop a prediction model of severe COVID-19 cases using the multiple-factor logistic regression analysis of risk factors in the form of abnormal manifestations.

Study Patients
This study was approved by the Ethics Committee of The Ding Li Clinical College of Wenzhou Medical University and Sixth People's Hospital of Wenzhou and followed the Declaration of Helsinki. Written consent was obtained from the patients. The patients were diagnosed to have COVID-19 according to the guidelines issued by the National Health and Health Commission of the People's Republic of China [11].
The study enrolled 255 patients with COVID-19 aged over 16 years from January 17 to March 10 of 2020 in three hospitals in Wenzhou. All patients were con rmed by reverse transcription-polymerase chain reaction (RT-PCR) on the upper nasopharynx swabs to test for SARS-CoV-2. The epidemiological investigation was focused on the transmission mode through the history of travel or residence in the high-epidemic areas and the close contact with con rmed patients within 14 past days.

Clinical typing
Based on the guidelines mentioned above [11], the patients were assessed to give one of three clinical types, i.e., moderate, severe, and critically ill type. The criteria of moderate type include fever, respiratory symptoms, and with or without pneumonia suggested by radiographic presentations; the criteria of severe type include acute respiratory distress syndrome (ARDS), respiratory rate of ≥30 times/min, oxygen saturation of ≤93% in a resting state, and arterial partial pressure of oxygen/ inspired oxygen concentration (FiO2) of ≤300mmHg; the criteria of critically ill type include the requirement of mechanic ventilation because of respiratory failure, shock, and organ failures that need monitoring in ICU. In this study for dichotomous outcome classi cation, the patients with a mild or moderate type were referred to as the moderate group, while the patients with severe or critically ill type were referred to as the severe group.

Data Collection
The patients' data were collected by two trained physicians from the electric medical data in three hospitals, including demographic data, epidemiological data, laboratory examinations, and medical history of underlying diseases. The laboratory examinations consisted of routine etiological examinations, immunological biomarkers, as well as indices for monitoring the lungs, liver, myocardial, and renal functions. Chest CT scans were also retrieved for all patients.

Statistical analysis
Continuous variables were presented as mean±SD (standard deviation). Categorical variables were presented as counts and percentages. Means for continuous variables were compared using independent group t tests when values were normally distributed; otherwise, the Mann-Whitney test was used. Proportions for categorical variables between groups were compared using the Fisher exact test. Univariate and multiple logistic regression analysis was applied to identify risk factors of the severe COVID-19 cases. The variables that were identi ed by the univariate analysis with p-value <0.05 were used as possible risk factors and assigned a categorical value. These assigned values were entered into a backward-stepwise multiple logistic regression analysis. Hosmer-Lemeshow test was used to evaluate the goodness of model tting by the degree of agreement between the tted value and the observed value. A 2-sided α of p<0·05 was considered statistically signi cant. SPSS software version 20.0 (SPSS Inc., Chicago, Illinois, USA) was used for all statistical analyses.

Laboratory and pulmonary CT ndings
Compared with the moderate group, the severe group exhibited more abnormal results in white blood cell counts, lymphocyte counts, alanine aminotransferase (ALT), aspartate aminotransferase (AST), lactate dehydrogenase (LDH), serum potassium, calcium, albumin, and C-reactive protein (CRP) (all P<0.01) ( Table 2). On admission, 98.4% of patients had abnormal lung lesions on CT scans, and 72.2% had bilateral lung lobe involvement. The severe cases had a higher prevalence of bilateral lung lobe involvement than the moderate cases (P = 0.009).

Assignment of variable
We selected 23 possible risk factors that displayed a signi cant association with the severity of COVID-19 (P <0.05). These variables were assigned a categorical value and used as the independent variables for the logistic regression (Table supplement). The outcome was referred to as the dependent variable, which was speci ed as 1 for severe COVID-19 and 0 for moderate COVID-19.

Univariate and multiple logistic regression analysis
The univariate logistic regression analysis of 23 variables showed that 21 variables exhibited P<0.05 ( Figure 1). Eleven variables with OR value >3 were selected as the powerful risk factors and were entered into the backward-stepwise multiple logistic regression analysis. Finally, seven statistically signi cant variables were determined to t the regression model (Table 3). These signi cantly associated risk factors were dyspnea (odds ratio [OR]=5.91), decreased calcium (OR=5.79) (Figure 2), exposure history to a high-transmission (Wuhan) (OR=5.04), elevated CRP (OR=4.67), elevated AST (OR=2.81), decreased lymphocytes (OR=2.80), and age (OR=2.78). The association between hypocalcemia and the severe COVID-19 cases was rst identi ed ( Figure 2).

Discussion
The present study identi ed seven risk factors that were highly associated with the severity of COVID-19, which were dyspnea, age, decreased lymphocytes, elevated CRP, exposure to a high-transmission setting, elevated AST, and decreased calcium. Severe cases usually had dyspnea or hypoxemia one week after the onset of symptoms and progressed to acute respiratory distress syndrome (ARDS), septic shock, and multiple organ failures. Judging from the cases currently being treated, most patients had a good prognosis, and a few patients were critically ill. Considering the rapid progression of severe COVID-19, it was critical to predict and prevent the severe condition.
The univariate regression analysis illustrated the severity was associated with age, sex, comorbidities (hypertension and diabetes), dyspnea, respiratory rate, white blood cell, lymphocyte, CRP, hypokalemia, and LDH, which is consistent with the ndings in several previous studies (Table 1, Table 2 and Figure 1) [12,13]. As mentioned already, most of the previous studies individually looked into the possible risk factors, which is disadvantageous in ranking the risk factors according to OR values, as compared with the current multiple logistic regression model that investigated the interactive effects by the possible risk factors. The observed independent variables in this study could simultaneously re ect the possible injuries of multiple organs caused by COVID-19. The injuries contributed to the outcomes in the form of moderate or severe condition. Therefore, this model could control confounding factors and screen out the signi cant risk factors that might have an impact on the severity of COVID-19.
The present study rstly observed a higher prevalence of hypocalcemia in the severe COVID-19 cases.
This association (OR = 5.79) was among the two most powerful risk factors, besides dyspnea (OR=5.91). The nding suggested that hypocalcemia might be the early warning factor for the trend of critical illness in patients with COVID-19. A recent review about Ca 2+ and virus infection explained that during the process of virus infection, the virus uses the host cell's environment to replicate and induce host cell dysfunction by capturing the calcium signal system in the host cell [14]. Due to the hijack of Ca 2+ system of the host's cells, the virus can inhibit T-cell reactivity, anti-apoptosis, and other functions and affect the occurrence and progression of the disease. Notably, this study also found that the severity was associated with the exposure to a high-transmission setting (OR=5.04) due to the repeat exposure to multiple points of transmission sources that might result in a stronger immune response, as previously reported [15].
The nal prediction model included seven risk factors that were very comprehensive in terms of patients' systemic responses to SARS-CoV-2. Age was widely reported as a major demographic feature that is highly related to severe COVID-19 [16]. The high transmission setting exposure was representative of transmission [15]. Dyspnea represented the affected respiratory function. CRP and reduced lymphocytes represented the in ammatory response. AST represented the liver and cardiovascular tissues that often attacked by the virus. Calcium represented the unbalanced electrolytes. The comprehensive representation was an important advantage of the multiple logistic regression over the univariate association. Although sex, cough, sputum, fatigue, hypertension, and diabetes were also associated with COVID-19 severity, as illustrated by univariate analysis. However, these variables were not nally included in the multiple regression model, which might be due to the small number of cases and the relatively low prevalence of these risk factors concerning these variables in moderate cases. Anyway, several risk factors, such as hypertension and diabetes, are well-accepted risk factors of bad progression of COVID-19 and should be taken into consideration in predicting the prognosis of COVID-19.
The major shortcoming in this study was the limited number of samples. Other possible risk factors, such as obesity, diabetes, and hypertension, were not included due to the low prevalence of these comorbidities. The seven risk factors included in the prediction model were indicative of the systemic responses to the SARS-CoV-2 infection. The combination of seven risk factors could partly represent comorbidities with low prevalence.
To summarize, the present study has identi ed that age, dyspnea, exposure to a high-transmission setting, reduced lymphocyte, elevated CRP, elevated AST, and decreased calcium were highly associated with the severity of COVID-19. Based on the multiple analysis of the risk factors, we developed a multiple logistic regression prediction modeling for severe COVID-19 cases. This quantitative prognosis prediction model can provide a theoretical basis for the early formulation of individualized diagnosis, treatment programs, and prevention of severe conditions.

Declarations
Ethics approval and consent to participate The study protocol was approved by the ethics committee of the Ethics Committee of The Ding Li Clinical College of Wenzhou Medical University and Sixth People's Hospital of Wenzhou. Individual informed consent was obtained.

Funding
This study was funded by the Key Scienti c and Technological Innovation Projects of Wenzhou (ZY202004).

Consent for publication
All authors agree.
Availability of data and materials The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare no competing interests.

Contributors
Chenchan Hu and Feifei Su designed the study and did the literature search. Jianyi Dai, Shushu Lu, Lianpeng Wu, and Dong Chen were responsible for disease diagnosis and treatment and data collection. Fan Zhou and, Qifa Song analyzed data and wrote the manuscript.

Figure 1
Distribution of odds ratio of risk factors for COVID-19 severity analyzed by univariate logistic regression.

Figure 2
Distribution of serum calcium among the moderate and severe COVID-19 cases.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.