The main conclusion of this study is that the prognosis of a patient with COVID-19 pneumonia can probably be predicted early by combining a widely validated comorbidity scale and an acute disease scale; the only "complementary examination" that we include is arterial saturation by pulse oximetry, a measurement that can be done at patient's home as easily as taking blood pressure. We have chosen the most popular comorbidity scale: the Charlson Comorbidity Index[11] (age-adjusted version[9]); and as a pneumonia severity scale, one of the CURB-65 family: the CRB scale;[3] but surely there will be other options. The main point is to check the validity of an idea with such clinical coherence: the prognosis of a patient essentially depends on the balance between the resistance capacity and the aggressiveness of the acute problem.
The sample we study meets the requirements to be considered representative: confirmed cases, consecutively included, in the same phase of the disease (on admission to hospital), with homogeneous admission criteria, in a naturally delimited time frame, with prospective data collection and complete follow-up (minimal percentage of losses: 12/404, 3%). Furthermore, looking at the proportion of hospital beds occupied by COVID-19 patients (maximum 38%, Supplementary Material, Figure S1), we get the impression that the confounding effect that work overload could have on patient outcomes has been lower in our hospital than in other cases.[12, 13]
Baseline characteristics also support the idea of representativeness; they are very similar to those of other series,[12, 14, 15] predominantly male, with a mean age of 60 years, similar to the USA[16] and intermediate between that of China[17] (around 55 years), and United Kingdom[18] (70 years old). The comorbidity burden was low (Age-Charlson median 2 points), similar to that observed by Casas-Rojo[15] in Spain, and in other series that have evaluated age and the Charlson index separately: Italy,[19] USA,[20] Denmark,[21] or China.[13] In all of them, with such different socio-geographic contexts, both characteristics were independent risk factors, which reinforces the idea of the suitability of combining them in Age-Charlson.
In the first clinical evaluation, CRB and SpO2 were abnormal in only 18% of the patients, but with a strong association with severity. SpO2 could be especially useful in COVID-19 patients, helping to detect what has been called "silent hypoxemia".[22, 23]
The most widespread model in which data on comorbidity and acute disease are combined in patients with pneumonia is the PSI scale.[24] However, it has a substantial disadvantage: it cannot be used outside a health centre since 7 of its 19 variables require laboratory or radiology/ultrasound. There are very few studies with predictive models applicable in primary care that, at the same time, implement such intuitive idea as that assessing the prognosis of potentially seriously ill patients requires considering not only the aggressiveness of the acute disease but also the burden of chronic disease that weakens them.[25] Generally, both components have been studied as alternatives,[26] and rarely as complementary.[27, 28] In patients with COVID-19, Petrilli[29] and ISARIC[18, 30] are two groups that more closely resemble this study’s objective. Petrilli does not explicitly include a comorbidity scale but empirically reaches the same conclusions: age, comorbidity, oxygenation and inflammation parameters determine the need for hospitalisation and the development of severe disease; the relative weight of each possibly varies depending on the outcome and the population of interest. ISARIC-4C is based on the components of the Charlson Index and CURB-65, along with gender, obesity and CRP to build a model with 8 predictor variables, including 2 biochemical which limits its application outside the hospital context; unexpectedly, hypotension has not reached the final model. In polypathological COVID-19 patients, the usefulness of combining acute damage and comorbidity scales has also been partially reported, in this case not with the Charlson index but with a specific scale for polypathological patients (PROFUND).[31]
Regarding other variables that could be important, we have explored the baseline functional situation in terms of dependency for activities of daily living, and though it was significant in the univariate and bivariate analysis (Table 1), it ceased to be so in the multivariate after incorporating the Age-Charlson scale; however, we think it deserves to be further explored. Casas-Rojo[15] have terribly similar results: 16% of dependency for activities of daily living, and association with worse evolution in the bivariate analysis; the multivariate analysis has yet to be published. Bernabeu-Wittel[31] in a study focused on multiple pathological patients with COVID-19 incorporates functional status (Barthel index) into the assessment of comorbidity.
The rate of severe disease in this series is 27%; in other studies, it ranges between 15% and 37%.[29, 30, 32, 33] This variability may be due to differences in the selected sample, in the definition of severe disease or in the method used to build the model:
-
Major differences in sampling: due to differences in the age of the patients (which we will address next); or due to exclusively including patients diagnosed by chest CT[34] (which is more sensitive than plain radiography); or excluding patients who already present in a severe condition[35–38] (because their objective is to study the progression from non-severe to severe); or limiting follow-up to a short period which does not allow to reach the outcome of interest to a significant proportion of included patients,[39, 40] and therefore rising significant risk of selection bias that will be later discussed.
-
Important differences in the definition of severe disease: most of the predictive models developed in China[37, 40, 41] use the definition recommended by the National Health Commission of China, that is broader than ours. In this Chinese definition, a ratio of arterial oxygen pressure to inspiratory oxygen fraction (Pa/Fi) less than 300 is a sufficient criterion to diagnose severe pneumonia. So, for example, a patient that with a FiO2 of 0.3 had a PaO2 of 80 mmHg (Pa/Fi: 80/0.3 = 267), should be considered severe with the Chinese definition, but not with ours. Our definition adopts criteria routinely recommended considering the admission of a patient with pneumonia to an area of high dependency or an Intensive Care Unit,[42–44] regarding FiO2 it requires to need 0.6 or more. Other studies limit the definition to admission in ICU or Intermediate Unit;[18] overlooking that in order to admit a patient in these units, in addition to severity, patient recoverability and availability of beds are also assessed; this explains the variability in the use of ICUs and why a high percentage of severely ill patients are not treated in ICU,[45] approximately 60% in our series.
-
Important differences in the strategy to build prediction models: those studies whose model is based on tools with little availability today, such as artificial intelligence or computer applications with copyright.[34, 36, 37, 46, 47]
Crude mortality rate in our series is 13% (of hospitalised patients). Again, direct comparison with other series is difficult, even being mortality a more robust outcome than disease severity. In Spain, mortality in multicentre studies of hospitalised patients has been 21–28%,[15, 32] in the United Kingdom 30%,[30] Italy 20%.[48] Age distribution and incomplete follow-up are two factors that could explain not only differences in raw mortality but also in the performance of predictive models.
Mortality varies according to age in all series; in our study, it ranges from 0% under 40 years to almost 40% at ages above 80 years, Fig. 5. A partial solution to improving comparability could be the age-standardised mortality rate, that is the mortality that a population would have if it had the age distribution of a reference population (e.g., the WHO World Standard Population),[49, 50] although it is not without criticism.[51] In this series, the age-standardised mortality rate with this reference population is 2.9 deaths per 100 COVID-19 patients admitted.
Incomplete follow-up cannot be controlled in the analysis phase. In mortality studies published in the first months of the pandemic, it has been frequent for the follow-up to be limited to 2 weeks of hospital stay; so that only those patients who have died or been discharged during that time were analysed and those who remained hospitalised were excluded.[16, 18, 52] The lack of follow-up information on these, most likely biases the estimation of crude mortality and the estimation of the performance of predictive models of mortality.[1, 33] Fig. 2 shows the distribution of hospital stay in our series depending on whether the patient had moderate disease, severe but finally survived, or severe and finally died; as previously mentioned, the group of severe but surviving patients had the longest stay, well above 14 days, so would be largely censored for the analysis if the follow-up was limited to two weeks. These patients constitute an "informative right censoring" because they are a "selective" loss of survivors, which leads us to calculate an inaccurate higher mortality. And they are also a "selective" loss of patients with a difficult prognosis (they were severely ill but survived), which would lead to prediction models that work better in the study than in real life, since in the study mainly remain patients from the extremes of the spectrum of severity: survivors after mild-moderate illness (left section of the graph), and deceased (right section of the graph). Our series only has a 3% loss of included patients, and not related to the length of stay nor outcome, but due to transfer from the Emergency Department to another hospital because of their place of residence.
Concerning laboratory variables, we observed that high levels of CRP, PCT and LDH, or
lymphopenia reach statistical significance but do not clinical significance from our perspective. This does not mean that would not be efficient in studies with different objectives; probably comorbidity variables be more decisive in countries with an older population; while variables of acute inflammatory damage do so in countries with a young population[29] This is something that should be investigated, however, is out of scope for this study.
What use can these models have? An essential requirement to apply them with confidence is their validation in independent but representative samples. Once validated, it can have multiple applications, both in the clinical and management area:
-
Support to make clinical decisions when, after a routine initial assessment, the course of action is unclear. The predicted probability of severe disease can help to decide whether to admit in the hospital and in which department; or which specific treatment to prescribe.
-
Support in decision-making for the management of the infrastructure necessary for the assistance to function as efficiently and effectively as possible.
-
Quality control, through the relationship between observed and expected mortality according to the model.[53]
Study limitations.
We have studied a sample from a single centre. The sample size is modest, and even though it is sufficient for the study’s objective, it still gives us wide confidence intervals in relevant variables. It is possible that some severely ill patients be misclassified as non-severe, specifically those severe ill patients that survived being treated outside the ICU if they were not discussed in the COVID commission. Anyway, if it exists, it will constitute a case of differential misclassification of the outcome, where the expected effect would be to bias the estimates towards null, and therefore it would not weaken the study's conclusions. It is necessary to validate the proposed model, especially with patients evaluated out of the hospital and subsequently followed up until the end of the disease.