In this study, a SOFA-based, time-incorporated mortality risk prediction model(T-SOFA M3) for critically ill patients was developed and validated. We defined several novel temporary variables to describe the dynamic change of OD quantitively from different perspectives. The results showed that a prediction model incorporating time dimension and age using machine learning algorithms had a significantly better predictive performance for ICU mortality than the original SOFA.
The SHAP algorithm was applied to evaluate the contribution of the time dimension. The results showed that various OD types have different impacts with the respiration system, the CNS system, and the cardiovascular system ranking the top three, consistent with what are used in the popular qSOFA score[5, 20]. Among all the temporary variables, the impacts of SOFA OUTI, which reflects the duration of persistent(unalleviated) OD, ranks highest, even higher than that of renal dysfunction, indicating the importance of the time dimension. The better interpretability provided by the SHAP model can potentially further guide treatment and facilitate clinical decision-making. Furthermore, our model derived from the impacts of each variable could be used for individual prediction, which would be of clinical use for risk stratification in ICU settings, which was impossible with previously proposed modified SOFA models[6, 13, 21–23].
Although efforts to develop new predictive models based on the SOFA score were repeatedly presented in the literature [6, 13, 20, 24–27], the T-SOFA M3 has several unique advantages. Firstly, the training set was obtained from the mixed datasets (MIMIC-III and eICU-CRD) with a large sample size, which may compensate for the remarkable heterogeneity among ICUs in the US. Secondly, the model was externally tested using an open-source multi-center dataset and a local single-center dataset, respectively, demonstrating excellent predictive performance and geographical generalizability. More, we collected total SOFA scores and individual scores every 12 hours within 72 hours of ICU admission, which was relatively infrequent and therefore doable in real practice, implicating its potential for future clinical use.
In recent years, the application of the SOFA score has in a broader range of diseases. Although the SOFA score was not initially developed to predict mortality but to evaluate comorbidities, it has become a well-established tool for prognosis prediction. As a consequence, a lot of modified SOFA models have emerged to overcome the limitations of the SOFA score . The modified SOFA systems incorporating the SOFA score derivatives like ΔSOFA, MAX SOFA, Mean SOFA, etc., could significantly outperform SOFA at admission, suggesting that characteristics representing changes in OD over time could help improve the performance.
Lilian Minne et al. systematically reviewed the performance of SOFA-based models for predicting mortality in critically ill patients and concluded that MAX SOFA had the best predictive performance (AUROC range = 0.792 to 0.922), while ΔSOFA showed variable performance across studies (AUROC range = 0.51 to 0.828) according to the definitions  . Another alternative option was to optimize the SOFA by replacing individual components with more clinically reliable ones. Previous studies have attempted to use the Richmond Agitation-Sedation Scale (RASS) score as a substitution as the Glasgow Coma Scale (GCS) score tends to be inaccurately recorded. However, they failed to find the superiority of RASS score modified SOFA over the original version. Adding other systems or variables to the original SOFA, such as the gastrointestinal system, was also considered and tested. Yehudit Aperstein et al. proposed the Novel Gastrointestinal Dysfunction Assessment Tool (Resting energy expenditure daily balance, Gastric residual volume, vomiting, and Bowel movements) as the seventh organ dysfunction assessment criterion of the modified SOFA and demonstrated improved prediction accuracy. However, their modified SOFA score did not gain much popularity due to the subjective and inaccurate nature of assessing gastrointestinal dysfunction . In the T-SOFA M3, age was the only additional variable apart from the original SOFA-required data since it was readily available and there was a strong correlation between age and mortality [32, 33].
There are some limitations to our study. First of all, both the training and validation sets were extracted from retrospective data, which inevitably contained missing data and recording errors. Although multiple efforts were made to minimize the potential bias, there was still some data distortion. Therefore, prospective studies are necessary to further validate the discriminative ability of the model. Moreover, our model requires consecutive 72-hour SOFA data, which denies its use in the "very early" (༜24 hours) stage of ICU admission.