In this large external validation study examining the performance of SORT in patients undergoing abdominal surgery, we found that SORT accurately predicted risk of post-operative death. It performed particularly well in low risk patients, but under-predicted the risk of death in patients who were stratified as the highest risk. When SORT was used to identify patients at risk of adverse outcome, only 2.2% of the study population were identified as being high risk. In this high risk group 25% patients had unplanned ICU admissions. These may have been avoidable if SORT was used to risk assess patients pre-operatively.
SORT was originally developed in 11,219 non-cardiac surgical patients3 identified in the NCEPOD enquiry titled ‘Knowing the Risk’.1 Its authors successfully validated it in a separate population of 5569 non-cardiac patients, with an AUROC of 0.91.3 There were 87 deaths by 30 days in this validation cohort, but SORT predicted only 73.3 The present study shows a similar trend; there were 29 deaths, but SORT predicted 25.
Several further external validation studies have assessed the ability of SORT to predict 30-day mortality, although none in a mixed population of abdominal surgery patients. Wong et al calculated SORT for 475 hepatectomies, reporting an AUROC of 0.82, however SORT over-predicted the number of deaths, particularly in patients with the lowest risk profiles.8 Oliver et al assessed SORT in a mixed population of 1936 elective orthopaedic and general surgery procedures reporting an AUROC of 0.85.9 Like the present study, both of these studies reported low mortality rates (0.3% and 1.7%, respectively) and therefore contained a high proportion of true negatives, which may have led to an over-estimation of the performance of SORT. Marufu et al assessed the performance of SORT in a population of hip fracture patients, who had a higher rate of death (5%). In this more balanced population SORT did not perform as effectively, with an AUROC of only 0.70.10
The predictive ability of risk stratification tools is frequently assessed using AUROCs and the c-statistic. However in populations where the outcome of interest is infrequent, such as the low mortality rate seen in the present study, AUROCs may over-estimate the performance of the model. This is due to impact of a large proportion of patients without the event (true negatives) in the calculation of specificity. In imbalanced populations the more appropriate analysis may be the PRC, where true negatives do not feature in the calculation of precision (positive predicted value) or recall (sensitivity).6 The present study is the first to assess the performance of SORT using PRC as well as a ROC curve, finding that the performance of SORT was significantly poorer. This was notable in patients with the highest risk profiles, where SORT under-quantified their risk. In lower risk patients SORT performed well though. Arguably risk prediction tools in these patients are more useful than in patients with higher risk profiles, as the latter as will have risk factors for poor outcome, such as advanced age, complex co-morbidity or emergency surgery which are readily identified by clinicians.
Several other tools have been designed to predict post-operative morbidity and mortality, such as ASA11 and the Portsmouth Physiological and Operative Severity Score for Mortality and Morbidity (P-POSSUM).12 In an external validation study of 5569 patients, SORT was superior to ASA at predicting mortality, although both performed well (AUROCs of 0.91 and 0.87, respectively).3 ASA is a population based tool defining physical status not operative risk, and although widely used, misclassifications are common particularly amongst patients with multiple co-morbidities.13 The performance of SORT is yet to be compared to that of P-POSSUM. A limitation of P-POSSUM is that it requires laboratory data, a chest radiograph and electrocardiogram, making it more difficult to calculate than SORT.
Once risk prediction is established as being accurate, the next question is regarding the discrete level of risk that qualifies a patient as ‘high risk’. The RCS recommend using a predicted risk of death of ≥ 5% to identify high risk patients.2 This represents a departure from previous guidance that categorised patients as high risk if they had a predicted risk of death of ≥ 10%.14 The present study is the first to assess ICU utilisation following the new recommendation of a threshold of 5%, and the first to use SORT to stratify patients. We demonstrate that lowering the threshold to 5% does not generate large volumes of new post-operative ICU admissions; only 2.2% of the study population met the criteria for direct ICU admission, and most of these had already been recognised as requiring post-operative ICU care. This group of additional ICU admissions represents only 0.45% of the study population. Of note, 25% of the high-risk group had unplanned ICU admissions. These patients represent a sub-group of high risk patients that could have been identified pre-operatively by SORT and electively admitted to ICU. However, there were also patients in the high risk group who were managed without ICU admission, and conversely patients in the standard risk group that had unplanned ICU admissions or died in hospital. In the standard risk group there were 16 deaths, suggesting that using a predicted mortality of 5% may yet be too high to safely identify all patients at risk of death.
Historically post-operative ICU admission has been thought to be of benefit as it permits rapid recognition and treatment of life-threatening post-operative complications. A study of 572,598 general surgical procedures found that a patient who receives post-operative ward-based care but then requires unplanned ICU admission has twice the risk of 30 day mortality.15 In elective surgery a recent study of 44,814 patients found no association between direct admission to ICU following surgery and in-hospital mortality however.16 These findings may be explained by advances in surgical and anaesthetic techniques that have reduced the physiological disturbance caused by surgery and therefore reduced the impact of ICU-based care. In the present study half of the patients in the standard risk group were admitted to ICU post-operatively. Given the acuity of the surgical procedures this is not an unexpected finding, but in the future a proportion of these patients may be eligible to receive critical care interventions, such as telemetry or vasopressors, outside of the traditional ICU.
Within the standard risk group 3.3% of patients had an unplanned ICU admission. These patients would not have been identified if risk stratification was restricted to SORT and the 5% mortality threshold. It is therefore important to highlight that risk tools serve to aid, as opposed to replace clinical judgement. None of the previously described scores have been directly compared to clinical opinion, but when assessing pre-operative risk, guidelines recommend that risk tools are used in conjunction with surgical judgement.2 In keeping with this, the American College of Surgeons National Surgical Quality Improvement Programme risk tool has an in-built option to allow surgeons to modify risk calculations if they deem necessary.18
Mortality is not the only outcome of importance to clinicians and patients. Prediction of complications and morbidity that allows accurate discussion of risk during surgical consent and pre-operative optimisation would also be of value and is a key area for further research. The creators of SORT have developed the SORT morbidity model, which they have validated in a mixed population of 527 elective surgery patients.19 It is yet to be further externally validated.
This study uses data collected from patients treated in five independent hospitals in the UK, a sector of healthcare that is traditionally thought to deliver simple treatments to stable patients. When comparing the demographics of this population to that of a contemporaneous NHS population of 16,788 surgical patients3 there are important similarities. High ASA classifications were common (ASA 3 and 4 were found in 19.9% and 2.7%, respectively in the NHS study,3 and 14.4% and 1.5%, respectively, in the present study) and the majority of patients were undergoing major or complex-major operations (32.7% and 34.2% patients, respectively in the NHS study3 and 41.6% and 49.8%, respectively, in the present study). The mortality rate was also similar (1.8% in the NHS study and 0.88% in the present study) and comparable to reported rates of 1.4 to 1.9% in other large NHS-based population studies of surgical patients.20 21
There are some important limitations to the present study. SORT was initially developed to predict 30-day mortality, but the present study was limited to in-hospital death as we were unable to collect data on patient outcomes after discharge. We also unable to capture the outcomes of patents who were transferred to the NHS or other healthcare providers. However, these cases represented only 1.6% of the study population. In some cases we were unable to determine the rationale for post-operative ICU admission so these cases were excluded from this sub-analysis. In the remaining cases we assumed that ICU admissions categorised as unplanned were categorised using clinical need. However a proportion of these may represent elective admissions where the operating surgeon has failed to book a bed, and were not truly unplanned admissions. It was not possible to sub-classify procedure urgency beyond elective or unplanned, so we were unable to identify which patients were truly ‘expedited’ or ‘emergency’ procedures. This may mean that true ‘emergency procedures’ are under-represented in the study population, leading to under-estimation of ICU capacity needed to implement the 5% risk threshold. It may also mean the performance of SORT described in the present study is not as good as could be if all variations of procedure urgency were included.
In summary this large study externally validates SORT in a population of patients undergoing major abdominal surgery. SORT performed particularly well in patients with low risk profiles, but under-predicted the number of deaths in patients with the highest risk. When SORT was used to identify patients with a predicted post-surgery mortality of ≥ 5% and therefore requiring direct ICU admission some patients who were stratified as standard risk ultimately required unplanned ICU admission. However, SORT did identify high risk patients who had unplanned ICU admissions, demonstrating the value of using SORT in conjunction with clinical judgement.