Baseline Characteristics of the Participants
Among the 780 cases included, 430 (55.13%) were diagnosed with AKI (AKI group), within which 159 cases (36.97%) were stage 3 AKI requiring postoperative CRRT. The complete set consist of a majority of male (n = 682, 87.44%), with a mean age of 50.7 years and BMI around 22.78 (Table 1).
Patients that did not end up with AKI (Non-AKI group) presented comparable percentage of preoperative AKI and CKD to that of AKI group. With evident use of CRRT in AKI group (16.27% vs. 6.85%, p < 0.001), the biomarkers of renal function were not significantly different in clinical settings. Meanwhile, AKI group presented more severe liver dysfunction and coagulopathy, and higher MELD score (median 30 vs. 22, p < 0.001). AKI group also held less cases with hepatic malignancy (28.37% vs. 54.28%, p <0.001) and higher the percent of hepatic encephalopathy (HE) (32.33% vs. 11.7%, p <0.001). The percentage of graft steatosis and ABO incompatibility were also significantly higher in AKI group.
During LT, AKI group tended to suffer from greater blood loss and required higher volume of blood transfusion, higher dose of terlipressin, sodium bicarbonate and hemostatic medications. Consistently, the average intraoperative urine output of AKI group was significantly lower (mean 2.61 vs. 3.70 ml•kg-1•h-1, p <0.001).
A great majority of AKI cases (n = 288, 66.97%) were diagnosed within 24 hours after LT (Table 1), that is, prior to the introduction of Tacrolimus. Although we collected data of post-operative medications prior to the appearance of diagnostic SCr (for AKI group) or prior to the record of maximum SCr (for Non-AKI group) (Appendix 3 Table 3), the heterogeneity in the timing of diagnosis made them unsuitable as predictors in our model.
The 6-month, 1-year and 2-year survival of patients in AKI group were respectively 85.52%, 82.65% and 79.87%, which was significantly lower compared to Non-AKI group (92.30%, 88.97% and 85.52%) (Figure 1).
Feature Importance and Model Performance
Finally 14 predictors were selected (Appendix 1 Table 4) and used in each classifier to predict AKI. In 1000 bootstrap test data set, GBM model achieved the greatest AUC (0.76, CI 0.70 to 0.82), a highest F1-score (0.73 CI 0.66 to 0.78) that tied with ADA, and relatively balanced sensitivity (0.74, CI 0.66 to 0.8) and specificity (0.65, CI 0.55 to 0.73)(Figure 2). Since GBM algorithm is more robust to outliers compared to ADA, we eventually chose GBM model for further analysis and application.
Since Kalisvaart’s AKI prediction score was built upon exclusion of patients requiring preoperative CRRT(5), we validated and compared the performance of this score and our GBM-based predictor in complete test set first, then further compared them in a subset excluding patients that received preoperative CRRT. It turned out that the AKI prediction score presented in our test set an absolutely high specificity (1.0, CI 1.0 to 1.0) with the lowest AUC (0.52, CI 0.45 to 0.6), F1-score (0.03, CI 0.0 to 0.08) and sensitivity (0.02, CI 0.00 to 0.04). These metrics were not improved even in the subset excluding patients receiving preoperative CRRT. Meanwhile, GBM model also demonstrated higher AUC (0.74, CI 0.67 to 0.8), acceptable specificity (0.68, CI 0.59 to 0.77) and sensitivity (0.64, CI 0.56 to 0.73) after exclusion of patients requiring pre-LT dialysis.
SHAP Values and Plots
The baseline for the Shapley value in our study is the average of all predicted AKI incidence in the test set, which was 52.08%.In our test set containing 234 cases, 163 cases were correctly classified. The SHAP summary plot demonstrated that preoperative IBIL, intraoperative urine output, time under general anesthesia, preoperative PLT and graft steatosis ranked the top 5 important features (Figure 3 A). Both kinds of SHAP plot revealed that higher IBIL, lower urine output, lower PLT, longer anesthesia time and graft steatosis above NASH CRN 1 were associated with higher SHAP value output in GBM model, indicating higher probability of post-LT AKI (Figure 3).
Four examples of correctly classified cases (Patient No. 104, No. 208, No. 224 and No. 229) were demonstrated as SHAP decision plot and force plot in Figure 4.The SHAP decision plots simulated the path of decision along which each feature was given in a sequence according to their availability in EMRs. The force plot mainly presented the major factors that contribute to the final model output in a certain individual. These plots increased the transparency of the prediction made by GBM algorithm. An online risk calculator to further facilitate external validation can be visited at http://wb.aidcloud.cn/zssy/aki.html （Figure 5）.