Patient characteristics
The characteristics of patients in training cohort (n = 343) and internal validation cohort (n = 171) in Zhongshan hospital were shown in table 1, with no differences found in baseline demographic variables, laboratory tests, and intraoperative major events between the two cohorts. There were 16 (4.7%) Child-Pugh class B patients in training cohort compared with 5 (2.9%) in internal validation cohort. 233 (67.9%), 75 (21.9%) and 35 (10.2%) patients were assigned to mALBI grade 1, 2a and 2b in the training cohort, while 120 (70.2%), 41 (24.0%) and 10 (5.8%) patients were to 1, 2a and 2b in the internal validation cohort, respectively. The incidence of PHLF in the training cohort was 15.2%, which was comparable to the internal validation cohort with a incidence of 12.9% (p = 0.485). The comparison of clinical characteristics between the training cohort and external validation cohort was shown in Supplementary Table 1. The external validation cohort had significant differences compared with the training cohort in patient age, baseline PT, INR and intraoperative hilar occlusion, and intraoperative blood loss, etc. (Supplementary Table 1)
Establishment of nomogram for PHLF grade B – C
The results of univariate analysis were shown in Supplementary Table 2. Among which, independent predictive significance for PHLF grade B – C was shown for international normalized ratio (INR), cirrhosis, intraoperative blood loss, Child-Pugh class and mALBI grade. (Table 2) These independent risk factors were used to construct a nomogram for predicting severe PHLF. (Figure 1) Receiver operating characteristic (ROC) analysis, decision curve analysis (DCA) and calibration curve analysis (CCA) were used to assess the predictive accuracy of the nomogram. The area under the ROC curve was 0.863 (p < 0.001, 95% CI, 0.812 – 0.914) for nomogram, 0.753 (p < 0.001, 95% CI, 0.676 – 0.829) for ALBI scores and 0.718 (p < 0.001, 95% CI, 0.631 – 0.806) for Child-Pugh scores. (Figure 2A) DCA suggested that the predictive value of the nomogram outperformed that of the ALBI score and Child-Pugh score. (Figure 2B)
Validation of nomogram in internal and external independent cohort
The area under the ROC curve was 0.823 (p < 0.001, 95% CI, 0.737 – 0.909) for nomogram, 0.689 (p = 0.004, 95% CI, 0.572 – 0.805) for ALBI scores and 0.691 (p = 0.004, 95% CI, 0.555 – 0.827) for Child-Pugh scores in the internal validation cohort (Figure 2C) and 0.740 (p = 0.001, 95% CI, 0.624 – 0.856) for nomogram, 0.639 (p = 0.044, 95% CI, 0.514 – 0.765) for ALBI scores and 0.619 (p = 0.085, 95% CI, 0.482 – 0.857) for Child-Pugh scores in the external validation cohort (Figure 2E). Consistent with the training cohort, the DCA curves in the internal and external validation cohort both revealed that the nomogram provided a higher net benefit than the ALBI score and Child-Pugh scores (Figure 2D and 2F). Calibration curve showed consistency between predictions and observations in the training cohort and validation cohort. (Figure 3)
Risk stratification of nomogram
The nomogram score corresponding to the maximum Youden index was considered as the cut-off value in the training cohort. Patients with higher value than 137.02 were considered to be high-risk populations for PHLF grade B – C, and those with lower value were considered to be low-risk populations. Based on this, we pooled the probabilities of PHLF grade B – C in different risk populations in the training cohort, validation cohort. (Table 3) Patients in the high-risk group experienced significantly higher frequency of PHLF grade B – C than those in the low-risk group, both in the training cohort and in the validation cohort. (p<0.001, p<0.001, respectively)