Study populations
446 consecutive patients with retrospective accumulation satisfactied the inclusion criteria. Then, through random grouping, we divided 335 patients into the research cohort and 111 patients into the validation cohort, and were separately included in the model. Table 1 summarizes the patient demographics for the two cohorts. In the research cohort (median [IQR] age, 69 [63 - 74] years) and the validation cohort (median [IQR] age, 69 [64 - 74] years), the incidence of postoperative BF was 22.39% (n = 75) and 27.02% (n = 30), respectively. Compared with the validation cohort, age at biopsy, BMI, PSA, abnormal DRE, PI-RADS v2.1 category, ISUP grade and surgical technique of the research cohort were similar.
Table 2 summarizes the MRI characteristics for the two cohorts. The research cohort had a similar zonal location of index lesion, maximum diameter of index lesion, MRI EPE, seminal invasion and clinical stage with the validation cohort (P > 0.05).
Prediction model
In the baseline model, PSA, GG3, GG4 and GG5 were independent predictors as clinical variables, and they also had statistical significance in MRI model (Table 3). The risk for BF was positively associated with PSA and increased with GG3, GG4, GG5 and lesion of central peripheral zone. In the research cohort and validation cohort, the calibration plot showed that the MRI model had better fit than the baseline model (Figure 1).
Compared with the baseline model, AUC increased from 0.780 to 0.857 (P < 0.05) in the MRI model in the research cohort (Figure 2A and Table 4). In the validation cohort, compared with the baseline model, AUC increased from 0.753 to 0.865 (P < 0.05) (Figure 3A and Table 5).
The TPR and FPR of the models were indicated in Figure 2B fortheresearch cohort. The TPR and FPR of the calibrated risk models (Table 4) are showed in Table 5 and Figure 3B for the validation cohort. The FPR of MRI model was lower than that of baseline model, and the loss of TPR was the smallest.
Decision Curve Analysis (DCA)
Figures 2C and D showed the NBs and NRs in the quantity of FPs for the research cohort, and Figures 3C and D showed the NBs and NRs in the quantity of FPs for the validation cohort. We then applied the MRI model to the validation cohort. We found that the risk threshold exceeds 10%, a higher NB and NR in the quantity of FPs than the baseline model can be achieved, as well as BF (BF all) for each patient. For instance, at 20% risk cut-off, the NB was 14 (95%CI, 6 – 23) in the two model, 14 (95%CI, 7 – 23) in baseline model, and 18 (95%CI, 11 – 28) in MRI model, and NR in the quantity of FPs was 0 in the all model, 19 (95%CI, 6 – 37) in the baseline model, and 32 (95%CI, 0 – 56) in the MRI model. The NB of the MRI model was identical to 18 BFs per 100 men with-out negative BFs, 4 more than the baseline model. Compared with BFs in all patients with positive MRI results, the NR in the quantity of FPs based on MRI model is equivalent to 32 fewer false BFs per 100 men, while the quantity of undiagnosed BFs does not increase. Overall, 66% (95% CI, 53%-90%) of BFs could been averted, which was better than baseline model under this threshold [53% (95% CI, 33%-76%)]. In clinical work, the risk threshold of RP could be determined after physicians and patients weigh and judge the relative hazard of potentially unnecessary RP and the benefit of finding postoperative BF. So, there was no one risk threshold to decide who demands RP, but a series of risk thresholds. For example, by selecting a 20% risk threshold, a total of 66% of operations can be avoided, while 89% of postoperative BFs can still be identified.