Comparison of the RENAL, PADUA, DAP, NePhRO, and SPARE scores for retroperitoneal laparoscopic partial nephrectomy.


 Background: Nephrometry scores play a critical role in the preoperative evaluation of partial nephrectomy. Although score comparisons have been performed for transperitoneal or open surgery, systematic comparisons for retroperitoneal operations are lacking.Methods: We retrospectively evaluated the clinical records of patients who underwent partial nephrectomy at one center by one surgeon. Scores were generated according to the imaging results, and each score was categorized into low-, intermediate- and high-complexity groups. Then, the differences in perioperative outcomes were compared among the groups. We assessed whether the scores and sex, body mass index (BMI), age, or American Society of Anesthesiologists (ASA) Physical Status classification could predict whether the warm ischemia time (WIT) was likely be longer than 20 min and whether they could predict postoperative complications worse than Clavien–Dindo 1. The interobserver variability between two experienced surgeons for these scores was calculated with the intraclass correlation coefficient (ICC).Results: A total of 107 patients were ultimately evaluated. The median tumor size was 36.88 mm, and the median WIT, total operation time (OT) and estimated blood loss (EBL) were 18.97 min, 80 min, and 100 ml, respectively. Significant differences in WIT were identified among the complexity groups for each scoring system. Significant differences were identified between the OTs associated with the DAP and RENAL score complexity groups. The scores included in this study were significantly associated with the probability of having a WIT >20 min and high-grade postoperative complications. Receiver Characteristic Operator (ROC) curves showed that there were no significant differences in their predictive power. NePhRo had the highest agreement (0.839), followed by DAP (0.827). RENAL was superior to SPARE and PADUA, which were 0.758, 0.724 and 0.667, respectively.Conclusions: The scores included in this study were useful for preoperative assessment of retroperitoneal laparoscopic partial nephrectomy. No significant differences were observed among the scores in terms of their ability to predict prolonged hot ischemia time or high-grade postoperative complications. DAP is a good score in the retroperitoneal circumstance when its consistency is taken into account.


Introduction
Rapid advancements in imaging have allowed for an increasing number of renal carcinomas to be distinguished at the T1 stage. 1 Partial nephrectomy (PN) is the standard treatment for stage 1 renal cancer, and relative to radical nephrectomy, PN may preserve renal function to a greater extent 2,3 . In addition, PN has been found to have comparable cancer-speci c survival rates and quality of life (QOL) relative to radical nephrectomy. 4 At present, retroperitoneal laparoscopic partial nephrectomy (RLPN) is the most widely utilized procedure in China for renal carcinomas. This procedure has been improved since it was rst introduced in China.
By using this procedure, clinicians can avoid abdominal surgery, bowel mobilization, and drainage into the abdomen and decrease the risk of intraperitoneal organ injury. 5 Compared to radical nephrectomy (RN), partial nephrectomy (PN) is more challenging to perform and may be associated with a higher rate of short-term complications. 6,7 Thus, scoring systems such as RENAL 8 , PADUA 9 , DAP 10 , and NePhRO 11 have been utilized to evaluate the morphologic features of renal carcinomas by radiography and to assess their surgical complexity, 12,13 thereby helping clinicians make appropriate operative decisions.
From RENAL to the latest ABC 14 and SPARE 15 scoring systems, these systems are constantly being optimized. 16 These scoring systems are widely used and were designed with open or transperitoneal operations in mind. Unfortunately, there is no consensus regarding whether the application of these new scores has any apparent advantage; furthermore, there is no standard approach to choosing the most suitable scoring system. Systematic studies on the use of these scores in retroperitoneal laparoscopic partial nephrectomy (RLPN) are particularly limited. Thus, to investigate the use of these scores in RLPN, we retrospectively analyzed the correlations between the evaluated scores and intraoperative outcomes of RLPN performed by a single operator at a single center.

Patients and methods
We retrospectively analyzed data from patients who underwent retroperitoneal laparoscopic partial nephrectomy in Chaoyang Hospital between 2014 and 2017. The electronic medical record data were available for all included patients. All patients had an enhanced abdominal computerised tomography(CT) or enhanced Magnetic resonance imaging (MRI), with enhanced CT scan thicknesses of 0.5 mm and 0.1 mm to ensure the de nition of the reconstructed images, which included the transverse plane, and reconstructed coronal data. Arterial, venous and delayed phase data were also included to ensure that the elements in the scores were accurately measured.
To ensure the accuracy of the scores, patients who received an enhanced CT at other institutions without quali ed image electronic data were excluded. All the included patients had single unilateral tumors, and none of the included patients had metastatic or locally advanced tumors. Polycystic kidney patients were not included, and in addition, patients with severe pelvic and spinal deformities that affected surgery were excluded. Patients with a history of retroperitoneal surgery on the affected side were not included.
The demographic features of the patients were collected and included sex, age, and BMI. Images were reviewed electronically by experienced urologists, and the morphological features of the tumors were identi ed and evaluated according to established criteria. 8,9,10,11,15 Then, scores were generated according to the tumor features observed. Each patient's American Society of Anesthesiologists (ASA) score, WIT, OT, EBL, and postoperative complications were obtained from the original operation data and electronic medical records. The scores were classi ed into low-, moderate-and high-complexity groups based on the criteria, and differences in perioperative outcomes were subsequently compared among the different score groups. The ability of the scores and factors such as sex, age, body mass index (BMI), and ASA Physical Status classi cation to predict whether WIT would last longer than 20 min and the presence of high-grade postoperative complications were also assessed. ROC curves were used to compare the predictive value of each score.
The images were scored by two experienced surgeons who were blinded to the patient's demographics, surgical procedure and outcome, and the scores of the other surgeon. Interobserver agreement was calculated with intraclass correlation coe cients (ICC).
All operations were performed by the same surgeon, who is an experienced urologist. All procedures were performed according to routine surgical practices. Some parts of the procedure were improved based on our experience 17,18,19 . ASA scores were collaboratively assigned by both the urologist and the anesthesiologist.

Statistical analysis
Continuous data are shown as medians and interquartile ranges (IQRs) because they were not normally distributed. Differences between complexity groups in WIT, OT, and EBL were analyzed using the Kruskal-Wallis test. Multivariable logistic regression models were used to estimate the probability of having a WIT > 20 min and postoperative complications based on the evaluated clinical characteristics, including age, sex, ASA score, BMI, and the RENAL, PADUA, DAP, Zonal NePhRO, and SPARE scores. Receiver operating characteristic(ROC)curves were generated for the probability of having a WIT of > 20 min and postoperative complications worse than Clavien-Dindo 1, and the differences among the scoring systems were compared by using the area under the curve (AUC) values. The differences of the areas under the ROC curves were compared with the Z test. Intraclass correlation coe cients were used to assess the interobserver variability of the scores of different readers. Statistical signi cance was set at P

Results
From January 2014 to December 2017, 242 patients underwent partial nephrectomy. After exclusion of patients who did not meet the inclusion criteria, a total of 107 patients who had complete perioperative data and the appropriate imaging data were ultimately included in this study and were subsequently evaluated. The demographic features of these patients are shown in Table 1  Heart failure II 1 Table 2 shows the complexity distributions of the patients' RENAL, PADUA, DAP, NePhRO, and SPARE scores. The RENAL and SPARE scales demonstrated a preponderance of low-and middle-complexity groups, the DAP score demonstrated a preponderance of middle-and high-complexity groups, while the PADUA and NePhRO scores were more homogeneously distributed among the three groups. Meanwhile, the differences of intraoperative outcomes among the low-, intermediate-and high-complexity groups of all of the scoring systems were evaluated, including WIT, OT and EBL. The WITs differed signi cantly among the three groups for each scoring system. The results of the Mann-Whitney U-test with Bonferroni correction showed that the WITs were signi cantly shorter in the low-complexity groups than in the high-complexity groups for each scoring system. In addition, only the DAP and NePhRO scoring systems had signi cantly shorter WITs in the high-complexity groups than in the middle-and high-complexity groups. Only OT differed signi cantly among the three groups for the RENAL, DAP, and SPARE scoring systems. There were no signi cant differences in EBL among the three groups for any scoring system. Table 3 summarizes the multivariate model that was generated using sex, age, BMI, ASA score, tumor size, RENAL score, PADUA score, DAP score, NePhRO score, and SPARE score to predict whether WIT would last longer than 20 min and to predict postoperative complications worse than Clavien-Dindo 1.
The results of the multivariate regression analyses revealed that sex, RENAL score, PADUA score, DAP score, Zonal NePhRO score, and SPARE score exhibited a signi cant correlation with WIT > 20 min. In addition, these scores showed a signi cant correlation with postoperative complications worse than Clavien-Dindo 1 (not included in the table).

Discussion
To obtain better surgical outcomes, preoperative evaluation of partial nephrectomy is critical, and a scoring system is an effective tool for evaluation. 13 Since the publication of the RENAL scoring system in 2009, 8 it has played an essential role in standardized preoperative evaluations, strengthening the comparability among different partial nephrectomies and facilitating communication. 20 The SPARE score may re ect the trend of score development. The SPARE scoring system streamlines the elements of the score, removing the less consistent elements of the polar location and removing the involvement of the UCS, which is di cult to determine 15 . In most previous nephrometry scoring systems, the assignment is usually 1, 2, or 3. SPARE abandoned this approach and uses regression analysis to calculate different assignments. In previous studies on retroperitoneal laparoscopic scoring, early modi cations were based on changes in the RENAL score based on surgical experience 21 , which may not be widely used. The DDD score is a novel score based on retroperitoneal PN 22 . D1 increases the weight of the tumor diameter in the score, and D3 obviously includes the advantages of the ABC score. Although there are no studies on the consistency of the consistency of the DDD score, it has excellent consistency in our small-scale preliminary experiments. RNP may be the latest scoring system developed for retroperitoneal nephrectomy 23 , adding elements of MAP 24 , which is inconsistent with the original intention of the authors who developed the MAP scoring system. The use of RNP requires further clinical validation.
Current research on comparisons of various scoring systems focuses on open surgery and robot-assisted laparoscopic surgery. However, system comparisons in the retroperitoneal laparoscopic environment are lacking, and currently, there are no comparisons of scores for RLPN. The novel DDD and RNP scoring systems have been designed for retroperitoneal laparoscopic partial nephrectomy, which are more straightforward and easier to use than the previous nephrometry scoring system. In single-center retrospective studies 22,23 , the predictive effects according to the DDD and RNP scoring systems were similar to the RENAL scoring system. Whether the DDD and RNP scoring systems have any obvious advantages may require further veri cation.
RLPN has distinct characteristics from TLPN. The space in the posterior peritoneum is relatively narrow, and although it is more di cult to resect a tumor in the lower pole than in the upper pole, RLPN facilitates exposure of the renal artery without occlusion of the renal vein. Because of the characteristics and advantages of RLPN, RLPN is a good surgical approach. If the surgeon has su cient experience and the proper technique is applied, compared to TLPN, the operation time and blood loss of RLPN may be shorter, and the postoperative results and oncologic effects of these techniques are similar 25,26 . Despite advances in robot-assisted surgery, retroperitoneal laparoscopic partial nephrectomy is still the standard and most popular procedure in many areas. 27 Currently, RENAL and PADUA are still the most widely used scores, DAP and NeRhRo are the most popular scores among the second-generation scoring systems, and SPARE is the most recent innovation. After ten years of optimization, it remains unclear which score is most advantageous and most suitable for the retroperitoneal laparoscopic environment. Thus, the goal of our study was to perform a comparison of these scores.
In the preoperative evaluation of partial nephrectomy, we typically use scores to predict the di culty of the operation, the warm ischemia time and the possibility of high-level complications. Since most studies include patients with Clavien-Dindo complications ≥ grade 2, a warm ischemia time (WIT) ≤ 20 min was used as the criterion for Trifecta outcomes. 28 Therefore, we compared the predictive ability of the different scores for these factors.
Although there were some differences in the AUCs for predicting high-grade complications and a WIT > 20 min, the differences were not signi cant, which was basically consistent with the results of the existing research. In a previous study that compared the RENAL, PADUA, and NePhRO scores in open PN, 29 these scores were found to be signi cantly associated with ischemia time. Except for the C-index, the other scores were identi ed as being correlated with serious complications. In another study, RENAL and DAP were compared for laparoscopic partial nephrectomy, and DAP was found to be better correlated than RENAL with warm ischemic time and estimated blood loss. 30 Because the SPARE score is a novel score that is well accepted, we hypothesized that it may perform better than PADUA and RENAL. However, one study showed that SPARE had no signi cant advantage for predicting EBL and ischemia time outcomes in PN over the other two classic scores 20 . The SPARE score is less involved and may be easier to calculate than the PADUA score. Nevertheless, the ability of the SPARE score to predict complications in PN is similar to that of the PADUA score 15 .
Another important aspect of scoring is standardization, which increases comparability and communication. Thus, the consistency of the score is also an important issue that we need to consider 20 .
In our study of interobserver variability, DAP with its relatively simple design and fewer scoring elements than the other systems had relatively good consistency. The consistency of NePhRo was also excellent, possibly because its zoning concepts are more direct and aligned with the way clinicians think. PADUA, on the other hand, has more elements than the other systems, with poor consistency at the polar location. Meanwhile, SPARE gains consistency after streamlining.
The currently used scoring systems, after streamlining the parameters, may improve consistency if appropriate parameters are selected that are easy for clinicians to grasp. However, if the selected parameters are not clearly de ned and are not easy to learn, the consistency of the scoring system may decline. Therefore, to improve scoring, on the one hand, the scoring system should be simple and easy to learn and remember, and on the other hand, the scoring system should improve the ability to predict the di culty and complications of surgery and should also improve consistency as much as possible. With the improvement of techniques in all aspects of surgery, scoring has become more challenging.
As far as we know, this is one of the rst investigations to evaluate various scores solely to evaluate RLPN. We completed a comprehensive evaluation of the most commonly used scoring systems, providing theoretical support for the use of these scores in retroperitoneal circumstances. Our study had some limitations, which should be noted. First, the number of patients recruited from our single institution was relatively small. As these operations took place over a prolonged period, the surgeon continuously developed his operative skills, and there were very few operations with a WIT > 20 min. This result is close to the average observed in clinical institutions across China. In addition, some other widely used scores and scores speci cally designed for RLPN, such as the ABC, DDD and RNP scores, should have been included. Finally, this was a single surgeon, single center retrospective study, which has inherent limitations in its research design.

Conclusion
we veri ed the capacity of the RENAL, PADUA, DAP, NePhRO, and SPARE scores to predict perioperative outcomes of RLPN. Despite ten years of unrelenting effort, the current scores still cannot replace RENAL and PADUA. DAP is a good score in the retroperitoneal circumstance when the consistency and ease of use of the score are taken into account. Larger prospective investigations are needed to validate these nephrometry scores for RLPN and to optimize the scores based on experimental data. In further studies, some of the new scores designed for the retroperitoneal environment will need to be evaluated with a large sample, and the scoring assignments may be more reasonable if the scores can be statistically calculated. describes the AUC values for each nephrometric score in predicting the probability of having a WIT >20 min and postoperative complications worse than Clavien-Dindo 1.